[go: up one dir, main page]

WO2025227131A1 - Dystrophin constructs and methods of use thereof - Google Patents

Dystrophin constructs and methods of use thereof

Info

Publication number
WO2025227131A1
WO2025227131A1 PCT/US2025/026535 US2025026535W WO2025227131A1 WO 2025227131 A1 WO2025227131 A1 WO 2025227131A1 US 2025026535 W US2025026535 W US 2025026535W WO 2025227131 A1 WO2025227131 A1 WO 2025227131A1
Authority
WO
WIPO (PCT)
Prior art keywords
seq
domain
sequence
partial
midi
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
PCT/US2025/026535
Other languages
French (fr)
Inventor
Jason Allen WEST
Zachary Corkin KENNEDY
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Canbridge Pharmaceuticals Inc United States
Original Assignee
Canbridge Pharmaceuticals Inc United States
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Canbridge Pharmaceuticals Inc United States filed Critical Canbridge Pharmaceuticals Inc United States
Publication of WO2025227131A1 publication Critical patent/WO2025227131A1/en
Pending legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/85Vectors or expression systems specially adapted for eukaryotic hosts for animal cells
    • C12N15/86Viral vectors
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K38/00Medicinal preparations containing peptides
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • C07K14/435Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans
    • C07K14/46Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans from vertebrates
    • C07K14/47Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans from vertebrates from mammals
    • C07K14/4701Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans from vertebrates from mammals not used
    • C07K14/4707Muscular dystrophy
    • C07K14/4708Duchenne dystrophy
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K48/00Medicinal preparations containing genetic material which is inserted into cells of the living body to treat genetic diseases; Gene therapy
    • A61K48/005Medicinal preparations containing genetic material which is inserted into cells of the living body to treat genetic diseases; Gene therapy characterised by an aspect of the 'active' part of the composition delivered, i.e. the nucleic acid delivered
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2750/00MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA ssDNA viruses
    • C12N2750/00011Details
    • C12N2750/14011Parvoviridae
    • C12N2750/14111Dependovirus, e.g. adenoassociated viruses
    • C12N2750/14141Use of virus, viral particle or viral elements as a vector
    • C12N2750/14143Use of virus, viral particle or viral elements as a vector viral genome or elements thereof as genetic vector
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2800/00Nucleic acids vectors
    • C12N2800/22Vectors comprising a coding region that has been codon optimised for expression in a respective host
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2800/00Nucleic acids vectors
    • C12N2800/40Systems of functionally co-operating vectors
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2830/00Vector systems having a special element relevant for transcription
    • C12N2830/48Vector systems having a special element relevant for transcription regulating transport or export of RNA, e.g. RRE, PRE, WPRE, CTE
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2840/00Vectors comprising a special translation-regulating system
    • C12N2840/44Vectors comprising a special translation-regulating system being a specific part of the splice mechanism, e.g. donor, acceptor

Definitions

  • MMD Muscular’ dystrophy
  • Some forms of MD present symptoms at birth or develop during childhood, while others may not appear until middle age or later.
  • the disorders differ in terms of the distribution and extent of muscle weakness, age of onset, rate of progression, and pattern of inheritance.
  • DMD Duchenne muscular dystrophy
  • the disease occurs due to a defective DMD gene that results in absence of dystrophin, a protein that is involved in maintaining the integrity of muscle.
  • individuals with DMD may have symptoms such as trouble walking and running, falling frequently, fatigue, learning disabilities/difficulties, heart issues as a result of impact on heart muscle functioning, and breathing problems due to weakening of respiratory muscles involved in lung function.
  • Symptoms of muscle weakness associated with DMD typically begin in childhood, often between 3 to 6 years of age. DMD mainly affects males and in rare cases may affect females. About one in every 3,500 boys are affected by this disorder. As the disease progresses, lifethreatening heart and respiratory problems can occur.
  • the DMD gene is one of the largest known human genes. Its largest isoform contains 79 exons and encodes for a 427 kDa dystrophin protein. The extremely large size of the gene contributes to a complex mutational spectrum, with >7,000 different mutations and a high spontaneous mutation rate. The most severe phenotype associated with DMD is most often caused by out-of-frame mutations, resulting in complete loss of dystrophin protein expression. In-frame mutations that allow for the synthesis of an internally truncated but partially functional protein are associated with a milder phenotype known as Becker muscular dystrophy (BMD).
  • BMD Becker muscular dystrophy
  • the present disclosure provides isolated recombinant nucleic acid molecules encoding truncated dystrophin proteins that can be delivered and expressed in a subject using a dual adeno-associated virus (AAV) vector system to allow expression of truncated dystrophin proteins that are otherwise too large to fit into a single AAV system.
  • AAV adeno-associated virus
  • the truncated dystrophin proteins can be used to restore the expression and function instead of a wild-type dystrophin in a subject in need thereof.
  • the present disclosure also provides systems and methods for expressing or delivering a truncated dystrophin protein in a subject, and methods for treating a subject having a dystrophin-associated disease or disorder, e.g., muscular dystrophy, e.g., Duchenne muscular’ dystrophy (DMD).
  • a subject having a dystrophin-associated disease or disorder e.g., muscular dystrophy, e.g., Duchenne muscular’ dystrophy (DMD).
  • the AAV vectors can be delivered to a subject in need thereof, e.g., a subject having a dystrophin-associated disease or disorder, e.g., muscular dystrophy, e.g., DMD, to produce a significant level of a functional truncated dystrophin, and to protect muscle fibers from injury, increase muscle strength, reduce and/or prevent fibrosis, in the subject.
  • the present invention is directed to a system for generating a truncated human dystrophin protein, comprising a first recombinant nucleic acid molecule and a second recombinant nucleic acid molecule, wherein the first nucleic acid molecule comprises a first coding region encoding an N-terminal portion of the truncated dystrophin protein and a 3’ ribozyme, wherein the first coding region is operably linked to the 3’ ribozyme at its 3’ end, wherein the second nucleic acid molecule comprising a second coding region encoding a C- terminal portion of the truncated dystrophin protein and a 5 ’ribozyme, wherein the second coding region is operably linked to the 5’ ribozyme at its 5’ end, wherein upon ribo
  • the first coding region is operably linked to two or more 3’ ribozymes at its 3’ end. In some embodiments, the two or more 3’ ribozymes are the same 3’ ribozyme. In some embodiments, the two or more 3’ ribozymes are different 3’ ribozymes.
  • the second coding region is operably linked to two or more 5’ ribozymes at its 5’ end. In some embodiments, the two or more 5’ ribozymes are the same 5’ ribozymes. In some embodiments, the two or more 5’ ribozymes are different 5’ ribozymes.
  • the 5’ ribozyme and the 3’ ribozyme are each independently selected from the group consisting of Twister (Osa), Twister (Dre), Twister (Nvi), Twister (Sbi), Twister (Envl), Twister (Spu), Twister (Cpa), Twister Sister, Hammerhead (RzB), HDV, Pistol, Varkud Satellite (VS), Hatchet, Hairpin, and Hovlinc (Hov).
  • the 5’ ribozyme and the 3’ ribozyme are each independently selected from the group consisting of SEQ ID NOs: 6 - 20.
  • the first nucleic acid molecule further comprises an intron splice donor sequence
  • the second nucleic acid molecule further comprises an intron splice acceptor sequence
  • the splice donor sequence is positioned between the first coding region and the 3’ ribozyme.
  • the splice donor sequence is selected from the group consisting of SEQ ID NOs: 133 - 136.
  • the splice donor sequence is positioned within a region of the truncated dystrophin protein coding for a region selected from the group consisting of the R7 domain, the R8 domain, the R9 domain, the RIO domain, the R11 domain, the R12 domain, the R13 domain, the R14 domain, the R15 domain, the R16 domain, the R17 domain, the R18 domain, the R19 domain, the H3 domain, the R20 domain, the R21 domain, and the R22 domain.
  • the splice donor sequence is positioned within a region of the truncated dystrophin protein coding for a region selected from the group consisting of the R8 domain, the R19 domain, the H3 domain, the R20 domain, and the R21 domain.
  • the splice donor sequence is not positioned within the R21 domain.
  • the splice donor sequence is positioned at least 10 nucleotides, at least 15 nucleotides, at least 20 nucleotides, at least 30 nucleotides, at least 40 nucleotides, at least 50 nucleotides, or more from the 3’ ribozyme.
  • the splice acceptor sequence is positioned between the 5’ ribozyme and the second coding region.
  • the splice acceptor sequence is selected from the group consisting of SEQ ID Nos: 137-141.
  • the splice acceptor sequence is positioned at least 10 nucleotides, at least 15 nucleotides, at least 20 nucleotides, at least 30 nucleotides, at least 40 nucleotides, at least 50 nucleotides, or more from the 5’ ribozyme.
  • the splice donor sequence and the splice acceptor sequence are positioned such that the resulting spliced intron is between 50 - 200 bp in length.
  • the splice donor sequence and splice acceptor sequence are positioned such that the resulting spliced intron encodes a single predominant reading frame.
  • a stop codon sequence is introduced into the splice donor sequence or the splice acceptor sequence.
  • At least one of the first coding region and the second coding region is at least 2000 nucleotides in length, or at least 2200 nucleotides in length, or at least 2400 nucleotides in length, or at least 2600 nucleotides in length.
  • the first coding region and the second coding region are each at least 2000 nucleotides in length, or at least 2200 nucleotides in length, or at least 2400 nucleotides in length, or at least 2600 nucleotides in length.
  • the third coding region is at least 4920 nucleotides in length, or at least 5100 nucleotides in length, or at least 5300 nucleotides in length.
  • the first coding region and the second coding region do not share a region of substantial sequence identity.
  • the 3’ end of the first coding region does not have a sequence identity to the 5’ end of the second coding region.
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
  • the truncated dystrophin protein further comprises Hl Domain (SEQ ID NO: 22).
  • the truncated dystrophin protein comprises an amino acid sequence selected from the group consisting of: a. midi-Dys A R1-R15 (SEQ ID NO: 83), b. midi-Dys A R2-R15 (SEQ ID NO: 84), c. midi-Dys A R3-R15 (SEQ ID NO: 85), d. midi-Dys A H2-R15 (SEQ ID NO: 86), e. midi-Dys A R4-R15 (SEQ ID NO: 87), f. midi-Dys A R5-R15 (SEQ ID NO: 88), g.
  • midi-Dys A exon 13-33 SEQ ID NO: 93
  • h. midi-Dys A exon 13-39 SEQ ID NO: 94
  • i. midi-Dys A exon 13-41 SEQ ID NO: 95
  • j. midi-Dys A exon 13-48 SEQ ID NO: 96
  • k. midi-Dys A exon 15-39 SEQ ID NO: 97
  • l. midi-Dys A exon 15-41 SEQ ID NO: 98
  • m. midi-Dys A exon 15-48 SEQ ID NO: 99
  • n. midi-Dys A exon 17-39 SEQ ID NO: 100
  • midi-Dys A exon 17-41 SEQ ID NO: 101
  • p. midi-Dys A exon 17-48 SEQ ID NO: 102
  • q. midi-Dys A exon 18-39 SEQ ID NO: 220
  • r. midi-Dys A exon 18-41 SEQ ID NO: 221
  • s. midi-Dys A exon 18-48 SEQ ID NO: 222
  • t. midi-Dys A exon 19-39 SEQ ID NO: 103
  • u. midi-Dys A exon 19-41 SEQ ID NO: 104
  • v. midi-Dys A exon 19-48 SEQ ID NO: 105
  • midi-Dys A exon 21-41 (SEQ ID NO: 106)
  • x. midi-Dys A exon 21-42 (SEQ ID NO: 223)
  • y. midi-Dys A exon 21-48 (SEQ ID NO: 107).
  • the truncated dystrophin protein further comprises R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), and R19 Domain (SEQ ID NO:42).
  • the truncated dystrophin protein comprises an amino acid sequence selected from the group consisting of a. midi-Dys A R1-R15 (SEQ ID NO: 83), b. midi-Dys A R2-R15 (SEQ ID NO: 84), c. midi-Dys A R3-R15 (SEQ ID NO: 85), d. midi-Dys A H2-R15 (SEQ ID NO: 86), e. midi-Dys A R4-R15 (SEQ ID NO: 87), f. midi-Dys A R5-R15 (SEQ ID NO: 88), g.
  • midi-Dys A exon 10-33 (SEQ ID NO: 89), h. midi-Dys A exon 10-39 (SEQ ID NO: 90), i. midi-Dys A exon 10-41 (SEQ ID NO: 91), j. midi-Dys A exon 11-33 (SEQ ID NO: 216), k. midi-Dys A exon 11-39 (SEQ ID NO: 217), l. midi-Dys A exon 11-41 (SEQ ID NO: 218), m. midi-Dys A exon 13-33 (SEQ ID NO: 93), n. midi-Dys A exon 13-39 (SEQ ID NO: 94), o.
  • midi-Dys A exon 13-41 SEQ ID NO: 95
  • p. midi-Dys A exon 15-39 SEQ ID NO: 97
  • q. midi-Dys A exon 15-41 SEQ ID NO: 98
  • r. midi-Dys A exon 17-39 SEQ ID NO: 100
  • s. midi-Dys A exon 17-41 SEQ ID NO: 101
  • t. midi-Dys A exon 18-39 SEQ ID NO: 220
  • u. midi-Dys A exon 18-41 SEQ ID NO: 221)
  • v. midi-Dys A exon 19-39 SEQ ID NO: 103
  • midi-Dys A exon 19-41 SEQ ID NO: 104
  • x. midi-Dys A exon 21-41 SEQ ID NO: 106
  • y. midi-Dys A exon 21-42 SEQ ID NO: 223.
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO:24), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO:24), R3 domain (SEQ ID NO:25), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO:24), R3 domain (SEQ ID NO:25), H2 domain (SEQ ID NO:26), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO:24), R3 domain (SEQ ID NO:25), H2 domain (SEQ ID NO:26), R4 domain (SEQ ID NO:27), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), a partial Hl domain (SEQ ID NO: 404), a partial R11 domain (SEQ ID NO:412), R12 Domain (SEQ ID NO: 35), R13 Domain (SEQ ID NO: 36), R14 domain (SEQ ID NO: 37), R15 Domain (SEQ ID NO: 38), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), a partial Hl domain (SEQ ID NO: 404), a partial R14 Domain (SEQ ID NO: 413), R15 Domain (SEQ ID NO: 38), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), a partial Hl domain (SEQ ID NO: 404), a partial R15 Domain (SEQ ID NO: 414), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), a partial Hl domain (SEQ ID NO: 404), a partial R19 Domain (SEQ ID NO:415), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), a partial R1 domain (SEQ ID NO: 405), a partial R11 domain (SEQ ID NO: 412), R12 Domain (SEQ ID NO: 35), R13 Domain (SEQ ID NO: 36), R14 Domain (SEQ ID NO: 37), R15 Domain (SEQ ID NO: 38), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 21), Hl domain (
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), a partial R1 domain (SEQ ID NO: 405), a partial R14 Domain (SEQ ID NO: 41 ), R15 Domain (SEQ ID NO: 38), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), a partial R1 domain (SEQ ID NO: 405), a partial R15 Domain (SEQ ID NO: 414), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), a partial R1 domain (SEQ ID NO: 405), a partial R19 Domain (SEQ ID NO:415), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), a partial R2 domain (SEQ ID NO: 406), a partial R11 domain (SEQ ID NO: 412), R12 Domain (SEQ ID NO: 35), R13 Domain (SEQ ID NO: 36), R14 Domain (SEQ ID NO: 37), R15 Domain (SEQ ID NO: 38), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 21), Hl domain (
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), a partial R2 domain (SEQ ID NO: 406), a partial R14 Domain (SEQ ID NO: 413), R15 Domain (SEQ ID NO: 38), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41 ), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), a partial R2 domain (SEQ ID NO: 406), a partial R15 Domain (SEQ ID NO: 414), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51 ).
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), a partial R2 domain (SEQ ID NO: 406), a partial R19 Domain (SEQ ID NO: 415), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), a partial R3 domain (SEQ ID NO: 407), a partial R14 Domain (SEQ ID NO: 413), R15 Domain (SEQ ID NO: 38), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), a partial R3 domain (SEQ ID NO: 407), a partial R15 Domain (SEQ ID NO: 414), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), a partial R3 domain (SEQ ID NO: 407), a partial R19 Domain (SEQ ID NO:415), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), a partial R3 domain (SEQ ID NO: 408), a partial R14 Domain (SEQ ID NO: 413), R15 Domain (SEQ ID NO: 38), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), a partial R3 domain (SEQ ID NO: 408), a partial R15 Domain (SEQ ID NO: 414), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), a partial R3 domain (SEQ ID NO: 408), a partial R19 Domain (SEQ ID NO:415), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), R3 domain (SEQ ID NO: 25), H2 domain (SEQ ID NO: 26), a partial R4 domain (SEQ ID NO: 409), a partial R14 Domain (SEQ ID NO: 413), R15 Domain (SEQ ID NO: 38), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 21), Hl domain (
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), R3 domain (SEQ ID NO: 25), H2 domain (SEQ ID NO: 26), a partial R4 domain (SEQ ID NO: 409), a partial R15 Domain (SEQ ID NO: 414), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 21), Hl domain (
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), R3 domain (SEQ ID NO: 25), H2 domain (SEQ ID NO: 26), a partial R4 domain (SEQ ID NO: 409), a partial R19 Domain (SEQ ID NO: 415), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), R3 domain (SEQ ID NO: 25), H2 domain (SEQ ID NO: 26), a partial R4 domain (SEQ ID NO: 410), a partial R14 Domain (SEQ ID NO: 413), R15 Domain (SEQ ID NO: 38), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 21), Hl domain (
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), R3 domain (SEQ ID NO: 25), H2 domain (SEQ ID NO: 26), a partial R4 domain (SEQ ID NO: 410), a partial R15 Domain (SEQ ID NO: 414), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 21), Hl domain (
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), R3 domain (SEQ ID NO: 25), H2 domain (SEQ ID NO: 26), a partial R4 domain (SEQ ID NO: 410), a partial R19 Domain (SEQ ID NO: 415), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), R3 domain (SEQ ID NO: 25), H2 domain (SEQ ID NO: 26), R4 domain (SEQ ID NO: 27), a partial R5 domain (SEQ ID NO: 411), a partial R15 Domain (SEQ ID NO: 414), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 21), Hl domain (
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), R3 domain (SEQ ID NO: 25), H2 domain (SEQ ID NO: 26), R4 domain (SEQ ID NO: 27), a partial R5 domain (SEQ ID NO: 411), a partial R16 Domain (SEQ ID NO: 416), R17 Domain (SEQ ID NO: 40), R 18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 21), Hl domain (
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), R3 domain (SEQ ID NO: 25), H2 domain (SEQ ID NO: 26), R4 domain (SEQ ID NO:27), a partial R5 domain (SEQ ID NO: 411), a partial R19 Domain (SEQ ID NO: 415), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
  • Table 1 The truncated dystrophin proteins described above are depicted in Table 1 below:
  • the truncated dystrophin protein comprises an amino acid sequence selected from the group consisting of SEQ ID NOs:83-107 and 216-223, or an amino acid at least about 90% identical thereto.
  • the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 86, or an amino acid at least about 90% identical thereto.
  • the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 95 or an amino acid at least about 90% identical thereto.
  • the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 101 or an amino acid at least about 90% identical thereto.
  • the amino acid sequence of the truncated dystrophin protein is not identical to an amino acid sequence of SEQ ID NO: 143.
  • the truncated dystrophin protein is not a polypeptide of 2361 amino acids. In some embodiments, the truncated dystrophin protein is less than 2361 amino acid in length. In some embodiments, the truncated dystrophin protein is greater than 2361 amino acid in length.
  • the truncated dystrophin protein is functional.
  • the first nucleic acid molecule is present in a first viral vector, and the second nucleic acid molecule is present in a second viral vector.
  • the first viral vector and the second viral vector are each independently selected from the group consisting of an adenoviral vector, an adeno-associated viral vector, a lentiviral vector, a vaccinia vector, a herpes simplex viral vector, and an Epstein- Barr viral vector.
  • the first viral vector is an adeno-associated viral (AAV) vector
  • the second viral vector is an AAV vector
  • the AAV vector is selected from the group consisting of an AAV1, AAV2, AAV3, AAV4, AAV5, AAV6, AAV7, AAV8, AAV9, AAV10, AAV11, AAV12, AAV13, AAVrh74, AAV-rhlO, AAV-DJ, AAV-LK03, AAV-MYO, AAV-MYO2, AAV-MYO3, MYO3A-AAV, MYO4A-AAV, and MYO4E-AAV.
  • the first AAV vector further comprises a first promoter operably linked to the first nucleic acid molecule.
  • the second AAV vector further comprises a second promoter operably linked to the second nucleic acid molecule.
  • the promoter comprises a tissue specific promoter or a ubiquitous promoter.
  • the promoter comprises a CK8 promoter, an MHCK7 promoter, an SPC5 promoter, or a minimal CKM promoter.
  • the promoter comprises a nucleotide sequence selected from the group consisting of SEQ ID NOs: 144-150, or a nucleotide sequence at least 95% identical thereto.
  • the first and/or second AAV vectors further comprise an inverted terminal repeat (ITR) sequence.
  • the ITR sequence comprises a nucleotide sequence of SEQ ID NO: 202 and/or 203, or a nucleotide sequence at least 95% identical thereto.
  • the first and/or second AAV vectors further comprise an intron region.
  • the intron region comprise a nucleotide sequence of SEQ ID NO: 156 or 157, or a nucleotide sequence at least 95% identical thereto.
  • the first and/or second AAV vectors further comprise a polyadcnylation sequence.
  • the poly adenylation sequence comprises a nucleotide sequence of SEQ ID NO: 151 or 152, or a nucleotide sequence at least 95% identical thereto.
  • the first and/or second AAV vectors further comprise a Woodchuck Hepatitis Virus Posttranscriptional Regulatory Element (WPRE).
  • WPRE Woodchuck Hepatitis Virus Posttranscriptional Regulatory Element
  • the WPRE comprises a nucleotide sequence of SEQ ID NOs: 153-155, or a nucleotide sequence at least 95% identical thereto.
  • the first and/or second AAV vectors further comprise a Kozak sequence.
  • the present invention is directed to a vector system for expressing a truncated human dystrophin protein, comprising a first AAV vector and a second AAV vector, wherein the first AAV vector comprises a first nucleic acid molecule comprising a first coding region encoding an N-terminal portion of the truncated dystrophin protein and a 3’ ribozyme, where the first coding region is operably linked to the 3’ ribozyme at its 3’ end, wherein the second AAV vector comprises a second nucleic acid molecule comprising a second coding region encoding a C-terminal portion of the truncated dystrophin protein and a 5’ ribozyme, where the second coding region is operably linked to the 5’ ribozyme at its 5’ end, wherein upon ribozyme-mediated catalytic ligation, the first coding region and the second coding region forms a third coding region encoding for the t
  • the first coding region is operably linked to two or more 3’ ribozymes at its 3’ end. In some embodiments, the two or more 3’ ribozymes are the same 3’ ribozyme. In some embodiments, the two or more 3’ ribozymes are different 3’ ribozymes.
  • the second coding region is operably linked to two or more 5’ ribozymes at its 5’ end. In some embodiments, the two or more 5’ ribozymes are the same 5’ ribozymes. In some embodiments, the two or more 5’ ribozymes are different 5’ ribozymes.
  • the 5’ ribozyme and the 3’ ribozyme are each independently selected from the group consisting of Twister (Osa), Twister (Dre), Twister (Nvi), Twister (Sbi), Twister (Envl), Twister (Spu), Twister (Cpa), Twister Sister, Hammerhead (RzB), HDV, Pistol, Varkud Satellite (VS), Hatchet, Hairpin, and Hovlinc (Hov).
  • the 5’ ribozyme and the 3’ ribozyme arc each independently selected from the group consisting of SEQ ID NOs: 6 - 20.
  • the first nucleic acid molecule further comprises an intron splice donor sequence
  • the second nucleic acid molecule further comprises an intron splice acceptor sequence
  • the splice donor sequence is positioned between the first coding region and the 3’ ribozyme.
  • the splice donor sequence is selected from the group consisting of SEQ ID NOs: 133 - 136.
  • the splice donor sequence is positioned at least 10 nucleotides, at least 15 nucleotides, at least 20 nucleotides, at least 30 nucleotides, at least 40 nucleotides, at least 50 nucleotides, or more from the 3’ ribozyme.
  • the splice acceptor sequence is positioned between the 5’ ribozyme and the second coding region.
  • the splice acceptor sequence is selected from the group consisting of SEQ ID Nos: 137-141.
  • the splice acceptor sequence is positioned at least 10 nucleotides, at least 15 nucleotides, at least 20 nucleotides, at least 30 nucleotides, at least 40 nucleotides, at least 50 nucleotides, or more from the 5’ ribozyme.
  • the splice donor sequence and the splice acceptor sequence are positioned such that the resulting spliced intron is between 50 - 200 bp in length.
  • the splice donor sequence and splice acceptor sequence are positioned such that the resulting spliced intron encodes a single predominant reading frame.
  • a stop codon sequence is introduced into the splice donor sequence or the splice acceptor sequence.
  • At least one of the first coding region and the second coding region is at least 2000 nucleotides in length, or at least 2200 nucleotides in length, or at least 2400 nucleotides in length, or at least 2600 nucleotides in length.
  • the first coding region and the second coding region are each at least 2000 nucleotides in length, or at least 2200 nucleotides in length, or at least 2400 nucleotides in length, or at least 2600 nucleotides in length.
  • the third coding region is at least 4920 nucleotides in length, or at least 5100 nucleotides in length, or at least 5300 nucleotides in length.
  • the first coding region and the second coding region do not share a region of substantial sequence identity.
  • the 3’ end of the first coding region does not have a sequence identity to the 5’ end of the second coding region.
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
  • the truncated dystrophin protein further comprises Hl Domain (SEQ ID NO: 22).
  • the truncated dystrophin protein comprises an amino acid sequence selected from the group consisting of: a. midi-Dys A R1-R15 (SEQ ID NO: 83), b. midi-Dys A R2-R15 (SEQ ID NO: 84), c. midi-Dys A R3-R15 (SEQ ID NO: 85), d. midi-Dys A H2-R 15 (SEQ ID NO: 86), e. midi-Dys A R4-R15 (SEQ ID NO: 87), f. midi-Dys A R5-R15 (SEQ ID NO: 88), g.
  • midi-Dys A exon 13-33 SEQ ID NO: 93
  • h. midi-Dys A exon 13-39 SEQ ID NO: 94
  • i. midi-Dys A exon 13-41 SEQ ID NO: 95
  • j. midi-Dys A exon 13-48 SEQ ID NO: 96
  • k. midi-Dys A exon 15-39 SEQ ID NO: 97
  • l. midi-Dys A exon 15-41 SEQ ID NO: 98
  • m. midi-Dys A exon 15-48 SEQ ID NO: 99
  • n. midi-Dys A exon 17-39 SEQ ID NO: 100
  • midi-Dys A exon 17-41 SEQ ID NO: 101
  • p. midi-Dys A exon 17-48 SEQ ID NO: 102
  • q. midi-Dys A exon 18-39 SEQ ID NO: 220
  • r. midi-Dys A exon 18-41 SEQ ID NO: 221
  • s. midi-Dys A exon 18-48 SEQ ID NO: 222
  • t. midi-Dys A exon 19-39 SEQ ID NO: 103
  • u. midi-Dys A exon 19-41 SEQ ID NO: 104
  • v. midi-Dys A exon 19-48 SEQ ID NO: 105
  • midi-Dys A exon 21-41 (SEQ ID NO: 106)
  • x. midi-Dys A exon 21-42 (SEQ ID NO: 223)
  • y. midi-Dys A exon 21-48 (SEQ ID NO: 107).
  • the truncated dystrophin protein further comprises R16 Domain
  • the truncated dystrophin protein comprises an amino acid sequence selected from the group consisting of a. midi-Dys A R1-R15 (SEQ ID NO: 83), b. midi-Dys A R2-R15 (SEQ ID NO: 84), c. midi-Dys A R3-R15 (SEQ ID NO: 85), d. midi-Dys A H2-R15 (SEQ ID NO: 86), e. midi-Dys A R4-R 15 (SEQ ID NO: 87), f. midi-Dys A R5-R15 (SEQ ID NO: 88), g.
  • midi-Dys A exon 10-33 (SEQ ID NO: 89), h. midi-Dys A exon 10-39 (SEQ ID NO: 90), i. midi-Dys A exon 10-41 (SEQ ID NO: 91), j. midi-Dys A exon 11-33 (SEQ ID NO: 216), k. midi-Dys A exon 11-39 (SEQ ID NO: 217), l. midi-Dys A exon 11-41 (SEQ ID NO: 218), m. midi-Dys A exon 13-33 (SEQ ID NO: 93), n. midi-Dys A exon 13-39 (SEQ ID NO: 94), o.
  • midi-Dys A exon 13-41 SEQ ID NO: 95
  • p. midi-Dys A exon 15-39 SEQ ID NO: 97
  • q. midi-Dys A exon L5-41 SEQ ID NO: 98
  • r. midi-Dys A exon 17-39 SEQ ID NO: 100
  • s. midi-Dys A exon 17-41 SEQ ID NO: 101
  • t. midi-Dys A exon 18-39 SEQ ID NO: 220
  • u. midi-Dys A exon 18-41 SEQ ID NO: 221)
  • v. midi-Dys A exon 19-39 SEQ ID NO: 103
  • midi-Dys A exon 19-41 SEQ ID NO: 104
  • x. midi-Dys A exon 21-41 SEQ ID NO: 106
  • y. midi-Dys A exon 21-42 SEQ ID NO: 223.
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO:24), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO:24), R3 domain (SEQ ID NO:25), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO:24), R3 domain (SEQ ID NO:25), H2 domain (SEQ ID NO:26), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO:24), R3 domain (SEQ ID NO:25), H2 domain (SEQ ID NO:26), R4 domain (SEQ ID NO:27), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), a partial Hl domain (SEQ ID NO: 404), a partial R11 domain (SEQ ID NO:412), R12 Domain (SEQ ID NO: 35), R13 Domain (SEQ ID NO: 36), R14 domain (SEQ ID NO: 37), R15 Domain (SEQ ID NO: 38), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), a partial Hl domain (SEQ ID NO: 404), a partial R14 Domain (SEQ ID NO: 413), R15 Domain (SEQ ID NO: 38), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), a partial Hl domain (SEQ ID NO: 404), a partial R15 Domain (SEQ ID NO: 414), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), a partial Hl domain (SEQ ID NO: 404), a partial R19 Domain (SEQ ID NO:415), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), a partial R1 domain (SEQ ID NO: 405), a partial R11 domain (SEQ ID NO: 412), R12 Domain (SEQ ID NO: 35), R13 Domain (SEQ ID NO: 36), R14 Domain (SEQ ID NO: 37), R15 Domain (SEQ ID NO: 38), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 21), Hl domain (
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), a partial R1 domain (SEQ ID NO: 405), a partial R14 Domain (SEQ ID NO: 413), R15 Domain (SEQ ID NO: 38), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), a partial R1 domain (SEQ ID NO: 405), a partial R15 Domain (SEQ ID NO: 414), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), a partial R1 domain (SEQ ID NO: 405), a partial R19 Domain (SEQ ID NO:415), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), a partial R2 domain (SEQ ID NO: 406), a partial R11 domain (SEQ ID NO: 412), R12 Domain (SEQ ID NO: 35), R13 Domain (SEQ ID NO: 36), R14 Domain (SEQ ID NO: 37), R15 Domain (SEQ ID NO: 38), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 21), Hl domain (
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), a partial R2 domain (SEQ ID NO: 406), a partial R14 Domain (SEQ ID NO: 413), R15 Domain (SEQ ID NO: 38), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), a partial R2 domain (SEQ ID NO: 406), a partial R15 Domain (SEQ ID NO: 414), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), a partial R2 domain (SEQ ID NO: 406), a partial R19 Domain (SEQ ID NO: 415), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), a partial R3 domain (SEQ ID NO: 407), a partial R14 Domain (SEQ ID NO: 413), R15 Domain (SEQ ID NO: 38), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), a partial R3 domain (SEQ ID NO: 407), a partial R15 Domain (SEQ ID NO: 414), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), a partial R3 domain (SEQ ID NO: 407), a partial R19 Domain (SEQ ID NO:415), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), a partial R3 domain (SEQ ID NO: 408), a partial R14 Domain (SEQ ID NO: 413), R15 Domain (SEQ ID NO: 38), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), a partial R3 domain (SEQ ID NO: 408), a partial R15 Domain (SEQ ID NO: 414), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), a partial R3 domain (SEQ ID NO: 408), a partial R19 Domain (SEQ ID NO:415), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), R3 domain (SEQ ID NO: 25), H2 domain (SEQ ID NO: 26), a partial R4 domain (SEQ ID NO: 409), a partial R14 Domain (SEQ ID NO: 413), R15 Domain (SEQ ID NO: 38), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 21), Hl domain (
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), R3 domain (SEQ ID NO: 25), H2 domain (SEQ ID NO: 26), a partial R4 domain (SEQ ID NO: 409), a partial R15 Domain (SEQ ID NO: 414), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 21), Hl domain (
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), R3 domain (SEQ ID NO: 25), H2 domain (SEQ ID NO: 26), a partial R4 domain (SEQ ID NO: 409), a partial R19 Domain (SEQ ID NO: 415), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), R3 domain (SEQ ID NO: 25), H2 domain (SEQ ID NO: 26), a partial R4 domain (SEQ ID NO: 410), a partial R14 Domain (SEQ ID NO: 413), R15 Domain (SEQ ID NO: 38), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 21), Hl domain (
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), R3 domain (SEQ ID NO: 25), H2 domain (SEQ ID NO: 26), a partial R4 domain (SEQ ID NO: 410), a partial R15 Domain (SEQ ID NO: 414), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 21), Hl domain (
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), R3 domain (SEQ ID NO: 25), H2 domain (SEQ ID NO: 26), a partial R4 domain (SEQ ID NO: 410), a partial R19 Domain (SEQ ID NO: 415), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), R3 domain (SEQ ID NO: 25), H2 domain (SEQ ID NO: 26), R4 domain (SEQ ID NO:27), a partial R5 domain (SEQ ID NO: 411), a partial R15 Domain (SEQ ID NO: 414), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 21), Hl domain (
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), R3 domain (SEQ ID NO: 25), H2 domain (SEQ ID NO: 26), R4 domain (SEQ ID NO:27), a partial R5 domain (SEQ ID NO: 411), a partial R16 Domain (SEQ ID NO: 416), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 21), Hl domain (
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), R3 domain (SEQ ID NO: 25), H2 domain (SEQ ID NO: 26), R4 domain (SEQ ID NO:27), a partial R5 domain (SEQ ID NO: 411), a partial R19 Domain (SEQ ID NO: 415), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
  • the truncated dystrophin proteins described above are depicted in Table 1 above.
  • the truncated dystrophin protein comprises an amino acid sequence selected from the group consisting of SEQ ID NOs:83-107 and 216-223, or an amino acid at least about 90% identical thereto.
  • the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 86, or an amino acid at least about 90% identical thereto.
  • the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 95 or an amino acid at least about 90% identical thereto.
  • the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 101 or an amino acid at least about 90% identical thereto. [0152] In some embodiments, the amino acid sequence of the truncated dystrophin protein is not identical to an amino acid sequence of SEQ ID NO: 143.
  • the truncated dystrophin protein is not a polypeptide of 2361 amino acids. In some embodiments, the truncated dystrophin protein is less than 2361 amino acid in length. In some embodiments, the truncated dystrophin protein is greater than 2361 amino acid in length.
  • the truncated dystrophin protein is functional.
  • the first coding region comprises the sequence selected from the group consisting of SEQ ID NOs: 284, 286, 288, 290, 291, 293, 295 and 297.
  • the second coding region comprises the sequence selected from the group consisting of SEQ ID NOs: 285, 287, 289, 292, 294 and 296.
  • the first coding sequence comprises SEQ ID NO: 286 and the second coding sequence comprises SEQ ID NO:287. In some embodiments, the first coding sequence comprises SEQ ID NO: 288 and the second coding sequence comprises SEQ ID NO:289. In some embodiments, the first coding sequence comprises SEQ ID NO: 290 and the second coding sequence comprises SEQ ID NO:289. In some embodiments, the first coding sequence comprises SEQ ID NO: 291 and the second coding sequence comprises SEQ ID NO:292. In some embodiments, the first coding sequence comprises SEQ ID NO: 293 and the second coding sequence comprises SEQ ID NO:294.
  • the first coding sequence comprises SEQ ID NO: 295 and the second coding sequence comprises SEQ ID NO:296. In some embodiments, the first coding sequence comprises SEQ ID NO: 297 and the second coding sequence comprises SEQ ID NO:292. In some embodiments, the first coding sequence comprises SEQ ID NO: 288 and the second coding sequence comprises SEQ ID NO:289. In some embodiments, the first coding sequence comprises SEQ ID NO: 288 and the second coding sequence comprises SEQ ID NO:287.
  • the AAV vector is selected from the group consisting of an AAV1, AAV2, AAV3, AAV4, AAV5, AAV6, AAV7, AAV8, AAV9, AAV10, AAV11, AAV 12, AAV 13, AAVrh74, AAV-rhlO, AAV-DJ, AAV-LK03, AAV-MYO, AAV-MYO2, AAV-MYO3, MYO3A-AAV, MYO4A-AAV, and MYO4E-AAV.
  • the first AAV vector further comprises a first promoter operably linked to the first nucleic acid molecule.
  • the second AAV vector further comprises a second promoter operably linked to the second nucleic acid molecule.
  • the promoter comprises a tissue specific promoter or a ubiquitous promoter.
  • the promoter comprises a CK8 promoter, an MHCK7 promoter, an SPC5-12 promoter, or a minimal CKM promoter.
  • the promoter comprises a nucleotide sequence selected from the group consisting of SEQ ID NOs: 144-150, or a nucleotide sequence at least 95% identical thereto.
  • the first and/or second AAV vectors further comprise an inverted terminal repeat (ITR) sequence.
  • the ITR sequence comprises a nucleotide sequence of SEQ ID NO: 202 and/or 203, or a nucleotide sequence at least 95% identical thereto.
  • the first and/or second AAV vectors further comprise an intron region.
  • the intron region comprise a nucleotide sequence of SEQ ID NO: 156 or 157, or a nucleotide sequence at least 95% identical thereto.
  • the first and/or second AAV vectors further comprise a polyadenylation sequence.
  • the poly adenylation sequence comprises a nucleotide sequence of SEQ ID NO: 151 or 152, or a nucleotide sequence at least 95% identical thereto.
  • the first and/or second AAV vectors further comprise a Woodchuck Hepatitis Virus Posttranscriptional Regulatory Element (WPRE).
  • WPRE Woodchuck Hepatitis Virus Posttranscriptional Regulatory Element
  • the WPRE comprises a nucleotide sequence of SEQ ID NO: 153 or 154, or 155, or a nucleotide sequence at least 95% identical thereto.
  • the first and/or second AAV vectors further comprise a Kozak sequence.
  • the first AAV vector comprises, in 5’ to 3’ order, a 5’ ITR sequence, a promoter sequence, a first coding region encoding an N-terminal portion of the truncated dystrophin protein, an intron splice donor sequence, a 3’ ribozyme, and a 3’ ITR sequence.
  • the first AAV vector comprises, in 5’ to 3’ order, a 5’ ITR sequence, a promoter sequence, a first coding region encoding an N-terminal portion of the truncated dystrophin protein, an intron splice donor sequence, a 3’ ribozyme, a polyadenylation sequence, and a 3’ ITR sequence.
  • the first AAV vector comprises, in 5’ to 3’ order, a 5’ ITR sequence, a promoter sequence, an intron region, a first coding region encoding an N-terminal portion of the truncated dystrophin protein, an intron splice donor sequence, a 3’ ribozyme, a polyadenylation sequence, and a 3’ ITR sequence.
  • the first AAV vector comprises a sequence selected from the group consisting of SEQ ID NO: 160, 162, 164, 166, 168, 170, 172, 174, 176, 178, 180, 182, 184,
  • the second AAV vector comprises, in 5’ to 3’ order, a 5’ ITR sequence, a promoter sequence, a 5’ ribozyme, an intron splice acceptor sequence, a second coding region encoding a C-terminal portion of the truncated dystrophin protein, and a 3’ ITR sequence.
  • the second AAV vector comprises, in 5’ to 3’ order, a 5’ ITR sequence, a promoter sequence, a 5’ ribozyme, an intron splice acceptor sequence, a second coding region encoding a C-terminal portion of the truncated dystrophin protein, a polyadenylation sequence, and a 3’ ITR sequence.
  • the second AAV vector comprises, in 5’ to 3’ order, a 5’ ITR sequence, a promoter sequence, a 5’ ribozyme, an intron splice acceptor sequence, a second coding region encoding a C-terminal portion of the truncated dystrophin protein, a WPRE sequence, a polyadenylation sequence, and a 3’ ITR sequence.
  • the second AAV vector comprises a sequence selected from the group consisting of SEQ ID NO:161, 163, 165, 167, 169, 171, 173, 175, 177, 179, 181, 183, 185,
  • the first AAV vector comprises the sequence of SEQ ID NO: 303
  • the second AAV vector comprises the sequence of SEQ ID NO: 304.
  • the first AAV vector comprises the sequence of SEQ ID NO: 305
  • the second AAV vector comprises the sequence of SEQ ID NO: 306.
  • the first AAV vector comprises the sequence of SEQ ID NO: 307
  • the second AAV vector comprises the sequence of SEQ ID NO: 306.
  • the first AAV vector comprises the sequence of SEQ ID NO: 308, and the second AAV vector comprises the sequence of SEQ ID NO: 309.
  • the first AAV vector comprises the sequence of SEQ ID NO: 310
  • the second AAV vector comprises the sequence of SEQ ID NO: 311.
  • the first AAV vector comprises the sequence of SEQ ID NO: 312
  • the second AAV vector comprises the sequence of SEQ ID NO: 313.
  • the first AAV vector comprises the sequence of SEQ ID NO: 314, and the second AAV vector comprises the sequence of SEQ ID NO: 309.
  • the first AAV vector comprises the sequence of SEQ ID NO: 394
  • the second AAV vector comprises the sequence of SEQ ID NO: 309.
  • the first AAV vector comprises the sequence of SEQ ID NO: 395
  • the second AAV vector comprises the sequence of SEQ ID NO: 306.
  • the first AAV vector comprises the sequence of SEQ ID NO: 395
  • the second AAV vector comprises the sequence of SEQ ID NO: 304.
  • the present invention is directed to a pharmaceutical composition
  • a pharmaceutical composition comprising a first nucleic acid molecule and a second nucleic acid molecule of the present disclosure, and a pharmaceutically acceptable excipient.
  • the first nucleic acid molecule and the second nucleic acid molecule are presented at a ratio of 1:1. 1:2, 1:3, 1:4, 1:5, 2:1, 3:1, 4:1, or 5:1.
  • the present invention is directed to a pharmaceutical composition
  • a pharmaceutical composition comprising a first AAV vector and a second AAV vector of the present disclosure, and a pharmaceutically acceptable excipient.
  • the first AAV vector and the second AAV vector are presented at a ratio of 1:1. 1:2, 1:3, 1:4, 1:5, 2:1, 3:1, 4:1, or 5:1.
  • the present invention is directed to a pharmaceutical composition
  • a pharmaceutical composition comprising an isolated recombinant dystrophin protein as described below, a nucleic acid encoding such dystrophin protein, a viral genome encoding such dystrophin protein, or a host cell incorporating such a nucleic acid or viral genome.
  • the present invention is directed to a method for treating a dystrophin-associated disorder in a subject in need thereof, comprising administering a therapeutically effective amount of the first nucleic acid molecule and the second nucleic acid molecule of the present disclosure, or a therapeutically effective amount of the first AAV vector and the second AAV vector of the present disclosure, or the pharmaceutical composition of the present disclosure, thereby treating the dystrophin-associated disorder in the subject.
  • the present invention is directed to a method for increasing expression of dystrophin in a subject having or diagnosed with having a dystrophin-associated disorder, comprising administering a therapeutically effective amount of the first nucleic acid molecule and the second nucleic acid molecule of the present disclosure, or a therapeutically effective amount of the first AAV vector and the second AAV vector of the present disclosure, or the pharmaceutical composition of the present disclosure, thereby increasing expression of dystrophin in the subject.
  • the present invention is directed to a method for increasing muscle mass or muscle strength and/or preventing fibrosis in a subject having or diagnosed with having a dystrophin-associated disorder, comprising administering a therapeutically effective amount of the first nucleic acid molecule and the second nucleic acid molecule of the present disclosure, or a therapeutically effective amount of the first AAV vector and the second AAV vector of the present disclosure, or the pharmaceutical composition of the present disclosure, thereby increasing muscle strength and/or preventing fibrosis in the subject.
  • the dystrophin-associated disorder is muscular dystrophy.
  • the dystrophin-associated disorder is Duchenne muscular' dystrophy.
  • the first nucleic acid molecule and the second nucleic acid molecule, the first AAV vector and the second AAV vector, or the pharmaceutical composition is administered by intramuscular injection, or intravenous injection.
  • the first AAV vector and the second AAV vector are administered together.
  • the first AAV vector and the second AAV vector are administered separately.
  • the present invention is directed to an isolated recombinant dystrophin protein that has at least 90% sequence identity to a sequence selected from the group consisting of SEQ ID Nos: 83, 85-87, 89-107, and 216-223.
  • the present disclosure provides an isolated nucleic acid molecule encoding a truncated human dystrophin protein that has at least 90% sequence identity to a sequence selected from the group consisting of SEQ ID NOS: 83, 85-87, 89-107, and 216-223.
  • the isolate nucleic acid molecule comprises a nucleotide sequence selected from the group consisting of SEQ ID NOS: 108-132, 224-231, 260-280, and 396-403, or a nucleotide sequence at least 90% identical thereto.
  • the isolated nucleic acid molecule is part of an isolated recombinant vector.
  • the present invention is directed to an isolated recombinant viral genome comprising a nucleic acid molecule encoding a truncated human dystrophin protein comprising an amino acid sequence selected from the group consisting of SEQ ID NOS: 83-107 and 216- 223.
  • the isolated nucleic acid molecule comprises a nucleotide sequence selected from the group consisting of SEQ ID NOS: 108-132, 224-231, 260-280, and 396-403, or a nucleotide sequence at least 90% identical thereto.
  • the present invention is directed to a host cell comprising the first nucleic acid molecule and/or the second nucleic acid molecule of the present disclosure, or the first AAV vector and/or the second AAV vector of the present disclosure.
  • the cell is a mammalian cell, an insect cell, or a bacterial cell.
  • the present invention is directed to a method of making a first recombinant adeno-associated virus (rAAV) particle, the method comprising providing a host cell comprising the first nucleic acid molecule of the present disclosure, and incubating the host cell under conditions suitable to encapsulate the first nucleic acid in an AAV capsid protein; thereby making the first rAAV particle.
  • rAAV adeno-associated virus
  • the present invention is directed to a method of making a second recombinant adeno-associated virus (rAAV) particle, the method comprising providing a host cell comprising the second nucleic acid molecule of the present disclosure, and incubating the host cell under conditions suitable to encapsulate the second nucleic acid in an AAV capsid protein; thereby making the second rAAV particle.
  • rAAV adeno-associated virus
  • the cell is a mammalian cell, an insect cell, or a bacterial cell.
  • the present invention is directed to a system for use in the treatment of a dystrophin-associated disorder.
  • the present invention is directed to a first nucleic acid molecule and a second nucleic acid molecule for use in the treatment of a dystrophin-associated disorder.
  • the present invention is directed to a first AAV vector and a second AAV vector for use in the treatment of a dystrophin-associated disorder.
  • the present invention is directed to a pharmaceutical composition for use in the treatment of a dystrophin-associated disorder.
  • the present invention is directed to an isolated nucleic acid molecule for use in the treatment of a dystrophin-associated disorder.
  • FIG. 1 is a schematic depicting the dual-vector system for expressing a dystrophin protein.
  • the system comprises two separate vectors, such as AAV vectors.
  • the first AAV vector comprises a first nucleic acid molecule comprising a coding region for the N-terminal (Nt) portion of a dystrophin protein (e.g., a truncated dystrophin), a splicing donor sequence (SD) and a 3’ ribozyme (3’Rz), under the control of a muscle-specific promoter and flanked by two ITR sequences.
  • Nt N-terminal portion of a dystrophin protein
  • SD splicing donor sequence
  • 3’Rz 3’ ribozyme
  • the second AAV vector comprises a second nucleic acid molecule comprising a coding region for the C-terminal (Ct) portion of a dystrophin protein (e.g., a truncated dystrophin), a splicing acceptor sequence (SA) and a 5’ ribozyme (5’Rz), under the control of a muscle-specific promoter and flanked by two ITR sequences.
  • a dystrophin protein e.g., a truncated dystrophin
  • SA splicing acceptor sequence
  • 5’Rz 5’ ribozyme
  • Ribozymes are utilized on the 3' end of the Nt vector and 5' end of the Ct vector to create precise RNA termini and scarless trans- ligation, followed by splicing mediated by the splice donor and acceptor sequences, to generate a single RNA molecule containing a single open reading frame encoding the Nt portion and the Ct portion of the dystrophin protein.
  • FIG. 2 depicts various truncated dystrophin proteins having deletions of different domains or deletions of regions encoded by specific exons.
  • “Y” indicates the inclusion of the particular domain in the truncated dystrophin protein, with domains as described in Table 1 .
  • “N” indicates the absence of the particular domain in the truncated dystrophin protein.
  • “P” indicates that the particular domain is partially included in the truncated dystrophin protein. Partially included (“P”, “Pl” and “P2”) here is used when the truncated protein includes at least 1 but not all of the amino acid encoded from that particular domain as described in Table 1.
  • domains are exemplary listings of these sub-domain sequences, and the exact boundaries between each sub-domain can be shifted from the example sequences provided here, which could vary the example categorizations provided here, particularly the “P” (Partially included) category.
  • FIG. 3 depicts various truncated dystrophin proteins having deletions of different domains or deletions of regions encoded by specific exons.
  • Y indicates the inclusion of the particular exon in the truncated dystrophin protein, with exons defined in SEQ IDs 315-393.
  • N indicates the absence of the particular exon in the truncated dystrophin protein.
  • P indicates that the particular’ exon is partially included in the truncated dystrophin protein, as described in Table 1.
  • Exon sequences are based on NM.004006.3.
  • Exons 1 and 79 include untranslated regions (UTRs) which are not required for encoding midi-Dys amino acids, hence the “U” designation.
  • UTRs untranslated regions
  • FIG. 4 depicts the results of delivering various truncated midi-Dys expression plasmid combinations to primary human skeletal muscle cells cultured in vitro.
  • FIG. 4 includes Western blot analysis of cell lysates 24-to-48 hours after transfection of DNA expression plasmids demonstrates that dual plasmid delivery of truncated midi-Dys sequences results in varying levels of truncated midi-Dys protein expression.
  • FIG. 5 depicts the results of utilizing different split sites (SS) to divide truncated midi- Dys into N-terminal (Nt) and C-terminal (Ct) plasmids which are transfected into primary human skeletal muscle cells.
  • FIG. 5 shows Western blot results of cell lysates 24-to-48 hours after transfection of various Nt and Ct combinations to express truncated midi-Dystrophins.
  • FIG. 6 depicts the results of utilizing different codon optimized and CpG augmented midi-Dystrophin sequences to express midi-Dystrophin protein (SEQ ID NO 259) in primary human skeletal muscle cells.
  • FIG. 6 shows Western blot results of cell lysates 24-to-48 hours after transfection of various codon optimized truncated midi-Dystrophins.
  • FIG. 7 depicts the results of utilizing two different dual AAV vector designs to express midi-Dystrophin proteins in primary human skeletal muscle cells, utilizing transfected AAV cis- plasmids.
  • the present disclosure provides isolated recombinant nucleic acid molecules encoding truncated dystrophin proteins, that can be delivered and expressed in a subject using a dual adeno-associated virus (AAV) vector system, to allow expression of truncated dystrophin proteins that are otherwise too large to fit into a single AAV system.
  • AAV adeno-associated virus
  • the truncated dystrophin proteins can be used to restore the expression and function of a wild-type dystrophin in a subject in need thereof.
  • the present disclosure also provides systems and methods for expressing or delivering a truncated dystrophin protein in a subject, and methods for treating a subject having a dystrophin-associated disease or disorder, e.g., muscular dystrophy, e.g., Duchenne muscular dystrophy (DMD).
  • a dystrophin-associated disease or disorder e.g., muscular dystrophy, e.g., Duchenne muscular dystrophy (DMD).
  • Gene therapy is a promising therapeutic approach for the treatment of many diseases, such as genetic disease and/or diseases that can be treated by expression of a therapeutic protein.
  • Recombinant AAV vectors are generally regarded as one of the safest and most effective classes of vectors for gene therapy.
  • one challenge that hampers the development and clinical deployment of therapeutic AAV vectors for gene therapy is their packaging capacity, which is restricted to approximately 4.7 kb of DNA.
  • expressing a full-length dystrophin protein of 3,686 amino acids, using a single AAV vector could not be possible since the DMD gene encoding the dystrophin protein exceeds the packaging limit of AAV genomes by more than 2-fold. Accordingly, attention has focused on creating smaller versions of dystrophin that eliminate non-essential subdomains while maintaining at least some function of the full-length protein.
  • the present disclosure is based, at least in part, on the development of truncated dystrophin proteins that can be delivered using a dual AAV vector system to a subject for expression.
  • AAV vectors described herein can be used to administer and/or deliver a functional truncated dystrophin protein in order to achieve sustained and high concentrations and/or more consistent levels of the dystrophin protein.
  • the compositions and methods described herein can be used in the treatment of disorders associated with a lack of a dystrophin protein and/or activity, such as muscular dystrophy, e.g., Duchenne muscular dystrophy (DMD).
  • DMD Duchenne muscular dystrophy
  • an element means one element or more than one element, e.g., a plurality of elements.
  • Adeno-associated virus As used herein, the term “adeno-associated virus” or “AAV” refers to members of the Dependoparvovirus genus or a variant, e.g., a functional variant, thereof. In some embodiments, the AAV is wildtypc, or naturally occurring. In some embodiments, the AAV is recombinant.
  • an “AAV particle” refers to a particle or a virion comprising an AAV capsid, e.g., an AAV capsid variant, and a polynucleotide, e.g., a viral genome.
  • the viral genome of the AAV particle comprises at least one payload region encoding a protein of interest, e.g., a truncated dystrophin protein or portion thereof, and at least one ITR.
  • the AAV particle is capable of delivering a nucleic acid, e.g., a payload region, encoding a payload to cells, typically, mammalian, e.g., human, cells.
  • an AAV particle of the present disclosure may be produced recombinantly.
  • an AAV particle may be derived from any serotype, described herein or known in the art, including combinations of serotypes (e.g., “pseudotyped” AAV) or from various genomes (e.g., single stranded or self-complementary).
  • the AAV particle may be replication defective, targeted to a specific tissues or subset of tissues, and/or detargeted to a specific tissue or subset of tissues.
  • the AAV particle may comprise a peptide, e.g., targeting peptide, present, e.g., inserted into, the capsid to enhance tropism for a desired target tissue. It is to be understood that reference to the AAV particle of the disclosure also includes pharmaceutical compositions thereof, even if not explicitly recited.
  • AAV vector As used herein, the term "AAV vector” or “AAV construct” refers to a vector derived from an adeno-associated virus serotype.
  • AAV vector refers to a vector that includes AAV nucleotide sequences as well as heterologous nucleotide sequences.
  • AAV vectors require only the 145 base terminal repeats in cis to generate virus. All other viral sequences are dispensable and may be supplied in trans (Muzyczka (1992) Curr. Topics Microbiol. Immunol. 158:97-129).
  • the recombinant AAV vector genome will only retain the inverted terminal repeat (ITR) sequences so as to maximize the size of the transgene that can be efficiently packaged by the vector.
  • ITRs need not be the wild-type nucleotide sequences, and may be altered, e.g., by the insertion, deletion or substitution of nucleotides, as long as the sequences provide for functional rescue, replication and packaging.
  • Administering includes dispensing, delivering or applying a composition of the disclosure to a subject by any suitable route for delivery of the composition to the desired location in the subject. Alternatively, or in combination, delivery is by the topical, parenteral or oral route, intracerebral injection, intramuscular injection, subcutaneous/intradermal injection, intravenous injection, buccal administration, transdermal delivery and administration by the rectal, colonic, vaginal, intranasal or respiratory tract route.
  • capsid refers to the exterior, e.g., a protein shell, of a virus particle, e.g., an AAV particle, that is substantially (e.g., >50%, >60%, >70%, >80%, >90%, >95%, >99%, or 100%) protein.
  • the capsid is an AAV capsid comprising an AAV capsid protein described herein, e.g., a VP1, VP2, and/or VP3 polypeptide.
  • the AAV capsid protein can be a wild-type AAV capsid protein or a variant, e.g., a structural and/or functional variant from a wild-type or a reference capsid protein, referred to herein as an “AAV capsid variant.”
  • the AAV capsid variant described herein has the ability to enclose, e.g., encapsulate, a viral genome and/or is capable of entry into a cell, e.g., a mammalian cell.
  • Codon optimization refers to a process of changing codons of a given gene in such a manner that the polypeptide sequence encoded by the gene remains the same while the changed codons improve the process of expression of the polypeptide sequence.
  • codon optimization refers to a process of changing codons of a given gene in such a manner that the polypeptide sequence encoded by the gene remains the same while the changed codons improve the process of expression of the polypeptide sequence.
  • codon optimization is performed on the DNA sequence to change the human codons to codons that are more effective for expression in E. coli.
  • Human protein-coding nucleic acid sequences delivered by vectors such as AAV can be further codon optimized for more effective expression in human cells and/or specific human cell or tissue types.
  • contacting i.e., contacting a cell with an agent
  • contacting is intended to include incubating the agent and the cell together in vitro (e.g., adding the agent to cells in culture) or administering the agent to a subject such that the agent and cells of the subject are contacted in vivo.
  • the term "contacting” is not intended to include exposure of cells to an agent that may occur naturally in a subject (i.e., exposure that may occur as a result of a natural physiological process).
  • Cpg Motif As used herein, a “CpG motif’ is a pattern of bases that include a central CpG (“p” refers to the phosphodiester link between consecutive C and G nucleotides) surrounded by at least one base flanking (on the 3' and the 5' side of) the central CpG.
  • Dystrophin refers to a sarcolemmal protein associated with the dystrophin-associated protein complex (DAPC) (Hoffman et al., Cell
  • the DAPC is composed of multiple proteins at the muscle sarcolemma that form a structural link between the extracellular matrix (ECM) and the cytoskeleton via dystrophin, an actin binding protein, and alpha-dystroglycan, a laminin-binding protein. These structural links act to stabilize the muscle cell membrane during contraction and protect against contraction-induced damage. With dystrophin loss, membrane fragility results in sarcolemmal tears and an influx of calcium, triggering calcium-activated proteases and segmental fiber necrosis (Straub et al., Curr Opin. Neurol. 10(2): 168-75, 1997).
  • the dystrophin (DMD) gene is 2.2 megabases at locus Xp21 and has 79 exon s encoding a protein which is over 3500 amino acids. Normal skeleton muscle tissue contains only small amounts of dystrophin but its absence or abnormal expression leads to the development of severe and incurable symptoms. Some mutations in the dystrophin gene lead to the production of defective dystrophin and severe dystrophic phenotype in affected patients. Some mutations in the dystrophin gene lead to partially-functional dystrophin protein and a much milder dystrophic phenotype in affected patients.
  • the term “functional” or “functional protein” refers to a truncated protein that is capable of, partially or completely, restoring a function of an endogenously expressed full-length protein.
  • a functional truncated dystrophin protein is capable of, partially or completely, restoring a function, e.g., supporting a link between the extracellular matrix and the cytoskeleton, of a full-length dystrophin protein in vitro or in vivo, e.g., in a disease model in an animal, or in a human.
  • Dystrophin-associated disorder refers to diseases or disorders having a deficiency in the DMD gene, such as a heritable, e.g., X-linked, mutation in DMD resulting in deficient or defective dystrophin protein expression in patient cells.
  • Dystrophin-associated disorders include, but are not limited to muscular dystrophies, e.g., Duchenne muscular dystrophy (DMD), or Becker muscular dystrophy (BMD).
  • Isolated refers to a substance or entity that is altered or removed from the natural state, e.g., altered or removed from at least some of component with which it is associated in the natural state.
  • a nucleic acid or a peptide naturally present in a living animal is not “isolated,” but the same nucleic acid or peptide partially or completely separated from the coexisting materials of its natural state is “isolated.”
  • An isolated nucleic acid or protein can exist in substantially purified form, or can exist in a non-native environment such as, for example, a host cell.
  • Muscle cell/tissue refers to a cell or group of cells derived from muscle of any kind, for example, skeletal muscle and smooth muscle, e.g., from the digestive tract, urinary bladder, blood vessels or cardiac tissue.
  • Muscular dystrophy As used herein, the term “muscular dystrophy” or “muscular dystrophies” refers to a group of hereditary muscle diseases that weakens skeletal muscles. Muscular dystrophies are characterized by a genetic defect resulting in muscle weakness or loss of muscle tissue which progressively increases over time.
  • Muscular dystrophies include, but are not limited to, Duchenne muscular dystrophy, Becker muscular dystrophy, myotonic dystrophy, congenital muscular dystrophy, distal muscular dystrophy, Emery-Dreifuss muscular dystrophy facioscapulohumeral muscular dystrophy, limb girdle muscular dystrophy, and oculopharyngeal muscular dystrophy.
  • DMD Duchenne muscular dystrophy
  • DMD Duchenne muscular dystrophy
  • Becker muscular dystrophy As used herein, the term “Becker muscular dystrophy (BMD)” refers to an X-linked muscle disease caused by in-frame mutations of the dystrophin gene. These BMD-causing mutations result in the production of a truncated isoform of dystrophin protein that is partially functional and expressed at reduced amounts. The reduced levels of a truncated dystrophin protein lead to progressive skeletal and cardiac muscle dysfunction. BMD presents with reduced severity compared with Duchenne muscular dystrophy (DMD).
  • DMD Duchenne muscular dystrophy
  • mutations refers to a change and/or alteration.
  • mutations may be changes and/or alterations to proteins (including peptides and polypeptides) and/or nucleic acids (including polynucleic acids).
  • mutations comprise changes and/or alterations to a protein and/or nucleic acid sequence. Such changes and/or alterations may comprise the addition, substitution and or deletion of one or more amino acids (in the case of proteins and/or peptides) and/or nucleotides (in the case of nucleic acids and or polynucleic acids).
  • mutations comprise the addition and/or substitution of amino acids and/or nucleotides
  • such additions and/or substitutions may comprise 1 or more amino acid and/or nucleotide residues and may include modified amino acids and/or nucleotides.
  • One or more mutations may result in a “mutant,” “derivative,” or “valiant,” e.g., of a nucleic acid sequence or polypeptide or protein sequence.
  • Naturally occurring means existing in nature without artificial aid, or involvement of the hand of man. “Naturally occurring” or “wild-type” may refer to a native form of a biomolecule, sequence, or entity.
  • nucleic acid As used herein, the terms “nucleic acid,” “polynucleotide,” and “oligonucleotide” refer to any nucleic acid polymers composed of either polydeoxyribonucleotides (containing 2-deoxy-D-ribose), or polyribonucleotides (containing D- ribosc), or any other type of polynucleotide that is an N glycoside of a purine or pyrimidine base, or modified purine or pyrimidine bases. There is no intended distinction in length between the term “nucleic acid,” “polynucleotide,” and “oligonucleotide,” and these terms will be used interchangeably.
  • Operably linked refers to a functional connection between two or more molecules, constructs, transcripts, entities, moieties or the like.
  • Particle' is a virus comprised of at least two components, a protein capsid and a polynucleotide sequence enclosed within the capsid.
  • Payload As used herein, “payload” or “payload region” or “transgene” refers to one or more polynucleotides or polynucleotide regions encoded by or within a viral genome or an expression product of such polynucleotide or polynucleotide region, e.g., a transgene, a polynucleotide encoding a polypeptide.
  • Peptide refers to a chain of amino acids that is less than or equal to about 50 amino acids long, e.g., about 5, 10, 15, 20, 25, 30, 35, 40, 45, or 50 amino acids long.
  • compositions, and/or dosage forms which are, within the scope of sound medical judgment, suitable for use in contact with the tissues of human beings and animals without excessive toxicity, irritation, allergic response, or other problem or complication, commensurate with a reasonable benefit/risk ratio.
  • compositions As used herein, the term “pharmaceutically acceptable excipient,” as used herein, refers to any ingredient other than active agents (e.g., as described herein) present in pharmaceutical compositions and having the properties of being substantially nontoxic and non-inflammatory in subjects. In some embodiments, pharmaceutically acceptable excipients are vehicles capable of suspending and/or dissolving active agents.
  • Excipients may include, for example: antiadherents, antioxidants, binders, coatings, compression aids, disintegrants, dyes (colors), emollients, emulsifiers, fillers (diluents), film formers or coatings, flavors, fragrances, glidants (flow enhancers), lubricants, preservatives, printing inks, sorbents, suspending or dispersing agents, sweeteners, and waters of hydration.
  • antiadherents antioxidants, binders, coatings, compression aids, disintegrants, dyes (colors), emollients, emulsifiers, fillers (diluents), film formers or coatings, flavors, fragrances, glidants (flow enhancers), lubricants, preservatives, printing inks, sorbents, suspending or dispersing agents, sweeteners, and waters of hydration.
  • Excipients include, but are not limited to: butylated hydroxytoluene (BHT), calcium carbonate, calcium phosphate (dibasic), calcium stearate, croscarmcllosc, cross-linked polyvinyl pyrrolidone, citric acid, crospovidone, cysteine, ethylcellulose, gelatin, hydroxypropyl cellulose, hydroxypropyl methylcellulose, lactose, magnesium stearate, maltitol, mannitol, methionine, methylcellulose, methyl paraben, microcrystalline cellulose, polyethylene glycol, polyvinyl pyrrolidone, povidone, pregelatinized starch, propyl paraben, retinyl palmitate, shellac, silicon dioxide, sodium carboxymethyl cellulose, sodium citrate, sodium starch glycolate, sorbitol, starch (corn), stearic acid, sucrose, talc, titanium dioxide, vitamin A, vitamin E,
  • Polypeptide refers to an organic polymer consisting of a large number of amino-acid residues bonded together in a chain.
  • a monomeric protein molecule is a polypeptide.
  • the term “preventing” refers to partially or completely delaying onset of an infection, disease, disorder and/or condition; partially or completely delaying onset of one or more symptoms, features, or clinical manifestations of a particular infection, disease, disorder, and/or condition; partially or completely delaying progression from an infection, a particular disease, disorder and/or condition; and/or decreasing the risk of developing pathology associated with the infection, the disease, disorder, and/or condition.
  • Promoter refers to a nucleic acid site to which a polymerase enzyme will bind to initiate transcription (DNA to RNA) or reverse transcription (RNA to DNA).
  • Recombinant nucleic acid molecule refers to a nucleic acid molecule or a polynucleotide having sequences that are not naturally joined together.
  • An amplified or assembled recombinant polynucleotide may be included in a suitable vector, and the vector can be used to transform a suitable host cell.
  • a recombinant polynucleotide may serve a non-coding function (e.g., promoter, origin of replication, ribosome-binding site, etc.) as well.
  • regulatory sequence is intended to include promoters, enhancers and other expression control elements (e.g., polyadenylation signals). Such regulatory sequences are described, for example, in Goeddel; Gene Expression Technology: Methods in Enzymology 185, Academic Press, San Diego, Calif. (1990). Regulatory sequences include those which direct constitutive expression of a nucleotide sequence in many types of host cells, those which are constitutively active, those which are inducible, and those which direct expression of the nucleotide sequence only in certain host cells (e.g., tissue- specific regulatory sequences).
  • the expression vectors of the disclosure can be introduced into host cells to thereby produce proteins or portions thereof, including fusion proteins or portions thereof, encoded by nucleic acids as described herein.
  • Ribozyme refers to an RNA molecule capable of acting as an enzyme.
  • some ribozymes are capable of cleaving RNA molecules.
  • RNA cleaving ribozymes typically consist at least of a catalytic domain and a recognition sequence that is recognized by the catalytic domain.
  • the catalytic domain can be a part of the same RNA molecule as the recognition sequence, and thus mediate cis- cleavage.
  • the catalytic domain can be a separate RNA molecule from the RNA molecule comprising the recognition sequence, and thus mediate trans-cleavage.
  • sequence identity refers to the overall relatedness between polymeric molecules, e.g., between nucleic acid molecules (e.g. DNA molecules and/or RNA molecules) and/or between polypeptide molecules. Sequence identity of polymeric molecules to one another can be calculated as the percentage of nucleotides or amino acid residues in a candidate sequence that are identical to the nucleotides or amino acid residues in a given polymeric molecule, after aligning the sequences and introducing gaps, if necessary, to achieve the maximum percent sequence identity, and not considering any conservative substitutions as pail of the sequence identity.
  • Alignment for purposes of determining percent sequence identity can be achieved in various ways that are within the skill in the ail, for instance, using publicly available computer software such as BLAST, BLAST-2, or ALIGN. Those skilled in the ail can determine appropriate parameters for measuring alignment, including any algorithms needed to achieve maximal alignment over the full length of the sequences being compared.
  • Similarity refers to the overall relatedness between polymeric molecules, e.g. between polynucleotide molecules (e.g. DNA molecules and/or RNA molecules) and/or between polypeptide molecules. Calculation of percent similarity of polymeric molecules to one another can be performed in the same manner as a calculation of percent identity, except that calculation of percent similarity takes into account conservative substitutions as are understood in the art.
  • split Site refers to the location in the nucleic acid sequence where a truncated midi-Dystrophin transgenic sequence is divided into two portions, which will then be delivered to cells by an N-terminal (Nt) vector (or plasmid) and a C-terminal (Ct) vector or plasmid.
  • Nt N-terminal vector
  • Ct C-terminal vector
  • split Sites must be selected carefully based on their sequence properties and can be empirically optimized for enhanced ability to join the Nt and Ct sequences for midi-Dys protein expression.
  • Subject refers to any organism to which a composition in accordance with the disclosure may be administered, e.g., for experimental, diagnostic, prophylactic, and/or therapeutic purposes.
  • subject or patient refers to an organism who may seek, who may require, who is receiving, or who will receive treatment or who is under care by a trained professional for a particular disease or condition.
  • Typical subjects include animals (e.g., mammals such as mice, rats, rabbits, dogs, non-human primates, and humans).
  • the subject is a mammal, e.g., a primate, e.g., a human.
  • the subject is a human.
  • the subject is a child. In other embodiments, the subject is an adult.
  • a subject or patient may be susceptible to or suspected of having a dystrophin-associated disorder, e.g., muscular dystrophy, e.g., DMD. In certain embodiments, a subject or patient may be diagnosed with a dystrophin- associated disorder, e.g., muscular dystrophy, e.g., DMD.
  • the term “substantially” refers to the qualitative condition of exhibiting total or near-total extent or degree of a characteristic or property of interest.
  • One of ordinary skill in the biological arts will understand that biological and chemical phenomena rarely, if ever, go to completion and/or proceed to completeness or achieve or avoid an absolute result.
  • the term “substantially” is therefore used herein to capture the potential lack of completeness inherent in many biological and chemical phenomena.
  • therapeutic agent refers to any agent that, when administered to a subject has a therapeutic, diagnostic, and/or prophylactic effect and/or elicits a desired biological and/or pharmacological effect.
  • therapeutically effective amount means an amount of an agent to be delivered (e.g., nucleic acid, drug, therapeutic agent, diagnostic agent, prophylactic agent, etc.) that is sufficient, when administered to a subject suffering from or susceptible to an infection, disease, disorder, and/or condition, to treat, improve symptoms of, diagnose, prevent, and/or delay the onset of the infection, disease, disorder, and/or condition.
  • a therapeutically effective amount is provided in a single dose.
  • a therapeutically effective amount is administered in a dosage regimen comprising a plurality of doses.
  • a unit dosage form may be considered to comprise a therapeutically effective amount of a particular agent or entity if it comprises an amount that is effective when administered as part of such a dosage regimen.
  • Treating refers to partially or completely alleviating, ameliorating, improving, relieving, reversing, delaying onset of, inhibiting progression of, reducing severity of, and/or reducing incidence of one or more symptoms or features of a particular infection, disease, disorder, and/or condition. Treatment may be administered to a subject who does not exhibit signs of a disease, disorder, and/or condition and/or to a subject who exhibits only early signs of a disease, disorder, and/or condition for the purpose of decreasing the risk of developing pathology associated with the disease, disorder, and/or condition. [0258] Truncated Dystrophin'.
  • a “truncated dystrophin” is any protein shorter than the full length 427 kDa Dystrophin protein isoform.
  • a truncated dystrophin refers to any transgenic Dystrophin protein delivered by either a single AAV vector or reconstituted by a dual vector system, since in both cases, these delivery technologies are incapable of delivering the full-length 427 kDa dystrophin protein.
  • a truncated dystrophin protein as described herein, is functional, i.e., the truncated dystrophin protein is capable of restoring, partially or completely, a function of a full-length dystrophin protein in vitro (e.g., supporting a link between the extracellular matrix and the cytoskeleton) or in vivo, e.g., in a disease model in an animal, or in a human.
  • Vector is any molecule or moiety which transports, transduces or otherwise acts as a carrier of a heterologous molecule.
  • Vectors of the present disclosure may be produced recombinantly and may be based on and/or may comprise adeno-associated virus (AAV) parent or reference scqucncc(s). Such parent or reference AAV sequences may serve as an original, second, third or subsequent sequence for engineering vectors.
  • AAV adeno-associated virus
  • such parent or reference AAV sequences may comprise any one or more of the following sequences: a polynucleotide sequence encoding a polypeptide or multi-polypeptide, having a sequence that may be wild-type or modified from wild-type and which sequence may encode full-length or partial sequence of a protein, protein domain, or one or more subunits of dystrophin protein and valiants thereof; a polynucleotide encoding dystrophin protein and variants thereof, having a sequence that may be wild-type or modified from wild-type; and a transgene encoding dystrophin protein and variants thereof that may or may not be modified from wild-type sequence.
  • Viral genome As used herein, a “viral genome” or “vector genome” is a polynucleotide comprising at least one inverted terminal repeat (ITR) and at least one encoded payload. A viral genome encodes at least one copy of the payload.
  • ITR inverted terminal repeat
  • Wild-type is a native form of a biomolecule, sequence, or entity.
  • wild type is a term of the art understood by skilled persons and means the typical form of an organism, strain, gene or characteristic as it occurs in nature as distinguished from mutant or valiant forms.
  • Dystrophin is a cytoplasmic protein encoded by the DMD gene, which is a part of a protein complex that connects the cytoskeleton of a muscle fiber to the surrounding extracellular matrix through the cell membrane.
  • the dystrophin protein located primarily in skeletal and cardiac muscles, with smaller amounts expressed in the brain, acts as a shock absorber during muscle fiber contraction by linking the actin of the contractile apparatus to the layer of connective tissue that surrounds each muscle fiber.
  • dystrophin is localized at the cytoplasmic face of the sarcolemma membrane.
  • the full-length dystrophin muscle isoform (Dp427m) is a large (427 kDa) protein comprising a number of subdomains that contribute to its function. These subdomains include, in order from the amino-terminus toward the carboxy-terminus, the N-terminal actin-binding domain (ACBD), a central so-called “rod” domain, a cystcinc-rich (CR) domain and lastly a carboxy-terminal (CT) domain.
  • ACBD N-terminal actin-binding domain
  • CR cystcinc-rich domain
  • CT carboxy-terminal
  • the rod domain is comprised of 4 proline-rich hinge domains (abbreviated H) and 24 spectrin-like repeats (abbreviated R) in the following order: a first hinge domain (Hl), 3 spectrin-like repeats (Rl, R2, R3), a second hinge domain (H2), 16 more spectrin-like repeats (R4, R5, R6, R7, R8, R9, RIO, Rl 1, R12, R13, R14, R15, R16, R17, R18, R19), a third hinge domain (H3), 5 more spectrin-like repeats (R20, R21, R22, R23, R24), and finally a fourth hinge domain (H4).
  • H proline-rich hinge domains
  • R spectrin-like repeats
  • DGC dystrophin-associated glycoprotein complex
  • the DMD gene is one of the largest known human genes at approximately 2.2 Mb.
  • the gene is located on the X chromosome at position Xp21 and contains 79 exons.
  • the most common mutations that cause Duchenne muscular dystrophy (DMD), or Becker muscular dystrophy (BMD) are large deletion mutations of one or more exons (60-70%), but duplication mutations (5-10%) and single nucleotide variants (including small deletions or insertions, singlebase changes, and splice site changes accounting for approximately 25%-35% of pathogenic variants in males with DMD and about 10%-20% of males with BMD) can also cause pathogenic dystrophin valiants.
  • DMD DMD
  • mutations often lead to a frame shift resulting in a premature stop codon and a truncated, non-functional or unstable protein.
  • Nonsense point mutations can also result in premature termination codons with the same result.
  • the BMD genotype is similar to DMD in that deletions are present in the dystrophin gene. However, these deletions leave the reading frame intact. Thus an internally truncated but partially functional dystrophin protein is created.
  • changing a DMD genotype to a BMD genotype is a common strategy to correct dystrophin. There are many strategies to correct dystrophin, many of which rely on restoring the reading frame of the endogenous dystrophin. This shifts the disease genotype from DMD to BMD.
  • the present disclosure provides truncated dystrophin gene sequences and expression vectors containing the same.
  • Such genes and expression vectors arc useful gene therapy to prevent or treat dystrophin-associated disorders, e.g., muscular dystrophies, e.g.. Duchenne muscular dystrophy (DMD), in subjects in need thereof.
  • dystrophin-associated disorders e.g., muscular dystrophies, e.g.. Duchenne muscular dystrophy (DMD)
  • DMD Duchenne muscular dystrophy
  • Expression of functional truncated proteins in transduced muscle cells is able to replicate and replace at least some of the function normally attributable to full-length dystrophin, such as supporting a mechanically strong link between the extracellular matrix and the cytoskeleton.
  • the present disclosure provides an isolated, recombinant nucleic acid molecule encoding a truncated human dystrophin protein, wherein the truncated dystrophin protein comprises an ABCD domain (SEQ ID NO: 21), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
  • the truncated dystrophin protein comprises an ABCD domain (SEQ ID NO: 21), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (S
  • the truncated dystrophin protein further comprises an Hl Domain (SEQ ID NO: 22).
  • the truncated dystrophin protein comprises an amino acid sequence selected from the group consisting of: a. midi-Dys AR1-R15 (SEQ ID NO: 83), b. midi-Dys AR2-R15 (SEQ ID NO: 84), c. midi-Dys AR3-R15 (SEQ ID NO: 85), d. midi-Dys AH2-R15 (SEQ ID NO: 86), e. midi-Dys AR4-R15 (SEQ ID NO: 87), f .
  • midi-Dys AR5-R 15 (SEQ ID NO : 88 ) , g. midi-Dys A exon 13-33 (SEQ ID NO: 93), h. midi-Dys A exon 13-39 (SEQ ID NO: 94), i. midi-Dys A exon 13-41 (SEQ ID NO: 95), j. midi-Dys A exon 13-48 (SEQ ID NO: 96), k. midi-Dys A exon 15-39 (SEQ ID NO: 97), l. midi-Dys A exon 15-41 (SEQ ID NO: 98), m.
  • midi-Dys A exon 15-48 (SEQ ID NO: 99), n. midi-Dys A exon 17-39 (SEQ ID NO: 100), o. midi-Dys A exon 17-41 (SEQ ID NO: 101), p. midi-Dys A exon 17-48 (SEQ ID NO: 102), q. midi-Dys A exon 18-39 (SEQ ID NO: 220), r. midi-Dys A exon 18-41 (SEQ ID NO: 221), s. midi-Dys A exon 18-48 (SEQ ID NO: 222), t. midi-Dys A exon 19-39 (SEQ ID NO: 103), u.
  • midi-Dys A exon 19-41 SEQ ID NO: 104
  • v. midi-Dys A exon 19-48 SEQ ID NO: 105
  • w. midi-Dys A exon 21-41 SEQ ID NO: 106
  • x. midi-Dys A exon 21-42 SEQ ID NO: 223
  • y. midi-Dys A exon 21-48 SEQ ID NO: 107).
  • the truncated dystrophin protein further comprises an R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), and R19 Domain (SEQ ID NO:42).
  • the truncated dystrophin protein comprises an amino acid sequence selected from the group consisting of a. midi-Dys AR1-R15 (SEQ ID NO: 83), b. midi-Dys AR2-R15 (SEQ ID NO: 84), c. midi-Dys AR3-R15 (SEQ ID NO: 85), d. midi-Dys AH2-R15 (SEQ ID NO: 86), e. midi-Dys AR4-R15 (SEQ ID NO: 87), f. midi-Dys AR5-R15 (SEQ ID NO: 88), g.
  • midi-Dys A exon 10-33 (SEQ ID NO: 89), h. midi-Dys A exon 10-39 (SEQ ID NO: 90), i. midi-Dys A exon 10-41 (SEQ ID NO: 91), j. midi-Dys A exon 11-33 (SEQ ID NO: 216), k. midi-Dys A exon 11-39 (SEQ ID NO: 217), l. midi-Dys A exon 11-41 (SEQ ID NO: 218), m. midi-Dys A exon 13-33 (SEQ ID NO: 93), n. midi-Dys A exon 13-39 (SEQ ID NO: 94), o.
  • midi-Dys A exon 13-41 SEQ ID NO: 95
  • p. midi-Dys A exon 15-39 SEQ ID NO: 97
  • q. midi-Dys A exon 15-41 SEQ ID NO: 98
  • r. midi-Dys A exon 17-39 SEQ ID NO: 100
  • s. midi-Dys A exon 17-41 SEQ ID NO: 101
  • t. midi-Dys A exon 18-39 SEQ ID NO: 220
  • u. midi-Dys A exon 18-41 SEQ ID NO: 221)
  • v. midi-Dys A exon 19-39 SEQ ID NO: 103
  • midi-Dys A exon 19-41 SEQ ID NO: 104
  • x. midi-Dys A exon 21-41 SEQ ID NO: 106
  • y. midi-Dys A exon 21-42 SEQ ID NO: 223.
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO:24), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO:24), R3 domain (SEQ ID NO:25), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO:24), R3 domain (SEQ ID NO:25), H2 domain (SEQ ID NO:26), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), RI8 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO:24), R3 domain (SEQ ID NO:25), H2 domain (SEQ ID NO:26), R4 domain (SEQ ID NO:27), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), a partial Hl domain (SEQ ID NO: 404), a partial R11 domain (SEQ ID NO:412), R12 Domain (SEQ ID NO: 35), R13 Domain (SEQ ID NO: 36), R14 domain (SEQ ID NO: 37), R15 Domain (SEQ ID NO: 38), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), a partial Hl domain (SEQ ID NO: 404), a partial R14 Domain (SEQ ID NO: 413), R15 Domain (SEQ ID NO: 38), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), a partial Hl domain (SEQ ID NO: 404), a partial R15 Domain (SEQ ID NO: 414), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), a partial Hl domain (SEQ ID NO: 404), a partial R19 Domain (SEQ ID NO:415), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), a partial R1 domain (SEQ ID NO: 405), a partial R11 domain (SEQ ID NO: 412), R12 Domain (SEQ ID NO: 35), R13 Domain (SEQ ID NO: 36), R14 Domain (SEQ ID NO: 37), R15 Domain (SEQ ID NO: 38), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 21), Hl domain (
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), a partial R1 domain (SEQ ID NO: 405), a partial R14 Domain (SEQ ID NO: 413), R15 Domain (SEQ ID NO: 38), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51 ).
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), a partial R1 domain (SEQ ID NO: 405), a partial R15 Domain (SEQ ID NO: 414), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), a partial R1 domain (SEQ ID NO: 405), a partial R19 Domain (SEQ ID NO:415), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), a partial R2 domain (SEQ ID NO: 406), a partial R11 domain (SEQ ID NO: 412), R12 Domain (SEQ ID NO: 35), R13 Domain (SEQ ID NO: 36), R14 Domain (SEQ ID NO: 37), R15 Domain (SEQ ID NO: 38), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 21), Hl domain (
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), a partial R2 domain (SEQ ID NO: 406), a partial R14 Domain (SEQ ID NO: 413), R15 Domain (SEQ ID NO: 38), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), a partial R2 domain (SEQ ID NO: 406), a partial R15 Domain (SEQ ID NO: 414), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), a partial R2 domain (SEQ ID NO: 406), a partial R19 Domain (SEQ ID NO: 415), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), a partial R3 domain (SEQ ID NO: 407), a partial R14 Domain (SEQ ID NO: 413), R15 Domain (SEQ ID NO: 38), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), a partial R3 domain (SEQ ID NO: 407), a partial R15 Domain (SEQ ID NO: 414), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), a partial R3 domain (SEQ ID NO: 407), a partial R19 Domain (SEQ ID NO:415), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), a partial R3 domain (SEQ ID NO: 408), a partial R14 Domain (SEQ ID NO: 413), R15 Domain (SEQ ID NO: 38), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), a partial R3 domain (SEQ ID NO: 408), a partial R15 Domain (SEQ ID NO: 414), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), a partial R3 domain (SEQ ID NO: 408), a partial R19 Domain (SEQ ID NO:415), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), R3 domain (SEQ ID NO: 25), H2 domain (SEQ ID NO: 26), a partial R4 domain (SEQ ID NO: 409), a partial R14 Domain (SEQ ID NO: 413), R15 Domain (SEQ ID NO: 38), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 21), Hl domain (
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), R3 domain (SEQ ID NO: 25), H2 domain (SEQ ID NO: 26), a partial R4 domain (SEQ ID NO: 409), a partial R15 Domain (SEQ ID NO: 414), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 21), Hl domain (
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), R3 domain (SEQ ID NO: 25), H2 domain (SEQ ID NO: 26), a partial R4 domain (SEQ ID NO: 409), a partial R19 Domain (SEQ ID NO: 415), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), R3 domain (SEQ ID NO: 25), H2 domain (SEQ ID NO: 26), a partial R4 domain (SEQ ID NO: 410), a partial R14 Domain (SEQ ID NO: 413), R15 Domain (SEQ ID NO: 38), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 21), Hl domain (
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), R3 domain (SEQ ID NO: 25), H2 domain (SEQ ID NO: 26), a partial R4 domain (SEQ ID NO: 410), a partial R15 Domain (SEQ ID NO: 414), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 21), Hl domain (
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), R3 domain (SEQ ID NO: 25), H2 domain (SEQ ID NO: 26), a partial R4 domain (SEQ ID NO: 410), a partial R19 Domain (SEQ ID NO: 415), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), R3 domain (SEQ ID NO: 25), H2 domain (SEQ ID NO: 26), R4 domain (SEQ ID NO:27), a partial R5 domain (SEQ ID NO: 411), a partial R15 Domain (SEQ ID NO: 414), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 21), Hl domain (
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), R3 domain (SEQ ID NO: 25), H2 domain (SEQ ID NO: 26), R4 domain (SEQ ID NO:27), a partial R5 domain (SEQ ID NO: 411), a partial R16 Domain (SEQ ID NO: 416), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 21), Hl domain (
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), R3 domain (SEQ ID NO: 25), H2 domain (SEQ ID NO: 26), R4 domain (SEQ ID NO:27), a partial R5 domain (SEQ ID NO: 411), a partial R19 Domain (SEQ ID NO: 415), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
  • the truncated dystrophin protein comprises an amino acid sequence selected from the group consisting of SEQ ID NOs: 83-107 and 216-223, or an amino acid sequence at least 90%, 91 %, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 83, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 84, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 85, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 86, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 87, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 88, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 89, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 90, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 91, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 92, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 93, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 94, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 95, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 96, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 97, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 98, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 99, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 100, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 101, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 102, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 103, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 104, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 105, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 106, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 107, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 216, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 217, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 218, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 219, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 220, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 221, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 222, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 223, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the amino acid sequence of the truncated dystrophin protein is not identical to the amino acid sequence of SEQ ID NO: 143. In some embodiments, the truncated dystrophin protein is not a polypeptide of 2361 amino acids. In some embodiments, the truncated dystrophin protein is less than 2361 amino acid in length. In some embodiments, the truncated dystrophin protein is greater than 2361 amino acid in length.
  • the recombinant nucleic acid molecule encoding the truncated dystrophin protein comprises a nucleotide sequence selected from the group consisting of SEQ ID NOS: 108-132, 224-231, 260-280, and 396-403, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 108, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 109, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 110, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 111, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 112, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 113, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 114, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 115, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 116, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 117, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 118, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 119, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 120, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 121, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 122, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 123, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 124, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 125, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 126, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 127, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 128, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 129, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 130, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 131, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 132, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 224, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 225, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 226, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 227, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 228, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 229, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 230, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 231, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 260, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 261, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 262, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 263, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 264, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 265, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 266, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 267, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 268, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 269, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 270, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 271, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 272, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 273, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 274, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 275, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 276, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 277, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 278, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 279, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 280, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 396, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 397, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 398, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 399, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 400, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 401, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 402, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 403, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the present disclosure also provides expression vectors and cells comprising the isolated nucleic acid molecules encoding the truncated dystrophin proteins, as described herein in Tables 3 and 4, or a portion (e.g., a 3’ portion or 5’ portion) of the isolated nucleic acid molecules.
  • the host cell is a mammalian cell, an insect cell, or a bacterial cell.
  • the present disclosure also provides systems for efficiently and reliably generating a large single nucleic acid molecule that encodes a protein of interest, e.g., a truncated dystrophin protein, as described herein, whose coding sequence is too large to package into a single expression vector, e.g., an adeno-associated virus (AAV) vector.
  • a protein of interest e.g., a truncated dystrophin protein, as described herein
  • AAV adeno-associated virus
  • the present disclosure also provides systems for delivery and expression of a protein of interest, e.g., a truncated dystrophin protein, as described herein.
  • a protein of interest e.g., a truncated dystrophin protein
  • the invention utilizes ribozyme-mediated zran -ligation of two or more nucleic acid molecules to assemble a single nucleic acid molecule encoding a protein of interest.
  • Ribozymes are small catalytic RNA sequences capable of nucleotide-specific selfcleavage found widespread in nature. Ribozyme cleavage generates unique 2', 3 '-phosphate and 5'-hydroxyl termini, and mammalian cells have an inherent capacity to catalyze the /ra -ligation of independent RNAs that have been cleaved by ribozymes.
  • the efficient and precise nature of ribozyme cleavage, which produce precise and unique nucleotide termini allow for a trans- ligated RNA to be scarless and able to maintain a protein-coding open reading frame.
  • the ligated mRNAs can behave essentially indistinguishably from their natural full-length counterparts, in that they can be spliced using conventional introns and translated into functional proteins.
  • a first nucleic acid molecule comprising a first coding region encoding a first portion of the dystrophin protein (e.g., an N-terminal portion of the truncated dystrophin protein), and a second nucleic acid molecule comprising a second coding region encoding a second portion of the dystrophin protein (e.g., a C-terminal portion of the truncated dystrophin protein).
  • a single nucleic acid molecule is assembled comprising a third (i.e., combined), coding region which encodes the truncated dystrophin protein
  • the first nucleic acid molecule and the second nucleic acid molecule are included in separate viral vectors for delivery to target cells, e.g., muscle cells.
  • the viral vector is an adeno- associated viral vector (AAV).
  • AAV adeno- associated viral vector
  • the present disclosure provides a system for generating a truncated human dystrophin protein, comprising a first recombinant nucleic acid molecule and a second recombinant nucleic acid molecule, wherein the first nucleic acid molecule comprises a first coding region encoding an N-terminal portion of the truncated dystrophin protein and a 3’ ribozyme, where the first coding region is operably linked to the 3’ ribozyme at its 3’ end, wherein the second nucleic acid molecule comprising a second coding region encoding a C- terminal portion of the truncated dystrophin protein and a 5 ’ribozyme, where the second coding region is operably linked to the 5’ ribozyme at its 5’ end, wherein upon ribozyme-mediated catalytic ligation, the first coding region and the second coding region forms a third coding region encoding for the t
  • the first coding region is operably linked to two or more 3’ ribozymes at its 3’ end.
  • the second coding region is operably linked to two or more 5’ ribozymes at its 5’ end.
  • the 3’ ribozyme in the first nucleic acid molecule is able to catalyze itself out of the nucleic acid molecule leaving a 3’P or 2’ 3’ cyclic phosphate (cP) end.
  • the 5’ ribozyme in the second nucleic acid molecule is able to catalyze itself out of the nucleic acid molecule leaving a 5’ OH end.
  • the 3’P or 2’ 3’ cP end and the 5’ OH end of nucleic acid molecules that have undergone ribozyme-mediated cleavage can be ligated together.
  • the coding region of the first nucleic acid molecule which encodes the N-terminal portion of the truncated dystrophin protein
  • the coding region of the second nucleic acid molecule which encodes the C-terminal portion of the truncated dystrophin protein
  • the functional truncated dystrophin is able to restore the expression and function, partially or completely, of an endogenously expressed full-length dystrophin protein.
  • the 3’ ribozyme is selected from the group consisting of Twister (Osa), Twister (Dre), Twister (Nvi), Twister (Sbi), Twister (Envl), Twister (Spu), Twister (Cpa), Twister Sister, Hammerhead (HH), Hepatitis Delta Virus (HDV), Pistol, Varkud Satellite (VS), Hatchet, Hairpin, and Hovlinc (Hov), or a variant or fragment thereof.
  • the 3’ ribozyme comprises a sequence selected from the group consisting of SEQ ID NOs: 6 - 20.
  • the 3’ ribozyme comprises a sequence of SEQ ID NO: 6. In some embodiments, the 3’ ribozyme comprises a sequence of SEQ ID NO:7. In some embodiments, the 3’ ribozyme comprises a sequence of SEQ ID NO:8. In some embodiments, the 3’ ribozyme comprises a sequence of SEQ ID NO:9. In some embodiments, the 3’ ribozyme comprises a sequence of SEQ ID NO: 10. In some embodiments, the 3’ ribozyme comprises a sequence of SEQ ID NO: 11 . In some embodiments, the 3’ ribozyme comprises a sequence of SEQ ID NO: 12.
  • the 3’ ribozyme comprises a sequence of SEQ ID NO: 13. In some embodiments, the 3’ ribozyme comprises a sequence of SEQ ID NO: 14. In some embodiments, the 3’ ribozyme comprises a sequence of SEQ ID NO: 15. In some embodiments, the 3’ ribozyme comprises a sequence of SEQ ID NO: 16. In some embodiments, the 3’ ribozyme comprises a sequence of SEQ ID NO: 17. In some embodiments, the 3’ ribozyme comprises a sequence of SEQ ID NO: 18. In some embodiments, the 3’ ribozyme comprises a sequence of SEQ ID NO: 19. In some embodiments, the 3’ ribozyme comprises a sequence of SEQ ID NO:20.
  • the 5’ ribozyme is selected from the group consisting of Twister (Osa), Twister (Dre), Twister (Nvi), Twister (Sbi), Twister (Envl), Twister (Spu), Twister (Cpa), Twister Sister, Hammerhead (HH), Hepatitis Delta Virus (HDV), Pistol, Varkud Satellite (VS), Hatchet, Hairpin, and Hovlinc (Hov), or a variant or fragment thereof.
  • the 5’ ribozyme comprises a sequence selected from the group consisting of SEQ ID NOs: 6 - 20.
  • the 5’ ribozyme comprises a sequence of SEQ ID NO: 6. In some embodiments, the 5’ ribozyme comprises a sequence of SEQ ID NO:7. In some embodiments, the 5’ ribozyme comprises a sequence of SEQ ID NO:8. In some embodiments, the 5’ ribozyme comprises a sequence of SEQ ID NO:9. In some embodiments, the 5’ ribozyme comprises a sequence of SEQ ID NO: 10. In some embodiments, the 5’ ribozyme comprises a sequence of SEQ ID NO: 11. In some embodiments, the 5’ ribozyme comprises a sequence of SEQ ID NO: 12.
  • the 5’ ribozyme comprises a sequence of SEQ ID NO: 13. In some embodiments, the 5’ ribozyme comprises a sequence of SEQ ID NO: 14. In some embodiments, the 5’ ribozyme comprises a sequence of SEQ ID NO: 15. In some embodiments, the 5’ ribozyme comprises a sequence of SEQ ID NO: 16. In some embodiments, the 5’ ribozyme comprises a sequence of SEQ ID NO: 17. In some embodiments, the 5’ ribozyme comprises a sequence of SEQ ID NO: 18. In some embodiments, the 5’ ribozyme comprises a sequence of SEQ ID NO: 19. In some embodiments, the 5’ ribozyme comprises a sequence of SEQ ID NO:20.
  • the 3’ ribozyme comprises an HDV ribozyme.
  • the 5’ ribozyme comprises an HH ribozyme.
  • the HDV ribozyme is selected from the group consisting of HDV, HDV68, HDV67, HDV56, genHDV, and antiHDV, or a variant or fragment thereof.
  • the HH ribozyme is RzB ribozyme.
  • the 3’ ribozyme comprises a Twister ribozyme.
  • the 5’ ribozyme comprises an HH ribozyme.
  • the Twister ribozyme is selected from the group consisting of Twister (Osa), Twister (Dre), Twister (Nvi), Twister (Sbi), Twister (Envl), Twister (Spu), and Twister (Cpa).
  • the HH ribozyme is RzB ribozyme.
  • the 3’ ribozyme comprises a Twister ribozyme.
  • the 5’ ribozyme comprises a Twister ribozyme.
  • the Twister ribozyme is selected from the group consisting of Twister (Osa), Twister (Dre), Twister (Nvi), Twister (Sbi), Twister (Envl), Twister (Spu), and Twister (Cpa).
  • the Twister ribozyme is Twister (Osa).
  • Pre-mRNA splicing by the spliceosome has been shown to enhance mRNA translation, either through deposition of factors which promote a pioneer round of translation or through promoting RNA processing and export to the cytoplasm.
  • the addition of a chimeric cis-splicing intron within a transgene has also been shown to promote transgene protein expression.
  • the addition of intron splice donor and intron splice acceptor sites that are recognized and cis-spliced by the spliceosome may enhance protein expression from split precursor RNA molecules.
  • the first nucleic acid molecule further comprises an intron splice donor sequence
  • the second nucleic acid molecule further comprises an intron splice acceptor sequence.
  • the first nucleic acid molecule comprises a first coding region encoding an N-terminal portion of the truncated dystrophin protein, an intron splice donor sequence, and a 3’ ribozyme
  • the second nucleic acid molecule comprises a second coding region encoding a C-terminal portion of the truncated dystrophin protein, an intron splice acceptor sequence, and a 5’ ribozyme.
  • the splice donor sequence is positioned between the first coding region and the 3’ ribozyme, hi some embodiments, the splice donor sequence is selected from the group consisting of SEQ ID NOs: 133 - 136. In some embodiments, the splice donor sequence comprises a sequence of SEQ ID NO: 133. In some embodiments, the splice donor sequence comprises a sequence of SEQ ID NO: 134. hi some embodiments, the splice donor sequence comprises a sequence of SEQ ID NO: 135. In some embodiments, the splice donor sequence comprises a sequence of SEQ ID NO: 136.
  • the splice donor sequence is positioned at least 10 nucleotides, at least 15 nucleotides, at least 20 nucleotides, at least 30 nucleotides, at least 40 nucleotides, at least 50 nucleotides, or more from the 3’ ribozyme.
  • the splice donor sequence is positioned within a region of the truncated dystrophin protein coding for a region selected from the group consisting of the R7 domain, the R8 domain, the R9 domain, the R10 domain, the R11 domain, the R12 domain, the R13 domain, the R 14 domain, the R15 domain, the R16 domain, the R17 domain, the R18 domain, the R19 domain, the H3 domain, the R20 domain, the R21 domain, and the R22 domain.
  • the splice donor sequence is positioned within a region of the truncated dystrophin protein coding for a region selected from the group consisting of the R8 domain, the R19 domain, the H3 domain, the R20 domain, and the R21 domain, hi some embodiments, the splice donor sequence is not positioned within the R21 domain.
  • the splice acceptor sequence is positioned between the 5’ ribozyme and the second coding region.
  • the splice acceptor sequence is selected from the group consisting of SEQ ID Nos: 137-141.
  • the splice acceptor sequence comprises a sequence of SEQ ID NO: 137.
  • the splice acceptor sequence comprises a sequence of SEQ ID NO: 138.
  • the splice acceptor sequence comprises a sequence of SEQ ID NO: 139.
  • the splice acceptor sequence comprises a sequence of SEQ ID NO: 140.
  • the splice acceptor sequence comprises a sequence of SEQ ID NO: 141. In some embodiments, the splice acceptor sequence is positioned at least 10 nucleotides, at least 15 nucleotides, at least 20 nucleotides, at least 30 nucleotides, at least 40 nucleotides, at least 50 nucleotides, or more from the 5’ ribozyme.
  • the splice donor sequence and the splice acceptor sequence are positioned such that the resulting spliced intron is between 50 - 200 bp in length. In some embodiments, the splice donor sequence and splice acceptor sequence are positioned such that the resulting spliced intron encodes a single predominant reading frame. In some embodiments, a stop codon sequence is introduced into the splice donor sequence or the splice acceptor sequence.
  • the first nucleic acid molecule and the nucleic acid RNA molecule are ligated together by an endogenous ligase that exists in the native cell or tissue in which the nucleic acid assembly is taking place.
  • the systems of the present invention comprises an exogenous ligase to induce the ligation of the processed nucleic acid molecules together.
  • the ligase is RNA 2',3'-Cyclic Phosphate and 5'-OH (RtcB) ligase.
  • the coding region of the first nucleic acid molecule, which encodes the N-terminal portion of the truncated dystrophin protein, and the coding region of the second nucleic acid molecule, which encodes the C-terminal portion of the truncated dystrophin protein, forms a longer nucleic acid molecule comprising a third coding region which encodes for the truncated dystrophin protein.
  • At least one of the first coding region and the second coding region is at least 2000 nucleotides in length. In some embodiments, at least one of the first coding region and the second coding region is at least 2100 nucleotides in length. In some embodiments, at least one of the first coding region and the second coding region is at least 2200 nucleotides in length. In some embodiments, at least one of the first coding region and the second coding region is at least 2300 nucleotides in length. In some embodiments, at least one of the first coding region and the second coding region is at least 2400 nucleotides in length.
  • At least one of the first coding region and the second coding region is at least 2500 nucleotides in length. In some embodiments, at least one of the first coding region and the second coding region is at least 2600 nucleotides in length.
  • the first coding region and the second coding region are each at least 2000 nucleotides in length. In some embodiments, the first coding region and the second coding region are each at least 2100 nucleotides in length. In some embodiments, the first coding region and the second coding region are each at least 2200 nucleotides in length. In some embodiments, the first coding region and the second coding region are each at least 2300 nucleotides in length. In some embodiments, the first coding region and the second coding region are each at least 2400 nucleotides in length. In some embodiments, the first coding region and the second coding region are each at least 2500 nucleotides in length. In some embodiments, the first coding region and the second coding region are each at least 2600 nucleotides in length.
  • the first coding region and the second coding region do not share a region of substantial sequence identity, i.e., the sequence identity of the first coding region and the second coding region is less than 90%, less than 85%, less than 80%, less than 75%, less than 70%, less than 65%, less than 60%, less than 55%, less than 50%, less than 45%, less than 40%, less than 35%, or less than 30%.
  • the 3’ end of the first coding region does not have a sequence identity to the 5’ end of the second coding region.
  • the third coding region i.e., the combination of the first coding and the second coding region, is at least 4920 nucleotides in length. In some embodiments, the third coding region is at least 5000 nucleotides in length. In some embodiments, the third coding region is at least 5100 nucleotides in length. In some embodiments, the third coding region is at least 5200 nucleotides in length. In some embodiments, the third coding region is at least 5300 nucleotides in length.
  • the truncated human dystrophin proteins are functional.
  • the truncated human dystrophin protein comprises at least 1640 amino acids.
  • the truncated human dystrophin protein comprises ABCD domain (SEQ ID NO: 21), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51 ).
  • the truncated dystrophin protein further comprises Hl Domain (SEQ ID NO: 22).
  • the truncated dystrophin protein comprises an amino acid sequence selected from the group consisting of: a. midi-Dys AR1-R15 (SEQ ID NO: 83), b. midi-Dys AR2-R15 (SEQ ID NO: 84), c. midi-Dys AR3-R15 (SEQ ID NO: 85), d. midi-Dys AH2-R15 (SEQ ID NO: 86), c. midi-Dys AR4-R15 (SEQ ID NO: 87), f. midi-Dys AR5-R15 (SEQ ID NO: 88), g. midi-Dys A exon 13-33 (SEQ ID NO: 93), h.
  • midi-Dys A exon 13-39 SEQ ID NO: 94
  • i. midi-Dys A exon 13-41 SEQ ID NO: 95
  • j. midi-Dys A exon 13-48 SEQ ID NO: 96
  • k. midi-Dys A exon 15-39 SEQ ID NO: 97
  • l. midi-Dys A exon 15-41 SEQ ID NO: 98
  • m. midi-Dys A exon 15-48 SEQ ID NO: 99
  • n. midi-Dys A exon 17-39 SEQ ID NO: 100
  • o. midi-Dys A exon 17-41 SEQ ID NO: 101
  • midi-Dys A exon 17-48 (SEQ ID NO: 102), q. midi-Dys A exon 18-39 (SEQ ID NO: 220), r. midi-Dys A exon 18-41 (SEQ ID NO: 221), s. midi-Dys A exon 18-48 (SEQ ID NO: 222), t. midi-Dys A exon 19-39 (SEQ ID NO: 103), u. midi-Dys A exon 19-41 (SEQ ID NO: 104), v. midi-Dys A exon 19-48 (SEQ ID NO: 105), w. midi-Dys A exon 21-41 (SEQ ID NO: 106), x. midi-Dys A exon 21-42 (SEQ ID NO: 223), and y. midi-Dys A exon 21-48 (SEQ ID NO: 107).
  • the truncated dystrophin protein further comprises R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), and R19
  • the truncated dystrophin protein comprises an amino acid sequence selected from the group consisting of a. midi-Dys AR1-R15 (SEQ ID NO: 83), b. midi-Dys AR2-R15 (SEQ ID NO: 84), c. midi-Dys AR3-R15 (SEQ ID NO: 85), d. midi-Dys AH2-R15 (SEQ ID NO: 86), e. midi-Dys AR4-R15 (SEQ ID NO: 87), f. midi-Dys AR5-R15 (SEQ ID NO: 88) g.
  • midi-Dys A exon 10-33 (SEQ ID NO: 89), h. midi-Dys A exon 10-39 (SEQ ID NO: 90), i. midi-Dys A exon 1041 (SEQ ID NO: 91), j. midi-Dys A exon 11 33 (SEQ ID NO: 216), k. midi-Dys A exon 11 39 (SEQ ID NO: 217), l. midi-Dys A exon 11 41 (SEQ ID NO: 218), m. midi-Dys A exon 13 33 (SEQ ID NO: 93), n. midi-Dys A exon 13 39 (SEQ ID NO: 94), o.
  • midi-Dys A exon 13 41 SEQ ID NO: 95
  • p. midi-Dys A exon 15 39 SEQ ID NO: 97
  • q. midi-Dys A exon 15 41 SEQ ID NO: 98
  • r. midi-Dys A exon 17- 39 SEQ ID NO: 100
  • s. midi-Dys A exon 17- 41 SEQ ID NO: 101
  • t. midi-Dys A exon 18 39 SEQ ID NO: 220
  • u. midi-Dys A exon 18 41 SEQ ID NO: 221)
  • v. midi-Dys A exon 19 SEQ ID NO: 103
  • midi-Dys A exon 19 41 (SEQ ID NO: 104), x. midi-Dys A exon 21- 41 (SEQ ID NO: 106), and y. midi-Dys A exon 21- 42 (SEQ ID NO: 223).
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO:24), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO:24), R3 domain (SEQ ID NO:25), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), RI8 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO:24), R3 domain (SEQ ID NO:25), H2 domain (SEQ ID NO:26), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO:24), R3 domain (SEQ ID NO:25), H2 domain (SEQ ID NO:26), R4 domain (SEQ ID NO:27), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), a partial Hl domain (SEQ ID NO: 404), a partial R11 domain (SEQ ID NO:412), R12 Domain (SEQ ID NO: 35), R13 Domain (SEQ ID NO: 36), R14 domain (SEQ ID NO: 37), R15 Domain (SEQ ID NO: 38), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), a partial Hl domain (SEQ ID NO: 404), a partial R14 Domain (SEQ ID NO: 413), R15 Domain (SEQ ID NO: 38), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), a partial Hl domain (SEQ ID NO: 404), a partial R15 Domain (SEQ ID NO: 414), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41 ), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), a partial Hl domain (SEQ ID NO: 404), a partial R19 Domain (SEQ ID NO:415), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), a partial R1 domain (SEQ ID NO: 405), a partial R11 domain (SEQ ID NO: 412), R12 Domain (SEQ ID NO: 35), R13 Domain (SEQ ID NO: 36), R14 Domain (SEQ ID NO: 37), R15 Domain (SEQ ID NO: 38), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 21), Hl domain (
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), a partial R1 domain (SEQ ID NO: 405), a partial R14 Domain (SEQ ID NO: 413), R15 Domain (SEQ ID NO: 38), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), a partial R1 domain (SEQ ID NO: 405), a partial R15 Domain (SEQ ID NO: 414), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), a partial R1 domain (SEQ ID NO: 405), a partial R19 Domain (SEQ ID NO:415), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), a partial R2 domain (SEQ ID NO: 406), a partial R11 domain (SEQ ID NO: 412), R12 Domain (SEQ ID NO: 35), R13 Domain (SEQ ID NO: 36), R14 Domain (SEQ ID NO: 37), R15 Domain (SEQ ID NO: 38), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 21), Hl domain (
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), a partial R2 domain (SEQ ID NO: 406), a partial R14 Domain (SEQ ID NO: 413), R15 Domain (SEQ ID NO: 38), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), a partial R2 domain (SEQ ID NO: 406), a partial R15 Domain (SEQ ID NO: 414), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), a partial R2 domain (SEQ ID NO: 406), a partial R19 Domain (SEQ ID NO: 415), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), a partial R3 domain (SEQ ID NO: 407), a partial R14 Domain (SEQ ID NO: 413), R15 Domain (SEQ ID NO: 38), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), a partial R3 domain (SEQ ID NO: 407), a partial R15 Domain (SEQ ID NO: 414), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), a partial R3 domain (SEQ ID NO: 407), a partial R19 Domain (SEQ ID NO:415), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), a partial R3 domain (SEQ ID NO: 408), a partial R14 Domain (SEQ ID NO: 413), R15 Domain (SEQ ID NO: 38), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), a partial R3 domain (SEQ ID NO: 408), a partial R15 Domain (SEQ ID NO: 414), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), a partial R3 domain (SEQ ID NO: 408), a partial R19 Domain (SEQ ID NO:415), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), R3 domain (SEQ ID NO: 25), H2 domain (SEQ ID NO: 26), a partial R4 domain (SEQ ID NO: 409), a partial R14 Domain (SEQ ID NO: 413), R15 Domain (SEQ ID NO: 38), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 21), Hl domain (
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), R3 domain (SEQ ID NO: 25), H2 domain (SEQ ID NO: 26), a partial R4 domain (SEQ ID NO: 409), a partial R15 Domain (SEQ ID NO: 414), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 21), Hl domain (
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), R3 domain (SEQ ID NO: 25), H2 domain (SEQ ID NO: 26), a partial R4 domain (SEQ ID NO: 409), a partial R19 Domain (SEQ ID NO: 415), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), R3 domain (SEQ ID NO: 25), H2 domain (SEQ ID NO: 26), a partial R4 domain (SEQ ID NO: 410), a partial R14 Domain (SEQ ID NO: 413), R15 Domain (SEQ ID NO: 38), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 21), Hl domain (
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), R3 domain (SEQ ID NO: 25), H2 domain (SEQ ID NO: 26), a partial R4 domain (SEQ ID NO: 410), a partial R15 Domain (SEQ ID NO: 414), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 21), Hl domain (
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), R3 domain (SEQ ID NO: 25), H2 domain (SEQ ID NO: 26), a partial R4 domain (SEQ ID NO: 410), a partial R19 Domain (SEQ ID NO: 415), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), R3 domain (SEQ ID NO: 25), H2 domain (SEQ ID NO: 26), R4 domain (SEQ ID NO: 27), a partial R5 domain (SEQ ID NO: 411), a partial R15 Domain (SEQ ID NO: 414), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 21), Hl domain (
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), R3 domain (SEQ ID NO: 25), H2 domain (SEQ ID NO: 26), R4 domain (SEQ ID NO: 27), a partial R5 domain (SEQ ID NO: 411), a partial R16 Domain (SEQ ID NO: 416), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 21), Hl domain (
  • the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), R3 domain (SEQ ID NO: 25), H2 domain (SEQ ID NO: 26), R4 domain (SEQ ID NO:27), a partial R5 domain (SEQ ID NO: 411), a partial R19 Domain (SEQ ID NO: 415), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
  • the truncated human dystrophin protein comprises an amino acid sequence selected from the group consisting of SEQ ID NOs: 83-107 amd 216-223, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 83, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 84, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 85, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 86, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 87, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 88, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 89, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 90, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 91, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 92, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 93, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 94, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 95, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 96, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 97, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 98, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 99, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 100, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 101, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 102, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 103, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 104, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 105, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 106, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 107, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 216, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 217, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 218, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 219, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 220, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 221, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 222, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 223, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the truncated human dystrophin protein does not comprise 2361 amino acids. In some embodiments, the truncated human dystrophin protein is greater than 2361 amino acids in length. In some embodiments, the truncated human dystrophin protein is less than 2361 amino acids in length. In some embodiments, the truncated human dystrophin protein is not identical to the sequence of SEQ ID NO: 143.
  • the third coding region (created by trans-ligation of the first and second coding regions) comprises a nucleotide sequence selected from the group consisting of SEQ ID NOS: 108-132, 224-231, 260-280, and 396-403, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the third coding region comprises a nucleotide sequence of SEQ ID NO: 108, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO:108, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto
  • the second coding region comprises a second, e.g., 3’, portion of the nucleotide sequence of SEQ ID
  • nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the third coding region comprises a nucleotide sequence of SEQ ID NO: 109, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO:109, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto
  • the second coding region comprises a second, e.g., 3’, portion of the nucleotide sequence of SEQ ID
  • nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the third coding region comprises a nucleotide sequence of SEQ ID NO: 110, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO:110, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto
  • the second coding region comprises a second, e.g., 3’, portion of the nucleotide sequence of SEQ ID NO:110, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the third coding region comprises a nucleotide sequence of SEQ ID NO: 111, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO: 111, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto
  • the second coding region comprises a second, e.g., 3’, portion of the nucleotide sequence of SEQ ID NO:111, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the third coding region comprises a nucleotide sequence of SEQ ID NO: 1 12, or a nucleotide sequence at least 90%, 91 %, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO:112, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto
  • the second coding region comprises a second, e.g., 3’, portion of the nucleotide sequence of SEQ ID NO: 112, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the third coding region comprises a nucleotide sequence of SEQ ID NO: 113, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO:113, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto
  • the second coding region comprises a second, e.g., 3’, portion of the nucleotide sequence of SEQ ID NO:113, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the third coding region comprises a nucleotide sequence of SEQ ID NO: 114, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO:114, or a nucleotide sequence at least 90%, 91 %, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto
  • the second coding region comprises a second, e.g., 3’, portion of the nucleotide sequence of SEQ ID NO: 114, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the third coding region comprises a nucleotide sequence of SEQ ID NO: 115, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO:115, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto
  • the second coding region comprises a second, e.g., 3’, portion of the nucleotide sequence of SEQ ID NO:115, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the third coding region comprises a nucleotide sequence of SEQ ID NO: 116, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO:116, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto
  • the second coding region comprises a second, e.g., 3’, portion of the nucleotide sequence of SEQ ID NO:116, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the third coding region comprises a nucleotide sequence of SEQ ID NO: 117, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO:117, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto
  • the second coding region comprises a second portion of the nucleotide sequence of SEQ ID NO: 117, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the third coding region comprises a nucleotide sequence of SEQ ID NO: 118, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO: 1 18, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto
  • the second coding region comprises a second, e.g., 3’, portion of the nucleotide sequence of SEQ ID NO: 118, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the third coding region comprises a nucleotide sequence of SEQ ID NO: 119, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO:119, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto
  • the second coding region comprises a second, e.g., 3’, portion of the nucleotide sequence of SEQ ID NO:119, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the third coding region comprises a nucleotide sequence of SEQ ID NO: 120, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO:120, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto
  • the second coding region comprises a second, e.g., 3’, portion of the nucleotide sequence of SEQ ID NO: 120, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the third coding region comprises a nucleotide sequence of SEQ ID NO: 121, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO:121, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto
  • the second coding region comprises a second, e.g., 3’, portion of the nucleotide sequence of SEQ ID NO:121, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the third coding region comprises a nucleotide sequence of SEQ ID NO: 122, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO: 122, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto
  • the second coding region comprises a second, e.g., 3’, portion of the nucleotide sequence of SEQ ID NO:122, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the third coding region comprises a nucleotide sequence of SEQ ID NO: 123, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO: 123, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto
  • the second coding region comprises a second, e.g., 3’, portion of the nucleotide sequence of SEQ ID
  • nucleotide sequence at least 90%, 91 %, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the third coding region comprises a nucleotide sequence of SEQ ID NO: 124, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO: 124, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto
  • the second coding region comprises a second, e.g., 3’, portion of the nucleotide sequence of SEQ ID
  • nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the third coding region comprises a nucleotide sequence of SEQ ID NO: 125, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO: 125, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto
  • the second coding region comprises a second, e.g., 3’, portion of the nucleotide sequence of SEQ ID
  • nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the third coding region comprises a nucleotide sequence of SEQ ID NO: 126, or a nucleotide sequence at least 90%, 91 %, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO:126, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto
  • the second coding region comprises a second, e.g., 3’, portion of the nucleotide sequence of SEQ ID NO: 126, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the third coding region comprises a nucleotide sequence of SEQ ID NO: 127, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO: 127, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto
  • the second coding region comprises a second, e.g., 3’, portion of the nucleotide sequence of SEQ ID NO: 127, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the third coding region comprises a nucleotide sequence of SEQ ID NO: 128, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO:128, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto
  • the second coding region comprises a second, e.g., 3’, portion of the nucleotide sequence of SEQ ID NO: 128, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the third coding region comprises a nucleotide sequence of SEQ ID NO: 129, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO:129, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto
  • the second coding region comprises a second, e.g., 3’, portion of the nucleotide sequence of SEQ ID NO: 129, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the third coding region comprises a nucleotide sequence of SEQ ID NO: 130, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO:130, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto
  • the second coding region comprises a second, e.g., 3’, portion of the nucleotide sequence of SEQ ID NO:130, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the third coding region comprises a nucleotide sequence of SEQ ID NO: 131, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO:131 , or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto
  • the second coding region comprises a second, e.g., 3’, portion of the nucleotide sequence of SEQ ID NO:131, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the third coding region comprises a nucleotide sequence of SEQ ID NO: 132, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO: 132, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto
  • the second coding region comprises a second, e.g., 3’, portion of the nucleotide sequence of SEQ ID NO: 132, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the third coding region comprises a nucleotide sequence of SEQ ID NO: 224, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO:224, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto
  • the second coding region comprises a second, e.g., 3’, portion of the nucleotide sequence of SEQ ID NO:224, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the third coding region comprises a nucleotide sequence of SEQ ID NO: 225, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO:225, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto
  • the second coding region comprises a second, e.g., 3’, portion of the nucleotide sequence of SEQ ID NO:225, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the third coding region comprises a nucleotide sequence of SEQ ID NO: 226, or a nucleotide sequence at least 90%, 91 %, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO:226, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto
  • the second coding region comprises a second, e.g., 3’, portion of the nucleotide sequence of SEQ ID NO:226, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the third coding region comprises a nucleotide sequence of SEQ ID NO: 227, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO:227, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto
  • the second coding region comprises a second, e.g., 3’, portion of the nucleotide sequence of SEQ ID NO:227, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the third coding region comprises a nucleotide sequence of SEQ ID NO: 228, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO:228, or a nucleotide sequence at least 90%, 91 %, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto
  • the second coding region comprises a second, e.g., 3’, portion of the nucleotide sequence of SEQ ID NO:228, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the third coding region comprises a nucleotide sequence of SEQ ID NO: 229, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO:229, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto
  • the second coding region comprises a second, e.g., 3’, portion of the nucleotide sequence of SEQ ID NO:229, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the third coding region comprises a nucleotide sequence of SEQ ID NO: 230, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO:230, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto
  • the second coding region comprises a second, e.g., 3’, portion of the nucleotide sequence of SEQ ID NO:230, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the third coding region comprises a nucleotide sequence of SEQ ID NO: 231, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO:231, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto
  • the second coding region comprises a second, e.g., 3’, portion of the nucleotide sequence of SEQ ID NO:231, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the third coding region comprises a nucleotide sequence of SEQ ID NO: 260, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO:260, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto
  • the second coding region comprises a second, e.g., 3’, portion of the nucleotide sequence of SEQ ID NO:260, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the third coding region comprises a nucleotide sequence of SEQ ID NO: 261, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO:261, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto
  • the second coding region comprises a second, e.g., 3’, portion of the nucleotide sequence of SEQ ID NO:261, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the third coding region comprises a nucleotide sequence of SEQ ID NO: 262, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO:262, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto
  • the second coding region comprises a second, e.g., 3’, portion of the nucleotide sequence of SEQ ID NO:262, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the third coding region comprises a nucleotide sequence of SEQ ID NO: 263, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO:263, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto
  • the second coding region comprises a second, e.g., 3’, portion of the nucleotide sequence of SEQ ID NO:263, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the third coding region comprises a nucleotide sequence of SEQ ID NO: 264, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO:264, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto
  • the second coding region comprises a second, e.g., 3’, portion of the nucleotide sequence of SEQ ID NO:264, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the third coding region comprises a nucleotide sequence of SEQ ID NO: 265, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO:265, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto
  • the second coding region comprises a second, e.g., 3’, portion of the nucleotide sequence of SEQ ID NO:265, or a nucleotide sequence at least 90%, 91 %, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the third coding region comprises a nucleotide sequence of SEQ ID NO: 266, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO:266, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto
  • the second coding region comprises a second, e.g., 3’, portion of the nucleotide sequence of SEQ ID NO:266, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the third coding region comprises a nucleotide sequence of SEQ ID NO: 267, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO:267, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto
  • the second coding region comprises a second, e.g., 3’, portion of the nucleotide sequence of SEQ ID NO:267, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the third coding region comprises a nucleotide sequence of SEQ ID NO: 268, or a nucleotide sequence at least 90%, 91 %, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO:268, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto
  • the second coding region comprises a second, e.g., 3’, portion of the nucleotide sequence of SEQ ID NO:268, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the third coding region comprises a nucleotide sequence of SEQ ID NO: 269, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO:269, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto
  • the second coding region comprises a second, e.g., 3’, portion of the nucleotide sequence of SEQ ID NO:269, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the third coding region comprises a nucleotide sequence of SEQ ID NO: 270, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO:270, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto
  • the second coding region comprises a second, e.g., 3’, portion of the nucleotide sequence of SEQ ID NO:270, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the third coding region comprises a nucleotide sequence of SEQ ID NO: 271, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO:271, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto
  • the second coding region comprises a second, e.g., 3’, portion of the nucleotide sequence of SEQ ID NO:271, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the third coding region comprises a nucleotide sequence of SEQ ID NO: 272, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO:272, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto
  • the second coding region comprises a second, e.g., 3’, portion of the nucleotide sequence of SEQ ID NO:272, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the third coding region comprises a nucleotide sequence of SEQ ID NO: 273, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO:273, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto
  • the second coding region comprises a second, e.g., 3’, portion of the nucleotide sequence of SEQ ID NO:273, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the third coding region comprises a nucleotide sequence of SEQ ID NO: 274, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO:274, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto
  • the second coding region comprises a second, e.g., 3’, portion of the nucleotide sequence of SEQ ID NO:274, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the third coding region comprises a nucleotide sequence of SEQ ID NO: 275, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO:275, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto
  • the second coding region comprises a second, e.g., 3’, portion of the nucleotide sequence of SEQ ID NO:275, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the third coding region comprises a nucleotide sequence of SEQ ID NO: 276, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO:276, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto
  • the second coding region comprises a second, e.g., 3’, portion of the nucleotide sequence of SEQ ID NO:276, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the third coding region comprises a nucleotide sequence of SEQ ID NO: 277, or a nucleotide sequence at least 90%, 91 %, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO:277, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto
  • the second coding region comprises a second, e.g., 3’, portion of the nucleotide sequence of SEQ ID NO:277, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the third coding region comprises a nucleotide sequence of SEQ ID NO: 278, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO:278, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto
  • the second coding region comprises a second, e.g., 3’, portion of the nucleotide sequence of SEQ ID NO:278, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the third coding region comprises a nucleotide sequence of SEQ ID NO: 279, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO:279, or a nucleotide sequence at least 90%, 91 %, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto, and the second
  • Il coding region comprises a second, e.g., 3’, portion of the nucleotide sequence of SEQ ID NO:279, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the third coding region comprises a nucleotide sequence of SEQ ID NO: 280, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO:280, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto
  • the second coding region comprises a second, e.g., 3’, portion of the nucleotide sequence of SEQ ID NO:280, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the third coding region comprises a nucleotide sequence of SEQ ID NO: 396, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO:396, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto
  • the second coding region comprises a second, e.g., 3’, portion of the nucleotide sequence of SEQ ID NO:396, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the third coding region comprises a nucleotide sequence of SEQ ID NO: 397, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO:397, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto
  • the second coding region comprises a second, e.g., 3’, portion of the nucleotide sequence of SEQ ID NO:397, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the third coding region comprises a nucleotide sequence of SEQ ID NO: 398, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO:398, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto
  • the second coding region comprises a second, e.g., 3’, portion of the nucleotide sequence of SEQ ID NO:398, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the third coding region comprises a nucleotide sequence of SEQ ID NO: 399, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO:399, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto
  • the second coding region comprises a second, e.g., 3’, portion of the nucleotide sequence of SEQ ID NO:399, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the third coding region comprises a nucleotide sequence of SEQ ID NO: 400, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO:400, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto
  • the second coding region comprises a second, e.g., 3’, portion of the nucleotide sequence of SEQ ID NO:400, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the third coding region comprises a nucleotide sequence of SEQ ID NO: 401, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO:401, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto
  • the second coding region comprises a second, e.g., 3’, portion of the nucleotide sequence of SEQ ID NO:401, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the third coding region comprises a nucleotide sequence of SEQ ID NO: 402, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO:402, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto
  • the second coding region comprises a second, e.g., 3’, portion of the nucleotide sequence of SEQ ID NO:402, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the third coding region comprises a nucleotide sequence of SEQ ID NO: 403, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO:403, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto
  • the second coding region comprises a second, e.g., 3’, portion of the nucleotide sequence of SEQ ID NO:403, or a nucleotide sequence at least 90%, 91 %, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
  • the first isolated nucleic acid molecule and the second isolated nucleic acid molecule of the system can also be introduced into a vector.
  • the first isolated nucleic acid molecule and the second isolated nucleic acid molecule are encoded in separate vectors.
  • the isolated nucleic acid molecules of the invention can be cloned into a number of types of vectors.
  • the nucleic acid molecules can be cloned into a vector including, but not limited to a plasmid, a phagemid, a phage derivative, an animal virus, and a cosmid.
  • Vectors of particular interest include expression vectors, replication vectors, probe generation vectors, and sequencing vectors.
  • the vector may be provided to a cell in the form of a viral vector.
  • Viral vector technology is well known in the art and is described, for example, in Sambrook et al. (2012, Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory, New York), and in other virology and molecular biology manuals.
  • Viruses, which are useful as vectors include, but are not limited to, retroviruses, adenoviruses, adeno-associated viruses, herpes viruses, and lentiviruses.
  • the isolated nucleic acid molecule is introduced into a vector derived from an adeno-associated virus (AAV) particle.
  • AAV belonging to the genus Dependovirus of the Parvoviridae family and, as used herein, include any serotype of the over 100 serotypes of AAV known.
  • serotypes of AAV have genomic sequences with a significant homology at the level of amino acids and nucleic acids, provide an identical primers of genetic functions, produce virions that are essentially equivalent in physical and functional terms, and replicate and assemble through practically identical mechanisms.
  • Peptide insertions into any of these serotypes may also enhance the tissue- specific tropism and therefore also be used to introduce isolated nucleic acid molecules (For examples, see: Yu, CY., Yuan, Z., Cao, Z. et al. A muscle-targeting peptide displayed on AAV2 improves muscle tropism on systemic delivery.
  • the AAV genome is approximately 4.7 kilobases long and is composed of singlestranded deoxyribonucleic acid (ssDNA) which may be either positive- or negative- sensed.
  • the genome comprises two open reading frames (ORFs) encoding the proteins responsible for replication (Rep) and the structural protein of the capsid (Cap).
  • the open reading frames are flanked by two inverted terminal repeats (ITRs), which serve as the origin of replication of the viral genome.
  • the rep frame is made of four overlapping genes encoding Rep proteins ((Rep78, Rep68, Rep52, Rep40).
  • the cap frame contains overlapping nucleotide sequences of three capsid proteins: VP1, VP2 and VP3.
  • Rep proteins are important for replication and packaging, while the capsid proteins are assembled to create the protein shell of the AAV, or AAV capsid. See Carter B, Adeno-associated virus and adeno- associated virus vectors for gene delivery, Lassie D, et ah, Eds., “Gene Therapy: Therapeutic Mechanisms and Strategies” (Marcel Dekker, Inc., New York, NY, US, 2000) and Gao G, et al, J. Virol. 2004; 78( 12):6381-6388.
  • AAV have been explored as vectors for delivery of gene therapeutics because of several unique features.
  • Non-limiting examples of the features include (i) the ability to infect both dividing and non-dividing cells; (ii) a broad host range for infectivity, including human cells;
  • wild-type AAV has not been associated with any disease and has not been shown to replicate in infected cells; (iv) the lack of ccll-mcdiatcd immune response against the vector, and (v) the non-integrative nature in a host chromosome thereby reducing potential for long-term genetic alterations.
  • infection with AAV vectors has minimal influence on changing the pattern of cellular gene expression (Stilwell and Samulski et al., Biotechniques, 2003, 34, 148, the contents of which are herein incorporated by reference in their entirety).
  • AAV vectors for protein delivery may be recombinant viral vectors which are replication defective as they lack sequences encoding functional Rep and Cap proteins within the viral genome.
  • the defective AAV vectors may lack most or all coding sequences and essentially only contain one or two AAV ITR sequences and a payload sequence.
  • AAV vectors may be modified to enhance the efficiency of delivery.
  • modified AAV vectors of the present disclosure can be packaged efficiently and can be used to successfully infect the target cells at high frequency and with minimal toxicity.
  • AAV vector means a vector derived from an adeno-associated virus serotype, including without limitation, serotype 1 (AAV 1), serotype 2 (AAV2), serotype 3 (AAV3), serotype 4 (AAV4), serotype 5 (AAV5), serotype 6 (AAV6), serotype 7 (AAV7), serotype 8 (AAV8), or serotype 9 (AAV9), serotype 10 (AAV10), serotype 11 (AAV11), serotype 12 (AAV12), serotype 13 (AAV13), AAVrh74, AAV-rhlO, AAV-DJ, AAV-LK03, AAV-MYO, AAV-MY02, AAV-MY03, MY03A-AAV, MY04A-AAV, or MY04E-AAV. Expression of a particular gene contained within an AAV vector can be specifically targeted to one or more types of cells by choosing the appropriate combination of AAV
  • the present disclosure provides a dual- vector system, where a transgene, e.g., a truncated dystrophin protein, is split into two separate vectors, e.g., AAV vectors. Co-infection of a cell with these two AAV vectors result in the transcription of an assembled RNA that could not be encoded by a single AAV vector because of the packaging limits of AAV.
  • a transgene e.g., a truncated dystrophin protein
  • the present disclosure provides a vector system for expressing a truncated human dystrophin protein, comprising a first AAV vector and a second AAV vector, wherein the first AAV vector comprises a first nucleic acid molecule comprising a first coding region encoding an N-terminal portion of the truncated dystrophin protein and a 3’ ribozyme, and the second AAV vector comprises a second nucleic acid molecule comprising a second coding region encoding a C-terminal portion of the truncated dystrophin protein and a 5’ ribozyme.
  • the first coding region and the second coding region forms a third coding region encoding for the truncated human dystrophin protein.
  • the truncated dystrophin protein is able to restore the expression and function of an endogenously expressed full-length dystrophin protein.
  • the first coding region and second coding region comprised in the first AAV and second AAV vectors may be the sequences of any first coding region and second coding regions described herein.
  • the first nucleic acid molecule within the first AAV vector further comprises an intron splice donor sequence
  • the second nucleic acid molecule within the second AAV vector further comprises an intron splice acceptor sequence.
  • the splice donor sequence is positioned between the first coding region and the 3’ ribozyme, hi some embodiments, the splice acceptor sequence is positioned between the 5’ ribozyme and the second coding region.
  • the vectors also include conventional control elements which are operably linked to the transgene in a manner which permits its transcription, translation and/or expression in a cell transfected with the plasmid vector or infected with the virus produced by the invention.
  • operably linked sequences include both expression control sequences that are contiguous with the gene of interest (e.g., a truncated dystrophin protein or portion thereof) and expression control sequences that act in trans or at a distance to control the gene of interest.
  • Expression control sequences include appropriate transcription initiation, termination, promoter and enhancer sequences; efficient RNA processing signals such as splicing and polyadenylation (poly A) signals; sequences that stabilize cytoplasmic mRNA; sequences that enhance translation efficiency (z.e., Kozak consensus sequence); sequences that enhance protein stability; and when desired, sequences that enhance secretion of the encoded product.
  • efficient RNA processing signals such as splicing and polyadenylation (poly A) signals
  • sequences that stabilize cytoplasmic mRNA sequences that enhance translation efficiency (z.e., Kozak consensus sequence); sequences that enhance protein stability; and when desired, sequences that enhance secretion of the encoded product.
  • poly A splicing and polyadenylation
  • the first AAV vector comprises a first promoter operably linked to the first nucleic acid molecule comprising a first coding region encoding an N-terminal portion of the truncated dystrophin protein.
  • the second AAV vector comprises a second promoter operably linked to the second nucleic acid molecule comprising a second coding region encoding a C-terminal portion of the truncated dystrophin protein.
  • the first and second promoters are identical. In some embodiments the first and second promoters are different.
  • Promoters may be naturally occurring or non-naturally occurring.
  • Non-limiting examples of promoters include viral promoters, plant promoters and mammalian promoters.
  • the promoters may be human promoters.
  • the first promoter and/or the second promoter is a ubiquitous promoter or a tissue specific promoter.
  • the promoter is a ubiquitous promoter that results in expression in one or more, e.g., multiple, cells and/or tissues.
  • a promoter which drives or promotes expression in most mammalian tissues includes, but is not limited to, human elongation factor la-subunit (EFla), cytomegalovirus (CMV) immediate-early enhancer and/or promoter, chicken P-actin (CBA) and its derivative CAG, glucuronidase (GUSB), and ubiquitin C (UBC).
  • the promoter is a tissue specific promoter, e.g., a muscle specific promoter, e.g., an actin promoter, a myosin promoter, and a creatine kinase promoter.
  • the promoter is a CK8 promoter, an MHCK7 promoter, an SPC5-12 promoter, a MCK promoter, a desmin promoter, or a Calpain3 promoter.
  • the promoter comprises a nucleotide sequence selected from the group consisting of SEQ ID NOs: 144-150, or a nucleotide sequence at least 95% identical thereto.
  • the promoter comprises a nucleotide sequence of SEQ ID NO: 144, or a nucleotide sequence at least 95% identical thereto. In some embodiments, the promoter comprises a nucleotide sequence of SEQ ID NO: 145, or a nucleotide sequence at least 95% identical thereto. In some embodiments, the promoter comprises a nucleotide sequence of SEQ ID NO: 146, or a nucleotide sequence at least 95% identical thereto. In some embodiments, the promoter comprises a nucleotide sequence of SEQ ID NO: 147, or a nucleotide sequence at least 95% identical thereto.
  • the promoter comprises a nucleotide sequence of SEQ ID NO: 148, or a nucleotide sequence at least 95% identical thereto. In some embodiments, the promoter comprises a nucleotide sequence of SEQ ID NO: 149, or a nucleotide sequence at least 95% identical thereto. In some embodiments, the promoter comprises a nucleotide sequence of SEQ ID NO: 150, or a nucleotide sequence at least 95% identical thereto. [0592] In some embodiments, the first AAV vector and/or the second AAV vector further comprise an enhancer. Enhancer sequences found on a vector regulate the expression of the gene contained therein.
  • enhancers are bound with protein factors to enhance the transcription of a gene. Enhancers may be located upstream or downstream of the gene it regulates. Enhancers may also be tissue-specific to enhance transcription in a specific cell or tissue type.
  • the vector of the present invention comprises one or more enhancers to boost transcription of the gene present within the vector.
  • the first AAV vector and/or the second AAV vector further comprise an intron or a fragment or derivative thereof.
  • the intron may enhance expression of a truncated dystrophin protein or portion thereof, as described herein.
  • the first AAV vector and/or the second AAV vector may comprise a human beta-globin intron or a fragment or variant thereof.
  • the intron comprises one or more human beta- globin sequences (e.g.. including fragments/variants thereof).
  • the first AAV vector and/or the second AAV vector may comprise an SV40 intron or others known in the art.
  • the intron region comprises a nucleotide sequence of SEQ ID NO: 156, or a nucleotide sequence at least 95% identical thereto. In some embodiments, the intron region comprise a nucleotide sequence of SEQ ID NO: 157, or a nucleotide sequence at least 95% identical thereto.
  • the first AAV vector and/or the second AAV vector further comprise an inverted terminal repeat (ITR) sequence.
  • the ITR sequence is positioned either 5’ or 3’ relative to the transgene, e.g., the truncated dystrophin protein or portion thereof.
  • the first AAV vector and/or the second AAV vector have two ITRs. These two ITRs flank the payload region, e.g., the truncated dystrophin protein or portion thereof, at the 5’ and 3’ ends.
  • the ITR functions as an origin of replication comprising a recognition site for replication.
  • the ITR comprises a sequence region which can be complementary and symmetrically arranged.
  • the ITR incorporated into a viral vector described herein may be comprised of a naturally occurring polynucleotide sequence or a recombinantly derived polynucleotide sequence.
  • the AAV vector comprises two ITRs.
  • the ITRs are of the same serotype as one another.
  • the ITRs are of different serotypes.
  • Non-limiting examples include zero, one or both of the ITRs having the same serotype as the capsid.
  • both ITRs of the AAV vectors are AAV2 ITRs.
  • each ITR may be about 100 to about 150 nucleotides in length.
  • the ITR comprises 100-180 nucleotides in length, e.g., about 100-115, about 100- 120, about 100-130, about 100-140, about 100-150, about 100-160, about 100-170, about 100-
  • ITR length 120, 130, 140, 141, 142, 145 nucleotides in length.
  • the ITR sequence comprises a nucleotide sequence of SEQ ID NO: 202, or a nucleotide sequence at least 95% identical thereto. In some embodiments, the ITR sequence comprises a nucleotide sequence of SEQ ID NO: 203, or a nucleotide sequence at least 95% identical thereto.
  • the first AAV vector and/or the second AAV vector further comprise a polyadenylation (polyA) sequence.
  • the polyA sequence comprises a length of about 40-600 nucleotides, e.g., about 40-300 nucleotides, about 40-250 nucleotides, about 100-400 nucleotides, about 100-300 nucleotides, about 100-200 nucleotides, about 200-600 nucleotides, about 200-500 nucleotides, about 200-400 nucleotides, about 200-300 nucleotides, about 300- 600 nucleotides, about 300-500 nucleotides, about 300-400 nucleotides, about 400-600 nucleotides, about 400-500 nucleotides, or about 500-600 nucleotides.
  • the polyadenylation sequence is a bovine growth hormone (bGH) polyA sequence.
  • the polyadenylation sequence comprises a nucleotide sequence of SEQ ID NO: 151, or a nucleotide sequence at least 95% identical thereto.
  • the polyadenylation sequence is a synthetic bovine growth hormone (bGH) polyA sequence.
  • the polyadenylation sequence comprises a nucleotide sequence of SEQ ID NO: 152, or a nucleotide sequence at least 95% identical thereto.
  • the first AAV vector and/or the second AAV vector further comprise an untranslated region (UTR).
  • UTR untranslated region
  • the 5’ UTR starts at the transcription start site and ends at the start codon and the 3’ UTR starts immediately following the stop codon and continues until the termination signal for transcription.
  • Features typically found in abundantly expressed genes of specific target organs may be engineered into UTRs to enhance the stability and protein production.
  • any UTR from any gene known in the art may be incorporated into the AAV vectors. These UTRs, or portions thereof, may be placed in the same orientation as in the gene from which they were selected or they may be altered in orientation or location.
  • the UTR used in the AAV vector may be inverted, shortened, lengthened, or made with one or more other 5' UTRs or 3' UTRs known in the art.
  • the term “altered,” as it relates to a UTR means that the UTR has been changed in some way in relation to a reference sequence.
  • a 3' or 5' UTR may be altered relative to a wild type or native UTR by the change in orientation or location as taught above or may be altered by the inclusion of additional nucleotides, deletion of nucleotides, swapping or transposition of nucleotides.
  • the first AAV vector and/or the second AAV vector further comprise a Kozak sequence.
  • Kozak sequences which are commonly known to be involved in the process by which the ribosome initiates translation of many genes, are usually included in 5’ UTRs.
  • the first AAV vector and/or the second AAV vector further comprise a post transcriptional regulatory element, e.g., a Woodchuck Hepatitis Virus Posttranscriptional Regulatory Element (WPRE), e.g., in the 3’ UTR.
  • a post transcriptional regulatory element e.g., a Woodchuck Hepatitis Virus Posttranscriptional Regulatory Element (WPRE), e.g., in the 3’ UTR.
  • WPRE Woodchuck Hepatitis Virus Posttranscriptional Regulatory Element
  • the WPRE comprises a nucleotide sequence of SEQ ID NO: 153, or a nucleotide sequence at least 95% identical thereto. In some embodiments, the WPRE comprises a nucleotide sequence of SEQ ID NO: 154, or a nucleotide sequence at least 95% identical thereto. In some embodiments, the WPRE comprises a nucleotide sequence of SEQ ID NO: 155, or a nucleotide sequence at least 95% identical thereto.
  • the first AAV vector and/or the second AAV vector further comprise one or more filler sequences.
  • the filler sequence may be a wild-type sequence or an engineered sequence.
  • a filler sequence may be a variant of a wild-type sequence.
  • the AAV vector comprise one or more filler sequences in order to have the optional length for packaging.
  • the AAV vector comprises any portion of a filler sequence, e.g., 1%, 2%, 3%, 4%, 5%, 6%, 7%, 8%, 9%, 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, or 99% of a filler sequence.
  • the filler sequences can be located within any position within the AAV vector, for example, 3’ to the 5’ ITR sequence, 5’ to the promoter sequence, 3’ to the poly adenylation sequence, or 5’ to the 3’ ITR sequence.
  • the vectors to be introduced into a cell may also contain either a selectable marker gene or a reporter gene or both to facilitate identification and selection of expressing cells from the population of cells sought to be transfected or infected through viral vectors.
  • the selectable marker may be carried on a separate piece of DNA and used in a co-transfection procedure. Both selectable markers and reporter genes may be flanked with appropriate regulatory sequences to enable expression in the host cells.
  • Useful selectable markers include, for example, antibiotic -resistance genes, such as neo and the like. Reporter genes are used for identifying potentially transfected cells and for evaluating the functionality of regulatory sequences.
  • a reporter gene is a gene that is not present in or expressed by the recipient organism or tissue and that encodes a polypeptide whose expression is manifested by some easily detectable property, e.g.. enzymatic activity. Expression of the reporter gene is assayed at a suitable time after the DNA has been introduced into the recipient cells.
  • Suitable reporter genes may include genes encoding luciferase, beta-galactosidase, chloramphenicol acetyl transferase, secreted alkaline phosphatase, or the green fluorescent protein gene. Suitable expression systems are well known and may be prepared using known techniques or obtained commercially.
  • the first AAV vector comprises, in 5’ to 3’ order, a 5’ ITR sequence, a promoter sequence, a first coding region encoding an N-terminal portion of the truncated dystrophin protein, an intron splice donor sequence, a 3’ ribozyme, and a 3’ ITR sequence.
  • the first AAV vector comprises, in 5’ to 3’ order, a 5’ ITR sequence, a promoter sequence, a first coding region encoding an N-tcrminal portion of the truncated dystrophin protein, an intron splice donor sequence, a 3’ ribozyme, a polyadenylation sequence, and a 3’ ITR sequence.
  • the first AAV vector comprises, in 5’ to 3’ order, a 5’ ITR sequence, a promoter sequence, an intron region, a first coding region encoding an N-teiminal portion of the truncated dystrophin protein, an intron splice donor sequence, a 3’ ribozyme, a polyadenylation sequence, and a 3’ ITR sequence.
  • the first AAV vector comprises a nucleotide sequence selected from the group consisting of SEQ ID NO: 160, 162, 164, 166, 168, 170, 172, 174, 176, 178, 180, 182, 184, 186, 188, 190, 192, 194, 196, 198, 200, 301, 303, 305, 307, 308, 310, 312, 314, 394, and 395, or a nucleotide sequence at least 90% identical thereto.
  • the first AAV vector comprises a nucleotide sequence of SEQ ID NO: 160, or a nucleotide sequence at least 90% identical thereto.
  • the first AAV vector comprises a nucleotide sequence of SEQ ID NO: 162, or a nucleotide sequence at least 90% identical thereto.
  • the first AAV vector comprises a nucleotide sequence of SEQ ID NO: 164, or a nucleotide sequence at least 90% identical thereto.
  • the first AAV vector comprises a nucleotide sequence of SEQ ID NO: 166, or a nucleotide sequence at least 90% identical thereto.
  • the first AAV vector comprises a nucleotide sequence of SEQ ID NO: 168, or a nucleotide sequence at least 90% identical thereto.
  • the first AAV vector comprises a nucleotide sequence of SEQ ID NO: 170, or a nucleotide sequence at least 90% identical thereto.
  • the first AAV vector comprises a nucleotide sequence of SEQ ID NO: 172, or a nucleotide sequence at least 90% identical thereto.
  • the first AAV vector comprises a nucleotide sequence of SEQ ID NO: 174, or a nucleotide sequence at least 90% identical thereto.
  • the first AAV vector comprises a nucleotide sequence of SEQ ID NO: 176, or a nucleotide sequence at least 90% identical thereto.
  • the first AAV vector comprises a nucleotide sequence of SEQ ID NO: 178, or a nucleotide sequence at least 90% identical thereto.
  • the first AAV vector comprises a nucleotide sequence of SEQ ID NO: 180, or a nucleotide sequence at least 90% identical thereto.
  • the first AAV vector comprises a nucleotide sequence of SEQ ID NO: 182, or a nucleotide sequence at least 90% identical thereto.
  • the first AAV vector comprises a nucleotide sequence of SEQ ID NO: 184, or a nucleotide sequence at least 90% identical thereto.
  • the first AAV vector comprises a nucleotide sequence of SEQ ID NO: 186, or a nucleotide sequence at least 90% identical thereto.
  • the first AAV vector comprises a nucleotide sequence of SEQ ID NO: 188, or a nucleotide sequence at least 90% identical thereto.
  • the first AAV vector comprises a nucleotide sequence of SEQ ID NO: 190, or a nucleotide sequence at least 90% identical thereto.
  • the first AAV vector comprises a nucleotide sequence of SEQ ID NO: 192, or a nucleotide sequence at least 90% identical thereto.
  • the first AAV vector comprises a nucleotide sequence of SEQ ID NO: 194, or a nucleotide sequence at least 90% identical thereto.
  • the first AAV vector comprises a nucleotide sequence of SEQ ID NO: 196, or a nucleotide sequence at least 90% identical thereto.
  • the first AAV vector comprises a nucleotide sequence of SEQ ID NO: 198, or a nucleotide sequence at least 90% identical thereto.
  • the first AAV vector comprises a nucleotide sequence of SEQ ID NO: 200, or a nucleotide sequence at least 90% identical thereto.
  • the first AAV vector comprises a nucleotide sequence of SEQ ID NO: 301, or a nucleotide sequence at least 90% identical thereto.
  • the first AAV vector comprises a nucleotide sequence of SEQ ID NO: 303, or a nucleotide sequence at least 90% identical thereto.
  • the first AAV vector comprises a nucleotide sequence of SEQ ID NO: 305, or a nucleotide sequence at least 90% identical thereto.
  • the first AAV vector comprises a nucleotide sequence of SEQ ID NO: 307, or a nucleotide sequence at least 90% identical thereto.
  • the first AAV vector comprises a nucleotide sequence of SEQ ID NO: 308, or a nucleotide sequence at least 90% identical thereto.
  • the first AAV vector comprises a nucleotide sequence of SEQ ID NO: 310, or a nucleotide sequence at least 90% identical thereto.
  • the first AAV vector comprises a nucleotide sequence of SEQ ID NO: 312, or a nucleotide sequence at least 90% identical thereto.
  • the first AAV vector comprises a nucleotide sequence of SEQ ID NO: 314, or a nucleotide sequence at least 90% identical thereto.
  • the first AAV vector comprises a nucleotide sequence of SEQ ID NO: 394, or a nucleotide sequence at least 90% identical thereto.
  • the first AAV vector comprises a nucleotide sequence of SEQ ID NO: 395, or a nucleotide sequence at least 90% identical thereto.
  • the second AAV vector comprises, in 5’ to 3’ order, a 5’ ITR sequence, a promoter sequence, a 5’ ribozyme, an intron splice acceptor sequence, a second coding region encoding a C-terminal portion of the truncated dystrophin protein, and a 3’ ITR sequence.
  • the second AAV vector comprises, in 5’ to 3’ order, a 5’ ITR sequence, a promoter sequence, a 5’ ribozyme, an intron splice acceptor sequence, a second coding region encoding a C-terminal portion of the truncated dystrophin protein, a polyadenylation sequence, and a 3’ ITR sequence.
  • the second AAV vector comprises, in 5’ to 3’ order, a 5’ ITR sequence, a promoter sequence, a 5’ ribozyme, an intron splice acceptor sequence, a second coding region encoding a C-terminal portion of the truncated dystrophin protein, a WPRE sequence, a polyadenylation sequence, and a 3’ ITR sequence.
  • the second AAV vector comprises a nucleotide sequence selected from the group consisting of SEQ ID NO: 161, 163, 165, 167, 169, 171, 173, 175, 177, 179, 181, 183, 185, 187, 189, 191, 193, 195, 197, 199, 201, 302, 304, 306, 309, 311, and 313, or a nucleotide sequence at least 90% identical thereto.
  • the second AAV vector comprises a nucleotide sequence of SEQ ID NO: 161 , or a nucleotide sequence at least 90% identical thereto.
  • the second AAV vector comprises a nucleotide sequence of SEQ ID NO: 163, or a nucleotide sequence at least 90% identical thereto.
  • the second AAV vector comprises a nucleotide sequence of SEQ ID NO: 165, or a nucleotide sequence at least 90% identical thereto.
  • the second AAV vector comprises a nucleotide sequence of SEQ ID NO: 167, or a nucleotide sequence at least 90% identical thereto.
  • the second AAV vector comprises a nucleotide sequence of SEQ ID NO: 169, or a nucleotide sequence at least 90% identical thereto.
  • the second AAV vector comprises a nucleotide sequence of SEQ ID NO: 171, or a nucleotide sequence at least 90% identical thereto.
  • the second AAV vector comprises a nucleotide sequence of SEQ ID NO: 173, or a nucleotide sequence at least 90% identical thereto.
  • the second AAV vector comprises a nucleotide sequence of SEQ ID NO: 175, or a nucleotide sequence at least 90% identical thereto.
  • the second AAV vector comprises a nucleotide sequence of SEQ ID NO: 177, or a nucleotide sequence at least 90% identical thereto.
  • the second AAV vector comprises a nucleotide sequence of SEQ ID NO: 179, or a nucleotide sequence at least 90% identical thereto.
  • the second AAV vector comprises a nucleotide sequence of SEQ ID NO:181, or a nucleotide sequence at least 90% identical thereto.
  • the second AAV vector comprises a nucleotide sequence of SEQ ID NO: 183, or a nucleotide sequence at least 90% identical thereto.
  • the second AAV vector comprises a nucleotide sequence of SEQ ID NO: 185, or a nucleotide sequence at least 90% identical thereto.
  • the second AAV vector comprises a nucleotide sequence of SEQ ID NO: 187, or a nucleotide sequence at least 90% identical thereto.
  • the second AAV vector comprises a nucleotide sequence of SEQ ID NO: 189, or a nucleotide sequence at least 90% identical thereto.
  • the second AAV vector comprises a nucleotide sequence of SEQ ID NO: 191 , or a nucleotide sequence at least 90% identical thereto.
  • the second AAV vector comprises a nucleotide sequence of SEQ ID NO: 193, or a nucleotide sequence at least 90% identical thereto.
  • the second AAV vector comprises a nucleotide sequence of SEQ ID NO: 195, or a nucleotide sequence at least 90% identical thereto.
  • the second AAV vector comprises a nucleotide sequence of SEQ ID NO: 197, or a nucleotide sequence at least 90% identical thereto.
  • the second AAV vector comprises a nucleotide sequence of SEQ ID NO: 199, or a nucleotide sequence at least 90% identical thereto.
  • the second AAV vector comprises a nucleotide sequence of SEQ ID NO:201, or a nucleotide sequence at least 90% identical thereto.
  • the second AAV vector comprises a nucleotide sequence of SEQ ID NO:302, or a nucleotide sequence at least 90% identical thereto.
  • the second AAV vector comprises a nucleotide sequence of SEQ ID NO:304, or a nucleotide sequence at least 90% identical thereto.
  • the second AAV vector comprises a nucleotide sequence of SEQ ID NO: 306, or a nucleotide sequence at least 90% identical thereto.
  • the second AAV vector comprises a nucleotide sequence of SEQ ID NO:309, or a nucleotide sequence at least 90% identical thereto.
  • the second AAV vector comprises a nucleotide sequence of SEQ ID NO:311, or a nucleotide sequence at least 90% identical thereto.
  • the second AAV vector comprises a nucleotide sequence of SEQ ID NO:313, or a nucleotide sequence at least 90% identical thereto.
  • the first AAV vector comprises a nucleotide sequence of SEQ ID NO: 160, or a nucleotide sequence at least 90% identical thereto
  • the second AAV vector comprises a nucleotide sequence of SEQ ID NO: 161, or a nucleotide sequence at least 90% identical thereto.
  • the first AAV vector comprises a nucleotide sequence of SEQ ID NO: 162, or a nucleotide sequence at least 90% identical thereto
  • the second AAV vector comprises a nucleotide sequence of SEQ ID NO: 163, or a nucleotide sequence at least 90% identical thereto.
  • the first AAV vector comprises a nucleotide sequence of SEQ ID NO: 164, or a nucleotide sequence at least 90% identical thereto
  • the second AAV vector comprises a nucleotide sequence of SEQ ID NO: 165, or a nucleotide sequence at least 90% identical thereto.
  • the first AAV vector comprises a nucleotide sequence of SEQ ID NO: 166, or a nucleotide sequence at least 90% identical thereto
  • the second AAV vector comprises a nucleotide sequence of SEQ ID NO: 167, or a nucleotide sequence at least 90% identical thereto.
  • the first AAV vector comprises a nucleotide sequence of SEQ ID NO: 168, or a nucleotide sequence at least 90% identical thereto
  • the second AAV vector comprises a nucleotide sequence of SEQ ID NO: 169, or a nucleotide sequence at least 90% identical thereto.
  • the first AAV vector comprises a nucleotide sequence of SEQ ID NO: 170, or a nucleotide sequence at least 90% identical thereto
  • the second AAV vector comprises a nucleotide sequence of SEQ ID NO: 171, or a nucleotide sequence at least 90% identical thereto.
  • the first AAV vector comprises a nucleotide sequence of SEQ ID NO: 172, or a nucleotide sequence at least 90% identical thereto
  • the second AAV vector comprises a nucleotide sequence of SEQ ID NO: 173, or a nucleotide sequence at least 90% identical thereto.
  • the first AAV vector comprises a nucleotide sequence of SEQ ID NO: 174, or a nucleotide sequence at least 90% identical thereto
  • the second AAV vector comprises a nucleotide sequence of SEQ ID NO: 175, or a nucleotide sequence at least 90% identical thereto.
  • the first AAV vector comprises a nucleotide sequence of SEQ ID NO: 176, or a nucleotide sequence at least 90% identical thereto
  • the second AAV vector comprises a nucleotide sequence of SEQ ID NO: 177, or a nucleotide sequence at least 90% identical thereto.
  • the first AAV vector comprises a nucleotide sequence of SEQ ID NO: 178, or a nucleotide sequence at least 90% identical thereto
  • the second AAV vector comprises a nucleotide sequence of SEQ ID NO: 179, or a nucleotide sequence at least 90% identical thereto.
  • the first AAV vector comprises a nucleotide sequence of SEQ ID NO: 180, or a nucleotide sequence at least 90% identical thereto
  • the second AAV vector comprises a nucleotide sequence of SEQ ID NO: 181, or a nucleotide sequence at least 90% identical thereto.
  • the first AAV vector comprises a nucleotide sequence of SEQ ID NO: 182, or a nucleotide sequence at least 90% identical thereto
  • the second AAV vector comprises a nucleotide sequence of SEQ ID NO: 183, or a nucleotide sequence at least 90% identical thereto.
  • the first AAV vector comprises a nucleotide sequence of SEQ ID NO: 184, or a nucleotide sequence at least 90% identical thereto
  • the second AAV vector comprises a nucleotide sequence of SEQ ID NO: 185, or a nucleotide sequence at least 90% identical thereto.
  • the first AAV vector comprises a nucleotide sequence of SEQ ID NO: 186, or a nucleotide sequence at least 90% identical thereto
  • the second AAV vector comprises a nucleotide sequence of SEQ ID NO: 187, or a nucleotide sequence at least 90% identical thereto.
  • the first AAV vector comprises a nucleotide sequence of SEQ ID NO: 188, or a nucleotide sequence at least 90% identical thereto
  • the second AAV vector comprises a nucleotide sequence of SEQ ID NO: 189, or a nucleotide sequence at least 90% identical thereto.
  • the first AAV vector comprises a nucleotide sequence of SEQ ID NO: 190, or a nucleotide sequence at least 90% identical thereto
  • the second AAV vector comprises a nucleotide sequence of SEQ ID NO: 191, or a nucleotide sequence at least 90% identical thereto.
  • the first AAV vector comprises a nucleotide sequence of SEQ ID NO: 192, or a nucleotide sequence at least 90% identical thereto
  • the second AAV vector comprises a nucleotide sequence of SEQ ID NO: 193, or a nucleotide sequence at least 90% identical thereto.
  • the first AAV vector comprises a nucleotide sequence of SEQ ID NO: 194, or a nucleotide sequence at least 90% identical thereto
  • the second AAV vector comprises a nucleotide sequence of SEQ ID NO: 195, or a nucleotide sequence at least 90% identical thereto.
  • the first AAV vector comprises a nucleotide sequence of SEQ ID NO: 196, or a nucleotide sequence at least 90% identical thereto
  • the second AAV vector comprises a nucleotide sequence of SEQ ID NO: 197, or a nucleotide sequence at least 90% identical thereto.
  • the first AAV vector comprises a nucleotide sequence of SEQ ID NO: 198, or a nucleotide sequence at least 90% identical thereto
  • the second AAV vector comprises a nucleotide sequence of SEQ ID NO: 199, or a nucleotide sequence at least 90% identical thereto.
  • the first AAV vector comprises a nucleotide sequence of SEQ ID NO:200, or a nucleotide sequence at least 90% identical thereto
  • the second AAV vector comprises a nucleotide sequence of SEQ ID NO:201, or a nucleotide sequence at least 90% identical thereto.
  • the first AAV vector comprises a nucleotide sequence of SEQ ID NO:301, or a nucleotide sequence at least 90% identical thereto
  • the second AAV vector comprises a nucleotide sequence of SEQ ID NO:302, or a nucleotide sequence at least 90% identical thereto.
  • the first AAV vector comprises a nucleotide sequence of SEQ ID NO:303, or a nucleotide sequence at least 90% identical thereto
  • the second AAV vector comprises a nucleotide sequence of SEQ ID NO:304, or a nucleotide sequence at least 90% identical thereto.
  • the first AAV vector comprises a nucleotide sequence of SEQ ID NO:305, or a nucleotide sequence at least 90% identical thereto
  • the second AAV vector comprises a nucleotide sequence of SEQ ID NO:306, or a nucleotide sequence at least 90% identical thereto.
  • the first AAV vector comprises a nucleotide sequence of SEQ ID NO:307, or a nucleotide sequence at least 90% identical thereto
  • the second AAV vector comprises a nucleotide sequence of SEQ ID NO:306, or a nucleotide sequence at least 90% identical thereto.
  • the first AAV vector comprises a nucleotide sequence of SEQ ID NO:308, or a nucleotide sequence at least 90% identical thereto
  • the second AAV vector comprises a nucleotide sequence of SEQ ID NO:309, or a nucleotide sequence at least 90% identical thereto.
  • the first AAV vector comprises a nucleotide sequence of SEQ ID NQ:310, or a nucleotide sequence at least 90% identical thereto
  • the second AAV vector comprises a nucleotide sequence of SEQ ID NO:311, or a nucleotide sequence at least 90% identical thereto.
  • the first AAV vector comprises a nucleotide sequence of SEQ ID NO:312, or a nucleotide sequence at least 90% identical thereto
  • the second AAV vector comprises a nucleotide sequence of SEQ ID NO:313, or a nucleotide sequence at least 90% identical thereto.
  • the first AAV vector comprises a nucleotide sequence of SEQ ID NO:314, or a nucleotide sequence at least 90% identical thereto
  • the second AAV vector comprises a nucleotide sequence of SEQ ID NO:309, or a nucleotide sequence at least 90% identical thereto.
  • the first AAV vector comprises a nucleotide sequence of SEQ ID NO:394, or a nucleotide sequence at least 90% identical thereto
  • the second AAV vector comprises a nucleotide sequence of SEQ ID NO:309, or a nucleotide sequence at least 90% identical thereto.
  • the first AAV vector comprises a nucleotide sequence of SEQ ID NO:395, or a nucleotide sequence at least 90% identical thereto
  • the second AAV vector comprises a nucleotide sequence of SEQ ID NO:306, or a nucleotide sequence at least 90% identical thereto.
  • Adeno-associated viral (AAV) production includes processes and methods for producing AAV particles and vectors which can contact a target cell to deliver a payload, e.g. a recombinant viral construct, which includes a nucleic acid molecule encoding a payload molecule, e.g., a truncated dystrophin protein or portion thereof.
  • a payload e.g. a recombinant viral construct, which includes a nucleic acid molecule encoding a payload molecule, e.g., a truncated dystrophin protein or portion thereof.
  • a method of making a recombinant AAV particle of the present disclosure comprising (i) providing a host cell comprising a viral genome described herein, e.g., a nucleic acid comprising a coding region encoding a truncated dystrophin protein or portion thereof, and incubating the host cell under conditions suitable to encapsulate the viral genome in a capsid protein, thereby making the recombinant AAV particle.
  • the method comprises prior to step (i), introducing a first nucleic acid comprising the viral genome into a cell.
  • the host cell comprises a second nucleic acid encoding the capsid protein.
  • the second nucleic acid is introduced into the host cell prior to, concurrently with, or after the first nucleic acid molecule.
  • the host cell is a bacterial cell, a mammalian cell (e.g., a HEK293 cell), or an insect cell (e.g., an Sf9 cell).
  • a method for making a first recombinant AAV particle comprises providing a host cell comprising a first nucleic acid molecule encoding an N-terminal portion of a truncated dystrophin protein, and incubating the host cell under conditions suitable to encapsulate the first nucleic acid in an AAV capsid protein; thereby making the first recombinant AAV particle.
  • a method for making a second recombinant AAV particle comprises providing a host cell comprising a second nucleic acid molecule encoding a C-terminal portion of a truncated dystrophin protein, and incubating the host cell under conditions suitable to encapsulate the second nucleic acid in an AAV capsid protein; thereby making the second recombinant AAV particle.
  • methods are provided herein of producing AAV particles or vectors by (a) contacting a viral production cell with one or more viral expression constructs encoding at least one AAV capsid protein, and one or more payload constructs encoding a payload molecule, e.g.. a truncated dystrophin protein or portion thereof, e.g., an N-terminal portion of a truncated dystrophin protein, or a C-terminal portion of a truncated dystrophin protein; (b) culturing the viral production cell under conditions such that at least one AAV particle or vector is produced, and (c) isolating the AAV particle or vector from the production stream.
  • a payload molecule e.g.. a truncated dystrophin protein or portion thereof, e.g., an N-terminal portion of a truncated dystrophin protein, or a C-terminal portion of a truncated dystrophin protein
  • a viral expression construct may encode at least one structural protein and/or at least one non- structural protein.
  • the structural protein may include any of the native or wild type capsid proteins VP1, VP2, and/or VP3, or a chimeric protein thereof.
  • the VP1 capsid protein may be an sL65 VP1 capsid protein.
  • the non- structural protein may include any of the native or wild type Rep78, Rep68, Rep52, and/or Rep40 proteins or a chimeric protein thereof.
  • contacting occurs via transient transfection, viral transduction, and/or electroporation.
  • the viral production cell is selected from a mammalian cell and an insect cell.
  • the insect cell includes a Spodoptera frugiperda insect cell.
  • the insect cell includes a Sf9 insect cell.
  • the insect cell includes a Sf21 insect cell.
  • the payload construct vector of the present disclosure may include, in various embodiments, at least one inverted terminal repeat (ITR) and may include mammalian DNA.
  • ITR inverted terminal repeat
  • the AAV particles or viral vectors may be formulated as a pharmaceutical composition with one or more acceptable excipients.
  • the AAV particles are produced in an insect cell (e.g., Spodoptera frugiperda (Sf9) cell) using a method described herein.
  • the insect cell is contacted using viral transduction which may include baculoviral transduction.
  • the AAV particles are produced in a mammalian cell (e.g., HEK293 cell) using a method described herein.
  • a mammalian cell e.g., HEK293 cell
  • the mammalian cell is contacted using viral transduction which may include multiplasmid transient transfection (such as triple plasmid transient transfection).
  • the AAV particle production method described herein produces greater than 10 1 , greater than 10 2 , greater than 10 3 , greater than 10 4 , or greater than 10 5 AAV particles in a viral production cell.
  • a process of the present disclosure includes production of viral particles in a viral production cell using a viral production system which includes at least one viral expression construct and at least one payload construct.
  • the at least one viral expression construct and at least one payload construct can be co-transfected (e.g. dual transfection, triple transfection) into a viral production cell.
  • the transfection is completed using standard molecular biology techniques known and routinely performed by a person skilled in the art.
  • the viral production cell provides the cellular machinery necessary for expression of the proteins and other biomaterials necessary for producing the AAV particles, including Rep proteins which replicate the payload construct and Cap proteins which assemble to form a capsid that encloses the replicated payload constructs.
  • the resulting AAV particle is extracted from the viral production cells and processed into a pharmaceutical preparation for administration.
  • an AAV particle disclosed herein may, without being bound by theory, contact a target cell and enter the cell.
  • the AAV particles may subsequently contact the nucleus of the target cell to deliver the payload construct.
  • the payload construct e.g. recombinant viral construct, may be delivered to the nucleus of the target cell wherein the payload molecule encoded by the payload construct may be expressed.
  • the process for production of viral particles utilizes seed cultures of viral production cells that include one or more baculoviruses (e.g., a Baculoviral Expression Vector (BEV) or a baculovirus infected insect cell (BIIC) that has been transfected with a viral expression construct and a payload construct vector).
  • baculoviruses e.g., a Baculoviral Expression Vector (BEV) or a baculovirus infected insect cell (BIIC) that has been transfected with a viral expression construct and a payload construct vector.
  • BEV Baculoviral Expression Vector
  • BIIC baculovirus infected insect cell
  • large scale production of AAV particles utilizes a bioreactor.
  • a bioreactor may allow for the precise measurement and/or control of variables that support the growth and activity of viral production cells such as mass, temperature, mixing conditions (impellor RPM or wave oscillation), CO2 concentration, O2 concentration, gas sparge rates and volumes, gas overlay rates and volumes, pH, Viable Cell Density (VCD), cell viability, cell diameter, and/or optical density (OD).
  • the bioreactor is used for batch production in which the entire culture is harvested at an experimentally determined time point and AAV particles are purified.
  • the bioreactor is used for continuous production in which a portion of the culture is harvested at an experimentally determined time point for purification of AAV particles, and the remaining culture in the biorcactor is refreshed with additional growth media components.
  • AAV viral particles can be extracted from viral production cells in a process which includes cell lysis, clarification, sterilization and purification.
  • Cell lysis includes any process that disrupts the structure of the viral production cell, thereby releasing AAV particles.
  • cell lysis may include thermal shock, chemical, or mechanical lysis methods.
  • Clarification can include the gross purification of the mixture of lysed cells, media components, and AAV particles.
  • clarification includes centrifugation and/or filtration, including but not limited to depth end, tangential flow, and/or hollow fiber filtration.
  • the end result of viral production is a purified collection of AAV particles which include two components: (1) a payload construct (e.g. a recombinant AAV vector genome construct) and (2) a viral capsid.
  • a payload construct e.g. a recombinant AAV vector genome construct
  • a viral capsid e.g. a viral capsid
  • the viral production cell may be selected from any biological organism, including prokaryotic (e.g., bacterial) cells, and eukaryotic cells, including, insect cells, yeast cells and mammalian cells.
  • prokaryotic e.g., bacterial
  • eukaryotic cells including, insect cells, yeast cells and mammalian cells.
  • the AAV particles of the present disclosure may be produced in a viral production cell that includes a mammalian cell.
  • Viral production cells may comprise mammalian cells such as A549, WEH1, 3T3, 10T1/2, BHK, MDCK, COS 1, COS 7, BSC 1, BSC 40, BMT 10, VERO, W138, HeLa, HEK293, HEK293T (293T), Saos, C2C12, L cells, HT1080, Huh7, HepG2, C127, 3T3, CHO, HeLa cells, KB cells, BHK and primary fibroblast, hepatocyte, and myoblast cells derived from mammals.
  • Viral production cells can include cells derived from any mammalian species including, but not limited to, human, monkey, mouse, rat, rabbit, and hamster or cell type, including but not limited to fibroblast, hepatocyte, tumor cell, cell line transformed cell, etc.
  • AAV particles are produced in mammalian cells using a multiplasmid transient transfection method (such as triple plasmid transient transfection).
  • the multiplasmid transient transfection method includes transfection of the following three different constructs: (i) a payload construct, (ii) a Rep/Cap construct (parvoviral Rep and parvoviral Cap), and (iii) a helper construct.
  • the triple transfection method of the three components of AAV particle production may be utilized to produce small lots of virus for assays including transduction efficiency, target tissue (tropism) evaluation, and stability.
  • the triple transfection method of the three components of AAV particle production may be utilized to produce large lots of materials for clinical or commercial applications.
  • mammalian viral production cells e.g. 293T cells
  • an adhesion/adherent state e.g. with calcium phosphate
  • a suspension state e.g. with polyethyleneimine (PEI)
  • the mammalian viral production cell is transfected with plasmids required for production of AAV, (z.e., AAV rep/cap construct, an adenoviral helper construct, and/or ITR flanked payload construct).
  • the transfection process can include optional medium changes (e.g. medium changes for cells in adhesion form, no medium changes for cells in suspension form, medium changes for cells in suspension form if desired).
  • the transfection process can include transfection mediums such as DMEM or F17.
  • the transfection medium can include serum or can be serum-free (e.g. cells in adhesion state with calcium phosphate and with serum, cells in suspension state with PEI and without serum).
  • Cells can subsequently be collected by scraping (adherent form) and/or pelleting (suspension form and scraped adherent form) and transferred into a receptacle. Collection steps can be repeated as necessary for full collection of produced cells. Next, cell lysis can be achieved by consecutive freeze-thaw cycles (-80°C to 37°C), chemical lysis (such as adding detergent triton), mechanical lysis, or by allowing the cell culture to degrade after reaching -0% viability. Cellular debris is removed by centrifugation and/or depth filtration. The samples are quantified for AAV particles by DNase resistant genome titration by DNA qPCR or digital PCR.
  • AAV particle titers are measured according to genome copy number (genome particles per milliliter). Genome particle concentrations are based on DNA qPCR of the vector DNA as previously reported (Clark et al. (1999) Hum. Gene Ther., 10:1031-1039; Veldwijk et al. (2002) Mol. Ther., 6:272-278, the contents of which are each incorporated by reference in their entireties as related to the measurement of particle concentrations).
  • the AAV particles or viral vectors of the present disclosure may be produced in a viral production cell that includes an insect cell. Any insect cell which allows for replication of parvovirus and which can be maintained in culture can be used in accordance with the present disclosure.
  • AAV viral production cells commonly used for production of recombinant AAV particles include, but is not limited to, Spodoptera frugiperda, including, but not limited to the Sf9 or Sf21 cell lines, Drosophila cell lines, or mosquito cell lines, such as Aedes albopictus derived cell lines.
  • Use of insect cells for expression of heterologous proteins is well documented, as are methods of introducing nucleic acids, such as vectors, e.g., insect-cell compatible vectors, into such cells and methods of maintaining such cells in culture.
  • Expansion, culturing, transfection, infection and storage of insect cells can be carried out in any cell culture media, cell transfection media or storage media known in the art, including HycloneTM SFX-InsectTM Cell Culture Media, Expression System ESF AFTM Insect Cell Culture Medium, ThermoFisher Sf-900IITM media, ThermoFisher Sf-900IIITM media, or ThermoFisher Grace’s Insect Media.
  • Insect cell mixtures of the present disclosure can also include any of the formulation additives or elements described in the present disclosure, including (but not limited to) salts, acids, bases, buffers, surfactants (such as Poloxamer 188/Pluronic F-68), and other known culture media elements.
  • Formulation additives can be incorporated gradually or as “spikes” (incorporation of large volumes in a short time).
  • the AAV particles or viral vectors of the present disclosure may be produced in a baculoviral system using a viral expression construct and a payload construct vector.
  • the baculoviral system includes Baculovirus expression vectors (BEVs) and/or baculovirus infected insect cells (BIICs).
  • BEVs Baculovirus expression vectors
  • BIICs Baculovirus infected insect cells
  • a viral expression construct or a payload construct of the present disclosure can be a bacmid, also known as a baculovirus plasmid or recombinant baculovirus genome.
  • a viral expression construct or a payload construct of the present disclosure can be polynucleotide incorporated by homologous recombination (transposon donor/acceptor system) into a bacmid by standard molecular biology techniques known and performed by a person skilled in the art.
  • Transfection of separate viral replication cell populations produces two or more groups (e.g. two, three) of baculoviruses (BEVs), one or more group which can include the viral expression construct (Expression BEV), and one or more group which can include the payload construct (Pay load BEV).
  • BEVs baculoviruses
  • Expression BEV the viral expression construct
  • Payload BEV the payload construct
  • the baculoviruses may be used to infect a viral production cell for production of AAV particles or viral vector.
  • the process includes transfection of a single viral replication cell population to produce a single baculovirus (BEV) group which includes both the viral expression construct and the payload construct.
  • BEV baculovirus
  • These baculoviruses may be used to infect a viral production cell for production of AAV particles or viral vector.
  • BEVs arc produced using a Bacmid Transfection agent, such as Promega FuGENE® HD, WFI water, or ThermoFisher Cellfectin® II Reagent.
  • BEVs are produced and expanded in viral production cells, such as an insect cell.
  • the AAV particles or viral vectors of the present disclosure may be produced in insect cells (e.g., Sf9 cells).
  • the AAV particles or viral vectors of the present disclosure may be produced using triple transfection.
  • the AAV particles or viral vectors of the present disclosure may be produced in mammalian cells.
  • the AAV particle s or viral vectors of the present disclosure may be produced by triple transfection in mammalian cells.
  • the AAV particle s or viral vectors of the present disclosure may be produced by triple transfection in HEK293 cells.
  • the AAV particles or vectors encoding the truncated dystrophin protein or portion thereof, as described herein, may be useful in the fields of human disease, veterinary applications and a variety of in vivo and in vitro settings.
  • the AAV particles or vectors of the present disclosure may be useful in the field of medicine for the treatment, prophylaxis, palliation, or amelioration of dystrophin-associated diseases and/or disorders, e.g., muscular dystrophy, e.g., DMD.
  • the AAV particles or vectors of the disclosure are used for the prevention and/or treatment of dystrophin -associated disorders, e.g., muscular dystrophy, e.g.,
  • the present disclosure provides compositions comprising the isolated nucleic acid molecules encoding the truncated dystrophin protein or portion thereof of the invention, and one or more excipients.
  • the present disclosure also provides pharmaceutical compositions comprising the isolated nucleic acid molecules encoding the truncated dystrophin protein or portion thereof of the invention, and one or more pharmaceutically acceptable excipients.
  • the present disclosure also provides pharmaceutical compositions comprising the AAV vectors comprising the isolated nucleic acid molecules encoding the truncated dystrophin protein or portion thereof, and one or more pharmaceutically acceptable excipients.
  • the pharmaceutical composition comprises a first nucleic acid molecule comprising a first coding region encoding an N-terminal portion of the truncated dystrophin protein, and a second nucleic acid molecule comprising a second coding region encoding a C-terminal portion of the truncated dystrophin protein.
  • the first nucleic acid molecule and the second nucleic acid molecule are presented at a ratio of 1:1. 1:2, 1:3, 1:4, 1:5, 2:1, 3:1, 4:1, or 5:1.
  • the pharmaceutical composition comprises a first AAV vector comprising a first nucleic acid molecule comprising a first coding region encoding an N-terminal portion of the truncated dystrophin protein, and a second AAV vector comprising a second nucleic acid molecule comprising a second coding region encoding a C-terminal portion of the truncated dystrophin protein.
  • the first AAV vector and the second AAV vector are presented at a ratio of 1:1. 1:2, 1:3, 1:4, 1:5, 2:1, 3:1, 4:1, or 5:1.
  • the present disclosure provides a first pharmaceutical composition comprising a first AAV vector comprising a first nucleic acid molecule comprising a first coding region encoding an N-terminal portion of the truncated dystrophin protein, and a second pharmaceutical composition comprising a second AAV vector comprising a second nucleic acid molecule comprising a second coding region encoding a C-terminal portion of the truncated dystrophin protein.
  • the first pharmaceutical composition and the second pharmaceutical composition are administered at a ratio of 1:1. 1:2, 1:3, 1:4, 1:5, 2:1, 3:1, 4:1, or 5:1.
  • the pharmaceutically acceptable excipient may be any functional molecules as vehicles, adjuvants, carriers, or diluents known in the art.
  • the pharmaceutically acceptable excipient may be a transfection facilitating agent, which may include surface active agents, such as immune- stimulating complexes (ISCOMS), Freund’s incomplete adjuvant, LPS analog including monophosphoryl lipid A, muramyl peptides, quinone analogs, vesicles such as squalene and squalene, hyaluronic acid, lipids, liposomes, calcium ions, viral proteins, polyanions, polycations, or nanoparticles, or other known transfection facilitating agents.
  • ISCOMS immune- stimulating complexes
  • Freund’s incomplete adjuvant LPS analog including monophosphoryl lipid A, muramyl peptides, quinone analogs, vesicles such as squalene and squalene, hyaluronic acid, lipids, liposomes
  • compositions which are suitable for administration to humans, it will be understood by the skilled artisan that such compositions arc generally suitable for administration to any other animal, e.g., to non-human animals, e.g. non-human mammals. Modification of pharmaceutical compositions suitable for administration to humans in order to render the compositions suitable for administration to various animals is well understood, and the ordinarily skilled veterinary pharmacologist can design and/or perform such modification with merely ordinary, if any, experimentation.
  • compositions include, but are not limited to, humans and/or other primates; mammals, including commercially relevant mammals such as cattle, pigs, horses, sheep, cats, dogs, mice, and/or rats; and/or birds, including commercially relevant birds such as poultry, chickens, ducks, geese, and/or turkeys.
  • the compositions are administered to humans, human patients, or subjects.
  • a pharmaceutical composition in accordance with the present disclosure may be prepared, packaged, and/or sold in bulk, as a single unit dose, and/or as a plurality of single unit doses.
  • a “unit dose” refers to a discrete amount of the pharmaceutical composition comprising a predetermined amount of the active ingredient.
  • the amount of the active ingredient is generally equal to the dosage of the active ingredient which would be administered to a subject and/or a convenient fraction of such a dosage such as, for example, one-half or one-third of such a dosage.
  • pharmaceutically acceptable carrier refers to any substantially non-toxic carrier conventionally useable for administration of pharmaceuticals in which the isolated nucleic acid molecules or AAV vectors of the present disclosure will remain stable and bioavailable.
  • the pharmaceutically acceptable earner must be of sufficiently high purity and of sufficiently low toxicity to render it suitable for administration to the mammal being treated.
  • the pharmaceutically acceptable carrier can be liquid or solid and is selected, with the planned manner of administration in mind, to provide for the desired bulk, consistency, etc., when combined with an active agent and other components of a given composition.
  • Suitable pharmaceutically acceptable carriers include any and all solvents, dispersion media, coatings, antibacterial and antifungal agents, isotonic and absorption delaying agents, and the like that are physiologically compatible.
  • Pharmaceutically acceptable carriers also include sterile aqueous solutions or dispersions and sterile powders for the extemporaneous preparation of sterile injectable solutions or dispersion.
  • the use of such media and agents for pharmaceutically active substances is well known in the art. Except insofar as any conventional media or agent is incompatible with the gene therapy vector, use thereof in the pharmaceutical compositions of the disclosure is contemplated. Supplementary active compounds can also be incorporated into the compositions.
  • compositions of the disclosure may be formulated for delivery to animals for veterinary purposes (e.g. livestock (cattle, pigs, dogs, mice, rats), and other non-human mammalian subjects, as well as to human subjects.
  • livestock e.g. livestock (cattle, pigs, dogs, mice, rats)
  • non-human mammalian subjects as well as to human subjects.
  • the pharmaceutical compositions of the present disclosure are in the form of injectable compositions.
  • the compositions can be prepared as an injectable, either as liquid solutions or suspensions.
  • the preparation may also be emulsified. Suitable excipients are, for example, water, saline, dextrose, glycerol, ethanol, phosphate buffered saline or the like and combinations thereof.
  • the preparation may contain minor amounts of auxiliary substances such as wetting or emulsifying agents, pH-buffering agents, adjuvants, surfactant or immunopotentiators.
  • Sterile injectable solutions can be prepared by incorporating the compositions of the disclosure in the required amount in an appropriate solvent with one or a combination of ingredients enumerated above, as required, followed by filtered sterilization.
  • dispersions are prepared by incorporating the active compound into a sterile vehicle which contains a basic dispersion medium and the required other ingredients from those enumerated above.
  • the preferred methods of preparation include vacuum drying and freeze-drying which yields a powder of the active ingredient plus any additional desired ingredient from a previously sterile- filtered solution thereof.
  • Toxicity and therapeutic efficacy of nucleic acid molecules described herein can be determined by standard pharmaceutical procedures in cell cultures or experimental animals, e.g., for determining the ED50 (the dose therapeutically effective in 50% of the population). Data obtained from cell culture assays and/or animal studies can be used in formulating a range of dosage for use in humans. The dosage typically will lie within a range of concentrations that include the ED50 with little or no toxicity. The dosage may vary within this range depending upon the dosage form employed and the route of administration utilized. For any compound used in the method of the disclosure, the therapeutically effective dose can be estimated initially from cell culture assays.
  • Formulations of the pharmaceutical compositions described herein may be prepared by any method known or hereafter developed in the art of pharmacology. In general, such preparatory methods include the step of bringing the active ingredient into association with an excipient and/or one or more other accessory ingredients, and then, if necessary and/or desirable, dividing, shaping and/or packaging the product into a desired single- or multi-dose unit.
  • Relative amounts of the active ingredient, the pharmaceutically acceptable excipient, and/or any additional ingredients in a pharmaceutical composition in accordance with the disclosure will vary, depending upon the identity, size, and/or condition of the subject treated and further depending upon the route by which the composition is to be administered.
  • the composition may comprise between 0.1% and 99% (w/w) of the active ingredient.
  • the composition may comprise between 0.1% and 100%, e.g., between 0.5% and 50%, between 1-30%, between 5-80%, or at least 80% (w/w) active ingredient.
  • the pharmaceutical composition of the disclosure can be formulated using one or more excipients to: (1) increase stability; (2) increase cell transfection or transduction; (3) permit the sustained or delayed release; (4) alter the biodistribution (e.g., targeting specific tissues or cell types); (5) increase the translation of encoded protein in vivo’, (6) alter the release profile of encoded protein in vivo and/or (7) allow for regulatable expression of the payload.
  • Formulations of the present disclosure can include, without limitation, saline, lipidoids, liposomes, lipid nanoparticles, polymers, lipoplexes, core-shell nanoparticles, peptides, proteins, cells transfected with viral vectors (e.g., for transplantation into a subject), nanoparticle mimics and combinations thereof. Further, the viral vectors of the present disclosure may be formulated using self-assembled nucleic acid nanoparticles.
  • the viral vectors encoding the truncated dystrophin protein or portion thereof may be formulated to optimize baricity and/or osmolality.
  • the baricity and/or osmolality of the formulation may be optimized to ensure optimal distribution in the muscle tissues or cells.
  • the formulations of the disclosure can include one or more excipients, each in an amount that together increases the stability of the AAV particle, increases cell transfection or transduction by the viral particle, increases the expression of viral particle encoded protein, and/or alters the release profile of AAV particle encoded proteins.
  • a pharmaceutically acceptable excipient may be at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% pure.
  • an excipient is approved for use for humans and for veterinary use.
  • an excipient may be approved by United States Food and Drug Administration.
  • an excipient may be of pharmaceutical grade.
  • an excipient may meet the standards of the United States Pharmacopoeia (USP), the European Pharmacopoeia (EP), the British Pharmacopoeia, and/or the International Pharmacopoeia.
  • Excipients which, as used herein, include, but are not limited to, any and all solvents, dispersion media, diluents, or other liquid vehicles, dispersion or suspension aids, surface active agents, isotonic agents, thickening or emulsifying agents, preservatives, and the like, as suited to the particular dosage form desired.
  • Various excipients for formulating pharmaceutical compositions and techniques for preparing the composition are known in the art (see Remington: The Science and Practice of Pharmacy, 21st Edition, A. R. Gennaro, Lippincott, Williams & Wilkins, Baltimore, MD, 2006; the contents of which are herein incorporated by reference in their entirety).
  • any conventional excipient medium may be contemplated within the scope of the present disclosure, except insofar as any conventional excipient medium may be incompatible with a substance or its derivatives, such as by producing any undesirable biological effect or otherwise interacting in a deleterious manner with any other component(s) of the pharmaceutical composition.
  • formulations may comprise at least one excipient which is an inactive ingredient.
  • inactive ingredient refers to one or more agents that do not contribute to the activity of the pharmaceutical composition included in formulations.
  • all, none, or some of the inactive ingredients which may be used in the formulations of the present disclosure may be approved by the US Food and Drug Administration (FDA).
  • FDA US Food and Drug Administration
  • the present disclosure provides methods of use of the systems and compositions of the disclosure, which generally include administering the isolated nucleic acid molecules encoding the truncated dystrophin protein or portion thereof, the AAV vectors comprising the isolated nucleic acid molecules, or the compositions or pharmaceutical compositions of the disclosure.
  • the present disclosure provides methods for delivering a truncated dystrophin protein or increasing expression of a functional dystrophin (e.g., truncated dystrophin protein) in a subject having or diagnosed with having a dystrophin-associated disorder.
  • a functional dystrophin e.g., truncated dystrophin protein
  • the present disclosure provides methods for treating a dystrophin-associated disorder in a subject in need thereof.
  • the present disclosure further provides methods for increasing muscle mass or muscle strength and/or preventing fibrosis in a subject having or diagnosed with having a dystrophin-associated disorder.
  • the methods generally comprise administering a therapeutically effective amount of the isolated nucleic acid molecules encoding the truncated dystrophin protein or portion thereof of the disclosure, the AAV vectors comprising the isolated nucleic acid molecules, or the pharmaceutical composition of the disclosure.
  • the dystrophin-associated disorder is muscular' dystrophy.
  • Muscular dystrophies include, but are not limited to, Duchenne muscular dystrophy, Becker muscular dystrophy, myotonic dystrophy, congenital muscular’ dystrophy, distal muscular dystrophy, Emery-Dreifuss muscular dystrophy facioscapulohumeral muscular dystrophy, limb girdle muscular dystrophy, and oculopharyngeal muscular dystrophy.
  • the truncated dystrophin protein of the present disclosure may be any truncated dystrophin protein described herein.
  • the truncated dystrophin protein comprises an amino acid sequence selected from the group consisting of SEQ ID Nos: 83-107 and 216-223.
  • the nucleic acid molecules encoding the truncated dystrophin protein include those having a nucleic acid sequence selected from the group consisting of SEQ ID NOs: 108-132, 224-231, 260-280, and 396-403, in certain embodiments, operably linked to regulatory elements for constitutive, muscle-specific (including skeletal, smooth muscle and cardiac muscle- specific) expression, and other regulatory elements such as poly A sites.
  • Such nucleic acids may be in the context of a recombinant AAV genome vector, for example, flanked by ITR sequences, particularly, AAV2 ITR sequences.
  • the methods comprise administering a subject in need thereof, a first nucleic acid molecule comprises a first coding region encoding an N-terminal portion of the truncated dystrophin protein, and a second nucleic acid molecule comprising a second coding region encoding a C-terminal portion of the truncated dystrophin protein.
  • the methods comprise administering a subject in need thereof, a first AAV vector comprising a first nucleic acid molecule comprising a first coding region encoding an N-terminal portion of the truncated dystrophin protein, and an AAV vector comprising a second nucleic acid molecule comprising a second coding region encoding a C- terminal portion of the truncated dystrophin protein.
  • the methods comprising administering to a subject in need thereof, a first AAV vector comprising a nucleotide sequence selected from the group consisting of SEQ ID NOs: 160, 162, 164, 166, 168, 170, 172, 174, 176, 178, 180, 182, 184, 186, 188, 190, 192, 194, 196, 198, 200, 301, 303, 305, 307, 308, 310, 312, 314, 394, and 395, or a nucleotide sequence at least 90% identical thereto; and a second AAV vector comprising a nucleotide sequence selected from the group consisting of SEQ ID NO: 161, 163, 165, 167, 169, 171, 173, 175, 177, 179, 181, 183, 185, 187, 189, 191, 193, 195, 197, 199, 201, 302, 304, 306, 309, 311, 313, or a nucleotide sequence selected from the
  • the subject has been diagnosed with and/or has symptom(s) associated with muscular dystrophy, e.g., DMD.
  • the isolated nucleic acid molecules encoding the truncated dystrophin protein or portion thereof, the AAV vectors, or the pharmaceutical composition of the disclosure may be administered by any suitable route of administration.
  • a route of administration may refer to any administration pathway known in the art, including but not limited to aerosol, enteral, nasal, ophthalmic, oral, parenteral, rectal, transdermal (e.g., topical cream or ointment, patch), or vaginal.
  • Parenter refers to a route of administration that is generally associated with injection, including infraorbital, infusion, intraarterial, intracapsular, intracardiac, intradermal, intramuscular-, intraperitoneal, intrapulmonary, intraspinal, intrasternal, intrathecal, intrauterine, intravenous, subarachnoid, subcapsular, subcutaneous, transmucosal, or transtracheal.
  • the route of the administration accordance with the methods described herein includes local or regional muscle injection to improve local muscle function in patients, systemic delivery (such as intravenous, intra-artery, intraperitoneal) to all muscles in a region or in the whole body in patients, or in vitro infection of myogenic stem cells with an AAV or lentiviral vector followed by local and/or systemic delivery.
  • the nucleic acid molecules or vectors arc administered intramuscularly, subcutaneously, or intravenously. Intramuscular, subcutaneous, or intravenous administration should result in expression of the transgene product in cells of the muscle (including skeletal muscle, cardiac muscle, and/or smooth muscle).
  • the first nucleic acid molecule and the second first nucleic acid molecule, or the pharmaceutical composition comprising the same are administered together. In some embodiments, the first nucleic acid molecule and the second nucleic acid molecule, or the pharmaceutical composition comprising the same, are administered separately.
  • the first AAV vector and the second AAV vector, or the pharmaceutical composition comprising the same are administered together. In some embodiments, the first AAV vector and the second AAV vector, or the pharmaceutical composition comprising the same, are administered separately.
  • terapéuticaally effective amount refers to an amount of a truncated dystrophin protein, peptide, or fragment thereof, or an isolated nucleic acid molecule encoding the same, or an AAV vector comprising the same, that produces a desired therapeutic effect in a subject, such as preventing or treating a target condition, alleviating symptoms associated with the condition, or producing a desired physiological effect.
  • the precise amount will vary depending upon a variety of factors, including but not limited to the physiological condition of the subject (including age, sex, disease type and stage, general physical condition, responsiveness to a given dosage, and type of medication), the nature of a pharmaceutically acceptable carrier or carriers in the formulation, and the route of administration.
  • an effective or therapeutically effective amount may vary depending on whether the a truncated dystrophin protein, peptide, or fragment thereof is administered alone or in combination with a compound, drug, therapy or other therapeutic method or modality.
  • a truncated dystrophin protein, peptide, or fragment thereof is administered alone or in combination with a compound, drug, therapy or other therapeutic method or modality.
  • One skilled in the clinical and pharmacological aits will be able to determine an effective amount or therapeutically effective amount through routine experimentation.
  • the isolated nucleic acid molecules and/or the vectors of the disclosure are provided in a therapeutically effective amount to elicit the desired effect, e.g., increase dystrophin expression and/or activity.
  • the quantity of the viral particle to be administered both according to number of treatments and amount, will also depend on factors such as the clinical status, age, previous treatments, the general health and/or age of the subject, other diseases present, and the severity of the disorder. Precise amounts of active ingredient required to be administered depend on the judgment of the gene therapist and will be particular to each individual patient.
  • treatment of a subject with a therapeutically effective amount of the nucleic acid molecules and/or the vectors of the disclosure can include a single treatment or, preferably, can include a series of treatments.
  • the effective dosage used for treatment may increase or decrease over the course of a particular treatment. Changes in dosage may result from the results of diagnostic assays as described herein.
  • the pharmaceutical compositions can be included in a container, pack, or dispenser together with instructions for administration.
  • a therapeutically effective amount of a viral particle of the disclosure is in titers ranging from about IxlO 5 , about 1.5xl0 5 , about 2xl0 5 , about 2.5xl0 5 , about 3xl0 5 , about 3.5xl0 5 , about 4xl0 5 , about 4.5xl0 5 , about 5xl0 5 , about 5.5xl0 5 , about 6xl0 5 , about 6.5xl0 5 , about 7xl0 5 , about 7.5xl0 5 , about 8xl0 5 , about 8.5xl0 5 , about 9xl0 5 , about 9.5xl0 5 , about IxlO 6 , about 1.5xl0 6 , about 2xl0 6 , about 2.5xl0 6 , about 3xl0 6 , about 3.5xl0 6 , about 4xl0 6 , about 4.5xl0 6
  • a therapeutically effective amount of a viral particle of the disclosure is in genome copies (“GC”), also referred to as “viral genomes” (“vg”), ranging from: about IxlO 5 , about 1.5xl0 5 , about 2xl0 5 , about 2.5xl0 5 , about 3xl0 5 , about 3.5xl0 5 , about 4xl0 5 , about 4.5xl0 5 , about 5xl0 5 , about 5.5xl0 5 , about 6xl0 5 , about 6.5xl0 5 , about 7xl0 5 , about 7.5xl0 5 , about 8xlO 5 , about 8.5xl0 5 , about 9xl0 5 , about 9.5xl0 5 , about IxlO 6 , about 1.5xl0 6 , about 2xl0 6 , about 2.5xl0 6 , about 3xl0 6 , about 3.5xl
  • Any method known in the art can be used to determine the genome copy (GC) number of the viral compositions of the disclosure.
  • One method for performing AAV GC number titration is as follows: purified AAV viral particle samples are first treated with DNase to eliminate unencapsulated AAV genome DNA or contaminating plasmid DNA from the production process. The DNase resistant particles arc then subjected to heat treatment to release the genome from the capsid. The released genomes are then quantitated by real-time PCR or digital PCR using primer/probe sets targeting specific region of the viral genome.
  • nucleic acid molecules or gene therapy vectors provided herein may be administered in combination with other treatments for muscular dystrophy, including corticosteroids, beta blockers and ACE inhibitors.
  • the methods may alleviate or reduce symptoms of muscular dystrophy.
  • Deletion of dystrophin results in mechanical instability causing myofibers to weaken and eventually break during contraction.
  • Patients with DMD first display skeletal muscle weakness in early childhood, which progresses rapidly to loss of muscle mass, spinal curvature known as kyphosis, paralysis and ultimately death from cardiorespiratory failure before 30 years of age.
  • Skeletal muscles of DMD patients also develop muscle hypertrophy, particularly of the calf evidence of focal necrotic myofibers, abnormal variation in myofiber diameter, increased fat deposition and fibrosis, as well as lack of dystrophin staining in immunohistological sections.
  • the methods of treatment provided herein is to slow or arrest the progression of DMD, or other muscular dystrophy disease, or to reduce the severity of one or more symptoms associated with DMD, or other muscular’ dystrophy disease.
  • the methods provided herein is to reduce muscle degeneration, induce/improve muscle regeneration, and/or prevent/reduce downstream pathologies including inflammation and fibrosis that interfere with muscle regeneration and cause loss of movement, orthopedic complications, and, ultimately, respiratory and cardiac failure.
  • Efficacy may be monitored by measuring changes from baseline in gross motor function using the North Star Ambulatory Assessment (NSAA) (scale is ordinal with 34 as the maximum score indicating fully-independent function) or an age-appropriate modified assessment, by assessing changes in ambulatory function (e.g., 6-min (distance walked ⁇ 300m, between 300 and 400m, or >400m)), by performing a timed function test to measure changes from baseline in time taken to stand from a supine position (1 to 8 s (good), 8 to 20 s (moderate), and 20 to 35 s (poor)), by performing time to climb (4 steps) and time to run/walk assessments (10 meters), as well as myometry to evaluate changes from baseline in strength of upper and lower extremities.
  • NSAA North Star Ambulatory Assessment
  • Efficacy may also be monitored by measuring changes (reduction) from baseline in serum creatine kinase (CK) levels (normal: 35-175 U/L, DMD: 500-20,000 U/L), an enzyme that is found in abnormally high levels when muscle is damaged, serum or urine creatinine levels (DMD: 10-25 pmol/L, mild BMD: 20-30 pmol/L, normal>53 pmol/L, DMD) and truncated dystrophin protein levels in muscle biopsies.
  • the percentage of myofibers positive for truncated dystrophin protein expression is also a method to establish efficacy of treatment.
  • Magnetic Resonance Imaging may also be performed to assess fatty tissue infiltration in skeletal muscle (fat fraction).
  • skeletal muscle symptoms are considered the defining characteristic of DMD, patients most commonly die of respiratory or cardiac failure.
  • DMD patients develop dilated cardiomyopathy (DCM) due to the absence of dystrophin in cardiomyocytes, which is required for contractile function. This leads to an influx of extracellular calcium, triggering protease activation, cardiomyocyte death, tissue necrosis, and inflammation, ultimately leading to accumulation of fat and fibrosis.
  • DCM dilated cardiomyopathy
  • This process first affects the left ventricle (LV), which is responsible for pumping blood to most of the body and is thicker and therefore experiences a greater workload.
  • LV left ventricle
  • Atrophic cardiomyocytcs exhibit a loss of striations, vacuolization, fragmentation, and nuclear degeneration. Functionally, atrophy and scarring leads to structural instability and hypokinesis of the LV, ultimately progressing to general DCM.
  • DMD may be associated with various electrocardiograms changes like sinus tachycardia, reduction of circadian index, decreased heart rate variability, short PR interval, right ventricular hypertrophy, S-T segment depression and prolonged QTc.
  • the methods provided herein can slow or arrest the progression of DMD and other muscular dystrophy, particularly to reduce the progression of or attenuate cardiac dysfunction and/or maintain or improve cardiac function. Efficacy may be monitored by periodic evaluation of signs and symptoms of cardiac involvement or heart failure that are appropriate for the age and disease stage of the trial population, using serial electrocardiograms (ECG), and serial noninvasive imaging studies (e.g., echocardiography or cardiac magnetic resonance imaging (CMR)).
  • ECG serial electrocardiograms
  • CMR cardiac magnetic resonance imaging
  • CMR may be used to monitor changes from baseline in forced vital capacity (FVC), forced expiratory volume (FEV1), maximum inspiratory pressure (MIP), maximum expiratory pressure (MEP), peak expiratory flow (PEF), peak cough flow, left ventricular’ ejection fraction (LVEF), left ventricular fractional shortening (LVFS), inflammation, and fibrosis.
  • ECG may be used to monitor conduction abnormalities and arrythmias. In particular, ECG may be used to assess normalization of the PR interval, R waves in VI, Q waves in V6, ventricular’ repolarization, QS waves in inferior and/or upper lateral wall, conduction disturbances in right bundle branch, QT C, and QRS.
  • the present disclosure also provides a system or a pharmaceutical composition for use in the treatment of a dystrophin-associated disorder.
  • the present disclosure provides a first nucleic acid molecule and a second nucleic acid molecule, or a first AAV vector and a second AAV vector, for use in the treatment of a dystrophin-associated disorder.
  • the present disclosure provides an isolated nucleic acid molecule for use the treatment of a dystrophin-associated disorder.
  • kits for conveniently and/or effectively carrying out methods of the present disclosure.
  • kits will comprise sufficient amounts and/or numbers of components to allow a user to perform multiple treatments of a subjcct(s) and/or to perform multiple experiments.
  • kits may further include reagents and/or instructions for creating and/or synthesizing compounds and/or compositions of the present disclosure.
  • kits may also include one or more buffers.
  • kits of the disclosure may include components for making protein or nucleic acid arrays or libraries and thus, may include, for example, solid supports.
  • kit components may be packaged either in aqueous media or in lyophilized form.
  • the container means of the kits will generally include at least one vial, test tube, flask, bottle, syringe or other container means, into which a component may be placed, and suitably aliquoted. Where there is more than one kit component, (labeling reagent and label may be packaged together), kits may also generally contain second, third or other additional containers into which additional components may be separately placed. In some embodiments, kits may also comprise second container means for containing sterile, pharmaceu tic ally acceptable buffers and/or other diluents. In some embodiments, various combinations of components may be comprised in one or more vial.
  • Kits of the present disclosure may also typically include means for containing compounds and/or compositions of the present disclosure, e.g., proteins, nucleic acids, and any other reagent containers in close confinement for commercial sale.
  • Such containers may include injection or blow-molded plastic containers into which desired vials are retained.
  • kit components are provided in one and/or more liquid solutions.
  • liquid solutions are aqueous solutions, with sterile aqueous solutions being particularly used.
  • kit components may be provided as dried powder(s). When reagents and/or components are provided as dry powders, such powders may be reconstituted by the addition of suitable volumes of solvent. In some embodiments, it is envisioned that solvents may also be provided in another container means.
  • kits may include instructions for employing kit components as well the use of any other reagent not included in the kit. Instructions may include variations that may be implemented.
  • DNA expression plasmids were utilized for a cost-effective and higher throughput method of assessing the relative protein coding potential for various truncated Dystrophin construct designs.
  • the pcDNA3.1(+) expression plasmid was used for delivery of various truncated Dystrophin designs described in Table 6 above.
  • the pcDNA3.1(+) expression plasmin includes a CMV promoter and bovine growth hormone (bGH) polyA sequence, and a multicloning site located in-between these two elements where protein coding sequences and various regulatory elements can be inserted.
  • bGH bovine growth hormone
  • these expression plasmids were transfected into cells, including primary human skeletal muscle cells, and assessed for their protein coding ability via Western blotting (or similar techniques).
  • Those DNA-expression plasmids that result in noteworthy or desirable levels of midi-Dystrophin protein production were then incorporated into the designs for AAV vectors and production (please See Example 5: AAV Design and production).
  • midi-Dys sequences are divided into an N-terminal and C-terminal portion, linked to splice donor or acceptor and 3’ ribozyme or 5’ ribozyme sequences, and cloned into pcDNA3.1(+) expression plasmids which puts expression under the control of a CMV promoter.
  • the Ct plasmid also includes a 3xFLAG tag on the C-terminal end of the encoded midi-Dys, and an anti-FLAG antibody is used for detection of midi-Dys protein expression.
  • Shown below in Table 7 are the results of densitometry analysis (using ImageJ) of the Western blot results depicted in FIG. 4.
  • the relative densitometry intensities are normalized to the midi-Dys A H2- R15 protein expression from SEQ ID NOs 4 + 5 (bold).
  • Certain truncated midi-Dys are expressed more than 3x greater than the starting midi-Dys A H2-R15 sequence constructs (SEQ ID NOs 4 + 5).
  • the nucleic acid sequences used to encode a given truncated midi-Dys protein were optimized based on codon usage properties (so called “codon optimization”).
  • Multiple codon optimized versions of the gene encoding a midi-Dys A exon 18-41 truncated Dystrophin construct (SEQ ID NO: 221) were designed for further testing and characterization.
  • codon optimization we started with the human reference genome (hgl9) version of a truncated Dys sequence (SEQ ID NO:260).
  • hgl9 human reference genome version of a truncated Dys sequence
  • We then applied various human codon usage statistics to the sequence either based on codon usage across the entire human genome, or codon usage in a highly expressed muscle transcript (the Titan gene).
  • FIG. 6 shows Western blot results of cell lysates 24-to-48 hours after transfection of various codon optimized truncated midi-Dystrophins.
  • the MANDRA antibody is used to detect midiDystrophin protein expression, which detects an epitope included in all truncated midi-Dys designs as well as the larger full-length 427kDa DMD protein (which is endogenously expressed by primary human skeletal muscle cells).
  • Table 9 shows the results of densitometry analysis (using Image!) of the Western blot results depicted in FIG. 6.
  • the relative densitometry intensities are normalized to the midi-Dys sequence derived from the refence human genome build hgl9, SEQ ID NO: 260 (bold). Certain truncated midi-Dys codon optimized sequences result in greater than 100-fold higher protein expression compared to the hgl9 sequence.
  • Table 9 Densitometry data from Figure 6.
  • nucleic acid encoding the truncated dystrophin proteins are divided into two portions, an N-terminal (Nt) portion, and a C- terminal (Ct) portion, where each portion is inserted into a separate vector for expression.
  • DNA expression plasmids can be used in lieu of AAV vectors for in vitro testing.
  • the divided Nt and Ct sequences are cloned into a suitable DNA expression plasmid and then transfected into an appropriate cell line in vitro.
  • the resulting transfected cells can then be studied for the amount of resulting RNA and/or protein produced by the DNA plasmid(s).
  • the protein-coding nucleic acids flanking the Nt SD intron and Ct SA intron are critical for efficient splicing to occur.
  • split sites were selected by first finding locations in the Dystrophin encoding nucleic acid sequence that either (1) had a splice site consensus nucleotide sequence, a well-established pattern found at the boundaries of exons and introns in prc-mRNA, or (2) could be changed into a splice site consensus sequence without altering the amino acid sequence using degenerate codons for the specified amino acid sequence. In practice, this makes certain amino acid motifs more amenable to split site selection than others, on the basis that the nucleic acid sequence optimal for splicing is needed.
  • the protein-coding nucleic acid at the 3’ end of the Nt Vector may be a “G”, while additional nucleic acids are likely to confer additional influence on the efficiency of splicing.
  • “CAG, “AAG” and “TAG” are enriched at the 3’ end of the 5’ exon in the human genome.
  • a 5’-G on the 3’ exon is also enriched, as are “GT” and “GTT” sequences.
  • the protein coding nucleic acids at the 5’ end of the Ct vector may be a “G”, “GT”, “GA”, or “GTT”.
  • the following amino acid motifs in Dystrophin can be considered as a potential split site: QV, SG, PG, TG, AG, FRF, FRL, FRS, FRY, FRC, FRW, SRF, SRL, SRS, SRY, SRC, SRW, YRF, YRL, YRS, YRY, YRC, YRW, CRF, CRL, CRS, CRY, CRC, CRW, LRF, LRL, LRS, LRY, LRC, LRW, PRF, PRL, PRS, PRY, PRC, PRW, HRF, HRL, HRS, HRY, HRC, HRW, RRF, RRL, RRS, RRY, RRC, RRW, IRF, IRL, IRS, IRY, IRC, IRW, TRF,
  • split sites may be at any nucleic acid position of a codon, either before the first nucleic acid for a given codon, after the first nucleic acid for a given codon, after the second nucleic acid of a given codon, or after the third nucleic acid for a given codon.
  • arginine (“R”) and methionine (“M”) which can be encoded by an “AGG” and “ATG”, respectively, can be utilized for split sites after the second nucleic acid of their codon sequences, providing a 3’ “AG” or “AT” in the Nt vector protein coding sequence as well as a 5’ “G” in the Ct vector protein coding sequence.
  • split sites that allow for optimized distribution of a portion of the midi-dystrophin encoding sequence to the Nt vector and Ct vector is also required, as those splice sites result in Nt and Ct sequences whose sizes are amendable to packaging into an AAV vector when combined with the other regulatory elements required for AAV function (i.e., promoters, polyadenylation sequence, etc).
  • the present designs it was found that split sites between the R17 to R24 subdomains of the dystrophin protein were ideal for distributing the nucleic acid sequence equitably between the Nt and Ct vectors, particularly the R20 subdomain.
  • Table 10 Various split sites generated within truncated Dystrophin proteins
  • the location of the split site is indicated either by an underlined amino acid in cases where the split site is located in nucleic acids with the tripartite nucleotide sequence for the indicated amino acid, or as a hyphen when the split site occurs at the nucleic acids in between the tripartite nucleic acid sequences for two amino acids ** In the nucleic acid sequences provided, the location of the split site is indicated by a hyphen.
  • variable regions for the truncated Dystrophins described in Table 6 can be designed in an N-terminal (Nt) vector, all of the designed constructs to test these various truncated dystrophins can utilize the same C-terminal (Ct) vector, including the same split site for diving the truncated Dystrophin protein into two vectors. This allows for streamlined screening of truncated Dystrophin designs that can minimize other variables in the testing. [0815] Optimizing split sites for a given truncated Dystrophin can improve the efficiency, and therefore the protein expression, of the dual plasmid or dual vector design.
  • the truncated dystrophin protein (midi-Dys A exon 13-41, SEQ ID NO:95) can be divided into a Nt portion encoded by a first coding region comprising the sequence of SEQ ID NO. 204, and a Ct portion encoded by a second coding region comprising the sequence of SEQ ID NO: 205.
  • the truncated dystrophin protein (midi-Dys A exon 13-41, SEQ ID NO:95) can be divided into a Nt portion encoded by a first coding region comprising the sequence of SEQ ID NO:206 , and a Ct portion encoded by a second coding region comprising the sequence of SEQ ID NO:207.
  • the truncated dystrophin protein (midi-Dys A exon 13-41, SEQ ID NO:95) can be divided into a Nt portion encoded by a first coding region comprising the sequence of SEQ ID NO: 208, and a Ct portion encoded by a second coding region comprising the sequence of SEQ ID NO:209.
  • the truncated dystrophin protein (midi-Dys AH2-R15, SEQ ID NO:86) can be divided into a Nt portion encoded by a first coding region comprising the sequence of SEQ ID NO: 210, and a Ct portion encoded by a second coding region comprising the sequence of SEQ ID NO: 211.
  • the truncated dystrophin protein (midi-Dys A exon 17-41, SEQ ID NO: 101) were divided into a Nt portion encoded by a first Nt coding region comprising the sequence of SEQ ID NO: 212, and a Ct portion encoded by a second coding region comprising the sequence of SEQ ID NO: 213.
  • FIG. 5 depicts the results of utilizing different split sites (SS) described above in Table
  • FIG. 5 shows Western blot results of cell lysates 24-to-48 hours after transfection of various Nt and Ct combinations to express truncated midi-Dystrophins.
  • the MANDRA antibody was used to detect midi-Dystrophin protein expression, which detects an epitope included in all truncated midi-Dys designs as well as the full-length 427 kDa DMD protein (which is endogenously expressed by primary human skeletal muscle cells).
  • SEQ ID NO: 159 For a corresponding Ct segment of the midi-Dys A H2-R15 (SEQ ID NO 5), it was linked to a ribozyme (SEQ ID NO: 159) and splice acceptor sequence (SEQ ID NO: 137) on its 5’ end and a stop codon or a 3xFLAG tag plus a stop codon on its 3’ end and subcloned in the pcDNA3.1(+) plasmid which created SEQ ID NO 246 (SEQ ID NO: 246 includes the 3x FLAG tag to enable FLAG based detection of truncated dystrophin expression). This same strategy is applied to any Nt and Ct segment for testing via transfection of DNA expression plasmids.
  • This strategy is not limited to the pcDNA3.1(+) expression plasmid.
  • Any suitable backbone plasmid can be used, which could vary the promoter, polyA sequence, and other regulatory elements.
  • the plasmid can be an AAV cis-plasmid which contains the ITR sequences necessary for DNA packaging into AAV particles during AAV production.
  • DNA expression plasmids containing the entire midi-Dystrophin encoding sequence, or the N-terminal (Nt) portion and the C-terminal (Ct) portion for each of the truncated dystrophin proteins, as described in Examples 1 and 2 were designed and prepared.
  • truncated dystrophins can also be established via determining if proper interactions with other DAPC members occurs in cultured skeletal muscle cells in vitro, such as co-immunoprecipitation experiments that establish truncated Dystrophin interactions with alpha- Dystrobrevin, another key DAPC member.
  • AAV vectors (cis plasmids) expressing either the N-terminal (Nt) portion or the C- terminal (Ct) portion of each of the truncated dystrophin proteins, as described in Example 1, are designed.
  • the Nt vector comprises, in 5’ to 3’ order, a 5’ ITR sequence, a promoter sequence, a first coding region encoding an N-terminal portion of the truncated dystrophin protein, an intron splice donor sequence, a 3’ ribozyme (also referred to as “Nt Ribozyme” as shown below), and a 3’ ITR sequence.
  • the Nt vector may also comprise a polyadenylation sequence and/or an intron region.
  • the Ct vector comprises, in 5’ to 3’ order, a 5’ ITR sequence, a promoter sequence, a 5’ ribozyme (also referred to as “Ct Ribozyme” as shown below), an intron splice acceptor sequence, a second coding region encoding a C-terminal portion of the truncated dystrophin protein, and a 3’ ITR sequence.
  • the Ct vector may also comprise a polyadenylation sequence and/or a post transcriptional regulatory element (e.g., WPRE).
  • AAV vectors expressing the Nt portion and the Ct portion of the truncated dystrophin protein are shown below.
  • AAV vectors expressing the Nt portion and the Ct portion of the truncated dystrophin protein are shown below.
  • AAV vectors expressing the Nt portion and the Ct portion of the truncated dystrophin protein are shown below.
  • the designed AAV vectors can be directly transfected into cells in vitro as DNA plasmids, or they can be packaged into recombinant AAVs (rAAVs) of any given serotypes. These rAAV particles can then be exposed to cells (for transduction) or animals and further characterized.
  • the cis-plasmid containing the intended AAV genome can either be directly transfected into cells in vitro as a DNA plasmid or used in the process of rAAV packaging to produce AAV vectors containing the intended AAV genome. These rAAV particles can then be used to transduce cells in vitro or administered to animals to assess the effects of transduction in on-target as well as off-target tissues.
  • AAV designs described above were delivered to primary human skeletal muscle cells in vitro via DNA-based transfection of the cis-plasmid. These designs included the starting dual vector configuration (SEQ ID NO 301 + 302) to deliver midi-Dys A H2-R15 protein (SEQ ID NO:86) as well as an optimized sequence version (SEQ ID NO 314 + 309) to deliver midi- Dys A H2-R15 protein (SEQ ID NO:86).
  • FIG. 7 depicts the results of DNA-based transfection of AAV cis-plasmids with one representative dual vector configuration (SEQ ID NO 314 + 309) based on the results of codon optimization, split site selection, and re-configuration of other regulatory elements presented in Examples 1, 2 and 3. This dual vector combination was compared to the starting dual vector configuration (SEQ ID NO 301 + 302).
  • FIG. 7 shows Western blot results of cell lysates 24-to- 48 hours after transfection of the levels of midi-Dys A H2-R15 protein (SEQ ID NO:86) obtained.
  • the optimized dual vector combination led to significantly higher expression of truncated midi-Dystrophin compared to the starting design (SEQ ID NO 301 + 302).
  • the relative densitometry intensities were normalized to the midi-Dys A H2-R15 codon VI split site #1 configuration (SEQ ID NOs 301 + 302, bold) and presented in Table 12 below.
  • the codon and split site optimized midi-Dys A H2-R15 (SEQ ID NOs 314 + 309) expressed 294x higher levels of protein than the starting midi-Dy (SEQ ID NOs 301 + 302) based on densitometry.
  • Cells such as HEK293T cells, primary human skeletal muscle cells (from either healthy or DMD patients), immortalized muscle cells (such as C2C12 cells), are co- transduced with varying amounts of Nt and Ct AAV vector(s). RT-PCR and Western blotting are performed to evaluate the resulting RNA transcripts and truncated proteins, respectively. Co-transduction of the Nt- and the Ct-vectors results in detectable protein corresponding to the expected length of the correctly combined Nt- and Ct-portion of the truncated dystrophin protein in the cells.
  • Antibodies to detect dystrophin protein can be directed at epitopes originating from the NT vector sequence (such as DysB, Dys3, DYS-3241, etc.) or the Ct vector (Dys2, MANDRA1, etc.), or portions of the dystrophin that may be encoded by either the Nt or Ct vector depending on the exact design (MANDYS106). It is expected that optimized Nt and Ct combinatorial conditions based on optimized truncated dystrophins that further utilized the codon and split site optimizations described in the examples above will result in high levels of truncated dystrophin expression.
  • the function and activity of the truncated dystrophin protein is evaluated in vitro.
  • DAPC Dystrophin-associated protein complex
  • dystrophin protein primarily serves the function of a structural component of DAPC, co-localization with other DAPC members is indicative that functional activity of dystrophin has been restored.
  • Co-immunoprecipitation experiments further establish direct interactions between truncated dystrophin and other DAPC members.
  • midi-dystrophin proteins in vitro are further ascertained via their effect on cell stimulation or contractile properties in either myotubes or cardiomyocytes, where mididystrophin may enhance or restore function in healthy-donor or DMD-patient derived cells. It has been established that cells from healthy donors exhibit specific, reproducible and quantifiable stimulation and contractile properties that can be studied in vitro. These phenotypes can be further perturbed by administration of tetanus contractions or myosin ATPase inhibitors, and the impact of treatment can be characterized.
  • cell lines from healthy-donors where the dystrophin gene has been knocked-out, are created to allow functional characterization of the impact of a midi-dystrophin transgene in a null background. Measures such as twitch force, contraction energy, contraction velocity, time to peak contraction, relaxation velocity, time to 50% relaxation of the peak force, time to 90% relaxation of the peak force, % force over time (“fatigue”), force-frequency relationships, tissue stiffness, tissue strain, and Calcium flux have all been characterized in vitro and shown to be impacted by the presence, absence, and augmentation of dystrophin protein levels, and these measures will thus be evaluated for the midi-dystrophin proteins.
  • AAV vectors containing the coding region for the N-terminal (Nt) portion and the C- terminal (Ct) portion for each of the truncated dystrophin proteins, as described in Example 1, are designed.
  • the Nt and Ct AAV vector pairs are prepared to express RNAs that when joined via the ribozyme technology and spliced, encode a truncated but highly functional human dystrophin protein.
  • the expression of each vector is under the control of a muscle-specific promoter.
  • Each pair of the Nt- and Ct- AAV vectors are packaged separately in an AAV capsid, e.g., the muscle-trophic AAV capsid, such as AAV6, AAV9, AAVrh74, AAV-MYO, AAV- MY02, AAV-MY03, MY03A-AAV, MY04A-AAV, or MY04E-AAV, to generate two AAV particles. These particles are mixed and then dosed.
  • the mixture may be a 1:1 ratio of Nt to Ct vector, but can also be, for example, 2:1, 3:1, 4:1, or 1:2, 1:3, 1:4.
  • Dystrophin-KO mice which display an age-dependent disease phenotype characterized by degenerating and regenerating myofibers, necrosis and fibrosis, arc injected with the Nt- and Ct- AAV particle pair.
  • Other mouse models of DMD exist and can be tested, including but not limited to C56B16 mdx (or “mdx”), mdx4cv, and mdx-/utm-/-.
  • Biomarkers of muscle health are assessed via serum collection and analysis. Serum creatine kinase, ALT, AST, and an N-terminal fragment of Titan are all bio markers used to assess muscle health/damage and are predictive of dystrophin transgene function in mdx mice.
  • Serum creatine kinase, ALT, AST, and an N-terminal fragment of Titan are all bio markers used to assess muscle health/damage and are predictive of dystrophin transgene function in mdx mice.
  • One-month post-injection, a timepoint in which severe myofiber degeneration occurs in D2-w/.r mice, muscle histology is assessed. It is expected that the injected mice have preserved muscle architecture, similar to saline injected age-matched wild-type mice, and in contrast to the degeneration observed in saline injected D2-mdx control littermates.
  • Timepoints post injection can be extended for longer periods of time (for example, 6 weeks, 8 weeks, 3 months, 6 months, 9 months, or 12 months) to measure durability of the effect of treatment.
  • In vivo characterization can be extended to other animal models, including but not limited to canine models (Golden Retriever Muscular Dystrophy [GRMD] dogs or A-E50 muscular dystrophy [DE50-MD] dogs) or non-human primates. Similar characterizations can be performed as described above, although each model may demonstrate different genotypes and phenotypes to take into consideration for each particular assay.
  • canine models Golden Retriever Muscular Dystrophy [GRMD] dogs or A-E50 muscular dystrophy [DE50-MD] dogs
  • non-human primates Similar characterizations can be performed as described above, although each model may demonstrate different genotypes and phenotypes to take into consideration for each particular assay.
  • the human amino acid sequence of the midi-Dystrophin transgene is often exchanged for the corresponding canine sequence to reduce the immunogenicity risk of expression of the transgene (For example, see: Kodippili K, Hakim CH, Pan X, Yang HT, Yue Y, Zhang Y, Shin JH, Yang NN, Duan D. Dual AAV Gene Therapy for Duchenne Muscular Dystrophy with a 7-kb Mini-Dystrophin Gene in the Canine Model. Hum Gene Ther. 2018 Mar;29(3):299-311.
  • midi-Dystrophin can be established by restoration of the missing dystrophin-associated glycoprotein complex (DAPC). Function can be further established by establishing reductions in muscle degeneration and fibrosis as well as improved myofiber size distribution, hi addition, function can be established by testing mididystrophin’s ability to protect muscle from eccentric contraction-induced force loss.
  • DAPC dystrophin-associated glycoprotein complex
  • midiDystrophins are delivered that include a FLAG tag, which enables FLAG detection as a means to identify the expression of the midi -dystrophin transgene. Function of the midi-dystrophin can be

Landscapes

  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Engineering & Computer Science (AREA)
  • Organic Chemistry (AREA)
  • General Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Zoology (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Biotechnology (AREA)
  • General Engineering & Computer Science (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Medicinal Chemistry (AREA)
  • Wood Science & Technology (AREA)
  • Biochemistry (AREA)
  • Pharmacology & Pharmacy (AREA)
  • Gastroenterology & Hepatology (AREA)
  • Toxicology (AREA)
  • Veterinary Medicine (AREA)
  • Virology (AREA)
  • Physics & Mathematics (AREA)
  • Public Health (AREA)
  • Animal Behavior & Ethology (AREA)
  • Epidemiology (AREA)
  • Plant Pathology (AREA)
  • Microbiology (AREA)
  • Immunology (AREA)
  • Peptides Or Proteins (AREA)

Abstract

The present disclosure provides gene constructs that encode a truncated dystrophin protein. The present disclosure also provides systems and methods for delivering a truncated dystrophin protein in a subject, and methods for treating a subject having a dystrophin-associated disease or disorder, e.g., muscular dystrophy, e.g., Duchenne muscular dystrophy (DMD).

Description

DYSTROPHIN CONSTRUCTS AND METHODS OF USE THEREOF
RELATED APPLICATION
[0001] This application claims the benefit of priority to U.S. Provisional Application No. 63/639,556, filed on April 26, 2024, the entire contents of which are incorporated herein by reference.
SEQUENCE LISTING
[0002] The instant application contains a Sequence Listing which has been submitted electronically in XML format and is hereby incorporated by reference in its entirety. Said XML copy, created on April 23, 2025, is named 135388-00820.xml and is 1,500,856 bytes in size.
BACKGROUND OF THE INVENTION
[0003] Muscular’ dystrophy (MD) is a group of genetic diseases characterized by progressive weakness and degeneration of muscle mass. Some forms of MD present symptoms at birth or develop during childhood, while others may not appear until middle age or later. The disorders differ in terms of the distribution and extent of muscle weakness, age of onset, rate of progression, and pattern of inheritance.
[0004] Duchenne muscular dystrophy (DMD) is the most common form of MD. The disease occurs due to a defective DMD gene that results in absence of dystrophin, a protein that is involved in maintaining the integrity of muscle. As a result of this genetic defect, individuals with DMD may have symptoms such as trouble walking and running, falling frequently, fatigue, learning disabilities/difficulties, heart issues as a result of impact on heart muscle functioning, and breathing problems due to weakening of respiratory muscles involved in lung function. Symptoms of muscle weakness associated with DMD typically begin in childhood, often between 3 to 6 years of age. DMD mainly affects males and in rare cases may affect females. About one in every 3,500 boys are affected by this disorder. As the disease progresses, lifethreatening heart and respiratory problems can occur.
[0005] The DMD gene is one of the largest known human genes. Its largest isoform contains 79 exons and encodes for a 427 kDa dystrophin protein. The extremely large size of the gene contributes to a complex mutational spectrum, with >7,000 different mutations and a high spontaneous mutation rate. The most severe phenotype associated with DMD is most often caused by out-of-frame mutations, resulting in complete loss of dystrophin protein expression. In-frame mutations that allow for the synthesis of an internally truncated but partially functional protein are associated with a milder phenotype known as Becker muscular dystrophy (BMD). [00061 Current treatments for DMD focus on managing symptoms, including corticosteroid medications to slow down the progression of muscle weakness, stretching and exercise programs, and use of equipment such as braces or a wheelchair as walking becomes more difficult. Exon-skipping therapies, which use antisense oligonucleotides to restore small amounts of dystrophin in specific subsets of DMD patients with a genetic phenotype amenable to the skipping strategy, are also used but require repeated administration. Furthermore, gene therapy strategies to replace the missing dystrophin have been limited by an inability to efficiently deliver the large and complex dystrophin gene sequence.
[0007] Accordingly, there remains an unmet need to develop new therapies to treat muscular dystrophies and to ameliorate deficiencies in patients afflicted with DMD-associated disorders caused by mutations in the dystrophin gene.
SUMMARY OF THE INVENTION
[0008] The present disclosure provides isolated recombinant nucleic acid molecules encoding truncated dystrophin proteins that can be delivered and expressed in a subject using a dual adeno-associated virus (AAV) vector system to allow expression of truncated dystrophin proteins that are otherwise too large to fit into a single AAV system. The truncated dystrophin proteins can be used to restore the expression and function instead of a wild-type dystrophin in a subject in need thereof. The present disclosure also provides systems and methods for expressing or delivering a truncated dystrophin protein in a subject, and methods for treating a subject having a dystrophin-associated disease or disorder, e.g., muscular dystrophy, e.g., Duchenne muscular’ dystrophy (DMD). In some embodiments, the AAV vectors can be delivered to a subject in need thereof, e.g., a subject having a dystrophin-associated disease or disorder, e.g., muscular dystrophy, e.g., DMD, to produce a significant level of a functional truncated dystrophin, and to protect muscle fibers from injury, increase muscle strength, reduce and/or prevent fibrosis, in the subject. The AAV vectors can also be used for increasing muscular force and/or increasing muscle mass in order to address the gene defect observed in DMD patients. [0009] Accordingly, in one aspect, the present invention is directed to a system for generating a truncated human dystrophin protein, comprising a first recombinant nucleic acid molecule and a second recombinant nucleic acid molecule, wherein the first nucleic acid molecule comprises a first coding region encoding an N-terminal portion of the truncated dystrophin protein and a 3’ ribozyme, wherein the first coding region is operably linked to the 3’ ribozyme at its 3’ end, wherein the second nucleic acid molecule comprising a second coding region encoding a C- terminal portion of the truncated dystrophin protein and a 5 ’ribozyme, wherein the second coding region is operably linked to the 5’ ribozyme at its 5’ end, wherein upon ribozyme- mediated catalytic ligation (“trans-ligation”), the first coding region and the second coding region forms a third coding region encoding for the complete truncated human dystrophin protein, and wherein the truncated human dystrophin protein comprises at least 1640 amino acids.
[0010] In some embodiments, the first coding region is operably linked to two or more 3’ ribozymes at its 3’ end. In some embodiments, the two or more 3’ ribozymes are the same 3’ ribozyme. In some embodiments, the two or more 3’ ribozymes are different 3’ ribozymes. [0011] In some embodiments, the second coding region is operably linked to two or more 5’ ribozymes at its 5’ end. In some embodiments, the two or more 5’ ribozymes are the same 5’ ribozymes. In some embodiments, the two or more 5’ ribozymes are different 5’ ribozymes. [0012] In some embodiments, the 5’ ribozyme and the 3’ ribozyme are each independently selected from the group consisting of Twister (Osa), Twister (Dre), Twister (Nvi), Twister (Sbi), Twister (Envl), Twister (Spu), Twister (Cpa), Twister Sister, Hammerhead (RzB), HDV, Pistol, Varkud Satellite (VS), Hatchet, Hairpin, and Hovlinc (Hov).
[0013] In some embodiments, the 5’ ribozyme and the 3’ ribozyme are each independently selected from the group consisting of SEQ ID NOs: 6 - 20.
[0014] In some embodiments, the first nucleic acid molecule further comprises an intron splice donor sequence, and the second nucleic acid molecule further comprises an intron splice acceptor sequence.
[0015] In some embodiments, the splice donor sequence is positioned between the first coding region and the 3’ ribozyme. [0016] In some embodiments, the splice donor sequence is selected from the group consisting of SEQ ID NOs: 133 - 136.
[0017] In some embodiments, the splice donor sequence is positioned within a region of the truncated dystrophin protein coding for a region selected from the group consisting of the R7 domain, the R8 domain, the R9 domain, the RIO domain, the R11 domain, the R12 domain, the R13 domain, the R14 domain, the R15 domain, the R16 domain, the R17 domain, the R18 domain, the R19 domain, the H3 domain, the R20 domain, the R21 domain, and the R22 domain. [0018] In some embodiments, the splice donor sequence is positioned within a region of the truncated dystrophin protein coding for a region selected from the group consisting of the R8 domain, the R19 domain, the H3 domain, the R20 domain, and the R21 domain.
[0019] In some embodiments, the splice donor sequence is not positioned within the R21 domain.
[0020] In some embodiments, the splice donor sequence is positioned at least 10 nucleotides, at least 15 nucleotides, at least 20 nucleotides, at least 30 nucleotides, at least 40 nucleotides, at least 50 nucleotides, or more from the 3’ ribozyme.
[0021] In some embodiments, the splice acceptor sequence is positioned between the 5’ ribozyme and the second coding region.
[0022] In some embodiments, the splice acceptor sequence is selected from the group consisting of SEQ ID Nos: 137-141.
[0023] In some embodiments, the splice acceptor sequence is positioned at least 10 nucleotides, at least 15 nucleotides, at least 20 nucleotides, at least 30 nucleotides, at least 40 nucleotides, at least 50 nucleotides, or more from the 5’ ribozyme.
[0024] In some embodiments, the splice donor sequence and the splice acceptor sequence are positioned such that the resulting spliced intron is between 50 - 200 bp in length.
[0025] In some embodiments, the splice donor sequence and splice acceptor sequence are positioned such that the resulting spliced intron encodes a single predominant reading frame. [0026] In some embodiments, a stop codon sequence is introduced into the splice donor sequence or the splice acceptor sequence.
[0027] In some embodiments, at least one of the first coding region and the second coding region is at least 2000 nucleotides in length, or at least 2200 nucleotides in length, or at least 2400 nucleotides in length, or at least 2600 nucleotides in length. [0028] In some embodiments, the first coding region and the second coding region are each at least 2000 nucleotides in length, or at least 2200 nucleotides in length, or at least 2400 nucleotides in length, or at least 2600 nucleotides in length.
[0029] In some embodiments, the third coding region is at least 4920 nucleotides in length, or at least 5100 nucleotides in length, or at least 5300 nucleotides in length.
[0030] In some embodiments, the first coding region and the second coding region do not share a region of substantial sequence identity.
[0031] In some embodiments, the 3’ end of the first coding region does not have a sequence identity to the 5’ end of the second coding region.
[0032] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
[0033] In some embodiments, the truncated dystrophin protein further comprises Hl Domain (SEQ ID NO: 22).
[0034] In some embodiments, the truncated dystrophin protein comprises an amino acid sequence selected from the group consisting of: a. midi-Dys A R1-R15 (SEQ ID NO: 83), b. midi-Dys A R2-R15 (SEQ ID NO: 84), c. midi-Dys A R3-R15 (SEQ ID NO: 85), d. midi-Dys A H2-R15 (SEQ ID NO: 86), e. midi-Dys A R4-R15 (SEQ ID NO: 87), f. midi-Dys A R5-R15 (SEQ ID NO: 88), g. midi-Dys A exon 13-33 (SEQ ID NO: 93), h. midi-Dys A exon 13-39 (SEQ ID NO: 94), i. midi-Dys A exon 13-41 (SEQ ID NO: 95), j. midi-Dys A exon 13-48 (SEQ ID NO: 96), k. midi-Dys A exon 15-39 (SEQ ID NO: 97), l. midi-Dys A exon 15-41 (SEQ ID NO: 98), m. midi-Dys A exon 15-48 (SEQ ID NO: 99), n. midi-Dys A exon 17-39 (SEQ ID NO: 100), o. midi-Dys A exon 17-41 (SEQ ID NO: 101), p. midi-Dys A exon 17-48 (SEQ ID NO: 102), q. midi-Dys A exon 18-39 (SEQ ID NO: 220), r. midi-Dys A exon 18-41 (SEQ ID NO: 221), s. midi-Dys A exon 18-48 (SEQ ID NO: 222), t. midi-Dys A exon 19-39 (SEQ ID NO: 103), u. midi-Dys A exon 19-41 (SEQ ID NO: 104), v. midi-Dys A exon 19-48 (SEQ ID NO: 105), w. midi-Dys A exon 21-41 (SEQ ID NO: 106), x. midi-Dys A exon 21-42 (SEQ ID NO: 223), and y. midi-Dys A exon 21-48 (SEQ ID NO: 107).
[0035] In some embodiments, the truncated dystrophin protein further comprises R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), and R19 Domain (SEQ ID NO:42).
[0036] In some embodiments, the truncated dystrophin protein comprises an amino acid sequence selected from the group consisting of a. midi-Dys A R1-R15 (SEQ ID NO: 83), b. midi-Dys A R2-R15 (SEQ ID NO: 84), c. midi-Dys A R3-R15 (SEQ ID NO: 85), d. midi-Dys A H2-R15 (SEQ ID NO: 86), e. midi-Dys A R4-R15 (SEQ ID NO: 87), f. midi-Dys A R5-R15 (SEQ ID NO: 88), g. midi-Dys A exon 10-33 (SEQ ID NO: 89), h. midi-Dys A exon 10-39 (SEQ ID NO: 90), i. midi-Dys A exon 10-41 (SEQ ID NO: 91), j. midi-Dys A exon 11-33 (SEQ ID NO: 216), k. midi-Dys A exon 11-39 (SEQ ID NO: 217), l. midi-Dys A exon 11-41 (SEQ ID NO: 218), m. midi-Dys A exon 13-33 (SEQ ID NO: 93), n. midi-Dys A exon 13-39 (SEQ ID NO: 94), o. midi-Dys A exon 13-41 (SEQ ID NO: 95), p. midi-Dys A exon 15-39 (SEQ ID NO: 97), q. midi-Dys A exon 15-41 (SEQ ID NO: 98), r. midi-Dys A exon 17-39 (SEQ ID NO: 100), s. midi-Dys A exon 17-41 (SEQ ID NO: 101), t. midi-Dys A exon 18-39 (SEQ ID NO: 220), u. midi-Dys A exon 18-41 (SEQ ID NO: 221), v. midi-Dys A exon 19-39 (SEQ ID NO: 103), w. midi-Dys A exon 19-41 (SEQ ID NO: 104), x. midi-Dys A exon 21-41 (SEQ ID NO: 106), and y. midi-Dys A exon 21-42 (SEQ ID NO: 223).
[0037] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
[0038] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
[0039] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO:24), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51). [0040] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO:24), R3 domain (SEQ ID NO:25), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
[0041] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO:24), R3 domain (SEQ ID NO:25), H2 domain (SEQ ID NO:26), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
[0042] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO:24), R3 domain (SEQ ID NO:25), H2 domain (SEQ ID NO:26), R4 domain (SEQ ID NO:27), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
[0043] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), a partial Hl domain (SEQ ID NO: 404), a partial R11 domain (SEQ ID NO:412), R12 Domain (SEQ ID NO: 35), R13 Domain (SEQ ID NO: 36), R14 domain (SEQ ID NO: 37), R15 Domain (SEQ ID NO: 38), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51). [0044] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), a partial Hl domain (SEQ ID NO: 404), a partial R14 Domain (SEQ ID NO: 413), R15 Domain (SEQ ID NO: 38), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
[0045] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), a partial Hl domain (SEQ ID NO: 404), a partial R15 Domain (SEQ ID NO: 414), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
[0046] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), a partial Hl domain (SEQ ID NO: 404), a partial R19 Domain (SEQ ID NO:415), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
[0047] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), a partial R1 domain (SEQ ID NO: 405), a partial R11 domain (SEQ ID NO: 412), R12 Domain (SEQ ID NO: 35), R13 Domain (SEQ ID NO: 36), R14 Domain (SEQ ID NO: 37), R15 Domain (SEQ ID NO: 38), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51). [0048] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), a partial R1 domain (SEQ ID NO: 405), a partial R14 Domain (SEQ ID NO: 41 ), R15 Domain (SEQ ID NO: 38), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
[00491 In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), a partial R1 domain (SEQ ID NO: 405), a partial R15 Domain (SEQ ID NO: 414), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
[0050] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), a partial R1 domain (SEQ ID NO: 405), a partial R19 Domain (SEQ ID NO:415), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
[0051] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), a partial R2 domain (SEQ ID NO: 406), a partial R11 domain (SEQ ID NO: 412), R12 Domain (SEQ ID NO: 35), R13 Domain (SEQ ID NO: 36), R14 Domain (SEQ ID NO: 37), R15 Domain (SEQ ID NO: 38), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
[0052] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), a partial R2 domain (SEQ ID NO: 406), a partial R14 Domain (SEQ ID NO: 413), R15 Domain (SEQ ID NO: 38), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41 ), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
[0053] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), a partial R2 domain (SEQ ID NO: 406), a partial R15 Domain (SEQ ID NO: 414), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51 ).
[0054] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), a partial R2 domain (SEQ ID NO: 406), a partial R19 Domain (SEQ ID NO: 415), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
[0055] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), a partial R3 domain (SEQ ID NO: 407), a partial R14 Domain (SEQ ID NO: 413), R15 Domain (SEQ ID NO: 38), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
[0056] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), a partial R3 domain (SEQ ID NO: 407), a partial R15 Domain (SEQ ID NO: 414), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
[0057] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), a partial R3 domain (SEQ ID NO: 407), a partial R19 Domain (SEQ ID NO:415), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51). [0058] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), a partial R3 domain (SEQ ID NO: 408), a partial R14 Domain (SEQ ID NO: 413), R15 Domain (SEQ ID NO: 38), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
[0059] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), a partial R3 domain (SEQ ID NO: 408), a partial R15 Domain (SEQ ID NO: 414), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
[0060] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), a partial R3 domain (SEQ ID NO: 408), a partial R19 Domain (SEQ ID NO:415), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51). [0061] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), R3 domain (SEQ ID NO: 25), H2 domain (SEQ ID NO: 26), a partial R4 domain (SEQ ID NO: 409), a partial R14 Domain (SEQ ID NO: 413), R15 Domain (SEQ ID NO: 38), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
[0062] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), R3 domain (SEQ ID NO: 25), H2 domain (SEQ ID NO: 26), a partial R4 domain (SEQ ID NO: 409), a partial R15 Domain (SEQ ID NO: 414), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51). [0063] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), R3 domain (SEQ ID NO: 25), H2 domain (SEQ ID NO: 26), a partial R4 domain (SEQ ID NO: 409), a partial R19 Domain (SEQ ID NO: 415), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
[0064] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), R3 domain (SEQ ID NO: 25), H2 domain (SEQ ID NO: 26), a partial R4 domain (SEQ ID NO: 410), a partial R14 Domain (SEQ ID NO: 413), R15 Domain (SEQ ID NO: 38), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
[0065] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), R3 domain (SEQ ID NO: 25), H2 domain (SEQ ID NO: 26), a partial R4 domain (SEQ ID NO: 410), a partial R15 Domain (SEQ ID NO: 414), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51). [0066] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), R3 domain (SEQ ID NO: 25), H2 domain (SEQ ID NO: 26), a partial R4 domain (SEQ ID NO: 410), a partial R19 Domain (SEQ ID NO: 415), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
[0067] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), R3 domain (SEQ ID NO: 25), H2 domain (SEQ ID NO: 26), R4 domain (SEQ ID NO: 27), a partial R5 domain (SEQ ID NO: 411), a partial R15 Domain (SEQ ID NO: 414), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
[0068] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), R3 domain (SEQ ID NO: 25), H2 domain (SEQ ID NO: 26), R4 domain (SEQ ID NO: 27), a partial R5 domain (SEQ ID NO: 411), a partial R16 Domain (SEQ ID NO: 416), R17 Domain (SEQ ID NO: 40), R 18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51). [0069] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), R3 domain (SEQ ID NO: 25), H2 domain (SEQ ID NO: 26), R4 domain (SEQ ID NO:27), a partial R5 domain (SEQ ID NO: 411), a partial R19 Domain (SEQ ID NO: 415), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51). [0070] The truncated dystrophin proteins described above are depicted in Table 1 below:
Table 1: Truncated Dystrophins and corresponding SEQ ID NOs and subdomains included. [0071] In some embodiments, the truncated dystrophin protein comprises an amino acid sequence selected from the group consisting of SEQ ID NOs:83-107 and 216-223, or an amino acid at least about 90% identical thereto.
[0072] In some embodiments, the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 86, or an amino acid at least about 90% identical thereto.
[0073] In some embodiments, the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 95 or an amino acid at least about 90% identical thereto.
[0074] In some embodiments, the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 101 or an amino acid at least about 90% identical thereto.
[0075] In some embodiments, the amino acid sequence of the truncated dystrophin protein is not identical to an amino acid sequence of SEQ ID NO: 143.
[0076] In some embodiments, the truncated dystrophin protein is not a polypeptide of 2361 amino acids. In some embodiments, the truncated dystrophin protein is less than 2361 amino acid in length. In some embodiments, the truncated dystrophin protein is greater than 2361 amino acid in length.
[0077] In some embodiments, the truncated dystrophin protein is functional.
[0078] In some embodiments, the first nucleic acid molecule is present in a first viral vector, and the second nucleic acid molecule is present in a second viral vector.
[0079] In some embodiments, the first viral vector and the second viral vector are each independently selected from the group consisting of an adenoviral vector, an adeno-associated viral vector, a lentiviral vector, a vaccinia vector, a herpes simplex viral vector, and an Epstein- Barr viral vector.
[0080] In some embodiments, the first viral vector is an adeno-associated viral (AAV) vector, and the second viral vector is an AAV vector.
[0081] In some embodiments, the AAV vector is selected from the group consisting of an AAV1, AAV2, AAV3, AAV4, AAV5, AAV6, AAV7, AAV8, AAV9, AAV10, AAV11, AAV12, AAV13, AAVrh74, AAV-rhlO, AAV-DJ, AAV-LK03, AAV-MYO, AAV-MYO2, AAV-MYO3, MYO3A-AAV, MYO4A-AAV, and MYO4E-AAV.
[0082] In some embodiments, the first AAV vector further comprises a first promoter operably linked to the first nucleic acid molecule. In some embodiments, the second AAV vector further comprises a second promoter operably linked to the second nucleic acid molecule. [0083] In some embodiments, the promoter comprises a tissue specific promoter or a ubiquitous promoter. In some embodiments, the promoter comprises a CK8 promoter, an MHCK7 promoter, an SPC5 promoter, or a minimal CKM promoter. In some embodiments, the promoter comprises a nucleotide sequence selected from the group consisting of SEQ ID NOs: 144-150, or a nucleotide sequence at least 95% identical thereto.
[0084] In some embodiments, the first and/or second AAV vectors further comprise an inverted terminal repeat (ITR) sequence. In some embodiments, the ITR sequence comprises a nucleotide sequence of SEQ ID NO: 202 and/or 203, or a nucleotide sequence at least 95% identical thereto. [0085] In some embodiments, the first and/or second AAV vectors further comprise an intron region. In some embodiments, the intron region comprise a nucleotide sequence of SEQ ID NO: 156 or 157, or a nucleotide sequence at least 95% identical thereto.
[0086] In some embodiments, the first and/or second AAV vectors further comprise a polyadcnylation sequence. In some embodiments, the poly adenylation sequence comprises a nucleotide sequence of SEQ ID NO: 151 or 152, or a nucleotide sequence at least 95% identical thereto.
[0087] In some embodiments, the first and/or second AAV vectors further comprise a Woodchuck Hepatitis Virus Posttranscriptional Regulatory Element (WPRE). In some embodiments, the WPRE comprises a nucleotide sequence of SEQ ID NOs: 153-155, or a nucleotide sequence at least 95% identical thereto.
[0088] In some embodiments, the first and/or second AAV vectors further comprise a Kozak sequence.
[0089] In one aspect, the present invention is directed to a vector system for expressing a truncated human dystrophin protein, comprising a first AAV vector and a second AAV vector, wherein the first AAV vector comprises a first nucleic acid molecule comprising a first coding region encoding an N-terminal portion of the truncated dystrophin protein and a 3’ ribozyme, where the first coding region is operably linked to the 3’ ribozyme at its 3’ end, wherein the second AAV vector comprises a second nucleic acid molecule comprising a second coding region encoding a C-terminal portion of the truncated dystrophin protein and a 5’ ribozyme, where the second coding region is operably linked to the 5’ ribozyme at its 5’ end, wherein upon ribozyme-mediated catalytic ligation, the first coding region and the second coding region forms a third coding region encoding for the truncated human dystrophin protein, and wherein the truncated human dystrophin protein comprises at least 1640 amino acids.
[0090] In some embodiments, the first coding region is operably linked to two or more 3’ ribozymes at its 3’ end. In some embodiments, the two or more 3’ ribozymes are the same 3’ ribozyme. In some embodiments, the two or more 3’ ribozymes are different 3’ ribozymes. [0091] In some embodiments, the second coding region is operably linked to two or more 5’ ribozymes at its 5’ end. In some embodiments, the two or more 5’ ribozymes are the same 5’ ribozymes. In some embodiments, the two or more 5’ ribozymes are different 5’ ribozymes. [0092] In some embodiments, the 5’ ribozyme and the 3’ ribozyme are each independently selected from the group consisting of Twister (Osa), Twister (Dre), Twister (Nvi), Twister (Sbi), Twister (Envl), Twister (Spu), Twister (Cpa), Twister Sister, Hammerhead (RzB), HDV, Pistol, Varkud Satellite (VS), Hatchet, Hairpin, and Hovlinc (Hov).
[0093] In some embodiments, the 5’ ribozyme and the 3’ ribozyme arc each independently selected from the group consisting of SEQ ID NOs: 6 - 20.
[0094] In some embodiments, the first nucleic acid molecule further comprises an intron splice donor sequence, and the second nucleic acid molecule further comprises an intron splice acceptor sequence.
[0095] In some embodiments, the splice donor sequence is positioned between the first coding region and the 3’ ribozyme.
[0096] In some embodiments, the splice donor sequence is selected from the group consisting of SEQ ID NOs: 133 - 136.
[0097] In some embodiments, the splice donor sequence is positioned at least 10 nucleotides, at least 15 nucleotides, at least 20 nucleotides, at least 30 nucleotides, at least 40 nucleotides, at least 50 nucleotides, or more from the 3’ ribozyme.
[0098] In some embodiments, the splice acceptor sequence is positioned between the 5’ ribozyme and the second coding region.
[0099] In some embodiments, the splice acceptor sequence is selected from the group consisting of SEQ ID Nos: 137-141.
[0100] In some embodiments, the splice acceptor sequence is positioned at least 10 nucleotides, at least 15 nucleotides, at least 20 nucleotides, at least 30 nucleotides, at least 40 nucleotides, at least 50 nucleotides, or more from the 5’ ribozyme. [0101] In some embodiments, the splice donor sequence and the splice acceptor sequence are positioned such that the resulting spliced intron is between 50 - 200 bp in length.
[0102] In some embodiments, the splice donor sequence and splice acceptor sequence are positioned such that the resulting spliced intron encodes a single predominant reading frame. [0103] In some embodiments, a stop codon sequence is introduced into the splice donor sequence or the splice acceptor sequence.
[0104] In some embodiments, at least one of the first coding region and the second coding region is at least 2000 nucleotides in length, or at least 2200 nucleotides in length, or at least 2400 nucleotides in length, or at least 2600 nucleotides in length.
[0105] In some embodiments, the first coding region and the second coding region are each at least 2000 nucleotides in length, or at least 2200 nucleotides in length, or at least 2400 nucleotides in length, or at least 2600 nucleotides in length.
[0106] In some embodiments, the third coding region is at least 4920 nucleotides in length, or at least 5100 nucleotides in length, or at least 5300 nucleotides in length.
[0107] In some embodiments, the first coding region and the second coding region do not share a region of substantial sequence identity.
[0108] In some embodiments, the 3’ end of the first coding region does not have a sequence identity to the 5’ end of the second coding region.
[0109] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
[0110] In some embodiments, the truncated dystrophin protein further comprises Hl Domain (SEQ ID NO: 22).
[0111] In some embodiments, the truncated dystrophin protein comprises an amino acid sequence selected from the group consisting of: a. midi-Dys A R1-R15 (SEQ ID NO: 83), b. midi-Dys A R2-R15 (SEQ ID NO: 84), c. midi-Dys A R3-R15 (SEQ ID NO: 85), d. midi-Dys A H2-R 15 (SEQ ID NO: 86), e. midi-Dys A R4-R15 (SEQ ID NO: 87), f. midi-Dys A R5-R15 (SEQ ID NO: 88), g. midi-Dys A exon 13-33 (SEQ ID NO: 93), h. midi-Dys A exon 13-39 (SEQ ID NO: 94), i. midi-Dys A exon 13-41 (SEQ ID NO: 95), j. midi-Dys A exon 13-48 (SEQ ID NO: 96), k. midi-Dys A exon 15-39 (SEQ ID NO: 97), l. midi-Dys A exon 15-41 (SEQ ID NO: 98), m. midi-Dys A exon 15-48 (SEQ ID NO: 99), n. midi-Dys A exon 17-39 (SEQ ID NO: 100), o. midi-Dys A exon 17-41 (SEQ ID NO: 101), p. midi-Dys A exon 17-48 (SEQ ID NO: 102), q. midi-Dys A exon 18-39 (SEQ ID NO: 220), r. midi-Dys A exon 18-41 (SEQ ID NO: 221), s. midi-Dys A exon 18-48 (SEQ ID NO: 222), t. midi-Dys A exon 19-39 (SEQ ID NO: 103), u. midi-Dys A exon 19-41 (SEQ ID NO: 104), v. midi-Dys A exon 19-48 (SEQ ID NO: 105), w. midi-Dys A exon 21-41 (SEQ ID NO: 106), x. midi-Dys A exon 21-42 (SEQ ID NO: 223), and y. midi-Dys A exon 21-48 (SEQ ID NO: 107).
[0112] In some embodiments, the truncated dystrophin protein further comprises R16 Domain
(SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), and R19
Domain (SEQ ID NO:42).
[0113] In some embodiments, the truncated dystrophin protein comprises an amino acid sequence selected from the group consisting of a. midi-Dys A R1-R15 (SEQ ID NO: 83), b. midi-Dys A R2-R15 (SEQ ID NO: 84), c. midi-Dys A R3-R15 (SEQ ID NO: 85), d. midi-Dys A H2-R15 (SEQ ID NO: 86), e. midi-Dys A R4-R 15 (SEQ ID NO: 87), f. midi-Dys A R5-R15 (SEQ ID NO: 88), g. midi-Dys A exon 10-33 (SEQ ID NO: 89), h. midi-Dys A exon 10-39 (SEQ ID NO: 90), i. midi-Dys A exon 10-41 (SEQ ID NO: 91), j. midi-Dys A exon 11-33 (SEQ ID NO: 216), k. midi-Dys A exon 11-39 (SEQ ID NO: 217), l. midi-Dys A exon 11-41 (SEQ ID NO: 218), m. midi-Dys A exon 13-33 (SEQ ID NO: 93), n. midi-Dys A exon 13-39 (SEQ ID NO: 94), o. midi-Dys A exon 13-41 (SEQ ID NO: 95), p. midi-Dys A exon 15-39 (SEQ ID NO: 97), q. midi-Dys A exon L5-41 (SEQ ID NO: 98), r. midi-Dys A exon 17-39 (SEQ ID NO: 100), s. midi-Dys A exon 17-41 (SEQ ID NO: 101), t. midi-Dys A exon 18-39 (SEQ ID NO: 220), u. midi-Dys A exon 18-41 (SEQ ID NO: 221), v. midi-Dys A exon 19-39 (SEQ ID NO: 103), w. midi-Dys A exon 19-41 (SEQ ID NO: 104), x. midi-Dys A exon 21-41 (SEQ ID NO: 106), and y. midi-Dys A exon 21-42 (SEQ ID NO: 223).
[0114] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
[0115] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
[0116] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO:24), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
[0117] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO:24), R3 domain (SEQ ID NO:25), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
[0118] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO:24), R3 domain (SEQ ID NO:25), H2 domain (SEQ ID NO:26), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
[0119] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO:24), R3 domain (SEQ ID NO:25), H2 domain (SEQ ID NO:26), R4 domain (SEQ ID NO:27), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
[0120] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), a partial Hl domain (SEQ ID NO: 404), a partial R11 domain (SEQ ID NO:412), R12 Domain (SEQ ID NO: 35), R13 Domain (SEQ ID NO: 36), R14 domain (SEQ ID NO: 37), R15 Domain (SEQ ID NO: 38), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
[0121] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), a partial Hl domain (SEQ ID NO: 404), a partial R14 Domain (SEQ ID NO: 413), R15 Domain (SEQ ID NO: 38), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
[0122] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), a partial Hl domain (SEQ ID NO: 404), a partial R15 Domain (SEQ ID NO: 414), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
[0123] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), a partial Hl domain (SEQ ID NO: 404), a partial R19 Domain (SEQ ID NO:415), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51). [0124] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), a partial R1 domain (SEQ ID NO: 405), a partial R11 domain (SEQ ID NO: 412), R12 Domain (SEQ ID NO: 35), R13 Domain (SEQ ID NO: 36), R14 Domain (SEQ ID NO: 37), R15 Domain (SEQ ID NO: 38), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51). [0125] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), a partial R1 domain (SEQ ID NO: 405), a partial R14 Domain (SEQ ID NO: 413), R15 Domain (SEQ ID NO: 38), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
[0126] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), a partial R1 domain (SEQ ID NO: 405), a partial R15 Domain (SEQ ID NO: 414), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
[0127] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), a partial R1 domain (SEQ ID NO: 405), a partial R19 Domain (SEQ ID NO:415), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
[0128] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), a partial R2 domain (SEQ ID NO: 406), a partial R11 domain (SEQ ID NO: 412), R12 Domain (SEQ ID NO: 35), R13 Domain (SEQ ID NO: 36), R14 Domain (SEQ ID NO: 37), R15 Domain (SEQ ID NO: 38), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
[0129] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), a partial R2 domain (SEQ ID NO: 406), a partial R14 Domain (SEQ ID NO: 413), R15 Domain (SEQ ID NO: 38), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
[0130] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), a partial R2 domain (SEQ ID NO: 406), a partial R15 Domain (SEQ ID NO: 414), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
[0131] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), a partial R2 domain (SEQ ID NO: 406), a partial R19 Domain (SEQ ID NO: 415), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
[0132] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), a partial R3 domain (SEQ ID NO: 407), a partial R14 Domain (SEQ ID NO: 413), R15 Domain (SEQ ID NO: 38), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
[0133] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), a partial R3 domain (SEQ ID NO: 407), a partial R15 Domain (SEQ ID NO: 414), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
[0134] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), a partial R3 domain (SEQ ID NO: 407), a partial R19 Domain (SEQ ID NO:415), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51). [0135] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), a partial R3 domain (SEQ ID NO: 408), a partial R14 Domain (SEQ ID NO: 413), R15 Domain (SEQ ID NO: 38), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
[0136] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), a partial R3 domain (SEQ ID NO: 408), a partial R15 Domain (SEQ ID NO: 414), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
[01371 In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), a partial R3 domain (SEQ ID NO: 408), a partial R19 Domain (SEQ ID NO:415), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51). [0138] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), R3 domain (SEQ ID NO: 25), H2 domain (SEQ ID NO: 26), a partial R4 domain (SEQ ID NO: 409), a partial R14 Domain (SEQ ID NO: 413), R15 Domain (SEQ ID NO: 38), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
[0139] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), R3 domain (SEQ ID NO: 25), H2 domain (SEQ ID NO: 26), a partial R4 domain (SEQ ID NO: 409), a partial R15 Domain (SEQ ID NO: 414), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51). [0140] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), R3 domain (SEQ ID NO: 25), H2 domain (SEQ ID NO: 26), a partial R4 domain (SEQ ID NO: 409), a partial R19 Domain (SEQ ID NO: 415), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
[0141] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), R3 domain (SEQ ID NO: 25), H2 domain (SEQ ID NO: 26), a partial R4 domain (SEQ ID NO: 410), a partial R14 Domain (SEQ ID NO: 413), R15 Domain (SEQ ID NO: 38), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
[0142] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), R3 domain (SEQ ID NO: 25), H2 domain (SEQ ID NO: 26), a partial R4 domain (SEQ ID NO: 410), a partial R15 Domain (SEQ ID NO: 414), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51). [0143] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), R3 domain (SEQ ID NO: 25), H2 domain (SEQ ID NO: 26), a partial R4 domain (SEQ ID NO: 410), a partial R19 Domain (SEQ ID NO: 415), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
[0144] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), R3 domain (SEQ ID NO: 25), H2 domain (SEQ ID NO: 26), R4 domain (SEQ ID NO:27), a partial R5 domain (SEQ ID NO: 411), a partial R15 Domain (SEQ ID NO: 414), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
[0145] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), R3 domain (SEQ ID NO: 25), H2 domain (SEQ ID NO: 26), R4 domain (SEQ ID NO:27), a partial R5 domain (SEQ ID NO: 411), a partial R16 Domain (SEQ ID NO: 416), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51). [0146] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), R3 domain (SEQ ID NO: 25), H2 domain (SEQ ID NO: 26), R4 domain (SEQ ID NO:27), a partial R5 domain (SEQ ID NO: 411), a partial R19 Domain (SEQ ID NO: 415), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51). [0147] The truncated dystrophin proteins described above are depicted in Table 1 above.
[0148] In some embodiments, the truncated dystrophin protein comprises an amino acid sequence selected from the group consisting of SEQ ID NOs:83-107 and 216-223, or an amino acid at least about 90% identical thereto.
[0149] In some embodiments, the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 86, or an amino acid at least about 90% identical thereto.
[0150] In some embodiments, the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 95 or an amino acid at least about 90% identical thereto.
[0151] In some embodiments, the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 101 or an amino acid at least about 90% identical thereto. [0152] In some embodiments, the amino acid sequence of the truncated dystrophin protein is not identical to an amino acid sequence of SEQ ID NO: 143.
[0153] In some embodiments, the truncated dystrophin protein is not a polypeptide of 2361 amino acids. In some embodiments, the truncated dystrophin protein is less than 2361 amino acid in length. In some embodiments, the truncated dystrophin protein is greater than 2361 amino acid in length.
[0154] In some embodiments, the truncated dystrophin protein is functional.
[0155] In some embodiments, the first coding region comprises the sequence selected from the group consisting of SEQ ID NOs: 284, 286, 288, 290, 291, 293, 295 and 297.
[0156] In some embodiments, the second coding region comprises the sequence selected from the group consisting of SEQ ID NOs: 285, 287, 289, 292, 294 and 296.
[0157] In some embodiments, the first coding sequence comprises SEQ ID NO: 286 and the second coding sequence comprises SEQ ID NO:287. In some embodiments, the first coding sequence comprises SEQ ID NO: 288 and the second coding sequence comprises SEQ ID NO:289. In some embodiments, the first coding sequence comprises SEQ ID NO: 290 and the second coding sequence comprises SEQ ID NO:289. In some embodiments, the first coding sequence comprises SEQ ID NO: 291 and the second coding sequence comprises SEQ ID NO:292. In some embodiments, the first coding sequence comprises SEQ ID NO: 293 and the second coding sequence comprises SEQ ID NO:294. In some embodiments, the first coding sequence comprises SEQ ID NO: 295 and the second coding sequence comprises SEQ ID NO:296. In some embodiments, the first coding sequence comprises SEQ ID NO: 297 and the second coding sequence comprises SEQ ID NO:292. In some embodiments, the first coding sequence comprises SEQ ID NO: 288 and the second coding sequence comprises SEQ ID NO:289. In some embodiments, the first coding sequence comprises SEQ ID NO: 288 and the second coding sequence comprises SEQ ID NO:287.
[0158] In some embodiments, the AAV vector is selected from the group consisting of an AAV1, AAV2, AAV3, AAV4, AAV5, AAV6, AAV7, AAV8, AAV9, AAV10, AAV11, AAV 12, AAV 13, AAVrh74, AAV-rhlO, AAV-DJ, AAV-LK03, AAV-MYO, AAV-MYO2, AAV-MYO3, MYO3A-AAV, MYO4A-AAV, and MYO4E-AAV. [0159] In some embodiments, the first AAV vector further comprises a first promoter operably linked to the first nucleic acid molecule. In some embodiments, the second AAV vector further comprises a second promoter operably linked to the second nucleic acid molecule.
[0160] In some embodiments, the promoter comprises a tissue specific promoter or a ubiquitous promoter. In some embodiments, the promoter comprises a CK8 promoter, an MHCK7 promoter, an SPC5-12 promoter, or a minimal CKM promoter. In some embodiments, the promoter comprises a nucleotide sequence selected from the group consisting of SEQ ID NOs: 144-150, or a nucleotide sequence at least 95% identical thereto.
[0161] In some embodiments, the first and/or second AAV vectors further comprise an inverted terminal repeat (ITR) sequence. In some embodiments, the ITR sequence comprises a nucleotide sequence of SEQ ID NO: 202 and/or 203, or a nucleotide sequence at least 95% identical thereto. [0162] In some embodiments, the first and/or second AAV vectors further comprise an intron region. In some embodiments, the intron region comprise a nucleotide sequence of SEQ ID NO: 156 or 157, or a nucleotide sequence at least 95% identical thereto.
[0163] In some embodiments, the first and/or second AAV vectors further comprise a polyadenylation sequence. In some embodiments, the poly adenylation sequence comprises a nucleotide sequence of SEQ ID NO: 151 or 152, or a nucleotide sequence at least 95% identical thereto.
[0164] In some embodiments, the first and/or second AAV vectors further comprise a Woodchuck Hepatitis Virus Posttranscriptional Regulatory Element (WPRE). In some embodiments, the WPRE comprises a nucleotide sequence of SEQ ID NO: 153 or 154, or 155, or a nucleotide sequence at least 95% identical thereto.
[0165] In some embodiments, the first and/or second AAV vectors further comprise a Kozak sequence.
[0166] In some embodiments, the first AAV vector comprises, in 5’ to 3’ order, a 5’ ITR sequence, a promoter sequence, a first coding region encoding an N-terminal portion of the truncated dystrophin protein, an intron splice donor sequence, a 3’ ribozyme, and a 3’ ITR sequence.
[0167] In some embodiments, the first AAV vector comprises, in 5’ to 3’ order, a 5’ ITR sequence, a promoter sequence, a first coding region encoding an N-terminal portion of the truncated dystrophin protein, an intron splice donor sequence, a 3’ ribozyme, a polyadenylation sequence, and a 3’ ITR sequence.
[0168] In some embodiments, the first AAV vector comprises, in 5’ to 3’ order, a 5’ ITR sequence, a promoter sequence, an intron region, a first coding region encoding an N-terminal portion of the truncated dystrophin protein, an intron splice donor sequence, a 3’ ribozyme, a polyadenylation sequence, and a 3’ ITR sequence.
[0169] In some embodiments, the first AAV vector comprises a sequence selected from the group consisting of SEQ ID NO: 160, 162, 164, 166, 168, 170, 172, 174, 176, 178, 180, 182, 184,
186, 188, 190, 192, 194, 196, 198, 200, 301, 303, 305, 307, 308, 310, 312, 314, 394, and 395.
[0170] In some embodiments, the second AAV vector comprises, in 5’ to 3’ order, a 5’ ITR sequence, a promoter sequence, a 5’ ribozyme, an intron splice acceptor sequence, a second coding region encoding a C-terminal portion of the truncated dystrophin protein, and a 3’ ITR sequence.
[0171] In some embodiments, the second AAV vector comprises, in 5’ to 3’ order, a 5’ ITR sequence, a promoter sequence, a 5’ ribozyme, an intron splice acceptor sequence, a second coding region encoding a C-terminal portion of the truncated dystrophin protein, a polyadenylation sequence, and a 3’ ITR sequence.
[0172] In some embodiments, the second AAV vector comprises, in 5’ to 3’ order, a 5’ ITR sequence, a promoter sequence, a 5’ ribozyme, an intron splice acceptor sequence, a second coding region encoding a C-terminal portion of the truncated dystrophin protein, a WPRE sequence, a polyadenylation sequence, and a 3’ ITR sequence.
[0173] In some embodiments, the second AAV vector comprises a sequence selected from the group consisting of SEQ ID NO:161, 163, 165, 167, 169, 171, 173, 175, 177, 179, 181, 183, 185,
187, 189, 191, 193, 195, 197, 199, 201, 302, 304, 306, 309, 311, and 313.
[0174] In some embodiments, the first AAV vector comprises the sequence of SEQ ID NO: 303, and the second AAV vector comprises the sequence of SEQ ID NO: 304. In some embodiments, the first AAV vector comprises the sequence of SEQ ID NO: 305, and the second AAV vector comprises the sequence of SEQ ID NO: 306. In some embodiments, the first AAV vector comprises the sequence of SEQ ID NO: 307, and the second AAV vector comprises the sequence of SEQ ID NO: 306. In some embodiments, the first AAV vector comprises the sequence of SEQ ID NO: 308, and the second AAV vector comprises the sequence of SEQ ID NO: 309. In some embodiments, the first AAV vector comprises the sequence of SEQ ID NO: 310, and the second AAV vector comprises the sequence of SEQ ID NO: 311. In some embodiments, the first AAV vector comprises the sequence of SEQ ID NO: 312, and the second AAV vector comprises the sequence of SEQ ID NO: 313. In some embodiments, the first AAV vector comprises the sequence of SEQ ID NO: 314, and the second AAV vector comprises the sequence of SEQ ID NO: 309. In some embodiments, the first AAV vector comprises the sequence of SEQ ID NO: 394, and the second AAV vector comprises the sequence of SEQ ID NO: 309. In some embodiments, the first AAV vector comprises the sequence of SEQ ID NO: 395, and the second AAV vector comprises the sequence of SEQ ID NO: 306. In some embodiments, the first AAV vector comprises the sequence of SEQ ID NO: 395, and the second AAV vector comprises the sequence of SEQ ID NO: 304.
[0175] In one aspect, the present invention is directed to a pharmaceutical composition comprising a first nucleic acid molecule and a second nucleic acid molecule of the present disclosure, and a pharmaceutically acceptable excipient.
[0176] In some embodiments, the first nucleic acid molecule and the second nucleic acid molecule are presented at a ratio of 1:1. 1:2, 1:3, 1:4, 1:5, 2:1, 3:1, 4:1, or 5:1.
[0177] In another aspect, the present invention is directed to a pharmaceutical composition comprising a first AAV vector and a second AAV vector of the present disclosure, and a pharmaceutically acceptable excipient.
[0178] In some embodiments, the first AAV vector and the second AAV vector are presented at a ratio of 1:1. 1:2, 1:3, 1:4, 1:5, 2:1, 3:1, 4:1, or 5:1.
[0179] In another aspect, the present invention is directed to a pharmaceutical composition comprising an isolated recombinant dystrophin protein as described below, a nucleic acid encoding such dystrophin protein, a viral genome encoding such dystrophin protein, or a host cell incorporating such a nucleic acid or viral genome.
[0180] In one aspect, the present invention is directed to a method for treating a dystrophin- associated disorder in a subject in need thereof, comprising administering a therapeutically effective amount of the first nucleic acid molecule and the second nucleic acid molecule of the present disclosure, or a therapeutically effective amount of the first AAV vector and the second AAV vector of the present disclosure, or the pharmaceutical composition of the present disclosure, thereby treating the dystrophin-associated disorder in the subject. [0181] In another aspect, the present invention is directed to a method for increasing expression of dystrophin in a subject having or diagnosed with having a dystrophin-associated disorder, comprising administering a therapeutically effective amount of the first nucleic acid molecule and the second nucleic acid molecule of the present disclosure, or a therapeutically effective amount of the first AAV vector and the second AAV vector of the present disclosure, or the pharmaceutical composition of the present disclosure, thereby increasing expression of dystrophin in the subject.
[0182] In a further aspect, the present invention is directed to a method for increasing muscle mass or muscle strength and/or preventing fibrosis in a subject having or diagnosed with having a dystrophin-associated disorder, comprising administering a therapeutically effective amount of the first nucleic acid molecule and the second nucleic acid molecule of the present disclosure, or a therapeutically effective amount of the first AAV vector and the second AAV vector of the present disclosure, or the pharmaceutical composition of the present disclosure, thereby increasing muscle strength and/or preventing fibrosis in the subject.
[0183] In some embodiments, the dystrophin-associated disorder is muscular dystrophy.
[0184] In some embodiments, the dystrophin-associated disorder is Duchenne muscular' dystrophy.
[0185] In some embodiments, the first nucleic acid molecule and the second nucleic acid molecule, the first AAV vector and the second AAV vector, or the pharmaceutical composition is administered by intramuscular injection, or intravenous injection.
[0186] In some embodiments, the first AAV vector and the second AAV vector are administered together.
[0187] In some embodiments, the first AAV vector and the second AAV vector are administered separately.
[0188] In one aspect, the present invention is directed to an isolated recombinant dystrophin protein that has at least 90% sequence identity to a sequence selected from the group consisting of SEQ ID Nos: 83, 85-87, 89-107, and 216-223.
[0189] In one aspect, the present disclosure provides an isolated nucleic acid molecule encoding a truncated human dystrophin protein that has at least 90% sequence identity to a sequence selected from the group consisting of SEQ ID NOS: 83, 85-87, 89-107, and 216-223. [0190] In some embodiments, the isolate nucleic acid molecule comprises a nucleotide sequence selected from the group consisting of SEQ ID NOS: 108-132, 224-231, 260-280, and 396-403, or a nucleotide sequence at least 90% identical thereto. In another embodiment, the isolated nucleic acid molecule is part of an isolated recombinant vector.
[0191] In one aspect, the present invention is directed to an isolated recombinant viral genome comprising a nucleic acid molecule encoding a truncated human dystrophin protein comprising an amino acid sequence selected from the group consisting of SEQ ID NOS: 83-107 and 216- 223.
[0192] In some embodiments, the isolated nucleic acid molecule comprises a nucleotide sequence selected from the group consisting of SEQ ID NOS: 108-132, 224-231, 260-280, and 396-403, or a nucleotide sequence at least 90% identical thereto.
[0193] In one aspect, the present invention is directed to a host cell comprising the first nucleic acid molecule and/or the second nucleic acid molecule of the present disclosure, or the first AAV vector and/or the second AAV vector of the present disclosure.
[0194] In some embodiments, the cell is a mammalian cell, an insect cell, or a bacterial cell.
[0195] In another aspect, the present invention is directed to a method of making a first recombinant adeno-associated virus (rAAV) particle, the method comprising providing a host cell comprising the first nucleic acid molecule of the present disclosure, and incubating the host cell under conditions suitable to encapsulate the first nucleic acid in an AAV capsid protein; thereby making the first rAAV particle.
[0196] In yet another aspect, the present invention is directed to a method of making a second recombinant adeno-associated virus (rAAV) particle, the method comprising providing a host cell comprising the second nucleic acid molecule of the present disclosure, and incubating the host cell under conditions suitable to encapsulate the second nucleic acid in an AAV capsid protein; thereby making the second rAAV particle.
[0197] In some embodiments, the cell is a mammalian cell, an insect cell, or a bacterial cell.
[0198] In one aspect, the present invention is directed to a system for use in the treatment of a dystrophin-associated disorder.
[0199] In another aspect, the present invention is directed to a first nucleic acid molecule and a second nucleic acid molecule for use in the treatment of a dystrophin-associated disorder. [0200] In one aspect, the present invention is directed to a first AAV vector and a second AAV vector for use in the treatment of a dystrophin-associated disorder.
[0201] In another aspect, the present invention is directed to a pharmaceutical composition for use in the treatment of a dystrophin-associated disorder.
[0202] In yet another aspect, the present invention is directed to an isolated nucleic acid molecule for use in the treatment of a dystrophin-associated disorder.
[0203] The details of various aspects or embodiments of the present disclosure are set forth below. Other features, objects, and advantages of the disclosure will be apparent from the description and the claims. In the description, the singular forms also include the plural unless the context clearly dictates otherwise. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art in the field of this disclosure. In the case of conflict, the present description will control.
BRIEF DESCRIPTION OF DRAWINGS
[0204] FIG. 1 is a schematic depicting the dual-vector system for expressing a dystrophin protein. The system comprises two separate vectors, such as AAV vectors. The first AAV vector comprises a first nucleic acid molecule comprising a coding region for the N-terminal (Nt) portion of a dystrophin protein (e.g., a truncated dystrophin), a splicing donor sequence (SD) and a 3’ ribozyme (3’Rz), under the control of a muscle-specific promoter and flanked by two ITR sequences. The second AAV vector comprises a second nucleic acid molecule comprising a coding region for the C-terminal (Ct) portion of a dystrophin protein (e.g., a truncated dystrophin), a splicing acceptor sequence (SA) and a 5’ ribozyme (5’Rz), under the control of a muscle-specific promoter and flanked by two ITR sequences. Ribozymes are utilized on the 3' end of the Nt vector and 5' end of the Ct vector to create precise RNA termini and scarless trans- ligation, followed by splicing mediated by the splice donor and acceptor sequences, to generate a single RNA molecule containing a single open reading frame encoding the Nt portion and the Ct portion of the dystrophin protein.
[0205] FIG. 2 depicts various truncated dystrophin proteins having deletions of different domains or deletions of regions encoded by specific exons. “Y” indicates the inclusion of the particular domain in the truncated dystrophin protein, with domains as described in Table 1 . “N” indicates the absence of the particular domain in the truncated dystrophin protein. “P” indicates that the particular domain is partially included in the truncated dystrophin protein. Partially included (“P”, “Pl” and “P2”) here is used when the truncated protein includes at least 1 but not all of the amino acid encoded from that particular domain as described in Table 1. Defined domains are exemplary listings of these sub-domain sequences, and the exact boundaries between each sub-domain can be shifted from the example sequences provided here, which could vary the example categorizations provided here, particularly the “P” (Partially included) category.
[0206] FIG. 3 depicts various truncated dystrophin proteins having deletions of different domains or deletions of regions encoded by specific exons. “Y” indicates the inclusion of the particular exon in the truncated dystrophin protein, with exons defined in SEQ IDs 315-393. “N” indicates the absence of the particular exon in the truncated dystrophin protein. “P” indicates that the particular’ exon is partially included in the truncated dystrophin protein, as described in Table 1. Exon sequences are based on NM.004006.3. Exons 1 and 79 include untranslated regions (UTRs) which are not required for encoding midi-Dys amino acids, hence the “U” designation. [0207] FIG. 4 depicts the results of delivering various truncated midi-Dys expression plasmid combinations to primary human skeletal muscle cells cultured in vitro. FIG. 4 includes Western blot analysis of cell lysates 24-to-48 hours after transfection of DNA expression plasmids demonstrates that dual plasmid delivery of truncated midi-Dys sequences results in varying levels of truncated midi-Dys protein expression.
[0208] FIG. 5 depicts the results of utilizing different split sites (SS) to divide truncated midi- Dys into N-terminal (Nt) and C-terminal (Ct) plasmids which are transfected into primary human skeletal muscle cells. FIG. 5 shows Western blot results of cell lysates 24-to-48 hours after transfection of various Nt and Ct combinations to express truncated midi-Dystrophins.
[0209] FIG. 6 depicts the results of utilizing different codon optimized and CpG augmented midi-Dystrophin sequences to express midi-Dystrophin protein (SEQ ID NO 259) in primary human skeletal muscle cells. FIG. 6 shows Western blot results of cell lysates 24-to-48 hours after transfection of various codon optimized truncated midi-Dystrophins.
[0210] FIG. 7 depicts the results of utilizing two different dual AAV vector designs to express midi-Dystrophin proteins in primary human skeletal muscle cells, utilizing transfected AAV cis- plasmids. DETAILED DESCRIPTION OF THE INVENTION
[0211] The present disclosure provides isolated recombinant nucleic acid molecules encoding truncated dystrophin proteins, that can be delivered and expressed in a subject using a dual adeno-associated virus (AAV) vector system, to allow expression of truncated dystrophin proteins that are otherwise too large to fit into a single AAV system. The truncated dystrophin proteins can be used to restore the expression and function of a wild-type dystrophin in a subject in need thereof. The present disclosure also provides systems and methods for expressing or delivering a truncated dystrophin protein in a subject, and methods for treating a subject having a dystrophin-associated disease or disorder, e.g., muscular dystrophy, e.g., Duchenne muscular dystrophy (DMD).
[0212] Gene therapy is a promising therapeutic approach for the treatment of many diseases, such as genetic disease and/or diseases that can be treated by expression of a therapeutic protein. Recombinant AAV vectors are generally regarded as one of the safest and most effective classes of vectors for gene therapy. However, one challenge that hampers the development and clinical deployment of therapeutic AAV vectors for gene therapy is their packaging capacity, which is restricted to approximately 4.7 kb of DNA. As a result, expressing a full-length dystrophin protein of 3,686 amino acids, using a single AAV vector, could not be possible since the DMD gene encoding the dystrophin protein exceeds the packaging limit of AAV genomes by more than 2-fold. Accordingly, attention has focused on creating smaller versions of dystrophin that eliminate non-essential subdomains while maintaining at least some function of the full-length protein.
[0213] The present disclosure is based, at least in part, on the development of truncated dystrophin proteins that can be delivered using a dual AAV vector system to a subject for expression. Without wishing to be bound by theory, it is believed that the AAV vectors described herein can be used to administer and/or deliver a functional truncated dystrophin protein in order to achieve sustained and high concentrations and/or more consistent levels of the dystrophin protein. The compositions and methods described herein can be used in the treatment of disorders associated with a lack of a dystrophin protein and/or activity, such as muscular dystrophy, e.g., Duchenne muscular dystrophy (DMD). I. Definitions
[0214] As used herein, each of the following terms has the meaning associated with it in this section.
[0215] The articles “a” and “an” are used herein to refer to one or to more than one (z.e., to at least one) of the grammatical object of the article. By way of example, “an element” means one element or more than one element, e.g., a plurality of elements.
[0216] The term "including" is used herein to mean, and is used interchangeably with, the phrase "including but not limited to".
[0217] The term "or" is used herein to mean, and is used interchangeably with, the term "and/or," unless context clearly indicates otherwise.
[0218] Adeno-associated virus (AAV): As used herein, the term “adeno-associated virus” or “AAV” refers to members of the Dependoparvovirus genus or a variant, e.g., a functional variant, thereof. In some embodiments, the AAV is wildtypc, or naturally occurring. In some embodiments, the AAV is recombinant.
[0219] AAV Particle'. As used herein, an “AAV particle” refers to a particle or a virion comprising an AAV capsid, e.g., an AAV capsid variant, and a polynucleotide, e.g., a viral genome. In some embodiments, the viral genome of the AAV particle comprises at least one payload region encoding a protein of interest, e.g., a truncated dystrophin protein or portion thereof, and at least one ITR. In some embodiments, the AAV particle is capable of delivering a nucleic acid, e.g., a payload region, encoding a payload to cells, typically, mammalian, e.g., human, cells. In some embodiments, an AAV particle of the present disclosure may be produced recombinantly. In some embodiments, an AAV particle may be derived from any serotype, described herein or known in the art, including combinations of serotypes (e.g., “pseudotyped” AAV) or from various genomes (e.g., single stranded or self-complementary). In some embodiments, the AAV particle may be replication defective, targeted to a specific tissues or subset of tissues, and/or detargeted to a specific tissue or subset of tissues. In some embodiments, the AAV particle may comprise a peptide, e.g., targeting peptide, present, e.g., inserted into, the capsid to enhance tropism for a desired target tissue. It is to be understood that reference to the AAV particle of the disclosure also includes pharmaceutical compositions thereof, even if not explicitly recited. [0220] AAV vector: As used herein, the term "AAV vector" or “AAV construct” refers to a vector derived from an adeno-associated virus serotype. "AAV vector" refers to a vector that includes AAV nucleotide sequences as well as heterologous nucleotide sequences. AAV vectors require only the 145 base terminal repeats in cis to generate virus. All other viral sequences are dispensable and may be supplied in trans (Muzyczka (1992) Curr. Topics Microbiol. Immunol. 158:97-129). Typically, the recombinant AAV vector genome will only retain the inverted terminal repeat (ITR) sequences so as to maximize the size of the transgene that can be efficiently packaged by the vector. The ITRs need not be the wild-type nucleotide sequences, and may be altered, e.g., by the insertion, deletion or substitution of nucleotides, as long as the sequences provide for functional rescue, replication and packaging.
[0221] Administering: As used herein, the term “administering” to a subject includes dispensing, delivering or applying a composition of the disclosure to a subject by any suitable route for delivery of the composition to the desired location in the subject. Alternatively, or in combination, delivery is by the topical, parenteral or oral route, intracerebral injection, intramuscular injection, subcutaneous/intradermal injection, intravenous injection, buccal administration, transdermal delivery and administration by the rectal, colonic, vaginal, intranasal or respiratory tract route.
[0222] Capsid'. As used herein, the term “capsid” refers to the exterior, e.g., a protein shell, of a virus particle, e.g., an AAV particle, that is substantially (e.g., >50%, >60%, >70%, >80%, >90%, >95%, >99%, or 100%) protein. In some embodiments, the capsid is an AAV capsid comprising an AAV capsid protein described herein, e.g., a VP1, VP2, and/or VP3 polypeptide. The AAV capsid protein can be a wild-type AAV capsid protein or a variant, e.g., a structural and/or functional variant from a wild-type or a reference capsid protein, referred to herein as an “AAV capsid variant.” In some embodiments, the AAV capsid variant described herein has the ability to enclose, e.g., encapsulate, a viral genome and/or is capable of entry into a cell, e.g., a mammalian cell.
[0223] Codon optimization'. As used herein, the term “codon optimization” refers to a process of changing codons of a given gene in such a manner that the polypeptide sequence encoded by the gene remains the same while the changed codons improve the process of expression of the polypeptide sequence. For example, if the polypeptide is of a human protein sequence and expressed in E. coli, expression will often be improved if codon optimization is performed on the DNA sequence to change the human codons to codons that are more effective for expression in E. coli. Human protein-coding nucleic acid sequences delivered by vectors such as AAV can be further codon optimized for more effective expression in human cells and/or specific human cell or tissue types.
[0224] Contacting: As used herein, the term "contacting" (i.e., contacting a cell with an agent) is intended to include incubating the agent and the cell together in vitro (e.g., adding the agent to cells in culture) or administering the agent to a subject such that the agent and cells of the subject are contacted in vivo. The term "contacting" is not intended to include exposure of cells to an agent that may occur naturally in a subject (i.e., exposure that may occur as a result of a natural physiological process).
[0225] Cpg Motif: As used herein, a “CpG motif’ is a pattern of bases that include a central CpG (“p” refers to the phosphodiester link between consecutive C and G nucleotides) surrounded by at least one base flanking (on the 3' and the 5' side of) the central CpG.
[0226] Certain constructs are described containing the Greek symbol “A”. In other instances, the same constructs may be described as “delta”. As used herein, the word “delta” and the symbol “A” are used interchangeably.
[0227] Dystrophin: As used herein, the term “dystrophin” refers to a sarcolemmal protein associated with the dystrophin-associated protein complex (DAPC) (Hoffman et al., Cell
51(6):919-28, 1987). The DAPC is composed of multiple proteins at the muscle sarcolemma that form a structural link between the extracellular matrix (ECM) and the cytoskeleton via dystrophin, an actin binding protein, and alpha-dystroglycan, a laminin-binding protein. These structural links act to stabilize the muscle cell membrane during contraction and protect against contraction-induced damage. With dystrophin loss, membrane fragility results in sarcolemmal tears and an influx of calcium, triggering calcium-activated proteases and segmental fiber necrosis (Straub et al., Curr Opin. Neurol. 10(2): 168-75, 1997). This uncontrolled cycle of muscle degeneration and regeneration ultimately exhausts the muscle stem cell population, resulting in progressive muscle weakness, endomysial inflammation, and fibrotic scarring (Sacco et al., Cell, 2010. 143(7): p. 1059-71). The dystrophin (DMD) gene is 2.2 megabases at locus Xp21 and has 79 exon s encoding a protein which is over 3500 amino acids. Normal skeleton muscle tissue contains only small amounts of dystrophin but its absence or abnormal expression leads to the development of severe and incurable symptoms. Some mutations in the dystrophin gene lead to the production of defective dystrophin and severe dystrophic phenotype in affected patients. Some mutations in the dystrophin gene lead to partially-functional dystrophin protein and a much milder dystrophic phenotype in affected patients.
[0228] Functional: As used herein, the term “functional” or “functional protein” refers to a truncated protein that is capable of, partially or completely, restoring a function of an endogenously expressed full-length protein. A functional truncated dystrophin protein is capable of, partially or completely, restoring a function, e.g., supporting a link between the extracellular matrix and the cytoskeleton, of a full-length dystrophin protein in vitro or in vivo, e.g., in a disease model in an animal, or in a human.
[0229] Dystrophin-associated disorder: The terms “Dystrophin-associated disorder,” “Dystrophin-associated disease,” and the like refer to diseases or disorders having a deficiency in the DMD gene, such as a heritable, e.g., X-linked, mutation in DMD resulting in deficient or defective dystrophin protein expression in patient cells. Dystrophin-associated disorders include, but are not limited to muscular dystrophies, e.g., Duchenne muscular dystrophy (DMD), or Becker muscular dystrophy (BMD).
[0230] Isolated: As used herein, the term “isolated” refers to a substance or entity that is altered or removed from the natural state, e.g., altered or removed from at least some of component with which it is associated in the natural state. For example, a nucleic acid or a peptide naturally present in a living animal is not “isolated,” but the same nucleic acid or peptide partially or completely separated from the coexisting materials of its natural state is “isolated.” An isolated nucleic acid or protein can exist in substantially purified form, or can exist in a non-native environment such as, for example, a host cell. Such polynucleotides could be part of a vector and/or such polynucleotides or polypeptides could be part of a composition and still be isolated in that such vector or composition is not part of the environment in which it is found in nature. In some embodiments, an isolated nucleic acid is recombinant, e.g., incorporated into a vector. [0231] Muscle cell/tissue: As used herein, the term “muscle cell” or “muscle tissue” refers to a cell or group of cells derived from muscle of any kind, for example, skeletal muscle and smooth muscle, e.g., from the digestive tract, urinary bladder, blood vessels or cardiac tissue. Such muscle cells may be differentiated or undifferentiated, such as myoblasts, myocytes, myotubes, cardiomyocytes and cardiomyoblasts. [0232] Muscular dystrophy: As used herein, the term "muscular dystrophy" or "muscular dystrophies" refers to a group of hereditary muscle diseases that weakens skeletal muscles. Muscular dystrophies are characterized by a genetic defect resulting in muscle weakness or loss of muscle tissue which progressively increases over time. Muscular dystrophies include, but are not limited to, Duchenne muscular dystrophy, Becker muscular dystrophy, myotonic dystrophy, congenital muscular dystrophy, distal muscular dystrophy, Emery-Dreifuss muscular dystrophy facioscapulohumeral muscular dystrophy, limb girdle muscular dystrophy, and oculopharyngeal muscular dystrophy.
[0233] Duchenne muscular dystrophy (DMD): As used herein, the term “Duchenne muscular dystrophy (DMD)” refers to a fatal, X-linked genetic disease caused by mutations in the dystrophin gene and a complete loss of a functional dystrophin protein. As a result of this genetic defect, individuals with DMD may have symptoms such as trouble walking and running, falling frequently, fatigue, learning disabilitics/difficultics, heart issues as a result of impact on heart muscle functioning, and breathing problems due to weakening of respiratory muscles involved in lung function. Symptoms of muscle weakness associated with DMD typically begin in childhood, often between 3 to 6 year’s of age. Most individuals with DMD require full-time wheelchair use in their early teens and eventually lose their ability to do daily tasks, such as using restrooms, washing and eating, independently. The disease progresses to life-threatening heart and respiratory failure, often resulting in premature death in the 20s or 30s.
[0234] Becker muscular dystrophy (BMD): As used herein, the term “Becker muscular dystrophy (BMD)” refers to an X-linked muscle disease caused by in-frame mutations of the dystrophin gene. These BMD-causing mutations result in the production of a truncated isoform of dystrophin protein that is partially functional and expressed at reduced amounts. The reduced levels of a truncated dystrophin protein lead to progressive skeletal and cardiac muscle dysfunction. BMD presents with reduced severity compared with Duchenne muscular dystrophy (DMD).
[0235] Mutation As used herein, the term “mutation” refers to a change and/or alteration. In some embodiments, mutations may be changes and/or alterations to proteins (including peptides and polypeptides) and/or nucleic acids (including polynucleic acids). In some embodiments, mutations comprise changes and/or alterations to a protein and/or nucleic acid sequence. Such changes and/or alterations may comprise the addition, substitution and or deletion of one or more amino acids (in the case of proteins and/or peptides) and/or nucleotides (in the case of nucleic acids and or polynucleic acids). In embodiments wherein mutations comprise the addition and/or substitution of amino acids and/or nucleotides, such additions and/or substitutions may comprise 1 or more amino acid and/or nucleotide residues and may include modified amino acids and/or nucleotides. One or more mutations may result in a “mutant,” “derivative,” or “valiant,” e.g., of a nucleic acid sequence or polypeptide or protein sequence.
[0236] Naturally occurring: As used herein, “naturally occurring” or “wild-type” means existing in nature without artificial aid, or involvement of the hand of man. “Naturally occurring” or “wild-type” may refer to a native form of a biomolecule, sequence, or entity.
[0237] Nucleic acid: As used herein, the terms “nucleic acid,” “polynucleotide,” and “oligonucleotide” refer to any nucleic acid polymers composed of either polydeoxyribonucleotides (containing 2-deoxy-D-ribose), or polyribonucleotides (containing D- ribosc), or any other type of polynucleotide that is an N glycoside of a purine or pyrimidine base, or modified purine or pyrimidine bases. There is no intended distinction in length between the term “nucleic acid,” “polynucleotide,” and “oligonucleotide,” and these terms will be used interchangeably. These terms refer only to the primary structure of the molecule. Thus, these terms include double- and single-stranded DNA, as well as double- and single-stranded RNA. [0238] Operably linked: As used herein, the phrase “operably linked” refers to a functional connection between two or more molecules, constructs, transcripts, entities, moieties or the like. [0239] Particle'. As used herein, a “particle” is a virus comprised of at least two components, a protein capsid and a polynucleotide sequence enclosed within the capsid.
[0240] Payload: As used herein, “payload” or “payload region” or “transgene” refers to one or more polynucleotides or polynucleotide regions encoded by or within a viral genome or an expression product of such polynucleotide or polynucleotide region, e.g., a transgene, a polynucleotide encoding a polypeptide.
[0241] Peptide: As used herein, the term “peptide” refers to a chain of amino acids that is less than or equal to about 50 amino acids long, e.g., about 5, 10, 15, 20, 25, 30, 35, 40, 45, or 50 amino acids long.
[0242] Pharmaceutically acceptable'. The phrase “pharmaceutically acceptable” is employed herein to refer to those compounds, materials, compositions, and/or dosage forms which are, within the scope of sound medical judgment, suitable for use in contact with the tissues of human beings and animals without excessive toxicity, irritation, allergic response, or other problem or complication, commensurate with a reasonable benefit/risk ratio.
[0243] Pharmaceutically acceptable excipients: As used herein, the term “pharmaceutically acceptable excipient,” as used herein, refers to any ingredient other than active agents (e.g., as described herein) present in pharmaceutical compositions and having the properties of being substantially nontoxic and non-inflammatory in subjects. In some embodiments, pharmaceutically acceptable excipients are vehicles capable of suspending and/or dissolving active agents. Excipients may include, for example: antiadherents, antioxidants, binders, coatings, compression aids, disintegrants, dyes (colors), emollients, emulsifiers, fillers (diluents), film formers or coatings, flavors, fragrances, glidants (flow enhancers), lubricants, preservatives, printing inks, sorbents, suspending or dispersing agents, sweeteners, and waters of hydration. Excipients include, but are not limited to: butylated hydroxytoluene (BHT), calcium carbonate, calcium phosphate (dibasic), calcium stearate, croscarmcllosc, cross-linked polyvinyl pyrrolidone, citric acid, crospovidone, cysteine, ethylcellulose, gelatin, hydroxypropyl cellulose, hydroxypropyl methylcellulose, lactose, magnesium stearate, maltitol, mannitol, methionine, methylcellulose, methyl paraben, microcrystalline cellulose, polyethylene glycol, polyvinyl pyrrolidone, povidone, pregelatinized starch, propyl paraben, retinyl palmitate, shellac, silicon dioxide, sodium carboxymethyl cellulose, sodium citrate, sodium starch glycolate, sorbitol, starch (corn), stearic acid, sucrose, talc, titanium dioxide, vitamin A, vitamin E, vitamin C, and/or xylitol.
[0244] Polypeptide: As used herein, the term “polypeptide” refers to an organic polymer consisting of a large number of amino-acid residues bonded together in a chain. A monomeric protein molecule is a polypeptide.
[0245] Preventing'. As used herein, the term “preventing” refers to partially or completely delaying onset of an infection, disease, disorder and/or condition; partially or completely delaying onset of one or more symptoms, features, or clinical manifestations of a particular infection, disease, disorder, and/or condition; partially or completely delaying progression from an infection, a particular disease, disorder and/or condition; and/or decreasing the risk of developing pathology associated with the infection, the disease, disorder, and/or condition. [0246] Promoter: As used herein, the term “promoter” refers to a nucleic acid site to which a polymerase enzyme will bind to initiate transcription (DNA to RNA) or reverse transcription (RNA to DNA).
[0247] Recombinant nucleic acid molecule'. As used herein, the term “recombinant nucleic acid molecule” or “recombinant polynucleotide” refers to a nucleic acid molecule or a polynucleotide having sequences that are not naturally joined together. An amplified or assembled recombinant polynucleotide may be included in a suitable vector, and the vector can be used to transform a suitable host cell. A recombinant polynucleotide may serve a non-coding function (e.g., promoter, origin of replication, ribosome-binding site, etc.) as well.
[0248] Regulatory sequence: As used herein, the term “regulatory sequence” is intended to include promoters, enhancers and other expression control elements (e.g., polyadenylation signals). Such regulatory sequences are described, for example, in Goeddel; Gene Expression Technology: Methods in Enzymology 185, Academic Press, San Diego, Calif. (1990). Regulatory sequences include those which direct constitutive expression of a nucleotide sequence in many types of host cells, those which are constitutively active, those which are inducible, and those which direct expression of the nucleotide sequence only in certain host cells (e.g., tissue- specific regulatory sequences). The expression vectors of the disclosure can be introduced into host cells to thereby produce proteins or portions thereof, including fusion proteins or portions thereof, encoded by nucleic acids as described herein.
[0249] Ribozyme: As used herein, the term “ribozyme” refers to an RNA molecule capable of acting as an enzyme. For example, some ribozymes are capable of cleaving RNA molecules. RNA cleaving ribozymes typically consist at least of a catalytic domain and a recognition sequence that is recognized by the catalytic domain. The catalytic domain can be a part of the same RNA molecule as the recognition sequence, and thus mediate cis- cleavage. Alternatively, the catalytic domain can be a separate RNA molecule from the RNA molecule comprising the recognition sequence, and thus mediate trans-cleavage.
[0250] Sequence identity: As used herein, the term “sequence identity” refers to the overall relatedness between polymeric molecules, e.g., between nucleic acid molecules (e.g. DNA molecules and/or RNA molecules) and/or between polypeptide molecules. Sequence identity of polymeric molecules to one another can be calculated as the percentage of nucleotides or amino acid residues in a candidate sequence that are identical to the nucleotides or amino acid residues in a given polymeric molecule, after aligning the sequences and introducing gaps, if necessary, to achieve the maximum percent sequence identity, and not considering any conservative substitutions as pail of the sequence identity. Alignment for purposes of determining percent sequence identity can be achieved in various ways that are within the skill in the ail, for instance, using publicly available computer software such as BLAST, BLAST-2, or ALIGN. Those skilled in the ail can determine appropriate parameters for measuring alignment, including any algorithms needed to achieve maximal alignment over the full length of the sequences being compared.
[0251] Similarity: As used herein, the term “similarity” refers to the overall relatedness between polymeric molecules, e.g. between polynucleotide molecules (e.g. DNA molecules and/or RNA molecules) and/or between polypeptide molecules. Calculation of percent similarity of polymeric molecules to one another can be performed in the same manner as a calculation of percent identity, except that calculation of percent similarity takes into account conservative substitutions as are understood in the art.
[0252] Split Site: As used herein, the term “Split Site” refers to the location in the nucleic acid sequence where a truncated midi-Dystrophin transgenic sequence is divided into two portions, which will then be delivered to cells by an N-terminal (Nt) vector (or plasmid) and a C-terminal (Ct) vector or plasmid. As the ribozyme sequences and adjoining splice donor sequences in the Nt vector (or plasmid) or splice acceptor sequences in the Ct vector (or plasmid) are influenced by the 5’ and 3’ protein coding sequences (in the Nt and Ct vectors, respectively), Split Sites must be selected carefully based on their sequence properties and can be empirically optimized for enhanced ability to join the Nt and Ct sequences for midi-Dys protein expression.
[0253] Subject: As used herein, the term “subject” or “patient” refers to any organism to which a composition in accordance with the disclosure may be administered, e.g., for experimental, diagnostic, prophylactic, and/or therapeutic purposes. Similarly, “subject” or “patient” refers to an organism who may seek, who may require, who is receiving, or who will receive treatment or who is under care by a trained professional for a particular disease or condition. Typical subjects include animals (e.g., mammals such as mice, rats, rabbits, dogs, non-human primates, and humans). In some embodiments, the subject is a mammal, e.g., a primate, e.g., a human. In certain embodiments, the subject is a human. In some embodiments, the subject is a child. In other embodiments, the subject is an adult. In certain embodiments, a subject or patient may be susceptible to or suspected of having a dystrophin-associated disorder, e.g., muscular dystrophy, e.g., DMD. In certain embodiments, a subject or patient may be diagnosed with a dystrophin- associated disorder, e.g., muscular dystrophy, e.g., DMD.
[0254] Substantially: As used herein, the term “substantially” refers to the qualitative condition of exhibiting total or near-total extent or degree of a characteristic or property of interest. One of ordinary skill in the biological arts will understand that biological and chemical phenomena rarely, if ever, go to completion and/or proceed to completeness or achieve or avoid an absolute result. The term “substantially” is therefore used herein to capture the potential lack of completeness inherent in many biological and chemical phenomena.
[0255] Therapeutic Agent: The term “therapeutic agent” refers to any agent that, when administered to a subject has a therapeutic, diagnostic, and/or prophylactic effect and/or elicits a desired biological and/or pharmacological effect.
[0256] Therapeutically effective amount: As used herein, the term “therapeutically effective amount” means an amount of an agent to be delivered (e.g., nucleic acid, drug, therapeutic agent, diagnostic agent, prophylactic agent, etc.) that is sufficient, when administered to a subject suffering from or susceptible to an infection, disease, disorder, and/or condition, to treat, improve symptoms of, diagnose, prevent, and/or delay the onset of the infection, disease, disorder, and/or condition. In some embodiments, a therapeutically effective amount is provided in a single dose. In some embodiments, a therapeutically effective amount is administered in a dosage regimen comprising a plurality of doses. Those skilled in the art will appreciate that in some embodiments, a unit dosage form may be considered to comprise a therapeutically effective amount of a particular agent or entity if it comprises an amount that is effective when administered as part of such a dosage regimen.
[0257] Treating: As used herein, the term “treating” refers to partially or completely alleviating, ameliorating, improving, relieving, reversing, delaying onset of, inhibiting progression of, reducing severity of, and/or reducing incidence of one or more symptoms or features of a particular infection, disease, disorder, and/or condition. Treatment may be administered to a subject who does not exhibit signs of a disease, disorder, and/or condition and/or to a subject who exhibits only early signs of a disease, disorder, and/or condition for the purpose of decreasing the risk of developing pathology associated with the disease, disorder, and/or condition. [0258] Truncated Dystrophin'. As used herein, a “truncated dystrophin” is any protein shorter than the full length 427 kDa Dystrophin protein isoform. A truncated dystrophin refers to any transgenic Dystrophin protein delivered by either a single AAV vector or reconstituted by a dual vector system, since in both cases, these delivery technologies are incapable of delivering the full-length 427 kDa dystrophin protein. In some embodiments, a truncated dystrophin protein, as described herein, is functional, i.e., the truncated dystrophin protein is capable of restoring, partially or completely, a function of a full-length dystrophin protein in vitro (e.g., supporting a link between the extracellular matrix and the cytoskeleton) or in vivo, e.g., in a disease model in an animal, or in a human.
[0259] Vector: As used herein, a “vector” is any molecule or moiety which transports, transduces or otherwise acts as a carrier of a heterologous molecule. Vectors of the present disclosure may be produced recombinantly and may be based on and/or may comprise adeno-associated virus (AAV) parent or reference scqucncc(s). Such parent or reference AAV sequences may serve as an original, second, third or subsequent sequence for engineering vectors. In non-limiting examples, such parent or reference AAV sequences may comprise any one or more of the following sequences: a polynucleotide sequence encoding a polypeptide or multi-polypeptide, having a sequence that may be wild-type or modified from wild-type and which sequence may encode full-length or partial sequence of a protein, protein domain, or one or more subunits of dystrophin protein and valiants thereof; a polynucleotide encoding dystrophin protein and variants thereof, having a sequence that may be wild-type or modified from wild-type; and a transgene encoding dystrophin protein and variants thereof that may or may not be modified from wild-type sequence.
[0260] Viral genome: As used herein, a “viral genome” or “vector genome” is a polynucleotide comprising at least one inverted terminal repeat (ITR) and at least one encoded payload. A viral genome encodes at least one copy of the payload.
[0261] Wild-type: As used herein, “wild-type” is a native form of a biomolecule, sequence, or entity. The term “wild type” is a term of the art understood by skilled persons and means the typical form of an organism, strain, gene or characteristic as it occurs in nature as distinguished from mutant or valiant forms.
II. Truncated Dystrophin [0262] The present disclosure provides isolated recombinant nucleic acid molecules encoding a dystrophin protein or portion thereof, e.g., a truncated dystrophin protein or portion thereof. [0263] Dystrophin is a cytoplasmic protein encoded by the DMD gene, which is a part of a protein complex that connects the cytoskeleton of a muscle fiber to the surrounding extracellular matrix through the cell membrane. Normally, the dystrophin protein, located primarily in skeletal and cardiac muscles, with smaller amounts expressed in the brain, acts as a shock absorber during muscle fiber contraction by linking the actin of the contractile apparatus to the layer of connective tissue that surrounds each muscle fiber. In muscle, dystrophin is localized at the cytoplasmic face of the sarcolemma membrane.
[0264] The full-length dystrophin muscle isoform (Dp427m) is a large (427 kDa) protein comprising a number of subdomains that contribute to its function. These subdomains include, in order from the amino-terminus toward the carboxy-terminus, the N-terminal actin-binding domain (ACBD), a central so-called “rod” domain, a cystcinc-rich (CR) domain and lastly a carboxy-terminal (CT) domain. The rod domain is comprised of 4 proline-rich hinge domains (abbreviated H) and 24 spectrin-like repeats (abbreviated R) in the following order: a first hinge domain (Hl), 3 spectrin-like repeats (Rl, R2, R3), a second hinge domain (H2), 16 more spectrin-like repeats (R4, R5, R6, R7, R8, R9, RIO, Rl 1, R12, R13, R14, R15, R16, R17, R18, R19), a third hinge domain (H3), 5 more spectrin-like repeats (R20, R21, R22, R23, R24), and finally a fourth hinge domain (H4). Subdomains toward the carboxy-terminus of the protein are involved in connecting to the dystrophin-associated glycoprotein complex (DGC), a large protein complex that forms a critical link between the cytoskeleton and the extra-cellular matrix. The amino acid sequences of the various domains are provided in Table 2 below.
Table 2. Domains of Human Dystrophin
[0265] The DMD gene is one of the largest known human genes at approximately 2.2 Mb. The gene is located on the X chromosome at position Xp21 and contains 79 exons. The most common mutations that cause Duchenne muscular dystrophy (DMD), or Becker muscular dystrophy (BMD) are large deletion mutations of one or more exons (60-70%), but duplication mutations (5-10%) and single nucleotide variants (including small deletions or insertions, singlebase changes, and splice site changes accounting for approximately 25%-35% of pathogenic variants in males with DMD and about 10%-20% of males with BMD) can also cause pathogenic dystrophin valiants.
[0266] In DMD, mutations often lead to a frame shift resulting in a premature stop codon and a truncated, non-functional or unstable protein. Nonsense point mutations can also result in premature termination codons with the same result. The BMD genotype is similar to DMD in that deletions are present in the dystrophin gene. However, these deletions leave the reading frame intact. Thus an internally truncated but partially functional dystrophin protein is created. Thus, changing a DMD genotype to a BMD genotype is a common strategy to correct dystrophin. There are many strategies to correct dystrophin, many of which rely on restoring the reading frame of the endogenous dystrophin. This shifts the disease genotype from DMD to BMD.
[0267] The present disclosure provides truncated dystrophin gene sequences and expression vectors containing the same. Such genes and expression vectors arc useful gene therapy to prevent or treat dystrophin-associated disorders, e.g., muscular dystrophies, e.g.. Duchenne muscular dystrophy (DMD), in subjects in need thereof. Expression of functional truncated proteins in transduced muscle cells is able to replicate and replace at least some of the function normally attributable to full-length dystrophin, such as supporting a mechanically strong link between the extracellular matrix and the cytoskeleton.
[0268] In one aspect, the present disclosure provides an isolated, recombinant nucleic acid molecule encoding a truncated human dystrophin protein, wherein the truncated dystrophin protein comprises an ABCD domain (SEQ ID NO: 21), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
[0269] In some embodiments, the truncated dystrophin protein further comprises an Hl Domain (SEQ ID NO: 22). In some embodiments, the truncated dystrophin protein comprises an amino acid sequence selected from the group consisting of: a. midi-Dys AR1-R15 (SEQ ID NO: 83), b. midi-Dys AR2-R15 (SEQ ID NO: 84), c. midi-Dys AR3-R15 (SEQ ID NO: 85), d. midi-Dys AH2-R15 (SEQ ID NO: 86), e. midi-Dys AR4-R15 (SEQ ID NO: 87), f . midi-Dys AR5-R 15 (SEQ ID NO : 88 ) , g. midi-Dys A exon 13-33 (SEQ ID NO: 93), h. midi-Dys A exon 13-39 (SEQ ID NO: 94), i. midi-Dys A exon 13-41 (SEQ ID NO: 95), j. midi-Dys A exon 13-48 (SEQ ID NO: 96), k. midi-Dys A exon 15-39 (SEQ ID NO: 97), l. midi-Dys A exon 15-41 (SEQ ID NO: 98), m. midi-Dys A exon 15-48 (SEQ ID NO: 99), n. midi-Dys A exon 17-39 (SEQ ID NO: 100), o. midi-Dys A exon 17-41 (SEQ ID NO: 101), p. midi-Dys A exon 17-48 (SEQ ID NO: 102), q. midi-Dys A exon 18-39 (SEQ ID NO: 220), r. midi-Dys A exon 18-41 (SEQ ID NO: 221), s. midi-Dys A exon 18-48 (SEQ ID NO: 222), t. midi-Dys A exon 19-39 (SEQ ID NO: 103), u. midi-Dys A exon 19-41 (SEQ ID NO: 104), v. midi-Dys A exon 19-48 (SEQ ID NO: 105), w. midi-Dys A exon 21-41 (SEQ ID NO: 106), x. midi-Dys A exon 21-42 (SEQ ID NO: 223), and y. midi-Dys A exon 21-48 (SEQ ID NO: 107).
[0270] In some embodiments, the truncated dystrophin protein further comprises an R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), and R19 Domain (SEQ ID NO:42).
[0271] In some embodiments, the truncated dystrophin protein comprises an amino acid sequence selected from the group consisting of a. midi-Dys AR1-R15 (SEQ ID NO: 83), b. midi-Dys AR2-R15 (SEQ ID NO: 84), c. midi-Dys AR3-R15 (SEQ ID NO: 85), d. midi-Dys AH2-R15 (SEQ ID NO: 86), e. midi-Dys AR4-R15 (SEQ ID NO: 87), f. midi-Dys AR5-R15 (SEQ ID NO: 88), g. midi-Dys A exon 10-33 (SEQ ID NO: 89), h. midi-Dys A exon 10-39 (SEQ ID NO: 90), i. midi-Dys A exon 10-41 (SEQ ID NO: 91), j. midi-Dys A exon 11-33 (SEQ ID NO: 216), k. midi-Dys A exon 11-39 (SEQ ID NO: 217), l. midi-Dys A exon 11-41 (SEQ ID NO: 218), m. midi-Dys A exon 13-33 (SEQ ID NO: 93), n. midi-Dys A exon 13-39 (SEQ ID NO: 94), o. midi-Dys A exon 13-41 (SEQ ID NO: 95), p. midi-Dys A exon 15-39 (SEQ ID NO: 97), q. midi-Dys A exon 15-41 (SEQ ID NO: 98), r. midi-Dys A exon 17-39 (SEQ ID NO: 100), s. midi-Dys A exon 17-41 (SEQ ID NO: 101), t. midi-Dys A exon 18-39 (SEQ ID NO: 220), u. midi-Dys A exon 18-41 (SEQ ID NO: 221), v. midi-Dys A exon 19-39 (SEQ ID NO: 103), w. midi-Dys A exon 19-41 (SEQ ID NO: 104), x. midi-Dys A exon 21-41 (SEQ ID NO: 106), and y. midi-Dys A exon 21-42 (SEQ ID NO: 223).
[0272] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
[0273] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
[0274] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO:24), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
[0275] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO:24), R3 domain (SEQ ID NO:25), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
[0276] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO:24), R3 domain (SEQ ID NO:25), H2 domain (SEQ ID NO:26), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), RI8 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
[0277] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO:24), R3 domain (SEQ ID NO:25), H2 domain (SEQ ID NO:26), R4 domain (SEQ ID NO:27), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
[0278] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), a partial Hl domain (SEQ ID NO: 404), a partial R11 domain (SEQ ID NO:412), R12 Domain (SEQ ID NO: 35), R13 Domain (SEQ ID NO: 36), R14 domain (SEQ ID NO: 37), R15 Domain (SEQ ID NO: 38), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
[0279] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), a partial Hl domain (SEQ ID NO: 404), a partial R14 Domain (SEQ ID NO: 413), R15 Domain (SEQ ID NO: 38), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
[0280] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), a partial Hl domain (SEQ ID NO: 404), a partial R15 Domain (SEQ ID NO: 414), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
[0281] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), a partial Hl domain (SEQ ID NO: 404), a partial R19 Domain (SEQ ID NO:415), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
[0282] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), a partial R1 domain (SEQ ID NO: 405), a partial R11 domain (SEQ ID NO: 412), R12 Domain (SEQ ID NO: 35), R13 Domain (SEQ ID NO: 36), R14 Domain (SEQ ID NO: 37), R15 Domain (SEQ ID NO: 38), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51). [0283] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), a partial R1 domain (SEQ ID NO: 405), a partial R14 Domain (SEQ ID NO: 413), R15 Domain (SEQ ID NO: 38), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51 ). [0284] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), a partial R1 domain (SEQ ID NO: 405), a partial R15 Domain (SEQ ID NO: 414), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
[0285] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), a partial R1 domain (SEQ ID NO: 405), a partial R19 Domain (SEQ ID NO:415), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
[0286] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), a partial R2 domain (SEQ ID NO: 406), a partial R11 domain (SEQ ID NO: 412), R12 Domain (SEQ ID NO: 35), R13 Domain (SEQ ID NO: 36), R14 Domain (SEQ ID NO: 37), R15 Domain (SEQ ID NO: 38), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
[0287] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), a partial R2 domain (SEQ ID NO: 406), a partial R14 Domain (SEQ ID NO: 413), R15 Domain (SEQ ID NO: 38), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51). [0288] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), a partial R2 domain (SEQ ID NO: 406), a partial R15 Domain (SEQ ID NO: 414), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
[0289] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), a partial R2 domain (SEQ ID NO: 406), a partial R19 Domain (SEQ ID NO: 415), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
[0290] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), a partial R3 domain (SEQ ID NO: 407), a partial R14 Domain (SEQ ID NO: 413), R15 Domain (SEQ ID NO: 38), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
[0291] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), a partial R3 domain (SEQ ID NO: 407), a partial R15 Domain (SEQ ID NO: 414), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51). [0292] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), a partial R3 domain (SEQ ID NO: 407), a partial R19 Domain (SEQ ID NO:415), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51). [0293] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), a partial R3 domain (SEQ ID NO: 408), a partial R14 Domain (SEQ ID NO: 413), R15 Domain (SEQ ID NO: 38), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
[0294] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), a partial R3 domain (SEQ ID NO: 408), a partial R15 Domain (SEQ ID NO: 414), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
[0295] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), a partial R3 domain (SEQ ID NO: 408), a partial R19 Domain (SEQ ID NO:415), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51). [0296] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), R3 domain (SEQ ID NO: 25), H2 domain (SEQ ID NO: 26), a partial R4 domain (SEQ ID NO: 409), a partial R14 Domain (SEQ ID NO: 413), R15 Domain (SEQ ID NO: 38), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
[0297] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), R3 domain (SEQ ID NO: 25), H2 domain (SEQ ID NO: 26), a partial R4 domain (SEQ ID NO: 409), a partial R15 Domain (SEQ ID NO: 414), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51). [0298] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), R3 domain (SEQ ID NO: 25), H2 domain (SEQ ID NO: 26), a partial R4 domain (SEQ ID NO: 409), a partial R19 Domain (SEQ ID NO: 415), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
[0299] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), R3 domain (SEQ ID NO: 25), H2 domain (SEQ ID NO: 26), a partial R4 domain (SEQ ID NO: 410), a partial R14 Domain (SEQ ID NO: 413), R15 Domain (SEQ ID NO: 38), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51). [0300] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), R3 domain (SEQ ID NO: 25), H2 domain (SEQ ID NO: 26), a partial R4 domain (SEQ ID NO: 410), a partial R15 Domain (SEQ ID NO: 414), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51). [0301] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), R3 domain (SEQ ID NO: 25), H2 domain (SEQ ID NO: 26), a partial R4 domain (SEQ ID NO: 410), a partial R19 Domain (SEQ ID NO: 415), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
[0302] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), R3 domain (SEQ ID NO: 25), H2 domain (SEQ ID NO: 26), R4 domain (SEQ ID NO:27), a partial R5 domain (SEQ ID NO: 411), a partial R15 Domain (SEQ ID NO: 414), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
[0303] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), R3 domain (SEQ ID NO: 25), H2 domain (SEQ ID NO: 26), R4 domain (SEQ ID NO:27), a partial R5 domain (SEQ ID NO: 411), a partial R16 Domain (SEQ ID NO: 416), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51). [0304] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), R3 domain (SEQ ID NO: 25), H2 domain (SEQ ID NO: 26), R4 domain (SEQ ID NO:27), a partial R5 domain (SEQ ID NO: 411), a partial R19 Domain (SEQ ID NO: 415), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51). [0305] In some embodiments, the truncated dystrophin protein comprises an amino acid sequence selected from the group consisting of SEQ ID NOs: 83-107 and 216-223, or an amino acid sequence at least 90%, 91 %, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0306] In some embodiments, the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 83, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0307] In some embodiments, the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 84, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0308] In some embodiments, the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 85, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0309] In some embodiments, the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 86, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0310] In some embodiments, the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 87, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0311] In some embodiments, the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 88, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto. [0312] In some embodiments, the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 89, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0313] In some embodiments, the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 90, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0314] In some embodiments, the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 91, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0315] In some embodiments, the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 92, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0316] In some embodiments, the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 93, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0317] In some embodiments, the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 94, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0318] In some embodiments, the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 95, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0319] In some embodiments, the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 96, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0320] In some embodiments, the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 97, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0321] In some embodiments, the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 98, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto. [0322] In some embodiments, the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 99, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0323] In some embodiments, the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 100, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0324] In some embodiments, the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 101, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0325] In some embodiments, the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 102, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0326] In some embodiments, the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 103, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0327] In some embodiments, the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 104, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0328] In some embodiments, the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 105, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0329] In some embodiments, the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 106, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0330] In some embodiments, the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 107, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0331] In some embodiments, the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 216, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto. [0332] In some embodiments, the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 217, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0333] In some embodiments, the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 218, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0334] In some embodiments, the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 219, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0335] In some embodiments, the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 220, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0336] In some embodiments, the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 221, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0337] In some embodiments, the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 222, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0338] In some embodiments, the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 223, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0339] In some embodiments, the amino acid sequence of the truncated dystrophin protein is not identical to the amino acid sequence of SEQ ID NO: 143. In some embodiments, the truncated dystrophin protein is not a polypeptide of 2361 amino acids. In some embodiments, the truncated dystrophin protein is less than 2361 amino acid in length. In some embodiments, the truncated dystrophin protein is greater than 2361 amino acid in length.
[0340] In some embodiments, the recombinant nucleic acid molecule encoding the truncated dystrophin protein comprises a nucleotide sequence selected from the group consisting of SEQ ID NOS: 108-132, 224-231, 260-280, and 396-403, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto. [0341] In some embodiments, the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 108, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0342] In some embodiments, the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 109, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0343] In some embodiments, the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 110, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0344] In some embodiments, the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 111, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0345] In some embodiments, the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 112, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0346] In some embodiments, the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 113, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0347] In some embodiments, the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 114, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0348] In some embodiments, the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 115, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0349] In some embodiments, the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 116, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0350] In some embodiments, the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 117, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto. [0351] In some embodiments, the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 118, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0352] In some embodiments, the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 119, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0353] In some embodiments, the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 120, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0354] In some embodiments, the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 121, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0355] In some embodiments, the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 122, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0356] In some embodiments, the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 123, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0357] In some embodiments, the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 124, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0358] In some embodiments, the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 125, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0359] In some embodiments, the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 126, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0360] In some embodiments, the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 127, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto. [0361] In some embodiments, the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 128, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0362] In some embodiments, the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 129, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0363] In some embodiments, the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 130, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0364] In some embodiments, the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 131, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0365] In some embodiments, the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 132, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0366] In some embodiments, the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 224, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0367] In some embodiments, the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 225, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0368] In some embodiments, the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 226, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0369] In some embodiments, the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 227, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0370] In some embodiments, the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 228, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto. [0371] In some embodiments, the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 229, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0372] In some embodiments, the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 230, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0373] In some embodiments, the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 231, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0374] In some embodiments, the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 260, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0375] In some embodiments, the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 261, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0376] In some embodiments, the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 262, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0377] In some embodiments, the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 263, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0378] In some embodiments, the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 264, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0379] In some embodiments, the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 265, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0380] In some embodiments, the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 266, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto. [0381] In some embodiments, the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 267, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0382] In some embodiments, the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 268, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0383] In some embodiments, the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 269, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0384] In some embodiments, the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 270, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0385] In some embodiments, the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 271, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0386] In some embodiments, the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 272, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0387] In some embodiments, the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 273, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0388] In some embodiments, the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 274, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0389] In some embodiments, the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 275, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0390] In some embodiments, the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 276, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto. [0391] In some embodiments, the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 277, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0392] In some embodiments, the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 278, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0393] In some embodiments, the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 279, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0394] In some embodiments, the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 280, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0395] In some embodiments, the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 396, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0396] In some embodiments, the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 397, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0397] In some embodiments, the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 398, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0398] In some embodiments, the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 399, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0399] In some embodiments, the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 400, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0400] In some embodiments, the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 401, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto. [0401] In some embodiments, the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 402, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0402] In some embodiments, the recombinant nucleic acid molecule comprises a nucleotide sequence of SEQ ID NO: 403, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0403] The present disclosure also provides expression vectors and cells comprising the isolated nucleic acid molecules encoding the truncated dystrophin proteins, as described herein in Tables 3 and 4, or a portion (e.g., a 3’ portion or 5’ portion) of the isolated nucleic acid molecules. In some embodiments, the host cell is a mammalian cell, an insect cell, or a bacterial cell.
Table 3. Truncated Dystrophin Proteins - Subdomain Components and Organization
Table 4. Truncated Dystrophin Sequences
III. Systems of the Disclosure
[0404] The present disclosure also provides systems for efficiently and reliably generating a large single nucleic acid molecule that encodes a protein of interest, e.g., a truncated dystrophin protein, as described herein, whose coding sequence is too large to package into a single expression vector, e.g., an adeno-associated virus (AAV) vector.
[0405] The present disclosure also provides systems for delivery and expression of a protein of interest, e.g., a truncated dystrophin protein, as described herein. Specifically, the invention utilizes ribozyme-mediated zran -ligation of two or more nucleic acid molecules to assemble a single nucleic acid molecule encoding a protein of interest.
[0406] Ribozymes are small catalytic RNA sequences capable of nucleotide-specific selfcleavage found widespread in nature. Ribozyme cleavage generates unique 2', 3 '-phosphate and 5'-hydroxyl termini, and mammalian cells have an inherent capacity to catalyze the /ra -ligation of independent RNAs that have been cleaved by ribozymes. The efficient and precise nature of ribozyme cleavage, which produce precise and unique nucleotide termini, allow for a trans- ligated RNA to be scarless and able to maintain a protein-coding open reading frame. The ligated mRNAs can behave essentially indistinguishably from their natural full-length counterparts, in that they can be spliced using conventional introns and translated into functional proteins.
[0407] For example, provided herein arc a first nucleic acid molecule comprising a first coding region encoding a first portion of the dystrophin protein (e.g., an N-terminal portion of the truncated dystrophin protein), and a second nucleic acid molecule comprising a second coding region encoding a second portion of the dystrophin protein (e.g., a C-terminal portion of the truncated dystrophin protein). Upon ribozyme- mediated ligation of the two nucleic acid molecules, a single nucleic acid molecule is assembled comprising a third (i.e., combined), coding region which encodes the truncated dystrophin protein, hi some embodiments, the first nucleic acid molecule and the second nucleic acid molecule are included in separate viral vectors for delivery to target cells, e.g., muscle cells. In one embodiment, the viral vector is an adeno- associated viral vector (AAV). This dual AAV strategy allows for efficient delivery and expression of large therapeutic proteins, e.g.. dystrophin, in order to correct disease pathology and treat a subject having a dystrophin-associated disease or disorder, e.g., muscular’ dystrophy, e.g., Duchenne muscular- dystrophy (DMD).
[0408] Accordingly, in one aspect, the present disclosure provides a system for generating a truncated human dystrophin protein, comprising a first recombinant nucleic acid molecule and a second recombinant nucleic acid molecule, wherein the first nucleic acid molecule comprises a first coding region encoding an N-terminal portion of the truncated dystrophin protein and a 3’ ribozyme, where the first coding region is operably linked to the 3’ ribozyme at its 3’ end, wherein the second nucleic acid molecule comprising a second coding region encoding a C- terminal portion of the truncated dystrophin protein and a 5 ’ribozyme, where the second coding region is operably linked to the 5’ ribozyme at its 5’ end, wherein upon ribozyme-mediated catalytic ligation, the first coding region and the second coding region forms a third coding region encoding for the truncated human dystrophin protein. In some embodiments, the systems of the disclosure are capable of producing a significant level of functional truncated dystrophin in transduced cells.
[0409] In some embodiments, the first coding region is operably linked to two or more 3’ ribozymes at its 3’ end. In some embodiments, the second coding region is operably linked to two or more 5’ ribozymes at its 5’ end.
[0410] In some embodiments, the 3’ ribozyme in the first nucleic acid molecule is able to catalyze itself out of the nucleic acid molecule leaving a 3’P or 2’ 3’ cyclic phosphate (cP) end. In some embodiments, the 5’ ribozyme in the second nucleic acid molecule is able to catalyze itself out of the nucleic acid molecule leaving a 5’ OH end. The 3’P or 2’ 3’ cP end and the 5’ OH end of nucleic acid molecules that have undergone ribozyme-mediated cleavage can be ligated together. As such, the coding region of the first nucleic acid molecule, which encodes the N-terminal portion of the truncated dystrophin protein, can be ligated to the coding region of the second nucleic acid molecule, which encodes the C-terminal portion of the truncated dystrophin protein, to form a longer nucleic acid molecule encoding the functional truncated dystrophin protein. In one embodiment, the functional truncated dystrophin is able to restore the expression and function, partially or completely, of an endogenously expressed full-length dystrophin protein. [0411] In some embodiments, the 3’ ribozyme is selected from the group consisting of Twister (Osa), Twister (Dre), Twister (Nvi), Twister (Sbi), Twister (Envl), Twister (Spu), Twister (Cpa), Twister Sister, Hammerhead (HH), Hepatitis Delta Virus (HDV), Pistol, Varkud Satellite (VS), Hatchet, Hairpin, and Hovlinc (Hov), or a variant or fragment thereof.
[0412] In some embodiments, the 3’ ribozyme comprises a sequence selected from the group consisting of SEQ ID NOs: 6 - 20.
[0413] In some embodiments, the 3’ ribozyme comprises a sequence of SEQ ID NO: 6. In some embodiments, the 3’ ribozyme comprises a sequence of SEQ ID NO:7. In some embodiments, the 3’ ribozyme comprises a sequence of SEQ ID NO:8. In some embodiments, the 3’ ribozyme comprises a sequence of SEQ ID NO:9. In some embodiments, the 3’ ribozyme comprises a sequence of SEQ ID NO: 10. In some embodiments, the 3’ ribozyme comprises a sequence of SEQ ID NO: 11 . In some embodiments, the 3’ ribozyme comprises a sequence of SEQ ID NO: 12. In some embodiments, the 3’ ribozyme comprises a sequence of SEQ ID NO: 13. In some embodiments, the 3’ ribozyme comprises a sequence of SEQ ID NO: 14. In some embodiments, the 3’ ribozyme comprises a sequence of SEQ ID NO: 15. In some embodiments, the 3’ ribozyme comprises a sequence of SEQ ID NO: 16. In some embodiments, the 3’ ribozyme comprises a sequence of SEQ ID NO: 17. In some embodiments, the 3’ ribozyme comprises a sequence of SEQ ID NO: 18. In some embodiments, the 3’ ribozyme comprises a sequence of SEQ ID NO: 19. In some embodiments, the 3’ ribozyme comprises a sequence of SEQ ID NO:20.
[0414] In some embodiments, the 5’ ribozyme is selected from the group consisting of Twister (Osa), Twister (Dre), Twister (Nvi), Twister (Sbi), Twister (Envl), Twister (Spu), Twister (Cpa), Twister Sister, Hammerhead (HH), Hepatitis Delta Virus (HDV), Pistol, Varkud Satellite (VS), Hatchet, Hairpin, and Hovlinc (Hov), or a variant or fragment thereof.
[0415] In some embodiments, the 5’ ribozyme comprises a sequence selected from the group consisting of SEQ ID NOs: 6 - 20.
[0416] In some embodiments, the 5’ ribozyme comprises a sequence of SEQ ID NO: 6. In some embodiments, the 5’ ribozyme comprises a sequence of SEQ ID NO:7. In some embodiments, the 5’ ribozyme comprises a sequence of SEQ ID NO:8. In some embodiments, the 5’ ribozyme comprises a sequence of SEQ ID NO:9. In some embodiments, the 5’ ribozyme comprises a sequence of SEQ ID NO: 10. In some embodiments, the 5’ ribozyme comprises a sequence of SEQ ID NO: 11. In some embodiments, the 5’ ribozyme comprises a sequence of SEQ ID NO: 12. In some embodiments, the 5’ ribozyme comprises a sequence of SEQ ID NO: 13. In some embodiments, the 5’ ribozyme comprises a sequence of SEQ ID NO: 14. In some embodiments, the 5’ ribozyme comprises a sequence of SEQ ID NO: 15. In some embodiments, the 5’ ribozyme comprises a sequence of SEQ ID NO: 16. In some embodiments, the 5’ ribozyme comprises a sequence of SEQ ID NO: 17. In some embodiments, the 5’ ribozyme comprises a sequence of SEQ ID NO: 18. In some embodiments, the 5’ ribozyme comprises a sequence of SEQ ID NO: 19. In some embodiments, the 5’ ribozyme comprises a sequence of SEQ ID NO:20.
[0417] In some embodiments, the 3’ ribozyme comprises an HDV ribozyme. In other embodiments, the 5’ ribozyme comprises an HH ribozyme. In some embodiments, the HDV ribozyme is selected from the group consisting of HDV, HDV68, HDV67, HDV56, genHDV, and antiHDV, or a variant or fragment thereof. In some embodiment, the HH ribozyme is RzB ribozyme.
[0418] In some embodiments, the 3’ ribozyme comprises a Twister ribozyme. In other embodiments, the 5’ ribozyme comprises an HH ribozyme. In some embodiments, the Twister ribozyme is selected from the group consisting of Twister (Osa), Twister (Dre), Twister (Nvi), Twister (Sbi), Twister (Envl), Twister (Spu), and Twister (Cpa). In some embodiment, the HH ribozyme is RzB ribozyme.
[0419] In some embodiments, the 3’ ribozyme comprises a Twister ribozyme. In other embodiments, the 5’ ribozyme comprises a Twister ribozyme. In some embodiment, the Twister ribozyme is selected from the group consisting of Twister (Osa), Twister (Dre), Twister (Nvi), Twister (Sbi), Twister (Envl), Twister (Spu), and Twister (Cpa). In some embodiments, the Twister ribozyme is Twister (Osa).
[0420] Pre-mRNA splicing by the spliceosome has been shown to enhance mRNA translation, either through deposition of factors which promote a pioneer round of translation or through promoting RNA processing and export to the cytoplasm. The addition of a chimeric cis-splicing intron within a transgene has also been shown to promote transgene protein expression. Thus, in some embodiments, the addition of intron splice donor and intron splice acceptor sites that are recognized and cis-spliced by the spliceosome may enhance protein expression from split precursor RNA molecules. Moreover, the inclusion of an intron splice donor and an intron splice acceptor sequence allows for the use of ribozymes that are not completely scarless, since any remaining ribozyme sequences will be removed through splicing of the intron. [0421] In some embodiments, the first nucleic acid molecule further comprises an intron splice donor sequence, and the second nucleic acid molecule further comprises an intron splice acceptor sequence.
[0422] In some embodiments, the first nucleic acid molecule comprises a first coding region encoding an N-terminal portion of the truncated dystrophin protein, an intron splice donor sequence, and a 3’ ribozyme, and the second nucleic acid molecule comprises a second coding region encoding a C-terminal portion of the truncated dystrophin protein, an intron splice acceptor sequence, and a 5’ ribozyme.
[0423] In some embodiments, the splice donor sequence is positioned between the first coding region and the 3’ ribozyme, hi some embodiments, the splice donor sequence is selected from the group consisting of SEQ ID NOs: 133 - 136. In some embodiments, the splice donor sequence comprises a sequence of SEQ ID NO: 133. In some embodiments, the splice donor sequence comprises a sequence of SEQ ID NO: 134. hi some embodiments, the splice donor sequence comprises a sequence of SEQ ID NO: 135. In some embodiments, the splice donor sequence comprises a sequence of SEQ ID NO: 136. In some embodiments, the splice donor sequence is positioned at least 10 nucleotides, at least 15 nucleotides, at least 20 nucleotides, at least 30 nucleotides, at least 40 nucleotides, at least 50 nucleotides, or more from the 3’ ribozyme.
[0424] In some embodiments, the splice donor sequence is positioned within a region of the truncated dystrophin protein coding for a region selected from the group consisting of the R7 domain, the R8 domain, the R9 domain, the R10 domain, the R11 domain, the R12 domain, the R13 domain, the R 14 domain, the R15 domain, the R16 domain, the R17 domain, the R18 domain, the R19 domain, the H3 domain, the R20 domain, the R21 domain, and the R22 domain. In some embodiments, the splice donor sequence is positioned within a region of the truncated dystrophin protein coding for a region selected from the group consisting of the R8 domain, the R19 domain, the H3 domain, the R20 domain, and the R21 domain, hi some embodiments, the splice donor sequence is not positioned within the R21 domain.
[0425] In some embodiments, the splice acceptor sequence is positioned between the 5’ ribozyme and the second coding region. In some embodiments, the splice acceptor sequence is selected from the group consisting of SEQ ID Nos: 137-141. In some embodiments, the splice acceptor sequence comprises a sequence of SEQ ID NO: 137. In some embodiments, the splice acceptor sequence comprises a sequence of SEQ ID NO: 138. In some embodiments, the splice acceptor sequence comprises a sequence of SEQ ID NO: 139. In some embodiments, the splice acceptor sequence comprises a sequence of SEQ ID NO: 140. In some embodiments, the splice acceptor sequence comprises a sequence of SEQ ID NO: 141. In some embodiments, the splice acceptor sequence is positioned at least 10 nucleotides, at least 15 nucleotides, at least 20 nucleotides, at least 30 nucleotides, at least 40 nucleotides, at least 50 nucleotides, or more from the 5’ ribozyme.
[0426] In some embodiments, the splice donor sequence and the splice acceptor sequence are positioned such that the resulting spliced intron is between 50 - 200 bp in length. In some embodiments, the splice donor sequence and splice acceptor sequence are positioned such that the resulting spliced intron encodes a single predominant reading frame. In some embodiments, a stop codon sequence is introduced into the splice donor sequence or the splice acceptor sequence.
[0427] In some embodiments, the first nucleic acid molecule and the nucleic acid RNA molecule are ligated together by an endogenous ligase that exists in the native cell or tissue in which the nucleic acid assembly is taking place. In some embodiments, the systems of the present invention comprises an exogenous ligase to induce the ligation of the processed nucleic acid molecules together. In one embodiment, the ligase is RNA 2',3'-Cyclic Phosphate and 5'-OH (RtcB) ligase. [0428] The coding region of the first nucleic acid molecule, which encodes the N-terminal portion of the truncated dystrophin protein, and the coding region of the second nucleic acid molecule, which encodes the C-terminal portion of the truncated dystrophin protein, forms a longer nucleic acid molecule comprising a third coding region which encodes for the truncated dystrophin protein.
[0429] In some embodiments, at least one of the first coding region and the second coding region is at least 2000 nucleotides in length. In some embodiments, at least one of the first coding region and the second coding region is at least 2100 nucleotides in length. In some embodiments, at least one of the first coding region and the second coding region is at least 2200 nucleotides in length. In some embodiments, at least one of the first coding region and the second coding region is at least 2300 nucleotides in length. In some embodiments, at least one of the first coding region and the second coding region is at least 2400 nucleotides in length. In some embodiments, at least one of the first coding region and the second coding region is at least 2500 nucleotides in length. In some embodiments, at least one of the first coding region and the second coding region is at least 2600 nucleotides in length.
[0430] In some embodiments, the first coding region and the second coding region are each at least 2000 nucleotides in length. In some embodiments, the first coding region and the second coding region are each at least 2100 nucleotides in length. In some embodiments, the first coding region and the second coding region are each at least 2200 nucleotides in length. In some embodiments, the first coding region and the second coding region are each at least 2300 nucleotides in length. In some embodiments, the first coding region and the second coding region are each at least 2400 nucleotides in length. In some embodiments, the first coding region and the second coding region are each at least 2500 nucleotides in length. In some embodiments, the first coding region and the second coding region are each at least 2600 nucleotides in length.
[0431] In some embodiments, the first coding region and the second coding region do not share a region of substantial sequence identity, i.e., the sequence identity of the first coding region and the second coding region is less than 90%, less than 85%, less than 80%, less than 75%, less than 70%, less than 65%, less than 60%, less than 55%, less than 50%, less than 45%, less than 40%, less than 35%, or less than 30%. In some embodiments, the 3’ end of the first coding region does not have a sequence identity to the 5’ end of the second coding region.
[0432] In some embodiments, the third coding region, i.e., the combination of the first coding and the second coding region, is at least 4920 nucleotides in length. In some embodiments, the third coding region is at least 5000 nucleotides in length. In some embodiments, the third coding region is at least 5100 nucleotides in length. In some embodiments, the third coding region is at least 5200 nucleotides in length. In some embodiments, the third coding region is at least 5300 nucleotides in length.
[0433] In some embodiments, the truncated human dystrophin proteins are functional.
[0434] In some embodiments, the truncated human dystrophin protein comprises at least 1640 amino acids.
[0435] In some embodiments, the truncated human dystrophin protein comprises ABCD domain (SEQ ID NO: 21), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51 ). [0436] In some embodiments, the truncated dystrophin protein further comprises Hl Domain (SEQ ID NO: 22). In some embodiments, the truncated dystrophin protein comprises an amino acid sequence selected from the group consisting of: a. midi-Dys AR1-R15 (SEQ ID NO: 83), b. midi-Dys AR2-R15 (SEQ ID NO: 84), c. midi-Dys AR3-R15 (SEQ ID NO: 85), d. midi-Dys AH2-R15 (SEQ ID NO: 86), c. midi-Dys AR4-R15 (SEQ ID NO: 87), f. midi-Dys AR5-R15 (SEQ ID NO: 88), g. midi-Dys A exon 13-33 (SEQ ID NO: 93), h. midi-Dys A exon 13-39 (SEQ ID NO: 94), i. midi-Dys A exon 13-41 (SEQ ID NO: 95), j. midi-Dys A exon 13-48 (SEQ ID NO: 96), k. midi-Dys A exon 15-39 (SEQ ID NO: 97), l. midi-Dys A exon 15-41 (SEQ ID NO: 98), m. midi-Dys A exon 15-48 (SEQ ID NO: 99), n. midi-Dys A exon 17-39 (SEQ ID NO: 100), o. midi-Dys A exon 17-41 (SEQ ID NO: 101), p. midi-Dys A exon 17-48 (SEQ ID NO: 102), q. midi-Dys A exon 18-39 (SEQ ID NO: 220), r. midi-Dys A exon 18-41 (SEQ ID NO: 221), s. midi-Dys A exon 18-48 (SEQ ID NO: 222), t. midi-Dys A exon 19-39 (SEQ ID NO: 103), u. midi-Dys A exon 19-41 (SEQ ID NO: 104), v. midi-Dys A exon 19-48 (SEQ ID NO: 105), w. midi-Dys A exon 21-41 (SEQ ID NO: 106), x. midi-Dys A exon 21-42 (SEQ ID NO: 223), and y. midi-Dys A exon 21-48 (SEQ ID NO: 107).
[0437] In some embodiments, the truncated dystrophin protein further comprises R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), and R19
Domain (SEQ ID NO:42).
[0438] In some embodiments, the truncated dystrophin protein comprises an amino acid sequence selected from the group consisting of a. midi-Dys AR1-R15 (SEQ ID NO: 83), b. midi-Dys AR2-R15 (SEQ ID NO: 84), c. midi-Dys AR3-R15 (SEQ ID NO: 85), d. midi-Dys AH2-R15 (SEQ ID NO: 86), e. midi-Dys AR4-R15 (SEQ ID NO: 87), f. midi-Dys AR5-R15 (SEQ ID NO: 88) g. midi-Dys A exon 10-33 (SEQ ID NO: 89), h. midi-Dys A exon 10-39 (SEQ ID NO: 90), i. midi-Dys A exon 1041 (SEQ ID NO: 91), j. midi-Dys A exon 11 33 (SEQ ID NO: 216), k. midi-Dys A exon 11 39 (SEQ ID NO: 217), l. midi-Dys A exon 11 41 (SEQ ID NO: 218), m. midi-Dys A exon 13 33 (SEQ ID NO: 93), n. midi-Dys A exon 13 39 (SEQ ID NO: 94), o. midi-Dys A exon 13 41 (SEQ ID NO: 95), p. midi-Dys A exon 15 39 (SEQ ID NO: 97), q. midi-Dys A exon 15 41 (SEQ ID NO: 98), r. midi-Dys A exon 17- 39 (SEQ ID NO: 100), s. midi-Dys A exon 17- 41 (SEQ ID NO: 101), t. midi-Dys A exon 18 39 (SEQ ID NO: 220), u. midi-Dys A exon 18 41 (SEQ ID NO: 221), v. midi-Dys A exon 19 39 (SEQ ID NO: 103), w. midi-Dys A exon 19 41 (SEQ ID NO: 104), x. midi-Dys A exon 21- 41 (SEQ ID NO: 106), and y. midi-Dys A exon 21- 42 (SEQ ID NO: 223).
[0439] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
[0440] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
[0441] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO:24), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
[0442] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO:24), R3 domain (SEQ ID NO:25), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), RI8 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
[0443] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO:24), R3 domain (SEQ ID NO:25), H2 domain (SEQ ID NO:26), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
[0444] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO:24), R3 domain (SEQ ID NO:25), H2 domain (SEQ ID NO:26), R4 domain (SEQ ID NO:27), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
[0445] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), a partial Hl domain (SEQ ID NO: 404), a partial R11 domain (SEQ ID NO:412), R12 Domain (SEQ ID NO: 35), R13 Domain (SEQ ID NO: 36), R14 domain (SEQ ID NO: 37), R15 Domain (SEQ ID NO: 38), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
[0446] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), a partial Hl domain (SEQ ID NO: 404), a partial R14 Domain (SEQ ID NO: 413), R15 Domain (SEQ ID NO: 38), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
[0447] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), a partial Hl domain (SEQ ID NO: 404), a partial R15 Domain (SEQ ID NO: 414), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41 ), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
[0448] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), a partial Hl domain (SEQ ID NO: 404), a partial R19 Domain (SEQ ID NO:415), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
[0449] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), a partial R1 domain (SEQ ID NO: 405), a partial R11 domain (SEQ ID NO: 412), R12 Domain (SEQ ID NO: 35), R13 Domain (SEQ ID NO: 36), R14 Domain (SEQ ID NO: 37), R15 Domain (SEQ ID NO: 38), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51). [0450] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), a partial R1 domain (SEQ ID NO: 405), a partial R14 Domain (SEQ ID NO: 413), R15 Domain (SEQ ID NO: 38), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
[0451] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), a partial R1 domain (SEQ ID NO: 405), a partial R15 Domain (SEQ ID NO: 414), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
[0452] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), a partial R1 domain (SEQ ID NO: 405), a partial R19 Domain (SEQ ID NO:415), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
[0453] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), a partial R2 domain (SEQ ID NO: 406), a partial R11 domain (SEQ ID NO: 412), R12 Domain (SEQ ID NO: 35), R13 Domain (SEQ ID NO: 36), R14 Domain (SEQ ID NO: 37), R15 Domain (SEQ ID NO: 38), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51). [0454] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), a partial R2 domain (SEQ ID NO: 406), a partial R14 Domain (SEQ ID NO: 413), R15 Domain (SEQ ID NO: 38), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
[0455] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), a partial R2 domain (SEQ ID NO: 406), a partial R15 Domain (SEQ ID NO: 414), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
[0456] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), a partial R2 domain (SEQ ID NO: 406), a partial R19 Domain (SEQ ID NO: 415), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
[0457] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), a partial R3 domain (SEQ ID NO: 407), a partial R14 Domain (SEQ ID NO: 413), R15 Domain (SEQ ID NO: 38), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51). [0458] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), a partial R3 domain (SEQ ID NO: 407), a partial R15 Domain (SEQ ID NO: 414), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
[0459] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), a partial R3 domain (SEQ ID NO: 407), a partial R19 Domain (SEQ ID NO:415), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51). [0460] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), a partial R3 domain (SEQ ID NO: 408), a partial R14 Domain (SEQ ID NO: 413), R15 Domain (SEQ ID NO: 38), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
[0461] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), a partial R3 domain (SEQ ID NO: 408), a partial R15 Domain (SEQ ID NO: 414), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51). [0462] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), a partial R3 domain (SEQ ID NO: 408), a partial R19 Domain (SEQ ID NO:415), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51). [0463] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), R3 domain (SEQ ID NO: 25), H2 domain (SEQ ID NO: 26), a partial R4 domain (SEQ ID NO: 409), a partial R14 Domain (SEQ ID NO: 413), R15 Domain (SEQ ID NO: 38), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
[0464] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), R3 domain (SEQ ID NO: 25), H2 domain (SEQ ID NO: 26), a partial R4 domain (SEQ ID NO: 409), a partial R15 Domain (SEQ ID NO: 414), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51). [0465] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), R3 domain (SEQ ID NO: 25), H2 domain (SEQ ID NO: 26), a partial R4 domain (SEQ ID NO: 409), a partial R19 Domain (SEQ ID NO: 415), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51). [0466] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), R3 domain (SEQ ID NO: 25), H2 domain (SEQ ID NO: 26), a partial R4 domain (SEQ ID NO: 410), a partial R14 Domain (SEQ ID NO: 413), R15 Domain (SEQ ID NO: 38), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
[0467] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), R3 domain (SEQ ID NO: 25), H2 domain (SEQ ID NO: 26), a partial R4 domain (SEQ ID NO: 410), a partial R15 Domain (SEQ ID NO: 414), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51). [0468] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), R3 domain (SEQ ID NO: 25), H2 domain (SEQ ID NO: 26), a partial R4 domain (SEQ ID NO: 410), a partial R19 Domain (SEQ ID NO: 415), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
[0469] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), R3 domain (SEQ ID NO: 25), H2 domain (SEQ ID NO: 26), R4 domain (SEQ ID NO: 27), a partial R5 domain (SEQ ID NO: 411), a partial R15 Domain (SEQ ID NO: 414), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
[0470] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), R3 domain (SEQ ID NO: 25), H2 domain (SEQ ID NO: 26), R4 domain (SEQ ID NO: 27), a partial R5 domain (SEQ ID NO: 411), a partial R16 Domain (SEQ ID NO: 416), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51). [0471] In some embodiments, the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), R3 domain (SEQ ID NO: 25), H2 domain (SEQ ID NO: 26), R4 domain (SEQ ID NO:27), a partial R5 domain (SEQ ID NO: 411), a partial R19 Domain (SEQ ID NO: 415), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51). [0472] In some embodiments, the truncated human dystrophin protein comprises an amino acid sequence selected from the group consisting of SEQ ID NOs: 83-107 amd 216-223, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0473] In some embodiments, the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 83, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0474] In some embodiments, the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 84, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0475] In some embodiments, the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 85, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto. [0476] In some embodiments, the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 86, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0477] In some embodiments, the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 87, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0478] In some embodiments, the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 88, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0479] In some embodiments, the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 89, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0480] In some embodiments, the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 90, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0481] In some embodiments, the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 91, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0482] In some embodiments, the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 92, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0483] In some embodiments, the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 93, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0484] In some embodiments, the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 94, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0485] In some embodiments, the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 95, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto. [0486] In some embodiments, the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 96, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0487] In some embodiments, the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 97, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0488] In some embodiments, the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 98, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0489] In some embodiments, the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 99, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0490] In some embodiments, the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 100, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0491] In some embodiments, the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 101, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0492] In some embodiments, the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 102, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0493] In some embodiments, the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 103, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0494] In some embodiments, the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 104, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0495] In some embodiments, the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 105, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto. [0496] In some embodiments, the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 106, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0497] In some embodiments, the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 107, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0498] In some embodiments, the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 216, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0499] In some embodiments, the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 217, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0500] In some embodiments, the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 218, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0501] In some embodiments, the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 219, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0502] In some embodiments, the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 220, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0503] In some embodiments, the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 221, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0504] In some embodiments, the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 222, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0505] In some embodiments, the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 223, or an amino acid sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto. [0506] In some embodiments, the truncated human dystrophin protein does not comprise 2361 amino acids. In some embodiments, the truncated human dystrophin protein is greater than 2361 amino acids in length. In some embodiments, the truncated human dystrophin protein is less than 2361 amino acids in length. In some embodiments, the truncated human dystrophin protein is not identical to the sequence of SEQ ID NO: 143.
[0507] In some embodiments, the third coding region (created by trans-ligation of the first and second coding regions) comprises a nucleotide sequence selected from the group consisting of SEQ ID NOS: 108-132, 224-231, 260-280, and 396-403, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0508] In some embodiments, the third coding region comprises a nucleotide sequence of SEQ ID NO: 108, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto. In some embodiments, the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO:108, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto, and the second coding region comprises a second, e.g., 3’, portion of the nucleotide sequence of SEQ ID
NO: 108, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0509] In some embodiments, the third coding region comprises a nucleotide sequence of SEQ ID NO: 109, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto. In some embodiments, the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO:109, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto, and the second coding region comprises a second, e.g., 3’, portion of the nucleotide sequence of SEQ ID
NO: 109, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0510] In some embodiments, the third coding region comprises a nucleotide sequence of SEQ ID NO: 110, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto. In some embodiments, the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO:110, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto, and the second coding region comprises a second, e.g., 3’, portion of the nucleotide sequence of SEQ ID NO:110, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0511] In some embodiments, the third coding region comprises a nucleotide sequence of SEQ ID NO: 111, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto. In some embodiments, the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO: 111, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto, and the second coding region comprises a second, e.g., 3’, portion of the nucleotide sequence of SEQ ID NO:111, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0512] In some embodiments, the third coding region comprises a nucleotide sequence of SEQ ID NO: 1 12, or a nucleotide sequence at least 90%, 91 %, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto. In some embodiments, the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO:112, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto, and the second coding region comprises a second, e.g., 3’, portion of the nucleotide sequence of SEQ ID NO: 112, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0513] In some embodiments, the third coding region comprises a nucleotide sequence of SEQ ID NO: 113, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto. In some embodiments, the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO:113, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto, and the second coding region comprises a second, e.g., 3’, portion of the nucleotide sequence of SEQ ID NO:113, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0514] In some embodiments, the third coding region comprises a nucleotide sequence of SEQ ID NO: 114, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto. In some embodiments, the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO:114, or a nucleotide sequence at least 90%, 91 %, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto, and the second coding region comprises a second, e.g., 3’, portion of the nucleotide sequence of SEQ ID NO: 114, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0515] In some embodiments, the third coding region comprises a nucleotide sequence of SEQ ID NO: 115, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto. In some embodiments, the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO:115, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto, and the second coding region comprises a second, e.g., 3’, portion of the nucleotide sequence of SEQ ID NO:115, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0516] In some embodiments, the third coding region comprises a nucleotide sequence of SEQ ID NO: 116, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto. In some embodiments, the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO:116, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto, and the second coding region comprises a second, e.g., 3’, portion of the nucleotide sequence of SEQ ID NO:116, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0517] In some embodiments, the third coding region comprises a nucleotide sequence of SEQ ID NO: 117, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto. In some embodiments, the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO:117, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto, and the second coding region comprises a second portion of the nucleotide sequence of SEQ ID NO: 117, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0518] In some embodiments, the third coding region comprises a nucleotide sequence of SEQ ID NO: 118, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto. In some embodiments, the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO: 1 18, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto, and the second coding region comprises a second, e.g., 3’, portion of the nucleotide sequence of SEQ ID NO: 118, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0519] In some embodiments, the third coding region comprises a nucleotide sequence of SEQ ID NO: 119, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto. In some embodiments, the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO:119, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto, and the second coding region comprises a second, e.g., 3’, portion of the nucleotide sequence of SEQ ID NO:119, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0520] In some embodiments, the third coding region comprises a nucleotide sequence of SEQ ID NO: 120, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto. In some embodiments, the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO:120, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto, and the second coding region comprises a second, e.g., 3’, portion of the nucleotide sequence of SEQ ID NO: 120, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0521] In some embodiments, the third coding region comprises a nucleotide sequence of SEQ ID NO: 121, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto. In some embodiments, the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO:121, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto, and the second coding region comprises a second, e.g., 3’, portion of the nucleotide sequence of SEQ ID NO:121, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0522] In some embodiments, the third coding region comprises a nucleotide sequence of SEQ ID NO: 122, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto. In some embodiments, the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO: 122, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto, and the second coding region comprises a second, e.g., 3’, portion of the nucleotide sequence of SEQ ID NO:122, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[05231 In some embodiments, the third coding region comprises a nucleotide sequence of SEQ ID NO: 123, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto. In some embodiments, the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO: 123, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto, and the second coding region comprises a second, e.g., 3’, portion of the nucleotide sequence of SEQ ID
NO: 123, or a nucleotide sequence at least 90%, 91 %, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0524] In some embodiments, the third coding region comprises a nucleotide sequence of SEQ ID NO: 124, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto. In some embodiments, the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO: 124, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto, and the second coding region comprises a second, e.g., 3’, portion of the nucleotide sequence of SEQ ID
NO: 124, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0525] In some embodiments, the third coding region comprises a nucleotide sequence of SEQ ID NO: 125, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto. In some embodiments, the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO: 125, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto, and the second coding region comprises a second, e.g., 3’, portion of the nucleotide sequence of SEQ ID
NO: 125, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0526] In some embodiments, the third coding region comprises a nucleotide sequence of SEQ ID NO: 126, or a nucleotide sequence at least 90%, 91 %, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto. In some embodiments, the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO:126, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto, and the second coding region comprises a second, e.g., 3’, portion of the nucleotide sequence of SEQ ID NO: 126, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0527] In some embodiments, the third coding region comprises a nucleotide sequence of SEQ ID NO: 127, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto. In some embodiments, the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO: 127, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto, and the second coding region comprises a second, e.g., 3’, portion of the nucleotide sequence of SEQ ID NO: 127, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0528] In some embodiments, the third coding region comprises a nucleotide sequence of SEQ ID NO: 128, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto. In some embodiments, the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO:128, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto, and the second coding region comprises a second, e.g., 3’, portion of the nucleotide sequence of SEQ ID NO: 128, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0529] In some embodiments, the third coding region comprises a nucleotide sequence of SEQ ID NO: 129, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto. In some embodiments, the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO:129, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto, and the second coding region comprises a second, e.g., 3’, portion of the nucleotide sequence of SEQ ID NO: 129, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto. [0530] In some embodiments, the third coding region comprises a nucleotide sequence of SEQ ID NO: 130, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto. In some embodiments, the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO:130, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto, and the second coding region comprises a second, e.g., 3’, portion of the nucleotide sequence of SEQ ID NO:130, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0531] In some embodiments, the third coding region comprises a nucleotide sequence of SEQ ID NO: 131, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto. In some embodiments, the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO:131 , or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto, and the second coding region comprises a second, e.g., 3’, portion of the nucleotide sequence of SEQ ID NO:131, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0532] In some embodiments, the third coding region comprises a nucleotide sequence of SEQ ID NO: 132, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto. In some embodiments, the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO: 132, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto, and the second coding region comprises a second, e.g., 3’, portion of the nucleotide sequence of SEQ ID NO: 132, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0533] In some embodiments, the third coding region comprises a nucleotide sequence of SEQ ID NO: 224, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto. In some embodiments, the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO:224, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto, and the second coding region comprises a second, e.g., 3’, portion of the nucleotide sequence of SEQ ID NO:224, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0534] In some embodiments, the third coding region comprises a nucleotide sequence of SEQ ID NO: 225, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto. In some embodiments, the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO:225, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto, and the second coding region comprises a second, e.g., 3’, portion of the nucleotide sequence of SEQ ID NO:225, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0535] In some embodiments, the third coding region comprises a nucleotide sequence of SEQ ID NO: 226, or a nucleotide sequence at least 90%, 91 %, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto. In some embodiments, the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO:226, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto, and the second coding region comprises a second, e.g., 3’, portion of the nucleotide sequence of SEQ ID NO:226, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0536] In some embodiments, the third coding region comprises a nucleotide sequence of SEQ ID NO: 227, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto. In some embodiments, the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO:227, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto, and the second coding region comprises a second, e.g., 3’, portion of the nucleotide sequence of SEQ ID NO:227, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0537] In some embodiments, the third coding region comprises a nucleotide sequence of SEQ ID NO: 228, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto. In some embodiments, the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO:228, or a nucleotide sequence at least 90%, 91 %, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto, and the second coding region comprises a second, e.g., 3’, portion of the nucleotide sequence of SEQ ID NO:228, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0538] In some embodiments, the third coding region comprises a nucleotide sequence of SEQ ID NO: 229, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto. In some embodiments, the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO:229, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto, and the second coding region comprises a second, e.g., 3’, portion of the nucleotide sequence of SEQ ID NO:229, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0539] In some embodiments, the third coding region comprises a nucleotide sequence of SEQ ID NO: 230, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto. In some embodiments, the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO:230, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto, and the second coding region comprises a second, e.g., 3’, portion of the nucleotide sequence of SEQ ID NO:230, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0540] In some embodiments, the third coding region comprises a nucleotide sequence of SEQ ID NO: 231, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto. In some embodiments, the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO:231, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto, and the second coding region comprises a second, e.g., 3’, portion of the nucleotide sequence of SEQ ID NO:231, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0541] In some embodiments, the third coding region comprises a nucleotide sequence of SEQ ID NO: 260, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto. In some embodiments, the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO:260, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto, and the second coding region comprises a second, e.g., 3’, portion of the nucleotide sequence of SEQ ID NO:260, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0542] In some embodiments, the third coding region comprises a nucleotide sequence of SEQ ID NO: 261, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto. In some embodiments, the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO:261, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto, and the second coding region comprises a second, e.g., 3’, portion of the nucleotide sequence of SEQ ID NO:261, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0543] In some embodiments, the third coding region comprises a nucleotide sequence of SEQ ID NO: 262, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto. In some embodiments, the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO:262, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto, and the second coding region comprises a second, e.g., 3’, portion of the nucleotide sequence of SEQ ID NO:262, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0544] In some embodiments, the third coding region comprises a nucleotide sequence of SEQ ID NO: 263, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto. In some embodiments, the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO:263, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto, and the second coding region comprises a second, e.g., 3’, portion of the nucleotide sequence of SEQ ID NO:263, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0545] In some embodiments, the third coding region comprises a nucleotide sequence of SEQ ID NO: 264, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto. In some embodiments, the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO:264, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto, and the second coding region comprises a second, e.g., 3’, portion of the nucleotide sequence of SEQ ID NO:264, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[05461 In some embodiments, the third coding region comprises a nucleotide sequence of SEQ ID NO: 265, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto. In some embodiments, the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO:265, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto, and the second coding region comprises a second, e.g., 3’, portion of the nucleotide sequence of SEQ ID NO:265, or a nucleotide sequence at least 90%, 91 %, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0547] In some embodiments, the third coding region comprises a nucleotide sequence of SEQ ID NO: 266, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto. In some embodiments, the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO:266, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto, and the second coding region comprises a second, e.g., 3’, portion of the nucleotide sequence of SEQ ID NO:266, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0548] In some embodiments, the third coding region comprises a nucleotide sequence of SEQ ID NO: 267, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto. In some embodiments, the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO:267, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto, and the second coding region comprises a second, e.g., 3’, portion of the nucleotide sequence of SEQ ID NO:267, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0549] In some embodiments, the third coding region comprises a nucleotide sequence of SEQ ID NO: 268, or a nucleotide sequence at least 90%, 91 %, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto. In some embodiments, the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO:268, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto, and the second coding region comprises a second, e.g., 3’, portion of the nucleotide sequence of SEQ ID NO:268, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0550] In some embodiments, the third coding region comprises a nucleotide sequence of SEQ ID NO: 269, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto. In some embodiments, the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO:269, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto, and the second coding region comprises a second, e.g., 3’, portion of the nucleotide sequence of SEQ ID NO:269, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0551] In some embodiments, the third coding region comprises a nucleotide sequence of SEQ ID NO: 270, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto. In some embodiments, the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO:270, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto, and the second coding region comprises a second, e.g., 3’, portion of the nucleotide sequence of SEQ ID NO:270, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0552] In some embodiments, the third coding region comprises a nucleotide sequence of SEQ ID NO: 271, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto. In some embodiments, the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO:271, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto, and the second coding region comprises a second, e.g., 3’, portion of the nucleotide sequence of SEQ ID NO:271, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto. [0553] In some embodiments, the third coding region comprises a nucleotide sequence of SEQ ID NO: 272, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto. In some embodiments, the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO:272, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto, and the second coding region comprises a second, e.g., 3’, portion of the nucleotide sequence of SEQ ID NO:272, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0554] In some embodiments, the third coding region comprises a nucleotide sequence of SEQ ID NO: 273, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto. In some embodiments, the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO:273, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto, and the second coding region comprises a second, e.g., 3’, portion of the nucleotide sequence of SEQ ID NO:273, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0555] In some embodiments, the third coding region comprises a nucleotide sequence of SEQ ID NO: 274, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto. In some embodiments, the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO:274, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto, and the second coding region comprises a second, e.g., 3’, portion of the nucleotide sequence of SEQ ID NO:274, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0556] In some embodiments, the third coding region comprises a nucleotide sequence of SEQ ID NO: 275, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto. In some embodiments, the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO:275, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto, and the second coding region comprises a second, e.g., 3’, portion of the nucleotide sequence of SEQ ID NO:275, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0557] In some embodiments, the third coding region comprises a nucleotide sequence of SEQ ID NO: 276, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto. In some embodiments, the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO:276, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto, and the second coding region comprises a second, e.g., 3’, portion of the nucleotide sequence of SEQ ID NO:276, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0558] In some embodiments, the third coding region comprises a nucleotide sequence of SEQ ID NO: 277, or a nucleotide sequence at least 90%, 91 %, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto. In some embodiments, the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO:277, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto, and the second coding region comprises a second, e.g., 3’, portion of the nucleotide sequence of SEQ ID NO:277, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0559] In some embodiments, the third coding region comprises a nucleotide sequence of SEQ ID NO: 278, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto. In some embodiments, the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO:278, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto, and the second coding region comprises a second, e.g., 3’, portion of the nucleotide sequence of SEQ ID NO:278, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0560] In some embodiments, the third coding region comprises a nucleotide sequence of SEQ ID NO: 279, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto. In some embodiments, the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO:279, or a nucleotide sequence at least 90%, 91 %, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto, and the second
Il l coding region comprises a second, e.g., 3’, portion of the nucleotide sequence of SEQ ID NO:279, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0561] In some embodiments, the third coding region comprises a nucleotide sequence of SEQ ID NO: 280, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto. In some embodiments, the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO:280, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto, and the second coding region comprises a second, e.g., 3’, portion of the nucleotide sequence of SEQ ID NO:280, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0562] In some embodiments, the third coding region comprises a nucleotide sequence of SEQ ID NO: 396, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto. In some embodiments, the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO:396, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto, and the second coding region comprises a second, e.g., 3’, portion of the nucleotide sequence of SEQ ID NO:396, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0563] In some embodiments, the third coding region comprises a nucleotide sequence of SEQ ID NO: 397, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto. In some embodiments, the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO:397, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto, and the second coding region comprises a second, e.g., 3’, portion of the nucleotide sequence of SEQ ID NO:397, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0564] In some embodiments, the third coding region comprises a nucleotide sequence of SEQ ID NO: 398, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto. In some embodiments, the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO:398, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto, and the second coding region comprises a second, e.g., 3’, portion of the nucleotide sequence of SEQ ID NO:398, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0565] In some embodiments, the third coding region comprises a nucleotide sequence of SEQ ID NO: 399, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto. In some embodiments, the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO:399, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto, and the second coding region comprises a second, e.g., 3’, portion of the nucleotide sequence of SEQ ID NO:399, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0566] In some embodiments, the third coding region comprises a nucleotide sequence of SEQ ID NO: 400, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto. In some embodiments, the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO:400, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto, and the second coding region comprises a second, e.g., 3’, portion of the nucleotide sequence of SEQ ID NO:400, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0567] In some embodiments, the third coding region comprises a nucleotide sequence of SEQ ID NO: 401, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto. In some embodiments, the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO:401, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto, and the second coding region comprises a second, e.g., 3’, portion of the nucleotide sequence of SEQ ID NO:401, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0568] In some embodiments, the third coding region comprises a nucleotide sequence of SEQ ID NO: 402, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto. In some embodiments, the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO:402, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto, and the second coding region comprises a second, e.g., 3’, portion of the nucleotide sequence of SEQ ID NO:402, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[05691 In some embodiments, the third coding region comprises a nucleotide sequence of SEQ ID NO: 403, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto. In some embodiments, the first coding region comprises a first, e.g., 5’, portion of the nucleotide sequence of SEQ ID NO:403, or a nucleotide sequence at least 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto, and the second coding region comprises a second, e.g., 3’, portion of the nucleotide sequence of SEQ ID NO:403, or a nucleotide sequence at least 90%, 91 %, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical thereto.
[0570] Provided herein, in Table 5, are exemplary first, second, and third coding regions of different truncated dystrophin proteins.
Table 5. Exemplary First, Second and Third Coding Regions
[0571] The first isolated nucleic acid molecule and the second isolated nucleic acid molecule of the system can also be introduced into a vector. In some embodiments, the first isolated nucleic acid molecule and the second isolated nucleic acid molecule are encoded in separate vectors.
[0572] The isolated nucleic acid molecules of the invention can be cloned into a number of types of vectors. For example, the nucleic acid molecules can be cloned into a vector including, but not limited to a plasmid, a phagemid, a phage derivative, an animal virus, and a cosmid. Vectors of particular interest include expression vectors, replication vectors, probe generation vectors, and sequencing vectors.
[0573] Further, the vector may be provided to a cell in the form of a viral vector. Viral vector technology is well known in the art and is described, for example, in Sambrook et al. (2012, Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory, New York), and in other virology and molecular biology manuals. Viruses, which are useful as vectors include, but are not limited to, retroviruses, adenoviruses, adeno-associated viruses, herpes viruses, and lentiviruses.
[0574] In one embodiment, the isolated nucleic acid molecule is introduced into a vector derived from an adeno-associated virus (AAV) particle. AAV belonging to the genus Dependovirus of the Parvoviridae family and, as used herein, include any serotype of the over 100 serotypes of AAV known. In general, serotypes of AAV have genomic sequences with a significant homology at the level of amino acids and nucleic acids, provide an identical scries of genetic functions, produce virions that are essentially equivalent in physical and functional terms, and replicate and assemble through practically identical mechanisms. Peptide insertions into any of these serotypes may also enhance the tissue- specific tropism and therefore also be used to introduce isolated nucleic acid molecules (For examples, see: Yu, CY., Yuan, Z., Cao, Z. et al. A muscle-targeting peptide displayed on AAV2 improves muscle tropism on systemic delivery.
Gene Ther 16, 953-962 (2009). https://doi.org/10.1038/gt.20Q9.59; Weinmann J, Weis S, Sippel J, Tulalamba W, Remes A, El Andari J, Herrmann AK, et al. Identification of a myotropic AAV by massively parallel in vivo evaluation of barcoded capsid variants. Nat Commun. 2020 Oct 28; 11(1):5432. doi: 10.1038/s41467-020-19230-w.; Jihad El Andari et al.,Semirational bioengineering of AAV vectors with increased potency and specificity for systemic gene therapy of muscle disorders. Sei. Adv. 8, eabn4704(2022). DOI:10.1126/sciadv.abn4704; Tabebordbar M, Lagerborg KA, Stanton A, King EM, et al. Directed evolution of a family of AAV capsid variants enabling potent muscle-directed gene delivery across species. Cell. 2021 Sep 16;184(19):4919-4938.e22. doi: 10.1016/j.cell.2021.08.028.)
[0575] The AAV genome is approximately 4.7 kilobases long and is composed of singlestranded deoxyribonucleic acid (ssDNA) which may be either positive- or negative- sensed. The genome comprises two open reading frames (ORFs) encoding the proteins responsible for replication (Rep) and the structural protein of the capsid (Cap). The open reading frames are flanked by two inverted terminal repeats (ITRs), which serve as the origin of replication of the viral genome. The rep frame is made of four overlapping genes encoding Rep proteins ((Rep78, Rep68, Rep52, Rep40). The cap frame contains overlapping nucleotide sequences of three capsid proteins: VP1, VP2 and VP3. The Rep proteins are important for replication and packaging, while the capsid proteins are assembled to create the protein shell of the AAV, or AAV capsid. See Carter B, Adeno-associated virus and adeno- associated virus vectors for gene delivery, Lassie D, et ah, Eds., "Gene Therapy: Therapeutic Mechanisms and Strategies" (Marcel Dekker, Inc., New York, NY, US, 2000) and Gao G, et al, J. Virol. 2004; 78( 12):6381-6388.
[0576] AAV have been explored as vectors for delivery of gene therapeutics because of several unique features. Non-limiting examples of the features include (i) the ability to infect both dividing and non-dividing cells; (ii) a broad host range for infectivity, including human cells;
(iii) wild-type AAV has not been associated with any disease and has not been shown to replicate in infected cells; (iv) the lack of ccll-mcdiatcd immune response against the vector, and (v) the non-integrative nature in a host chromosome thereby reducing potential for long-term genetic alterations. Moreover, infection with AAV vectors has minimal influence on changing the pattern of cellular gene expression (Stilwell and Samulski et al., Biotechniques, 2003, 34, 148, the contents of which are herein incorporated by reference in their entirety).
[0577] Typically, AAV vectors for protein delivery may be recombinant viral vectors which are replication defective as they lack sequences encoding functional Rep and Cap proteins within the viral genome. In some cases, the defective AAV vectors may lack most or all coding sequences and essentially only contain one or two AAV ITR sequences and a payload sequence.
[0578] AAV vectors may be modified to enhance the efficiency of delivery. Such modified AAV vectors of the present disclosure can be packaged efficiently and can be used to successfully infect the target cells at high frequency and with minimal toxicity.
[0579] The term "AAV vector" means a vector derived from an adeno-associated virus serotype, including without limitation, serotype 1 (AAV 1), serotype 2 (AAV2), serotype 3 (AAV3), serotype 4 (AAV4), serotype 5 (AAV5), serotype 6 (AAV6), serotype 7 (AAV7), serotype 8 (AAV8), or serotype 9 (AAV9), serotype 10 (AAV10), serotype 11 (AAV11), serotype 12 (AAV12), serotype 13 (AAV13), AAVrh74, AAV-rhlO, AAV-DJ, AAV-LK03, AAV-MYO, AAV-MY02, AAV-MY03, MY03A-AAV, MY04A-AAV, or MY04E-AAV. Expression of a particular gene contained within an AAV vector can be specifically targeted to one or more types of cells by choosing the appropriate combination of AAV serotype, promoter, and delivery method.
Dual-Vector System
[0580] The present disclosure provides a dual- vector system, where a transgene, e.g., a truncated dystrophin protein, is split into two separate vectors, e.g., AAV vectors. Co-infection of a cell with these two AAV vectors result in the transcription of an assembled RNA that could not be encoded by a single AAV vector because of the packaging limits of AAV.
[0581] In one aspect, the present disclosure provides a vector system for expressing a truncated human dystrophin protein, comprising a first AAV vector and a second AAV vector, wherein the first AAV vector comprises a first nucleic acid molecule comprising a first coding region encoding an N-terminal portion of the truncated dystrophin protein and a 3’ ribozyme, and the second AAV vector comprises a second nucleic acid molecule comprising a second coding region encoding a C-terminal portion of the truncated dystrophin protein and a 5’ ribozyme. Upon ribozyme-mediated catalytic ligation, the first coding region and the second coding region forms a third coding region encoding for the truncated human dystrophin protein. Upon delivery into a target cell, the truncated dystrophin protein is able to restore the expression and function of an endogenously expressed full-length dystrophin protein. It will be understood that, in various embodiments, the first coding region and second coding region comprised in the first AAV and second AAV vectors may be the sequences of any first coding region and second coding regions described herein.
[0582] In some embodiments, the first nucleic acid molecule within the first AAV vector further comprises an intron splice donor sequence, and the second nucleic acid molecule within the second AAV vector further comprises an intron splice acceptor sequence.
[0583] In some embodiments, the splice donor sequence is positioned between the first coding region and the 3’ ribozyme, hi some embodiments, the splice acceptor sequence is positioned between the 5’ ribozyme and the second coding region.
[0584] In certain embodiments, the vectors also include conventional control elements which are operably linked to the transgene in a manner which permits its transcription, translation and/or expression in a cell transfected with the plasmid vector or infected with the virus produced by the invention. As used herein, “operably linked” sequences include both expression control sequences that are contiguous with the gene of interest (e.g., a truncated dystrophin protein or portion thereof) and expression control sequences that act in trans or at a distance to control the gene of interest.
[0585] Expression control sequences include appropriate transcription initiation, termination, promoter and enhancer sequences; efficient RNA processing signals such as splicing and polyadenylation (poly A) signals; sequences that stabilize cytoplasmic mRNA; sequences that enhance translation efficiency (z.e., Kozak consensus sequence); sequences that enhance protein stability; and when desired, sequences that enhance secretion of the encoded product. A great number of expression control sequences, including promoters which are native, constitutive, inducible and/or tissue-specific, are known in the ail and may be utilized.
[0586] In some embodiments, the first AAV vector comprises a first promoter operably linked to the first nucleic acid molecule comprising a first coding region encoding an N-terminal portion of the truncated dystrophin protein. In some embodiments, the second AAV vector comprises a second promoter operably linked to the second nucleic acid molecule comprising a second coding region encoding a C-terminal portion of the truncated dystrophin protein. In some embodiments the first and second promoters are identical. In some embodiments the first and second promoters are different.
[0587] Promoters may be naturally occurring or non-naturally occurring. Non-limiting examples of promoters include viral promoters, plant promoters and mammalian promoters. In some embodiments, the promoters may be human promoters. In some embodiments, the first promoter and/or the second promoter is a ubiquitous promoter or a tissue specific promoter.
[0588] In some embodiments, the promoter is a ubiquitous promoter that results in expression in one or more, e.g., multiple, cells and/or tissues. In some embodiments, a promoter which drives or promotes expression in most mammalian tissues includes, but is not limited to, human elongation factor la-subunit (EFla), cytomegalovirus (CMV) immediate-early enhancer and/or promoter, chicken P-actin (CBA) and its derivative CAG, glucuronidase (GUSB), and ubiquitin C (UBC).
[0589] In some embodiments, the promoter is a tissue specific promoter, e.g., a muscle specific promoter, e.g., an actin promoter, a myosin promoter, and a creatine kinase promoter. In some embodiments, the promoter is a CK8 promoter, an MHCK7 promoter, an SPC5-12 promoter, a MCK promoter, a desmin promoter, or a Calpain3 promoter. [0590] In some embodiments, the promoter comprises a nucleotide sequence selected from the group consisting of SEQ ID NOs: 144-150, or a nucleotide sequence at least 95% identical thereto.
[0591] In some embodiments, the promoter comprises a nucleotide sequence of SEQ ID NO: 144, or a nucleotide sequence at least 95% identical thereto. In some embodiments, the promoter comprises a nucleotide sequence of SEQ ID NO: 145, or a nucleotide sequence at least 95% identical thereto. In some embodiments, the promoter comprises a nucleotide sequence of SEQ ID NO: 146, or a nucleotide sequence at least 95% identical thereto. In some embodiments, the promoter comprises a nucleotide sequence of SEQ ID NO: 147, or a nucleotide sequence at least 95% identical thereto. In some embodiments, the promoter comprises a nucleotide sequence of SEQ ID NO: 148, or a nucleotide sequence at least 95% identical thereto. In some embodiments, the promoter comprises a nucleotide sequence of SEQ ID NO: 149, or a nucleotide sequence at least 95% identical thereto. In some embodiments, the promoter comprises a nucleotide sequence of SEQ ID NO: 150, or a nucleotide sequence at least 95% identical thereto. [0592] In some embodiments, the first AAV vector and/or the second AAV vector further comprise an enhancer. Enhancer sequences found on a vector regulate the expression of the gene contained therein. Typically, enhancers are bound with protein factors to enhance the transcription of a gene. Enhancers may be located upstream or downstream of the gene it regulates. Enhancers may also be tissue-specific to enhance transcription in a specific cell or tissue type. In one embodiment, the vector of the present invention comprises one or more enhancers to boost transcription of the gene present within the vector.
[0593] In some embodiments, the first AAV vector and/or the second AAV vector further comprise an intron or a fragment or derivative thereof. In some embodiments, the intron may enhance expression of a truncated dystrophin protein or portion thereof, as described herein. [0594] In some embodiments, the first AAV vector and/or the second AAV vector may comprise a human beta-globin intron or a fragment or variant thereof. In some embodiments, the intron comprises one or more human beta- globin sequences (e.g.. including fragments/variants thereof). In some embodiments, the first AAV vector and/or the second AAV vector may comprise an SV40 intron or others known in the art.
[0595] In some embodiments, the intron region comprises a nucleotide sequence of SEQ ID NO: 156, or a nucleotide sequence at least 95% identical thereto. In some embodiments, the intron region comprise a nucleotide sequence of SEQ ID NO: 157, or a nucleotide sequence at least 95% identical thereto.
[0596] In some embodiments, the first AAV vector and/or the second AAV vector further comprise an inverted terminal repeat (ITR) sequence. The ITR sequence is positioned either 5’ or 3’ relative to the transgene, e.g., the truncated dystrophin protein or portion thereof. In some embodiments, the first AAV vector and/or the second AAV vector have two ITRs. These two ITRs flank the payload region, e.g., the truncated dystrophin protein or portion thereof, at the 5’ and 3’ ends. In some embodiments, the ITR functions as an origin of replication comprising a recognition site for replication. In some embodiments, the ITR comprises a sequence region which can be complementary and symmetrically arranged. In some embodiments, the ITR incorporated into a viral vector described herein may be comprised of a naturally occurring polynucleotide sequence or a recombinantly derived polynucleotide sequence.
[0597] In a non-limiting example, the AAV vector comprises two ITRs. In some embodiments, the ITRs are of the same serotype as one another. In another embodiment, the ITRs are of different serotypes. Non-limiting examples include zero, one or both of the ITRs having the same serotype as the capsid. In some embodiments both ITRs of the AAV vectors are AAV2 ITRs. [0598] Independently, each ITR may be about 100 to about 150 nucleotides in length. In some embodiments, the ITR comprises 100-180 nucleotides in length, e.g., about 100-115, about 100- 120, about 100-130, about 100-140, about 100-150, about 100-160, about 100-170, about 100-
180, about 110-120, about 110-130, about 110-140, about 110-150, about 110-160, about 110-
170, about 110-180, about 120-130, about 120-140, about 120-150, about 120-160, about 120-
170, about 120-180, about 130-140, about 130-150, about 130-160, about 130-170, about 130-
180, about 140-150, about 140-160, about 140-170, about 140-180, about 150-160, about 150-
170, about 150-180, about 160-170, about 160-180, or about 170-180 nucleotides in length. Non-limiting examples of ITR length are 120, 130, 140, 141, 142, 145 nucleotides in length.
[0599] In some embodiments, the ITR sequence comprises a nucleotide sequence of SEQ ID NO: 202, or a nucleotide sequence at least 95% identical thereto. In some embodiments, the ITR sequence comprises a nucleotide sequence of SEQ ID NO: 203, or a nucleotide sequence at least 95% identical thereto.
[0600] In some embodiments, the first AAV vector and/or the second AAV vector further comprise a polyadenylation (polyA) sequence. [0601] In some embodiments, the polyA sequence comprises a length of about 40-600 nucleotides, e.g., about 40-300 nucleotides, about 40-250 nucleotides, about 100-400 nucleotides, about 100-300 nucleotides, about 100-200 nucleotides, about 200-600 nucleotides, about 200-500 nucleotides, about 200-400 nucleotides, about 200-300 nucleotides, about 300- 600 nucleotides, about 300-500 nucleotides, about 300-400 nucleotides, about 400-600 nucleotides, about 400-500 nucleotides, or about 500-600 nucleotides.
[0602] In some embodiments, the polyadenylation sequence is a bovine growth hormone (bGH) polyA sequence. In some embodiments, the polyadenylation sequence comprises a nucleotide sequence of SEQ ID NO: 151, or a nucleotide sequence at least 95% identical thereto.
[0603] In some embodiments, the polyadenylation sequence is a synthetic bovine growth hormone (bGH) polyA sequence. In some embodiments, the polyadenylation sequence comprises a nucleotide sequence of SEQ ID NO: 152, or a nucleotide sequence at least 95% identical thereto.
[0604] In some embodiments, the first AAV vector and/or the second AAV vector further comprise an untranslated region (UTR). Generally, the 5’ UTR starts at the transcription start site and ends at the start codon and the 3’ UTR starts immediately following the stop codon and continues until the termination signal for transcription. Features typically found in abundantly expressed genes of specific target organs may be engineered into UTRs to enhance the stability and protein production.
[0605] Any UTR from any gene known in the art may be incorporated into the AAV vectors. These UTRs, or portions thereof, may be placed in the same orientation as in the gene from which they were selected or they may be altered in orientation or location. In some embodiments, the UTR used in the AAV vector may be inverted, shortened, lengthened, or made with one or more other 5' UTRs or 3' UTRs known in the art. As used herein, the term “altered,” as it relates to a UTR, means that the UTR has been changed in some way in relation to a reference sequence. For example, a 3' or 5' UTR may be altered relative to a wild type or native UTR by the change in orientation or location as taught above or may be altered by the inclusion of additional nucleotides, deletion of nucleotides, swapping or transposition of nucleotides.
[0606] In some embodiments, the first AAV vector and/or the second AAV vector further comprise a Kozak sequence. Kozak sequences, which are commonly known to be involved in the process by which the ribosome initiates translation of many genes, are usually included in 5’ UTRs.
[0607] In some embodiments, the first AAV vector and/or the second AAV vector further comprise a post transcriptional regulatory element, e.g., a Woodchuck Hepatitis Virus Posttranscriptional Regulatory Element (WPRE), e.g., in the 3’ UTR.
[0608] In some embodiments, the WPRE comprises a nucleotide sequence of SEQ ID NO: 153, or a nucleotide sequence at least 95% identical thereto. In some embodiments, the WPRE comprises a nucleotide sequence of SEQ ID NO: 154, or a nucleotide sequence at least 95% identical thereto. In some embodiments, the WPRE comprises a nucleotide sequence of SEQ ID NO: 155, or a nucleotide sequence at least 95% identical thereto.
[0609] In some embodiments, the first AAV vector and/or the second AAV vector further comprise one or more filler sequences. The filler sequence may be a wild-type sequence or an engineered sequence. A filler sequence may be a variant of a wild-type sequence.
[0610] The AAV vector comprise one or more filler sequences in order to have the optional length for packaging. In some embodiments, the AAV vector comprises any portion of a filler sequence, e.g., 1%, 2%, 3%, 4%, 5%, 6%, 7%, 8%, 9%, 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, or 99% of a filler sequence. The filler sequences can be located within any position within the AAV vector, for example, 3’ to the 5’ ITR sequence, 5’ to the promoter sequence, 3’ to the poly adenylation sequence, or 5’ to the 3’ ITR sequence.
[0611] In order to assess the expression of a truncated dystrophin protein of the invention, the vectors to be introduced into a cell may also contain either a selectable marker gene or a reporter gene or both to facilitate identification and selection of expressing cells from the population of cells sought to be transfected or infected through viral vectors. In some embodiments, the selectable marker may be carried on a separate piece of DNA and used in a co-transfection procedure. Both selectable markers and reporter genes may be flanked with appropriate regulatory sequences to enable expression in the host cells. Useful selectable markers include, for example, antibiotic -resistance genes, such as neo and the like. Reporter genes are used for identifying potentially transfected cells and for evaluating the functionality of regulatory sequences. In general, a reporter gene is a gene that is not present in or expressed by the recipient organism or tissue and that encodes a polypeptide whose expression is manifested by some easily detectable property, e.g.. enzymatic activity. Expression of the reporter gene is assayed at a suitable time after the DNA has been introduced into the recipient cells. Suitable reporter genes may include genes encoding luciferase, beta-galactosidase, chloramphenicol acetyl transferase, secreted alkaline phosphatase, or the green fluorescent protein gene. Suitable expression systems are well known and may be prepared using known techniques or obtained commercially.
Exemplary AAV vectors
[0612] In some embodiments, the first AAV vector comprises, in 5’ to 3’ order, a 5’ ITR sequence, a promoter sequence, a first coding region encoding an N-terminal portion of the truncated dystrophin protein, an intron splice donor sequence, a 3’ ribozyme, and a 3’ ITR sequence.
[0613] In some embodiments, the first AAV vector comprises, in 5’ to 3’ order, a 5’ ITR sequence, a promoter sequence, a first coding region encoding an N-tcrminal portion of the truncated dystrophin protein, an intron splice donor sequence, a 3’ ribozyme, a polyadenylation sequence, and a 3’ ITR sequence.
[0614] In some embodiments, the first AAV vector comprises, in 5’ to 3’ order, a 5’ ITR sequence, a promoter sequence, an intron region, a first coding region encoding an N-teiminal portion of the truncated dystrophin protein, an intron splice donor sequence, a 3’ ribozyme, a polyadenylation sequence, and a 3’ ITR sequence.
[0615] In some embodiments, the first AAV vector comprises a nucleotide sequence selected from the group consisting of SEQ ID NO: 160, 162, 164, 166, 168, 170, 172, 174, 176, 178, 180, 182, 184, 186, 188, 190, 192, 194, 196, 198, 200, 301, 303, 305, 307, 308, 310, 312, 314, 394, and 395, or a nucleotide sequence at least 90% identical thereto.
[0616] In some embodiments, the first AAV vector comprises a nucleotide sequence of SEQ ID NO: 160, or a nucleotide sequence at least 90% identical thereto.
[0617] In some embodiments, the first AAV vector comprises a nucleotide sequence of SEQ ID NO: 162, or a nucleotide sequence at least 90% identical thereto.
[0618] In some embodiments, the first AAV vector comprises a nucleotide sequence of SEQ ID NO: 164, or a nucleotide sequence at least 90% identical thereto.
[0619] In some embodiments, the first AAV vector comprises a nucleotide sequence of SEQ ID NO: 166, or a nucleotide sequence at least 90% identical thereto. [0620] In some embodiments, the first AAV vector comprises a nucleotide sequence of SEQ ID NO: 168, or a nucleotide sequence at least 90% identical thereto.
[0621] In some embodiments, the first AAV vector comprises a nucleotide sequence of SEQ ID NO: 170, or a nucleotide sequence at least 90% identical thereto.
[0622] In some embodiments, the first AAV vector comprises a nucleotide sequence of SEQ ID NO: 172, or a nucleotide sequence at least 90% identical thereto.
[0623] In some embodiments, the first AAV vector comprises a nucleotide sequence of SEQ ID NO: 174, or a nucleotide sequence at least 90% identical thereto.
[0624] In some embodiments, the first AAV vector comprises a nucleotide sequence of SEQ ID NO: 176, or a nucleotide sequence at least 90% identical thereto.
[0625] In some embodiments, the first AAV vector comprises a nucleotide sequence of SEQ ID NO: 178, or a nucleotide sequence at least 90% identical thereto.
[0626] In some embodiments, the first AAV vector comprises a nucleotide sequence of SEQ ID NO: 180, or a nucleotide sequence at least 90% identical thereto.
[0627] In some embodiments, the first AAV vector comprises a nucleotide sequence of SEQ ID NO: 182, or a nucleotide sequence at least 90% identical thereto.
[0628] In some embodiments, the first AAV vector comprises a nucleotide sequence of SEQ ID NO: 184, or a nucleotide sequence at least 90% identical thereto.
[0629] In some embodiments, the first AAV vector comprises a nucleotide sequence of SEQ ID NO: 186, or a nucleotide sequence at least 90% identical thereto.
[0630] In some embodiments, the first AAV vector comprises a nucleotide sequence of SEQ ID NO: 188, or a nucleotide sequence at least 90% identical thereto.
[0631] In some embodiments, the first AAV vector comprises a nucleotide sequence of SEQ ID NO: 190, or a nucleotide sequence at least 90% identical thereto.
[0632] In some embodiments, the first AAV vector comprises a nucleotide sequence of SEQ ID NO: 192, or a nucleotide sequence at least 90% identical thereto.
[0633] In some embodiments, the first AAV vector comprises a nucleotide sequence of SEQ ID NO: 194, or a nucleotide sequence at least 90% identical thereto.
[0634] In some embodiments, the first AAV vector comprises a nucleotide sequence of SEQ ID NO: 196, or a nucleotide sequence at least 90% identical thereto. [0635] In some embodiments, the first AAV vector comprises a nucleotide sequence of SEQ ID NO: 198, or a nucleotide sequence at least 90% identical thereto.
[0636] In some embodiments, the first AAV vector comprises a nucleotide sequence of SEQ ID NO: 200, or a nucleotide sequence at least 90% identical thereto.
[0637] In some embodiments, the first AAV vector comprises a nucleotide sequence of SEQ ID NO: 301, or a nucleotide sequence at least 90% identical thereto.
[0638] In some embodiments, the first AAV vector comprises a nucleotide sequence of SEQ ID NO: 303, or a nucleotide sequence at least 90% identical thereto.
[0639] In some embodiments, the first AAV vector comprises a nucleotide sequence of SEQ ID NO: 305, or a nucleotide sequence at least 90% identical thereto.
[0640] In some embodiments, the first AAV vector comprises a nucleotide sequence of SEQ ID NO: 307, or a nucleotide sequence at least 90% identical thereto.
[0641] In some embodiments, the first AAV vector comprises a nucleotide sequence of SEQ ID NO: 308, or a nucleotide sequence at least 90% identical thereto.
[0642] In some embodiments, the first AAV vector comprises a nucleotide sequence of SEQ ID NO: 310, or a nucleotide sequence at least 90% identical thereto.
[0643] In some embodiments, the first AAV vector comprises a nucleotide sequence of SEQ ID NO: 312, or a nucleotide sequence at least 90% identical thereto.
[0644] In some embodiments, the first AAV vector comprises a nucleotide sequence of SEQ ID NO: 314, or a nucleotide sequence at least 90% identical thereto.
[0645] In some embodiments, the first AAV vector comprises a nucleotide sequence of SEQ ID NO: 394, or a nucleotide sequence at least 90% identical thereto.
[0646] In some embodiments, the first AAV vector comprises a nucleotide sequence of SEQ ID NO: 395, or a nucleotide sequence at least 90% identical thereto.
[0647] In some embodiments, the second AAV vector comprises, in 5’ to 3’ order, a 5’ ITR sequence, a promoter sequence, a 5’ ribozyme, an intron splice acceptor sequence, a second coding region encoding a C-terminal portion of the truncated dystrophin protein, and a 3’ ITR sequence.
[0648] In some embodiments, the second AAV vector comprises, in 5’ to 3’ order, a 5’ ITR sequence, a promoter sequence, a 5’ ribozyme, an intron splice acceptor sequence, a second coding region encoding a C-terminal portion of the truncated dystrophin protein, a polyadenylation sequence, and a 3’ ITR sequence.
[0649] In some embodiments, the second AAV vector comprises, in 5’ to 3’ order, a 5’ ITR sequence, a promoter sequence, a 5’ ribozyme, an intron splice acceptor sequence, a second coding region encoding a C-terminal portion of the truncated dystrophin protein, a WPRE sequence, a polyadenylation sequence, and a 3’ ITR sequence.
[0650] In some embodiments, the second AAV vector comprises a nucleotide sequence selected from the group consisting of SEQ ID NO: 161, 163, 165, 167, 169, 171, 173, 175, 177, 179, 181, 183, 185, 187, 189, 191, 193, 195, 197, 199, 201, 302, 304, 306, 309, 311, and 313, or a nucleotide sequence at least 90% identical thereto.
[0651] In some embodiments, the second AAV vector comprises a nucleotide sequence of SEQ ID NO: 161 , or a nucleotide sequence at least 90% identical thereto.
[0652] In some embodiments, the second AAV vector comprises a nucleotide sequence of SEQ ID NO: 163, or a nucleotide sequence at least 90% identical thereto.
[0653] In some embodiments, the second AAV vector comprises a nucleotide sequence of SEQ ID NO: 165, or a nucleotide sequence at least 90% identical thereto.
[0654] In some embodiments, the second AAV vector comprises a nucleotide sequence of SEQ ID NO: 167, or a nucleotide sequence at least 90% identical thereto.
[0655] In some embodiments, the second AAV vector comprises a nucleotide sequence of SEQ ID NO: 169, or a nucleotide sequence at least 90% identical thereto.
[0656] In some embodiments, the second AAV vector comprises a nucleotide sequence of SEQ ID NO: 171, or a nucleotide sequence at least 90% identical thereto.
[0657] In some embodiments, the second AAV vector comprises a nucleotide sequence of SEQ ID NO: 173, or a nucleotide sequence at least 90% identical thereto.
[0658] In some embodiments, the second AAV vector comprises a nucleotide sequence of SEQ ID NO: 175, or a nucleotide sequence at least 90% identical thereto.
[0659] In some embodiments, the second AAV vector comprises a nucleotide sequence of SEQ ID NO: 177, or a nucleotide sequence at least 90% identical thereto.
[0660] In some embodiments, the second AAV vector comprises a nucleotide sequence of SEQ ID NO: 179, or a nucleotide sequence at least 90% identical thereto. [0661] In some embodiments, the second AAV vector comprises a nucleotide sequence of SEQ ID NO:181, or a nucleotide sequence at least 90% identical thereto.
[0662] In some embodiments, the second AAV vector comprises a nucleotide sequence of SEQ ID NO: 183, or a nucleotide sequence at least 90% identical thereto.
[0663] In some embodiments, the second AAV vector comprises a nucleotide sequence of SEQ ID NO: 185, or a nucleotide sequence at least 90% identical thereto.
[0664] In some embodiments, the second AAV vector comprises a nucleotide sequence of SEQ ID NO: 187, or a nucleotide sequence at least 90% identical thereto.
[0665] In some embodiments, the second AAV vector comprises a nucleotide sequence of SEQ ID NO: 189, or a nucleotide sequence at least 90% identical thereto.
[0666] In some embodiments, the second AAV vector comprises a nucleotide sequence of SEQ ID NO: 191 , or a nucleotide sequence at least 90% identical thereto.
[0667] In some embodiments, the second AAV vector comprises a nucleotide sequence of SEQ ID NO: 193, or a nucleotide sequence at least 90% identical thereto.
[0668] In some embodiments, the second AAV vector comprises a nucleotide sequence of SEQ ID NO: 195, or a nucleotide sequence at least 90% identical thereto.
[0669] In some embodiments, the second AAV vector comprises a nucleotide sequence of SEQ ID NO: 197, or a nucleotide sequence at least 90% identical thereto.
[0670] In some embodiments, the second AAV vector comprises a nucleotide sequence of SEQ ID NO: 199, or a nucleotide sequence at least 90% identical thereto.
[0671] In some embodiments, the second AAV vector comprises a nucleotide sequence of SEQ ID NO:201, or a nucleotide sequence at least 90% identical thereto.
[0672] In some embodiments, the second AAV vector comprises a nucleotide sequence of SEQ ID NO:302, or a nucleotide sequence at least 90% identical thereto.
[0673] In some embodiments, the second AAV vector comprises a nucleotide sequence of SEQ ID NO:304, or a nucleotide sequence at least 90% identical thereto.
[0674] In some embodiments, the second AAV vector comprises a nucleotide sequence of SEQ ID NO: 306, or a nucleotide sequence at least 90% identical thereto.
[0675] In some embodiments, the second AAV vector comprises a nucleotide sequence of SEQ ID NO:309, or a nucleotide sequence at least 90% identical thereto. [0676] In some embodiments, the second AAV vector comprises a nucleotide sequence of SEQ ID NO:311, or a nucleotide sequence at least 90% identical thereto.
[0677] In some embodiments, the second AAV vector comprises a nucleotide sequence of SEQ ID NO:313, or a nucleotide sequence at least 90% identical thereto.
[0678] In some embodiments, the first AAV vector comprises a nucleotide sequence of SEQ ID NO: 160, or a nucleotide sequence at least 90% identical thereto, and the second AAV vector comprises a nucleotide sequence of SEQ ID NO: 161, or a nucleotide sequence at least 90% identical thereto.
[0679] In some embodiments, the first AAV vector comprises a nucleotide sequence of SEQ ID NO: 162, or a nucleotide sequence at least 90% identical thereto, and the second AAV vector comprises a nucleotide sequence of SEQ ID NO: 163, or a nucleotide sequence at least 90% identical thereto.
[0680] In some embodiments, the first AAV vector comprises a nucleotide sequence of SEQ ID NO: 164, or a nucleotide sequence at least 90% identical thereto, and the second AAV vector comprises a nucleotide sequence of SEQ ID NO: 165, or a nucleotide sequence at least 90% identical thereto.
[0681] In some embodiments, the first AAV vector comprises a nucleotide sequence of SEQ ID NO: 166, or a nucleotide sequence at least 90% identical thereto, and the second AAV vector comprises a nucleotide sequence of SEQ ID NO: 167, or a nucleotide sequence at least 90% identical thereto.
[0682] In some embodiments, the first AAV vector comprises a nucleotide sequence of SEQ ID NO: 168, or a nucleotide sequence at least 90% identical thereto, and the second AAV vector comprises a nucleotide sequence of SEQ ID NO: 169, or a nucleotide sequence at least 90% identical thereto.
[0683] In some embodiments, the first AAV vector comprises a nucleotide sequence of SEQ ID NO: 170, or a nucleotide sequence at least 90% identical thereto, and the second AAV vector comprises a nucleotide sequence of SEQ ID NO: 171, or a nucleotide sequence at least 90% identical thereto.
[0684] In some embodiments, the first AAV vector comprises a nucleotide sequence of SEQ ID NO: 172, or a nucleotide sequence at least 90% identical thereto, and the second AAV vector comprises a nucleotide sequence of SEQ ID NO: 173, or a nucleotide sequence at least 90% identical thereto.
[0685] In some embodiments, the first AAV vector comprises a nucleotide sequence of SEQ ID NO: 174, or a nucleotide sequence at least 90% identical thereto, and the second AAV vector comprises a nucleotide sequence of SEQ ID NO: 175, or a nucleotide sequence at least 90% identical thereto.
[0686] In some embodiments, the first AAV vector comprises a nucleotide sequence of SEQ ID NO: 176, or a nucleotide sequence at least 90% identical thereto, and the second AAV vector comprises a nucleotide sequence of SEQ ID NO: 177, or a nucleotide sequence at least 90% identical thereto.
[0687] In some embodiments, the first AAV vector comprises a nucleotide sequence of SEQ ID NO: 178, or a nucleotide sequence at least 90% identical thereto, and the second AAV vector comprises a nucleotide sequence of SEQ ID NO: 179, or a nucleotide sequence at least 90% identical thereto.
[0688] In some embodiments, the first AAV vector comprises a nucleotide sequence of SEQ ID NO: 180, or a nucleotide sequence at least 90% identical thereto, and the second AAV vector comprises a nucleotide sequence of SEQ ID NO: 181, or a nucleotide sequence at least 90% identical thereto.
[0689] In some embodiments, the first AAV vector comprises a nucleotide sequence of SEQ ID NO: 182, or a nucleotide sequence at least 90% identical thereto, and the second AAV vector comprises a nucleotide sequence of SEQ ID NO: 183, or a nucleotide sequence at least 90% identical thereto.
[0690] In some embodiments, the first AAV vector comprises a nucleotide sequence of SEQ ID NO: 184, or a nucleotide sequence at least 90% identical thereto, and the second AAV vector comprises a nucleotide sequence of SEQ ID NO: 185, or a nucleotide sequence at least 90% identical thereto.
[0691] In some embodiments, the first AAV vector comprises a nucleotide sequence of SEQ ID NO: 186, or a nucleotide sequence at least 90% identical thereto, and the second AAV vector comprises a nucleotide sequence of SEQ ID NO: 187, or a nucleotide sequence at least 90% identical thereto. [0692] In some embodiments, the first AAV vector comprises a nucleotide sequence of SEQ ID NO: 188, or a nucleotide sequence at least 90% identical thereto, and the second AAV vector comprises a nucleotide sequence of SEQ ID NO: 189, or a nucleotide sequence at least 90% identical thereto.
[0693] In some embodiments, the first AAV vector comprises a nucleotide sequence of SEQ ID NO: 190, or a nucleotide sequence at least 90% identical thereto, and the second AAV vector comprises a nucleotide sequence of SEQ ID NO: 191, or a nucleotide sequence at least 90% identical thereto.
[0694] In some embodiments, the first AAV vector comprises a nucleotide sequence of SEQ ID NO: 192, or a nucleotide sequence at least 90% identical thereto, and the second AAV vector comprises a nucleotide sequence of SEQ ID NO: 193, or a nucleotide sequence at least 90% identical thereto.
[0695] In some embodiments, the first AAV vector comprises a nucleotide sequence of SEQ ID NO: 194, or a nucleotide sequence at least 90% identical thereto, and the second AAV vector comprises a nucleotide sequence of SEQ ID NO: 195, or a nucleotide sequence at least 90% identical thereto.
[0696] In some embodiments, the first AAV vector comprises a nucleotide sequence of SEQ ID NO: 196, or a nucleotide sequence at least 90% identical thereto, and the second AAV vector comprises a nucleotide sequence of SEQ ID NO: 197, or a nucleotide sequence at least 90% identical thereto.
[0697] In some embodiments, the first AAV vector comprises a nucleotide sequence of SEQ ID NO: 198, or a nucleotide sequence at least 90% identical thereto, and the second AAV vector comprises a nucleotide sequence of SEQ ID NO: 199, or a nucleotide sequence at least 90% identical thereto.
[0698] In some embodiments, the first AAV vector comprises a nucleotide sequence of SEQ ID NO:200, or a nucleotide sequence at least 90% identical thereto, and the second AAV vector comprises a nucleotide sequence of SEQ ID NO:201, or a nucleotide sequence at least 90% identical thereto.
[0699] In some embodiments, the first AAV vector comprises a nucleotide sequence of SEQ ID NO:301, or a nucleotide sequence at least 90% identical thereto, and the second AAV vector comprises a nucleotide sequence of SEQ ID NO:302, or a nucleotide sequence at least 90% identical thereto.
[0700] In some embodiments, the first AAV vector comprises a nucleotide sequence of SEQ ID NO:303, or a nucleotide sequence at least 90% identical thereto, and the second AAV vector comprises a nucleotide sequence of SEQ ID NO:304, or a nucleotide sequence at least 90% identical thereto.
[0701] In some embodiments, the first AAV vector comprises a nucleotide sequence of SEQ ID NO:305, or a nucleotide sequence at least 90% identical thereto, and the second AAV vector comprises a nucleotide sequence of SEQ ID NO:306, or a nucleotide sequence at least 90% identical thereto.
[0702] In some embodiments, the first AAV vector comprises a nucleotide sequence of SEQ ID NO:307, or a nucleotide sequence at least 90% identical thereto, and the second AAV vector comprises a nucleotide sequence of SEQ ID NO:306, or a nucleotide sequence at least 90% identical thereto.
[0703] In some embodiments, the first AAV vector comprises a nucleotide sequence of SEQ ID NO:308, or a nucleotide sequence at least 90% identical thereto, and the second AAV vector comprises a nucleotide sequence of SEQ ID NO:309, or a nucleotide sequence at least 90% identical thereto.
[0704] In some embodiments, the first AAV vector comprises a nucleotide sequence of SEQ ID NQ:310, or a nucleotide sequence at least 90% identical thereto, and the second AAV vector comprises a nucleotide sequence of SEQ ID NO:311, or a nucleotide sequence at least 90% identical thereto.
[0705] In some embodiments, the first AAV vector comprises a nucleotide sequence of SEQ ID NO:312, or a nucleotide sequence at least 90% identical thereto, and the second AAV vector comprises a nucleotide sequence of SEQ ID NO:313, or a nucleotide sequence at least 90% identical thereto.
[0706] In some embodiments, the first AAV vector comprises a nucleotide sequence of SEQ ID NO:314, or a nucleotide sequence at least 90% identical thereto, and the second AAV vector comprises a nucleotide sequence of SEQ ID NO:309, or a nucleotide sequence at least 90% identical thereto. [0707] In some embodiments, the first AAV vector comprises a nucleotide sequence of SEQ ID NO:394, or a nucleotide sequence at least 90% identical thereto, and the second AAV vector comprises a nucleotide sequence of SEQ ID NO:309, or a nucleotide sequence at least 90% identical thereto.
[0708] In some embodiments, the first AAV vector comprises a nucleotide sequence of SEQ ID NO:395, or a nucleotide sequence at least 90% identical thereto, and the second AAV vector comprises a nucleotide sequence of SEQ ID NO:306, or a nucleotide sequence at least 90% identical thereto.
IV. Viral Production
[0709] Adeno-associated viral (AAV) production includes processes and methods for producing AAV particles and vectors which can contact a target cell to deliver a payload, e.g. a recombinant viral construct, which includes a nucleic acid molecule encoding a payload molecule, e.g., a truncated dystrophin protein or portion thereof.
[0710] In some embodiments, disclosed herein is a method of making a recombinant AAV particle of the present disclosure, the method comprising (i) providing a host cell comprising a viral genome described herein, e.g., a nucleic acid comprising a coding region encoding a truncated dystrophin protein or portion thereof, and incubating the host cell under conditions suitable to encapsulate the viral genome in a capsid protein, thereby making the recombinant AAV particle. In some embodiments, the method comprises prior to step (i), introducing a first nucleic acid comprising the viral genome into a cell. In some embodiments, the host cell comprises a second nucleic acid encoding the capsid protein. In some embodiments, the second nucleic acid is introduced into the host cell prior to, concurrently with, or after the first nucleic acid molecule. In some embodiments, the host cell is a bacterial cell, a mammalian cell (e.g., a HEK293 cell), or an insect cell (e.g., an Sf9 cell).
[0711] In some embodiments, disclosed herein is a method for making a first recombinant AAV particle, the method comprises providing a host cell comprising a first nucleic acid molecule encoding an N-terminal portion of a truncated dystrophin protein, and incubating the host cell under conditions suitable to encapsulate the first nucleic acid in an AAV capsid protein; thereby making the first recombinant AAV particle. [0712] In some embodiments, disclosed herein is a method for making a second recombinant AAV particle, the method comprises providing a host cell comprising a second nucleic acid molecule encoding a C-terminal portion of a truncated dystrophin protein, and incubating the host cell under conditions suitable to encapsulate the second nucleic acid in an AAV capsid protein; thereby making the second recombinant AAV particle.
[0713] In various embodiments, methods are provided herein of producing AAV particles or vectors by (a) contacting a viral production cell with one or more viral expression constructs encoding at least one AAV capsid protein, and one or more payload constructs encoding a payload molecule, e.g.. a truncated dystrophin protein or portion thereof, e.g., an N-terminal portion of a truncated dystrophin protein, or a C-terminal portion of a truncated dystrophin protein; (b) culturing the viral production cell under conditions such that at least one AAV particle or vector is produced, and (c) isolating the AAV particle or vector from the production stream.
[0714] In these methods, a viral expression construct may encode at least one structural protein and/or at least one non- structural protein. The structural protein may include any of the native or wild type capsid proteins VP1, VP2, and/or VP3, or a chimeric protein thereof. In some embodiments, the VP1 capsid protein may be an sL65 VP1 capsid protein. The non- structural protein may include any of the native or wild type Rep78, Rep68, Rep52, and/or Rep40 proteins or a chimeric protein thereof.
[0715] In certain embodiments, contacting occurs via transient transfection, viral transduction, and/or electroporation.
[0716] In certain embodiments, the viral production cell is selected from a mammalian cell and an insect cell. In certain embodiments, the insect cell includes a Spodoptera frugiperda insect cell. In certain embodiments, the insect cell includes a Sf9 insect cell. In certain embodiments, the insect cell includes a Sf21 insect cell.
[0717] The payload construct vector of the present disclosure may include, in various embodiments, at least one inverted terminal repeat (ITR) and may include mammalian DNA. [0718] In various embodiments, the AAV particles or viral vectors may be formulated as a pharmaceutical composition with one or more acceptable excipients. [0719] In certain embodiments, the AAV particles are produced in an insect cell (e.g., Spodoptera frugiperda (Sf9) cell) using a method described herein. As a non-limiting example, the insect cell is contacted using viral transduction which may include baculoviral transduction. [0720] In certain embodiments, the AAV particles are produced in a mammalian cell (e.g., HEK293 cell) using a method described herein. As a non-limiting example, the mammalian cell is contacted using viral transduction which may include multiplasmid transient transfection (such as triple plasmid transient transfection).
[0721] In certain embodiments, the AAV particle production method described herein produces greater than 101, greater than 102, greater than 103, greater than 104, or greater than 105 AAV particles in a viral production cell.
[0722] In certain embodiments, a process of the present disclosure includes production of viral particles in a viral production cell using a viral production system which includes at least one viral expression construct and at least one payload construct. The at least one viral expression construct and at least one payload construct can be co-transfected (e.g. dual transfection, triple transfection) into a viral production cell. The transfection is completed using standard molecular biology techniques known and routinely performed by a person skilled in the art. The viral production cell provides the cellular machinery necessary for expression of the proteins and other biomaterials necessary for producing the AAV particles, including Rep proteins which replicate the payload construct and Cap proteins which assemble to form a capsid that encloses the replicated payload constructs. The resulting AAV particle is extracted from the viral production cells and processed into a pharmaceutical preparation for administration.
[0723] In various embodiments, once administered, an AAV particle disclosed herein may, without being bound by theory, contact a target cell and enter the cell. The AAV particles may subsequently contact the nucleus of the target cell to deliver the payload construct. The payload construct, e.g. recombinant viral construct, may be delivered to the nucleus of the target cell wherein the payload molecule encoded by the payload construct may be expressed.
[0724] In certain embodiments, the process for production of viral particles utilizes seed cultures of viral production cells that include one or more baculoviruses (e.g., a Baculoviral Expression Vector (BEV) or a baculovirus infected insect cell (BIIC) that has been transfected with a viral expression construct and a payload construct vector). In certain embodiments, the seed cultures are harvested, divided into aliquots and frozen, and may be used at a later time point to initiate an infection of a naive population of production cells .
[0725] In some embodiments, large scale production of AAV particles utilizes a bioreactor. Without being bound by theory, the use of a bioreactor may allow for the precise measurement and/or control of variables that support the growth and activity of viral production cells such as mass, temperature, mixing conditions (impellor RPM or wave oscillation), CO2 concentration, O2 concentration, gas sparge rates and volumes, gas overlay rates and volumes, pH, Viable Cell Density (VCD), cell viability, cell diameter, and/or optical density (OD). In certain embodiments, the bioreactor is used for batch production in which the entire culture is harvested at an experimentally determined time point and AAV particles are purified. In some embodiments, the bioreactor is used for continuous production in which a portion of the culture is harvested at an experimentally determined time point for purification of AAV particles, and the remaining culture in the biorcactor is refreshed with additional growth media components. [0726] In various embodiments, AAV viral particles can be extracted from viral production cells in a process which includes cell lysis, clarification, sterilization and purification. Cell lysis includes any process that disrupts the structure of the viral production cell, thereby releasing AAV particles. In certain embodiments, cell lysis may include thermal shock, chemical, or mechanical lysis methods. Clarification can include the gross purification of the mixture of lysed cells, media components, and AAV particles. In certain embodiments, clarification includes centrifugation and/or filtration, including but not limited to depth end, tangential flow, and/or hollow fiber filtration.
[0727] In various embodiments, the end result of viral production is a purified collection of AAV particles which include two components: (1) a payload construct (e.g. a recombinant AAV vector genome construct) and (2) a viral capsid.
[0728] The viral production cell may be selected from any biological organism, including prokaryotic (e.g., bacterial) cells, and eukaryotic cells, including, insect cells, yeast cells and mammalian cells.
[0729] In certain embodiments, the AAV particles of the present disclosure may be produced in a viral production cell that includes a mammalian cell. Viral production cells may comprise mammalian cells such as A549, WEH1, 3T3, 10T1/2, BHK, MDCK, COS 1, COS 7, BSC 1, BSC 40, BMT 10, VERO, W138, HeLa, HEK293, HEK293T (293T), Saos, C2C12, L cells, HT1080, Huh7, HepG2, C127, 3T3, CHO, HeLa cells, KB cells, BHK and primary fibroblast, hepatocyte, and myoblast cells derived from mammals. Viral production cells can include cells derived from any mammalian species including, but not limited to, human, monkey, mouse, rat, rabbit, and hamster or cell type, including but not limited to fibroblast, hepatocyte, tumor cell, cell line transformed cell, etc.
[07301 In certain embodiments, AAV particles are produced in mammalian cells using a multiplasmid transient transfection method (such as triple plasmid transient transfection). In certain embodiments, the multiplasmid transient transfection method includes transfection of the following three different constructs: (i) a payload construct, (ii) a Rep/Cap construct (parvoviral Rep and parvoviral Cap), and (iii) a helper construct. In certain embodiments, the triple transfection method of the three components of AAV particle production may be utilized to produce small lots of virus for assays including transduction efficiency, target tissue (tropism) evaluation, and stability. In certain embodiments, the triple transfection method of the three components of AAV particle production may be utilized to produce large lots of materials for clinical or commercial applications.
[0731] In certain embodiments, mammalian viral production cells (e.g. 293T cells) can be in an adhesion/adherent state (e.g. with calcium phosphate) or a suspension state (e.g. with polyethyleneimine (PEI)). The mammalian viral production cell is transfected with plasmids required for production of AAV, (z.e., AAV rep/cap construct, an adenoviral helper construct, and/or ITR flanked payload construct). In certain embodiments, the transfection process can include optional medium changes (e.g. medium changes for cells in adhesion form, no medium changes for cells in suspension form, medium changes for cells in suspension form if desired). In certain embodiments, the transfection process can include transfection mediums such as DMEM or F17. In certain embodiments, the transfection medium can include serum or can be serum-free (e.g. cells in adhesion state with calcium phosphate and with serum, cells in suspension state with PEI and without serum).
[0732] Cells can subsequently be collected by scraping (adherent form) and/or pelleting (suspension form and scraped adherent form) and transferred into a receptacle. Collection steps can be repeated as necessary for full collection of produced cells. Next, cell lysis can be achieved by consecutive freeze-thaw cycles (-80°C to 37°C), chemical lysis (such as adding detergent triton), mechanical lysis, or by allowing the cell culture to degrade after reaching -0% viability. Cellular debris is removed by centrifugation and/or depth filtration. The samples are quantified for AAV particles by DNase resistant genome titration by DNA qPCR or digital PCR.
[0733] AAV particle titers are measured according to genome copy number (genome particles per milliliter). Genome particle concentrations are based on DNA qPCR of the vector DNA as previously reported (Clark et al. (1999) Hum. Gene Ther., 10:1031-1039; Veldwijk et al. (2002) Mol. Ther., 6:272-278, the contents of which are each incorporated by reference in their entireties as related to the measurement of particle concentrations).
[0734] In certain embodiments, the AAV particles or viral vectors of the present disclosure may be produced in a viral production cell that includes an insect cell. Any insect cell which allows for replication of parvovirus and which can be maintained in culture can be used in accordance with the present disclosure. AAV viral production cells commonly used for production of recombinant AAV particles include, but is not limited to, Spodoptera frugiperda, including, but not limited to the Sf9 or Sf21 cell lines, Drosophila cell lines, or mosquito cell lines, such as Aedes albopictus derived cell lines. Use of insect cells for expression of heterologous proteins is well documented, as are methods of introducing nucleic acids, such as vectors, e.g., insect-cell compatible vectors, into such cells and methods of maintaining such cells in culture.
[0735] Expansion, culturing, transfection, infection and storage of insect cells can be carried out in any cell culture media, cell transfection media or storage media known in the art, including Hyclone™ SFX-Insect™ Cell Culture Media, Expression System ESF AF™ Insect Cell Culture Medium, ThermoFisher Sf-900II™ media, ThermoFisher Sf-900III™ media, or ThermoFisher Grace’s Insect Media. Insect cell mixtures of the present disclosure can also include any of the formulation additives or elements described in the present disclosure, including (but not limited to) salts, acids, bases, buffers, surfactants (such as Poloxamer 188/Pluronic F-68), and other known culture media elements. Formulation additives can be incorporated gradually or as “spikes” (incorporation of large volumes in a short time).
[0736] In certain embodiments, the AAV particles or viral vectors of the present disclosure may be produced in a baculoviral system using a viral expression construct and a payload construct vector. In certain embodiments, the baculoviral system includes Baculovirus expression vectors (BEVs) and/or baculovirus infected insect cells (BIICs). In certain embodiments, a viral expression construct or a payload construct of the present disclosure can be a bacmid, also known as a baculovirus plasmid or recombinant baculovirus genome. In certain embodiments, a viral expression construct or a payload construct of the present disclosure can be polynucleotide incorporated by homologous recombination (transposon donor/acceptor system) into a bacmid by standard molecular biology techniques known and performed by a person skilled in the art.
Transfection of separate viral replication cell populations produces two or more groups (e.g. two, three) of baculoviruses (BEVs), one or more group which can include the viral expression construct (Expression BEV), and one or more group which can include the payload construct (Pay load BEV). The baculoviruses may be used to infect a viral production cell for production of AAV particles or viral vector.
[0737] In certain embodiments, the process includes transfection of a single viral replication cell population to produce a single baculovirus (BEV) group which includes both the viral expression construct and the payload construct. These baculoviruses may be used to infect a viral production cell for production of AAV particles or viral vector.
[0738] In certain embodiments, BEVs arc produced using a Bacmid Transfection agent, such as Promega FuGENE® HD, WFI water, or ThermoFisher Cellfectin® II Reagent. In certain embodiments, BEVs are produced and expanded in viral production cells, such as an insect cell. [0739] In some embodiments, the AAV particles or viral vectors of the present disclosure may be produced in insect cells (e.g., Sf9 cells).
[0740] In some embodiments, the AAV particles or viral vectors of the present disclosure may be produced using triple transfection.
[0741] In some embodiments, the AAV particles or viral vectors of the present disclosure may be produced in mammalian cells.
[0742] In some embodiments, the AAV particle s or viral vectors of the present disclosure may be produced by triple transfection in mammalian cells.
[0743] In some embodiments, the AAV particle s or viral vectors of the present disclosure may be produced by triple transfection in HEK293 cells.
[0744] The AAV particles or vectors encoding the truncated dystrophin protein or portion thereof, as described herein, may be useful in the fields of human disease, veterinary applications and a variety of in vivo and in vitro settings. The AAV particles or vectors of the present disclosure may be useful in the field of medicine for the treatment, prophylaxis, palliation, or amelioration of dystrophin-associated diseases and/or disorders, e.g., muscular dystrophy, e.g., DMD. In some embodiments, the AAV particles or vectors of the disclosure are used for the prevention and/or treatment of dystrophin -associated disorders, e.g., muscular dystrophy, e.g.,
DMD.
V. Pharmaceutical Composition
[0745] The present disclosure provides compositions comprising the isolated nucleic acid molecules encoding the truncated dystrophin protein or portion thereof of the invention, and one or more excipients. The present disclosure also provides pharmaceutical compositions comprising the isolated nucleic acid molecules encoding the truncated dystrophin protein or portion thereof of the invention, and one or more pharmaceutically acceptable excipients. The present disclosure also provides pharmaceutical compositions comprising the AAV vectors comprising the isolated nucleic acid molecules encoding the truncated dystrophin protein or portion thereof, and one or more pharmaceutically acceptable excipients.
[0746] In some embodiments, the pharmaceutical composition comprises a first nucleic acid molecule comprising a first coding region encoding an N-terminal portion of the truncated dystrophin protein, and a second nucleic acid molecule comprising a second coding region encoding a C-terminal portion of the truncated dystrophin protein.
[0747] In some embodiments, the first nucleic acid molecule and the second nucleic acid molecule are presented at a ratio of 1:1. 1:2, 1:3, 1:4, 1:5, 2:1, 3:1, 4:1, or 5:1.
[0748] In some embodiments, the pharmaceutical composition comprises a first AAV vector comprising a first nucleic acid molecule comprising a first coding region encoding an N-terminal portion of the truncated dystrophin protein, and a second AAV vector comprising a second nucleic acid molecule comprising a second coding region encoding a C-terminal portion of the truncated dystrophin protein.
[0749] In some embodiments, the first AAV vector and the second AAV vector are presented at a ratio of 1:1. 1:2, 1:3, 1:4, 1:5, 2:1, 3:1, 4:1, or 5:1.
[0750] In some embodiments, the present disclosure provides a first pharmaceutical composition comprising a first AAV vector comprising a first nucleic acid molecule comprising a first coding region encoding an N-terminal portion of the truncated dystrophin protein, and a second pharmaceutical composition comprising a second AAV vector comprising a second nucleic acid molecule comprising a second coding region encoding a C-terminal portion of the truncated dystrophin protein. [0751] In some embodiments, the first pharmaceutical composition and the second pharmaceutical composition are administered at a ratio of 1:1. 1:2, 1:3, 1:4, 1:5, 2:1, 3:1, 4:1, or 5:1.
[0752] The pharmaceutically acceptable excipient may be any functional molecules as vehicles, adjuvants, carriers, or diluents known in the art. The pharmaceutically acceptable excipient may be a transfection facilitating agent, which may include surface active agents, such as immune- stimulating complexes (ISCOMS), Freund’s incomplete adjuvant, LPS analog including monophosphoryl lipid A, muramyl peptides, quinone analogs, vesicles such as squalene and squalene, hyaluronic acid, lipids, liposomes, calcium ions, viral proteins, polyanions, polycations, or nanoparticles, or other known transfection facilitating agents.
[0753] Although the descriptions of pharmaceutical compositions provided herein arc principally directed to pharmaceutical compositions which are suitable for administration to humans, it will be understood by the skilled artisan that such compositions arc generally suitable for administration to any other animal, e.g., to non-human animals, e.g. non-human mammals. Modification of pharmaceutical compositions suitable for administration to humans in order to render the compositions suitable for administration to various animals is well understood, and the ordinarily skilled veterinary pharmacologist can design and/or perform such modification with merely ordinary, if any, experimentation.
[0754] Subjects to which administration of the pharmaceutical compositions is contemplated include, but are not limited to, humans and/or other primates; mammals, including commercially relevant mammals such as cattle, pigs, horses, sheep, cats, dogs, mice, and/or rats; and/or birds, including commercially relevant birds such as poultry, chickens, ducks, geese, and/or turkeys. In some embodiments, the compositions are administered to humans, human patients, or subjects. [0755] A pharmaceutical composition in accordance with the present disclosure may be prepared, packaged, and/or sold in bulk, as a single unit dose, and/or as a plurality of single unit doses. As used herein, a “unit dose” refers to a discrete amount of the pharmaceutical composition comprising a predetermined amount of the active ingredient. The amount of the active ingredient is generally equal to the dosage of the active ingredient which would be administered to a subject and/or a convenient fraction of such a dosage such as, for example, one-half or one-third of such a dosage. [0756] As used herein "pharmaceutically acceptable carrier" refers to any substantially non-toxic carrier conventionally useable for administration of pharmaceuticals in which the isolated nucleic acid molecules or AAV vectors of the present disclosure will remain stable and bioavailable. The pharmaceutically acceptable earner must be of sufficiently high purity and of sufficiently low toxicity to render it suitable for administration to the mammal being treated. It further should maintain the stability and bioavailability of an active agent. The pharmaceutically acceptable carrier can be liquid or solid and is selected, with the planned manner of administration in mind, to provide for the desired bulk, consistency, etc., when combined with an active agent and other components of a given composition.
[0757] Suitable pharmaceutically acceptable carriers include any and all solvents, dispersion media, coatings, antibacterial and antifungal agents, isotonic and absorption delaying agents, and the like that are physiologically compatible. Pharmaceutically acceptable carriers also include sterile aqueous solutions or dispersions and sterile powders for the extemporaneous preparation of sterile injectable solutions or dispersion. The use of such media and agents for pharmaceutically active substances is well known in the art. Except insofar as any conventional media or agent is incompatible with the gene therapy vector, use thereof in the pharmaceutical compositions of the disclosure is contemplated. Supplementary active compounds can also be incorporated into the compositions.
[0758] Pharmaceutical compositions of the disclosure may be formulated for delivery to animals for veterinary purposes (e.g. livestock (cattle, pigs, dogs, mice, rats), and other non-human mammalian subjects, as well as to human subjects.
[0759] In one embodiment, the pharmaceutical compositions of the present disclosure are in the form of injectable compositions. The compositions can be prepared as an injectable, either as liquid solutions or suspensions. The preparation may also be emulsified. Suitable excipients are, for example, water, saline, dextrose, glycerol, ethanol, phosphate buffered saline or the like and combinations thereof. In addition, if desired, the preparation may contain minor amounts of auxiliary substances such as wetting or emulsifying agents, pH-buffering agents, adjuvants, surfactant or immunopotentiators.
[0760] Sterile injectable solutions can be prepared by incorporating the compositions of the disclosure in the required amount in an appropriate solvent with one or a combination of ingredients enumerated above, as required, followed by filtered sterilization. Generally, dispersions are prepared by incorporating the active compound into a sterile vehicle which contains a basic dispersion medium and the required other ingredients from those enumerated above. In the case of sterile powders for the preparation of sterile injectable solutions, the preferred methods of preparation include vacuum drying and freeze-drying which yields a powder of the active ingredient plus any additional desired ingredient from a previously sterile- filtered solution thereof.
[0761] Toxicity and therapeutic efficacy of nucleic acid molecules described herein can be determined by standard pharmaceutical procedures in cell cultures or experimental animals, e.g., for determining the ED50 (the dose therapeutically effective in 50% of the population). Data obtained from cell culture assays and/or animal studies can be used in formulating a range of dosage for use in humans. The dosage typically will lie within a range of concentrations that include the ED50 with little or no toxicity. The dosage may vary within this range depending upon the dosage form employed and the route of administration utilized. For any compound used in the method of the disclosure, the therapeutically effective dose can be estimated initially from cell culture assays.
[0762] Formulations of the pharmaceutical compositions described herein may be prepared by any method known or hereafter developed in the art of pharmacology. In general, such preparatory methods include the step of bringing the active ingredient into association with an excipient and/or one or more other accessory ingredients, and then, if necessary and/or desirable, dividing, shaping and/or packaging the product into a desired single- or multi-dose unit.
[0763] Relative amounts of the active ingredient, the pharmaceutically acceptable excipient, and/or any additional ingredients in a pharmaceutical composition in accordance with the disclosure will vary, depending upon the identity, size, and/or condition of the subject treated and further depending upon the route by which the composition is to be administered.
[0764] For example, the composition may comprise between 0.1% and 99% (w/w) of the active ingredient. By way of example, the composition may comprise between 0.1% and 100%, e.g., between 0.5% and 50%, between 1-30%, between 5-80%, or at least 80% (w/w) active ingredient.
[0765] The pharmaceutical composition of the disclosure can be formulated using one or more excipients to: (1) increase stability; (2) increase cell transfection or transduction; (3) permit the sustained or delayed release; (4) alter the biodistribution (e.g., targeting specific tissues or cell types); (5) increase the translation of encoded protein in vivo’, (6) alter the release profile of encoded protein in vivo and/or (7) allow for regulatable expression of the payload.
[0766] Formulations of the present disclosure can include, without limitation, saline, lipidoids, liposomes, lipid nanoparticles, polymers, lipoplexes, core-shell nanoparticles, peptides, proteins, cells transfected with viral vectors (e.g., for transplantation into a subject), nanoparticle mimics and combinations thereof. Further, the viral vectors of the present disclosure may be formulated using self-assembled nucleic acid nanoparticles.
[0767] In some embodiments, the viral vectors encoding the truncated dystrophin protein or portion thereof may be formulated to optimize baricity and/or osmolality. In some embodiments, the baricity and/or osmolality of the formulation may be optimized to ensure optimal distribution in the muscle tissues or cells.
[0768] The formulations of the disclosure can include one or more excipients, each in an amount that together increases the stability of the AAV particle, increases cell transfection or transduction by the viral particle, increases the expression of viral particle encoded protein, and/or alters the release profile of AAV particle encoded proteins. In some embodiments, a pharmaceutically acceptable excipient may be at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% pure. In some embodiments, an excipient is approved for use for humans and for veterinary use. In some embodiments, an excipient may be approved by United States Food and Drug Administration. In some embodiments, an excipient may be of pharmaceutical grade. In some embodiments, an excipient may meet the standards of the United States Pharmacopoeia (USP), the European Pharmacopoeia (EP), the British Pharmacopoeia, and/or the International Pharmacopoeia.
[0769] Excipients, which, as used herein, include, but are not limited to, any and all solvents, dispersion media, diluents, or other liquid vehicles, dispersion or suspension aids, surface active agents, isotonic agents, thickening or emulsifying agents, preservatives, and the like, as suited to the particular dosage form desired. Various excipients for formulating pharmaceutical compositions and techniques for preparing the composition are known in the art (see Remington: The Science and Practice of Pharmacy, 21st Edition, A. R. Gennaro, Lippincott, Williams & Wilkins, Baltimore, MD, 2006; the contents of which are herein incorporated by reference in their entirety). The use of a conventional excipient medium may be contemplated within the scope of the present disclosure, except insofar as any conventional excipient medium may be incompatible with a substance or its derivatives, such as by producing any undesirable biological effect or otherwise interacting in a deleterious manner with any other component(s) of the pharmaceutical composition.
[0770] In some embodiments, formulations may comprise at least one excipient which is an inactive ingredient. As used herein, the term “inactive ingredient” refers to one or more agents that do not contribute to the activity of the pharmaceutical composition included in formulations. In some embodiments, all, none, or some of the inactive ingredients which may be used in the formulations of the present disclosure may be approved by the US Food and Drug Administration (FDA).
VI. Methods of Disclosure
[0771] The present disclosure provides methods of use of the systems and compositions of the disclosure, which generally include administering the isolated nucleic acid molecules encoding the truncated dystrophin protein or portion thereof, the AAV vectors comprising the isolated nucleic acid molecules, or the compositions or pharmaceutical compositions of the disclosure. [0772] In one aspect, the present disclosure provides methods for delivering a truncated dystrophin protein or increasing expression of a functional dystrophin (e.g., truncated dystrophin protein) in a subject having or diagnosed with having a dystrophin-associated disorder. In another aspect, the present disclosure provides methods for treating a dystrophin-associated disorder in a subject in need thereof. The present disclosure further provides methods for increasing muscle mass or muscle strength and/or preventing fibrosis in a subject having or diagnosed with having a dystrophin-associated disorder.
[0773] The methods generally comprise administering a therapeutically effective amount of the isolated nucleic acid molecules encoding the truncated dystrophin protein or portion thereof of the disclosure, the AAV vectors comprising the isolated nucleic acid molecules, or the pharmaceutical composition of the disclosure.
[0774] In some embodiments, the dystrophin-associated disorder is muscular' dystrophy. Muscular dystrophies include, but are not limited to, Duchenne muscular dystrophy, Becker muscular dystrophy, myotonic dystrophy, congenital muscular’ dystrophy, distal muscular dystrophy, Emery-Dreifuss muscular dystrophy facioscapulohumeral muscular dystrophy, limb girdle muscular dystrophy, and oculopharyngeal muscular dystrophy. [0775] The truncated dystrophin protein of the present disclosure may be any truncated dystrophin protein described herein. In embodiments, the truncated dystrophin protein comprises an amino acid sequence selected from the group consisting of SEQ ID Nos: 83-107 and 216-223. The nucleic acid molecules encoding the truncated dystrophin protein include those having a nucleic acid sequence selected from the group consisting of SEQ ID NOs: 108-132, 224-231, 260-280, and 396-403, in certain embodiments, operably linked to regulatory elements for constitutive, muscle-specific (including skeletal, smooth muscle and cardiac muscle- specific) expression, and other regulatory elements such as poly A sites. Such nucleic acids may be in the context of a recombinant AAV genome vector, for example, flanked by ITR sequences, particularly, AAV2 ITR sequences.
[0776] In certain embodiments, the methods comprise administering a subject in need thereof, a first nucleic acid molecule comprises a first coding region encoding an N-terminal portion of the truncated dystrophin protein, and a second nucleic acid molecule comprising a second coding region encoding a C-terminal portion of the truncated dystrophin protein.
[0777] In certain embodiments, the methods comprise administering a subject in need thereof, a first AAV vector comprising a first nucleic acid molecule comprising a first coding region encoding an N-terminal portion of the truncated dystrophin protein, and an AAV vector comprising a second nucleic acid molecule comprising a second coding region encoding a C- terminal portion of the truncated dystrophin protein.
[0778] In certain embodiments, the methods comprising administering to a subject in need thereof, a first AAV vector comprising a nucleotide sequence selected from the group consisting of SEQ ID NOs: 160, 162, 164, 166, 168, 170, 172, 174, 176, 178, 180, 182, 184, 186, 188, 190, 192, 194, 196, 198, 200, 301, 303, 305, 307, 308, 310, 312, 314, 394, and 395, or a nucleotide sequence at least 90% identical thereto; and a second AAV vector comprising a nucleotide sequence selected from the group consisting of SEQ ID NO: 161, 163, 165, 167, 169, 171, 173, 175, 177, 179, 181, 183, 185, 187, 189, 191, 193, 195, 197, 199, 201, 302, 304, 306, 309, 311, 313, or a nucleotide sequence at least 90% identical thereto.
[0779] In embodiments, the subject has been diagnosed with and/or has symptom(s) associated with muscular dystrophy, e.g., DMD.
[0780] The isolated nucleic acid molecules encoding the truncated dystrophin protein or portion thereof, the AAV vectors, or the pharmaceutical composition of the disclosure may be administered by any suitable route of administration. A route of administration may refer to any administration pathway known in the art, including but not limited to aerosol, enteral, nasal, ophthalmic, oral, parenteral, rectal, transdermal (e.g., topical cream or ointment, patch), or vaginal. “Parenteral” refers to a route of administration that is generally associated with injection, including infraorbital, infusion, intraarterial, intracapsular, intracardiac, intradermal, intramuscular-, intraperitoneal, intrapulmonary, intraspinal, intrasternal, intrathecal, intrauterine, intravenous, subarachnoid, subcapsular, subcutaneous, transmucosal, or transtracheal. In some embodiments, the route of the administration accordance with the methods described herein includes local or regional muscle injection to improve local muscle function in patients, systemic delivery (such as intravenous, intra-artery, intraperitoneal) to all muscles in a region or in the whole body in patients, or in vitro infection of myogenic stem cells with an AAV or lentiviral vector followed by local and/or systemic delivery. In some embodiments, the nucleic acid molecules or vectors arc administered intramuscularly, subcutaneously, or intravenously. Intramuscular, subcutaneous, or intravenous administration should result in expression of the transgene product in cells of the muscle (including skeletal muscle, cardiac muscle, and/or smooth muscle).
[0781] In some embodiments, the first nucleic acid molecule and the second first nucleic acid molecule, or the pharmaceutical composition comprising the same, are administered together. In some embodiments, the first nucleic acid molecule and the second nucleic acid molecule, or the pharmaceutical composition comprising the same, are administered separately.
[0782] In some embodiments, the first AAV vector and the second AAV vector, or the pharmaceutical composition comprising the same, are administered together. In some embodiments, the first AAV vector and the second AAV vector, or the pharmaceutical composition comprising the same, are administered separately.
[0783] The term “therapeutically effective amount” as used herein refers to an amount of a truncated dystrophin protein, peptide, or fragment thereof, or an isolated nucleic acid molecule encoding the same, or an AAV vector comprising the same, that produces a desired therapeutic effect in a subject, such as preventing or treating a target condition, alleviating symptoms associated with the condition, or producing a desired physiological effect. The precise amount will vary depending upon a variety of factors, including but not limited to the physiological condition of the subject (including age, sex, disease type and stage, general physical condition, responsiveness to a given dosage, and type of medication), the nature of a pharmaceutically acceptable carrier or carriers in the formulation, and the route of administration. Further, an effective or therapeutically effective amount may vary depending on whether the a truncated dystrophin protein, peptide, or fragment thereof is administered alone or in combination with a compound, drug, therapy or other therapeutic method or modality. One skilled in the clinical and pharmacological aits will be able to determine an effective amount or therapeutically effective amount through routine experimentation.
[0784] The isolated nucleic acid molecules and/or the vectors of the disclosure are provided in a therapeutically effective amount to elicit the desired effect, e.g., increase dystrophin expression and/or activity. The quantity of the viral particle to be administered, both according to number of treatments and amount, will also depend on factors such as the clinical status, age, previous treatments, the general health and/or age of the subject, other diseases present, and the severity of the disorder. Precise amounts of active ingredient required to be administered depend on the judgment of the gene therapist and will be particular to each individual patient. Moreover, treatment of a subject with a therapeutically effective amount of the nucleic acid molecules and/or the vectors of the disclosure can include a single treatment or, preferably, can include a series of treatments. It will also be appreciated that the effective dosage used for treatment may increase or decrease over the course of a particular treatment. Changes in dosage may result from the results of diagnostic assays as described herein. The pharmaceutical compositions can be included in a container, pack, or dispenser together with instructions for administration.
[0785] In some embodiments, a therapeutically effective amount of a viral particle of the disclosure (or pharmaceutical composition of the disclosure) is in titers ranging from about IxlO5, about 1.5xl05, about 2xl05, about 2.5xl05, about 3xl05, about 3.5xl05, about 4xl05, about 4.5xl05, about 5xl05, about 5.5xl05, about 6xl05, about 6.5xl05, about 7xl05, about 7.5xl05, about 8xl05, about 8.5xl05, about 9xl05, about 9.5xl05, about IxlO6, about 1.5xl06, about 2xl06, about 2.5xl06, about 3xl06, about 3.5xl06, about 4xl06, about 4.5xl06, about 5xl06, about 5.5xl06, about 6xl06, about 6.5xl06, about 7xl06, about 7.5xl06, about 8xl06, about 8.5x10, about 9xl06, about 9.5xl06, about IxlO7, about 1.5xl07, about 2xl07, about 2.5xl07, about 3xl07, about 3.5xl07, about 4xl07, about 4.5xl07, about 5xl07, about 5.5xl07, about 6xl07, about 6.5xl07, about 7xl07, about 7.5xl07, about 8xl07, about 8.5xl07, about 9xl07, about 9.5x107, about IxlO8, about 1.5x l08, about 2xl08, about 2.5x108, about 3x l08, about 3.5xl08, about 4xl08, about 4.5xl08, about 5xl08, about 5.5xl08, about 6xl08, about 6.5xl08, about 7xl08, about 7.5xl08, about 8xl08, about 8.5xlO8, about 9xl08, about 9.5xl08, about IxlO9, about 1.5xl09, about 2xl09, about 2.5xl098, about 3xl09, about 3.5xl09, about 4xl09, about 4.5xl09, about 5xl09, about 5.5xl09, about 6xl09, about 6.5xl09, about 7xl09, about 7.5xl09, about 8xl09, about 8.5xl09, about 9xl09, about 9.5xl09, about IxlO10, about 1.5xlO10, about 2xlO10, about 2.5xlO10, about 3xlO10, about 3.5xlO10, about 4xlO10, about 4.5xlO10, about 5xlO10, about 5.5xlO10, about 6xlO10, about 6.5xlO10, about 7xlO10, about 7.5xlO10, about 8xlO10, about 8.5xlO10, about 9xlO10, about 9.5xlO10, about IxlO11, about 1.5xlOn, about 2xlOn, about 2.5xlOn, about 3xlOn, about 3.5xlOn, about 4xlOn, about 4.5xlOn, about 5xlOn, about 5.5xlOn, about 6xlOn, about 6.5xlOn, about 7xlOn, about 7.5xlOn, about 8xlOn, about 8.5xlOu, about 9xlOn, about 9.5xlOn, about IxlO12 , about 1 ,5xl012, about 2x 1012, about 2.5xl012, about 3xl012, about 3.5xl012, about 4x 1012, about 4.5xl012, about 5xl012, about 5.5xl012, about 6xl012, about 6.5xl012, about 7xl012, about 7.5xl012, about 8xl012, about 8.5xl012, about 9xl012, about 9.5xl012, about IxlO13, about 1.5xl013, about 2xl034, about 2.5xl013, about 3xl013, about 3.5xl013, about 4xl013, about 4.5xl013, about 5xl013, about 5.5xl013, about 6xl013, about 6.5xl013, about 7xl013, about 7.5xl013, about 8xlO13, about 8.5xl013, about 9xl013, about 9.5xl013, about IxlO14, about 1.5xl014, about 2xl014, about 2.5xl014, about 3xl014, about 3.5xl014, about 4xl014, about 4.5xl014, about 5xl014, about 5.5xl014, about 6xl014, about 6.5xl014, about 7xl014, about 7.5xl014, about 8xl014, about 8.5xl014, about 9xl014, about 9.5xl014, about IxlO15 viral particles (vp).
[0786] In some embodiments, a therapeutically effective amount of a viral particle of the disclosure (or pharmaceutical composition of the disclosure) is in genome copies (“GC”), also referred to as “viral genomes” ("vg"), ranging from: about IxlO5 , about 1.5xl05, about 2xl05, about 2.5xl05, about 3xl05, about 3.5xl05, about 4xl05, about 4.5xl05, about 5xl05, about 5.5xl05, about 6xl05, about 6.5xl05, about 7xl05, about 7.5xl05, about 8xlO5, about 8.5xl05, about 9xl05, about 9.5xl05, about IxlO6, about 1.5xl06, about 2xl06, about 2.5xl06, about 3xl06, about 3.5xl06, about 4xl06, about 4.5xl06, about 5xl06, about 5.5xl06, about 6xl06, about 6.5xl06, about 7xl06, about 7.5xl06, about 8xl06, about 8.5x10, about 9xl06, about 9.5xl06, about IxlO7, about 1.5xl07, about 2xl07, about 2.5xl07, about 3xl07, about 3.5xl07, about 4xl07, about 4.5x 107, about 5xl07, about 5.5xl07, about 6xl07, about 6.5x 107, about 7xl07, about 7.5xl07, about 8xl07, about 8.5xl07, about 9xl07, about 9.5xl07, about IxlO8, about 1.5xlO8, about 2xl08, about 2.5xl08, about 3xlO8, about 3.5xl08, about 4xl08, about 4.5xl08, about 5xl08, about 5.5xl08, about 6xl08, about 6.5xl08, about 7xl08, about 7.5xl08, about 8xlO8, about 8.5xl08, about 9xl08, about 9.5xl08, about IxlO9, about 1.5xl09, about 2xl09, about 2.5xl098, about 3xl09, about 3.5xl09, about 4xl09, about 4.5xl09, about 5xl09, about 5.5xl09, about 6xl09, about 6.5xl09, about 7xl09, about 7.5xl09, about 8xl09, about 8.5xl09, about 9xl09, about 9.5xl09, about IxlO10, about 1.5xlO10, about 2xlO10, about 2.5xlO10, about 3xlO10, about 3.5xlO10, about 4xlO10, about 4.5xlO10, about 5xlO10, about 5.5xlO10, about 6xlO10, about 6.5xlO10, about 7xlO10, about 7.5xlO10, about 8xlO10, about 8.5xlO10, about 9xlO10, about 9.5xlO10, about IxlO11, about 1.5xl0n, about 2xlOn, about 2.5xlOn, about 3xl0n, about 3.5xl0n, about 4xlOn, about 4.5xlOu, about 5xl0n, about 5.5xl0n, about 6xl0n, about 6.5x10n, about 7x 1011, about 7.5xlOn, about 8xl0n, about 8.5x10n, about 9x 1011, about 9.5xl0n, about IxlO12, about 1.5xl012, about 2xl012, about 2.5xl012, about 3xl012, about 3.5xl012, about 4xl012, about 4.5xl012, about 5xl012, about 5.5xl012, about 6xl012, about 6.5xl012, about 7xl012, about 7.5xl012, about 8xl012, about 8.5xl012, about 9xl012, about 9.5xl012, about IxlO13, about 1.5xl013, about 2xl034, about 2.5xl013, about 3xl013, about 3.5xl013, about 4xl013, about 4.5xl013, about 5xl013, about 5.5xl013, about 6xl013, about 6.5xl013, about 7xl013, about 7.5xl013, about 8xlO13, about 8.5xlO13, about 9xl013, about 9.5xl013, about IxlO14, about 1.5xl014, about 2xl014, about 2.5xl014, about 3xl014, about 3.5xl014, about 4xl014, about 4.5xl014, about 5xl014, about 5.5xl014, about 6xl014, about 6.5xl014, about 7xl014, about 7.5xl014, about 8xl014, about 8.5xl014, about 9xl014, about 9.5xl014, about IxlO15, about 1.5xl015, about 2xl015, about 2.5xl015, about 3xl015, about 3.5xl015, about 4xl015, about 4.5xl015, about 5xl015, about 5.5xl015, about 6xl015, about 6.5xl015, about 7xl015, about 7.5xl015, about 8xl015, about 8.5xl015, about 9xl015, about 9.5xl015, or about lxl016 vg.
[0787] Any method known in the art can be used to determine the genome copy (GC) number of the viral compositions of the disclosure. One method for performing AAV GC number titration is as follows: purified AAV viral particle samples are first treated with DNase to eliminate unencapsulated AAV genome DNA or contaminating plasmid DNA from the production process. The DNase resistant particles arc then subjected to heat treatment to release the genome from the capsid. The released genomes are then quantitated by real-time PCR or digital PCR using primer/probe sets targeting specific region of the viral genome.
[0788] The nucleic acid molecules or gene therapy vectors provided herein may be administered in combination with other treatments for muscular dystrophy, including corticosteroids, beta blockers and ACE inhibitors.
[0789] The methods may alleviate or reduce symptoms of muscular dystrophy. Deletion of dystrophin results in mechanical instability causing myofibers to weaken and eventually break during contraction. Patients with DMD first display skeletal muscle weakness in early childhood, which progresses rapidly to loss of muscle mass, spinal curvature known as kyphosis, paralysis and ultimately death from cardiorespiratory failure before 30 years of age. Skeletal muscles of DMD patients also develop muscle hypertrophy, particularly of the calf evidence of focal necrotic myofibers, abnormal variation in myofiber diameter, increased fat deposition and fibrosis, as well as lack of dystrophin staining in immunohistological sections.
[0790] The methods of treatment provided herein is to slow or arrest the progression of DMD, or other muscular dystrophy disease, or to reduce the severity of one or more symptoms associated with DMD, or other muscular’ dystrophy disease. In particular, the methods provided herein is to reduce muscle degeneration, induce/improve muscle regeneration, and/or prevent/reduce downstream pathologies including inflammation and fibrosis that interfere with muscle regeneration and cause loss of movement, orthopedic complications, and, ultimately, respiratory and cardiac failure.
[0791] Efficacy may be monitored by measuring changes from baseline in gross motor function using the North Star Ambulatory Assessment (NSAA) (scale is ordinal with 34 as the maximum score indicating fully-independent function) or an age-appropriate modified assessment, by assessing changes in ambulatory function (e.g., 6-min (distance walked<300m, between 300 and 400m, or >400m)), by performing a timed function test to measure changes from baseline in time taken to stand from a supine position (1 to 8 s (good), 8 to 20 s (moderate), and 20 to 35 s (poor)), by performing time to climb (4 steps) and time to run/walk assessments (10 meters), as well as myometry to evaluate changes from baseline in strength of upper and lower extremities. [0792] Efficacy may also be monitored by measuring changes (reduction) from baseline in serum creatine kinase (CK) levels (normal: 35-175 U/L, DMD: 500-20,000 U/L), an enzyme that is found in abnormally high levels when muscle is damaged, serum or urine creatinine levels (DMD: 10-25 pmol/L, mild BMD: 20-30 pmol/L, normal>53 pmol/L, DMD) and truncated dystrophin protein levels in muscle biopsies. The percentage of myofibers positive for truncated dystrophin protein expression (via immunofluorescence staining of muscle biopsies or similar methods) is also a method to establish efficacy of treatment. Magnetic Resonance Imaging (MRI) may also be performed to assess fatty tissue infiltration in skeletal muscle (fat fraction). [07931 Although skeletal muscle symptoms are considered the defining characteristic of DMD, patients most commonly die of respiratory or cardiac failure. DMD patients develop dilated cardiomyopathy (DCM) due to the absence of dystrophin in cardiomyocytes, which is required for contractile function. This leads to an influx of extracellular calcium, triggering protease activation, cardiomyocyte death, tissue necrosis, and inflammation, ultimately leading to accumulation of fat and fibrosis. This process first affects the left ventricle (LV), which is responsible for pumping blood to most of the body and is thicker and therefore experiences a greater workload. Atrophic cardiomyocytcs exhibit a loss of striations, vacuolization, fragmentation, and nuclear degeneration. Functionally, atrophy and scarring leads to structural instability and hypokinesis of the LV, ultimately progressing to general DCM. DMD may be associated with various electrocardiograms changes like sinus tachycardia, reduction of circadian index, decreased heart rate variability, short PR interval, right ventricular hypertrophy, S-T segment depression and prolonged QTc.
[0794] The methods provided herein can slow or arrest the progression of DMD and other muscular dystrophy, particularly to reduce the progression of or attenuate cardiac dysfunction and/or maintain or improve cardiac function. Efficacy may be monitored by periodic evaluation of signs and symptoms of cardiac involvement or heart failure that are appropriate for the age and disease stage of the trial population, using serial electrocardiograms (ECG), and serial noninvasive imaging studies (e.g., echocardiography or cardiac magnetic resonance imaging (CMR)). CMR may be used to monitor changes from baseline in forced vital capacity (FVC), forced expiratory volume (FEV1), maximum inspiratory pressure (MIP), maximum expiratory pressure (MEP), peak expiratory flow (PEF), peak cough flow, left ventricular’ ejection fraction (LVEF), left ventricular fractional shortening (LVFS), inflammation, and fibrosis. ECG may be used to monitor conduction abnormalities and arrythmias. In particular, ECG may be used to assess normalization of the PR interval, R waves in VI, Q waves in V6, ventricular’ repolarization, QS waves in inferior and/or upper lateral wall, conduction disturbances in right bundle branch, QT C, and QRS.
[0795] The present disclosure also provides a system or a pharmaceutical composition for use in the treatment of a dystrophin-associated disorder. In another aspect, the present disclosure provides a first nucleic acid molecule and a second nucleic acid molecule, or a first AAV vector and a second AAV vector, for use in the treatment of a dystrophin-associated disorder. In yet another aspect, the present disclosure provides an isolated nucleic acid molecule for use the treatment of a dystrophin-associated disorder.
VII. Kits
[0796] The present disclosure also provides kits for conveniently and/or effectively carrying out methods of the present disclosure. Typically, kits will comprise sufficient amounts and/or numbers of components to allow a user to perform multiple treatments of a subjcct(s) and/or to perform multiple experiments.
[0797] Any of the vectors, constructs, or truncated dystrophin proteins, or portion thereof, of the present disclosure may be comprised in a kit. In some embodiments, kits may further include reagents and/or instructions for creating and/or synthesizing compounds and/or compositions of the present disclosure. In some embodiments, kits may also include one or more buffers. In some embodiments, kits of the disclosure may include components for making protein or nucleic acid arrays or libraries and thus, may include, for example, solid supports.
[0798] In some embodiments, kit components may be packaged either in aqueous media or in lyophilized form. The container means of the kits will generally include at least one vial, test tube, flask, bottle, syringe or other container means, into which a component may be placed, and suitably aliquoted. Where there is more than one kit component, (labeling reagent and label may be packaged together), kits may also generally contain second, third or other additional containers into which additional components may be separately placed. In some embodiments, kits may also comprise second container means for containing sterile, pharmaceu tic ally acceptable buffers and/or other diluents. In some embodiments, various combinations of components may be comprised in one or more vial. Kits of the present disclosure may also typically include means for containing compounds and/or compositions of the present disclosure, e.g., proteins, nucleic acids, and any other reagent containers in close confinement for commercial sale. Such containers may include injection or blow-molded plastic containers into which desired vials are retained.
[0799] In some embodiments, kit components are provided in one and/or more liquid solutions. In some embodiments, liquid solutions are aqueous solutions, with sterile aqueous solutions being particularly used. In some embodiments, kit components may be provided as dried powder(s). When reagents and/or components are provided as dry powders, such powders may be reconstituted by the addition of suitable volumes of solvent. In some embodiments, it is envisioned that solvents may also be provided in another container means.
[0800] In some embodiments, kits may include instructions for employing kit components as well the use of any other reagent not included in the kit. Instructions may include variations that may be implemented.
[0801] While the invention has been described in connection with specific embodiments thereof, it will be understood that the inventive materials are capable of further modifications. This patent application is intended to cover any variations, uses, or adaptations of the invention following, in general, the principles of the invention and including such departures from the present disclosure as come within known or customary practice within the art to which the invention pertains and as may be applied to the essential features herein before set forth.
[0802] The following examples are intended to illustrate various embodiments of the invention. As such, the specific embodiments discussed are not to be construed as limitations on the scope of the invention. It will be apparent to one skilled in the art that various equivalents, changes, and modifications may be made without departing from the scope of invention, and it is understood that such equivalent embodiments are to be included herein. Further, all references cited in the disclosure are hereby incorporated by reference in their entirety, as if fully set forth herein.
EXAMPLES
Example 1. Design of Truncated Dystrophin Protein and Codon Optimization
1A: Design of Dystrophin Protein Constructs [0803] Various truncated dystrophin protein constructs were designed having deletions of different domains or deletions of regions encoded by specific exons (FIG. 2 and Table 6). The constructs incorporate the subdomains of dystrophin described in Table 6.
Table 6. Truncated Dystrophin Proteins
[0804] In order to test various single and dual payload configurations described the examples above, DNA expression plasmids were utilized for a cost-effective and higher throughput method of assessing the relative protein coding potential for various truncated Dystrophin construct designs.
[0805] For example, the pcDNA3.1(+) expression plasmid was used for delivery of various truncated Dystrophin designs described in Table 6 above. The pcDNA3.1(+) expression plasmin includes a CMV promoter and bovine growth hormone (bGH) polyA sequence, and a multicloning site located in-between these two elements where protein coding sequences and various regulatory elements can be inserted. Once sub-cloned using standard molecular biology techniques, these expression plasmids were transfected into cells, including primary human skeletal muscle cells, and assessed for their protein coding ability via Western blotting (or similar techniques). Those DNA-expression plasmids that result in noteworthy or desirable levels of midi-Dystrophin protein production were then incorporated into the designs for AAV vectors and production (please See Example 5: AAV Design and production).
[0806] Various truncated midi-Dys expression plasmid combinations were delivered to primary human skeletal muscle cells cultured in vitro to test for expression by Western blot analysis. In these experiments, midi-Dys sequences are divided into an N-terminal and C-terminal portion, linked to splice donor or acceptor and 3’ ribozyme or 5’ ribozyme sequences, and cloned into pcDNA3.1(+) expression plasmids which puts expression under the control of a CMV promoter. The Ct plasmid also includes a 3xFLAG tag on the C-terminal end of the encoded midi-Dys, and an anti-FLAG antibody is used for detection of midi-Dys protein expression. Shown below in Table 7 are the results of densitometry analysis (using ImageJ) of the Western blot results depicted in FIG. 4. The relative densitometry intensities are normalized to the midi-Dys A H2- R15 protein expression from SEQ ID NOs 4 + 5 (bold). Certain truncated midi-Dys are expressed more than 3x greater than the starting midi-Dys A H2-R15 sequence constructs (SEQ ID NOs 4 + 5).
[0807] As shown in Figure 4, lysates of cells 24-to-48 hours after transfection of DNA expression plasmids demonstrates that dual plasmid delivery of truncated midi-Dys sequences results in varying levels of truncated midi-Dys protein expression. Densitometric quantification of the blot is provided in Table 7 below:
Table 7: Densitometry data from Figure 4.
IB. Codon and CpG Optimization
The nucleic acid sequences used to encode a given truncated midi-Dys protein were optimized based on codon usage properties (so called “codon optimization”). Multiple codon optimized versions of the gene encoding a midi-Dys A exon 18-41 truncated Dystrophin construct (SEQ ID NO: 221) were designed for further testing and characterization. For the codon optimization, we started with the human reference genome (hgl9) version of a truncated Dys sequence (SEQ ID NO:260). We then applied various human codon usage statistics to the sequence, either based on codon usage across the entire human genome, or codon usage in a highly expressed muscle transcript (the Titan gene). We then excluded various undesired motifs (such as "CG", "ACGT", "TCGT", and/or "CCGT", and silently converted these to alternate codons based on either the entire codon usage of the human genome or the Titan gene codon usage. In addition, in some versions, we added predicted "immunosilent" CpG motifs (either "GCGC", "GCGG", and/or "CCGC" (based on the theory of immune surveillance of CpG motifs published by J.F. Wright, (Wright JF. Quantification of CpG Motifs in rAAV Genomes: Avoiding the Toll. Mol Ther. 2020 Aug 5 ;28(8): 1756- 1758. doi: 10.1016/j.ymthe.2020.07.006. Epub 2020 Jul 14. PMID: 32710825; PMCID: PMC7403467.). Table 8 below lists the total number of CpG dinucleotides in each of the midi-Dys codon optimized sequences.
Table 8. Codon-optimized and CpG modified constructs based on midi-Dys A exon 18-41.
[0808] These sequences were tested for expression in primary human skeletal muscle cells in vitro by transfection-based delivery as single ORFs contained in a DNA expression plasmid. FIG. 6 shows Western blot results of cell lysates 24-to-48 hours after transfection of various codon optimized truncated midi-Dystrophins. The MANDRA antibody is used to detect midiDystrophin protein expression, which detects an epitope included in all truncated midi-Dys designs as well as the larger full-length 427kDa DMD protein (which is endogenously expressed by primary human skeletal muscle cells).
Table 9 below shows the results of densitometry analysis (using Image!) of the Western blot results depicted in FIG. 6. The relative densitometry intensities are normalized to the midi-Dys sequence derived from the refence human genome build hgl9, SEQ ID NO: 260 (bold). Certain truncated midi-Dys codon optimized sequences result in greater than 100-fold higher protein expression compared to the hgl9 sequence. Table 9. Densitometry data from Figure 6.
Example 2. Design of Coding Regions for N-Terminal and C-Terminal Portions of Truncated Dystrophin Protein
[0809] Due to the packaging limits of the AAV expression vectors, the nucleic acid encoding the truncated dystrophin proteins are divided into two portions, an N-terminal (Nt) portion, and a C- terminal (Ct) portion, where each portion is inserted into a separate vector for expression.
[0810] To enable rapid and relatively cost-effective screening of various truncated midi-Dys designs, DNA expression plasmids can be used in lieu of AAV vectors for in vitro testing. To this end, the divided Nt and Ct sequences are cloned into a suitable DNA expression plasmid and then transfected into an appropriate cell line in vitro. The resulting transfected cells can then be studied for the amount of resulting RNA and/or protein produced by the DNA plasmid(s). Those designs with desirable attributes (such as improved levels of protein production, or reduced levels of undesirable products such as fragmented protein from each individual plasmid, or with desirable sequence attributes such as low CpG dinucleiotide content (while maintaining or potentially enhancing protein production) can then be inserted into viral vectors for further study. [0811] Split Sites, the genomic locations where the nucleic acid sequence encoding the dystrophin proteins are split into the Nt and Ct portion, cannot be chosen randomly in a given nucleic acid sequence since the technology requires that RNA splicing occurs between the splice donor (SD) containing intron of the Nt vector and the spice acceptor (SA) containing intron encoded by the Ct vector. The protein-coding nucleic acids flanking the Nt SD intron and Ct SA intron are critical for efficient splicing to occur. As such, split sites were selected by first finding locations in the Dystrophin encoding nucleic acid sequence that either (1) had a splice site consensus nucleotide sequence, a well-established pattern found at the boundaries of exons and introns in prc-mRNA, or (2) could be changed into a splice site consensus sequence without altering the amino acid sequence using degenerate codons for the specified amino acid sequence. In practice, this makes certain amino acid motifs more amenable to split site selection than others, on the basis that the nucleic acid sequence optimal for splicing is needed. Based on human genome analysis of all exon /intron boundaries, there are consensus sequences identified that are implicated in efficient splicing (For example, see: Ma SL, Vega-Warner V, Gillies C, Sampson MG, Kher V, Sethi SK, Otto EA. Whole Exome Sequencing Reveals Novel PHEX Splice Site Mutations in Patients with Hypophosphatemic Rickets. PLoS One. 2015 Jun 24;10(6):e0130729. doi: 10.1371/joumal.pone.0130729. PMID: 26107949; PMCID: PMC4479593). Based on that body of work, the protein-coding nucleic acid at the 3’ end of the Nt Vector may be a “G”, while additional nucleic acids are likely to confer additional influence on the efficiency of splicing. For example, “CAG, “AAG” and “TAG” are enriched at the 3’ end of the 5’ exon in the human genome. In the human genome, a 5’-G on the 3’ exon is also enriched, as are “GT” and “GTT” sequences. Thus, the protein coding nucleic acids at the 5’ end of the Ct vector may be a “G”, “GT”, “GA”, or “GTT”. When reconstituted after splicing of the RNA encoded by the Nt vector and Ct vector, these properties restrict the potential amino acids that could be encoded by this sequence when the goal is to maintain protein coding sequence to that of the human reference genome (or “wild type” gene). For example, if “CAG” is used for the 3’ protein coding sequence in the Nt vector and “GT” is used as the 5’ protein-coding sequence of the Ct vector, the following amino acid motifs in Dystrophin can be considered as a potential split site: QV, SG, PG, TG, AG, FRF, FRL, FRS, FRY, FRC, FRW, SRF, SRL, SRS, SRY, SRC, SRW, YRF, YRL, YRS, YRY, YRC, YRW, CRF, CRL, CRS, CRY, CRC, CRW, LRF, LRL, LRS, LRY, LRC, LRW, PRF, PRL, PRS, PRY, PRC, PRW, HRF, HRL, HRS, HRY, HRC, HRW, RRF, RRL, RRS, RRY, RRC, RRW, IRF, IRL, IRS, IRY, IRC, IRW, TRF, TRL, TRS, TRY, TRC, TRW, NRF, NRL, NRS, NRY, NRC, NRW, SRF, SRL, SRS, SRY, SRC, SRW, VRF, VRL, VRS, VRY, VRC, VRW, ARF, ARL, ARS, ARY, ARC, ARW, DRF, DRL, DRS, DRY, DRC, DRW, GRF, GRL, GRS, GRY, GRC, and GRW. Of note, split sties may be at any nucleic acid position of a codon, either before the first nucleic acid for a given codon, after the first nucleic acid for a given codon, after the second nucleic acid of a given codon, or after the third nucleic acid for a given codon. For example, arginine (“R”) and methionine (“M”), which can be encoded by an “AGG” and “ATG”, respectively, can be utilized for split sites after the second nucleic acid of their codon sequences, providing a 3’ “AG” or “AT” in the Nt vector protein coding sequence as well as a 5’ “G” in the Ct vector protein coding sequence. Usage of certain nucleic acid positions of a codon can result in other favorable qualities in vector sequence design. For example, for certain Nt vector examples provided here (including SEQ. ID NOs: 303, 305, 307, 308, 310, 312, 314, 394, and 395) using a split site after the second nucleic acid of a codon results in the Nt vector having a stop codon in-frame with the Nt vector protein coding sequence only one codon downstream of the split site. This stop codon “TAA” is provided by the Nt vector splice donor sequence (SEQ ID NO 133: GTAAGTATCAAGGTTACAAGACAGG). This is expected to reduce the possibility of in-frame translational read-through into the splice donor and ribozyme sequences that are included as part of the Nt vector sequence design.
[0812] Furthermore, selection of split sites that allow for optimized distribution of a portion of the midi-dystrophin encoding sequence to the Nt vector and Ct vector is also required, as those splice sites result in Nt and Ct sequences whose sizes are amendable to packaging into an AAV vector when combined with the other regulatory elements required for AAV function (i.e., promoters, polyadenylation sequence, etc). In the present designs, it was found that split sites between the R17 to R24 subdomains of the dystrophin protein were ideal for distributing the nucleic acid sequence equitably between the Nt and Ct vectors, particularly the R20 subdomain. [0813] Examples of Split Sites located in the Dystrophin sequence that are used in the dual plasmid and dual vector designs are presented in the Table 10 below: Table 10: Various split sites generated within truncated Dystrophin proteins
* In the amino acid sequences in the table above, the location of the split site is indicated either by an underlined amino acid in cases where the split site is located in nucleic acids with the tripartite nucleotide sequence for the indicated amino acid, or as a hyphen when the split site occurs at the nucleic acids in between the tripartite nucleic acid sequences for two amino acids ** In the nucleic acid sequences provided, the location of the split site is indicated by a hyphen.
[0814] As the variable regions for the truncated Dystrophins described in Table 6 can be designed in an N-terminal (Nt) vector, all of the designed constructs to test these various truncated dystrophins can utilize the same C-terminal (Ct) vector, including the same split site for diving the truncated Dystrophin protein into two vectors. This allows for streamlined screening of truncated Dystrophin designs that can minimize other variables in the testing. [0815] Optimizing split sites for a given truncated Dystrophin can improve the efficiency, and therefore the protein expression, of the dual plasmid or dual vector design. For example, the truncated dystrophin protein (midi-Dys A exon 13-41, SEQ ID NO:95) can be divided into a Nt portion encoded by a first coding region comprising the sequence of SEQ ID NO. 204, and a Ct portion encoded by a second coding region comprising the sequence of SEQ ID NO: 205. Alternatively, the truncated dystrophin protein (midi-Dys A exon 13-41, SEQ ID NO:95) can be divided into a Nt portion encoded by a first coding region comprising the sequence of SEQ ID NO:206 , and a Ct portion encoded by a second coding region comprising the sequence of SEQ ID NO:207. Furthermore, the truncated dystrophin protein (midi-Dys A exon 13-41, SEQ ID NO:95) can be divided into a Nt portion encoded by a first coding region comprising the sequence of SEQ ID NO: 208, and a Ct portion encoded by a second coding region comprising the sequence of SEQ ID NO:209.
[0816] The truncated dystrophin protein (midi-Dys AH2-R15, SEQ ID NO:86) can be divided into a Nt portion encoded by a first coding region comprising the sequence of SEQ ID NO: 210, and a Ct portion encoded by a second coding region comprising the sequence of SEQ ID NO: 211.
[0817] The truncated dystrophin protein (midi-Dys A exon 17-41, SEQ ID NO: 101) were divided into a Nt portion encoded by a first Nt coding region comprising the sequence of SEQ ID NO: 212, and a Ct portion encoded by a second coding region comprising the sequence of SEQ ID NO: 213.
[0818] FIG. 5 depicts the results of utilizing different split sites (SS) described above in Table
10 to divide truncated midi-Dys into N-terminal (Nt) and C-terminal (Ct) plasmids which are transfected into primary human skeletal muscle cells. FIG. 5 shows Western blot results of cell lysates 24-to-48 hours after transfection of various Nt and Ct combinations to express truncated midi-Dystrophins.
[0819] The MANDRA antibody was used to detect midi-Dystrophin protein expression, which detects an epitope included in all truncated midi-Dys designs as well as the full-length 427 kDa DMD protein (which is endogenously expressed by primary human skeletal muscle cells). Table
11 below shows the results of densitometry analysis (using ImageJ) of the Western blot results depicted in FIG. 5. The relative densitometry intensities are normalized to the midi-Dys A H2- R15 using split site #1 protein expression from SEQ ID NOs 232 + 233 (bold). Certain truncated midi-Dys exon 13-41 designs utilizing alternative SSs are expressed up to 1.72x greater than the starting the midi-Dys A exon 13-41 split site#l constructs (SEQ ID NOs 236 + 233).
Table 11: Effect of altering the split site location on expression (densitometry data from Figure 5)
[0820] The designs described above were synthesized and sub-cloned in pcDNA3.1(+) plasmid for testing. For example, the midi-Dys A H2-R15 Nt segment (SEQ ID NO 4) plus Kozak sequence on its 5’ end (“GCCACC”) as well as a splice donor sequence (SEQ ID NO: 133) and a ribozyme sequence (SEQ ID NO:133) on its 3’ end was subcloned into pcDNA3.1(+) resulting in Seq ID NO. 245. For a corresponding Ct segment of the midi-Dys A H2-R15 (SEQ ID NO 5), it was linked to a ribozyme (SEQ ID NO: 159) and splice acceptor sequence (SEQ ID NO: 137) on its 5’ end and a stop codon or a 3xFLAG tag plus a stop codon on its 3’ end and subcloned in the pcDNA3.1(+) plasmid which created SEQ ID NO 246 (SEQ ID NO: 246 includes the 3x FLAG tag to enable FLAG based detection of truncated dystrophin expression). This same strategy is applied to any Nt and Ct segment for testing via transfection of DNA expression plasmids. This strategy is not limited to the pcDNA3.1(+) expression plasmid. Any suitable backbone plasmid can be used, which could vary the promoter, polyA sequence, and other regulatory elements. Furthermore, the plasmid can be an AAV cis-plasmid which contains the ITR sequences necessary for DNA packaging into AAV particles during AAV production.
Example 3. Characterization of DNA expression plasmids In Vitro
[0821] DNA expression plasmids containing the entire midi-Dystrophin encoding sequence, or the N-terminal (Nt) portion and the C-terminal (Ct) portion for each of the truncated dystrophin proteins, as described in Examples 1 and 2 were designed and prepared.
[0822] Varying amounts of DNA plasmid were then transfected into cells using standard molecular- biology techniques. 4-to-72 hours after transfection, the cells were lysed and protein homogenates were prepared for characterization of truncated dystrophin expression via Western blotting (or similar- techniques).
[0823] Function of truncated dystrophins can also be established via determining if proper interactions with other DAPC members occurs in cultured skeletal muscle cells in vitro, such as co-immunoprecipitation experiments that establish truncated Dystrophin interactions with alpha- Dystrobrevin, another key DAPC member.
Example 4. Design and Preparation of AAV Vectors
[0824] AAV vectors (cis plasmids) expressing either the N-terminal (Nt) portion or the C- terminal (Ct) portion of each of the truncated dystrophin proteins, as described in Example 1, are designed.
[0825] The Nt vector comprises, in 5’ to 3’ order, a 5’ ITR sequence, a promoter sequence, a first coding region encoding an N-terminal portion of the truncated dystrophin protein, an intron splice donor sequence, a 3’ ribozyme (also referred to as “Nt Ribozyme” as shown below), and a 3’ ITR sequence. The Nt vector may also comprise a polyadenylation sequence and/or an intron region.
[0826] The Ct vector comprises, in 5’ to 3’ order, a 5’ ITR sequence, a promoter sequence, a 5’ ribozyme (also referred to as “Ct Ribozyme” as shown below), an intron splice acceptor sequence, a second coding region encoding a C-terminal portion of the truncated dystrophin protein, and a 3’ ITR sequence. The Ct vector may also comprise a polyadenylation sequence and/or a post transcriptional regulatory element (e.g., WPRE).
[0827] Exemplary AAV vectors expressing the Nt portion and the Ct portion of the truncated dystrophin protein (midi-Dys AH2-R15, SEQ ID NO:86) are shown below.
[0828] Exemplary AAV vectors expressing the Nt portion and the Ct portion of the truncated dystrophin protein (midi-Dys A exon 17-41, SEQ ID NO: 101) are shown below.
[0829] Exemplary AAV vectors expressing the Nt portion and the Ct portion of the truncated dystrophin protein (midi-Dys A exon 13-41, SEQ ID NO:95) are shown below.
[0830] Exemplary AAV vectors expressing the Nt portion and the Ct portion of the truncated dystrophin protein utilizing codon optimized midi-Dys sequences are shown below.
[0831] The designed AAV vectors (cis-plasmids) can be directly transfected into cells in vitro as DNA plasmids, or they can be packaged into recombinant AAVs (rAAVs) of any given serotypes. These rAAV particles can then be exposed to cells (for transduction) or animals and further characterized.
Example 5. Characterization of AAV Vectors In Vitro
[0832] AAV vectors containing the N-terminal (Nt) portion and the C-terminal (Ct) portion for each of the truncated dystrophin proteins, as described in the Examples above, were designed and prepared. The cis-plasmid containing the intended AAV genome can either be directly transfected into cells in vitro as a DNA plasmid or used in the process of rAAV packaging to produce AAV vectors containing the intended AAV genome. These rAAV particles can then be used to transduce cells in vitro or administered to animals to assess the effects of transduction in on-target as well as off-target tissues.
[0833] Four AAV designs described above were delivered to primary human skeletal muscle cells in vitro via DNA-based transfection of the cis-plasmid. These designs included the starting dual vector configuration (SEQ ID NO 301 + 302) to deliver midi-Dys A H2-R15 protein (SEQ ID NO:86) as well as an optimized sequence version (SEQ ID NO 314 + 309) to deliver midi- Dys A H2-R15 protein (SEQ ID NO:86).
[0834] FIG. 7 depicts the results of DNA-based transfection of AAV cis-plasmids with one representative dual vector configuration (SEQ ID NO 314 + 309) based on the results of codon optimization, split site selection, and re-configuration of other regulatory elements presented in Examples 1, 2 and 3. This dual vector combination was compared to the starting dual vector configuration (SEQ ID NO 301 + 302). FIG. 7 shows Western blot results of cell lysates 24-to- 48 hours after transfection of the levels of midi-Dys A H2-R15 protein (SEQ ID NO:86) obtained. The optimized dual vector combination (SEQ ID NO 314 + 309) led to significantly higher expression of truncated midi-Dystrophin compared to the starting design (SEQ ID NO 301 + 302). The relative densitometry intensities were normalized to the midi-Dys A H2-R15 codon VI split site #1 configuration (SEQ ID NOs 301 + 302, bold) and presented in Table 12 below. The codon and split site optimized midi-Dys A H2-R15 (SEQ ID NOs 314 + 309) expressed 294x higher levels of protein than the starting midi-Dy (SEQ ID NOs 301 + 302) based on densitometry.
Table 12: Densitometry data from Figure 7.
[0835] Experiments using packaged rAAV are first carried out to characterize the RNA transcript encoding the truncated dystrophin protein and the truncated dystrophin protein produced in vitro. AAV designs detailed above (Seq ID NOs 301-314) were packaged into AAV particles. These AAV preparations can be further used to determine the efficiency of various dual AAV combinations to reconstitute midi-Dys protein expression in vitro and vivo.
[0836] Cells, such as HEK293T cells, primary human skeletal muscle cells (from either healthy or DMD patients), immortalized muscle cells (such as C2C12 cells), are co- transduced with varying amounts of Nt and Ct AAV vector(s). RT-PCR and Western blotting are performed to evaluate the resulting RNA transcripts and truncated proteins, respectively. Co-transduction of the Nt- and the Ct-vectors results in detectable protein corresponding to the expected length of the correctly combined Nt- and Ct-portion of the truncated dystrophin protein in the cells. Correct ligation of the nucleotide sequence encoding the Nt portion and the nucleotide sequence encoding the Ct portion is further confirmed by sequencing analysis. Western blotting confirms the correct size of midi-Dystrophin protein reconstituted by the dual vectors. In some embodiments, a 3x FLAG tag (or similar, such as HA, His, or Myc) can be added to the N- terminus of the Nt vector and/or the C-terminus of the Ct vector to allow sensitive detection of the full-length midi-Dystrophin protein either via Western Blotting or an ELISA-based detection method. Antibodies to detect dystrophin protein can be directed at epitopes originating from the NT vector sequence (such as DysB, Dys3, DYS-3241, etc.) or the Ct vector (Dys2, MANDRA1, etc.), or portions of the dystrophin that may be encoded by either the Nt or Ct vector depending on the exact design (MANDYS106). It is expected that optimized Nt and Ct combinatorial conditions based on optimized truncated dystrophins that further utilized the codon and split site optimizations described in the examples above will result in high levels of truncated dystrophin expression.
[0837] Next, the function and activity of the truncated dystrophin protein is evaluated in vitro. In addition to the proper length of the midi-dystrophin product, in vitro characterization of colocalization of the midi-dystrophin in muscle cells with endogenous cellular proteins that are either members or otherwise associated with the Dystrophin-associated protein complex (DAPC)), such as alpha-sarcoglycan, beta-dystroglycan, alpha-dystroglycan, delta- sarcoglycan, Syntrophin, Utrophin, nNOS (Neuronal nitric oxide synthase) or other factors, is performed via IF and high-resolution microscopy. Since dystrophin protein primarily serves the function of a structural component of DAPC, co-localization with other DAPC members is indicative that functional activity of dystrophin has been restored. Co-immunoprecipitation experiments further establish direct interactions between truncated dystrophin and other DAPC members.
[0838] Function of midi-dystrophin proteins in vitro are further ascertained via their effect on cell stimulation or contractile properties in either myotubes or cardiomyocytes, where mididystrophin may enhance or restore function in healthy-donor or DMD-patient derived cells. It has been established that cells from healthy donors exhibit specific, reproducible and quantifiable stimulation and contractile properties that can be studied in vitro. These phenotypes can be further perturbed by administration of tetanus contractions or myosin ATPase inhibitors, and the impact of treatment can be characterized. In addition, cell lines from healthy-donors, where the dystrophin gene has been knocked-out, are created to allow functional characterization of the impact of a midi-dystrophin transgene in a null background. Measures such as twitch force, contraction energy, contraction velocity, time to peak contraction, relaxation velocity, time to 50% relaxation of the peak force, time to 90% relaxation of the peak force, % force over time (“fatigue”), force-frequency relationships, tissue stiffness, tissue strain, and Calcium flux have all been characterized in vitro and shown to be impacted by the presence, absence, and augmentation of dystrophin protein levels, and these measures will thus be evaluated for the midi-dystrophin proteins.
[0839] It is expected that the experiments described in the present example demonstrate that the constructs of the present disclosure produce functional dystrophin protein in vitro. Example 6. Characterization of AAV Vectors In Vivo
[0840] AAV vectors containing the coding region for the N-terminal (Nt) portion and the C- terminal (Ct) portion for each of the truncated dystrophin proteins, as described in Example 1, are designed. The Nt and Ct AAV vector pairs are prepared to express RNAs that when joined via the ribozyme technology and spliced, encode a truncated but highly functional human dystrophin protein. The expression of each vector is under the control of a muscle-specific promoter. Each pair of the Nt- and Ct- AAV vectors are packaged separately in an AAV capsid, e.g., the muscle-trophic AAV capsid, such as AAV6, AAV9, AAVrh74, AAV-MYO, AAV- MY02, AAV-MY03, MY03A-AAV, MY04A-AAV, or MY04E-AAV, to generate two AAV particles. These particles are mixed and then dosed. The mixture may be a 1:1 ratio of Nt to Ct vector, but can also be, for example, 2:1, 3:1, 4:1, or 1:2, 1:3, 1:4.
[0841] Dystrophin-KO mice (D2-mdx), which display an age-dependent disease phenotype characterized by degenerating and regenerating myofibers, necrosis and fibrosis, arc injected with the Nt- and Ct- AAV particle pair. Other mouse models of DMD exist and can be tested, including but not limited to C56B16 mdx (or “mdx”), mdx4cv, and mdx-/utm-/-.
[0842] Biomarkers of muscle health are assessed via serum collection and analysis. Serum creatine kinase, ALT, AST, and an N-terminal fragment of Titan are all bio markers used to assess muscle health/damage and are predictive of dystrophin transgene function in mdx mice. [0843] One-month post-injection, a timepoint in which severe myofiber degeneration occurs in D2-w/.r mice, muscle histology is assessed. It is expected that the injected mice have preserved muscle architecture, similar to saline injected age-matched wild-type mice, and in contrast to the degeneration observed in saline injected D2-mdx control littermates. This can include myofiber size, location of nuclei, and degree of fibrosis and necrosis in the tissue. Timepoints post injection can be extended for longer periods of time (for example, 6 weeks, 8 weeks, 3 months, 6 months, 9 months, or 12 months) to measure durability of the effect of treatment.
[0844] Immunofluorescence staining using an anti-Dystrophin antibody is also performed. It is expected that normal membrane localization of the truncated dystrophin is observed in injected D2-mdx mice. The level of expressed truncated dystrophin protein can also be confirmed by Western blotting and/or ELISA based assays. Markers of muscle damage can also be evaluated both in-life or via collection of blood/serum at the point of termination; it is expected that the markers including serum Creatine Kinase and the percentage of myofibers with centralized nuclei are more similar to wild-type levels than to levels found in saline injected D2-/n<A control littermates.
[0845] It is expected that the experiments described in the present example demonstrate that constructs of the present disclosure produce mature, functional truncated dystrophin proteins in vivo. The level of expressed truncated dystrophin protein can be quantified by Western blotting, immunofluorescence and/or ELISA based assays.
[0846] In vivo characterization can be extended to other animal models, including but not limited to canine models (Golden Retriever Muscular Dystrophy [GRMD] dogs or A-E50 muscular dystrophy [DE50-MD] dogs) or non-human primates. Similar characterizations can be performed as described above, although each model may demonstrate different genotypes and phenotypes to take into consideration for each particular assay.
[0847] For canine models of DMD, where the endogenous DMD gene does not produce functional protein (and therefore the dogs exhibit many features of disease present in DMD humans), the human amino acid sequence of the midi-Dystrophin transgene is often exchanged for the corresponding canine sequence to reduce the immunogenicity risk of expression of the transgene (For example, see: Kodippili K, Hakim CH, Pan X, Yang HT, Yue Y, Zhang Y, Shin JH, Yang NN, Duan D. Dual AAV Gene Therapy for Duchenne Muscular Dystrophy with a 7-kb Mini-Dystrophin Gene in the Canine Model. Hum Gene Ther. 2018 Mar;29(3):299-311. doi: 10.1089/hum.2017.095. Epub 2017 Aug 4. PMID: 28793798; PMCID: PMC5865264.) After gene transfer to the canine model, muscle biopsies are used to establish the level and extent of midi-dystrophin expression over time. The function of midi-Dystrophin can be established by restoration of the missing dystrophin-associated glycoprotein complex (DAPC). Function can be further established by establishing reductions in muscle degeneration and fibrosis as well as improved myofiber size distribution, hi addition, function can be established by testing mididystrophin’s ability to protect muscle from eccentric contraction-induced force loss.
[0848] For testing in non-human primates, which express endogenous DMD gene, it can be hard to distinguish the midi-dystrophin transgene expression and function due to robust endogenous gene expression (except in those assays where the size of the midi-dystrophin can be distinguished from the full-length endogenous DMD, such as Western blotting). Therefore, midiDystrophins are delivered that include a FLAG tag, which enables FLAG detection as a means to identify the expression of the midi -dystrophin transgene. Function of the midi-dystrophin can be
VI established via co-immunoprecipitation of the FLAG-tagged midi-dystrophin and its association with other DAPC members. Primate testing is also important to establish the safety of many drug development programs. Primates can be used to assess the safety of a dual vector mididystrophin approach, including assays of biodistribution of the vector, muscle health, liver function, etc. (For example, see: Gushchina LV, Frair EC, Rohan N, Bradley AJ, Simmons TR, Chavan HD, Chou HJ, Eggers M, Waldrop MA, Wein N, Flanigan KM. Lack of Toxicity in Nonhuman Primates Receiving Clinically Relevant Doses of an AAV9.U7snRNA Vector Designed to Induce DMD Exon 2 Skipping. Hum Gene Ther. 2021 Sep;32(17-18):882-894. doi: 10.1089/hum.2020.286. Epub 2021 May 7. PMID: 33406986; PMCID: PMC10112461.)

Claims

1. A system for generating a truncated human dystrophin protein, comprising a first recombinant nucleic acid molecule and a second recombinant nucleic acid molecule, wherein the first nucleic acid molecule comprises a first coding region encoding an N-terminal portion of the truncated dystrophin protein and a 3’ ribozyme, wherein the first coding region is operably linked to the 3’ ribozyme at its 3’ end, wherein the second nucleic acid molecule comprising a second coding region encoding a C- terminal portion of the truncated dystrophin protein and a 5’ ribozyme, wherein the second coding region is operably linked to the 5’ ribozyme at its 5’ end, wherein upon ribozyme-mediated catalytic ligation, the first coding region and the second coding region forms a third coding region encoding for the truncated human dystrophin protein, and wherein the truncated human dystrophin protein comprises at least 1640 amino acids.
2. The system of claim 1, wherein the first coding region is operably linked to two or more 3’ ribozymes at its 3’ end.
3. The system of claim 2, wherein the two or more 3’ ribozymes are the same 3’ ribozyme.
4. The system of claim 2, wherein the two or more 3’ ribozymes are different 3’ ribozymes.
5. The system of any one of claims 1-4, wherein the second coding region is operably linked to two or more 5’ ribozymes at its 5’ end.
6. The system of claim 5, wherein the two or more 5’ ribozymes are the same 5’ ribozymes.
7. The system of claim 5, wherein the two or more 5’ ribozymes are different 5’ ribozymes.
8. The system of any one of claims 1-7, wherein the 5’ ribozyme and the 3’ ribozyme are each independently selected from the group consisting of Twister (Osa), Twister (Dre), Twister (Nvi), Twister (Sbi), Twister (Envl), Twister (Spu), Twister (Cpa), Twister Sister, Hammerhead (RzB)c, HDV, Pistol, Varkud Satellite (VS), Hatchet, Hairpin, and Hovlinc (Hov).
9. The system of any one of claims 1-8, wherein the 5’ ribozyme and the 3’ ribozyme are each independently selected from the group consisting of SEQ ID NOs: 6 - 20.
10. The system of any one of claim 1-9, wherein the first nucleic acid molecule further comprises an intron splice donor sequence, and the second nucleic acid molecule further comprises an intron splice acceptor sequence.
11. The system of claim 10, wherein the splice donor sequence is positioned between the first coding region and the 3’ ribozyme.
12. The system of claim 11, wherein the splice donor sequence is selected from the group consisting of SEQ ID NOs: 133 - 136.
13. The system of any of claims 10-12, wherein the splice donor sequence is positioned within a region of the truncated dystrophin protein coding for a region selected from the group consisting of the R7 domain, the R8 domain, the R9 domain, the RIO domain, the R11 domain, the R12 domain, the R13 domain, the R14 domain, the R15 domain, the R16 domain, the R17 domain, the R18 domain, the R19 domain, the H3 domain, the R20 domain, the R21 domain, and the R22 domain.
14. The system of any of claims 10-13, wherein the splice donor sequence is positioned within a region of the truncated dystrophin protein coding for a region selected from the group consisting of the R8 domain, the R19 domain, the H3 domain, the R20 domain, and the R21 domain.
15. The system of any one of claims 10-14, wherein the splice donor sequence is not positioned within the R21 domain.
16. The system of any one of claims 10-15, wherein the splice donor sequence is positioned at least 10 nucleotides, at least 15 nucleotides, at least 20 nucleotides, at least 30 nucleotides, at least 40 nucleotides, at least 50 nucleotides, or more from the 3’ ribozyme.
17. The system of claim 10, wherein the splice acceptor sequence is positioned between the 5’ ribozyme and the second coding region.
18. The system of claim 17, wherein the splice acceptor sequence is selected from the group consisting of SEQ ID Nos: 137-141.
19. The system of any one of claims 10, 17 and 18, wherein the splice acceptor sequence is positioned at least 10 nucleotides, at least 15 nucleotides, at least 20 nucleotides, at least 30 nucleotides, at least 40 nucleotides, at least 50 nucleotides, or more from the 5’ ribozyme.
20. The system of any one of claims 10-19, wherein the splice donor sequence and the splice acceptor sequence are positioned such that the resulting spliced intron is between 50 - 200 bp in length.
21. The system of any one of claims 10-20, wherein the splice donor sequence and splice acceptor sequence are positioned such that the resulting spliced intron encodes a single predominant reading frame.
22. The system of any one of claims 10-21, wherein a stop codon sequence is introduced into the splice donor sequence or the splice acceptor sequence.
23. The system of any one of claims 1-22, wherein at least one of the first coding region and the second coding region is at least 2000 nucleotides in length, or at least 2200 nucleotides in length, or at least 2400 nucleotides in length, or at least 2600 nucleotides in length!
24. The system of any one of claims 1-23, wherein the first coding region and the second coding region are each at least 2000 nucleotides in length, or at least 2200 nucleotides in length, or at least 2400 nucleotides in length, or at least 2600 nucleotides in length.
25. The system of any one of claims 1-24, wherein the third coding region is at least 4920 nucleotides in length, or at least 5100 nucleotides in length, or at least 5300 nucleotides in length.
26. The system of any one of claims 1-25, wherein the third coding region comprises 300 or fewer CpG motifs.
27. The system of any one of claims 1-26, wherein the third coding region comprises 290 or fewer CpG motifs.
28. The system of any one of claims 1-27, wherein the third coding region comprises 67 or fewer CpG motifs.
29. The system of any one of claims 1-28, wherein the first coding region and the second coding region do not share a region of substantial sequence identity.
30. The system of any one of claims 1-29, wherein the 3’ end of the first coding region does not have a sequence identity to the 5’ end of the second coding region.
31. The system of any one of claims 1-30, wherein the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
32. The system of any one of claims 1-31, wherein the truncated dystrophin protein further comprises Hl Domain (SEQ ID NO: 22).
33. The system of claim 32, wherein the truncated dystrophin protein comprises an amino acid sequence selected from the group consisting of: a. midi-Dys AR1-R15 (SEQ ID NO: 83), b . midi-Dys AR2-R 15 (SEQ ID NO : 84) , c. midi-Dys AR3-R15 (SEQ ID NO: 85), d. midi-Dys AH2-R15 (SEQ ID NO: 86), e. midi-Dys AR4-R15 (SEQ ID NO: 87), f. midi-Dys AR5-R15 (SEQ ID NO: 88), g. midi-Dys A exon 13 -33 (SEQ ID NO: 93), h. midi-Dys A exon 13 -39 (SEQ ID NO: 94), i. midi-Dys A exon 13 -41 (SEQ ID NO: 95), j. midi-Dys A exon 13 -48 (SEQ ID NO: 96), k. midi-Dys A exon 15 -39 (SEQ ID NO: 97), l. midi-Dys A exon 15 -41 (SEQ ID NO: 98), m. midi-Dys A exon 15 -48 (SEQ ID NO: 99), n. midi-Dys A exon 17 -39 (SEQ ID NO: 100), o. midi-Dys A exon 17 -41 (SEQ ID NO: 101), p. midi-Dys A exon 17 -48 (SEQ ID NO: 102), q. midi-Dys A exon 18 -39 (SEQ ID NO: 220), r. midi-Dys A exon 18 -41 (SEQ ID NO: 221), s. midi-Dys A exon 18 -48 (SEQ ID NO: 222), t. midi-Dys A exon 19 -39 (SEQ ID NO: 103), u. midi-Dys A exon 19 -41 (SEQ ID NO: 104), v. midi-Dys A exon 19 -48 (SEQ ID NO: 105), w. midi-Dys A exon 21 -41 (SEQ ID NO: 106), x. midi-Dys A exon 21 -42 (SEQ ID NO: 223), and y. midi-Dys A exon 21 -48 (SEQ ID NO: 107).
34. The system of any one of claims 1-33, wherein the truncated dystrophin protein further comprises R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), and R19 Domain (SEQ ID NO:42).
35. The system of claim 34, wherein the truncated dystrophin protein comprises an amino acid sequence selected from the group consisting of a. midi-Dys AR1-R15 (SEQ ID NO: 83), b . midi-Dys AR2-R 15 (SEQ ID NO : 84) , c. midi-Dys AR3-R15 (SEQ ID NO: 85), d. midi-Dys AH2-R15 (SEQ ID NO: 86), e. midi-Dys AR4-R15 (SEQ ID NO: 87), f. midi-Dys AR5-R15 (SEQ ID NO: 88), g. midi-Dys A exon 10-33 (SEQ ID NO: 89), h. midi-Dys A exon 10-39 (SEQ ID NO: 90), i. midi-Dys A exon 10-41 (SEQ ID NO: 91), j. midi-Dys A exon 11-33 (SEQ ID NO: 216), k. midi-Dys A exon 11-39 (SEQ ID NO: 217), l. midi-Dys A exon 11-41 (SEQ ID NO: 218), m. midi-Dys A exon 13-33 (SEQ ID NO: 93), n. midi-Dys A exon 13-39 (SEQ ID NO: 94), o. midi-Dys A exon 13-41 (SEQ ID NO: 95), p. midi-Dys A exon 15-39 (SEQ ID NO: 97), q. midi-Dys A exon 15-41 (SEQ ID NO: 98), r. midi-Dys A exon 17-39 (SEQ ID NO: 100), s. midi-Dys A exon 17-41 (SEQ ID NO: 101), t. midi-Dys A exon 18-39 (SEQ ID NO: 220), u. midi-Dys A exon 18-41 (SEQ ID NO: 221), v. midi-Dys A exon 19-39 (SEQ ID NO: 103), w. midi-Dys A exon 19-41 (SEQ ID NO: 104), x. midi-Dys A exon 21-41 (SEQ ID NO: 106), and y. midi-Dys A exon 21-42 (SEQ ID NO: 223).
36. The system of any one ol ' claims 1-35, wherein the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
37. The system of any one of claims 1-35, wherein the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), RI6 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
38. The system of any one of claims 1-35, wherein the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO:24), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
39. The system of any one of claims 1-35, wherein the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO:24), R3 domain (SEQ ID NO:25), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
40. The system of any one of claims 1-35, wherein the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO:24), R3 domain (SEQ ID NO:25), H2 domain (SEQ ID NO:26), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
41. The system of any one of claims 1-35, wherein the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO:24), R3 domain (SEQ ID NO:25), H2 domain (SEQ ID NO:26), R4 domain (SEQ ID NO:27), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
42. The system of any one of claims 1-35, wherein the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), a partial Hl domain (SEQ ID NO: 404), a partial Rl l domain (SEQ ID NO: 412), R12 Domain (SEQ ID NO: 35), R13 Domain (SEQ ID NO: 36), R14 domain (SEQ ID NO: 37), R15 Domain (SEQ ID NO: 38), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
43. The system of any one of claims 1-35, wherein the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), a partial Hl domain (SEQ ID NO: 404), a partial R14 Domain (SEQ ID NO: 413), R15 Domain (SEQ ID NO: 38), R16 Domain (SEQ ID NO:
39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
44. The systems of any one of claims 1-35, wherein the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), a partial Hl domain (SEQ ID NO: 404), a partial R15 Domain (SEQ ID NO: 414), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO:
40), R18 Domain (SEQ ID NO: 41 ), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
45. The system of any one of claims 1-35, wherein the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), a partial Hl domain (SEQ ID NO: 404), a partial R19 Domain (SEQ ID NO: 415), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
46. The system of any one of claims 1-35, wherein the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), a partial R1 domain (SEQ ID NO: 405), a partial R11 domain (SEQ ID NO: 412), R12 Domain (SEQ ID NO: 35), R13 Domain (SEQ ID NO: 36), R14 Domain (SEQ ID NO: 37), R15 Domain (SEQ ID NO: 38), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
47. The system of any one of claims 1-35, wherein the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), a partial R1 domain (SEQ ID NO: 405), a partial R14 Domain (SEQ ID NO: 413), R15 Domain (SEQ ID NO: 38), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
48. The system of any one of claims 1-35, wherein the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), a partial R1 domain (SEQ ID NO: 405), a partial R15 Domain (SEQ ID NO: 414), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
49. The system of any one of claims 1-35, wherein the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), a partial R1 domain (SEQ ID NO: 405), a partial R19 Domain (SEQ ID NO: 415), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
50. The system of any one of claims 1-35, wherein the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), a partial R2 domain (SEQ ID NO: 406), a partial R11 domain (SEQ ID NO: 412), R12 Domain (SEQ ID NO: 35), R13 Domain (SEQ ID NO: 36), R14 Domain (SEQ ID NO: 37), R15 Domain (SEQ ID NO: 38), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
51. The system of any one of claims 1-35, wherein the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), a partial R2 domain (SEQ ID NO: 406), a partial R14 Domain (SEQ ID NO: 413), R15 Domain (SEQ ID NO: 38), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
52. The system of any one of claims 1-35, wherein the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), a partial R2 domain (SEQ ID NO: 406), a partial R15 Domain (SEQ ID NO: 414), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
53. The system of any one of claims 1-35, wherein the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), a partial R2 domain (SEQ ID NO: 406), a partial R19 Domain (SEQ ID NO: 415), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
54. The system of any one of claims 1-35, wherein the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), a partial R3 domain (SEQ ID NO: 407), a partial R14 Domain (SEQ ID NO: 413), R15 Domain (SEQ ID NO: 38), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
55. The system of any one of claims 1-35, wherein the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), a partial R3 domain (SEQ ID NO: 407), a partial R15 Domain (SEQ ID NO: 414), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
56. The system of any one of claims 1-35, wherein the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), a partial R3 domain (SEQ ID NO: 407), a partial R19 Domain (SEQ ID NO: 415), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
57. The system of any one of claims 1-35, wherein the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), a partial R3 domain (SEQ ID NO: 408), a partial R14 Domain (SEQ ID NO: 413), R15 Domain (SEQ ID NO: 38), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
58. The system of any one of claims 1-35, wherein the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), a partial R3 domain (SEQ ID NO: 408), a partial R15 Domain (SEQ ID NO: 414), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
59. The system of any one of claims 1-35, wherein the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), a partial R3 domain (SEQ ID NO: 408), a partial R19 Domain (SEQ ID NO: 415), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
60. The system of any one of claims 1-35, wherein the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), R3 domain (SEQ ID NO: 25), H2 domain (SEQ ID NO: 26), a partial R4 domain (SEQ ID NO: 409), a partial R14 Domain (SEQ ID NO: 413), R15 Domain (SEQ ID NO: 38), R 16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
61. The system of any one of claims 1-35, wherein the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), R3 domain (SEQ ID NO: 25), H2 domain (SEQ ID NO: 26), a partial R4 domain (SEQ ID NO: 409), a partial R15 Domain (SEQ ID NO: 414), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
62. The systems of any one of claims 1-35, wherein the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), R3 domain (SEQ ID NO: 25), H2 domain (SEQ ID NO: 26), a partial R4 domain (SEQ ID NO: 409), a partial R19 Domain (SEQ ID NO: 415), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
63. The system of any one of claims 1-35, wherein the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), R3 domain (SEQ ID NO: 25), H2 domain (SEQ ID NO: 26), a partial R4 domain (SEQ ID NO: 410), a partial R14 Domain (SEQ ID NO: 413), R15 Domain (SEQ ID NO: 38), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
64. The system of any one of claims 1-35, wherein the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), R3 domain (SEQ ID NO: 25), H2 domain (SEQ ID NO: 26), a partial R4 domain (SEQ ID NO: 410), a partial R15 Domain (SEQ ID NO: 414), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
65. The systems of any one of claims 1-35, wherein the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), R3 domain (SEQ ID NO: 25), H2 domain (SEQ ID NO: 26), a partial R4 domain (SEQ ID NO: 410), a partial R19 Domain (SEQ ID NO: 415), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
66. The system of any one of claims 1-35, wherein the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), R3 domain (SEQ ID NO: 25), H2 domain (SEQ ID NO: 26), R4 domain (SEQ ID NO:27), a partial R5 domain (SEQ ID NO: 411), a partial R15 Domain (SEQ ID NO: 414), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
67. The system of any one of claims 1-35, wherein the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), R3 domain (SEQ ID NO: 25), H2 domain (SEQ ID NO: 26), R4 domain (SEQ ID NO:27), a partial R5 domain (SEQ ID NO: 411), a partial R16 Domain (SEQ ID NO: 416), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
68. The system of any one of claims 1-35, wherein the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), R3 domain (SEQ ID NO: 25), H2 domain (SEQ ID NO: 26), R4 domain (SEQ ID NO:27), a partial R5 domain (SEQ ID NO: 411), a partial R19 Domain (SEQ ID NO: 415), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
69. The system of any one of claims 1-68, wherein the truncated dystrophin protein comprises an amino acid sequence selected from the group consisting of SEQ ID NOs:83-107 and 216-223, or an amino acid at least about 90% identical thereto.
70. The system of any one of claims 1-69 wherein the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 86, or an amino acid at least about 90% identical thereto.
71. The system of any one of claims 1-69, wherein the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 95 or an amino acid at least about 90% identical thereto.
72. The system of any one of claims 1-69, wherein the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 101 or an amino acid at least about 90% identical thereto.
73. The system of any one of claims l-69wherein the amino acid sequence of the truncated dystrophin protein is not identical to an amino acid sequence of SEQ ID NO: 143.
74. The system of any one of claims 1-73, wherein the truncated dystrophin protein is not a polypeptide of 2361 amino acids.
75. The system of any one of claims 1-74, wherein the truncated dystrophin protein is less than 2361 amino acid in length.
76. The system of any one of claims 1-75, wherein the truncated dystrophin protein is greater than 2361 amino acid in length.
77. The system of any one of claims 1-76, wherein the truncated dystrophin protein is functional.
78. The system of any one of claims 1-77, wherein the first nucleic acid molecule is present in a first viral vector, and the second nucleic acid molecule is present in a second viral vector.
79. The system of claim 78, wherein the first viral vector and the second viral vector are each independently selected from the group consisting of an adenoviral vector, an adeno-associated viral vector, a lentiviral vector, a vaccinia vector, a herpes simplex viral vector, and an Epstein- Barr viral vector.
80. The system of claim 78 or 79, wherein the first viral vector is an adeno-associated viral (AAV) vector, and the second viral vector is an AAV vector.
81. The system of claim 80, wherein the AAV vector is selected from the group consisting of an AAV1 , AAV2, AAV3, AAV4, AAV5, AAV6, AAV7, AAV8, AAV9, AAV10, AAV11 , AAV12, AAV13, AAVrh74, AAV-rhlO, AAV-DJ, AAV-LK03, AAV-MYO, AAV-MY02, AAV-MY03, MYO3A-AAV, MYO4A-AAV, and MYO4E-AAV.
82. The system of any one of claims 78-81, wherein the first AAV vector further comprises a first promoter operably linked to the first nucleic acid molecule.
83. The system of any one of claims 78-82, wherein the second AAV vector further comprises a second promoter operably linked to the second nucleic acid molecule.
84. The system of claim 82 or 83, wherein the promoter comprises a tissue specific promoter or a ubiquitous promoter.
85. The system of any one of claims 82-84, wherein the promoter comprises a CK8 promoter, an MHCK7 promoter, an SPC5-12 promoter, or a minimal CKM promoter.
86. The system of any one of claims 82-85, wherein the promoter comprises a nucleotide sequence selected from the group consisting of SEQ ID NOs: 144-150, or a nucleotide sequence at least 95% identical thereto.
87. The system of any one of claims 83-86, wherein the first and/or second AAV vectors further comprise an inverted terminal repeat (ITR) sequence.
88. The system of claim 87, wherein the ITR sequence comprises a nucleotide sequence of SEQ ID NO: 202 and/or 203, or a nucleotide sequence at least 95% identical thereto.
89. The system of any one of claims 83-88, wherein the first and/or second AAV vectors further comprise an intron region.
90. The system of claim 89, wherein the intron region comprises a nucleotide sequence of SEQ ID NO: 156 or 157, or a nucleotide sequence at least 95% identical thereto.
91. The system of any one of claims 83-90, wherein the first and/or second AAV vectors further comprise a polyadenylation sequence.
92. The system of claim 91, wherein the polyadenylation sequence comprises a nucleotide sequence of SEQ ID NO: 151 or 152, or a nucleotide sequence at least 95% identical thereto.
93. The system of any one of claims 83-92, wherein the first and/or second AAV vectors further comprise a Woodchuck Hepatitis Vims Posttranscriptional Regulatory Element (WPRE).
94. The system of claim 93, wherein the WPRE comprises a nucleotide sequence of SEQ ID NOs: 153-155, or a nucleotide sequence at least 95% identical thereto.
95. The system of any one of claims 83-94, wherein the first and/or second AAV vectors further comprise a Kozak sequence.
96. A vector system for expressing a truncated human dystrophin protein, comprising a first AAV vector and a second AAV vector, wherein the first AAV vector comprises a first nucleic acid molecule comprising a first coding region encoding an N-terminal portion of the truncated dystrophin protein and a 3’ ribozyme, wherein the first coding region is operably linked to the 3’ ribozyme at its 3’ end, wherein the second AAV vector comprises a second nucleic acid molecule comprising a second coding region encoding a C-terminal portion of the truncated dystrophin protein and a 5’ ribozyme, wherein the second coding region is operably linked to the 5’ ribozyme at its 5’ end, wherein upon ribozyme-mediated catalytic ligation, the first coding region and the second coding region forms a third coding region encoding for the truncated human dystrophin protein, and wherein the truncated human dystrophin protein comprises at least 1640 amino acids.
97. The system of claim 96, wherein the first coding region is operably linked to two or more 3’ ribozymes at its 3’ end.
98. The system of claim 97, wherein the two or more 3’ ribozymes are the same 3’ ribozyme.
99. The system of claim 98, wherein the two or more 3’ ribozymes are different 3’ ribozymes.
100. The system of any one of claims 96-99, wherein the second coding region is operably linked to two or more 5’ ribozymes at its 5’ end.
101. The system of claim 100, wherein the two or more 5’ ribozymes are the same 5’ ribozyme.
102. The system of claim 100, wherein the two or more 5’ ribozymes are different 5’ ribozymes.
103. The system of any one of claims 96-102, wherein the 5’ ribozyme and the 3’ ribozyme are each independently selected from the group consisting of Twister (Osa), Twister (Dre), Twister (Nvi), Twister (Sbi), Twister (Envl), Twister (Spu), Twister (Cpa), Twister Sister, Hammerhead (RzB), HDV, Pistol, Varkud Satellite (VS), Hatchet, Hairpin, and Hovlinc (Hov).
104. The system of any one of claims 96-103, wherein the 5’ ribozyme and the 3’ ribozyme are each independently selected from the group consisting of SEQ ID NOs: 6 - 20.
105. The system of any one of claims 96-104, wherein the first nucleic acid molecule further comprises an intron splice donor sequence, and the second nucleic acid molecule further comprises an intron splice acceptor sequence.
106. The system of claim 105, wherein the splice donor sequence is positioned between the first coding region and the 3’ ribozyme.
107. The system of claim 105 or 106, wherein the splice donor sequence is selected from the group consisting of SEQ ID NOs: 133 - 136.
108. The system of any of claims 105-107, wherein the splice donor sequence is positioned within a region of the truncated dystrophin protein coding for a region selected from the group consisting of the R7 domain, the R8 domain, the R9 domain, the R10 domain, the R11 domain, the R12 domain, the R13 domain, the R14 domain, the R15 domain, the R16 domain, the R17 domain, the R18 domain, the R19 domain, the H3 domain, the R20 domain, the R21 domain, and the R22 domain.
109. The system of any of claims 105-108, wherein the splice donor sequence is positioned within a region of the truncated dystrophin protein coding for a region selected from the group consisting of the R8 domain, the R19 domain, the H3 domain, the R20 domain, and the R21 domain.
110. The system of any one of claims 105-109, wherein the splice donor sequence is not positioned within the R21 domain.
111. The system of any one of claims 105- 110, wherein the splice donor sequence is positioned at least 10 nucleotides, at least 15 nucleotides, at least 20 nucleotides, at least 30 nucleotides, at least 40 nucleotides, at least 50 nucleotides, or more from the 3’ ribozyme.
112. The system of any one of claims 105-111, wherein the splice acceptor sequence is positioned between the 5’ ribozyme and the second coding region.
113. The system of any one of claims 105-112, wherein the splice acceptor sequence is selected from the group consisting of SEQ ID Nos: 137-141.
114. The system of any one of claims 105-113, wherein the splice acceptor sequence is positioned at least 10 nucleotides, at least 15 nucleotides, at least 20 nucleotides, at least 30 nucleotides, at least 40 nucleotides, at least 50 nucleotides, or more from the 5’ ribozyme.
115. The system of any one of claims 105-114, wherein the splice donor sequence and the splice acceptor sequence are positioned such that the resulting spliced intron is between 50 - 200 bp in length.
116. The system of any one of claims 105-115, wherein the splice donor sequence and splice acceptor sequence are positioned such that the resulting spliced intron encodes a single predominant reading frame.
117. The system of any one of claims 105-116, wherein a stop codon sequence is introduced into the splice donor sequence or the splice acceptor sequence.
118. The system of any one of claims 96-117, wherein at least one of the first coding region and the second coding region is at least 2000 nucleotides in length, or at least 2200 nucleotides in length, or at least 2400 nucleotides in length, or at least 2600 nucleotides in length. The third coding region is at least 4920 nucleotides in length.
119. The system of any one of claims 96-118, wherein the first coding region and the second coding region are each at least 2000 nucleotides in length, or at least 2200 nucleotides in length, or at least 2400 nucleotides in length, or at least 2600 nucleotides in length the third coding region is at least 5100 nucleotides in length.
120. The system of any one of claims 96-119, wherein the third coding region is at least 4920 nucleotides in length, or at least 5100 nucleotides in length, or at least 5300 nucleotides in length.
121. The system of any one of claims 96-120, wherein the third coding region comprises 300 or fewer CpG motifs.
122. The system of any one of claims 96-120, wherein the third coding region comprises 290 or fewer CpG motifs.
123. The system of any one of claims 96 - 120, wherein the third coding region comprises 67 or fewer CpG motifs.
124. The system of any one of claims 96-123, wherein the first coding region and the second coding region do not share a region of substantial sequence identity.
125. The system of any one of claims 96-124, wherein the 3’ end of the first coding region does not have a sequence identity to the 5’ end of the second coding region.
126. The system of any one of claims 96-125, wherein the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
127. The system of any one of claims 96-126, wherein the truncated dystrophin protein further comprises Hl Domain (SEQ ID NO: 22).
128. The system of claim 127, wherein the truncated dystrophin protein comprises an amino acid sequence selected from the group consisting of: a. midi-Dys AR1-R15 (SEQ ID NO: 83), b . midi-Dys AR2-R 15 (SEQ ID NO : 84) , c. midi-Dys AR3-R15 (SEQ ID NO: 85), d. midi-Dys AH2-R15 (SEQ ID NO: 86), e. midi-Dys AR4-R15 (SEQ ID NO: 87), f. midi-Dys AR5-R15 (SEQ ID NO: 88), g. midi-Dys A exon 13-33 (SEQ ID NO: 93), h. midi-Dys A exon 13-39 (SEQ ID NO: 94), i. midi-Dys A exon 13-41 (SEQ ID NO: 95), j. midi-Dys A exon 13-48 (SEQ ID NO: 96), k. midi-Dys A exon 15-39 (SEQ ID NO: 97), l. midi-Dys A exon 15-41 (SEQ ID NO: 98), m. midi-Dys A exon 15-48 (SEQ ID NO: 99), n. midi-Dys A exon 17-39 (SEQ ID NO: 100), o. midi-Dys A exon 17-41 (SEQ ID NO: 101), p. midi-Dys A exon 17-48 (SEQ ID NO: 102), q. midi-Dys A exon 18-39 (SEQ ID NO: 220), r. midi-Dys A exon 18-41 (SEQ ID NO: 221), s. midi-Dys A exon 18-48 (SEQ ID NO: 222), t. midi-Dys A exon 19-39 (SEQ ID NO: 103), u. midi-Dys A exon 19-41 (SEQ ID NO: 104), v. midi-Dys A exon 19-48 (SEQ ID NO: 105), w. midi-Dys A exon 21-41 (SEQ ID NO: 106), x. midi-Dys A exon 21-42 (SEQ ID NO: 223), and y. midi-Dys A exon 21-48 (SEQ ID NO: 107).
129. The system of any one of claims 96- 128, wherein the truncated dystrophin protein further comprises R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), and R19 Domain (SEQ ID NO:42).
130. The system of claim 129, wherein the truncated dystrophin protein comprises an amino acid sequence selected from the group consisting of a. midi-Dys AR1-R15 (SEQ ID NO: 83), b . midi-Dys AR2-R 15 (SEQ ID NO : 84) , c. midi-Dys AR3-R15 (SEQ ID NO: 85), d. midi-Dys AH2-R15 (SEQ ID NO: 86), e. midi-Dys AR4-R15 (SEQ ID NO: 87), f. midi-Dys AR5-R15 (SEQ ID NO: 88) g. midi-Dys A exon 10-33 (SEQ ID NO: 89), h. midi-Dys A exon 10-39 (SEQ ID NO: 90), i. midi-Dys A exon 10-41 (SEQ ID NO: 91), j. midi-Dys A exon 11-33 (SEQ ID NO: 216), k. midi-Dys A exon 11-39 (SEQ ID NO: 217), l. midi-Dys A exon 11-41 (SEQ ID NO: 218), m. midi-Dys A exon 13-33 (SEQ ID NO: 93), n. midi-Dys A exon 13-39 (SEQ ID NO: 94), o. midi-Dys A exon 13-41 (SEQ ID NO: 95), p. midi-Dys A exon 15-39 (SEQ ID NO: 97), q. midi-Dys A exon 15-41 (SEQ ID NO: 98), r. midi-Dys A exon 17-39 (SEQ ID NO: 100), s. midi-Dys A exon 17-41 (SEQ ID NO: 101), t. midi-Dys A exon 18-39 (SEQ ID NO: 220), u. midi-Dys A exon 18-41 (SEQ ID NO: 221), v. midi-Dys A exon 19-39 (SEQ ID NO: 103), w. midi-Dys A exon 19-41 (SEQ ID NO: 104), x. midi-Dys A exon 21-41 (SEQ ID NO: 106), and y. midi-Dys A exon 21-42 (SEQ ID NO: 223).
131. The system of any one of claims 96-130, wherein the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
132. The system of any one of claims 96-130, wherein the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
133. The system of any one of claims 96-130, wherein the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO:24), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51 ).
134. The system of any one of claims 96-130, wherein the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO:24), R3 domain (SEQ ID NO:25), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
135. The system of any one of claims 96-130, wherein the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO:24), R3 domain (SEQ ID NO:25), H2 domain (SEQ ID NO:26), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
136. The systems of any one of claims 96-130, wherein the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO:24), R3 domain (SEQ ID NO:25), H2 domain (SEQ ID NO:26), R4 domain (SEQ ID NO:27), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
137. The systems of any one of claims 96-130, wherein the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), a partial Hl domain (SEQ ID NO: 404), a partial Rl l domain (SEQ ID NO: 412), R12 Domain (SEQ ID NO: 35), R13 Domain (SEQ ID NO: 36), R14 domain (SEQ ID NO: 37), R15 Domain (SEQ ID NO: 38), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
138. The systems of any one of claims 96-130, wherein the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), a partial Hl domain (SEQ ID NO: 404), a partial R14 Domain (SEQ ID NO: 413), R15 Domain (SEQ ID NO: 38), R16 Domain (SEQ ID NO:
39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
139. The systems of any one of claims 96-130, wherein the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), a partial Hl domain (SEQ ID NO: 404), a partial R15 Domain (SEQ ID NO: 414), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO:
40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
140. The systems of any one of claims 96-130, wherein the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), a partial Hl domain (SEQ ID NO: 404), a partial R19 Domain (SEQ ID NO: 415), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
141. The systems of any one of claims 96-130, wherein the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), a partial R1 domain (SEQ ID NO: 405), a partial R11 domain (SEQ ID NO: 412), R12 Domain (SEQ ID NO: 35), R13 Domain (SEQ ID NO: 36), R14 Domain (SEQ ID NO: 37), R15 Domain (SEQ ID NO: 38), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
142. The systems of any one of claims 96-130, wherein the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), a partial R1 domain (SEQ ID NO: 405), a partial R14 Domain (SEQ ID NO: 413), R15 Domain (SEQ ID NO: 38), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
143. The systems of any one of claims 96-130, wherein the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), a partial R1 domain (SEQ ID NO: 405), a partial R15 Domain (SEQ ID NO: 414), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
144. The systems of any one of claims 96-130, wherein the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), a partial R1 domain (SEQ ID NO: 405), a partial R19 Domain (SEQ ID NO: 415), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
145. The systems of any one of claims 96-130, wherein the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), a partial R2 domain (SEQ ID NO: 406), a partial R11 domain (SEQ ID NO: 412), R12 Domain (SEQ ID NO: 35), R13 Domain (SEQ ID NO: 36), R14 Domain (SEQ ID NO: 37), R15
Domain (SEQ ID NO: 38), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18
Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20
Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
146. The systems of any one of claims 96-130, wherein the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), a partial R2 domain (SEQ ID NO: 406), a partial R14 Domain (SEQ ID NO: 413), R15 Domain (SEQ ID NO: 38), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
147. The systems of any one of claims 96-130, wherein the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), a partial R2 domain (SEQ ID NO: 406), a partial R15 Domain (SEQ ID NO: 414), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
148. The systems of any one of claims 96-130, wherein the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), a partial R2 domain (SEQ ID NO: 406), a partial R19 Domain (SEQ ID NO: 415), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
149. The systems of any one of claims 96-130, wherein the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), a partial R3 domain (SEQ ID NO: 407), a partial R14 Domain (SEQ ID NO: 413), R15 Domain (SEQ ID NO: 38), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
150. The systems of any one of claims 96-130, wherein the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), a partial R3 domain (SEQ ID NO: 407), a partial R15 Domain (SEQ ID NO: 414), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
151. The systems of any one of claims 96-130, wherein the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), a partial R3 domain (SEQ ID NO: 407), a partial R19 Domain (SEQ ID NO: 415), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
152. The systems of any one of claims 96-130, wherein the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), a partial R3 domain (SEQ ID NO: 408), a partial R14 Domain (SEQ ID NO: 413), R15 Domain (SEQ ID NO: 38), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
153. The systems of any one of claims 96-130, wherein the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), a partial R3 domain (SEQ ID NO: 408), a partial R15 Domain (SEQ ID NO: 414), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
154. The systems of any one of claims 96-130, wherein the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), a partial R3 domain (SEQ ID NO: 408), a partial R19 Domain (SEQ ID NO: 415), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
155. The systems of any one of claims 96-130, wherein the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), R3 domain (SEQ ID NO: 25), H2 domain (SEQ ID NO: 26), a partial R4 domain (SEQ ID NO: 409), a partial R14 Domain (SEQ ID NO: 413), R15 Domain (SEQ ID NO: 38), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
156. The systems of any one of claims 96-130, wherein the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), R3 domain (SEQ ID NO: 25), H2 domain (SEQ ID NO: 26), a partial R4 domain (SEQ ID NO: 409), a partial R15 Domain (SEQ ID NO: 414), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
157. The systems of any one of claims 96-130, wherein the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), R3 domain (SEQ ID NO: 25), H2 domain (SEQ ID NO: 26), a partial R4 domain (SEQ ID NO: 409), a partial R19 Domain (SEQ ID NO: 415), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
158. The systems of any one of claims 96-130, wherein the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), R3 domain (SEQ ID NO: 25), H2 domain (SEQ ID NO: 26), a partial R4 domain (SEQ ID NO: 410), a partial R14 Domain (SEQ ID NO: 413), R15 Domain (SEQ ID NO: 38), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
159. The systems of any one of claims 96-130, wherein the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), R3 domain (SEQ ID NO: 25), H2 domain (SEQ ID NO: 26), a partial R4 domain (SEQ ID NO: 410), a partial R15 Domain (SEQ ID NO: 414), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
160. The systems of any one of claims 96-130, wherein the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), R3 domain (SEQ ID NO: 25), H2 domain (SEQ ID NO: 26), a partial R4 domain (SEQ ID NO: 410), a partial R19 Domain (SEQ ID NO: 415), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
161. The systems of any one of claims 96-130, wherein the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), R3 domain (SEQ ID NO: 25), H2 domain (SEQ ID NO: 26), R4 domain (SEQ ID NO:27), a partial R5 domain (SEQ ID NO: 411), a partial R15 Domain (SEQ ID NO: 414), R16 Domain (SEQ ID NO: 39), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
162. The systems of any one of claims 96-130, wherein the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), R3 domain (SEQ ID NO: 25), H2 domain (SEQ ID NO: 26), R4 domain (SEQ ID NO:27), a partial R5 domain (SEQ ID NO: 411), a partial R16 Domain (SEQ ID NO: 416), R17 Domain (SEQ ID NO: 40), R18 Domain (SEQ ID NO: 41), R19 Domain (SEQ ID NO:42), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
163. The systems of any one of claims 96-130, wherein the truncated dystrophin protein comprises ABCD domain (SEQ ID NO: 21), Hl domain (SEQ ID NO: 22), R1 domain (SEQ ID NO: 23), R2 domain (SEQ ID NO: 24), R3 domain (SEQ ID NO: 25), H2 domain (SEQ ID NO: 26), R4 domain (SEQ ID NO:27), a partial R5 domain (SEQ ID NO: 411), a partial R19 Domain (SEQ ID NO: 415), H3 Domain (SEQ ID NO: 43), R20 Domain (SEQ ID NO: 44), R21 Domain (SEQ ID NO: 45), R22 Domain (SEQ ID NO: 46), R23 Domain (SEQ ID NO: 47), R24 Domain (SEQ ID NO: 48), H4 Domain (SEQ ID NO: 49), CR Domain (SEQ ID NO: 50), and CT Domain (SEQ ID NO: 51).
164. The system of any one of claims 96-163, wherein the truncated dystrophin protein comprises an amino acid sequence selected from the group consisting of SEQ ID NOs:83-107, or an amino acid at least about 90% identical thereto.
165. The system of any one of claims 96-164, wherein the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 86, or an amino acid at least about 90% identical thereto.
166. The system of any one of claims 96-165, wherein the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 95 or an amino acid at least about 90% identical thereto.
167. The system of any one of claims 96-166, wherein the truncated dystrophin protein comprises an amino acid sequence of SEQ ID NO: 101 or an amino acid at least about 90% identical thereto.
168. The system of any one of claims 96-167, wherein the amino acid sequence of the truncated dystrophin protein is not identical to an amino acid sequence of SEQ ID NO: 143.
169. The system of any one of claims 96-168, wherein the truncated dystrophin protein is not a polypeptide of 2361 amino acids.
170. The system of any one of claims 96-169, wherein the truncated dystrophin protein is less than 2361 amino acid in length.
171. The system of any one of claims 96-170, wherein the truncated dystrophin protein is greater than 2361 amino acid in length.
172. The system of any one of claims 96-171, wherein the truncated dystrophin protein is functional.
173. The system of any one of claims 96-172, wherein the first coding region comprises the sequence selected from the group consisting of SEQ ID NOs: 284, 286, 288, 290, 291, 293, 295 and 297.
174. The system of any one of claims 96-173, wherein the second coding region comprises the sequence selected from the group consisting of SEQ ID NOs: 285, 287, 289, 292, 294 and 296.
175. The system of any one of claims 96-174, wherein the first coding sequence comprises SEQ ID NO: 286 and the second coding sequence comprises SEQ ID NO:287.
176. The system of any one of claims 96-174, wherein the first coding sequence comprises SEQ ID NO: 288 and the second coding sequence comprises SEQ ID NO:289.
177. The system of any one of claims 96-174, wherein the first coding sequence comprises SEQ ID NO: 290 and the second coding sequence comprises SEQ ID NO:289.
178. The system of any one of claims 96-174, wherein the first coding sequence comprises SEQ ID NO: 291 and the second coding sequence comprises SEQ ID NO:292.
179. The system of any one of claims 96-174, wherein the first coding sequence comprises SEQ ID NO: 293 and the second coding sequence comprises SEQ ID NO:294.
180. The system of any one of claims 96-174, wherein the first coding sequence comprises SEQ ID NO: 295 and the second coding sequence comprises SEQ ID NO:296.
181. The system of any one of claims 96-174, wherein the first coding sequence comprises SEQ ID NO: 297 and the second coding sequence comprises SEQ ID NO:292.
182. The system of any one of claims 96-174, wherein the first coding sequence comprises SEQ ID NO: 288 and the second coding sequence comprises SEQ ID NO:289.
183. The system of any one of claims 96-174, wherein the first coding sequence comprises SEQ ID NO: 288 and the second coding sequence comprises SEQ ID NO:287.
184. The system of any one of claims 96-183, wherein the AAV vector is selected from the group consisting of an AAV1, AAV2, AAV3, AAV4, AAV5, AAV6, AAV7, AAV8, AAV9, AAV10, AAV11, AAV12, AAV13, AAVrh74, AAV-rhlO, AAV-DJ, AAV-LK03, AAV-MYO, AAV-MYO2, AAV-MYO3, MYO3A-AAV, MYO4A-AAV, and MYO4E-AAV.
185. The system of any one of claims 96-184, wherein the first AAV vector further comprises a first promoter operably linked to the first nucleic acid molecule.
186. The system of any one of claims 96-185, wherein the second AAV vector further comprises a second promoter operably linked to the second nucleic acid molecule.
187. The system of claim 185 or 186, wherein the promoter comprises a tissue specific promoter or a ubiquitous promoter.
188. The system of any one of claims 185-187, wherein the promoter comprises a CK8 promoter, an MHCK7 promoter, an SPC5 promoter, or a minimal CKM promoter.
189. The system of any one of claims 185-188, wherein the promoter comprises a nucleotide sequence selected from the group consisting of SEQ ID NOs: 144-150, or a nucleotide sequence at least 95% identical thereto.
190. The system of any one of claims 96-189, wherein the first and/or second AAV vectors further comprise an inverted terminal repeat (ITR) sequence.
191. The system of claim 190, wherein the ITR sequence comprises a nucleotide sequence of SEQ ID NO: 202 and/or 203, or a nucleotide sequence at least 95% identical thereto.
192. The system of any one of claims 96-191, wherein the first and/or second AAV vectors further comprise an intron region.
193. The system of claim 192, wherein the intron region comprises a nucleotide sequence of SEQ ID NO: 156 or 157, or a nucleotide sequence at least 95% identical thereto.
194. The system of any one of claims 96-193, wherein the first and/or second AAV vectors further comprise a polyadenylation sequence.
195. The system of claim 194, wherein the polyadenylation sequence comprises a nucleotide sequence of SEQ ID NO: 151 or 152, or a nucleotide sequence at least 95% identical thereto.
196. The system of any one of claims 96-195, wherein the first and/or second AAV vectors further comprise a Woodchuck Hepatitis Vims Posttranscriptional Regulatory Element (WPRE).
197. The system of claim 196, wherein the WPRE comprises a nucleotide sequence of SEQ ID NO: 153 or 154, or 155, or a nucleotide sequence at least 95% identical thereto.
198. The system of any one of claims 96-197, wherein the first and/or second AAV vectors further comprise a Kozak sequence.
199. The system of any one of claims 96-198, wherein the first AAV vector comprises, in 5’ to 3’ order, a 5’ ITR sequence, a promoter sequence, a first coding region encoding an N-terminal portion of the truncated dystrophin protein, an intron splice donor sequence, a 3’ ribozyme, and a 3’ ITR sequence.
200. The system of any one of claims 96-199, wherein the first AAV vector comprises, in 5’ to 3’ order, a 5’ ITR sequence, a promoter sequence, a first coding region encoding an N-terminal portion of the truncated dystrophin protein, an intron splice donor sequence, a 3’ ribozyme, a polyadenylation sequence, and a 3’ ITR sequence.
201. The system of any one of claims 96-200, wherein the first AAV vector comprises, in 5’ to 3’ order, a 5’ ITR sequence, a promoter sequence, an intron region, a first coding region encoding an N-terminal portion of the truncated dystrophin protein, an intron splice donor sequence, a 3’ ribozyme, a polyadenylation sequence, and a 3’ ITR sequence.
202. The system of any one of claims 96-201, wherein the first AAV vector comprises a sequence selected from the group consisting of SEQ ID NO: 160, 162, 164, 166, 168, 170, 172, 174, 176, 178, 180, 182, 184, 186, 188, 190, 192, 194, 196, 198, 200, 301, 303, 305, 307, 308, 310, 312, 314, 394, and 395.
203. The system of any one of claims 96-202, wherein the second AAV vector comprises, in 5’ to 3’ order, a 5’ ITR sequence, a promoter sequence, a 5’ ribozyme, an intron splice acceptor sequence, a second coding region encoding a C-terminal portion of the truncated dystrophin protein, and a 3’ ITR sequence.
204. The system of any one of claims 96-203, wherein the second AAV vector comprises, in 5’ to 3’ order, a 5’ ITR sequence, a promoter sequence, a 5’ ribozyme, an intron splice acceptor sequence, a second coding region encoding a C-terminal portion of the truncated dystrophin protein, a polyadenylation sequence, and a 3’ ITR sequence.
205. The system of any one of claims 96-204, wherein the second AAV vector comprises, in 5’ to 3’ order, a 5’ ITR sequence, a promoter sequence, a 5’ ribozyme, an intron splice acceptor sequence, a second coding region encoding a C-terminal portion of the truncated dystrophin protein, a WPRE sequence, a polyadenylation sequence, and a 3’ ITR sequence.
206. The system of any one of claims 96-205, wherein the second AAV vector comprises a sequence selected from the group consisting of SEQ ID NO:161, 163, 165, 167, 169, 171, 173, 175, 177, 179, 181, 183, 185, 187, 189, 191, 193, 195, 197, 199, 201, 302, 304, 306, 309, 311, and 313.
207. The system of any one of claims 96-206, wherein the first AAV vector comprises the sequence of SEQ ID NO: 303, and the second AAV vector comprises the sequence of SEQ ID NO: 304.
208. The system of any one of claims 96-206, wherein the first AAV vector comprises the sequence of SEQ ID NO: 305, and the second AAV vector comprises the sequence of SEQ ID NO: 306.
209. The system of any one of claims 96-206, wherein the first AAV vector comprises the sequence of SEQ ID NO: 307, and the second AAV vector comprises the sequence of SEQ ID NO: 306.
210. The system of any one of claims 96-206, wherein the first AAV vector comprises the sequence of SEQ ID NO: 308, and the second AAV vector comprises the sequence of SEQ ID NO: 309.
211. The system of any one of claims 96-206, wherein the first AAV vector comprises the sequence of SEQ ID NO: 310, and the second AAV vector comprises the sequence of SEQ ID NO: 311.
212. The system of any one of claims 96-206, wherein the first AAV vector comprises the sequence of SEQ ID NO: 312, and the second AAV vector comprises the sequence of SEQ ID NO: 313.
213. The system of any one of claims 96-206, wherein the first AAV vector comprises the sequence of SEQ ID NO: 314, and the second AAV vector comprises the sequence of SEQ ID NO: 309.
214. The system of any one of claims 96-206, wherein the first AAV vector comprises the sequence of SEQ ID NO: 394, and the second AAV vector comprises the sequence of SEQ ID NO: 309.
215. The system of any one of claims 96-206, wherein the first AAV vector comprises the sequence of SEQ ID NO: 395, and the second AAV vector comprises the sequence of SEQ ID NO: 306.
216. The system of any one of claims 96-206, wherein the first AAV vector comprises the sequence of SEQ ID NO: 395, and the second AAV vector comprises the sequence of SEQ ID NO: 304.
217. An isolated truncated dystrophin protein that has at least 90% sequence identity to a sequence selected from the group consisting of SEQ ID Nos: 83, 85-87, 89-107, and 216-223.
218. An isolated nucleic acid encoding a truncated dystrophin protein that has at least 90% sequence identity to a sequence selected from the group consisting of SEQ ID NOs: 83, 85-87, 89-107, and 216-223.
219. The nucleic acid of claim 218, wherein the nucleic acid sequence is at least 90% identical to a sequence selected from the group consisting of SEQ ID NOs: 108, 110-112, 114-132, 224- 231, 260-280, and 396-403.
220. An isolated recombinant vector comprising the nucleic acid of claims 218-219.
221. An isolated recombinant viral genome comprising a nucleic acid molecule encoding a truncated human dystrophin protein comprising an amino acid sequence selected from the group consisting of SEQ ID NOS: 83, 85-87, 89-107, and 216-223.
222. The isolated recombinant viral genome of claim 221, wherein the isolate nucleic acid molecule comprises a nucleotide sequence selected from the group consisting of SEQ ID NOS: 108, 110-112, 114-132, 224-231, 260-280, and 396-403, or a nucleotide sequence at least 90% identical thereto.
223. A host cell comprising the first nucleic acid molecule and/or the second nucleic acid molecule of any one of claims 1-95, the first AAV vector and/or the second AAV vector of any one of claims 96-216, the nucleic acid of claim 218 or claim 219, the vector of claim 220, or the recombinant viral genome of claim 221 or claim 222.
224. The host cell of claim 223, wherein the cell is a mammalian cell, an insect cell, or a bacterial cell.
225. A pharmaceutical composition comprising a first nucleic acid molecule and a second nucleic acid molecule of any one of claims 1-95, and a pharmaceutically acceptable excipient.
226. The pharmaceutical composition of claim 225, wherein the first nucleic acid molecule and the second nucleic acid molecule are presented at a ratio of 1:1. 1:2, 1:3, 1:4, 1:5, 2:1, 3:1, 4:l, or 5:l.
227. A pharmaceutical composition comprising a first AAV vector and a second AAV vector of any one of claims 96-216, and a pharmaceutically acceptable excipient.
228. The pharmaceutical composition of claim 227, wherein the first AAV vector and the second AAV vector arc presented at a ratio of 1:1. 1:2, 1:3, 1:4, 1:5, 2:1, 3:1, 4:1, or 5:1.
229. A pharmaceutical composition comprising the protein of claim 217, the nucleic acid of claims 218-219, the vector of claim 220, the viral genome of claim 221 or 222, or the host cell of claim 223 or 224.
230. A method for treating a dystrophin-associated disorder in a subject in need thereof, comprising administering a therapeutically effective amount of the first nucleic acid molecule and the second nucleic acid molecule of any one of claims 1-95, the first AAV vector and the second AAV vector of any one of claims 96-216, the protein of claim 217, the nucleic acid of claims 218-219, the vector of claim 220, the viral genome of claim 221 or 222, or the host cell of claim 223 or 224, thereby treating the dystrophin-associated disorder in the subject.
231. A method for increasing expression of dystrophin in a subject having or diagnosed with having a dystrophin-associated disorder, comprising administering a therapeutically effective amount of the first nucleic acid molecule and the second nucleic acid molecule of any one of claims 1-95, the first AAV vector and the second AAV vector of any one of claims 96-216, the protein of claim 217, the nucleic acid of claims 218-219, the vector of claim 220, the viral genome of claim 221 or claim 222, or the host cell of claim 223 or 224, thereby increasing expression of dystrophin in the subject.
232. A method for increasing muscle mass or muscle strength and/or preventing fibrosis in a subject having or diagnosed with having a dystrophin-associated disorder, comprising administering a therapeutically effective amount of the first nucleic acid molecule and the second nucleic acid molecule of any one of claims 1-95, the first AAV vector and the second AAV vector of any one of claims 96-216, the protein of claim 217, the nucleic acid of claims 218-219, the vector of claim 220, the viral genome of claim 221 or claim 222, or the host cell of claim 223 or 224, thereby increasing muscle strength and/or preventing fibrosis in the subject.
233. The method of any one of claims 231-232, wherein the dystrophin-associated disorder is muscular dystrophy.
234. The method of any one of claims 231-233, wherein the dystrophin-associated disorder is Duchenne muscular dystrophy.
235. The method of any one of claims 231-234, wherein the pharmaceutical composition is administered by intramuscular injection, or intravenous injection.
236. The method of any one of claims 231 -235 wherein the first AAV vector and the second AAV vector arc administered together.
237. The method of any one of claims 231-236, wherein the first AAV vector and the second AAV vector are administered separately.
238. A method of making a first recombinant adeno-associated virus (rAAV) particle, the method comprising providing a host cell comprising the first nucleic acid molecule of any one of claims 1-95, and incubating the host cell under conditions suitable to encapsulate the first nucleic acid in an AAV capsid protein; thereby making the first rAAV particle.
239. A method of making a second recombinant adeno-associated virus (rAAV) particle, the method comprising providing a host cell comprising the second nucleic acid molecule of any one of claims 1-95, and incubating the host cell under conditions suitable to encapsulate the first nucleic acid in an AAV capsid protein; thereby making the second rAAV particle.
240. The method of claim 238 or 239, wherein the cell is a mammalian cell, an insect cell, or a bacterial cell.
241. The system of any one of claims 1-216 for use in the treatment of a dystrophin-associated disorder.
242. The first nucleic acid molecule and the second nucleic acid molecule of any one of claims 1-95 for use in the treatment of a dystrophin-associated disorder.
243. The first AAV vector and the second AAV vector of any one of claims 93-216 for use in the treatment of a dystrophin-associated disorder.
244. The pharmaceutical composition of any one of claims 242-243 for use the treatment of a dystrophin-associated disorder.
245. The isolated nucleic acid molecule of any one of claims 242-244 for use the treatment of a dystrophin-associated disorder.
PCT/US2025/026535 2024-04-26 2025-04-25 Dystrophin constructs and methods of use thereof Pending WO2025227131A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US202463639556P 2024-04-26 2024-04-26
US63/639,556 2024-04-26

Publications (1)

Publication Number Publication Date
WO2025227131A1 true WO2025227131A1 (en) 2025-10-30

Family

ID=97491087

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2025/026535 Pending WO2025227131A1 (en) 2024-04-26 2025-04-25 Dystrophin constructs and methods of use thereof

Country Status (1)

Country Link
WO (1) WO2025227131A1 (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20230073250A1 (en) * 2020-02-07 2023-03-09 University Of Rochester Ribozyme-mediated RNA Assembly and Expression
WO2024075012A1 (en) * 2022-10-06 2024-04-11 Pfizer Inc. Improved host cells for aav vector production

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20230073250A1 (en) * 2020-02-07 2023-03-09 University Of Rochester Ribozyme-mediated RNA Assembly and Expression
WO2024075012A1 (en) * 2022-10-06 2024-04-11 Pfizer Inc. Improved host cells for aav vector production

Similar Documents

Publication Publication Date Title
US20250367326A1 (en) Microdystrophin nucleic acid gene therapy constructs and uses thereof
CN113383010B (en) Ataxin expression constructs with engineered promoters and methods of use thereof
US20250121092A1 (en) Promoters, expression cassettes, vectors, kits, and methods for the treatment of achromatopsia and other diseases
RU2762747C2 (en) Gene therapy of ophthalmic disorders
AU2020270984A1 (en) Gene therapies for lysosomal disorders
AU2021315876A1 (en) Compositions and methods for the treatment of neurological disorders related to glucosylceramidase beta deficiency
TW202144575A (en) Treatment of phenylketonuria with aav and therapeutic formulations
TW202206599A (en) Modified nucleic acids encoding aspartoacylase (aspa) and vectors for gene therapy
WO2025038805A1 (en) Compositions and methods for the treatment of disorders related to glucosylceramidase beta 1 deficiency
US20240424141A1 (en) Aav particles comprising a liver-tropic capsid protein and alpha-galactosidase and their use to treat fabry disease
EP4392570A2 (en) Aav particles comprising liver-tropic capsid protein and acid alpha-glucosidase and use to treat pompe disease
WO2025227131A1 (en) Dystrophin constructs and methods of use thereof
IL293334A (en) Gene therapy constructs with microdystrophin and their uses
TW202449149A (en) Compositions and methods for the treatment of neurological disorders related to glucosylceramidase beta 1 deficiency
WO2025039622A1 (en) Modified adeno-associated virus vector and use thereof in treatment of central nervous system diseases
HK40061110A (en) Frataxin expression constructs having engineered promoters and methods of use thereof

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 25795351

Country of ref document: EP

Kind code of ref document: A1