WO2025027136A1 - Single-cell crispr-screening of multiple gene perturbations in vivo - Google Patents
Single-cell crispr-screening of multiple gene perturbations in vivo Download PDFInfo
- Publication number
- WO2025027136A1 WO2025027136A1 PCT/EP2024/071816 EP2024071816W WO2025027136A1 WO 2025027136 A1 WO2025027136 A1 WO 2025027136A1 EP 2024071816 W EP2024071816 W EP 2024071816W WO 2025027136 A1 WO2025027136 A1 WO 2025027136A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- grna
- cell
- promoter
- gene
- vector
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- C—CHEMISTRY; METALLURGY
- C40—COMBINATORIAL TECHNOLOGY
- C40B—COMBINATORIAL CHEMISTRY; LIBRARIES, e.g. CHEMICAL LIBRARIES
- C40B40/00—Libraries per se, e.g. arrays, mixtures
- C40B40/04—Libraries containing only organic compounds
- C40B40/06—Libraries containing nucleotides or polynucleotides, or derivatives thereof
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/10—Processes for the isolation, preparation or purification of DNA or RNA
- C12N15/1034—Isolating an individual clone by screening libraries
- C12N15/1065—Preparation or screening of tagged libraries, e.g. tagged microorganisms by STM-mutagenesis, tagged polynucleotides, gene tags
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/87—Introduction of foreign genetic material using processes not otherwise provided for, e.g. co-transformation
- C12N15/90—Stable introduction of foreign DNA into chromosome
- C12N15/902—Stable introduction of foreign DNA into chromosome using homologous recombination
- C12N15/907—Stable introduction of foreign DNA into chromosome using homologous recombination in mammalian cells
-
- C—CHEMISTRY; METALLURGY
- C40—COMBINATORIAL TECHNOLOGY
- C40B—COMBINATORIAL CHEMISTRY; LIBRARIES, e.g. CHEMICAL LIBRARIES
- C40B40/00—Libraries per se, e.g. arrays, mixtures
- C40B40/04—Libraries containing only organic compounds
- C40B40/06—Libraries containing nucleotides or polynucleotides, or derivatives thereof
- C40B40/08—Libraries containing RNA or DNA which encodes proteins, e.g. gene libraries
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01N—INVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
- G01N33/00—Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
- G01N33/48—Biological material, e.g. blood, urine; Haemocytometers
- G01N33/50—Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing
- G01N33/5005—Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing involving human or animal cells
- G01N33/5008—Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing involving human or animal cells for testing or evaluating the effect of chemical or biological compounds, e.g. drugs, cosmetics
- G01N33/5082—Supracellular entities, e.g. tissue, organisms
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/11—DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
- C12N15/113—Non-coding nucleic acids modulating the expression of genes, e.g. antisense oligonucleotides; Antisense DNA or RNA; Triplex- forming oligonucleotides; Catalytic nucleic acids, e.g. ribozymes; Nucleic acids used in co-suppression or gene silencing
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/63—Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
- C12N15/79—Vectors or expression systems specially adapted for eukaryotic hosts
- C12N15/85—Vectors or expression systems specially adapted for eukaryotic hosts for animal cells
- C12N15/86—Viral vectors
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N2310/00—Structure or type of the nucleic acid
- C12N2310/10—Type of nucleic acid
- C12N2310/20—Type of nucleic acid involving clustered regularly interspaced short palindromic repeats [CRISPR]
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N2320/00—Applications; Uses
- C12N2320/10—Applications; Uses in screening processes
- C12N2320/12—Applications; Uses in screening processes in functional genomics, i.e. for the determination of gene function
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N2750/00—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA ssDNA viruses
- C12N2750/00011—Details
- C12N2750/14011—Parvoviridae
- C12N2750/14111—Dependovirus, e.g. adenoassociated viruses
- C12N2750/14141—Use of virus, viral particle or viral elements as a vector
- C12N2750/14143—Use of virus, viral particle or viral elements as a vector viral genome or elements thereof as genetic vector
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N2830/00—Vector systems having a special element relevant for transcription
- C12N2830/001—Vector systems having a special element relevant for transcription controllable enhancer/promoter combination
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N2830/00—Vector systems having a special element relevant for transcription
- C12N2830/008—Vector systems having a special element relevant for transcription cell type or tissue specific enhancer/promoter combination
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N2830/00—Vector systems having a special element relevant for transcription
- C12N2830/48—Vector systems having a special element relevant for transcription regulating transport or export of RNA, e.g. RRE, PRE, WPRE, CTE
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N9/00—Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
- C12N9/14—Hydrolases (3)
- C12N9/16—Hydrolases (3) acting on ester bonds (3.1)
- C12N9/22—Ribonucleases [RNase]; Deoxyribonucleases [DNase]
Definitions
- the present invention relates to a method allowing for a single-cell-based analysis of multiple CRISPR- mediated gene perturbations in a single organism.
- the method comprises the administration of a plurality of expression vectors each encoding a gRNA into an organism expressing a Cas enzyme, and allows for analysis of the resulting phenotype on a single-cell level.
- a general framework for direct in vivo single-cell screening could broadly facilitate mechanistic studies of health and disease as well as enable the systematic interrogation of the vast and growing catalog of disease- associated risk alleles in disease-relevant cells and tissues.
- Genomics studies have identified thousands of genetic variants associated with human disease and interrogating them in in vivo models is essential to understanding their causality, function, and pathology as well as developing new diagnostics and therapeutics.
- US 2020 018 746 A1 discloses a method for perturbation of disease-implicated genes in 3D tissues composed of human induced neuronal cells and astrocytic cells.
- US 2021 172 017 A1 discloses that Perturb-seq and single-cell sequencing allow the reconstruction of a cellular network or circuit.
- WO 2019 113 499 A1 discloses the use of AAV in a Perturb-seq method.
- WO 2015 089 462 A1 discloses SpCas9- mediated in vivo genome editing in the brain. Still, high-throughput, phenotype-rich, and broadly- applicable in vivo methods are urgently needed.
- AAV-Perturb-seq an adeno-associated virus (AAV)-based single-cell or -nuclei CRISPR screening method that is simple to implement, tunable, and broadly applicable for in vivo functional genomics studies.
- AAV adeno-associated virus
- gRNA guide RNA
- AAV-Perturb-seq using either gene editing in LSL-Cas9 mice or transcriptional inhibition in dCas9-KRAB mice, to systematically interrogate the genotype-phenotype landscape of individual genes linked to 22q1 1.2 deletion syndrome (22q1 1.2 DS, also known as DiGeorge syndrome), a complex genetic disorder affecting numerous organs including the brain, where dysfunction is typically clinically expressed as schizophrenia or autism spectrum disorder (ASD).
- the objective of the present invention is to provide means and methods to analyse multiple gRNA-mediated gene perturbations on a single-cell level in vivo in a single organism.
- a first aspect of the invention relates to a method for analyzing multiple gene perturbations in vivo in a tissue of interest; said method comprising the steps: a. providing an organoid or a non-human organism; b. administering a plurality of viral gRNA-delivering nucleic acid expression vectors to the organoid or organism, each vector comprising: i. inverted-terminal repeats (ITRs); ii. a gRNA promoter;
- gRNA guide-RNA
- each vector comprises a different gRNA or gRNA combination; and iv. a terminator of transcription; wherein the organism expresses a Cas enzyme, or said vector additionally encodes a Cas enzyme under control of a promoter operable in said cell; c. isolating a sample of the tissue of interest from the organism, or a sample of the organoid; d. in a collection step, collecting cells or nuclei from the sample; e.
- gRNA guide-RNA
- a single-cell or single-nucleus assay comprising an analysis of the gene perturbation and comprising gRNA sequencing of each cell or nucleus, thereby generating a plurality of assay patterns related to expression of a defined gRNA or a defined gRNA combination.
- references to “about” a value or parameter herein includes (and describes) variations that are directed to that value or parameter per se. For example, description referring to “about X” includes description of “X.”
- gene refers to a polynucleotide containing at least one open reading frame (ORF) that is capable of encoding a particular polypeptide or protein after being transcribed and translated.
- ORF open reading frame
- a polynucleotide sequence can be used to identify larger fragments or full-length coding sequences of the gene with which they are associated. Methods of isolating larger fragment sequences are known to those of skill in the art.
- transgene in the context of the present specification relates to a gene or genetic material that has been transferred from one organism to another.
- the term may also refer to transfer of the natural or physiologically intact variant of a genetic sequence into tissue of a patient where it is missing. It may further refer to transfer of a natural encoded sequence the expression of which is driven by a promoter absent or silenced in the targeted tissue.
- a recombinant in the context of the present specification relates to a nucleic acid, which is the product of one or several steps of cloning, restriction and/or ligation and which is different from the naturally occurring nucleic acid.
- a recombinant virus particle comprises a recombinant nucleic acid.
- gene expression or expression may refer to either of, or both of, the processes - and products thereof - of generation of nucleic acids (RNA) or the generation of a peptide or polypeptide, also referred to transcription and translation, respectively, or any of the intermediate processes that regulate the processing of genetic information to yield polypeptide products.
- the term gene expression may also be applied to the transcription and processing of a RNA gene product, for example a regulatory RNA or a structural (e.g. ribosomal) RNA. If an expressed polynucleotide is derived from genomic DNA, expression may include splicing of the mRNA in a eukaryotic cell. Expression may be assayed both on the level of transcription and translation, in other words mRNA and/or protein product.
- nucleotides in the context of the present specification relates to nucleic acid or nucleic acid analogue building blocks, oligomers of which are capable of forming selective hybrids with RNA or DNA oligomers on the basis of base pairing.
- nucleotides in this context includes the classic ribonucleotide building blocks adenosine, guanosine, uridine (and ribosylthymine), cytidine, the classic deoxyribonucleotides deoxyadenosine, deoxyguanosine, thymidine, deoxyuridine and deoxycytidine.
- nucleic acids such as phosphothioates, 2’0-methylphosphothioates, peptide nucleic acids (PNA; N-(2-aminoethyl)-glycine units linked by peptide linkage, with the nucleobase attached to the alpha-carbon of the glycine) or locked nucleic acids (LNA; 2’0, 4’C methylene bridged RNA building blocks).
- PNA peptide nucleic acids
- LNA locked nucleic acids
- hybridizing sequence may be composed of any of the above nucleotides, or mixtures thereof.
- nucleic acid expression vector in the context of the present specification relates to a plasmid, a viral genome or an RNA, which is used to transfect (in case of a plasmid or an RNA) or transduce (in case of a viral genome) a target cell with a certain gene of interest, or -in the case of an RNA construct being transfected- to translate the corresponding protein of interest from a transfected mRNA.
- the gene of interest is under control of a promoter sequence and the promoter sequence is operational inside the target cell, thus, the gene of interest is transcribed either constitutively or in response to a stimulus or dependent on the cell’s status.
- the viral genome is packaged into a capsid to become a viral vector, which is able to transduce the target cell.
- fluorescent protein in the context of the present specification may relate, but is not limited to, a fluorescent protein selected from the group comprising: green fluorescent protein (GFP) from Aequorea victoria and derivatives thereof, such as enhanced blue fluorescent protein (EBFP), enhanced blue fluorescent protein 2 (EBFP2), azurite, mKalamal , Sirius; enhanced green fluorescent protein (EGFP), emerald, superfolder avGFP, T-sapphire; yellow fluorescent protein (YFP), enhanced yellow fluorescent protein (EYFP), citrine, venus, YPet, topaz, SYFP, mAmetrine enhanced cyan fluorescent protein (ECFP), mTurquoise, mTurquoise2, cerulean, CyPet, SCFP; fluorescent protein from Discosoma striata and derivatives thereof: mTagBFP,
- GFP green fluorescent protein
- EBFP enhanced blue fluorescent protein
- EBFP2 enhanced blue fluorescent protein 2
- EGFP enhanced green fluorescent protein
- TagCFP AmCyan, Midoriishi Cyan, mTFP1
- gRNA in the context of the present specification relates to a guide RNA which comprises a crRNA part and a tracrRNA part.
- gRNA combination in the context of the present specification relates to multiple (a plurality of) gRNAs.
- a gRNA combination consists of 2 to 1000 gRNAs.
- a gRNA combination consists of 2 to 100 gRNAs.
- a gRNA combination consists of 2 to 10 gRNAs.
- gene perturbation in the context of the present specification relates to an alteration in gene expression induced by CRISPR/Cas and a gRNA.
- a gene perturbation relates to the loss of expression of the gene targeted by the respective gRNA.
- a gene perturbation relates to the inhibition of transcription of the target gene by the respective gRNA and Cas-effector (eg KRAB) complex.
- a gene perturbation relates to the activation of transcription of the target gene by the respective gRNA and Cas-effector (eg VP64, HSF1 , p65) complex.
- a gene perturbation relates to the base editing of the target gene by the respective gRNA and base editing complex.
- a gene perturbation relates to the prime editing of the target gene by the respective prime editing gRNA and base editing complex.
- Cas enzyme in the context of the present specification relates to an enzyme of the Cas family.
- the Cas enzyme is a Cas enzyme of type II.
- the Cas enzyme is a Cas9 enzyme of UniProt-ID Q99ZW2.
- polypeptide part in the context of the present specification relates to one or several domains of a polypeptide, wherein the domain(s) are covalently linked to the rest of the polypeptide.
- gRNA-delivering nucleic acid expression vector in the context of the present specification relates to a nucleic acid expression vector which encodes a gRNA, and optionally further elements.
- a first aspect of the invention relates to a method for analyzing multiple (a plurality of) gene perturbations in vivo in a tissue of interest.
- the method comprises the steps:
- Step a providing an organoid or a non-human multi-cellular organism, particularly a non-human organism.
- An organoid is a multi-cellular, three-dimensional, simplified version of an organ, which is generated in vitro.
- the organoid can be composed of human or non-human cells.
- the non-human organism is an animal or a plant or a fungus.
- the non-human organism is an animal.
- the non-human organism is an adult organism.
- the non-human organism is an adult animal.
- the non-human organism is an animal which is born (not an embryo, post-embryonic stage).
- Step b administering a plurality of gRNA-delivering nucleic acid expression vectors to the organoid or organism, wherein each vector comprises the following elements: i. a gRNA promoter allowing for transcription inside a tissue of interest (particularly a Pol III RNA polymerase promoter); ii. at least one guide-RNA (gRNA) transcribable under control of said gRNA promoter, wherein each vector of the plurality comprises a different gRNA or gRNA combination (in case more than one gRNA is encoded); and
- the organism expresses a Cas enzyme, or the gRNA-delivering nucleic acid expression vector additionally encodes a Cas enzyme under control of a promoter operable in said cell.
- the Cas enzyme is encoded in the genome of the organism or organoid.
- the Cas enzyme is delivered via a separate nucleic acid expression vector.
- the gRNA-delivering nucleic acid expression vector is a viral vector.
- each vector comprises the following elements: i. inverted-terminal repeats (ITRs) flanking the elements described below at the 3’ and the 5’ end (positioned 5'upstream and 3’ downstream of all elements encoded by the vector, i.e. the elements described below: the gRNA and its promoter, optionally a gene encoding a Cas enzyme, and other optional elements); ii. a gRNA promoter allowing for transcription inside a tissue of interest; (particularly a Pol III RNA polymerase promoter);
- each vector of the plurality comprises a different gRNA or gRNA combination (in case more than one gRNA is encoded); and iv. a terminator of transcription in 3’ direction of the gRNA;
- the expression vector is delivered via a peptide-based delivery system using cell-penetrating peptides.
- the expression vector is delivered via a lipid-based delivery system via lipid nano-particles.
- the expression vector is delivered via an inorganic delivery system using black phosphorus, graphene oxide, mesoporous silica nanoparticles, or gold nanoparticles.
- the expression vector is delivered via a polymeric delivery system, wherein the expression vector is incorporated into a polymer.
- the expression vector is delivered via electroporation.
- Lan et al. (Mol Cancer 21 , 71 (2022)) reviews the non-viral delivery ways, and is incorporated by reference herein.
- Step b1 keeping the organism under conditions allowing the organism to live or the organoid under conditions allowing the organoid to stay intact.
- these conditions are physiological conditions.
- these conditions are pathological conditions which impose a particular burden on the organism or organoid.
- Step c isolating a sample of the tissue of interest from the organism, or a sample of the organoid. This sample may comprise a complete organ of the organism or only a subtraction of an organ.
- Step d in a collection step, collecting cells or nuclei from the sample of the tissue of interest or of the organoid.
- Step e in an analysis step, performing a single-cell or single-nucleus assay comprising an analysis of the gene perturbation and comprising gRNA sequencing of each cell or nucleus of the collected cells or nuclei, thereby generating a plurality of assay patterns related to expression of a defined gRNA or a defined gRNA combination.
- the analysis of the gene perturbation is an assay or a combination of assays which captures a certain feature of the analyzed cell.
- this assay it is possible to relate the perturbation of expression via the gRNA to a phenotype of the cell in vivo. There are multiple assays which can be performed on a single-cell level.
- An assay pattern reflects the phenotype of the cell for a certain parameter.
- the assay of the analysis step comprises single-cell or single-nucleus RNA sequencing.
- Each cell has a unique barcode that is present in both the mRNA and gRNA fractions.
- One uses this barcode to make connections between mRNA information and gRNA expression gRNA expression is an indirect way to identify the mutated gene).
- gRNA expression is an indirect way to identify the mutated gene.
- the gRNA indicates which gene was perturbed (knocked out, inhibited, activated) in that cell.
- the mRNA information from that cell is informative about the impact of perturbing that gene.
- the assay of the analysis step comprises single-cell or single-nucleus DNA sequencing.
- the assay of the analysis step comprises single-cell quantification of surface proteins, particularly cellular indexing of transcriptomes and epitopes by sequencing (CITE-seq), and the assay patterns are protein patterns, particularly surface protein patterns.
- Surface protein quantification is performed as described in Stoeckius, M. et al. Simultaneous epitope and transcriptome measurement in single cells. Nat Methods 14, 865-868 (2017).
- the assay of the analysis step comprises single-cell or single-nucleus quantification of cytosolic and nuclear proteins.
- This assay uses antibodies labelled with short oligonucleotides to indirectly quantify proteins of interest.
- cells are treated with oligo-tagged antibodies that bind to the protein(s) of interest.
- Cells are then used to prepare single-cell RNA-seq libraries with modified protocols that capture both mRNA and oligo-tags. Quantification of the oligo tags indirectly tells the quantity of protein present in the cell. Cytoplasmatic protein quantification is performed as described in Katzenelenbogen, Y. et al. Coupled scRNA-Seq and Intracellular Protein Activity Reveal an Immunosuppressive Role of TREM2 in Cancer. Cell 182, 872-885. e19 (2020).
- the assay of the analysis step comprises single-cell or single-nucleus quantification of histone marks.
- This assay uses an antibody binding to a histone modifying protein. After binding to the target protein, the antibody recruits a DNA cutting enzyme that cuts DNA around the specific histone modification. Deep sequencing of the cut DNA allows one to understand the precise original genomic localization of the histone modification. Bartosovic, M., Kabbe, M. & Castelo- Branco, G. Single-cell CUT&Tag profiles histone modifications and transcription factors in complex tissues. Nat Biotechnol 39, 825-835 (2021).
- the assay of the analysis step comprises single-cell or single-nucleus mRNA sequencing, and the assay patterns are mRNA expression patterns.
- the assay of the analysis step comprises transposase-accessible chromatin with sequencing (ATAC-seq), and the assay patterns are chromatin accessibility patterns.
- the assay patterns are clustered by their type of cell or origin, thereby generating an assay profile for each cell type in the tissue of interest.
- the assay patterns are clustered by their type of expressed gRNA, thereby generating an assay profile for each gene perturbed by a gRNA or a gRNA combination. Via clustering, the read-out of multiple cells of the same cell type and having the same gRNA perturbation can be combined to increase the signal-to-noise ratio.
- each vector of the plurality of gRNA-delivering nucleic acid expression vectors comprises a reporter gene under control of a reporter gene promoter, wherein said reporter gene encodes a reporter protein, wherein said reporter protein is detectable when expressed inside a cell and enables selective isolation of cells that express said reporter protein in the collection step.
- This reporter protein facilitates the analysis, because the cells which express a gRNA can be distinguished from non-expressing cells, e.g., via their fluorescence.
- cells or nuclei are collected selectively from the tissue of interest that exhibit expression of said reporter protein.
- the reporter protein is a fluorescent protein.
- the collection step is performed via fluorescence-activated cell sorting (FACS) or fluorescence-activated nucleus sorting (FANS).
- FACS fluorescence-activated cell sorting
- FANS fluorescence-activated nucleus sorting
- nuclei are collected.
- the reporter protein comprises a polypeptide part localizing to a nucleus of a cell, particularly wherein the reporter protein comprises a polypeptide part interacting with a membrane of a nucleus, particularly wherein the reporter protein comprises a KASH (Klarsicht, ANC-1 , Syne Homology) domain.
- the reporter protein comprises a polypeptide part localizing to a nucleus of a cell, particularly wherein the reporter protein comprises a NLS (nuclear localization sequence) domain.
- whole cells are collected.
- patterns are clustered first by their cell type.
- said organism expresses a gene (from its genome or delivered via a vector) encoding a recombinase enzyme (under control of a promoter), and wherein activation of expression of said Cas enzyme is mediated via said recombinase enzyme.
- the recombinase enzyme is a Cre enzyme.
- an organism expressing a Cas enzyme on its genome has an LSL (lox-stop-lox) sequence between the promoter and the Cas coding sequence. This LSL sequence blocks expression of the Cas enzyme. Once the Cre protein is expressed, the Cre protein removes the LSL sequence and unblocks the expression of Cas enzyme.
- Cre can be delivered via AAV, but it’s also possible to cross-breed the LSL-Cas animal to another animal that expresses Cre from its genome.
- the recombinase enzyme is a Flp enzyme.
- the promoter of the reporter/Cre is a pol-ll promoter.
- the gRNA promoter is a pol-ll promoter. In certain embodiments, the gRNA promoter is a pol-lll promoter.
- said reporter gene promoter is an RNA-polymerase II promoter.
- the gRNA-delivering nucleic acid expression vector is a viral vector. In certain embodiments, the gRNA-delivering nucleic acid expression vector is selected from the group of an AAV vector, an adenoviral vector, a rabies vector, a Sindbis vector, and a lentiviral vector. In certain embodiments, the gRNA-delivering nucleic acid expression vector is an AAV vector.
- the organism is an animal. In certain embodiments, the organism is a vertebrate. In certain embodiments, the organism is a mammal, particularly a mammal selected from the group of a rodent, a primate, an ungulate (particularly an artiodactyla or a perissodactyla), a lagomorph, a carnivore, an insectivore, and a chiroptera.
- the Cas enzyme is Cas9.
- the gene encoding the Cas enzyme is introduced into the germline of the organism.
- the gene encoding the Cas enzyme is delivered via a (particularly viral) vector.
- the gRNA promoter and the reporter gene promoter are two distinct promoters.
- the gRNA promoter is a tissue-specific promoter.
- the gRNA promoter is an inducible or conditional promoter. In certain embodiments, the gRNA promoter is a promoter selected from the group of a Tet-ON promoter, a Tet- OFF promoter, and a Cre-dependent promoter.
- 5’ capture sequencing is performed.
- Infected nuclei are subjected to a modified 5’ single-cell library preparation protocol to capture both mRNA and gRNA information.
- a. The reaction is modified to include a gRNA-specific reverse transcription primer to capture gRNA alongside mRNA.
- mRNA and gRNA molecules are separated to create two independent libraries.
- the mRNA library is bigger than 300 bp, while the gRNA library is approximately 180 bp. We separate the 2 fractions by using a specific protocol (beads) that sequesters and removes molecules bigger than 300 bp (mRNA) and keeps the gRNA molecules. Both fractions are prepared separately during the following steps.
- c. The mRNA library is processed accordingly to the kit’s instructions.
- the gRNA library is PCR amplified and indexed prior to Illumina deep sequencing.
- RNA-seq preparation kit For 5’ capture sequencing, a single-cell RNA-seq preparation kit is used that barcodes RNA at the 5’ (ex: 10xGenomics Chromium 5’ kit).
- the kit’s protocol was modified to additionally include a reverse transcription primer specific to the gRNA. This primer mediates capture and barcoding of gRNA molecules from each single cell, alongside the protocol’s standard capture of mRNA.
- capture sequencing comprises the following steps:
- scRNA-seq 5 a scRNA-seq 5’ capturing protocol (e.g. from 10x Genomics), comprising the steps:
- RNA molecules from the 5’ position • a scRNA-seq platform (e.g. from 10x Genomics) that barcode the RNA molecules from the 5’ position;
- RT reverse transcriptase
- RT reverse transcription
- the invention can be described as follows: For a given tissue, we infect a percentage of cells with a nucleic expression vector (not all cells from the tissue are infected, but later cell or nuclei sorting permits focus only on infected cells with fluorescent protein expression). Each infected cell carries a perturbation in one or more genes. The number of genes tested in parallel can be between 2 and +20 000. We assume that a pool of single cells carrying the same perturbation is representative of the effect of that perturbation in that tissue and cell type.
- the therapy in this case could be seemingly any therapeutic modality that modifies the target in the same way as the genetic perturbation.
- the target was an enzyme
- a small molecule drug inhibiting that enzyme could have the same capacity to rescue the disease state.
- a further aspect of the invention relates to the use of a plurality of viral gRNA-delivering nucleic acid expression vectors in a method according to the first aspect; each vector comprising: i.) inverted-terminal repeats (ITRs); ii.) a gRNA promoter; iii.) at least one guide-RNA (gRNA) under control of said gRNA promoter, wherein each vector comprises a different gRNA or gRNA combination; and iv.) a terminator of transcription.
- ITRs inverted-terminal repeats
- gRNA guide-RNA
- Our invention specifically relates to a method that allows for single-cell-based analysis of multiple CRISPR-mediated gene perturbations within a single organism in vivo. This capability is significantly different from US 2020 018 746 A1 , which focuses on 3D tissues and does not cover single-cell resolution in live animals.
- AAV-Perturb-seq AAV-based single-cell CRISPR screening method, AAV-Perturb-seq, which is broadly applicable for in vivo functional genomics studies. Unlike the methods in US 2020 018 746 A1 , our approach ensures efficient gRNA expression and detection within single-cell libraries, optimized for large numbers of single nuclei from complex tissues isolated from animals.
- the systemic delivery via intravenous injections of AAV vectors in our method enables targeting a wide range of tissues and cell types in animals of any age. This systemic delivery is tunable and provides broader application compared to the approaches described in US 2020 018 746 A1 .
- Our invention allows for single-cell-based analysis of multiple CRISPR-mediated gene perturbations within a single organism in vivo. This enables detailed and comprehensive mapping of gene functions at a single-cell resolution in a living animal, a capability not emphasized or developed in US 2021 172 017 A1.
- Our invention features systemic delivery of AAV vectors via intravenous injections, enabling the targeting of a wide range of tissues and cell types across the entire body in animals of any age. This broad targeting capability allows for more comprehensive studies of gene functions and interactions in various tissues simultaneously, which is not covered by US 2021 172 017 A1's approach. US 2021 172 017 A1 only covers in vitro applications.
- Our invention involves a method for analyzing multiple gene perturbations in vivo in a tissue of interest, which includes the administration of a plurality of viral gRNA-delivering nucleic acid expression vectors.
- This method allows for single-cell or single-nucleus assays to analyze gene perturbation, including gRNA sequencing of each cell or nucleus.
- the specificity lies in the detailed steps for isolating tissues, collecting cells or nuclei, and performing various assays (e.g., RNA sequencing, DNA sequencing, protein quantification).
- the invention includes detailed protocols for systemic delivery, enabling targeting of a wide range of tissues and cell types in animals of any age, and incorporates numerous specific assays (e.g., singlecell RNA-seq, CITE-seq, ATAC-seq). This broader scope is not suggested or implied by WO 2019 113 499 A1 .
- the method involves administering a plurality of viral gRNA-delivering nucleic acid expression vectors and performing single-cell or single-nucleus assays to analyze gene perturbation at a high resolution.
- This broad applicability to different tissues and comprehensive analysis distinguishes our invention from WO 2015 089 462 A1 , which focuses specifically on the brain and electrophysiological recording.
- Our invention facilitates the targeting of multiple genes simultaneously using a library of gRNA- delivering vectors.
- This method can perturb multiple genes in parallel, enabling high-throughput studies of complex genetic interactions across various tissues.
- This capability for parallel perturbation and analysis of multiple genes is a significant advancement over the single-gene focus described in WO 2015 089 462 A1.
- a method for analyzing multiple gene perturbations in vivo in a tissue of interest comprising the steps: a. providing an organoid or a non-human organism, particularly a non-human organism; b. administering a plurality of gRNA-delivering nucleic acid expression vectors to the organoid or organism, each vector comprising: i. a gRNA promoter ii. at least one guide-RNA (gRNA) under control of said gRNA promoter, wherein each vector comprises a different gRNA or gRNA combination; and
- the assay of the analysis step comprises transposase-accessible chromatin with sequencing (ATAC-seq), and the assay patterns are chromatin accessibility patterns.
- the assay of the analysis step comprises cellular indexing of transcriptomes and epitopes by sequencing (CITE-seq), and the assay patterns are protein patterns, particularly surface protein patterns.
- the assay patterns are clustered by their type of cell or origin, thereby generating an assay profile for each cell type in the tissue of interest.
- the assay patterns are clustered by their type of expressed gRNA, thereby generating an assay profile for each gene perturbed by a gRNA or a gRNA combination.
- each vector of the plurality of gRNA-delivering nucleic acid expression vectors comprises a reporter gene under control of a reporter gene promoter, wherein said reporter gene encodes a reporter protein, wherein said reporter protein enables selective isolation of cells that express said reporter protein in the collection step.
- the reporter protein comprises a polypeptide part localizing to a nucleus of a cell, particularly wherein the reporter protein comprises a polypeptide part interacting with a membrane of a nucleus, particularly wherein the reporter protein comprises a KASH domain.
- the reporter protein comprises a polypeptide part localizing to a nucleus of a cell, particularly wherein the reporter protein comprises a NLS (nuclear localization sequence) domain.
- the gRNA-delivering nucleic acid expression vector is a viral vector, particularly wherein the gRNA-delivering nucleic acid expression vector is an AAV vector, an adenoviral vector, or a lentiviral vector, more particularly wherein the gRNA-delivering nucleic acid expression vector is an AAV vector.
- the organism is an animal, particularly wherein the organism is a vertebrate, more particularly wherein the organism is a mammal, most particularly a mammal selected from the group of a rodent, a primate, an ungulate, a lagomorph, a carnivore, an insectivore, and a chiroptera.
- gRNA promoter is a tissue-specific promoter.
- the gRNA promoter is an inducible or conditional promoter, particularly a promoter selected from the group of a Tet-ON promoter, a Tet-OFF promoter, and a Cre-dependent promoter.
- a method for analyzing multiple gene perturbations in vivo in a tissue of interest comprising the steps: a. providing an organoid or a non-human organism, particularly a non-human organism; b. administering a plurality of viral gRNA-delivering nucleic acid expression vectors to the organoid or organism, each vector comprising: i. inverted-terminal repeats (ITRs); ii. a gRNA promoter;
- gRNA guide-RNA
- each vector comprises a different gRNA or gRNA combination; and iv. a terminator of transcription; wherein the organism expresses a Cas enzyme, or said vector additionally encodes a Cas enzyme under control of a promoter operable in said cell; c. isolating a sample of the tissue of interest from the organism, or a sample of the organoid; d. in a collection step, collecting cells or nuclei from the sample; e.
- gRNA guide-RNA
- an analysis step performing a single-cell or single-nucleus assay comprising an analysis of the gene perturbation and comprising gRNA sequencing of each cell or nucleus, thereby generating a plurality of assay patterns related to expression of a defined gRNA or a defined gRNA combination.
- the assay of the analysis step comprises a method selected from the group of
- the assay patterns are mRNA expression patterns
- the assay patterns are protein patterns, particularly surface protein patterns.
- each vector of the plurality of gRNA-delivering nucleic acid expression vectors comprises a reporter gene under control of a reporter gene promoter, wherein said reporter gene encodes a reporter protein, wherein said reporter protein enables selective isolation of cells that express said reporter protein in the collection step, particularly wherein in the collection step, cells or nuclei are collected selectively from the tissue of interest that exhibit expression of said reporter protein, more particularly wherein the reporter protein is a fluorescent protein.
- the reporter protein comprises a polypeptide part localizing to a nucleus of a cell, particularly wherein
- the reporter protein comprises a polypeptide part localizing to a nucleus of a cell, particularly wherein the reporter protein comprises a NLS (nuclear localization sequence) domain.
- NLS nuclear localization sequence
- the gRNA- delivering nucleic acid expression vector is an AAV vector, an adenoviral vector, a rabies vector, a Sindbis vector, or a lentiviral vector, more particularly wherein the gRNA-delivering nucleic acid expression vector is an AAV vector.
- the organism is an animal, particularly wherein the organism is a vertebrate, more particularly wherein the organism is a mammal, most particularly a mammal selected from the group of a rodent, a primate, an ungulate, a lagomorph, a carnivore, an insectivore, and a chiroptera.
- the Cas enzyme is Cas9.
- the gene encoding the Cas enzyme is Cas9.
- gRNA promoter and the reporter gene promoter are two distinct promoters.
- the gRNA promoter is a tissue-specific promoter.
- the gRNA promoter is an inducible or conditional promoter, particularly a promoter selected from the group of a Tet-ON promoter, a Tet-OFF promoter, and a Cre-dependent promoter.
- Fig. 2 Perturbation of 22q11.2-linked genes Dgcr8, Dgcr14, Gnbll, and Ufdll result in strong transcriptional changes in adult brain cell types
- a Schematic of the analysis pipeline (SH control: nuclei with control gRNAs targeting safe-harbor locus; P: perturbation; n: total number of perturbations; LFC: log fold change)
- b Number of differentially expressed genes (DEG) for all perturbations in individual cell types. Dashed line indicates 5 DEGs with an adjusted p-value (p.adj) lower than 0.05.
- Fig. 3 Perturbation of 22q11.2 genes results in the disruption of distinct sets of biological processes, a. Schematic representation of arrayed validation experiments, b. Pearson correlation and hierarchical clustering of transcriptional signatures (LFC values) mediated by Dgcr8, Dgcr14, or Gnbll perturbation in pooled screen and arrayed confirmation experiments for each neuron type. c. Heatmap showing the six transcriptional programs (grouped rows) altered in Dgcr8, Dgcr14, and Gnbll perturbed cells (columns) across cell types and experiments (screen or arrayed). Left: LFC values for each altered gene across neuron types and experiments.
- LFC values transcriptional signatures
- FIG. 6 AAV injection and nuclei isolation conditions
- a Schematic representation of AAV genomes used to deliver and express mTagBFP, Venus, or mCherry under the control of the CBh promoter
- b Schematic representation of the triple color experiment.
- An equal-ratio mix of the three AAVs was injected in LSL-Cas9 animals with different doses (Low: 2.5 x 10 9 ; Medium: 5.0 x 10 9 ; High: 2.5 x 10 10 , total AAV particles),
- c Percentage of infected nuclei (/.e., nuclei expressing at least one fluorescent protein) after systemic injection of different viral doses, d.
- Percentage of infected nuclei expressing one, two, or the three FPs Data shown for injections with 5.0 x 10 9 and 2.5 x 10 10 total AAV particles, e. Fluorescence imaging of brain cells expressing GFP four weeks after systemic injection of 5.0 x 10 9 AAV particles, f. Flow cytometry gating strategy to sort GFP-positive nuclei.
- FIG. 7 Astrocytes-specific pooled screen, a. Schematic representation of the AAV genome engineered to express, b. UMAP embedding of ⁇ 35 000 AAV.PHP.B-infected nuclei isolated from the mouse prefrontal cortex, c. Abundance of cell types in single-nucleus datasets generated from brain cells infected with CBh and GfaABCI D AAVs. d. Percentage of gRNAs detected per nucleus in Astrocytes, e. Number of DEGs for all perturbations in Astrocytes.
- Fig. 8 In vivo CRISPR screening in a high-fat diet MASH model, a. Animals were injected with genetic interventions and exposed to HFD for five months, b. Bulk gRNA count analysis revealed genes involved in hepatocyte damage, c. Bulk gRNA count prioritizes interventions with therapeutic potential. Interventions highlighted in red are examples of positive controls known to have an effect and support the ability of our platform to pinpoint potential therapeutical targets.
- FIG. 9 Microglia-specific AAV capsids, a. Average number of nuclei per perturbation across cell types using AAV. PHP. b for gRNA library delivery, as reported in Santinha et al. 2023. b. Experimental design to test microglia-specific AAV capsids. Viruses were tail vein injected (10 A 12 particles per animal), and the number of infected microglia was assessed three weeks later, c. Flow cytometry results for CD11 b positive microglia for two controls (PBS and PHP.eB) and four microglia-specific AAV capsids (AAV M1 - M4).
- Example 1 In vivo single-nucleus pooled CRISPR screening in the adult brain enabled through systemic administration ofAAV.PHP.B and 5’ gRNA capture
- AAV AAV transfer plasmids to independently express either mTagBFP, Venus, or mCherry under the control of a ubiquitous CBh promotor (Fig. 6a).
- FP fluorescent protein
- Fig. 6a Each fluorescent protein (FP) was additionally fused to a KASH domain which physically attaches proteins to the nuclear membrane, thus enabling nuclei sorting.
- Example 2 AAV-Perturb-seg of 22g 11.2 DS genes yields a rich single-nucleus dataset spanning genes and brain cell types from adult mice
- Example 3 Perturbation of Dgcr8, Dgcr14, Gnbll, or Ufdll result in strong transcriptional changes in prefrontal cortex neurons
- ⁇ Ne developed a data analysis pipeline to associate gRNAs, and thus genetic perturbations, with cell type-specific transcriptional phenotypes (Fig. 2a).
- Fig. 2a We create pseudobulk profiles by aggregating nuclei with the same perturbation and employ edgeR (Robinson, M. D et al., Bioinforma. Oxf. Engl. 26, I SOO (2010)) to calculate pairwise differential expression (DE) between control and each perturbation in superficial and deep layer excitatory neurons, interneurons, astrocytes, and oligodendrocytes.
- edgeR Robotson, M. D et al., Bioinforma. Oxf. Engl. 26, I SOO (2010)
- DE pairwise differential expression
- Our choice of using pseudobulk profiles is supported by recent benchmarking studies indicating that commonly used single-cell-specific DE methods tend to identify differentially expressed genes (DEG) in the absence of biological differences.
- Example 4 Altered transcriptional phenotypes are due to gene function and not a consequence of gene editing efficiency
- LDA linear discriminant analysis
- Example 6 Perturbation of 22g11.2-associated genes results in heterozygous and homozygous cells with similar transcriptional phenotypes
- the control of zygosity is a general challenge in CRISPR screens, as the expression of Cas9 and gRNA can lead to three potential scenarios: 1) the cells are infected but not edited and are thus wild-type (WT); 2) the cells are infected and acquire a heterozygous mutation; or 3) the cells are infected and acquire a homozygous mutation. While WT cells do not contribute to the observed transcriptional phenotypes and are removed by our filtering strategy, it is unclear whether heterozygous and heterozygous mutations lead to the same transcriptional phenotype. This is especially important for modelling haploinsufficiency, as is the case for 22q11.2 DS.
- CRISPR inhibition (CRISPRi)-mediated knockdown may reduce target gene expression to levels observed in a heterozygous condition and thus be used to simulate the phenotypes generated by haploinsufficiency.
- CRISPRi CRISPR inhibition
- CRISPRi-mediated Dgcr8 mRNA reduction was comparable to the values observed for 22q1 1 .2 DS, indicating our ability to model heterozygosity.
- CRISPRi- and CRISPR- mediated Dgcr8 perturbation led largely to analogous transcriptional phenotypes.
- Example 7 Perturbation of 22q11.2 DS genes results in the disruption of distinct sets of biological processes
- Dgcr8 encodes for a component of the microprocessor complex involved in processing primary microRNA (miRNA) transcripts (pri-miRNAs) into precursor miRNAs (pre-miRNAs), which are ultimately further processed by Dicer into mature miRNAs, and has been extensively studied in the context of 22q1 1.2 DS. While we found that no biological pathways were disrupted in the Dgcr8 down-regulated genetic program, in the up-regulated genetic program we identified a disruption in genes related with miRNA-mediated RNA silencing (Fig. 3c), which included several long noncoding RNAs (IncRNA) such as Mirg and Spaca6.
- IncRNAs encode pri-miRNAs and their up-regulation was previously reported in mouse models of Dgcr8 haploinsufficiency and 22q11.2 DS.
- the accumulation of these pri-miRNAs implies that there is less mature miRNA being produced.
- mature miRNAs negatively regulate gene expression, we would expect a concordant increase in the expression of genes targeted by the disrupted miRNAs.
- miRNA-target enrichment analysis Licursi, V.
- Dgcr14 encodes for the nuclear protein DGCR14, a component of C complex spliceosomes.
- Gene ontology analysis of the down-regulated genetic program revealed the presence of genes connected with RNA binding (Fig. 3c). We found a specific enrichment for genes associated with regulation of RNA splicing and the spliceosome, supporting the involvement of Dgcr14 in RNA maturation processes.
- Gnbl l encodes for a protein of unknown function46 that contains six WD40 repeats which facilitate protein-protein interactions and the formation of multiprotein complexes.
- the down-regulated genetic program was enriched for genes involved in neuronal development, synaptic organization and function, and chemical transmission (Fig. 3c), including genes that encode for glutamatergic receptor subunits (Grial , Gria4, Grik3, Grin2a, and Grin2b), regulation of a prepulse inhibition phenotype (Ctnna2 and Nrxnl), and regulation of action potential (Ank3, Cnr1 , Fgf13, Foxpl , and Trpc4).
- Grial , Gria4, Grik3, Grin2a, and Grin2b regulation of a prepulse inhibition phenotype
- Ank3, Cnr1 , Fgf13, Foxpl , and Trpc4 regulation of action potential
- AAV-Perturb-seq both confirms prior published work and provides new insights into the phenotypic landscape underlying 22q11 .2 genes.
- our data reveals new pri-miRNA targets of Dgcr8, a disrupted balance between RNA transcription and splicing resulting from perturbation of Dgcr14, and broad dysfunction in neuronal communication linked to the synapse and glutamate signaling in Gnbl l-perturbed cells.
- these 22q11.2 genes play an active role in mature neurons in the mouse brain, which may also contribute to 22q11 .2 DS symptomatology.
- Example 8 Single-nucleus prefrontal cortex atlas of a 22ct11.2 DS mouse model
- Hierarchical clustering of the bulk transcriptional profiles revealed a primary clustering driven by cell type followed by a second level clustering by genotype.
- GSEA gene set enrichment analysis
- Example 9 Transcriptional changes found in LgDel model neurons are partially explained by perturbation of Dgcr8, Dgcr14, and Gnb 11
- ⁇ Ne set out to quantify the extent to which individual perturbations explain the transcriptional signature observed in LgDel neurons.
- the Dgcr8 contribution was focused on up-regulated genes mostly associated with the accumulation of miRNA primary genes (Mirg, Spaca6, Mir9-3hg, and Mir181 a-1 hg).
- the smaller Dgcr14 contribution included down-regulated spliceosomal genes Srsfl , Srsf2, and Srsf6, while the Gnbl l contribution was primarily related to down-regulation of genes involved with synapse signaling (Fig. 5).
- An in vivo cell type-specific screen could in principle be achieved using cell type-specific delivery or expression as well as through physical enrichment of the cell type of interest.
- NASH is characterized by an excessive accumulation of fat in hepatocytes. At least 20% of patients progress to severe liver disease, in which the excessive fat causes cell damage and initiates a cascade of inflammatory events - mediated by Kupfer cells and hepatic stellate cells (HSC) - that lead to tissue fibrosis and scarring. If not addressed, this progression can culminate in cirrhosis and liver failure, which are major risk factors for the most common liver cancer, hepatocellular carcinoma.
- HSC hepatic stellate cells
- AAV-Perturb-seq is particularly well-positioned to identify therapeutic targets for the disease.
- NASH mouse model There are several types of murine models of NASH, ranging from genetic to chemical to dietarian. We chose to utilize a dietarian model based on a high-fat diet (HFD). This model has been reported to better mimic the pathological features observed in human patients.
- HFD high-fat diet
- AAV-Perturb-seq can prioritize interventions with therapeutic potential and set the stage for the identification of novel targets for the treatment of human diseases.
- AAV-Perturb-seq offers, for the first time, the possibility of identifying genetic targets able to modulate microglia states and, consequently, disease progression.
- AAV.PHP.b AAV capsid specifically developed to target neuronal cells in the mouse brain.
- AAV serotype proved to infect all major brain cell types but to different extents ( Figure 9a).
- Figure 9a Figure 9a.
- cell type-specific delivery of gRNA libraries by using cell type-specific AAV capsids will permit better control over the number of cells per perturbation.
- microglia are the cell type least abundant in an AAV-Perturb-seq experiment ( Figure 9a).
- Figure 9a we set out to identify microglia-specific AAV capsids.
- To test these capsids we produced AAV particles with four evolved capsids (M1 - M4) and included one neuron-specific capsid (AAV. PHP. eB) as a control.
- These viruses carried a GFP transgene to report successful infections.
- AAV-Perturb-seq a direct in vivo single-cell CRISPR screening method that is tunable, scalable, and broadly applicable for systematically interrogating genetic elements in vivo in high- throughput.
- AAV.PHP.B encoding a library of CRISPR gRNAs targeting genes linked to 22q11.2 DS.
- AAV-Perturb-seq offers for the first time the opportunity to directly interrogate multiple genes in several cell types at a single-cell level in the same animal without restriction to tissue or developmental time points, opening immense further possibilities for studying processes of health and disease in vivo.
- Gnb //-perturbed neurons displayed altered gene expression related to synaptic signaling, strongly suggesting that heterozygous loss of Gnbll may result in impaired neuronal communication throughout development and contribute to the emergence of alterations in neuronal functioning. This hypothesis is further supported by the observation of reduced expression levels of Gnbll in postmortem brain samples of schizophrenia patients with and without 22q1 1 .2 deletion and by the deficits in synaptic signaling and behavior related to schizophrenia and ASD and found in Gnbll* 1 - mouse models.
- a promising area for further study is determining whether 22q11.2 DS-associated neuronal and cognitive phenotypes can be rescued exclusively through restoring Dgcr8, Dgcr14, Gnbll, and/or Ufdll expression during or after development.
- AAV-Perturb-seq will broadly enable the interrogation of genotype-phenotype landscapes directly in vivo in different tissues, cell types, developmental stages, and under different health and disease contexts.
- the ability to interrogate complex in vivo biology at scale could lead to breakthroughs in our causal understanding of biological and disease mechanisms as well as our capacity to identify genetic interventions and targets for treating disease.
- AAV Compared to lentivirus for in vivo delivery Considering contemporary research, AAV is the vastly preferred modality for in vivo delivery for several very good reasons. These advantages, as well as disadvantages, are thoroughly covered by many recent reviews on in vivo delivery (Asokan, A., et la., Molecular Therapy 20, 699-708 (2012); Mingozzi, F. et al., Nat Rev Genet 12, 341-355 (2011)) Below, we highlight the main advantages and disadvantages relevant for single-cell CRISPR screening.
- AAVs can be injected systemically and infect seemingly any organ and cell type in a tunable way. If LV is injected systemically, it is only capable of infecting a few cells in the liver. If LV is injected within a compartment (e.g., via intraperitoneal, intrathecal, or intracerebroventricular injection) it mostly infects cells along the barrier without penetrating deeply into tissue. Thus, in vivo delivery of LV almost always reguires direct injection into the organ of interest.
- AAVs When injected systemically, AAVs can infect almost any tissue or cell type in a tunable way. Unlike LV, which is difficult to modify and target to specific cell types, AAV is very easy to modify and preferentially target to specific cell types thanks to natural serotypes and engineered/evolved capsid variants. If natural AAV serotypes are injected systemically, they show unigue (i.e., tunable) biodistributions. AAV capsid proteins can further be engineered or evolved to enable the preferential targeting of (new) cell types of interest or passing of physiological barriers such as the blood brain barrier. In our study, we leveraged the evolved AAV capsid PHP.B, which enabled us to achieve brain-wide infection with a simple systemic (tail vein) injection.
- AAVs do not generally reguire direct injection into tissue, thus avoiding the need for surgical procedures and complicated ethics approvals.
- Direct injection into tissue often reguires a surgical procedure, which in the case of brain delivery is a craniotomy.
- the drawbacks of this are plentiful.
- Surgery, compared to systemic delivery, is labor intensive, reguires expert knowledge, low throughput, leads to increased mortality in experimental mice, reguires increased post-surgical monitoring, and involves a more elaborate ethics approval process (due to increase severity and stress to the animals).
- AAVs do not generally reguire direct injection into tissue, thus avoiding tissue damage and confounding alterations to cell states. Direct injection into tissue causes damage, resulting in altered cell states. For example, along the injection track created during cranial/brain injections, it is extremely common to find reactive astrocytes and activated microglia, which can confound phenotypes of interest especially when using single cell methods.
- AAV sparsely infects a large number of cells across a tissue. With direct injection of a virus into tissue, it is extremely difficult if not impossible to infect a large number of cells while controlling the MOI. Systemically injecting AAV made it simple for us to titrate the amount of virus to optimize the number of infected cells with a single infection.
- LV has a narrow range of utility. Beyond what has already been discussed above, the only case that we are aware of where delivering LV in vivo is commonly done and useful is in the context of development where LV is injected before or shortly after birth. When LV is injected in utero or postnatally in the ventricle of the brain, it is possible to achieve brain-wide infection. Intracerebral ventricular injection into neonates can lead to brain-wide infection due to the fact that the blood brain barrier is immature. In utero injection into the brain of embryos within a pregnant mother can also lead to brain-wide infection due to radial glial progenitor cells of the ventricle wall being readily infected and differentiating to give rise to transgene-expressing daughter cells throughout the brain.
- AA V is vastly superior to LV for in vivo delivery and AA V-Perturb-seq will therefore open new avenues for single-cell CRISPR screening in vivo.
- ⁇ Ne created AAV transfer plasmids to test both 5’ and 3’ gRNA capture methods.
- the gRNA sequence is present within an mRNA transcript and can be captured by conventional single-cell RNA-seq (scRNA-seq) 3’ capture methods.
- the second strategy (pAS088) was designed to enable direct capture of the gRNA, wherein we designed an AAV transfer plasmid with independent gRNA and mRNA expression cassettes.
- the gRNA sequence can be directly captured via scRNA-seq 5’ capture methods, e.g., as shown by Replogle et al (Replogle, J. M.
- UMI proportion filter to correct for chimeric molecules and ambient RNA.
- the number of detected UMIs in a nucleus correlates directly with the number of reads.
- a nucleus with a higher number of UMIs is more likely to also have a higher number of UMIs coming from contaminating gRNA and mRNA.
- a gRNA associated with a small number of UMIs that represent a small proportion of the total gRNA-UMIs identified in a nucleus most likely represents a chimeric read or RNA cross contamination.
- Proportionbased filters have been used previously to address this issue. We incorporate this concept in our workflow but increased the stringency of the threshold compared to published work.
- gRNAs ar filtered out from nuclei where the most abundant gRNA UMI count is 1 .3x higher than the UMI counts for the second most abundant gRNA.
- Such a threshold only considers information about the two most expressed gRNAs and selects a gRNA label based on the highest expressed gRNA, even if there is a second gRNA with high counts. For instance, one cell containing 10 and 14 UMIs for two different gRNAs would be labeled as only having one gRNA.
- Pseudobulk performs superior to single-cell specific methods in a true positive control.
- Th e Lg D e I model carries a heterozygous deletion in chromosome 16 (eguivalent to human 22g1 1.2 locus). We hypothesized that this can be used as a control to test DE methods, as we know that genes within the locus are being expressed from only one copy, and thus, in general, their expression should be reduced to approximately 50%.
- Our pseudobulk detects LFC values of approximately -1 (50% reduction) across deleted genes and cell types, while the logistic regression test typically applied to single-cell data presents smaller LFC values and high variance.
- An in vivo cell type-specific screen could in principle be achieved using cell type-specific delivery or expression as well as through physical enrichment of the cell type of interest.
- AAV genome plasmids (Fig. 1a Fig. 6a) were based on the Addgene plasmid #60231 (Platt, R. J. et al., Cell 159, 440-455 (2014)). To achieve widespread transgene expression, the hSyn promoter was replaced by the ubiquitous CBh promoter (pAS088). For the triple color experiments (Fig. 6a), the U6 expression cassette and Cre were removed, while eGFP was replaced by mTagBF2 (Addgene plasmid #55302), Venus (Addgene plasmid #22663), or mCherry (Addgene plasmid #27970) (pAS132, pAS133, pAS134).
- the original U6 expression cassette was first removed by restriction digestion with Mlul (ThermoFisher) and Xbal (ThermoFisher) from upstream of the pol-ll promotor and cloned between the WPRE and poly(A) signal sequences (pAS006).
- Mlul ThermoFisher
- Xbal ThermoFisher
- the plasmid backbone (2.5 pg) was digested with Bsmbl (ThermoFisher) for 1 hr at 37 °C followed by an inactivation step for 5 min at 80 °C.
- the Gibson Assembly reaction was set as follows: 50 ng of digested plasmid backbone, 2 pL (200 fmoles of ssDNA oligos (stock at 100 mM), 10 pL NEBuilder® HiFi DNA Assembly Master Mix (NEB, E2621 L), and H2O up to 20 pL total reaction. The reaction was incubated for 1 hr at 50 °C.
- Isopropanol purification was used to concentrate the cloned gRNA library by mixing the total Gibson Assembly reaction with 20 pL isopropanol, 0.2 pL GlycoBlue Coprecipitant (ThermoFisher, AM9515) and 0.4 pL NaCI solution (stock at 5 M). The precipitation reaction was incubated at room temperature (RT) for 15 min, followed by centrifugation at > 15,000 xg for 15 min at RT. The supernatant was discarded and the DNA pellet was washed with 1 mL ice-cold 80 % ethanol and finally resuspended in 10 pL TE buffer.
- RT room temperature
- gRNA libraries were amplified as previously described (Joung, J. et al., Nat. Protoc. 2017 124 12, 828-863 (2017)). Briefly, the plasmid library was electroporated into Endura ElectroCompetent cells (Lucigen, 60242-2) according to the manufacturer’s instructions, followed by 1 hr recovering period at 37 °C. Bacteria were grown on a bioassay plate (Merck, D4803-1 CS) for 14 hr at 37 °C. Colonies were harvested by scrapping the plate surface before plasmid isolation with QIAGEN Plasmid Maxi kit (QIAGEN, #12165) according to the manufacturer’s protocol.
- the gRNA expression cassette was PCR amplified using KAPA HiFi ReadyMix with 100 ng of the final library as template and 0.5 pM of both custom Illumina P5 primer (AATG ATACG GCG ACCACCG AG ATCTACAC-N N N N N N N N- ACACTCTTTCCCTACACGACGCTCTTCCGATCTGCTTTATATATCTTGTGGAAAGGACGAAACACC , SEQ ID NO 13) and P7 primer (CAAGCAGAAGACGGCATACGAGAT-NNNNNNNN- GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCCCGACTCGGTGCCACTTTTTCAA, SEQ ID NO 14).
- PCR of the reaction mixture was performed as follows: (1) 95 °C for 3 min; (2) 98 °C for 20 s, 63 °C for 15 s, 72 °C 20 sec (18 cycles); (3) 72 °C for 2 min.
- the PCR reaction purified with double-size 0.6x - 1.0x AMPURE bead selection (A63882, Beckman Coulter). Deep sequencing libraries were sequenced using a NextSeq 550 75 cycle kit with the following cycle distribution: 75 to read 1 , 8 to index 1 , and 8 to index 2.
- AAVs were produced in HEK293T cells and purified by iodixanol gradient centrifugation. Briefly, HEK293T were expanded in DMEM (Merck) + 10% FBS (Merck) + 1 % HEPES (ThermoFisher). Twenty- four hours before the beginning of AAV production, cells were seeded in 15 cm dishes (HuberLab) at a density of 0.6 M cells per mL and a total of 20 mL medium per dish.
- Cells were transiently transfected with 21 ug of an equal molar-ratio mix of the AAV genome, AAV serotype plasmid (AAV.PHP.B), and the adeno helper plasmid pAdDeltaF6 (Puresyn) using polyethyleneimine max (PEI Max).
- AAV.PHP.B AAV serotype plasmid
- Puresyn adeno helper plasmid pAdDeltaF6
- Puresyn polyethyleneimine max
- Harvested medium was mixed with 5 x AAV precipitation buffer (400 g PEG 8000, 146.1 g NaCI in 1 L H2O) and kept at 4°C. One day later, cells were mechanically dislodged and centrifuged at 800 xg for 15 min.
- the resulting AAV solutions were aliquoted and flash-frozen in liquid nitrogen.
- the AAV particle concentration was determine by ddPCR (BioRad). Briefly, 5 uL of isolated AAVs were diluted 10x in water and treated with DNAse I (NEB, M0303S) before preparing tenfold serial dilutions with ddPCR dilution buffer [Ultrapure Water with 2 ng/pL sheared salmon sperm DNA (Thermo Fisher Scientific, AM9680) and 0.05% Pluronic F-68 (Thermo Fisher Scientific, 24040032)].
- the amplification reaction was performed as following: (1) 95 °C for 10 min; (2) 95 °C for 30 s, 60 °C for 1 min (42 cycles); (3) 72 °C for 15 s; (4) 98 °C for 10 min. Data were collected and analyzed with BioRad ddPCR apparatus to calculate number of viral particles per pL.
- mice were kept under specific pathogen-free conditions on a standard light cycle.
- Six to eight weeks old male Rosa26-LSL-Cas9 mice 1 were used unless otherwise indicated below.
- Six to eight weeks old male dCas9-KRAB mice (JAX stock #030000) were used.
- Eight weeks old male LgDel +/+ and LgDel +A mice 5 were used for the 22q11 .2 DS model snRNA-seq cell atlas.
- Triple-color experiment We developed the triple color experiment to fine-tune AAV injection conditions (Fig. 6a).
- the three AAV genomes were individually packaged into the AAV. PHP. B capsid and purified as indicated in “AAV production and purification”.
- Different viral particle doses (low: 2.5 x 10 9 ; medium: 5.0 x 10 9 ; and high: 2.5 x 10 10 , total number of particles) were generated by pooling equal-portions of the three viruses. Animals were spit into cages accordingly to their experimental groups. After tail vein injection of 100 pL of the AAV mixtures into LSL-Cas9 mice, animals were kept for three weeks under standard conditions before tissue extraction and processing.
- AAV particles carrying gRNAs to target 22q11 .2 locus genes were generated as indicated in “AAV production and purification”.
- a single dose of 5.0 x 10 9 viral particles in 100 pL total volume was injected per mouse. Animals were kept for four weeks under normal conditions before brain tissue extraction and processing.
- Virus carrying gRNAs to target validation genes were individually prepared as in “AAV production and purification”. Animals were spit into cages accordingly to their experimental groups before tail vein injection (100 pL) of 5.0 x 10 9 viral particles carrying unique gRNAs. Animals were kept for four weeks under standard conditions before tissue extraction and processing. Animals injected with Ufd1 /-targeting gRNAs presented comorbidities three weeks after injection and had to sacrificed at that time point.
- mice were intravenously injected with a lethal dose of pentobarbital (100 mg/kg body weight) before transcardial perfusion with 15 mL of ice cold 1x PBS followed by 15 mL of ice cold artificial cerebrospinal fluid (aCSF, in mM: 87 NaCI, 2.5 KCI, 1 .25 NaH2PO4, 26 NaHCO3, 75 sucrose, 20 glucose, 1 CaCI2, 7 MgSO4).
- the brain was removed, placed into a mouse brain matrix slicer (Zivic Instruments, BSMAS001-1), 1 mm slices were immediately snap-frozen, and the region of interest manually dissected into a frozen Eppendorf tube. Tissue samples were kept at -80 °C.
- Nuclei isolation was performed with mechanical and chemical tissue dissociation procedures.
- a tissue grinder (Sigma-Aldrich, D8938) was filled with 2 mL of ice-cold nuclei isolation buffer (NIB) (Sigma- Aldrich, NUC101-1 KT) and frozen pieces of tissue were directly placed inside the grinder.
- NAB ice-cold nuclei isolation buffer
- nuclei from different animals were isolated in individual grinders, except for the 22q11.2 pooled screen, in which tissue of 15 animals was joined into 3 grinders to reduce the number of isolations and the waiting time before subsequent procedures.
- the tissue was mechanically disrupted with 25 strokes with pestle A followed by 25 strokes with pestle B.
- the homogenized solution was transferred to a protein low-binding tube (Eppendorf, 0030122216), mixed with an additional 2 mL of NIB, incubated for 5 min, and immediately centrifuged at 500 xg for 5 min at 4 °C. Supernatant was discarded and the pellet was resuspended in 4 mL NIB, incubated for 5 min and centrifuged at 500 xg for 5 min at 4 °C.
- the pellet was resuspended in 4 mL of nuclei wash buffer [NWF: 1 % BSA in 1x PBS, 50 U/mL Superasein RNA inhibitor (ThermoFisher, AM2694), and 50 U/mL Enzymatics RNA inhibitor (Enzymatics, Y9240L)] and centrifuged at 500 xg for 5 min at 4 °C. Finally, the nuclei pellet was resuspended in 1 mL NWF and filtered through a 30 pm cell strainer (Sysmex) into a new protein low-binding tube.
- NWF 1 % BSA in 1x PBS
- 50 U/mL Superasein RNA inhibitor ThermoFisher, AM2694
- Enzymatics RNA inhibitor Enzymatics, Y9240L
- Fluorescence-Activated Nucleus Sorting was performed to: 1) quantify infected nuclei in the triple-color experiment; 2) purify nuclei from debris to ensure a clean nuclei solution before snRNA-seq library preparation; 3) isolate GFP+ nuclei to prepare snRNA-seq libraries with nuclei from infected cells. Briefly, isolated nuclei solutions were spiked with 2 pL/mL of Vybrant DyeCycle Ruby Stain (ThermoFisher, V10273) and sorted in a MA900 apparatus (Sony). Singlet nuclei were gated based on the DNA dye signal as illustrated in Fig.
- Vybrant DyeCycle Ruby Stain ThermoFisher, V10273
- mice were sacrificed by intravenous injection with a lethal dose of pentobarbitol (100 mg/kg body weight), followed by perfusion with 15 mL of ice cold 1x PBS and 15 mL of 4% PFA in 1x PBS. Brain tissue was incubated in 4% PFA in 1x PBS overnight and subsequently transferred in 1x PBS with 30% sucrose where they were left until they sunk. Brains were then embedded in OCT and sections of 20 pm were cut on a cryotome. Imaging was performed in a LSM 900 apparatus (Zeiss).
- Nuclei infected with AAV carrying the 3’ capture design were sequenced with a Chromium Single Cell 3' Reagent Kit v3 (1 Ox Genomics). The nuclei suspension was diluted to 1 ,000 nuclei/pL and processed accordingly to the kit’s protocol with 13 cycles of cDNA amplification and 14 cycles of sample indexing PCR.
- PCR 1 with primers targeting the U6 promotor sequence (TTTCCCATGATTCCTTCATATTTGC, SEQ ID NO 3) and read 1 sequence (ACACTCTTTCCCTACACGACG, SEQ ID NO 4) was performed with 20 ng of full-length single-cell cDNA library as DNA template.
- PCR 2 was performed with a forward primer targeting the U6 sequence immediately before the gRNA and containing the P7 adapter (GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAGcTTGTGGAAAGGACGAAACAC, SEQ ID NO 5), a reverse P5 primer (AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACG, SEQ ID NO 6), and 2 pL of PCR 1 reaction as template.
- a third PCR to index samples for deep sequencing used 2 pL of PCR 2 rection as template and was performed with a forward P7 index primer (CAAGCAGAAGACGGCATACGAGATNNNNNNNNGTCTCGTGGGCTCGG, SEQ ID NO 7) and the P5 primer as reverse (same primer used in PCR 2). All primers were used at a final concentration of 0.3 pM.
- Amplification reactions were performed as following: (1) 95 °C for 3 min; (2) 98 °C for 20 s, 65 °C for 15 s (72 °C for PCR 3), 72 °C 20 sec (number of cycles up to qPCR saturation); (3) 72 °C for 2 min.
- Nuclei infected with the 5’ capture design (pAS088, preliminary experiments, pooled screen, and confirmation experiments) were sequenced with a Chromium Single Cell 5' Reagent Kit v1 (10x Genomics).
- RT reverse transcription
- gRNA-constant-region-targeting RT primer (0.15 pM, AAGCAGTGGTATCAACGCAGAGTACCAAGTTGATAACGGACTAGCC, SEQ ID NO 8) (Mimitou, E. P. et al., Nat. Methods 2019 165 16, 409-412 (2019); Replogle, J. M. et al., Nat. Biotechnol. 2020 388 38, 954-961 (2020)).
- the reaction was purified with 0.6x SPRI beads (Beckman Coulter). At this point, longer cDNAs (more than 300 bp) from mRNA molecules bind to the beads, while the shorter cDNAs (approximately 200 bp) from gRNA sequences are free in the supernatant.
- the preparation of gene expression libraries was performed as indicated by the kit’s protocol, with 14 cycles of sample indexing PCR. To recover the gRNA-cDNA sequences, the supernatant from the above step was purified with 1 .4x SPRI beads and eluted in 30 pL of ultra-pure water (ThermoFisher).
- a 1 :10 diluted aliquot was loaded into Agilent Bioanalyzer High Sensitivity (Agilent) to confirm the presence of a gRNA band of - 180 bp.
- the gRNA-cDNA library (30 ng) was subjected to a sample indexing PCR using KAPA HiFi ReadyMix and 1 pM of P5 primer (AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTC, SEQ ID NO 18) and P7 indexing primer binding to the gRNA constant region directly downstream of the spacer sequence (CAAGCAGAAGACGGCATACGAGATNNNNNNNNGTCTCGTGGGCTCGGAGATGTGTATAAGAGA CAGTATTTCTAGCTCTAAAAC, SEQ ID NO 10).
- Amplification reactions were performed as follows: (1) 95 °C for 3 min; (2) 98 °C for 20 s, 54 °C for 30 s, 72 °C 20 sec (15 cycles); (3) 72 °C for 5 min.
- the final PCR reaction was cleaned and purified with double-size 0.6x - 1 .2x SPRI bead selection.
- Gene expression and gRNA libraries (5% of flow cell) were sequenced with a NextSeq 550 75 cycle kit or a NovaSeq 100 cycle kit with the following cycle distribution: 26 to read 1 , 8 to index 1 , and 56 (NextSeq) or 91 (NovaSeq) to read 2.
- PCR 1 was performed to specifically amplify -150 bp around the Cas9 cut site (keeping the cut site central in the amplicon) from genomic DNA (5 pL) with gene specific primers containing adapters (0.5 pM, fwd: ACACTCTTTCCCTACACGACGCTCTTCCGATCT (SEQ ID NO 19) + forward gene specific sequence; rev: GTGACTGGAGTTCAGACGTGTGCTCTTCCGATC (SEQ ID NO 20) + reverse gene specific sequence).
- PCR 1 amplification was performed as follows: (1) 95 °C for 3 min; (2) 98 °C for 20 s, primer set specific annealing temperature for 15 s, 72 °C 20 sec (15 cycles); (3) 72 °C for 2 min.
- CAAGCAGAAGACGGCATACGAGAT-NNNNNNNN- GTGACTGGAGTTCAGACGTGTGCTCTTCCGATC, SEQ ID NO 21) was performed as follows: (1) 95 °C for 3 min; (2) 98 °C for 20 s, 70 °C for 15 s, 72 °C 20 sec (15 cycles); (3) 72 °C for 2 min. Indexed samples were pooled, purified with PCR purification & concentration kit (Zymo Research, D4013), and loaded on a 2% E-Gel (Thermo Fisher Scientific, G402022).
- the PCR product (-250 bp) was extracted from the agarose gel with QIAquick Gel Extraction Kit (QIAGEN, 28706X4) and sequenced using a NextSeq 550 150 cycle kit with the following cycle distribution: 150 to read 1 , 8 to index 1 , and 8 to index 2.
- pseudobulk profiles were generated by aggregating raw UMI counts of nuclei from the same sample (/.e., same animal) and cell type. Differential gene expression of pseudobulk profiles was performed with the R package edgeR v3.36.0 (Robinson, M. D. et al., Bioinforma. Oxf. Engl. 26, 139-140 (2010)). For each cell type, we use the likelihood ratio test (egdeR-LRT) to calculate LFC and FDR values for each perturbation against SH control. The same process was used to compare LgDel +/_ against LgDel +/+ samples.
- Deep sequencing libraries for indel analysis were generated as described in “Deep sequencing quantification of Indels” and analyzed with CRISPresso2 (Gene Set Knowledge Discovery with Enrichr Xie - 2021 - Current Protocols - Wiley Online Library.
- the top 1000 up-regulated genes from each perturbation and cell type were uploaded as input to the online tool MIENTURNET (Licursi, V. et al., BMC Bioinformatics 20, 1-10 (2019)) using the miRTarBase (http://userver.bio.uniroma1 .it/apps/mienturnet/) reference dataset (Huang, H.-Y. et al., Nucleic Acids Res. 50, D222-D230 (2022)).
- Licursi, V., Conte, F., Fiscon, G. & Paci, P. MIENTURNET An interactive web tool for microRNA- target enrichment and network-based analysis.
- Licursi, V., Conte, F., Fiscon, G. & Paci, P. MIENTURNET An interactive web tool for microRNA- target enrichment and network-based analysis.
Landscapes
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Engineering & Computer Science (AREA)
- Chemical & Material Sciences (AREA)
- Genetics & Genomics (AREA)
- Molecular Biology (AREA)
- Biomedical Technology (AREA)
- Organic Chemistry (AREA)
- Biochemistry (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Biotechnology (AREA)
- Wood Science & Technology (AREA)
- General Health & Medical Sciences (AREA)
- Zoology (AREA)
- General Engineering & Computer Science (AREA)
- Microbiology (AREA)
- Medicinal Chemistry (AREA)
- Physics & Mathematics (AREA)
- Immunology (AREA)
- Cell Biology (AREA)
- Plant Pathology (AREA)
- Urology & Nephrology (AREA)
- General Chemical & Material Sciences (AREA)
- Chemical Kinetics & Catalysis (AREA)
- Hematology (AREA)
- Biophysics (AREA)
- Mycology (AREA)
- Food Science & Technology (AREA)
- General Physics & Mathematics (AREA)
- Tropical Medicine & Parasitology (AREA)
- Toxicology (AREA)
- Analytical Chemistry (AREA)
- Pathology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Crystallography & Structural Chemistry (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Abstract
The present invention relates to a method allowing for a single-cell-based analysis of multiple CRISPR-mediated gene perturbations in a single organism. The method comprises the administration of a plurality of viral expression vectors each encoding a gRNA into an organism expressing a Cas enzyme, and allows for analysis of the resulting phenotype on a single-cell level.
Description
Single-Cell CRISPR-Screening of Multiple Gene Perturbations in vivo
This application claims the right of priority of European Patent Application EP23188905.6 filed 01 August 2023, incorporated by reference herein.
Field
The present invention relates to a method allowing for a single-cell-based analysis of multiple CRISPR- mediated gene perturbations in a single organism. The method comprises the administration of a plurality of expression vectors each encoding a gRNA into an organism expressing a Cas enzyme, and allows for analysis of the resulting phenotype on a single-cell level.
Background
Advances in single-cell CRISPR screening methods are making it possible to interrogate complex genotype-phenotype landscapes in high throughput. The combination of pooled CRISPR libraries, lentiviral delivery, and single-cell omics were applied in vitro to study protein misfolding, gene regulation, and immunity as well as in vivo to study mouse neurodevelopment. While these efforts have fundamentally changed our ability to investigate the genetic networks underlying complex cellular processes, current methods are restricted to in vitro applications or a very narrow range of developmental time points, tissues, and cell types conducive to lentiviral infection in vivo. A general framework for direct in vivo single-cell screening could broadly facilitate mechanistic studies of health and disease as well as enable the systematic interrogation of the vast and growing catalog of disease- associated risk alleles in disease-relevant cells and tissues. Genomics studies have identified thousands of genetic variants associated with human disease and interrogating them in in vivo models is essential to understanding their causality, function, and pathology as well as developing new diagnostics and therapeutics.
US 2020 018 746 A1 discloses a method for perturbation of disease-implicated genes in 3D tissues composed of human induced neuronal cells and astrocytic cells. US 2021 172 017 A1 discloses that Perturb-seq and single-cell sequencing allow the reconstruction of a cellular network or circuit. WO 2019 113 499 A1 discloses the use of AAV in a Perturb-seq method. WO 2015 089 462 A1 discloses SpCas9- mediated in vivo genome editing in the brain. Still, high-throughput, phenotype-rich, and broadly- applicable in vivo methods are urgently needed.
To address this challenge, we developed AAV-Perturb-seq, an adeno-associated virus (AAV)-based single-cell or -nuclei CRISPR screening method that is simple to implement, tunable, and broadly applicable for in vivo functional genomics studies. We achieved this by creating a recombinant AAV vector for efficient guide RNA (gRNA) expression and detection within single-cell libraries as well as optimizing delivery and transgene expression for obtaining large numbers of single nuclei infected by single viruses from complex tissues. The use of AAV for in vivo delivery offers many advantages over previous lentivirus-based screening approaches commonly used in vitro, including the possibility of systemic delivery via intravenous injections leading to the targeting of a wide range of tissues and cell
types in animals of any age in a tunable way. We applied AAV-Perturb-seq, using either gene editing in LSL-Cas9 mice or transcriptional inhibition in dCas9-KRAB mice, to systematically interrogate the genotype-phenotype landscape of individual genes linked to 22q1 1.2 deletion syndrome (22q1 1.2 DS, also known as DiGeorge syndrome), a complex genetic disorder affecting numerous organs including the brain, where dysfunction is typically clinically expressed as schizophrenia or autism spectrum disorder (ASD). Using our data analysis pipeline, we extracted high-quality transcriptomes spanning perturbations and brain cell types, enabling us to highlight previously underappreciated genetic contributions and identify new cellular phenotypes that may contribute to 22q1 1.2 DS pathology. Our results establish AAV-Perturb-seq as a robust and broadly applicable methodology to systematically map genotype-phenotype landscapes in vivo.
Based on the above-mentioned state of the art, the objective of the present invention is to provide means and methods to analyse multiple gRNA-mediated gene perturbations on a single-cell level in vivo in a single organism. This objective is attained by the subject-matter of the independent claims of the present specification, with further advantageous embodiments described in the dependent claims, examples, figures and general description of this specification.
Summary of the Invention
A first aspect of the invention relates to a method for analyzing multiple gene perturbations in vivo in a tissue of interest; said method comprising the steps: a. providing an organoid or a non-human organism; b. administering a plurality of viral gRNA-delivering nucleic acid expression vectors to the organoid or organism, each vector comprising: i. inverted-terminal repeats (ITRs); ii. a gRNA promoter;
Hi. at least one guide-RNA (gRNA) under control of said gRNA promoter, wherein each vector comprises a different gRNA or gRNA combination; and iv. a terminator of transcription; wherein the organism expresses a Cas enzyme, or said vector additionally encodes a Cas enzyme under control of a promoter operable in said cell; c. isolating a sample of the tissue of interest from the organism, or a sample of the organoid; d. in a collection step, collecting cells or nuclei from the sample; e. in an analysis step, performing a single-cell or single-nucleus assay comprising an analysis of the gene perturbation and comprising gRNA sequencing of each cell or nucleus, thereby generating a plurality of assay patterns related to expression of a defined gRNA or a defined gRNA combination.
Terms and definitions
For purposes of interpreting this specification, the following definitions will apply and whenever appropriate, terms used in the singular will also include the plural and vice versa. In the event that any
definition set forth below conflicts with any document incorporated herein by reference, the definition set forth shall control.
The terms “comprising”, “having”, “containing”, and “including”, and other similar forms, and grammatical equivalents thereof, as used herein, are intended to be equivalent in meaning and to be open-ended in that an item or items following any one of these words is not meant to be an exhaustive listing of such item or items, or meant to be limited to only the listed item or items. For example, an article “comprising” components A, B, and C can consist of (i.e. , contain only) components A, B, and C, or can contain not only components A, B, and C but also one or more other components. As such, it is intended and understood that “comprises” and similar forms thereof, and grammatical equivalents thereof, include disclosure of embodiments of “consisting essentially of’ or “consisting of.”
Where a range of values is provided, it is understood that each intervening value, to the tenth of the unit of the lower limit, unless the context clearly dictates otherwise, between the upper and lower limit of that range and any other stated or intervening value in that stated range, is encompassed within the disclosure, subject to any specifically excluded limit in the stated range. Where the stated range includes one or both of the limits, ranges excluding either or both of those included limits are also included in the disclosure.
Reference to “about” a value or parameter herein includes (and describes) variations that are directed to that value or parameter per se. For example, description referring to “about X” includes description of “X.”
As used herein, including in the appended claims, the singular forms “a”, “or” and “the” include plural referents unless the context clearly dictates otherwise.
"And/or" where used herein is to be taken as specific recitation of each of the two specified features or components with or without the other. Thus, the term "and/or" as used in a phrase such as "A and/or B" herein is intended to include "A and B," "A or B," "A" (alone), and "B" (alone). Likewise, the term "and/or" as used in a phrase such as "A, B, and/or C" is intended to encompass each of the following aspects: A, B, and C; A, B, or C; A or C; A or B; B or C; A and C; A and B; B and C; A (alone); B (alone); and C (alone).
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art (e.g., in cell culture, molecular genetics, nucleic acid chemistry, hybridization techniques and biochemistry, organic synthesis). Standard techniques are used for molecular, genetic, and biochemical methods (see generally, Sambrook et al., Molecular Cloning: A Laboratory Manual, 4th ed. (2012) Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y. and Ausubel et al., Short Protocols in Molecular Biology (2002) 5th Ed, John Wiley & Sons, Inc.) and chemical methods.
Any patent document cited herein shall be deemed incorporated by reference herein in its entirety.
General Molecular Biology: Nucleic Acid Sequences, Expression
The term gene refers to a polynucleotide containing at least one open reading frame (ORF) that is capable of encoding a particular polypeptide or protein after being transcribed and translated. A polynucleotide sequence can be used to identify larger fragments or full-length coding sequences of the gene with which they are associated. Methods of isolating larger fragment sequences are known to those of skill in the art.
The term transgene in the context of the present specification relates to a gene or genetic material that has been transferred from one organism to another. In the present context, the term may also refer to transfer of the natural or physiologically intact variant of a genetic sequence into tissue of a patient where it is missing. It may further refer to transfer of a natural encoded sequence the expression of which is driven by a promoter absent or silenced in the targeted tissue.
The term recombinant in the context of the present specification relates to a nucleic acid, which is the product of one or several steps of cloning, restriction and/or ligation and which is different from the naturally occurring nucleic acid. A recombinant virus particle comprises a recombinant nucleic acid.
The terms gene expression or expression, or alternatively the term gene product, may refer to either of, or both of, the processes - and products thereof - of generation of nucleic acids (RNA) or the generation of a peptide or polypeptide, also referred to transcription and translation, respectively, or any of the intermediate processes that regulate the processing of genetic information to yield polypeptide products. The term gene expression may also be applied to the transcription and processing of a RNA gene product, for example a regulatory RNA or a structural (e.g. ribosomal) RNA. If an expressed polynucleotide is derived from genomic DNA, expression may include splicing of the mRNA in a eukaryotic cell. Expression may be assayed both on the level of transcription and translation, in other words mRNA and/or protein product.
The term Nucleotides in the context of the present specification relates to nucleic acid or nucleic acid analogue building blocks, oligomers of which are capable of forming selective hybrids with RNA or DNA oligomers on the basis of base pairing. The term nucleotides in this context includes the classic ribonucleotide building blocks adenosine, guanosine, uridine (and ribosylthymine), cytidine, the classic deoxyribonucleotides deoxyadenosine, deoxyguanosine, thymidine, deoxyuridine and deoxycytidine. It further includes analogues of nucleic acids such as phosphothioates, 2’0-methylphosphothioates, peptide nucleic acids (PNA; N-(2-aminoethyl)-glycine units linked by peptide linkage, with the nucleobase attached to the alpha-carbon of the glycine) or locked nucleic acids (LNA; 2’0, 4’C methylene bridged RNA building blocks). Wherever reference is made herein to a hybridizing sequence, such hybridizing sequence may be composed of any of the above nucleotides, or mixtures thereof.
The term nucleic acid expression vector in the context of the present specification relates to a plasmid, a viral genome or an RNA, which is used to transfect (in case of a plasmid or an RNA) or transduce (in case of a viral genome) a target cell with a certain gene of interest, or -in the case of an RNA construct
being transfected- to translate the corresponding protein of interest from a transfected mRNA. For vectors operating on the level of transcription and subsequent translation, the gene of interest is under control of a promoter sequence and the promoter sequence is operational inside the target cell, thus, the gene of interest is transcribed either constitutively or in response to a stimulus or dependent on the cell’s status. In certain embodiments, the viral genome is packaged into a capsid to become a viral vector, which is able to transduce the target cell.
The term fluorescent protein in the context of the present specification may relate, but is not limited to, a fluorescent protein selected from the group comprising: green fluorescent protein (GFP) from Aequorea victoria and derivatives thereof, such as enhanced blue fluorescent protein (EBFP), enhanced blue fluorescent protein 2 (EBFP2), azurite, mKalamal , Sirius; enhanced green fluorescent protein (EGFP), emerald, superfolder avGFP, T-sapphire; yellow fluorescent protein (YFP), enhanced yellow fluorescent protein (EYFP), citrine, venus, YPet, topaz, SYFP, mAmetrine enhanced cyan fluorescent protein (ECFP), mTurquoise, mTurquoise2, cerulean, CyPet, SCFP; fluorescent protein from Discosoma striata and derivatives thereof: mTagBFP,
TagCFP, AmCyan, Midoriishi Cyan, mTFP1
Azami Green, mWasabi, ZsGreen, TagGFP, TagGFP2, TurboGFP, CopCFP, AceGFP
- TagYFP, TurboYFP, ZsYellow, PhiYfP
Kusabira Orange, Kusabira Orange2, mOrange, mOrange2, dTomato, dTomato-Tandem, DsRed, DsRed2, DsRed-Express (T1), DsRed-Express2, DsRed-Max, DsRed-Monomer, TurboRFP, TagRFP, TagRFP-T mRuby, mApple, mStrawberry, AsRed2, mRFP1 , JRed, mCherry, eqFP611 , tdRFP611 , HcRedl , mRaspberry tdRFP639, mKate, mKate2, katushka, tdKatushka, HcRed-Tandem, mPlum, AQ143. proteins derived from alpha-allophycocyanin from the cyanobacterium Trichodesmium erythraeum such as small ultra-red fluorescent protein (smURFP).
The term gRNA in the context of the present specification relates to a guide RNA which comprises a crRNA part and a tracrRNA part.
The term gRNA combination in the context of the present specification relates to multiple (a plurality of) gRNAs. In certain embodiments, a gRNA combination consists of 2 to 1000 gRNAs. In certain embodiments, a gRNA combination consists of 2 to 100 gRNAs. In certain embodiments, a gRNA combination consists of 2 to 10 gRNAs.
The term gene perturbation in the context of the present specification relates to an alteration in gene expression induced by CRISPR/Cas and a gRNA. In certain embodiments, a gene perturbation relates to the loss of expression of the gene targeted by the respective gRNA. In certain embodiments, a gene perturbation relates to the inhibition of transcription of the target gene by the respective gRNA and Cas-effector (eg KRAB) complex. In certain embodiments, a gene perturbation relates to the
activation of transcription of the target gene by the respective gRNA and Cas-effector (eg VP64, HSF1 , p65) complex. In certain embodiments, a gene perturbation relates to the base editing of the target gene by the respective gRNA and base editing complex. In certain embodiments, a gene perturbation relates to the prime editing of the target gene by the respective prime editing gRNA and base editing complex.
The term Cas enzyme in the context of the present specification relates to an enzyme of the Cas family. In certain embodiments, the Cas enzyme is a Cas enzyme of type II. In certain embodiments, the Cas enzyme is a Cas9 enzyme of UniProt-ID Q99ZW2.
The term polypeptide part in the context of the present specification relates to one or several domains of a polypeptide, wherein the domain(s) are covalently linked to the rest of the polypeptide.
The term gRNA-delivering nucleic acid expression vector in the context of the present specification relates to a nucleic acid expression vector which encodes a gRNA, and optionally further elements.
Detailed Description of the Invention
A first aspect of the invention relates to a method for analyzing multiple (a plurality of) gene perturbations in vivo in a tissue of interest. The method comprises the steps:
Step a: providing an organoid or a non-human multi-cellular organism, particularly a non-human organism. An organoid is a multi-cellular, three-dimensional, simplified version of an organ, which is generated in vitro. The organoid can be composed of human or non-human cells. The non-human organism is an animal or a plant or a fungus. In certain embodiments, the non-human organism is an animal. In certain embodiments, the non-human organism is an adult organism. In certain embodiments, the non-human organism is an adult animal. In certain embodiments, the non-human organism is an animal which is born (not an embryo, post-embryonic stage).
Step b: administering a plurality of gRNA-delivering nucleic acid expression vectors to the organoid or organism, wherein each vector comprises the following elements: i. a gRNA promoter allowing for transcription inside a tissue of interest (particularly a Pol III RNA polymerase promoter); ii. at least one guide-RNA (gRNA) transcribable under control of said gRNA promoter, wherein each vector of the plurality comprises a different gRNA or gRNA combination (in case more than one gRNA is encoded); and
Hi. a terminator of transcription in 3’ direction of the gRNA;
The organism expresses a Cas enzyme, or the gRNA-delivering nucleic acid expression vector additionally encodes a Cas enzyme under control of a promoter operable in said cell. In certain embodiments, the Cas enzyme is encoded in the genome of the organism or organoid. In certain embodiments, the Cas enzyme is delivered via a separate nucleic acid expression vector.
In certain embodiments, the gRNA-delivering nucleic acid expression vector is a viral vector. In this case, each vector comprises the following elements:
i. inverted-terminal repeats (ITRs) flanking the elements described below at the 3’ and the 5’ end (positioned 5'upstream and 3’ downstream of all elements encoded by the vector, i.e. the elements described below: the gRNA and its promoter, optionally a gene encoding a Cas enzyme, and other optional elements); ii. a gRNA promoter allowing for transcription inside a tissue of interest; (particularly a Pol III RNA polymerase promoter);
Hi. at least one guide-RNA (gRNA) transcribable under control of said gRNA promoter, wherein each vector of the plurality comprises a different gRNA or gRNA combination (in case more than one gRNA is encoded); and iv. a terminator of transcription in 3’ direction of the gRNA;
The skilled person understands that there are alternative ways of delivering the gRNA-delivering nucleic acid expression vectors than viral delivery. In certain embodiments, the expression vector is delivered via a peptide-based delivery system using cell-penetrating peptides. In certain embodiments, the expression vector is delivered via a lipid-based delivery system via lipid nano-particles. In certain embodiments, the expression vector is delivered via an inorganic delivery system using black phosphorus, graphene oxide, mesoporous silica nanoparticles, or gold nanoparticles. In certain embodiments, the expression vector is delivered via a polymeric delivery system, wherein the expression vector is incorporated into a polymer. In certain embodiments, the expression vector is delivered via electroporation. Lan et al. (Mol Cancer 21 , 71 (2022)) reviews the non-viral delivery ways, and is incorporated by reference herein.
Step b1 : keeping the organism under conditions allowing the organism to live or the organoid under conditions allowing the organoid to stay intact. In certain embodiments, these conditions are physiological conditions. In certain embodiments, these conditions are pathological conditions which impose a particular burden on the organism or organoid.
Step c: isolating a sample of the tissue of interest from the organism, or a sample of the organoid. This sample may comprise a complete organ of the organism or only a subtraction of an organ.
Step d: in a collection step, collecting cells or nuclei from the sample of the tissue of interest or of the organoid.
Step e: in an analysis step, performing a single-cell or single-nucleus assay comprising an analysis of the gene perturbation and comprising gRNA sequencing of each cell or nucleus of the collected cells or nuclei, thereby generating a plurality of assay patterns related to expression of a defined gRNA or a defined gRNA combination. The analysis of the gene perturbation is an assay or a combination of assays which captures a certain feature of the analyzed cell. Using this assay, it is possible to relate the perturbation of expression via the gRNA to a phenotype of the cell in vivo. There are multiple assays which can be performed on a single-cell level. An assay pattern reflects the phenotype of the cell for a certain parameter.
In certain embodiments, the assay of the analysis step comprises single-cell or single-nucleus RNA sequencing. Each cell has a unique barcode that is present in both the mRNA and gRNA fractions. One uses this barcode to make connections between mRNA information and gRNA expression (gRNA
expression is an indirect way to identify the mutated gene). To put it another way, the gRNA indicates which gene was perturbed (knocked out, inhibited, activated) in that cell. The mRNA information from that cell, is informative about the impact of perturbing that gene.
In certain embodiments, the assay of the analysis step comprises single-cell or single-nucleus DNA sequencing.
In certain embodiments, the assay of the analysis step comprises single-cell quantification of surface proteins, particularly cellular indexing of transcriptomes and epitopes by sequencing (CITE-seq), and the assay patterns are protein patterns, particularly surface protein patterns. Surface protein quantification is performed as described in Stoeckius, M. et al. Simultaneous epitope and transcriptome measurement in single cells. Nat Methods 14, 865-868 (2017).
In certain embodiments, the assay of the analysis step comprises single-cell or single-nucleus quantification of cytosolic and nuclear proteins. This assay uses antibodies labelled with short oligonucleotides to indirectly quantify proteins of interest. First, cells are treated with oligo-tagged antibodies that bind to the protein(s) of interest. Cells are then used to prepare single-cell RNA-seq libraries with modified protocols that capture both mRNA and oligo-tags. Quantification of the oligo tags indirectly tells the quantity of protein present in the cell. Cytoplasmatic protein quantification is performed as described in Katzenelenbogen, Y. et al. Coupled scRNA-Seq and Intracellular Protein Activity Reveal an Immunosuppressive Role of TREM2 in Cancer. Cell 182, 872-885. e19 (2020).
In certain embodiments, the assay of the analysis step comprises single-cell or single-nucleus quantification of histone marks. This assay uses an antibody binding to a histone modifying protein. After binding to the target protein, the antibody recruits a DNA cutting enzyme that cuts DNA around the specific histone modification. Deep sequencing of the cut DNA allows one to understand the precise original genomic localization of the histone modification. Bartosovic, M., Kabbe, M. & Castelo- Branco, G. Single-cell CUT&Tag profiles histone modifications and transcription factors in complex tissues. Nat Biotechnol 39, 825-835 (2021).
In certain embodiments, the assay of the analysis step comprises single-cell or single-nucleus mRNA sequencing, and the assay patterns are mRNA expression patterns.
In certain embodiments, the assay of the analysis step comprises transposase-accessible chromatin with sequencing (ATAC-seq), and the assay patterns are chromatin accessibility patterns.
In certain embodiments, after the analysis step, the assay patterns are clustered by their type of cell or origin, thereby generating an assay profile for each cell type in the tissue of interest. In certain embodiments, after the analysis step, the assay patterns are clustered by their type of expressed gRNA, thereby generating an assay profile for each gene perturbed by a gRNA or a gRNA combination. Via clustering, the read-out of multiple cells of the same cell type and having the same gRNA perturbation can be combined to increase the signal-to-noise ratio.
In certain embodiments, each vector of the plurality of gRNA-delivering nucleic acid expression vectors comprises a reporter gene under control of a reporter gene promoter, wherein said reporter
gene encodes a reporter protein, wherein said reporter protein is detectable when expressed inside a cell and enables selective isolation of cells that express said reporter protein in the collection step. This reporter protein facilitates the analysis, because the cells which express a gRNA can be distinguished from non-expressing cells, e.g., via their fluorescence.
In certain embodiments, in the collection step, cells or nuclei are collected selectively from the tissue of interest that exhibit expression of said reporter protein.
In certain embodiments, the reporter protein is a fluorescent protein.
In certain embodiments, the collection step is performed via fluorescence-activated cell sorting (FACS) or fluorescence-activated nucleus sorting (FANS).
In certain embodiments, in the collection step, nuclei are collected. In certain embodiments, the reporter protein comprises a polypeptide part localizing to a nucleus of a cell, particularly wherein the reporter protein comprises a polypeptide part interacting with a membrane of a nucleus, particularly wherein the reporter protein comprises a KASH (Klarsicht, ANC-1 , Syne Homology) domain. In certain embodiments, the reporter protein comprises a polypeptide part localizing to a nucleus of a cell, particularly wherein the reporter protein comprises a NLS (nuclear localization sequence) domain.
In certain embodiments, in the collection step, whole cells are collected.
In certain embodiments, after the analysis step, patterns are clustered first by their cell type.
In certain embodiments, said organism expresses a gene (from its genome or delivered via a vector) encoding a recombinase enzyme (under control of a promoter), and wherein activation of expression of said Cas enzyme is mediated via said recombinase enzyme. In certain embodiments, the recombinase enzyme is a Cre enzyme. In certain embodiments, an organism expressing a Cas enzyme on its genome has an LSL (lox-stop-lox) sequence between the promoter and the Cas coding sequence. This LSL sequence blocks expression of the Cas enzyme. Once the Cre protein is expressed, the Cre protein removes the LSL sequence and unblocks the expression of Cas enzyme. Cre can be delivered via AAV, but it’s also possible to cross-breed the LSL-Cas animal to another animal that expresses Cre from its genome. In certain embodiments, the recombinase enzyme is a Flp enzyme. The promoter of the reporter/Cre is a pol-ll promoter.
In certain embodiments, the gRNA promoter is a pol-ll promoter. In certain embodiments, the gRNA promoter is a pol-lll promoter.
In certain embodiments, said reporter gene promoter is an RNA-polymerase II promoter.
In certain embodiments, the gRNA-delivering nucleic acid expression vector is a viral vector. In certain embodiments, the gRNA-delivering nucleic acid expression vector is selected from the group of an AAV vector, an adenoviral vector, a rabies vector, a sindbis vector, and a lentiviral vector. In certain embodiments, the gRNA-delivering nucleic acid expression vector is an AAV vector.
In certain embodiments, the organism is an animal. In certain embodiments, the organism is a vertebrate. In certain embodiments, the organism is a mammal, particularly a mammal selected from
the group of a rodent, a primate, an ungulate (particularly an artiodactyla or a perissodactyla), a lagomorph, a carnivore, an insectivore, and a chiroptera.
In certain embodiments, the Cas enzyme is Cas9.
In certain embodiments, the gene encoding the Cas enzyme is introduced into the germline of the organism.
In certain embodiments, the gene encoding the Cas enzyme is delivered via a (particularly viral) vector.
In certain embodiments, the gRNA promoter and the reporter gene promoter are two distinct promoters.
In certain embodiments, the gRNA promoter is a tissue-specific promoter.
In certain embodiments, the gRNA promoter is an inducible or conditional promoter. In certain embodiments, the gRNA promoter is a promoter selected from the group of a Tet-ON promoter, a Tet- OFF promoter, and a Cre-dependent promoter.
In certain embodiments, 5’ capture sequencing is performed. Infected nuclei are subjected to a modified 5’ single-cell library preparation protocol to capture both mRNA and gRNA information. a. The reaction is modified to include a gRNA-specific reverse transcription primer to capture gRNA alongside mRNA. b. mRNA and gRNA molecules are separated to create two independent libraries. (The mRNA library is bigger than 300 bp, while the gRNA library is approximately 180 bp. We separate the 2 fractions by using a specific protocol (beads) that sequesters and removes molecules bigger than 300 bp (mRNA) and keeps the gRNA molecules. Both fractions are prepared separately during the following steps.) c. The mRNA library is processed accordingly to the kit’s instructions. d. The gRNA library is PCR amplified and indexed prior to Illumina deep sequencing.
For 5’ capture sequencing, a single-cell RNA-seq preparation kit is used that barcodes RNA at the 5’ (ex: 10xGenomics Chromium 5’ kit). The kit’s protocol was modified to additionally include a reverse transcription primer specific to the gRNA. This primer mediates capture and barcoding of gRNA molecules from each single cell, alongside the protocol’s standard capture of mRNA.
5’ capture sequencing comprises the following steps:
- isolating cells or nuclei from the tissue of the non-human organism (that has been previously infected with the viral delivery system), (These cells or nuclei contain one or more gRNAs); using a scRNA-seq 5’ capturing protocol (e.g. from 10x Genomics), comprising the steps:
• a scRNA-seq platform (e.g. from 10x Genomics) that barcode the RNA molecules from the 5’ position;
• cells or nuclei are loaded in a microfluids chip;
• Reaction mix comprising a poly-A primer (to capture mRNAs) and a gRNA-specific primer (to capture the gRNA), is also loaded into the chip;
• Cells or nuclei are mixed with the reaction mix and single cells or nuclei are individually encapsulated into droplets;
• The RT (reverse transcriptase) reaction occurs inside each droplet, allowing for cell- or nucleus-specific barcoding of mRNAs and gRNAs;
- Adding a gRNA-specific capturing reverse transcription (RT) primer to specifically bind and amplify gRNA molecules;
- after cDNA amplification, using DNA fragments larger than 300 bp for gene expression and fragments smaller than 300 bp to isolate gRNA molecules;
- Sequencing gRNA molecules in a separated next-generation sequencing library.
The invention can be described as follows: For a given tissue, we infect a percentage of cells with a nucleic expression vector (not all cells from the tissue are infected, but later cell or nuclei sorting permits focus only on infected cells with fluorescent protein expression). Each infected cell carries a perturbation in one or more genes. The number of genes tested in parallel can be between 2 and +20 000. We assume that a pool of single cells carrying the same perturbation is representative of the effect of that perturbation in that tissue and cell type.
The goal is to understand how a particular gene perturbation (knockout, inhibition, or activation) affect mRNA expression profiles. Ultimately, these mRNA expression profiles tell the function of that gene along with other translationally relevant diagnostic and/or therapeutic features.
In terms of therapeutics, we could, for example, use our approach in a mouse model of disease and identify the genetic perturbation that corrects the disease-associated transcriptional state. The therapy in this case could be seemingly any therapeutic modality that modifies the target in the same way as the genetic perturbation. For example, if the target was an enzyme, a small molecule drug inhibiting that enzyme could have the same capacity to rescue the disease state.
Combining mRNA and gRNA data and focusing on cells with one gRNA can be described as follows:
■ for each cell type, group cells by perturbation.
■ for each cell type, learn which perturbations lead to transcriptional changes.
■ for each cell type and perturbation leading to changes, remove cells that don’t contribute to the observed changes.
■ For each cell type and perturbation leading to changes, calculate mRNA expression changes between perturbation and control.
■ For each cell type and perturbation leading to changes, understand the biological processes associated with changes.
■ For each perturbation, compare changes across cell types and understand if changes are cell type-specific or shared across cell types.
■ Repeat the above steps for cells with more than one gRNA.
A further aspect of the invention relates to the use of a plurality of viral gRNA-delivering nucleic acid expression vectors in a method according to the first aspect; each vector comprising: i.) inverted-terminal repeats (ITRs); ii.) a gRNA promoter; iii.) at least one guide-RNA (gRNA) under control of said gRNA promoter, wherein each vector comprises a different gRNA or gRNA combination; and iv.) a terminator of transcription.
Favorable aspects of the invention
Our invention specifically relates to a method that allows for single-cell-based analysis of multiple CRISPR-mediated gene perturbations within a single organism in vivo. This capability is significantly different from US 2020 018 746 A1 , which focuses on 3D tissues and does not cover single-cell resolution in live animals.
We developed an AAV-based single-cell CRISPR screening method, AAV-Perturb-seq, which is broadly applicable for in vivo functional genomics studies. Unlike the methods in US 2020 018 746 A1 , our approach ensures efficient gRNA expression and detection within single-cell libraries, optimized for large numbers of single nuclei from complex tissues isolated from animals.
Our method incorporates the use of the Cre-lox recombination system and Tet-ON/OFF promoters, providing a mechanism for controlling gene expression spatially and temporally.
The systemic delivery via intravenous injections of AAV vectors in our method enables targeting a wide range of tissues and cell types in animals of any age. This systemic delivery is tunable and provides broader application compared to the approaches described in US 2020 018 746 A1 .
Our method is designed for high-throughput screening with a phenotype-rich readout, facilitating the systematic interrogation of numerous genetic variants in disease-relevant cells and tissues. US 2020 018 746 A1 does not provide a framework for in vivo single-cell screening.
Our invention allows for single-cell-based analysis of multiple CRISPR-mediated gene perturbations within a single organism in vivo. This enables detailed and comprehensive mapping of gene functions at a single-cell resolution in a living animal, a capability not emphasized or developed in US 2021 172 017 A1.
Our invention features systemic delivery of AAV vectors via intravenous injections, enabling the targeting of a wide range of tissues and cell types across the entire body in animals of any age. This broad targeting capability allows for more comprehensive studies of gene functions and interactions in various tissues simultaneously, which is not covered by US 2021 172 017 A1's approach. US 2021 172 017 A1 only covers in vitro applications.
We have specifically applied our AAV-Perturb-seq method to study genes linked to the 22q11 .2 deletion syndrome and other neurological diseases in the adult mouse brain. This application demonstrates the ability to reveal novel genetic contributions and identify new cellular phenotypes
associated with complex genetic disorders. US 2021 172 017 A1 does not address specific applications of the use of the technology to study neurological disorders.
Our invention involves a method for analyzing multiple gene perturbations in vivo in a tissue of interest, which includes the administration of a plurality of viral gRNA-delivering nucleic acid expression vectors. This method allows for single-cell or single-nucleus assays to analyze gene perturbation, including gRNA sequencing of each cell or nucleus. The specificity lies in the detailed steps for isolating tissues, collecting cells or nuclei, and performing various assays (e.g., RNA sequencing, DNA sequencing, protein quantification).
We provide a broadly applicable methodology for systematically mapping genotype-phenotype landscapes in vivo across various tissues and cell types, not limited to immune cells or tumors. The invention includes detailed protocols for systemic delivery, enabling targeting of a wide range of tissues and cell types in animals of any age, and incorporates numerous specific assays (e.g., singlecell RNA-seq, CITE-seq, ATAC-seq). This broader scope is not suggested or implied by WO 2019 113 499 A1 .
We describe a method for analyzing multiple gene perturbations in vivo across various tissues, not limited to the brain. The method involves administering a plurality of viral gRNA-delivering nucleic acid expression vectors and performing single-cell or single-nucleus assays to analyze gene perturbation at a high resolution. This broad applicability to different tissues and comprehensive analysis distinguishes our invention from WO 2015 089 462 A1 , which focuses specifically on the brain and electrophysiological recording.
We employ a variety of assays including single-cell or single-nucleus RNA sequencing, DNA sequencing, protein quantification, and other omics techniques. This multi-faceted approach allows for a deeper understanding of the genotype-phenotype relationship across different cell types and tissues, which is not suggested or implied by WO 2015 089 462 A1 .
Our invention facilitates the targeting of multiple genes simultaneously using a library of gRNA- delivering vectors. This method can perturb multiple genes in parallel, enabling high-throughput studies of complex genetic interactions across various tissues. This capability for parallel perturbation and analysis of multiple genes is a significant advancement over the single-gene focus described in WO 2015 089 462 A1.
Wherever alternatives for single separable features such as, for example, an assay step or a type of protein or gRNA are laid out herein as “embodiments”, it is to be understood that such alternatives may be combined freely to form discrete embodiments of the invention disclosed herein. Thus, any of the alternative embodiments for an assay step may be combined with any type of protein or gRNA mentioned herein.
The specification further comprises the following items:
Items:
1 . A method for analyzing multiple gene perturbations in vivo in a tissue of interest; said method comprising the steps:
a. providing an organoid or a non-human organism, particularly a non-human organism; b. administering a plurality of gRNA-delivering nucleic acid expression vectors to the organoid or organism, each vector comprising: i. a gRNA promoter ii. at least one guide-RNA (gRNA) under control of said gRNA promoter, wherein each vector comprises a different gRNA or gRNA combination; and
Hi. a terminator of transcription; wherein the organism expresses a Cas enzyme, or said vector additionally encodes a Cas enzyme under control of a promoter operable in said cell; a. isolating a sample of the tissue of interest from the organism, or a sample of the organoid; b. in a collection step, collecting cells or nuclei from the sample; c. in an analysis step, performing a single-cell or single-nucleus assay comprising an analysis of the gene perturbation and comprising gRNA sequencing of each cell or nucleus, thereby generating a plurality of assay patterns related to expression of a defined gRNA or a defined gRNA combination.
2. The method according to item 1 , wherein the assay of the analysis step comprises single-cell or single-nucleus RNA sequencing.
3. The method according to item 1 , wherein the assay of the analysis step comprises single-cell or single-nucleus DNA sequencing.
4. The method according to item 1 , wherein the assay of the analysis step comprises single-cell quantification of surface proteins.
5. The method according to item 1 , wherein the assay of the analysis step comprises single-cell or single-nucleus quantification of cytosolic and nuclear proteins.
6. The method according to item 1 , wherein the assay of the analysis step comprises single-cell or single-nucleus quantification of histone marks.
7. The method according to item 1 , wherein the assay of the analysis step comprises single-cell or single-nucleus mRNA sequencing, and the assay patterns are mRNA expression patterns.
8. The method according to item 1 , wherein the assay of the analysis step comprises transposase-accessible chromatin with sequencing (ATAC-seq), and the assay patterns are chromatin accessibility patterns.
9. The method according to item 1 , wherein the assay of the analysis step comprises cellular indexing of transcriptomes and epitopes by sequencing (CITE-seq), and the assay patterns are protein patterns, particularly surface protein patterns.
10. The method according to any one of the preceding items, wherein after the analysis step, the assay patterns are clustered by their type of cell or origin, thereby generating an assay profile for each cell type in the tissue of interest.
11. The method according to any one of the preceding items, wherein after the analysis step, the assay patterns are clustered by their type of expressed gRNA, thereby generating an assay profile for each gene perturbed by a gRNA or a gRNA combination.
12. The method according to any one of the preceding items, wherein each vector of the plurality of gRNA-delivering nucleic acid expression vectors comprises a reporter gene under control of a reporter gene promoter, wherein said reporter gene encodes a reporter protein, wherein said reporter protein enables selective isolation of cells that express said reporter protein in the collection step.
13. The method according to item 12, wherein in the collection step, cells or nuclei are collected selectively from the tissue of interest that exhibit expression of said reporter protein.
14. The method according to item 12 or 13, wherein the reporter protein is a fluorescent protein.
15. The method according to item 14, wherein the collection step is performed via fluorescence- activated cell sorting (FACS) or fluorescence-activated nucleus sorting (FANS).
16. The method according to any one of the preceding items, wherein in the collection step, nuclei are collected.
17. The method according to item 16, wherein the reporter protein comprises a polypeptide part localizing to a nucleus of a cell, particularly wherein the reporter protein comprises a polypeptide part interacting with a membrane of a nucleus, particularly wherein the reporter protein comprises a KASH domain.
18. The method according to item 16, wherein the reporter protein comprises a polypeptide part localizing to a nucleus of a cell, particularly wherein the reporter protein comprises a NLS (nuclear localization sequence) domain.
19. The method according to any one of the preceding items 1 to 15, wherein in the collection step, whole cells are collected.
20. The method according to any one of the preceding items, wherein after the analysis step, patterns are clustered first by their cell type.
21. The method according to any one of the preceding items, wherein said organism expresses a gene encoding a recombinase enzyme, and wherein activation of expression of said Cas enzyme is mediated via said recombinase enzyme.
22. The method according to item 21 , wherein the recombinase enzyme is a Cre enzyme.
23. The method according to any one of the preceding items 12 to 22, wherein said reporter gene promoter is an RNA-polymerase II promoter.
24. The method according to any one of the preceding items, wherein the gRNA-delivering nucleic acid expression vector is a viral vector, particularly wherein the gRNA-delivering nucleic acid expression vector is an AAV vector, an adenoviral vector, or a lentiviral vector, more particularly wherein the gRNA-delivering nucleic acid expression vector is an AAV vector.
25. The method according to any one of the preceding items, wherein the organism is an animal, particularly wherein the organism is a vertebrate, more particularly wherein the organism is a mammal, most particularly a mammal selected from the group of a rodent, a primate, an ungulate, a lagomorph, a carnivore, an insectivore, and a chiroptera.
26. The method according to any one of the preceding items, wherein the Cas enzyme is Cas9.
27. The method according to any one of the preceding items, wherein the gene encoding the Cas enzyme is introduced into the germline of the organism.
28. The method according to any one of the preceding items, wherein the gene encoding the Cas enzyme is delivered via a vector.
29. The method according to any one of the preceding items, wherein the gRNA promoter and the reporter gene promoter are two distinct promoters.
30. The method according to any one of the preceding items, wherein the gRNA promoter is a tissue-specific promoter.
31. The method according to any one of the preceding items, wherein the gRNA promoter is an inducible or conditional promoter, particularly a promoter selected from the group of a Tet-ON promoter, a Tet-OFF promoter, and a Cre-dependent promoter.
32. The method according to any one of the preceding items, wherein 5’ capture sequencing is performed.
33. A method for analyzing multiple gene perturbations in vivo in a tissue of interest; said method comprising the steps: a. providing an organoid or a non-human organism, particularly a non-human organism; b. administering a plurality of viral gRNA-delivering nucleic acid expression vectors to the organoid or organism, each vector comprising: i. inverted-terminal repeats (ITRs); ii. a gRNA promoter;
Hi. at least one guide-RNA (gRNA) under control of said gRNA promoter, wherein each vector comprises a different gRNA or gRNA combination; and iv. a terminator of transcription; wherein the organism expresses a Cas enzyme, or said vector additionally encodes a Cas enzyme under control of a promoter operable in said cell;
c. isolating a sample of the tissue of interest from the organism, or a sample of the organoid; d. in a collection step, collecting cells or nuclei from the sample; e. in an analysis step, performing a single-cell or single-nucleus assay comprising an analysis of the gene perturbation and comprising gRNA sequencing of each cell or nucleus, thereby generating a plurality of assay patterns related to expression of a defined gRNA or a defined gRNA combination. The method according to item 33, wherein the assay of the analysis step comprises a method selected from the group of
- single-cell or single-nucleus RNA sequencing;
- single-cell or single-nucleus DNA sequencing;
- single-cell quantification of surface proteins;
- single-cell or single-nucleus quantification of cytosolic and nuclear proteins;
- single-cell or single-nucleus quantification of histone marks;
- single-cell or single-nucleus mRNA sequencing, and the assay patterns are mRNA expression patterns;
- transposase-accessible chromatin with sequencing (ATAC-seq), and the assay patterns are chromatin accessibility patterns;
- cellular indexing of transcriptomes and epitopes by sequencing (CITE-seq), and the assay patterns are protein patterns, particularly surface protein patterns. The method according to any one of the preceding items 33 to 34, wherein after the analysis step, a. the assay patterns are clustered by their type of cell or origin, thereby generating an assay profile for each cell type in the tissue of interest; and/or b. the assay patterns are clustered by their type of expressed gRNA, thereby generating an assay profile for each gene perturbed by a gRNA or a gRNA combination. The method according to any one of the preceding items 33 to 35, wherein each vector of the plurality of gRNA-delivering nucleic acid expression vectors comprises a reporter gene under control of a reporter gene promoter, wherein said reporter gene encodes a reporter protein, wherein said reporter protein enables selective isolation of cells that express said reporter protein in the collection step, particularly wherein in the collection step, cells or nuclei are collected selectively from the tissue of interest that exhibit expression of said reporter protein, more particularly wherein the reporter protein is a fluorescent protein. The method according to any one of the preceding items 33 to 36, wherein in the collection step, nuclei are collected.
The method according to item 37, wherein the reporter protein comprises a polypeptide part localizing to a nucleus of a cell, particularly wherein
- the reporter protein comprises a polypeptide part interacting with a membrane of a nucleus, particularly wherein the reporter protein comprises a KASH domain, or
- the reporter protein comprises a polypeptide part localizing to a nucleus of a cell, particularly wherein the reporter protein comprises a NLS (nuclear localization sequence) domain. The method according to any one of the preceding items 33 to 36, wherein in the collection step, whole cells are collected. The method according to any one of the preceding items 33 to 39, wherein said organism expresses a gene encoding a recombinase enzyme, and wherein activation of expression of said Cas enzyme is mediated via said recombinase enzyme, particularly wherein the recombinase enzyme is a Cre enzyme. The method according to any one of the preceding items 33 to 40, wherein the gRNA- delivering nucleic acid expression vector is an AAV vector, an adenoviral vector, a rabies vector, a sindbis vector, or a lentiviral vector, more particularly wherein the gRNA-delivering nucleic acid expression vector is an AAV vector. The method according to any one of the preceding items 33 to 41 , wherein the organism is an animal, particularly wherein the organism is a vertebrate, more particularly wherein the organism is a mammal, most particularly a mammal selected from the group of a rodent, a primate, an ungulate, a lagomorph, a carnivore, an insectivore, and a chiroptera. The method according to any one of the preceding items 33 to 42, wherein the Cas enzyme is Cas9. The method according to any one of the preceding items 33 to 43, wherein the gene encoding the Cas enzyme
- is introduced into the germline of the organism, or
- is delivered via a vector. The method according to any one of the preceding items 33 to 44, wherein the gRNA promoter and the reporter gene promoter are two distinct promoters. The method according to any one of the preceding items 33 to 45, wherein the gRNA promoter is a tissue-specific promoter. The method according to any one of the preceding items 33 to 46, wherein the gRNA promoter is an inducible or conditional promoter, particularly a promoter selected from the group of a Tet-ON promoter, a Tet-OFF promoter, and a Cre-dependent promoter.
The invention is further illustrated by the following examples and figures, from which further embodiments and advantages can be drawn. These examples are meant to illustrate the invention but not to limit its scope.
Fig. 1 In vivo single-nucleus pooled CRISPR screening in the adult brain enabled through systemic administration of AAV.PHP.B and 5’ gRNA capture, a. AAV- Perturb-seq experimental pipeline, b. Expression of mTagBFP, Venus, and mCherry in the prefrontal cortex after systemic injection of an equal mixture of 5.0 x 109 total AAV particles (scale bar = 100 pm), c. Representation of the 22q11 .2 locus where the genes expressed in the adult mouse prefrontal cortex are indicated. The human 22q1 1 .2 locus is conserved in mouse chromosome 16. d. UMAP embedding of -150,000 AAV.PHP.B- infected nuclei isolated from the mouse prefrontal cortex, e. Number of nuclei with a unique gRNA for each perturbation across cell types.
Fig. 2 Perturbation of 22q11.2-linked genes Dgcr8, Dgcr14, Gnbll, and Ufdll result in strong transcriptional changes in adult brain cell types, a. Schematic of the analysis pipeline (SH control: nuclei with control gRNAs targeting safe-harbor locus; P: perturbation; n: total number of perturbations; LFC: log fold change), b. Number of differentially expressed genes (DEG) for all perturbations in individual cell types. Dashed line indicates 5 DEGs with an adjusted p-value (p.adj) lower than 0.05. c. Heatmap and hierarchical clustering of the 20 top up-regulated genes (columns) in Dgcr8, Dgcr14, Gnbll, and Ufdll perturbations in neuron types (rows), d. UMAP embedding of control nuclei and nuclei passing filter perturbed in Dgcr8, Dgcr14, Gnbll, and Ufdll for each neuron type using DEGs as variables (LFC > 0.5 and FDR < 0.01). e. Deep sequencing-based gene editing (indel) analysis for four gRNAs targeting 22q1 1.2 genes with strong transcriptional phenotypes (Dgcr8, Dgcr14, Gnbll, and Ufdll) and four gRNAs targeting genes without apparent transcriptional phenotypes (Comt, Med15, Ranbpl, and Pi4ka).
Fig. 3 Perturbation of 22q11.2 genes results in the disruption of distinct sets of biological processes, a. Schematic representation of arrayed validation experiments, b. Pearson correlation and hierarchical clustering of transcriptional signatures (LFC values) mediated by Dgcr8, Dgcr14, or Gnbll perturbation in pooled screen and arrayed confirmation experiments for each neuron type. c. Heatmap showing the six transcriptional programs (grouped rows) altered in Dgcr8, Dgcr14, and Gnbll perturbed cells (columns) across cell types and experiments (screen or arrayed). Left: LFC values for each altered gene across neuron types and experiments. Right: disrupted biological process for each genetic program (Top Biological Processes), direction of expression change (Dir.), and representative genes (Genes), d-f. Gene program scores for up- regulated (UP) and down-regulated (DOWN) genes in Dgcr8, Dgcr14, and Gnbll perturbed Interneurons from the screen dataset.
Fig. 4 Transcriptional changes found in LgDel model neurons are partially explained by perturbation of Dgcr8, Dgcr14, and Gnbll. a. Schematic representation of the LgDel single-nucleus cortex atlas experimental design. snRNA-seq of 10-week-old LgDel+/_ (LgDel) and WT control (LgDel+/+) mouse brain prefrontal cortex (n=3 males for each condition), b. Left: UMAP embedding depicting cell types identified in WT and LgDel samples. Right: Individual UMAP representations of T and LgDel nuclei, c. LFC values calculated with pseudobulk DE analysis when comparing LgDel against WT control for genes targeted in the pooled screen across cell types. Dgcr2 and Rimbp3 were omitted due to their low expression levels and thus inaccuracy in calculating LFC. d. Biological processes enriched in LgDel transcriptional profiles from each cell type (NES: normalized enrichment score; p.adj: adjusted p-value). e. Cosine similarity of LFC profiles between individual perturbations and LgDel for each cell type. f. Heatmap showing the LFC values for the top 100 predicted genes in individual perturbations, LgDel, and the model (LgDel = 0.21 Dgcr8 + 0.18 Gnbl l + (-0.11) Dgcr14, dcor = 0.40) prediction based on individual perturbations profiles, g. Gene program score in WT control and LgDel nuclei for the up-regulated program in Dgcr8-perturbed nuclei (Student’s t-test, FDR values Supp. Layer Neurons < 0.01 , Deep Layer Neurons < 0.01 , and Interneurons < 0.01). h. Gene program score in WT control and LgDel nuclei for the down-regulated program in Gnb ^/-perturbed nuclei (Student’s t-test, FDR values Supp. Layer Neurons < 0.01 , Deep Layer Neurons < 0.01 , and Interneurons < 0.01).
Fig. 5 LgDel and individual 22q11.2 gene perturbations alter the expression of disease- associated risk genes. Heatmap highlighting genes (rows) commonly dysregulated in individual perturbations and LgDel transcriptional profiles (right) and their association with neurodevelopmental disorders (left side) (SCZ: Schizophrenia; BP: Bipolar Disorder; ADHD: Attention Deficit Hyperactivity Disorder; ASD: Autism Spectrum Disorder).
Fig. 6 AAV injection and nuclei isolation conditions, a. Schematic representation of AAV genomes used to deliver and express mTagBFP, Venus, or mCherry under the control of the CBh promoter, b. Schematic representation of the triple color experiment. An equal-ratio mix of the three AAVs was injected in LSL-Cas9 animals with different doses (Low: 2.5 x 109; Medium: 5.0 x 109; High: 2.5 x 1010, total AAV particles), c. Percentage of infected nuclei (/.e., nuclei expressing at least one fluorescent protein) after systemic injection of different viral doses, d. Percentage of infected nuclei expressing one, two, or the three FPs. Data shown for injections with 5.0 x 109 and 2.5 x 1010 total AAV particles, e. Fluorescence imaging of brain cells expressing GFP four weeks after systemic injection of 5.0 x 109 AAV particles, f. Flow cytometry gating strategy to sort GFP-positive nuclei.
Fig. 7 Astrocytes-specific pooled screen, a. Schematic representation of the AAV genome engineered to express, b. UMAP embedding of ~35 000 AAV.PHP.B-infected nuclei isolated from the mouse prefrontal cortex, c. Abundance of cell types in single-nucleus
datasets generated from brain cells infected with CBh and GfaABCI D AAVs. d. Percentage of gRNAs detected per nucleus in Astrocytes, e. Number of DEGs for all perturbations in Astrocytes.
Fig. 8 In vivo CRISPR screening in a high-fat diet MASH model, a. Animals were injected with genetic interventions and exposed to HFD for five months, b. Bulk gRNA count analysis revealed genes involved in hepatocyte damage, c. Bulk gRNA count prioritizes interventions with therapeutic potential. Interventions highlighted in red are examples of positive controls known to have an effect and support the ability of our platform to pinpoint potential therapeutical targets.
Fig. 9 Microglia-specific AAV capsids, a. Average number of nuclei per perturbation across cell types using AAV. PHP. b for gRNA library delivery, as reported in Santinha et al. 2023. b. Experimental design to test microglia-specific AAV capsids. Viruses were tail vein injected (10A12 particles per animal), and the number of infected microglia was assessed three weeks later, c. Flow cytometry results for CD11 b positive microglia for two controls (PBS and PHP.eB) and four microglia-specific AAV capsids (AAV M1 - M4).
Fig. 10 Sterotaxic injection of microglia-specific AAV capsids, a. Experimental design to test stereotaxic injections of microglia-specific AAV capsids. Viruses were injected (10A09 particles per animal), and the number of infected microglia was assessed three weeks later, b. Flow cytometry results for CD11 b positive microglia for animals injected with PBS, AAV.MG1 .1 , and AAV.MG1 .2.
Examples
Example 1: In vivo single-nucleus pooled CRISPR screening in the adult brain enabled through systemic administration ofAAV.PHP.B and 5’ gRNA capture
Towards creating a robust and broadly applicable direct in vivo single-cell CRISPR screening platform, we reasoned that it must have the following features: 1) simple to apply in mouse models, 2) relevant to a broad range of tissues and cell types yet also tunable to subsets of interest, 3) capable of inducing efficient genetic perturbations and recovering this information with a transcriptomic readout, and 4) delivery enabling low multiplicity of infection such that single cells receive single perturbations (Fig. 1 a). We hypothesized that systemic AAV-mediated delivery may enable each of these features and therefore set out to establish and characterize this approach in vivo in the mouse brain.
To test whether AAV permits infection of a large number of cells at low multiplicity of infection, we performed an in vivo titration experiment. We prepared three AAV transfer plasmids to independently express either mTagBFP, Venus, or mCherry under the control of a ubiquitous CBh promotor (Fig. 6a). Each fluorescent protein (FP) was additionally fused to a KASH domain which physically attaches proteins to the nuclear membrane, thus enabling nuclei sorting. In each case, we used the AAV.PHP.B17 capsid to achieve brain-wide infection after systemic delivery in C57BL/6 mice. We
injected an equal mixture of the three viruses via the tail vein with a low (2.5 x 109), medium (5.0 x 109), or high (2.5 x 1010) dose of total viral particles (Fig. 6b). Flow cytometry analysis of nuclei isolated from brain tissue revealed a direct correlation between the viral dose, number of infected cells, and multiplicity of infection (Fig. 6c-d). The medium dose infected 13.5% of total nuclei, with the expression of a single FP detected in 70% of those (Fig. 1 b and Fig. 6d). The higher viral dose led to an increased percentage of total infected nuclei (~ 34%), but only 55% of these presented a unique FP (i.e., almost half received more than one AAV particle). For subsequent experiments, we selected the dose of 5.0 x 109 AAV particles per animal as it maximized the total number of cells infected with a single AAV (Fig.6e-f).
Next, we focused on establishing a method to capture both mRNA and CRISPR gRNA molecules from the same AAV-infected nucleus. The use of nuclei rather than cells permits the study of complex, mature tissues from which good-quality single-cell suspensions are challenging to obtain18. We designed two different strategies where the gRNA expression cassette was either embedded within a mRNA (pAS006) or expressed independently (pAS088), enabling either 3’ (CROP-seq1) or 5’ (ECCITE-seq19) capture sequencing methods, respectively. We injected AAV. PHP. B containing small pools often distinct gRNAs for each construct. Four weeks after injection, we isolated single nuclei and prepared single-nucleus RNA sequencing (snRNA-seq) libraries using either 5’ or 3’ capture methods for cells infected with pAS088 or pAS006, respectively. Analysis of infected nuclei revealed that the percentage of total nuclei with a gRNA detected was 65% and 20% for the 5’- and 3’-based approaches, respectively, and that most infected nuclei contained a unique gRNA. For each nucleus, we found that the 5’-based approach captured an average of ~25 unique UMIs per gRNA, compared to ~3 unique UMIs for the 3’-based approach. Taken together, we established that the 5’-based approach, combining independent gRNA expression (pAS088) with 5’ capture sequencing, best captures mRNA and gRNA information in AAV- infected nuclei and we therefore proceeded with this method for subsequent experiments (Fig. 1 a).
Example 2: AAV-Perturb-seg of 22g 11.2 DS genes yields a rich single-nucleus dataset spanning genes and brain cell types from adult mice
To test the ability of our in vivo screening method to interrogate complex genetic disorders in adult animals, we applied AAV-Perturb-seq to individually perturb 22q11 .2 genes in mature somatic cells in the prefrontal cortex of adult mice. Deletion of the 22q11.2 locus is one of the most common chromosomal deletions in humans and results in a complex spectrum of phenotypes, including altered neuronal development and function. While the developmental phenotypes of a few single genes within the locus have been characterized! 5, the function(s) of individual 22q11.2 genes in the adult brain are poorly understood and have not been systematically examined. To identify candidate genes important for brain function in adult animals, we analyzed DropViz data to measure the expression of the mouse homologs of 22q11.2 genes. This analysis revealed that 29 of the 37 genes in the locus are expressed in the adult mouse cortex (Fig. 1 c). We focused our attention on the prefrontal cortex as dysfunction of this region is thought to underly many of the 22q1 1.2 DS neuropsychiatric manifestations! 6. We designed a CRISPR gRNA library to target each of the 29 adult expressed genes with two independent gRNAs and included five control gRNAs targeting mouse safe-harbor (SH) loci. The use of SH-targeting gRNAs controls for potential transcriptional changes induced by Cas9-mediated DNA double-strand breaks and not related to the function of the target gene20. We cloned the gRNA library into our 5’
capture AAV transfer plasmid (pAS088), produced AAV. PHP. B and injected 5.0 x 109 particles per animal. Four weeks after injection, we sequenced -150,000 sorted GFP-positive infected nuclei from the prefrontal cortex of 15 male mice using our modified 5’ capture method.
We first focused our analysis on cell type identification and perturbation assignment. Clustering analysis with Seurat (Stuart, T. et al., Cell 177, 1888-1902. e21 (2019)) identified expected neuronal and nonneuronal brain cells, highlighting our ability to infect and recover transcriptional information from a broad range of cell types (Fig. 1 d). Expression of gRNA molecules was detected in all cell types, with 70% of nuclei showing at least one gRNA and 45% (-60,000 nuclei) containing a unique gRNA (i.e., only one perturbed gene). Furthermore, we detected all gRNAs in the library, with average numbers of nuclei per gRNA ranging from -10 in microglia to 400 in interneurons (Fig. 1 e). The average number of nuclei per gRNA in any given cell type directly correlates with the total number of nuclei, indicating that genetic perturbations did not grossly alter the composition of cell types in the brain. Taken together, our direct in vivo screen experiment resulted in a single-nucleus transcriptomic dataset containing -60,000 nuclei spanning 6 brain cell types and perturbation of all 22q11.2 genes expressed in the adult prefrontal cortex.
Example 3: Perturbation of Dgcr8, Dgcr14, Gnbll, or Ufdll result in strong transcriptional changes in prefrontal cortex neurons
\Ne developed a data analysis pipeline to associate gRNAs, and thus genetic perturbations, with cell type-specific transcriptional phenotypes (Fig. 2a). We create pseudobulk profiles by aggregating nuclei with the same perturbation and employ edgeR (Robinson, M. D et al., Bioinforma. Oxf. Engl. 26, I SOO (2010)) to calculate pairwise differential expression (DE) between control and each perturbation in superficial and deep layer excitatory neurons, interneurons, astrocytes, and oligodendrocytes. Our choice of using pseudobulk profiles is supported by recent benchmarking studies indicating that commonly used single-cell-specific DE methods tend to identify differentially expressed genes (DEG) in the absence of biological differences. Indeed, we evaluated single-cell-specific DE methods and found that these lead to biases towards highly expressed genes. Using our pseudobulk approach, we found significant transcriptional phenotypes in four perturbations across all neuron types (Dgcr8, Dgcr14, Gnbl l, and Ufdl l), as measured by the number of DEGs (Fig. 2b). Transcriptional phenotype scoring analysis with Hoteling’s T-squared statistics (Ursu, O. et al., Nat. Biotechnol. 40, 896-905 (2022)) confirmed the identity of the genes with strong transcriptional phenotypes when perturbed in neurons. We also observed that all four genes are present within the 1 .5 Mb minimal region believed to be critical in 22q1 1 .2-related disorders (Fig. 1 c).
Next, we characterized the transcriptional phenotypes resulting from perturbation of Dgcr8, Dgcr14, Gnb11, and Ufd11 across neuron types. The overarching result was that perturbation of each gene led to a largely distinct transcriptional phenotype that was mostly shared across neuron types. Support for this came from: 1) clustering the top 20 (Fig. 2c) or all up-regulated genes for each perturbation; 2) Augur score analysis, which scores cells based on their dissimilarity to the control condition; 3) correlation analysis using all DEGs; and 4) two-dimensional UMAP embeddings, which directly segregated nuclei with different perturbations from each other and from SH control cells (Fig. 2d). Taken together, these observations demonstrate that AAV-Perturb-seq retrieves both mutation and cell type-specific
signatures and indicates that perturbation of Dgcr8, Dgcr14, Gnbl l, and Ufd11 affect specific subsets of unique genes across neuron types.
Example 4: Altered transcriptional phenotypes are due to gene function and not a consequence of gene editing efficiency
A limitation of CRISPR screens is thatthe efficiency of each gRNA in a library is only inferred or predicted and not known for all cell types (Bock, C. et al., Nat. Rev. Methods Primer 2022 21 2, 1-23 (2022)), and therefore underperforming gRNAs may confound our ability to robustly identify perturbed cells. This could explain, for example, why only a fraction of the perturbed 22q11.2 genes showed a strong transcriptional phenotype. To assess this possibility, we prepared eight individual AAV. PHP. B viruses expressing gRNAs targeting the four genes with a strong transcriptional change (Dgcr8, Dgcr14, Gnbl l, and Ufd11) and four randomly chosen genes with no apparent transcriptional phenotype (Comt, Med15, Ranbpl , and Pi4ka), which were then individually injected into distinct mice. Analysis of Cas9-mediated indels revealed that the percentage of mutated cells was similar across all tested gRNAs, with the majority of edited cells harboring frame-shifting loss of function mutations in the targeted gene (Fig. 2e).
Whilst our analysis revealed efficient gene editing and DEGs across perturbations and neuron types, we set out to refine our pipeline further considering gene editing mosaicism. As not all nuclei expressing a gRNA necessarily carry a loss-of-function mutation, and merging perturbed and non-perturbed transcriptomes into a pseudobulk profile could dampen and/or confound transcriptional phenotypes, we focused on identifying and filtering non-perturbed nuclei from the analysis. It is not sufficient to measure transcript levels of the target gene as indels created by Cas9 do not always result in mRNA degradation. Therefore, using the previously detected DEGs for each perturbation as variables (Fig. 2b), we used linear discriminant analysis (LDA) to identify gRNA-containing nuclei with a transcriptional phenotype significantly distinct from SH control nuclei. This analysis revealed that, on average, -50% of nuclei containing a particular gRNA were perturbed, in line with our observed gene editing efficiency (Figure 2e) and expected non-loss-of-function genotypes. After discarding non-perturbed nuclei and repeating the pseudobulk differential expression analysis, we observed that nuclei filtering increased our sensitivity to detect DEGs without biasing the transcriptional phenotype.
These results reveal the robustness of Cas9-mediated gene editing in vivo but also shows that not all mutated genes lead to transcriptional phenotypes, which could be explained by subtle transcriptional changes in lowly expressed genes not detected by snRNA-seq, genetic compensation mechanism, or a lack of transcriptional consequences upon these perturbations in the cell type or state examined. Our approach therefore focuses on genes that result in substantial transcriptional changes when perturbed under homeostatic conditions.
Example 5: Arrayed perturbations confirm AAV-Perturb-seq results
To confirm the fidelity of our pooled screening method, we performed validation experiments by perturbing selected genes individually in vivo followed by snRNA-seq (Fig. 3a). In individual LSL-Cas9 mice, we injected AAVs expressing one gRNA targeting either Dgcr8, Dgcr14, Gnbl l, or a SH control. We excluded Ufd11 from this and subsequent analyses after confirming its transcriptional response was enriched for terms associated with apoptosis, in line with its predicted role as an essential gene.
Additionally, we focused our attention on neurons by exchanging the ubiquitous CBh promotor for the neuron-specific promoter hSyn. After sequencing -6,000 nuclei per condition (3 animals each), dataset integration and clustering revealed the presence of mostly neurons and only a residual level of nonneuronal cells, with individual perturbations detected in all cell types. Similar to our findings in the pooled screen, pseudobulk analysis results from the arrayed experiments highlighted strong transcriptional changes induced by all gRNAs, which were distinct from one another and led to condition-specific phenotypes. A direct comparison of pooled and arrayed experiments revealed a high correlation between transcriptional phenotypes for all perturbations and neuron types, indicating that AAV-Perturb- seq is capable of faithfully capturing single cell transcriptomes from pooled perturbation experiments (Fig. 3b).
Example 6: Perturbation of 22g11.2-associated genes results in heterozygous and homozygous cells with similar transcriptional phenotypes
The control of zygosity is a general challenge in CRISPR screens, as the expression of Cas9 and gRNA can lead to three potential scenarios: 1) the cells are infected but not edited and are thus wild-type (WT); 2) the cells are infected and acquire a heterozygous mutation; or 3) the cells are infected and acquire a homozygous mutation. While WT cells do not contribute to the observed transcriptional phenotypes and are removed by our filtering strategy, it is unclear whether heterozygous and heterozygous mutations lead to the same transcriptional phenotype. This is especially important for modelling haploinsufficiency, as is the case for 22q11.2 DS. To explore potential differences between zygosities, we stratified SH control and perturbed nuclei in a pseudotemporal space using diffusion maps (Haghverdi, L. et al., Bioinformatics 31 , 2989-2998 (2015)) This analysis revealed that perturbed nuclei disperse continuously, suggesting that, for Dgcr8, Dgcr14, and Gnbl l, the transition from heterozygous to homozygous likely reflects a change in the magnitude of differential expression rather than a change in the transcriptional phenotype. While we do not know the ground truth genotypes for the sequenced nuclei, it is possible to reasonably stratify the data leveraging the apparent bimodal distribution in the diffusion space that likely captures heterozygous and homozygous nuclei. Considering these new zygosity stratifications, heterozygous and homozygous groups are indistinguishable in terms of the dysregulated genes that define the transcriptional phenotypes but are distinguishable in terms of the expression levels of those genes, with LFC values calculated using only heterozygous or homozygous nuclei highly correlating to LFC values obtained previously using all nuclei.
To further support these findings, we hypothesized that CRISPR inhibition (CRISPRi)-mediated knockdown may reduce target gene expression to levels observed in a heterozygous condition and thus be used to simulate the phenotypes generated by haploinsufficiency. We prepared AAVs carrying SH control or Dgcr8-targeting gRNAs and injected them into a dCas9-KRAB mouse model. Across neuron types, CRISPRi-mediated Dgcr8 mRNA reduction was comparable to the values observed for 22q1 1 .2 DS, indicating our ability to model heterozygosity. We also confirmed that CRISPRi- and CRISPR- mediated Dgcr8 perturbation led largely to analogous transcriptional phenotypes. Support for this came from: 1) high similarity between LFC values; 2) genes considered DE by both experiments; and 3) discovery of DE genes known to be involved with the biological function of Dgcr8. These results strongly suggest that heterozygous and homozygous mutations in Dgcr8, Dgcr14, and Gnbl l result in a
continuous phenotype and the assessment of both or either genotype captures the impact of the perturbation (and thus are relevant to haploinsufficiency).
Example 7: Perturbation of 22q11.2 DS genes results in the disruption of distinct sets of biological processes
Next, we focused on characterizing the transcriptional phenotypes and disrupted biological processes resulting from perturbation of individual 22q11.2 DS genes. For each perturbation we divided dysregulated genes into two genetic programs to represent the up-regulated (LFC > 0.5; FDR < 0.01) and down-regulated (LFC < -0.5; FDR < 0.01) groups (Fig. 3c). Gene ontology analysis of each program revealed dozens of disrupted biological processes associated with each perturbation (Fig. 3c). To confirm that genes identified by DE analysis had altered expression, we calculated their gene program score - average normalized expression of all genes in a program - for each nucleus in both screen and array datasets (Fig. 3d-f). Across all neuron types, this analysis confirmed that programs are perturbation specific and their expression changes coincide with LFC values calculated with pseudobulk DE analysis - encouraging us to go deeper into functionally interpreting the data.
Dgcr8 encodes for a component of the microprocessor complex involved in processing primary microRNA (miRNA) transcripts (pri-miRNAs) into precursor miRNAs (pre-miRNAs), which are ultimately further processed by Dicer into mature miRNAs, and has been extensively studied in the context of 22q1 1.2 DS. While we found that no biological pathways were disrupted in the Dgcr8 down-regulated genetic program, in the up-regulated genetic program we identified a disruption in genes related with miRNA-mediated RNA silencing (Fig. 3c), which included several long noncoding RNAs (IncRNA) such as Mirg and Spaca6. These IncRNAs encode pri-miRNAs and their up-regulation was previously reported in mouse models of Dgcr8 haploinsufficiency and 22q11.2 DS. In addition to what was previously described, we identified novel up-regulation of the pri-miRNAs Mir181 a-1 hg, Mir9-3hg, and Mir124a-1 hg, whose miRNA products have been associated with cortical development and neuron physiology. The accumulation of these pri-miRNAs implies that there is less mature miRNA being produced. As mature miRNAs negatively regulate gene expression, we would expect a concordant increase in the expression of genes targeted by the disrupted miRNAs. To test this hypothesis, we applied miRNA-target enrichment analysis (Licursi, V. Et al, BMC Bioinformatics 20, 1-10 (2019)) to up- regulated genes in all perturbations and cell types. While no enrichment was found for targets of miR-9 and miR-124a, we observed a strong accumulation of miR-181 a targets among up-regulated genes in Dgcr8-perturbed cells (FDR < 0.1). This enrichment was not observed in Dgcr14- or Gnbl l-perturbed neurons, indicating that the miRNA-associated phenotype is unique to Dgcr8.
Dgcr14 encodes for the nuclear protein DGCR14, a component of C complex spliceosomes. Gene ontology analysis of the down-regulated genetic program revealed the presence of genes connected with RNA binding (Fig. 3c). We found a specific enrichment for genes associated with regulation of RNA splicing and the spliceosome, supporting the involvement of Dgcr14 in RNA maturation processes. Among splicing-related genes, we found constituents of the serine and arginine protein family (Srrm2, Srsfl , Srsf2, Srsf5, SrsfB, and Srsf11) that are essential for spliceosome assembly as well as constituents of the heterogeneous nuclear ribonucleoproteins family (Hnrnph3, Hnrnpm, and Hnrnpu) that are involved in pre-mRNA processing, mRNA transport, and metabolism. Analysis of the up-
regulated genetic program showed the disruption of chromatin binding and organization (Fig. 3c), which included dysregulation of the genes from the chromodomain-helicase-DNA binding family (Chd1 , Chd3, Chd6, and Chd8), topoisomerases (Topi and Top2b), and Setd5, a gene that regulates chromatin structures to control RNA elongation and splicing. Many of these genes and their respective pathways are associated with neurodevelopmental disorders.
Gnbl l encodes for a protein of unknown function46 that contains six WD40 repeats which facilitate protein-protein interactions and the formation of multiprotein complexes. The down-regulated genetic program was enriched for genes involved in neuronal development, synaptic organization and function, and chemical transmission (Fig. 3c), including genes that encode for glutamatergic receptor subunits (Grial , Gria4, Grik3, Grin2a, and Grin2b), regulation of a prepulse inhibition phenotype (Ctnna2 and Nrxnl), and regulation of action potential (Ank3, Cnr1 , Fgf13, Foxpl , and Trpc4). In the up-regulated genetic program, we found genes related to peptide translation and metabolic processes. Overall, this suggests that perturbation of Gnbl l results in impaired neuronal communication that is distinct from those shown for other 22q11 .2-linked genes.
These results show that AAV-Perturb-seq both confirms prior published work and provides new insights into the phenotypic landscape underlying 22q11 .2 genes. For example, our data reveals new pri-miRNA targets of Dgcr8, a disrupted balance between RNA transcription and splicing resulting from perturbation of Dgcr14, and broad dysfunction in neuronal communication linked to the synapse and glutamate signaling in Gnbl l-perturbed cells. Taken together, these 22q11.2 genes play an active role in mature neurons in the mouse brain, which may also contribute to 22q11 .2 DS symptomatology.
Example 8: Single-nucleus prefrontal cortex atlas of a 22ct11.2 DS mouse model
Our pooled direct in vivo perturbation screen revealed several mechanistic connections between 22q1 1 .2 locus genes and biological processes in the adult mouse brain. We next set out to characterize the relationship between those findings versus transcriptional phenotypes observed in prefrontal cortex cells from a 22q11.2 DS mouse model. We chose the LgDel model which presents a deletion of 25 genes, analogous to the 1 .5 Mb minimal critical deletion observed in 22q11 .2 DS patients, and exhibits behavioral phenotypes reminiscent of ASD and schizophrenia. Importantly, in the LgDel mice, the genetic defect is present in the germline and affects all cells from the onset of embryogenesis, allowing us to address the function of each gene in mature neurons as well as which fraction of the transcriptional phenotype could be explained by single gene perturbations in adult animals.
We first generated a prefrontal cortex cell atlas from LgDel+/- (LgDel) and LgDel+/+ (WT) adult male mice and dissected cell type-specific phenotypes induced by the deletion (Fig. 4a). After recovering -30,000 high-quality single nuclei using 3 animals per genotype, we observed similar average numbers of UMIs and genes for both conditions. Unbiased clustering and UMAP embedding of nuclei profiles from WT and LgDel samples revealed the presence of superficial and deep layer excitatory neurons (layers 2/3, 5, and 6), inhibitory neurons (Interneurons CGE and MGE), and non-neuronal brain cells, mainly oligodendrocytes and astrocytes (Fig. 4b). Clustering was unaffected by individual samples or experimental condition. We did not detect significant differences in the proportion of cell populations between LgDel and WT, indicating that the deletion does not alter the gross cellular landscape of the
adult brain. Hierarchical clustering of the bulk transcriptional profiles revealed a primary clustering driven by cell type followed by a second level clustering by genotype.
We applied our pseudobulk DE analysis pipeline to identify genes dysregulated in LgDel. First, focusing on genes previously targeted in our pooled screen, we found that only those within the 1 .5 Mb deleted locus exhibited a significant negative LFC, while adjacent genes had minimal expression changes (Fig. 4c). This observation was consistent across cell types and confirms that locus heterozygosity leads to approximately 50% reduction in the expression of affected genes. Next, we explored 22q11.2 deletion- mediated transcriptional changes in all cell types and asked whether there are cell type-specific signals, or commonly dysregulated processes. Our DE analysis revealed DEGs (abs(LFC) > 0.5; FDR < 0.01) in all cell types. Excitatory neurons presented the highest number of dysregulated genes, with superficial and deep layer neurons showing 168 and 138 DEGs, respectively, while interneurons showed a substantial lower number of DEGs (23 genes). We asked whether the lower number of DEGs in interneurons meant that these cells are perturbed differently by the deletion or are affected to a reduced extent. Correlation analysis using calculated LFC values for each cell type highlighted a strong similarity between neuron types, indicating that the deletion leads to similar signatures in all neuron types, but with a smaller amplitude in interneurons.
To further support these findings and avoid a potential bias introduced by arbitrary DEG thresholds, we applied gene set enrichment analysis (GSEA) to identify biological processes dysregulated by the deletion. This analysis confirmed the similarity between the phenotypes in neuron types, with a strong overlap in the identified GO terms (Fig. 4d). While we identified an average of 68 DEGs in non-neuronal cells, we found substantial less GO terms significantly affected by the deletion in these cells. Among the down-regulated biological processes altered by the deletion in neurons, we found terms related with neuronal communication (ion transmembrane transport and regulation of synaptic plasticity) and neuronal development (neurogenesis, cell projection organization, and axonogenesis). These findings from adult mouse neurons also recapitulate recent findings from human cerebral spheroids derived from 22q1 1.2 patient cells. These results also reiterate many of the biological functions identified following perturbation of individual genes (Fig. 3c), suggesting that the 22q11 .2 DS phenotype in neurons may arise due to both dysfunctional development and the additive effects of reduced expression from a specific subset of single genes.
Example 9: Transcriptional changes found in LgDel model neurons are partially explained by perturbation of Dgcr8, Dgcr14, and Gnb 11
\Ne set out to quantify the extent to which individual perturbations explain the transcriptional signature observed in LgDel neurons. We started by considering the top DEGs for each perturbation and asked how their LFC values correlated across models. We chose a defined number of top genes, rather than all DEGs, to avoid biases introduced by arbitrary DEG thresholds and differential gene set sizes when calculating similarities. For all neuron types, we found that transcriptional changes mediated by Dgcr8 and Gnb11 perturbation have a high cosine similarity to the LgDel model (Dgcr8 = 0.2 and Gnb11 = 0.26), while the same is not observed for Dgcr14 (Dgcr14 = -0.11) (Fig. 4e).
Next, we used a linear regression model (Norman, T. M. et al. Science 365, 786-793 (2019)) to assess the extent to which the LFC observations in LgDel are explained by individual perturbations. The model tries to decompose the signature induced by 22q11 .2 deletion in terms of each perturbation alone. As a result, the model coefficients highlight how much the signal of each single perturbation is concordant with the deletion. For all neurons, we observed that Dgcr8 and Gnbl l perturbations have larger coefficients (cDgcr8 = 0.21 ; cGnbl l = 0.18), while Dgcr14 showed the smallest contribution (cDgcr14 = -0.11) (Fig. 4f). The linear model was capable of predicting -40% (dcor = 0.40) of the variance observed in the LgDel. Of the transcriptional changes correctly predicted by individual perturbations (Fig. 4g-h), the Dgcr8 contribution was focused on up-regulated genes mostly associated with the accumulation of miRNA primary genes (Mirg, Spaca6, Mir9-3hg, and Mir181 a-1 hg). The smaller Dgcr14 contribution included down-regulated spliceosomal genes Srsfl , Srsf2, and Srsf6, while the Gnbl l contribution was primarily related to down-regulation of genes involved with synapse signaling (Fig. 5). Together, these results indicate that perturbation of just 3 genes of the 22q1 1 .2 locus directly in the adult brain prefrontal cortex explain 40% of the transcriptional changes observed in the LgDel model. This indicates that the 22q11.2 DS phenotype observed in adult neurons may be partially explained by active disruption of cellular processes, and not exclusively a result from developmental defects.
Example 10: LgDel and individual gene perturbations share disease-associated risk alleles
Patients diagnosed with 22q11.2 DS typically present brain functional and behavioral deficits that associate with ASD, schizophrenia, and other neurodevelopmental and psychiatric disorders. Given these connections, we next questioned whether the transcriptional signatures observed for individual perturbations and detected in LgDel neurons may increase risk for those conditions through dysregulation of diseases susceptibility genes (Fig. 5). To answer this question, we analyzed the intersection between our curated list of genes commonly dysregulated in LgDel and in individual perturbations with genes previously associated with ASD, schizophrenia, attention deficit hyperactivity disorder, and bipolar disorder from the DisGeNET dataset (Pinero, J. et al., Nucleic Acids Res. 48, D845-D855 (2020)). We found the strongest overlap between genes down-regulated by Gnbl l perturbation and the schizophrenia list (FDR < 0.001 , hypergeometric test), with gene ontology analysis of those overlapping genes highlighting a strong presence of synaptic signaling genes (22 out of 44 genes). These results are in line with recent studies indicating that ASD- and schizophrenia-associated proteins are strongly concentrated in pre- and post-synaptic locations and are involved in functions related to synaptic organization, differentiation, and transmission. Overall, our results indicate that Dgcr8, Dgcr14, and Gnbl l may contribute to 22q11.2 DS through broadly altering the expression of disease susceptibility genes in vivo, emerging after development and through a mechanism that involves RNA regulation in mature neurons.
Example 11: Astrocyte-specific screen
An in vivo cell type-specific screen could in principle be achieved using cell type-specific delivery or expression as well as through physical enrichment of the cell type of interest. We chose to use a cell type-specific promoter that should lead to higher levels of expression of eGFP-KASH in astrocytes, thus allowing for enrichment of infected astrocyte nuclei using FACS. In a new set of experiments, we swapped the CBh promoter in the AAV genome for GfaABCI D, a promoter that was documented to
have high levels of expression in astrocytes (Figure 7a). We cloned our original gRNA library used in the unbiases screen (CBh screen), produced AAV.PHP.B and followed AAV-Perturb-seq’s experimental pipeline. Single-nucleus sequencing of ~ 35 000 infected cells revealed the presence of astrocytes as well as other brain cells (Figure 7b). Although we expected more of an enrichment in astrocytes based on prior published findings describing the GfaABCI D promoter, the presence of other cell types is not abnormal when using such promoters. For example, it was shown previously that GfaABCI D can also be expressed in neurons. Nevertheless, when looking at cell type composition in the datasets, we observed that 17% of all cells are astrocytes in the GfaABCI D experiment, as opposed to 6% in the CBh screen (Figure 7c), resulting in a sufficient number of nuclei to proceed with an astrocyte-focused AAV-Perturb-seq analysis.
Directing our attention to astrocytes, we detected a gRNA in 80% of nuclei, with -45% expressing a unique gRNA (Figure 7d). This was comparable to our previous results for the CBh screen shown in Extended Data Figure 4a. Analysis of cells with MOI 1 (without filtering unperturbed cells) indicated the presence of a transcriptional phenotype in astrocytes perturbed in Dgcr8 (3 DEGs) and Dgcr14 (16 DEGs) (Figure 7e). These findings confirm our original result indicating that the perturbed genes do not elicit an overtly strong transcriptional response in astrocytes.
Although the number of DEGs for Dgcr8-perturbed astrocytes did not meet our standards for LDA filtering and signal enrichment, and therefore we could not stratify perturbed and non-perturbed cells, we thought we may be able to proceed with unfiltered data. As noted previously, the transcriptional signature before and after filtering tends to correlate, making this approach sensible with the caveat of increased noise due to the inclusion of unperturbed cells. Considering this approach, we were curious to investigate whether the transcriptional phenotype observed in astrocytes was similar to our previous results in neurons, and therefore calculated the correlation between astrocytes and neuron types perturbed in Dgcr8 and Dgcr14. We demonstrate that astrocytes have a perturbation-specific transcriptional signal that is comparable with our previous observations in neuron types. Support for this comes from high perturbation-specific correlation and overlap of DEGs between cell-type pseudobulk profiles. Overall, the combination of the results from the un biases CBh screen and the astrocyte specific GfaABCI D screen suggests that mutations in Dgcr8 and Dgcr14 lead to perturbation-specific transcriptional phenotypes that are shared across brain cells beyond neuron types.
Example 12: NASH
The goal was to implement our platform in a NASH model and identify genetic targets with therapeutic potential. NASH is characterized by an excessive accumulation of fat in hepatocytes. At least 20% of patients progress to severe liver disease, in which the excessive fat causes cell damage and initiates a cascade of inflammatory events - mediated by Kupfer cells and hepatic stellate cells (HSC) - that lead to tissue fibrosis and scarring. If not addressed, this progression can culminate in cirrhosis and liver failure, which are major risk factors for the most common liver cancer, hepatocellular carcinoma. The involvement of various cell types throughout the disease's progression explains the complexity of finding effective treatments. Such complexity of cellular interactions is impossible to decipher in vitro. By permitting high-throughput screening directly in vivo, AAV-Perturb-seq is particularly well-positioned to identify therapeutic targets for the disease.
Firstly, we worked on implementing a NASH mouse model. There are several types of murine models of NASH, ranging from genetic to chemical to dietarian. We chose to utilize a dietarian model based on a high-fat diet (HFD). This model has been reported to better mimic the pathological features observed in human patients.
To demonstrate the ability of AAV-Perturb-seq to identify genetic targets of interest in the HFD model, we constructed a CRISPR-Cas9 gRNA library targeting genes previously associated with NASH. The library was packaged into AAV.rhI O - an AAV capsid with high tropism to liver cells - and intraperitoneally injected into 12-week-old animals fed a standard diet. Three weeks post-injection, animals were transferred to an HFD for five weeks. After this period, hepatocytes were isolated from liver tissues and separated based on the extent of lipid accumulation, which represents a hallmark of NASH (Figure 8a). Targets enriched in the high lipid content are involved with an increased accumulation of fat molecules inside cells and contribute to disease progression (Figure 8b). Contrarily, perturbation of targets enriched in the low lipid content leads to reduced accumulation of fat in hepatocytes (Figure 8c). These targets have a protective effect and have the potential to be further explored as therapeutic targets for NASH.
Our results prove that AAV-Perturb-seq can prioritize interventions with therapeutic potential and set the stage for the identification of novel targets for the treatment of human diseases.
Example 13: Alzheimer’s disease
Part of our platform implementation plan involves utilizing AAV-Perturb-seq to discover therapeutic targets for Alzheimer’s disease (AD). Recent studies have revealed the critical role of microglia in AD pathogenesis. Microglia react to the accumulation of Ap and tau proteins in the brain — processes thought to have a significant role in AD pathogenesis — which induce microglia state changes associated with inflammation, phagocytosis, and neurodegeneration. AAV-Perturb-seq offers, for the first time, the possibility of identifying genetic targets able to modulate microglia states and, consequently, disease progression.
Previously, we utilized an AAV capsid (AAV.PHP.b) specifically developed to target neuronal cells in the mouse brain. In our hands, this AAV serotype proved to infect all major brain cell types but to different extents (Figure 9a). Ideally, one would want to recover approximately the same number of cells per perturbation for each cell type to maintain a good representation of each perturbation in all cell types. Thus, cell type-specific delivery of gRNA libraries by using cell type-specific AAV capsids will permit better control over the number of cells per perturbation.
Given that microglia are the cell type least abundant in an AAV-Perturb-seq experiment (Figure 9a), we set out to identify microglia-specific AAV capsids. Recently, it was reported the evolution of new AAV capsids that could be used to deliver our gRNA libraries directly and specifically to microglia upon tail vein injection of viral particles. To test these capsids, we produced AAV particles with four evolved capsids (M1 - M4) and included one neuron-specific capsid (AAV. PHP. eB) as a control. These viruses carried a GFP transgene to report successful infections. Three weeks after the injection of 1012 viral particles into LSL-Cas9 mice, microglia were isolated from brain tissue and subjected to flow cytometry to evaluate the presence of GFP-positive (GFP+), and thus infected microglia (Figure 9b).
Unexpectedly, none of the AAV capsids tested proved to be able to infect microglia, as illustrated by the absence of events in the GFP+ group (Figure 9c). We then pivoted our attention to two new AAV capsids that were developed to infect microglia upon stereotaxic injection (Figure 10a). Three weeks after the injection of AAV.MG1.1 and AAV.MG1.2 into the somatosensorial cortex of LSL-Cas9 mice, we observed 30.2% and 56.9% infected microglia, respectively (Figure 10b). These results prove our ability to infect microglia and create the conditions to implement AAV-Perturb-seq screens in AD models.
Example 14: Sequences
Example 15: Discussion
\Ne describe AAV-Perturb-seq, a direct in vivo single-cell CRISPR screening method that is tunable, scalable, and broadly applicable for systematically interrogating genetic elements in vivo in high- throughput. We demonstrate the potential of AAV-Perturb-seq in the brain using a single systemic injection of AAV.PHP.B encoding a library of CRISPR gRNAs targeting genes linked to 22q11.2 DS. Using gene editing in LSL-Cas9 mice or transcriptional inhibition in dCas9-KRAB mice along with either constitutive or cell type-specific promoters, this enables flexible perturbation of numerous disease- associated genes in all or specific brain cell types, respectively. AAV-Perturb-seq offers for the first time
the opportunity to directly interrogate multiple genes in several cell types at a single-cell level in the same animal without restriction to tissue or developmental time points, opening immense further possibilities for studying processes of health and disease in vivo.
We applied AAV-Perturb-seq to interrogate the genotype-phenotypic landscape underpinning 22q11.2 DS. Unlike other deletion syndromes where the observed phenotype can be explained by single genes, none of the genes within the 22q11.2 locus can largely explain the predisposition it confers for neurodevelopmental and psychiatric disorders. Additionally, the function of each gene in mature brain cells is poorly understood and has never been systematically investigated. We therefore perturbed all 29 of the 22q11.2 DS-linked genes expressed in the adult mouse brain and found significant transcriptional phenotypes associated with Dgcr8, Dgcr14, Gnbll, and Ufdll. In addition to providing in vivo evidence to support previous findings, such as Ufdll cellular essentiality and Dgcr8 mediating pri- miRNA processing, we also discover previously unreported connections between Dgcr14 and Gnbll to adult neuron physiology relevant to 22q11.2 DS pathology. In Dgcrl 4-perturbed neurons, we identified the dysregulation of numerous genes involved in splicing previously associated with ASD and schizophrenia (e.g., Srrm2, Srsf5, Srsf11, Snrnp70, Fus), which suggests that splicing defects mediated by heterozygous loss of Dgcr14 may contribute to the emergence of those disorders in 22q11.2 DS patients. Gnb //-perturbed neurons displayed altered gene expression related to synaptic signaling, strongly suggesting that heterozygous loss of Gnbll may result in impaired neuronal communication throughout development and contribute to the emergence of alterations in neuronal functioning. This hypothesis is further supported by the observation of reduced expression levels of Gnbll in postmortem brain samples of schizophrenia patients with and without 22q1 1 .2 deletion and by the deficits in synaptic signaling and behavior related to schizophrenia and ASD and found in Gnbll*1- mouse models.
Approximately 40% of the transcriptional changes observed in LgDel model mouse neurons could be recapitulated by the perturbation of three genes (individually) in adult animals. We suspect that the remaining transcriptional phenotype may be due to disruptions during development and/or genetic interactions among LgDel genes or their downstream networks, as well as distinct non-cell autonomous interactions between the mosaic setting of the AAV-Perturb-seq experiments and the germline setting of the LgDel model, all which represent promising areas for further study. Overall, our findings suggest that the 22q11 .2 DS transcriptional phenotype found in mature neurons may to some extent be due to the continuous reduction in gene expression of a specific subset of 22q11.2 genes following development. A promising area for further study is determining whether 22q11.2 DS-associated neuronal and cognitive phenotypes can be rescued exclusively through restoring Dgcr8, Dgcr14, Gnbll, and/or Ufdll expression during or after development.
We envision that AAV-Perturb-seq will broadly enable the interrogation of genotype-phenotype landscapes directly in vivo in different tissues, cell types, developmental stages, and under different health and disease contexts. The ability to interrogate complex in vivo biology at scale could lead to breakthroughs in our causal understanding of biological and disease mechanisms as well as our capacity to identify genetic interventions and targets for treating disease.
Advantages of AAV compared to lentivirus for in vivo delivery
Considering contemporary research, AAV is the vastly preferred modality for in vivo delivery for several very good reasons. These advantages, as well as disadvantages, are thoroughly covered by many recent reviews on in vivo delivery (Asokan, A., et la., Molecular Therapy 20, 699-708 (2012); Mingozzi, F. et al., Nat Rev Genet 12, 341-355 (2011)) Below, we highlight the main advantages and disadvantages relevant for single-cell CRISPR screening.
The main advantages of AAV over LV are as follows:
1) Unlike LV, AAVs can be injected systemically and infect seemingly any organ and cell type in a tunable way. If LV is injected systemically, it is only capable of infecting a few cells in the liver. If LV is injected within a compartment (e.g., via intraperitoneal, intrathecal, or intracerebroventricular injection) it mostly infects cells along the barrier without penetrating deeply into tissue. Thus, in vivo delivery of LV almost always reguires direct injection into the organ of interest.
When injected systemically, AAVs can infect almost any tissue or cell type in a tunable way. Unlike LV, which is difficult to modify and target to specific cell types, AAV is very easy to modify and preferentially target to specific cell types thanks to natural serotypes and engineered/evolved capsid variants. If natural AAV serotypes are injected systemically, they show unigue (i.e., tunable) biodistributions. AAV capsid proteins can further be engineered or evolved to enable the preferential targeting of (new) cell types of interest or passing of physiological barriers such as the blood brain barrier. In our study, we leveraged the evolved AAV capsid PHP.B, which enabled us to achieve brain-wide infection with a simple systemic (tail vein) injection.
2) Unlike LV, AAVs do not generally reguire direct injection into tissue, thus avoiding the need for surgical procedures and complicated ethics approvals. Direct injection into tissue often reguires a surgical procedure, which in the case of brain delivery is a craniotomy. The drawbacks of this are plentiful. Surgery, compared to systemic delivery, is labor intensive, reguires expert knowledge, low throughput, leads to increased mortality in experimental mice, reguires increased post-surgical monitoring, and involves a more elaborate ethics approval process (due to increase severity and stress to the animals).
3) Unlike LV, AAVs do not generally reguire direct injection into tissue, thus avoiding tissue damage and confounding alterations to cell states. Direct injection into tissue causes damage, resulting in altered cell states. For example, along the injection track created during cranial/brain injections, it is extremely common to find reactive astrocytes and activated microglia, which can confound phenotypes of interest especially when using single cell methods.
4) Unlike LV, AAV sparsely infects a large number of cells across a tissue. With direct injection of a virus into tissue, it is extremely difficult if not impossible to infect a large number of cells while controlling the MOI. Systemically injecting AAV made it simple for us to titrate the amount of virus to optimize the number of infected cells with a single infection.
5) Transgene expression from AAV is known to be higher compared to LV, likely leading to higher rates of gRNA detection. While we did not test AAV and LV head-to-head for gRNA capture, it is well known that AAV is superior to LV in terms of transgene expression. We speculate that
this contributes to our gRNA capture/calling rates being higher than what has been described previously with LV.
6) LV has a narrow range of utility. Beyond what has already been discussed above, the only case that we are aware of where delivering LV in vivo is commonly done and useful is in the context of development where LV is injected before or shortly after birth. When LV is injected in utero or postnatally in the ventricle of the brain, it is possible to achieve brain-wide infection. Intracerebral ventricular injection into neonates can lead to brain-wide infection due to the fact that the blood brain barrier is immature. In utero injection into the brain of embryos within a pregnant mother can also lead to brain-wide infection due to radial glial progenitor cells of the ventricle wall being readily infected and differentiating to give rise to transgene-expressing daughter cells throughout the brain. However, the concern with such an approach is that large gRNA libraries could be bottlenecked due to only a small number of cells being amenable to infection. This outcome may not be apparent when examining the number of infected cells at the endpoint because these could represent daughters of the originally infected progenitor cells. In contrast to these developmental perspectives, it is not possible to achieve brain-wide infection with lentivirus. Taken together, while there is some utility of LV we think it is fair to say that the diversity of options is severely limited.
The main disadvantages of AAV over LV are as follows:
1) While the packaging capacity of AAV (~4.7kb) is smaller than that of LV (~10kb), dual/split vector approaches offer efficient mitigation strategies. The main limitation of AAV over LV is packaging capacity. However, this can be overcome by using dual/split vector systems (Lai, Y. et al., Nat Biotechnol 23, 1435-1439 (2005)) The concept here is that payloads can be split over multiple viruses, achieved either by splitting independent elements (e.g., gRNA from Cas9) or using strategies that stitch (via split proteins, trans-splicing, split-inteins, etc) the biomolecules (e.g., Cas proteins) back together again. These methods are widely used in the context of gene editing and other fields (Yang, Y. et al., Nat Biotechnol 34, 334-338 (2016); Koblan, L. W. et al., Nature 589, 608-614 (2021)). While this is not necessary in our experimental setup where we are using Cas9 transgenic animals, it should be straightforward to combine our approach with split vectors to enable AAV-Perturb-seq in non-Cas9 transgenic animals.
Taken together, AA V is vastly superior to LV for in vivo delivery and AA V-Perturb-seq will therefore open new avenues for single-cell CRISPR screening in vivo.
5’ and 3’ gRNA molecules capture
\Ne created AAV transfer plasmids to test both 5’ and 3’ gRNA capture methods. In the 3’ construct (pAS006), the gRNA sequence is present within an mRNA transcript and can be captured by conventional single-cell RNA-seq (scRNA-seq) 3’ capture methods. The second strategy (pAS088) was designed to enable direct capture of the gRNA, wherein we designed an AAV transfer plasmid with independent gRNA and mRNA expression cassettes. Here, the gRNA sequence can be directly captured via scRNA-seq 5’ capture methods, e.g., as shown by Replogle et al (Replogle, J. M. et al., Nature Biotechnology 2020 38:8 38, 954-961 (2020)). Our results indicate a higher capture efficiency
when using 5’ capture (combination of pAS088 with 5’ capture single-cell library preparation). We believe this comes from: 1) the hU6 pol-lll promoter leads to high RNA expression when compared with pol-ll promoters, and consequently more RNA molecules are available to be captured during single-nucleus library preparation; and 2) our 3’ capture approach construct (pAS006) expresses pol-ll transcripts containing a WPRE sequence that mediates mRNA nuclear export, leading to reduced numbers of mRNA molecules containing the gRNA in the nucleus available for capture (and this should not be occurring with our 5’ capture approach construct (pAS088) containing the pol-lll U6 promoter to express gRNAs that should not be exported from the nucleus). gRNA filtering strategy
Our strategy to assign gRNAs to cells was designed to consider three important details of single-cell data generation and introduced filtering steps to minimize the risk for each of them. Each filter alone cannot perfectly stratify true from false signal, but collectively result in a robust enrichment of signal over noise:
1) Reads per UMI coverage filter to correct for chimeric molecules. First reported in Dixit et al (Dixit, A. et al., Cell 167, 1853 (2016)) chimeric sequences are created during PCR amplification of gRNA molecules. Such events may create molecules composed by a combination of cell barcode, UMI and gRNA that wasn’t originally present in the cDNA library. Therefore, chimeric sequences could only emerge in our dataset from template swapping during PCR amplification, which we mitigate computationally. As these molecules appear later in the single-cell library preparation protocol, they are less abundant and generate fewer reads. To remove chimeric molecules, we calculate coverage, namely the number of reads per UMI (READ_count / UMI_count), and remove molecules with low coverage applying a threshold based on the coverage distribution. There is a bimodal distribution of the coverage which supports the removal of molecules with coverage lower than 60 (less than 60 reads per molecule). It should be noted that coverage depends on sequencing depth and should be re-calculated for every new sequencing data (not shown).
2) Total UMI count filter to correct for chimeric molecules and ambient RNA. For all gRNAs, we found a high number of events with only one UMI count within and across many cells. Such a phenomenon would not be expected given the low multiplicity of infection at which we infect the brain (Figure 1 b). We and others therefore accept that these events result from ambient RNA contamination or chimeric sequences created during library preparation. In such a scenario, a possible contaminating gRNA molecule is expected to have very few UMI counts, while an expressed gRNA is more likely to have many UMIs (as there are multiple RNA copies inside the nucleus). There is a disproportional high number of molecules with 1 UMI count, which supports this filtering step (data not shown).
3) UMI proportion filter to correct for chimeric molecules and ambient RNA. The number of detected UMIs in a nucleus correlates directly with the number of reads. A nucleus with a higher number of UMIs is more likely to also have a higher number of UMIs coming from contaminating gRNA and mRNA. A gRNA associated with a small number of UMIs that represent a small proportion of the total gRNA-UMIs identified in a nucleus most likely represents a chimeric read or RNA cross contamination. Proportionbased filters have been used previously to address this issue. We incorporate this concept in our
workflow but increased the stringency of the threshold compared to published work. For instance, in the in vivo Perturb-seq, gRNAs ar filtered out from nuclei where the most abundant gRNA UMI count is 1 .3x higher than the UMI counts for the second most abundant gRNA. Such a threshold only considers information about the two most expressed gRNAs and selects a gRNA label based on the highest expressed gRNA, even if there is a second gRNA with high counts. For instance, one cell containing 10 and 14 UMIs for two different gRNAs would be labeled as only having one gRNA. On the other hand, we remove events only from nuclei where the gRNA UMI count is at least 10x higher. This extra stringency gives the confidence that we are removing events from nuclei where there is a large difference on the number of UMIs per gRNA. For instance, one cell containing 10 and 14 UMIs for two different gRNAs would be labeled as having both gRNAs, whereas another cell containing 2 and 20 UMIs for two different gRNAs would be labeled as only having one gRNA. With gRNAs not having the same capture efficiency, we strongly believe that this extra stringency is important for confidently identifying cells with single infections/perturbations.
4) We can assess the impact of these filtering steps through modifying their thresholds and comparing how the outputs of the analysis change.
4.1) We applied different thresholds to the number of UMI- and proportion-based filters and compared gRNA-nuclei assignment and transcriptional signal. For each combination of the two filtering steps with different thresholds, we identified putative expressed gRNAs for each nucleus. Then, we assigned the gRNA label to nuclei expressing unique gRNAs (MOI = 1). All remaining nuclei were excluded from the analysis. We then calculated the pairwise label accuracy between filtering threshold combinations. We observed a high accuracy between thresholds, indicating that nuclei tend to be labelled with the same gRNA across threshold strategies.
4.2) As the accuracy was not 100%, we investigated which nuclei had different assignments. We repeated the analysis from point 1 keeping only nuclei that are labelled with a gRNA for both thresholds (i.e., removing nuclei that were not assigned to a gRNA). This analysis highlights whether thresholding changes the gRNA associated to a given nuclei or serves to remove nuclei with low-confidence assignments. We observed an accuracy of 100% between any two pairs of thresholds, which strongly suggests that thresholding removes nuclei with low-confidence assignments without changing gRNA assignment itself for high-confidence nuclei.
4.3) We next investigated how thresholding affects the transcriptional signal associated with a perturbation by performing DE analysis between SH control and Dgcr14-perturbed nuclei using each threshold. We observed that all thresholds lead to similar expression changes (LFC directionality) and mainly affect the LFC magnitude and consequently the statistical power (FDR) to call a gene differentially expressed.
These results indicate that our gRNA thresholding removes background noise from subsequent analyses (increasing our power to detect biological pathways) without affecting gRNA association to truly informative cells.
On analysis approaches for single-cell CRISPR screens
\Ne believe that there are two main steps that should be carefully considered while analyzing single-cell CRISPR screening data: 1) filtering of gRNA-labelled cells; and 2) identifying gene expression changes mediated by individual perturbations. After exploring pipelines published previously, we designed an approach that we believe best reflects current knowledge in the field. We aimed to apply strict metrics and thresholds to increase confidence in our findings, with the tradeoff that we may miss tenuous signals that can be confused with noise. Below, we discuss the two critic steps and clarify differences between our approach and previous methods.
1) Filter non-perturbed but gRNA-labelled cells. On average, Cas9-gRNA complexes generate an indel in approximately 80% of infected cells (Fig. 2e), meaning that approximately 20% of cells were not gene edited. Furthermore, a fraction of the edited cells will contain approximately 33% of in-frame deletions that should not lead to loss of function. The transcriptome of these “non-perturbed” cells should remain unchanged and should be indistinguishable from the control group. Different methods have been applied to remove non-perturbed cells and we will now describe them and explain why we are confident our method is robust. Specifically, in one of the publications indicated by the reviewer (Adamson et al., Cell 167, 1867-1882. e21 (2016)), they did not perform any filtering to remove non-perturbed cells. In later publications from the same labs, they used CRISPR inhibition or activation to regulate the mRNA expression of target genes without the introduction of mutations. Such methods lead to direct down or up regulation of target transcripts, which can be used to remove cells that do not present the desired regulation. In our case, the introduction of indels into a target gene typically leads to loss of function mutations without necessary affecting the gene’s mRNA level, and thus we cannot use a mRNA quantification-based method. As a side point, we now show that AAV-Perturb-seq is compatible with CRISPRi.
Other publications using Cas9 to induce mutations have explored the perturbation inference problem using linear models, cosine similarity, and Local Outlier Factor. While we have no reasons to criticize these methods, we think there are improvements that can be made when selecting the most informative genes for subsequent analysis. Ideally, one wants to use a subset of features enough to distinguish perturbation from control, but not so big that it prompts overfitting by including more variables (genes) than observations (single cells). Across these publications, authors use a linear model, Kolmogorov- Smirnov test, and Student’s t-test with a cut-off at p-value < 0.05 to identify DEGs between control and all gRNA X-labelled cells, and later use this DEGs subset as input for the filtering algorithms. Importantly, note that the authors use p-value and not adjusted p-values. We believe that this way of feature selection is influenced by two factors that reduce their power. First, given the large scale of single-cell data (/.e., the high number of cells in each group), tests looking at probability distributions, such as the Kolmogorov-Smirnov test, tend to find many genes with p < 0.05 expression changes without biological significant changes due to small differences detected by high sample counts. Second, scRNA-seq is effected by high dropout rates, especially among lowly expressed genes, which gives to all detected genes a non-zero fold change value that reflects technical limitations rather than biological signal. By using a relaxed threshold (no threshold on fold change values and non-adjusted p-value of 0.05), these
methods use features that may not be truly differentially expressed, thus introducing noise into the filtering process.
Considering this computational step of perturbation inference, the fundamental difference between us and other publications is that we only rely on confidently assigned DEGs by thresholding by FDR rather than p-value. Moreover, after we implemented our approach, a publication by Papalexi et al (Papalexi et al., Nature Genetics 53, 322-331 (2021)) corroborated our assumptions, whereby they demonstrate that MIMOSCA and MUSIC (which use p-value thresholding) fail to properly filter unperturbed control cells.
We believe that our approach, which is restricted to truly differential expressed genes, is superior at realistically identifying perturbed cells.
2) Identify significant and robust gene expression changes. After removing the noise introduced by nonperturbed cells, one should focus on identifying transcriptional changes that explain the conseguences of perturbations in the target gene and its biological relevance. Previous work has employed different technigues that can be grossly grouped into dimensional reduction methods (WGCNA, PCA, NMF) and single-cell specific differential expression analysis. For instance, Adamson et al, ibid used a combination of Kolmogorov-Smirnov (KS) test and random forest. We will comment on KS test below. Regarding the random forest test, its main advantage is also its limitation. Specifically, random forest allows for the identification of the most important features (genes) use by the model to classify distinct perturbations. However, it assumes that all perturbations lead to distinct transcriptional phenotypes as it associates a specific set of informative genes to each perturbation. Tian et al (Tian, R. et al., Nat Neurosci 24, 1020- 1034 (2021)), Jin et al (Jin, X. et al., Science 370, eaaz6063 (2020)), and other, have used dimensional reduction methods to group the entire transcriptional space (around 5000 genes in single-cell experiments) into smaller groups, or components. These typically cluster genes with similar expression patterns. However, it is not obvious which genes are the most relevant ones inside the groups.
The most used approach to identify informative genes rely on methods commonly use to find DEGs from single-cell data, such as linear models (Frangieh, C. J. et al. Nature Genetics 2021 53:3 53, 332-341 (2021)), logistic regression (Papalexi, E. et al., ibid), KS test (Adamson, B. et al., ibid), and Mann- Whitney U test (Replogle, J. M. et al., ibid). A recent publication by Sguair et al (Sguair, J. W. et al., Nature Communications 2021 12:1 12, 1-15 (2021)) presents an exceptional and exhaustive work to compare the performance of different DE methods across multiple scRNA-seg datasets. The authors’ two most preeminent findings are: 1) commonly used methods tend to have high false discovery rates and overestimate the number of DEGs and 2) the generation of artificial pseudobulks of cells from the same experimental group to use as input for bulk RNA-seg DE methods outperforms approaches that compare groups using individual cells (single-cell common methods mentioned above). Thus, we chose to implement a pseudobulk approach. We also evaluated a single-cell specific DE method (logistic regression). However, as reported before by others, this method has a bias towards highly expressed genes.
3. Pseudobulk performs superior to single-cell specific methods in a true positive control. Th e Lg D e I model carries a heterozygous deletion in chromosome 16 (eguivalent to human 22g1 1.2 locus). We
hypothesized that this can be used as a control to test DE methods, as we know that genes within the locus are being expressed from only one copy, and thus, in general, their expression should be reduced to approximately 50%. Our pseudobulk detects LFC values of approximately -1 (50% reduction) across deleted genes and cell types, while the logistic regression test typically applied to single-cell data presents smaller LFC values and high variance.
Use of AAV-Perturb-seq in non-genetically modified animals
In our study we applied AAV-Perturb-seq in Cas9 transgenic animals. However, some applications may require the use of disease-specific models. Performing gene editing outside of Cas9 knock in models would require either a smaller Cas9 (e.g. SaCas9) to ensure all of the necessary elements fit in one vector or a dual or split AAV approach. These alternative approaches are well established and there are several studies applying them in vivo in multiple tissues: smaller Cas9s, dual AAVs, and split AAVs. The only unique consideration for performing AAV-Perturb-seq in these contexts is controlling the multiplicity of infection. This could be achieved through delivering a limiting amount of the vector containing the gRNA (as we demonstrated in this study), but then a much higher amount of the second vector, or dual/split vectors, containing everything else. Given the extensive literature support of in vivo gene editing in non-genetically modified animals and the compatibility between AAV-Perturb-seq and these vectors, we think it is fair to conclude that AAV-Perturb-seq would work outside of Cas9 knock in models.
Astrocytes-specific screen
An in vivo cell type-specific screen could in principle be achieved using cell type-specific delivery or expression as well as through physical enrichment of the cell type of interest. We chose to use a cell type-specific promoter that should lead to higher levels of expression of eGFP-KASH in astrocytes, thus allowing for enrichment of infected astrocyte nuclei using FACS. We swapped the CBh promoter in the AAV genome for GfaABCI D, a promoter that was documented to have high levels of expression in astrocytes. We cloned our original gRNA library used in the unbiases screen (CBh screen), produced AAV.PHP.B and followed AAV-Perturb-seq’s experimental pipeline. Single-nucleus sequencing of - 35 000 infected cells revealed the presence of astrocytes as well as other brain cells. Although we expected more of an enrichment in astrocytes based on prior published findings describing the GfaABCI D promoter, the presence of other cell types is not abnormal when using such promoters. For example, it was shown previously that GfaABCI D can also be expressed in neurons. Nevertheless, when looking at cell type composition in the datasets, we observed that 17% of all cells are astrocytes in the GfaABCI D experiment, as opposed to 6% in the CBh screen, resulting in a sufficient number of nuclei to proceed with an astrocyte-focused AAV-Perturb-seq analysis.
Directing our attention to astrocytes, we detected a gRNA in 80% of nuclei, with -45% expressing a unique gRNA. This was comparable to our previous results for the CBh screen. Analysis of cells with MOI 1 (without filtering unperturbed cells) indicated the presence of a transcriptional phenotype in astrocytes perturbed in Dgcr8 (3 DEGs) and Dgcr14 (16 DEGs). These findings confirm our original result indicating that the perturbed genes do not elicit an overtly strong transcriptional response in astrocytes.
Although the number of DEGs for Dgcr8-perturbed astrocytes did not meet our standards for LDA filtering and signal enrichment, and therefore we could not stratify perturbed and non-perturbed cells, we thought we may be able to proceed with unfiltered data. As noted previously, the transcriptional signature before and after filtering tends to correlate, making this approach sensible with the caveat of increased noise due to the inclusion of unperturbed cells. Considering this approach, we were curious to investigate whether the transcriptional phenotype observed in astrocytes was similar to our previous results in neurons, and therefore calculated the correlation between astrocytes and neuron types perturbed in Dgcr8 and Dgcr14. \Ne demonstrate that astrocytes have a perturbation-specific transcriptional signal that is comparable with our previous observations in neuron types. Support for this comes from high perturbation-specific correlation and overlap of DEGs between cell type pseudobulk profiles. Overall, the combination of the results from the un biases CBh screen and the astrocyte specific GfaABCI D screen suggests that mutations in Dgcr8 and Dgcr14 lead to perturbation-specific transcriptional phenotypes that are shared across brain cells beyond neuron types.
In sum, setting aside the lack of reproducibility of prior published findings regarding GfaABCI D in our hands, we have shown that focusing AAV-Perturb-seq on less abundant cell types through the use of cell type-specific promoters is in principle possible.
Example 16: Material and Methods
Plasmid design and cloning
AAV genome plasmids (Fig. 1a Fig. 6a) were based on the Addgene plasmid #60231 (Platt, R. J. et al., Cell 159, 440-455 (2014)). To achieve widespread transgene expression, the hSyn promoter was replaced by the ubiquitous CBh promoter (pAS088). For the triple color experiments (Fig. 6a), the U6 expression cassette and Cre were removed, while eGFP was replaced by mTagBF2 (Addgene plasmid #55302), Venus (Addgene plasmid #22663), or mCherry (Addgene plasmid #27970) (pAS132, pAS133, pAS134). To prepare the 5’ capture AAV genome, the original U6 expression cassette was first removed by restriction digestion with Mlul (ThermoFisher) and Xbal (ThermoFisher) from upstream of the pol-ll promotor and cloned between the WPRE and poly(A) signal sequences (pAS006). gRNA library design
\Ne focused on a set of genes located within the human 22q1 1.2 locus and conserved in the mouse genome. Using BrainSpan data (Miller, J. A. et al., Nat. 2014 5087495 508, 199-206 (2014)), we identified 29 genes with detectable expression in the adult mouse cortex. Individual gRNAs sequences targeting those genes were designed with the online tool GUIDES (http://guides.sanjanalab.Org/#/). The two best scoring gRNAs for each target were selected. As a control, we used safe harbor (SH) targeting gRNAs established previously (Morgens, D. W. et al., Nat. Commun. 2017 81 8, 1-8 (2017)). The use of SH-targeting rather than non-targeting gRNAs allows us to control for transcriptional changes induced by CRISPR-Cas9 DNA double strand breaks that are not directly related to the target gene of interest. To facilitate Gibson Assembly cloning, we appended a 5’ arm (TGGAAAGGACGAAACACCG, SEQ ID NO 11) and a 3’ arm (GTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGC, SEQ ID NO 12) to the gRNA sequences. Sequences were ordered individually as single-strand oligo DNA nucleotides (ssODNs) and pooled at a final concentration of 100 mM.
gRNA library cloning
The plasmid backbone (2.5 pg) was digested with Bsmbl (ThermoFisher) for 1 hr at 37 °C followed by an inactivation step for 5 min at 80 °C. The Gibson Assembly reaction was set as follows: 50 ng of digested plasmid backbone, 2 pL (200 fmoles of ssDNA oligos (stock at 100 mM), 10 pL NEBuilder® HiFi DNA Assembly Master Mix (NEB, E2621 L), and H2O up to 20 pL total reaction. The reaction was incubated for 1 hr at 50 °C. Isopropanol purification was used to concentrate the cloned gRNA library by mixing the total Gibson Assembly reaction with 20 pL isopropanol, 0.2 pL GlycoBlue Coprecipitant (ThermoFisher, AM9515) and 0.4 pL NaCI solution (stock at 5 M). The precipitation reaction was incubated at room temperature (RT) for 15 min, followed by centrifugation at > 15,000 xg for 15 min at RT. The supernatant was discarded and the DNA pellet was washed with 1 mL ice-cold 80 % ethanol and finally resuspended in 10 pL TE buffer.
Plasmid amplification of pooled gRNA libraries
Pooled gRNA libraries were amplified as previously described (Joung, J. et al., Nat. Protoc. 2017 124 12, 828-863 (2017)). Briefly, the plasmid library was electroporated into Endura ElectroCompetent cells (Lucigen, 60242-2) according to the manufacturer’s instructions, followed by 1 hr recovering period at 37 °C. Bacteria were grown on a bioassay plate (Merck, D4803-1 CS) for 14 hr at 37 °C. Colonies were harvested by scrapping the plate surface before plasmid isolation with QIAGEN Plasmid Maxi kit (QIAGEN, #12165) according to the manufacturer’s protocol. To confirm the distribution of gRNAs, the gRNA expression cassette was PCR amplified using KAPA HiFi ReadyMix with 100 ng of the final library as template and 0.5 pM of both custom Illumina P5 primer (AATG ATACG GCG ACCACCG AG ATCTACAC-N N N N N N N N- ACACTCTTTCCCTACACGACGCTCTTCCGATCTGCTTTATATATCTTGTGGAAAGGACGAAACACC , SEQ ID NO 13) and P7 primer (CAAGCAGAAGACGGCATACGAGAT-NNNNNNNN- GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCCCGACTCGGTGCCACTTTTTCAA, SEQ ID NO 14). PCR of the reaction mixture was performed as follows: (1) 95 °C for 3 min; (2) 98 °C for 20 s, 63 °C for 15 s, 72 °C 20 sec (18 cycles); (3) 72 °C for 2 min. The PCR reaction purified with double-size 0.6x - 1.0x AMPURE bead selection (A63882, Beckman Coulter). Deep sequencing libraries were sequenced using a NextSeq 550 75 cycle kit with the following cycle distribution: 75 to read 1 , 8 to index 1 , and 8 to index 2.
AA V production and purification
AAVs were produced in HEK293T cells and purified by iodixanol gradient centrifugation. Briefly, HEK293T were expanded in DMEM (Merck) + 10% FBS (Merck) + 1 % HEPES (ThermoFisher). Twenty- four hours before the beginning of AAV production, cells were seeded in 15 cm dishes (HuberLab) at a density of 0.6 M cells per mL and a total of 20 mL medium per dish. Cells were transiently transfected with 21 ug of an equal molar-ratio mix of the AAV genome, AAV serotype plasmid (AAV.PHP.B), and the adeno helper plasmid pAdDeltaF6 (Puresyn) using polyethyleneimine max (PEI Max). At 48 hours post-transfection, the medium was replaced with fresh medium without FBS. Harvested medium was mixed with 5 x AAV precipitation buffer (400 g PEG 8000, 146.1 g NaCI in 1 L H2O) and kept at 4°C. One day later, cells were mechanically dislodged and centrifuged at 800 xg for 15 min. The cell pellet
was resuspended in 12 mL AAV lysis buffer (50 mL of 1 M TRIS-HCI (pH 8.5), 58.44 g NaCI, 5 mL of 2 M MgCh in 1 L) and flash frozen in liquid nitrogen. The supernatant was mixed with 5 x AAV precipitation buffer, joined to the medium harvested previously, incubated for 2 hr at 4 °C, and centrifuged at 3000 xg for 1 hr at 4 °C. The resulting pellet was resuspended in 3 mL AAV lysis buffer and added to the first cell pellet. The pellet was subjected to three freeze-thaw cycles and incubated with SAN (Merck) (50 U per 15 cm dish) for 1 h at 37 °C. After two centrifugation steps (saving the supernatant) at 3000 xg for 15 min at 4 °C, 14.5 mL of the supernatant contain AAV particles were poured into an ultracentrifugeready tube (Beckman Coulter). Briefly, gradients were prepared by sequential pipetting of the following iodixanol solutions: 9 mL (15%), 6 mL (25%), 5 mL (40%), and 5 mL (54%). Gradients were ultracentrifuged using a Beckman type 70 Ti rotor at 63,000 rpm for 2 hr at 4°C. To recover the AAV particles, the tubes were pierced at the bottom, 4 mL of gradient (mainly 54% phase) were allowed to pass through and discarded, and the next 3.5 mL (containing isolated AAV) were kept. To remove the iodixanol and concentrate the AAV, the solution was diluted with PBS + 10% glycerol and centrifuged through a 15 mL Amicon 100 kDa MWCO filter unit (Amicon) at 1000 xg for 10 min. The dilution and centrifugation steps were repeated for three rounds. The resulting AAV solutions were aliquoted and flash-frozen in liquid nitrogen. The AAV particle concentration was determine by ddPCR (BioRad). Briefly, 5 uL of isolated AAVs were diluted 10x in water and treated with DNAse I (NEB, M0303S) before preparing tenfold serial dilutions with ddPCR dilution buffer [Ultrapure Water with 2 ng/pL sheared salmon sperm DNA (Thermo Fisher Scientific, AM9680) and 0.05% Pluronic F-68 (Thermo Fisher Scientific, 24040032)]. ddPCR reactions with primers targeting the WPRE sequence (WPRE_fwd: CTTTCCCCCTCCCTATTG (SEQ ID NO 15); WPRE_rev: CAACACCACGGAATTGTC (SEQ ID NO 16); WPRE_probe: CACGGCGGAACTCATCG (SEQ ID NO 17)) were performed with 5.5 pL of the diluted AAV template, 11 pL ddPCR supermix for probes (BioRad, 1863024), 0.9 pM of both primers, and 0.25 pM probe in a total of 22 pL. Droplets were generated with BioRad ddPCR apparatus according with the manufacture’s indications. The amplification reaction was performed as following: (1) 95 °C for 10 min; (2) 95 °C for 30 s, 60 °C for 1 min (42 cycles); (3) 72 °C for 15 s; (4) 98 °C for 10 min. Data were collected and analyzed with BioRad ddPCR apparatus to calculate number of viral particles per pL.
Mice
All animal work was performed under the guidelines of the ETH Animal Welfare Office, the University Basel Veterinary Office, and the Basel-Stadt Cantonal Veterinary Office. Mice were kept under specific pathogen-free conditions on a standard light cycle. Six to eight weeks old male Rosa26-LSL-Cas9 mice1 were used unless otherwise indicated below. Six to eight weeks old male dCas9-KRAB mice (JAX stock #030000) were used. Eight weeks old male LgDel+/+ and LgDel+A mice5 were used for the 22q11 .2 DS model snRNA-seq cell atlas.
Mouse injections
Triple-color experiment: We developed the triple color experiment to fine-tune AAV injection conditions (Fig. 6a). The three AAV genomes were individually packaged into the AAV. PHP. B capsid and purified as indicated in “AAV production and purification”. Different viral particle doses (low: 2.5 x 109; medium: 5.0 x 109; and high: 2.5 x 1010, total number of particles) were generated by pooling equal-portions of
the three viruses. Animals were spit into cages accordingly to their experimental groups. After tail vein injection of 100 pL of the AAV mixtures into LSL-Cas9 mice, animals were kept for three weeks under standard conditions before tissue extraction and processing.
Pooled screen: AAV particles carrying gRNAs to target 22q11 .2 locus genes were generated as indicated in “AAV production and purification”. A single dose of 5.0 x 109 viral particles in 100 pL total volume was injected per mouse. Animals were kept for four weeks under normal conditions before brain tissue extraction and processing.
Arrayed confirmation experiments and CRISPRi experiment: Virus carrying gRNAs to target validation genes were individually prepared as in “AAV production and purification”. Animals were spit into cages accordingly to their experimental groups before tail vein injection (100 pL) of 5.0 x 109 viral particles carrying unique gRNAs. Animals were kept for four weeks under standard conditions before tissue extraction and processing. Animals injected with Ufd1 /-targeting gRNAs presented comorbidities three weeks after injection and had to sacrificed at that time point.
Brain tissue harvesting for nuclei preparations
Animals were intravenously injected with a lethal dose of pentobarbital (100 mg/kg body weight) before transcardial perfusion with 15 mL of ice cold 1x PBS followed by 15 mL of ice cold artificial cerebrospinal fluid (aCSF, in mM: 87 NaCI, 2.5 KCI, 1 .25 NaH2PO4, 26 NaHCO3, 75 sucrose, 20 glucose, 1 CaCI2, 7 MgSO4). The brain was removed, placed into a mouse brain matrix slicer (Zivic Instruments, BSMAS001-1), 1 mm slices were immediately snap-frozen, and the region of interest manually dissected into a frozen Eppendorf tube. Tissue samples were kept at -80 °C.
Nuclei Isolation
Nuclei isolation was performed with mechanical and chemical tissue dissociation procedures. A tissue grinder (Sigma-Aldrich, D8938) was filled with 2 mL of ice-cold nuclei isolation buffer (NIB) (Sigma- Aldrich, NUC101-1 KT) and frozen pieces of tissue were directly placed inside the grinder. For all experiments, nuclei from different animals were isolated in individual grinders, except for the 22q11.2 pooled screen, in which tissue of 15 animals was joined into 3 grinders to reduce the number of isolations and the waiting time before subsequent procedures. The tissue was mechanically disrupted with 25 strokes with pestle A followed by 25 strokes with pestle B. The homogenized solution was transferred to a protein low-binding tube (Eppendorf, 0030122216), mixed with an additional 2 mL of NIB, incubated for 5 min, and immediately centrifuged at 500 xg for 5 min at 4 °C. Supernatant was discarded and the pellet was resuspended in 4 mL NIB, incubated for 5 min and centrifuged at 500 xg for 5 min at 4 °C. The pellet was resuspended in 4 mL of nuclei wash buffer [NWF: 1 % BSA in 1x PBS, 50 U/mL Superasein RNA inhibitor (ThermoFisher, AM2694), and 50 U/mL Enzymatics RNA inhibitor (Enzymatics, Y9240L)] and centrifuged at 500 xg for 5 min at 4 °C. Finally, the nuclei pellet was resuspended in 1 mL NWF and filtered through a 30 pm cell strainer (Sysmex) into a new protein low-binding tube.
Fluorescence-Activated Nucleus Sorting
Fluorescence-Activated Nucleus Sorting (FANS) was performed to: 1) quantify infected nuclei in the triple-color experiment; 2) purify nuclei from debris to ensure a clean nuclei solution before snRNA-seq
library preparation; 3) isolate GFP+ nuclei to prepare snRNA-seq libraries with nuclei from infected cells. Briefly, isolated nuclei solutions were spiked with 2 pL/mL of Vybrant DyeCycle Ruby Stain (ThermoFisher, V10273) and sorted in a MA900 apparatus (Sony). Singlet nuclei were gated based on the DNA dye signal as illustrated in Fig. 6f and GFP+ nuclei sorted into ice-cold NWF. Nuclei were centrifuged at 500 xg for 7 min at 4 °C, resuspended in NWF, and counted on a Spectrum Cell Counter (Cellometer) apparatus.
Fluorescence imaging
For fluorescence imaging analyses, animals were sacrificed by intravenous injection with a lethal dose of pentobarbitol (100 mg/kg body weight), followed by perfusion with 15 mL of ice cold 1x PBS and 15 mL of 4% PFA in 1x PBS. Brain tissue was incubated in 4% PFA in 1x PBS overnight and subsequently transferred in 1x PBS with 30% sucrose where they were left until they sunk. Brains were then embedded in OCT and sections of 20 pm were cut on a cryotome. Imaging was performed in a LSM 900 apparatus (Zeiss).
Single nucleus library preparation for gene expression and gRNA capture
Nuclei infected with AAV carrying the 3’ capture design (pAS006) were sequenced with a Chromium Single Cell 3' Reagent Kit v3 (1 Ox Genomics). The nuclei suspension was diluted to 1 ,000 nuclei/pL and processed accordingly to the kit’s protocol with 13 cycles of cDNA amplification and 14 cycles of sample indexing PCR. To amplify gRNA sequences from the RNA-polymerase Il-driven transcript, we performed a tri-step hemi-nested PCR reaction with KAPA HiFi ReadyMix. To avoid overamplification, all PCRs were spiked in with EvaGreen (Biotium), monitored by qPCR, and stopped before reaching saturation (exiting of the exponential phase). PCR 1 with primers targeting the U6 promotor sequence (TTTCCCATGATTCCTTCATATTTGC, SEQ ID NO 3) and read 1 sequence (ACACTCTTTCCCTACACGACG, SEQ ID NO 4) was performed with 20 ng of full-length single-cell cDNA library as DNA template. PCR 2 was performed with a forward primer targeting the U6 sequence immediately before the gRNA and containing the P7 adapter (GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAGcTTGTGGAAAGGACGAAACAC, SEQ ID NO 5), a reverse P5 primer (AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACG, SEQ ID NO 6), and 2 pL of PCR 1 reaction as template. Finally, a third PCR to index samples for deep sequencing used 2 pL of PCR 2 rection as template and was performed with a forward P7 index primer (CAAGCAGAAGACGGCATACGAGATNNNNNNNNGTCTCGTGGGCTCGG, SEQ ID NO 7) and the P5 primer as reverse (same primer used in PCR 2). All primers were used at a final concentration of 0.3 pM. Amplification reactions were performed as following: (1) 95 °C for 3 min; (2) 98 °C for 20 s, 65 °C for 15 s (72 °C for PCR 3), 72 °C 20 sec (number of cycles up to qPCR saturation); (3) 72 °C for 2 min. The final PCR reaction was cleaned and purified with double-size 0.6x - 1 .2x AMPURE bead selection (Beckman Coulter). Gene expression and gRNA libraries (5 % of flow cell) were sequenced on a NextSeq 550 75 cycle kit with the following cycle distribution: 28 to read 1 , 8 to index 1 , and 56 to read 2.
Nuclei infected with the 5’ capture design (pAS088, preliminary experiments, pooled screen, and confirmation experiments) were sequenced with a Chromium Single Cell 5' Reagent Kit v1 (10x
Genomics). To capture gRNA molecules, we altered the reverse transcription (RT) reaction to additionally include a gRNA-constant-region-targeting RT primer (0.15 pM, AAGCAGTGGTATCAACGCAGAGTACCAAGTTGATAACGGACTAGCC, SEQ ID NO 8) (Mimitou, E. P. et al., Nat. Methods 2019 165 16, 409-412 (2019); Replogle, J. M. et al., Nat. Biotechnol. 2020 388 38, 954-961 (2020)). After cDNA amplification (16 cycles), the reaction was purified with 0.6x SPRI beads (Beckman Coulter). At this point, longer cDNAs (more than 300 bp) from mRNA molecules bind to the beads, while the shorter cDNAs (approximately 200 bp) from gRNA sequences are free in the supernatant. The preparation of gene expression libraries was performed as indicated by the kit’s protocol, with 14 cycles of sample indexing PCR. To recover the gRNA-cDNA sequences, the supernatant from the above step was purified with 1 .4x SPRI beads and eluted in 30 pL of ultra-pure water (ThermoFisher). A 1 :10 diluted aliquot was loaded into Agilent Bioanalyzer High Sensitivity (Agilent) to confirm the presence of a gRNA band of - 180 bp. The gRNA-cDNA library (30 ng) was subjected to a sample indexing PCR using KAPA HiFi ReadyMix and 1 pM of P5 primer (AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTC, SEQ ID NO 18) and P7 indexing primer binding to the gRNA constant region directly downstream of the spacer sequence (CAAGCAGAAGACGGCATACGAGATNNNNNNNNGTCTCGTGGGCTCGGAGATGTGTATAAGAGA CAGTATTTCTAGCTCTAAAAC, SEQ ID NO 10). Amplification reactions were performed as follows: (1) 95 °C for 3 min; (2) 98 °C for 20 s, 54 °C for 30 s, 72 °C 20 sec (15 cycles); (3) 72 °C for 5 min. The final PCR reaction was cleaned and purified with double-size 0.6x - 1 .2x SPRI bead selection. Gene expression and gRNA libraries (5% of flow cell) were sequenced with a NextSeq 550 75 cycle kit or a NovaSeq 100 cycle kit with the following cycle distribution: 26 to read 1 , 8 to index 1 , and 56 (NextSeq) or 91 (NovaSeq) to read 2.
Deep sequencing quantification of Cas9 induced indels
To quantify the efficiency of Cas9 gene editing, 10,000 GFP positive nuclei were sorted into quick extraction buffer (QE, in mM: 1 CaCI2, 3 MgCI2, 1 EDTA, 10 Tris pH 7.5; 1 % Triton X-100, and 0.2 mg/mL proteinase K freshly added before use) and subjected to DNA extraction at 65 °C for 10 min, 68 °C for 10 min, and 98 °C for 10 min. Genomic DNA was used as template for a first PCR reaction followed by a second PCR to index individual samples and attach P5 and P7 sequences. All reactions were performed with KAPA HiFi Ready Mix. Briefly, PCR 1 was performed to specifically amplify -150 bp around the Cas9 cut site (keeping the cut site central in the amplicon) from genomic DNA (5 pL) with gene specific primers containing adapters (0.5 pM, fwd: ACACTCTTTCCCTACACGACGCTCTTCCGATCT (SEQ ID NO 19) + forward gene specific sequence; rev: GTGACTGGAGTTCAGACGTGTGCTCTTCCGATC (SEQ ID NO 20) + reverse gene specific sequence). PCR 1 amplification was performed as follows: (1) 95 °C for 3 min; (2) 98 °C for 20 s, primer set specific annealing temperature for 15 s, 72 °C 20 sec (15 cycles); (3) 72 °C for 2 min. A second PCR to index samples with P5 and P7 primers (0.25 pM, P5: AATGATACGGCGACCACCGAGATCTACAC- NNNNNNNN-ACACTCTTTCCCTACACGACGCTCTTCCGATCT (SEQ ID NO 21); P7:
CAAGCAGAAGACGGCATACGAGAT-NNNNNNNN- GTGACTGGAGTTCAGACGTGTGCTCTTCCGATC, SEQ ID NO 21) was performed as follows: (1) 95 °C for 3 min; (2) 98 °C for 20 s, 70 °C for 15 s, 72 °C 20 sec (15 cycles); (3) 72 °C for 2 min. Indexed
samples were pooled, purified with PCR purification & concentration kit (Zymo Research, D4013), and loaded on a 2% E-Gel (Thermo Fisher Scientific, G402022). The PCR product (-250 bp) was extracted from the agarose gel with QIAquick Gel Extraction Kit (QIAGEN, 28706X4) and sequenced using a NextSeq 550 150 cycle kit with the following cycle distribution: 150 to read 1 , 8 to index 1 , and 8 to index 2.
Single nucleus data processing
Raw reads ofsnRNA-seq gene expression libraries were initially analyzed with CellRangerv4.0 software (10x Genomics) using a mouse reference genome (Ensembl Mouse GRCm38) to generate UMI count matrixes and to join data resulting from different library preparation lanes. Normalization and cell-type clustering were performed with the Seurat v3.0 package in R8. Briefly, UMI counts were scaled to 10,000 molecules per nucleus and log-normalized with the NormalizeData function, followed by selection of the top 2,000 variable genes with FindVariableFeatures . The normalized expression of top genes was scaled and standardized across all nuclei (z-score transformation) with the function ScaleData and used for dimensional reduction with principal component analysis (PCA) implemented in RunPCA. \Ne used the first 10 principal components (PCs) to cluster nuclei based on expression of top variable genes (FindNeighbors and FindClusters). Nuclei were projected to two dimensions with UMAP embedding (RunUMAP and DimPlof) and colored by cluster to evaluate clustering performance. To assign clusters to specific cell types, we first identified gene markers (FindAIIMarkers) for each cluster and investigated their expression in brain cell reference dataset https://DropViz.org (Saunders, A. et al., Cell 174, 1015- 1030.e16 (2018)). gRNA assignment to individual nucleus
To assign gRNAs identities to nuclei we first analyzed raw deep sequencing reads (from gRNA-specific enrichment libraries) with CellRanger. While these libraries do not align to the mouse reference genome, CellRanger outputs a bam file containing reads tagged with corrected cell barcodes and UMIs. Cell barcode correction is important to increase the alignment between barcodes found in both gene and gRNA expression datasets. Custom scripts were used to extract gRNA count tables (from the correct bam files) containing information about cell barcode, gRNA sequence, UMI counts, and read counts. The gRNA sequences were aligned to a reference list using BOWTIE 2, permitting an editing distance of maximum 2 bps. For each nucleus, we remove all gRNAs with coverage (READ_counts / UMI_counts) less than 60 (coverage should be calculated for each new deep sequencing run as it depends on the total number of reads attributed to the library), only 1 captured molecule, or when the gRNA-UMI counts represented less than 10% of all gRNAs identified in the nucleus. The gRNA counts, as well as the number of gRNAs detected in each nucleus and their identity were appended to the nuclei metadata in the Seurat object.
Differential expression analysis
Recent studies have highlighted that pseudobulk analysis of single-cell gene expression data better recapitulates true differences between conditions. Thus, we applied a pseudobulk profile and bulk RNA- seq statistical method to calculate LFC and FDR. Pseudobulk profiles in pooled screen datasets were generated as following: for each cell type and perturbation, we summed raw UMI counts across single-
nucleus library lanes (9 lanes in total). This step transforms the data from a matrix where each column represents one nucleus, to a matrix where each column is the sum of all nuclei from the same lane that contain a given perturbation. For LgDel samples (LgDel+A and LgDel+/+), pseudobulk profiles were generated by aggregating raw UMI counts of nuclei from the same sample (/.e., same animal) and cell type. Differential gene expression of pseudobulk profiles was performed with the R package edgeR v3.36.0 (Robinson, M. D. et al., Bioinforma. Oxf. Engl. 26, 139-140 (2010)). For each cell type, we use the likelihood ratio test (egdeR-LRT) to calculate LFC and FDR values for each perturbation against SH control. The same process was used to compare LgDel+/_ against LgDel+/+ samples. Differential expression analysis with a scRNA-seq method was performed with the Seurat function FindMarkers (parameter test. use = “LR”) to compare each perturbation against SH control, for each cell type individually. For all analyses based on LFC values, we focused on genes with average expression higher than 0.25 UMI per nucleus in the control group (SH control or LgDel+/+), which typically resulted on a list of -4000 genes.
Perturbation and nuclei filtering
To identify perturbations leading to a strong transcriptional signature, we proceeded as follows (for each cell type C and perturbation P):
1. Considering all nuclei belonging to C and P, we calculated pseudobulk differential expression against SH control nuclei from cell type C. If the number of detected DEGs was inferior to 5 (FDR < 0.05), we assumed that P didn’t lead to a significant transcriptional phenotype in cell type C and the perturbation was considered non-significant.
2. As CRISPR-Cas9 induced mutations are typically not observed in all cells carrying a gRNA, we focused on implementing a filtering step. Here, the goal is to identify and remove nuclei with a transcriptional signature closer to control than to perturbed nuclei. For all perturbations identified as significant (DEG > 5, FDR < 0.05), we applied Linear Discriminant Analysis (LDA) with the Ida function in the R package MASS (Venables, W. N. Et al., S. (Springer, 2002). doi:10.1007/978-0-387-21706-2). For each P, we trained a LDA model with a single-nucleus matrix containing nuclei belonging to SH control and P as observations and P-specific DEG as variables, and used the model to predict nuclei labels (SH control or P). We removed all nuclei belonging to a perturbation group whose predicted label did not agree with the true experimental label and kept all SH control nuclei.
Hotelling’s T-squared (T2) statistic
To orthogonally identify perturbations leading to strong transcriptional phenotypes and confirm the results from DEG based filtering, we use Hotelling’s T2 statistics (multivariate t-test) (Hotelling, H. The Generalization of Student’s Ratio, in Breakthroughs in Statistics: Foundations and Basic Theory (eds. Kotz, S. & Johnson, N. L.) 54-65 (Springer, 1992). doi:10.1007/978-1-4612-0919-5_4; Ursu, O. et al., Nat. Biotechnol. 40, 896-905 (2022).). Briefly, for each cell type we performed dimensional reduction with PCA to reduce the multivariate space from -5000 genes to 20 principal components and performed pairwise comparison of each perturbation to SH control nuclei.
Nuclei UMAP embedding based on perturbation transcriptional signature
For each cell type individually, we evaluated whether UMAP embedding was able to separate nuclei based on their perturbation transcriptional phenotype (Fig. 2d). First, DE genes (LFC > 0.5, FDR < 0.01) were selected from all perturbations. This process yielded a matrix with nuclei from all perturbations and SH control as observations and DE genes from all perturbations as variables. Normalized UMI counts were centered and scaled with the Seurat function ScaleData and used for UMAP embedding with the R package UWOT (Mclnnes, L. et al., http://arxiv.org/abs/1802.03426 (2020) doi:10.48550/arXiv.1802.03426.). The following parameters were used: metric = “cosine”; n_neighbors = 10; min_dist = 5; and spread = 10.
Augur scoring analysis
The Augur R package (Skinnider, M. A. et al., Nat. Biotechnol. 2020 391 39, 30-34 (2020)) was created to identify cell types that exhibit a high degree of transcriptional changes when comparing control and perturbed cells. The same rationale can be applied to identify perturbations that lead to a transcriptional phenotype. Briefly, for each cell type, we use the function calculate_auc() with recommended parameters to calculate augur scores for each perturbation. We focused on genes with average expression higher than 0.25 UMI per nucleus in the control group and used the entire group of nuclei for each perturbation and cell type combination.
Identifying perturbation specific transcriptional phenotypes
For all perturbations with a strong transcriptional phenotype, we performed pseudobulk differential expression using all nuclei passing LDA filtering against the SH control group (Fig. 2c and Fig. 3c). This step was repeated for each cell type individually. The top 20 up-regulated genes (LFC > 0.5 and FDR < 0.01) from each perturbation were used for the heatmap in Fig. 2d. To create genetic programs relevant for each perturbation, we selected all genes with an absolute LFC above 0.5 and FDR < 0.01 and spit them into two programs: up-regulated genes (LFC > 0.5) and down-regulated genes (LFC < 0.5). To reveal biological processes associated with dysregulated genes, the entire list of genes served as input for functional enrichment analysis with the R package g: Profiler (g:Profiler — a web server for functional interpretation of gene lists (2016 update) | Nucleic Acids Research | Oxford Academic) functions g:GOSt and g:SCS. A multiple hypothesis testing correction method applying a significance threshold of 0.05 was used. The top biological processes (GO:BP) for each gene program were selected as representative terms in Fig. 3c.
Computational dissection of zygosity in perturbed nuclei
We applied the R package destiny (Angerer, P. et al., Bioinformatics 32, 1241-1243 (2016)) to aligned nuclei along a pseudotemporal space with diffusion maps (Haghverdi, L. et al., Bioinformatics 31 , 2989- 2998 (2015)). Briefly, for each perturbation we extracted gene expression data (from the arrayed perturbations experiment) from SH control and perturbed nuclei, use it as input to the function DiffusionMaps(), and extracted the first two diffusion components (DC) for plotting. To calculate artificial zygosity labels shown in, we performed k-means clustering of DC1 with k = 3. Differential expression was performed as indicated in “Differential expression analysis”.
Gene program scores
Gene programs (Fig. 3d-f) were identified as indicated in “Identifying perturbation specific transcriptional phenotypes”. To calculate scores (/.e., the average expression of all genes belonging to a program across nuclei), we first normalized and center scaled raw UMI counts for all nuclei. Then, for each nucleus, we averaged the expression of genes in the program and divided nuclei by perturbation before visual representation with ridge plots.
Pearson correlation analyses
Pearson correlation between individual perturbations and cell types were calculated using the LFC values of all genes differentially expressed in at least one condition (abs(LFC) > 0.5) and FDR < 0.01) as variables. To calculate correlations between screen and array experiments (Fig. 3b), we select all genes differential expressed (abs(LFC) > 0.5) and FDR < 0.01) in at least one condition and experimental group.
Indel analysis
Deep sequencing libraries for indel analysis were generated as described in “Deep sequencing quantification of Indels” and analyzed with CRISPresso2 (Gene Set Knowledge Discovery with Enrichr Xie - 2021 - Current Protocols - Wiley Online Library. https://currentprotocols.onlinelibrary.wiley.com/doi/10.1002/cpz1.90) with the following parameters: -r1 “fastq file name”; -a “amplicon sequence”; -c “amplicon sequence”; -g “gRNA sequence”; - default_min_aln_score 60; -plot_window_size 20; -min_bp_quality_or_N 0; -exclude_bp_from_left 15; -exclude_bp_from_right 15; -w 1 ; and -wc -3.
Disease, gene set, and mRNA-target enrichment analysis
Disease enrichment analysis was performed with the list of genes commonly dysregulated in individual perturbations and the deletion model using the R package Enrichr and the DisGeNET database (Pinero, J. et al., Nucleic Acids Res. 48, D845-D855 (2020)). To investigate biological processes associated with LgDel transcriptional signatures (Fig. 4d), the list of expressed genes ranked by higher to lower LFC value was used as input to the R package fgsea (Korotkevich, G. et al., 060012 Preprint at https://doi.org/10.1101/060012 (2021)) to run GSEA using the mouse GO:BP dataset. This step was repeated for each cell type individually. To study miRNA-target enrichment, the top 1000 up-regulated genes from each perturbation and cell type were uploaded as input to the online tool MIENTURNET (Licursi, V. et al., BMC Bioinformatics 20, 1-10 (2019)) using the miRTarBase (http://userver.bio.uniroma1 .it/apps/mienturnet/) reference dataset (Huang, H.-Y. et al., Nucleic Acids Res. 50, D222-D230 (2022)).
Robust regression model
The use of robust regression to model the LgDel transcription profile using individual perturbations follows the assumption that the LgDel expression profile is a combination of each individual perturbation (Norman, T. M. et al., Science 365, 786-793 (2019)) (LgDel = CDgcrsDgcr8 + CDgcM4Dgcr14 + CGnbuGnbl l). The expression profile of each condition (i.e., LgDel, Dgcr8, Dgcr14, and Gnbl l) is the change induced by the deletion or each perturbation (all nuclei from a given perturbation) relative to WT control nuclei
(LgDel) or safe harbor control nuclei (screen). Our pseudobulk DE analysis approach calculates, for each expressed gene, LFC values that quantify the difference of a given group to the control condition. Thus, DE analysis results in a vector of LFC values that can be directly use in the model. To fit the model, we use the R package MASS function rim and focused on genes with average expression higher than 0.25 UMI per nucleus in the control group (SH control or LgDel+/+). We used distance correlation (dcor) with the R package energy to evaluate the model fit [d = dcor (LgDel, [cogcrsDgcrS + CDgcri4Dgcr14 + CGnbi iGnbl l])] .
Cited prior art documents:
All scientific publications and patent documents cited in the present specification are incorporated by reference herein.
Stuart, T. et al. Comprehensive Integration of Single-Cell Data. Cell 177, 1888-1902. e21 (2019)
Robinson, M. D., McCarthy, D. J. & Smyth, G. K. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinforma. Oxf. Engl. 26, 139-140 (2010)
Ursu, O. et al. Massively parallel phenotyping of coding variants in cancer with Perturb-seq. Nat. Biotechnol. 40, 896-905 (2022)
Bock, C. et al. High-content CRISPR screening. Nat. Rev. Methods Primer 2022 21 2, 1-23 (2022)
Haghverdi, L., Buettner, F. & Theis, F. J. Diffusion maps for high-dimensional single-cell analysis of differentiation data. Bioinformatics 31 , 2989-2998 (2015)
Licursi, V., Conte, F., Fiscon, G. & Paci, P. MIENTURNET: An interactive web tool for microRNA- target enrichment and network-based analysis. BMC Bioinformatics 20, 1-10 (2019)
Pinero, J. et al. The DisGeNET knowledge platform for disease genomics: 2019 update. Nucleic Acids Res. 48, D845-D855 (2020)
Asokan, A., Schaffer, D. V. & Jude Samulski, R. The AAV Vector Toolkit: Poised at the Clinical Crossroads. Molecular Therapy 20, 699-708 (2012)
Mingozzi, F. & High, K. A. Therapeutic in vivo gene transfer for genetic disease using AAV: progress and challenges. Nat Rev Genet 12, 341-355 (2011)
Lai, Y. et al. Efficient in vivo gene expression by trans-splicing adeno-associated viral vectors. Nat Biotechnol 23, 1435-1439 (2005)
Yang, Y. et al. A dual AAV system enables the Cas9-mediated correction of a metabolic liver disease in newborn mice. Nat Biotechnol 34, 334-338 (2016)
Koblan, L. W. et al. In vivo base editing rescues Hutchinson-Gilford progeria syndrome in mice. Nature 589, 608-614 (2021)
Jin, X. et al. In vivo Perturb-Seq reveals neuronal and glial abnormalities associated with autism risk genes. Science 370, eaaz6063 (2020)
Papalexi, E. et al. Characterizing the molecular regulation of inhibitory immune checkpoints with multimodal single-cell screens. Nature Genetics 53, 322-331 (2021)
Replogle, J. M. et al. Combinatorial single-cell CRISPR screens by direct guide RNA capture and targeted sequencing. Nature Biotechnology 202038:838, 954-961 (2020)
Dixit, A. etal. Perturb-seq: Dissecting molecular circuits with scalable single cell RNA profiling of pooled genetic screens. Cell 167, 1853 (2016)
Adamson, B. etal. A Multiplexed Single-Cell CRISPR Screening Platform Enables Systematic Dissection of the Unfolded Protein Response. Cell 167, 1867-1882. e21 (2016)
Datlinger, P. et al. Pooled CRISPR screening with single-cell transcriptome readout. Nature Methods 2017 14:3 14, 297-301 (2017)
Jaitin, D. A. et al. Dissecting Immune Circuits by Linking CRISPR-Pooled Screens with Single-Cell RNA-Seq. Cell 167, 1883-1896. e15 (2016)
Frangieh, C. J. et al. Multimodal pooled Perturb-CITE-seq screens in patient models define mechanisms of cancer immune evasion. Nature Genetics 2021 53:353, 332-341 (2021)
Griffin, J. M. et al. Astrocyte-selective AAV gene therapy through the endogenous GFAP promoter results in robust transduction in the rat spinal cord following injury. Gene Ther 26, 198-210 (2019)
Taschenberger, G., Tereshchenko, J. & Kugler, S. A MicroRNA124 Target Sequence Restores Astrocyte Specificity of gfaABCI D-Driven Transgene Expression in AAV-Mediated Gene Transfer. Molecular Therapy - Nucleic Acids 8, 13-25 (2017)
Pizzuti, A. et al. UFD1 L, a Developmentally Expressed Ubiquitination Gene, is Deleted in CATCH 22 Syndrome. Human Molecular Genetics 6, 259-265 (1997)
Meechan, D. W. et al. Modeling a model: Mouse genetics, 22q11.2 Deletion Syndrome, and isorders of cortical circuit development. Progress in Neurobiology 130, 1-28 (2015)
Meechan, D. W. et al. Gene dosage in the developing and adult brain in a mouse model of 22q11 deletion syndrome. Molecular and Cellular Neuroscience 33, 412-428 (2006)
Ran, F. A. etal. In vivo genome editing using Staphylococcus aureus Cas9. Nature 520, 186-191 (2015)
Tian, R. et al. Genome-wide CRISPRi/a screens in human neurons link lysosomal failure to ferroptosis. Nat Neurosci 24, 1020-1034 (2021)
Squair, J. W. et al. Confronting false discoveries in single-cell differential expression. Nature Communications 2021 12:1 12, 1-15 (2021)
Platt, R. J. et al. CRISPR-Cas9 knockin mice for genome editing and cancer modeling. Cell 159, 440- 455 (2014)
Miller, J. A. et al. Transcriptional landscape of the prenatal human brain. Nat. 2014 5087495 508, 199-206 (2014)
Morgens, D. W. et al. Genome-scale measurement of off-target activity using Cas9 toxicity in high- throughput screens. Nat. Commun. 2017 81 8, 1-8 (2017)
Joung, J. et al. Genome-scale CRISPR-Cas9 knockout and transcriptional activation screening. Nat. Protoc. 2017 124 12, 828-863 (2017)
Merscher, S. et al. TBX1 is responsible for cardiovascular defects in velo-cardio-facial/DiGeorge syndrome. Cell 104, 619-629 (2001)
Mimitou, E. P. et al. Multiplexed detection of proteins, transcriptomes, clonotypes and CRISPR perturbations in single cells. Nat. Methods 2019 165 16, 409-412 (2019)
Replogle, J. M. et al. Combinatorial single-cell CRISPR screens by direct guide RNA capture and targeted sequencing. Nat. Biotechnol. 2020 388 38, 954-961 (2020)
Stuart, T. et al. Comprehensive Integration of Single-Cell Data. Cell 177, 1888-1902. e21 (2019)
Saunders, A. et al. Molecular Diversity and Specializations among the Cells of the Adult Mouse Brain. Cell 174, 1015-1030. e16 (2018)
Robinson, M. D., McCarthy, D. J. & Smyth, G. K. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinforma. Oxf. Engl. 26, 139-140 (2010)
Venables, W. N. & Ripley, B. D. Modern Applied Statistics with S. (Springer, 2002). doi:10.1007/978-0- 387-21706-2
Hotelling, H. The Generalization of Student’s Ratio, in Breakthroughs in Statistics: Foundations and Basic Theory (eds. Kotz, S. & Johnson, N. L.) 54-65 (Springer, 1992). doi:10.1007/978-1-4612-0919- 5_4
Ursu, O. et al. Massively parallel phenotyping of coding variants in cancer with Perturb-seq. Nat. Biotechnol. 40, 896-905 (2022)
Mclnnes, L., Healy, J. & Melville, J. UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction, http://arxiv.org/abs/1802.03426 (2020) doi:10.48550/arXiv.1802.03426.
Skinnider, M. A. et al. Cell type prioritization in single-cell data. Nat. Biotechnol. 2020 391 39, 30-34 (2020) g:Profiler — a web server for functional interpretation of gene lists (2016 update) | Nucleic Acids Research | Oxford Academic
Angerer, P. et al. destiny: diffusion maps for large-scale single-cell data in R. Bioinformatics 32, 1241— 1243 (2016)
Haghverdi, L., Buettner, F. & Theis, F. J. Diffusion maps for high-dimensional single-cell analysis of differentiation data. Bioinformatics 31 , 2989-2998 (2015)
Clement, K. et al. CRISPResso2 provides accurate and rapid genome editing sequence analysis. Nat. Biotechnol. 37, 224-226 (2019)
Gene Set Knowledge Discovery with Enrichr - Xie - 2021 - Current Protocols - Wiley Online Library. https://currentprotocols.onlinelibrary.wiley.com/doi/10.1002/cpz1 .90
Pinero, J. et al. The DisGeNET knowledge platform for disease genomics: 2019 update. Nucleic Acids Res. 48, D845-D855 (2020) Korotkevich, G. et al. Fast gene set enrichment analysis. 060012 Preprint at https://doi.org/10.1101/060012 (2021)
Licursi, V., Conte, F., Fiscon, G. & Paci, P. MIENTURNET: An interactive web tool for microRNA- target enrichment and network-based analysis. BMC Bioinformatics 20, 1-10 (2019)
Huang, H.-Y. et al. miRTarBase update 2022: an informative resource for experimentally validated miRNA-target interactions. Nucleic Acids Res. 50, D222-D230 (2022)
Norman, T. M. et al. Exploring genetic interaction manifolds constructed from rich single-cell phenotypes. Science 365, 786-793 (2019)
Santinha, A.J., Klingler, E., Kuhn, M. et al. Transcriptional linkage analysis with in vivo AAV-Perturb- seq. Nature 622, 367-375 (2023).
Claims
1 . A method for analyzing multiple gene perturbations in vivo in a tissue of interest; said method comprising the steps: a. providing a non-human organism; b. administering a plurality of viral gRNA-delivering nucleic acid expression vectors to the organism, each vector comprising: i. inverted-terminal repeats (ITRs); ii. a gRNA promoter;
Hi. at least one guide-RNA (gRNA) under control of said gRNA promoter, wherein each vector comprises a different gRNA or gRNA combination; and iv. a terminator of transcription; wherein the organism expresses a Cas enzyme, or said vector additionally encodes a Cas enzyme under control of a promoter operable in said cell; c. isolating a sample of the tissue of interest from the organism; d. in a collection step, collecting cells or nuclei from the sample; e. in an analysis step, performing a single-cell or single-nucleus assay comprising an analysis of the gene perturbation and comprising gRNA sequencing of each cell or nucleus, thereby generating a plurality of assay patterns related to expression of a defined gRNA or a defined gRNA combination; wherein the assay comprises 5’ capture sequencing.
2. The method according to claim 1 , wherein the assay of the analysis step comprises a method selected from the group of
- single-cell or single-nucleus RNA sequencing;
- single-cell or single-nucleus DNA sequencing;
- single-cell quantification of surface proteins;
- single-cell or single-nucleus quantification of cytosolic and nuclear proteins;
- single-cell or single-nucleus quantification of histone marks;
- single-cell or single-nucleus mRNA sequencing, and the assay patterns are mRNA expression patterns;
- transposase-accessible chromatin with sequencing (ATAC-seq), and the assay patterns are chromatin accessibility patterns;
- cellular indexing of transcriptomes and epitopes by sequencing (CITE-seq), and the assay patterns are protein patterns, particularly surface protein patterns.
3. The method according to any one of the preceding claims, wherein after the analysis step, a. the assay patterns are clustered by their type of cell or origin, thereby generating an assay profile for each cell type in the tissue of interest; and/or b. the assay patterns are clustered by their type of expressed gRNA, thereby generating an assay profile for each gene perturbed by a gRNA or a gRNA combination.
4. The method according to any one of the preceding claims, wherein each vector of the plurality of gRNA-delivering nucleic acid expression vectors comprises a reporter gene under control of
a reporter gene promoter, wherein said reporter gene encodes a reporter protein, wherein said reporter protein enables selective isolation of cells that express said reporter protein in the collection step, particularly wherein in the collection step, cells or nuclei are collected selectively from the tissue of interest that exhibit expression of said reporter protein, more particularly wherein the reporter protein is a fluorescent protein.
5. The method according to any one of the preceding claims, wherein in the collection step, nuclei are collected.
6. The method according to claim 5, wherein the reporter protein comprises a polypeptide part localizing to a nucleus of a cell, particularly wherein
- the reporter protein comprises a polypeptide part interacting with a membrane of a nucleus, particularly wherein the reporter protein comprises a KASH domain, or
- the reporter protein comprises a polypeptide part localizing to a nucleus of a cell, particularly wherein the reporter protein comprises a NLS (nuclear localization sequence) domain.
7. The method according to any one of the preceding claims 1 to 4, wherein in the collection step, whole cells are collected.
8. The method according to any one of the preceding claims, wherein said organism expresses a gene encoding a recombinase enzyme, and wherein activation of expression of said Cas enzyme is mediated via said recombinase enzyme, particularly wherein the recombinase enzyme is a Cre enzyme.
9. The method according to any one of the preceding claims, wherein the gRNA-delivering nucleic acid expression vector is an AAV vector, an adenoviral vector, a rabies vector, a sindbis vector, or a lentiviral vector, more particularly wherein the gRNA-delivering nucleic acid expression vector is an AAV vector.
10. The method according to any one of the preceding claims, wherein the organism is an animal, particularly wherein the organism is a vertebrate, more particularly wherein the organism is a mammal, most particularly a mammal selected from the group of a rodent, a primate, an ungulate, a lagomorph, a carnivore, an insectivore, and a chiroptera.
11 . The method according to any one of the preceding claims, wherein the Cas enzyme is Cas9.
12. The method according to any one of the preceding claims, wherein the gene encoding the Cas enzyme
- is introduced into the germline of the organism, or
- is delivered via a vector.
13. The method according to any one of the preceding claims, wherein the gRNA promoter and the reporter gene promoter are two distinct promoters.
14. The method according to any one of the preceding claims, wherein the gRNA promoter is a tissue-specific promoter.
15. The method according to any one of the preceding claims, wherein the gRNA promoter is an inducible or conditional promoter, particularly a promoter selected from the group of a Tet-ON promoter, a Tet-OFF promoter, and a Cre-dependent promoter.
6. Use of a plurality of viral gRNA-delivering nucleic acid expression vectors in a method according to any one of the preceding claims; each vector comprising: i. inverted-terminal repeats (ITRs); ii. a gRNA promoter;
Hi. at least one guide-RNA (gRNA) under control of said gRNA promoter, wherein each vector comprises a different gRNA or gRNA combination; and iv. a terminator of transcription.
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| EP23188905 | 2023-08-01 | ||
| EP23188905.6 | 2023-08-01 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2025027136A1 true WO2025027136A1 (en) | 2025-02-06 |
Family
ID=87567281
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/EP2024/071816 Pending WO2025027136A1 (en) | 2023-08-01 | 2024-08-01 | Single-cell crispr-screening of multiple gene perturbations in vivo |
Country Status (1)
| Country | Link |
|---|---|
| WO (1) | WO2025027136A1 (en) |
Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2015089462A1 (en) | 2013-12-12 | 2015-06-18 | The Broad Institute Inc. | Delivery, use and therapeutic applications of the crispr-cas systems and compositions for genome editing |
| WO2019113499A1 (en) | 2017-12-07 | 2019-06-13 | The Broad Institute, Inc. | High-throughput methods for identifying gene interactions and networks |
| US20200018746A1 (en) | 2018-03-14 | 2020-01-16 | The Broad Institute, Inc. | Three-Dimensional Human Neural Tissues for CRISPR-Mediated Perturbation of Disease Genes |
| US20210172017A1 (en) | 2019-09-19 | 2021-06-10 | The Broad Institute, Inc. | Methods of in vivo evaluation of gene function |
| US20220304285A1 (en) * | 2018-10-02 | 2022-09-29 | The Board Of Trustees Of The Leland Stanford Junior University | Compositions and methods for multiplexed quantitative analysis of cell lineages |
-
2024
- 2024-08-01 WO PCT/EP2024/071816 patent/WO2025027136A1/en active Pending
Patent Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2015089462A1 (en) | 2013-12-12 | 2015-06-18 | The Broad Institute Inc. | Delivery, use and therapeutic applications of the crispr-cas systems and compositions for genome editing |
| WO2019113499A1 (en) | 2017-12-07 | 2019-06-13 | The Broad Institute, Inc. | High-throughput methods for identifying gene interactions and networks |
| US20200018746A1 (en) | 2018-03-14 | 2020-01-16 | The Broad Institute, Inc. | Three-Dimensional Human Neural Tissues for CRISPR-Mediated Perturbation of Disease Genes |
| US20220304285A1 (en) * | 2018-10-02 | 2022-09-29 | The Board Of Trustees Of The Leland Stanford Junior University | Compositions and methods for multiplexed quantitative analysis of cell lineages |
| US20210172017A1 (en) | 2019-09-19 | 2021-06-10 | The Broad Institute, Inc. | Methods of in vivo evaluation of gene function |
Non-Patent Citations (58)
| Title |
|---|
| "Current Protocols", 2021, WILEY ONLINE LIBRARY, article "Gene Set Knowledge Discovery with Enrichr - Xie" |
| "Hotelling, H. The Generalization of Student's Ratio", 1992, SPRINGER, article "Breakthroughs in Statistics: Foundations and Basic Theory", pages: 54 - 65 |
| ADAMSON, B. ET AL.: "A Multiplexed Single-Cell CRISPR Screening Platform Enables Systematic Dissection of the Unfolded Protein Response", CELL, vol. 167, 2016, pages 1867 - 1882 |
| ANGERER, P. ET AL., BIOINFORMATICS, vol. 32, 2016, pages 1241 - 1243 |
| ANGERER, P. ET AL.: "destiny: diffusion maps for large-scale single-cell data", R. BIOINFORMATICS, vol. 32, 2016, pages 1241 - 1243 |
| ASOKAN, A.SCHAFFER, D. V.JUDE SAMULSKI, R.: "The AAV Vector Toolkit: Poised at the Clinical Crossroads", MOLECULAR THERAPY, vol. 20, 2012, pages 699 - 708, XP055193366, DOI: 10.1038/mt.2011.287 |
| BARTOSOVIC, M.KABBE, M.CASTELO-BRANCO: "G. Single-cell CUT&Tag profiles histone modifications and transcription factors in complex tissues", NAT BIOTECHNOL, vol. 39, 2021, pages 825 - 835, XP037505677, DOI: 10.1038/s41587-021-00869-9 |
| BOCK, C. ET AL.: "High-content CRISPR screening", NAT. REV. METHODS PRIMER, vol. 21, no. 2, 2022, pages 1 - 23 |
| CLEMENT, K. ET AL.: "CRISPResso2 provides accurate and rapid genome editing sequence analysis", NAT. BIOTECHNOL., vol. 37, 2019, pages 224 - 226, XP036900605, DOI: 10.1038/s41587-019-0032-3 |
| DATLINGER, P. ET AL.: "Pooled CRISPR screening with single-cell transcriptome readout", NATURE METHODS, vol. 14, no. 3 14, 2017, pages 297 - 301, XP055460183, DOI: 10.1038/nmeth.4177 |
| DIXIT, A. ET AL.: "Perturb-seq: Dissecting molecular circuits with scalable single cell RNA profiling of pooled genetic screens", CELL, vol. 167, 2016, pages 1853 - 1882 |
| FRANGIEH, C. J. ET AL.: "Characterizing the molecular regulation of inhibitory immune checkpoints with multimodal single-cell screens", NATURE GENETICS, vol. 53, no. 3 53, 2021, pages 322 - 331 |
| FRANGIEH, C. J. ET AL.: "Multimodal pooled Perturb-CITE-seq screens in patient models define mechanisms of cancer immune evasion", NATURE GENETICS, vol. 53, no. 3 53, 2021, pages 332 - 341 |
| GRIFFIN, J. M. ET AL.: "Astrocyte-selective AAV gene therapy through the endogenous GFAP promoter results in robust transduction in the rat spinal cord following injury", GENE THER, vol. 26, 2019, pages 198 - 210, XP036791130, DOI: 10.1038/s41434-019-0075-6 |
| HAGHVERDI, L.BUETTNER, F.THEIS, F. J.: "Diffusion maps for high-dimensional single-cell analysis of differentiation data", BIOINFORMATICS, vol. 31, 2015, pages 2989 - 2998 |
| HOTELLING, H.: "The Generalization of Student's Ratio. in Breakthroughs in Statistics: Foundations and Basic Theory", 1992, SPRINGER, pages: 54 - 65 |
| HUANG, H.-Y. ET AL.: "miRTarBase update", NUCLEIC ACIDS RES., vol. 50, 2022, pages D222 - D230 |
| JAITIN, D. A. ET AL.: "Dissecting Immune Circuits by Linking CRISPR-Pooled Screens with Single-Cell RNA-Seq", CELL, vol. 167, 2016, pages 1883 - 1896 |
| JIN, X. ET AL.: "In vivo Perturb-Seq reveals neuronal and glial abnormalities associated with autism risk genes", SCIENCE, vol. 370, 2020, pages eaaz6063 |
| JOUNG, J. ET AL., NAT. PROTOC., vol. 124, no. 12, 2017, pages 828 - 863 |
| JOUNG, J. ET AL.: "Genome-scale CRISPR-Cas9 knockout and transcriptional activation screening", NAT., vol. 124, no. 12, 2017, pages 828 - 863 |
| KATZENELENBOGEN, Y. ET AL.: "Coupled scRNA-Seq and Intracellular Protein Activity Reveal an Immunosuppressive Role of TREM2 in Cancer", CELL, vol. 182, 2020, pages 872 - 885 |
| KOBLAN, L. W. ET AL.: "In vivo base editing rescues Hutchinson-Gilford progeria syndrome in mice.", NATURE, vol. 589, 2021, pages 608 - 614, XP037351694, DOI: 10.1038/s41586-020-03086-7 |
| KOROTKEVICH, G. ET AL., FAST GENE SET ENRICHMENT ANALYSIS, 2021, Retrieved from the Internet <URL:https://doi.org/10.1101/060012> |
| LAI, Y. ET AL.: "Efficient in vivo gene expression by trans-splicing adeno-associated viral vectors", NAT BIOTECHNOL, vol. 23, 2005, pages 1435 - 1439 |
| LAN ET AL., MOL CANCER, vol. 21, 2022, pages 71 |
| LICURSI, V.CONTE, F.FISCON, G.PACI, P.: "MIENTURNET: An interactive web tool for microRNA-target enrichment and network-based analysis", BMC BIOINFORMATICS, vol. 20, 2019, pages 1 - 10 |
| MCLNNES, L.HEALY, J.MELVILLE, J., UMAP: UNIFORM MANIFOLD APPROXIMATION AND PROJECTION FOR DIMENSION REDUCTION, 2020, Retrieved from the Internet <URL:http://arxiv.org/abs/1802.03426> |
| MEECHAN, D. W. ET AL.: "Gene dosage in the developing and adult brain in a mouse model of 22q11 deletion syndrome", MOLECULAR AND CELLULAR NEUROSCIENCE, vol. 33, 2006, pages 412 - 428, XP024908106, DOI: 10.1016/j.mcn.2006.09.001 |
| MEECHAN, D. W. ET AL.: "Modeling a model: Mouse genetics, 22q11.2 Deletion Syndrome, and isorders of cortical circuit development", PROGRESS IN NEUROBIOLOGY, vol. 130, 2015, pages 1 - 28, XP029175306, DOI: 10.1016/j.pneurobio.2015.03.004 |
| MERSCHER, S. ET AL.: "TBX1 is responsible for cardiovascular defects in velo-cardio-facial/DiGeorge syndrome", CELL, vol. 104, 2001, pages 619 - 629 |
| MILLER, J. A. ET AL.: "Transcriptional landscape of the prenatal human brain", NAT., vol. 5087495, no. 508, 2014, pages 199 - 206 |
| MIMITOU, E. P. ET AL.: "Multiplexed detection of proteins, transcriptomes, clonotypes and CRISPR perturbations in single cells", NAT. METHODS, vol. 165, no. 16, 2019, pages 409 - 412 |
| MINGOZZI, F.HIGH, K. A.: "Therapeutic in vivo gene transfer for genetic disease using AAV: progress and challenges", NAT REV GENET, vol. 12, 2011, pages 341 - 355, XP055155351, DOI: 10.1038/nrg2988 |
| MORGENS, D. W. ET AL.: "Genome-scale measurement of off-target activity using Cas9 toxicity in high-throughput screens", NAT. COMMUN., vol. 81, no. 8, 2017, pages 1 - 8 |
| NORMAN, T. M. ET AL.: "Exploring genetic interaction manifolds constructed from rich single-cell phenotypes", SCIENCE, vol. 365, 2019, pages 786 - 793 |
| PINERO, J. ET AL.: "The DisGeNET knowledge platform for disease genomics", NUCLEIC ACIDS RES., vol. 48, 2019, pages D845 - D855 |
| PINERO, J. ET AL.: "The DisGeNET knowledge platform for disease genomics", NUCLEIC ACIDS RES., vol. 48, 2020, pages D845 - D855 |
| PIZZUTI, A. ET AL.: "UFD1L, a Developmentally Expressed Ubiquitination Gene, is Deleted in CATCH 22 Syndrome", HUMAN MOLECULAR GENETICS, vol. 6, 1997, pages 259 - 265, XP002222092, DOI: 10.1093/hmg/6.2.259 |
| PLATT, R. J. ET AL.: "CRISPR-Cas9 knockin mice for genome editing and cancer modeling", CELL, vol. 159, 2014, pages 440 - 455, XP055523070, DOI: 10.1016/j.cell.2014.09.014 |
| RAMANI BISWARATHAN ET AL: "Scalable, cell type-selective, AAV-based in vivo CRISPR screening in the mouse brain", BIORXIV, 27 June 2023 (2023-06-27), pages 1 - 21, XP093217047, Retrieved from the Internet <URL:https://pmc.ncbi.nlm.nih.gov/articles/PMC10312723/pdf/nihpp-2023.06.13.544831v2.pdf> [retrieved on 20241022], DOI: 10.1101/2023.06.13.544831 * |
| RAN, F. A. ET AL.: "In vivo genome editing using Staphylococcus aureus Cas9", NATURE, vol. 520, 2015, pages 186 - 191, XP055484527, DOI: 10.1038/nature14299 |
| REPLOGLE, J. M. ET AL.: "Combinatorial single-cell CRISPR screens by direct guide RNA capture and targeted sequencing", NATURE BIOTECHNOLOGY, vol. 38, no. 8 38, 2020, pages 954 - 961, XP037211717, DOI: 10.1038/s41587-020-0470-y |
| ROBINSON, M. D.MCCARTHY, D. J.SMYTH, G. K.: "edgeR: a Bioconductor package for differential expression analysis of digital gene expression data", BIOINFORMA. OXF. ENGL., vol. 26, 2010, pages 139 - 140, XP055750957, DOI: 10.1093/bioinformatics/btp616 |
| SANTINHA, A.J.KLINGLER, E.KUHN, M. ET AL.: "Transcriptional linkage analysis with in vivo AAV-Perturb-seq", NATURE, vol. 622, 2023, pages 367 - 375 |
| SAUNDERS, A. ET AL.: "Molecular Diversity and Specializations among the Cells of the Adult Mouse Brain.", CELL, vol. 174, 2018, pages 1015 - 1030 |
| SKINNIDER, M. A. ET AL.: "Cell type prioritization in single-cell data", NAT. BIOTECHNOL., vol. 391, no. 39, 2020, pages 30 - 34 |
| SKINNIDER, M. A. ET AL.: "Combinatorial single-cell CRISPR screens by direct guide RNA capture and targeted sequencing", NAT. BIOTECHNOL., vol. 388, no. 38, 2020, pages 954 - 961 |
| SQUAIR, J. W. ET AL.: "Confronting false discoveries in single-cell differential expression", NATURE COMMUNICATIONS, vol. 12, no. 1 12, 2021, pages 1 - 15 |
| STOECKIUS, M. ET AL.: "Simultaneous epitope and transcriptome measurement in single cells", NAT METHODS, vol. 14, 2017, pages 865 - 868, XP055547724, DOI: 10.1038/nmeth.4380 |
| STUART, T. ET AL.: "Comprehensive Integration of Single-Cell Data", CELL, vol. 177, 2019, pages 1888 - 1902 |
| TASCHENBERGER, G., TERESHCHENKO, J. & KIIGLER, S.: "A MicroRNA124 Target Sequence Restores Astrocyte Specificity of gfaABC1D-Driven Transgene Expression in AAV-Mediated Gene Transfer. ", MOLECULAR THERAPY - NUCLEIC ACIDS, vol. 8, 2017, pages 13 - 25 |
| TIAN, R. ET AL.: "Genome-wide CRISPRi/a screens in human neurons link lysosomal failure to ferroptosis", NAT NEUROSCI, vol. 24, 2021, pages 1020 - 1034, XP037496274, DOI: 10.1038/s41593-021-00862-0 |
| URSU, O. ET AL., NAT. BIOTECHNOL., vol. 40, 2022, pages 896 - 905 |
| URSU, O. ET AL.: "Massively parallel phenotyping of coding variants in cancer with Perturb-seq", NAT., vol. 40, 2022, pages 896 - 905, XP037897839, DOI: 10.1038/s41587-021-01160-7 |
| VANDUSEN NATHAN J. ET AL: "Massively parallel in vivo CRISPR screening identifies RNF20/40 as epigenetic regulators of cardiomyocyte maturation", NATURE COMMUNICATIONS, vol. 12, no. 1, 21 July 2021 (2021-07-21), UK, XP093217000, ISSN: 2041-1723, DOI: 10.1038/s41467-021-24743-z * |
| VENABLES, W. N.RIPLEY, B. D.: "Short Protocols in Molecular Biology", 2002, JOHN WILEY & SONS, INC. |
| YANG, Y. ET AL.: "A dual AAV system enables the Cas9-mediated correction of a metabolic liver disease in newborn mice", NAT BIOTECHNOL, vol. 34, 2016, pages 334 - 338, XP055569763, DOI: 10.1038/nbt.3469 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| Santinha et al. | Transcriptional linkage analysis with in vivo AAV-Perturb-seq | |
| Li et al. | Single-cell brain organoid screening identifies developmental defects in autism | |
| Graybuck et al. | Enhancer viruses for combinatorial cell-subclass-specific labeling | |
| Mich et al. | Functional enhancer elements drive subclass-selective expression from mouse to primate neocortex | |
| Werling et al. | An analytical framework for whole-genome sequence association studies and its implications for autism spectrum disorder | |
| Zheng et al. | Massively parallel in vivo Perturb-seq reveals cell-type-specific transcriptional networks in cortical development | |
| Contreras et al. | A genome-wide library of MADM mice for single-cell genetic mosaic analysis | |
| Nardone et al. | Dysregulation of cortical neuron DNA methylation profile in autism spectrum disorder | |
| US20210395821A1 (en) | Methods for determining spatial and temporal gene expression dynamics during adult neurogenesis in single cells | |
| Kratz et al. | Digital expression profiling of the compartmentalized translatome of Purkinje neurons | |
| US20200018746A1 (en) | Three-Dimensional Human Neural Tissues for CRISPR-Mediated Perturbation of Disease Genes | |
| WO2016205745A2 (en) | Cell sorting | |
| US12264367B2 (en) | Methods of in vivo evaluation of gene function | |
| Zhang et al. | txci-ATAC-seq: a massive-scale single-cell technique to profile chromatin accessibility | |
| Werling et al. | Limited contribution of rare, noncoding variation to autism spectrum disorder from sequencing of 2,076 genomes in quartet families | |
| US20240384260A1 (en) | Methods of in situ total rna-based transcriptome profiling for large-scale subcellular structure profiling | |
| Bermúdez-Barrientos et al. | Disentangling sRNA-Seq data to study RNA communication between species | |
| Ross et al. | Modeling neuronal consequences of autism-associated gene regulatory variants with human induced pluripotent stem cells | |
| Licastro et al. | Promiscuity of enhancer, coding and non-coding transcription functions in ultraconserved elements | |
| Graybuck et al. | Enhancer viruses and a transgenic platform for combinatorial cell subclass-specific labeling | |
| WO2025027136A1 (en) | Single-cell crispr-screening of multiple gene perturbations in vivo | |
| WO2016149684A2 (en) | Haplotype based generalizable allele specific silencing for therapy of cardiovascular disease | |
| Zheng et al. | Massively parallel in vivo Perturb-seq screening | |
| US20210254053A1 (en) | Methods and uses of high-throughput inference of synaptic connectivity relationships among cell types | |
| WO2025102397A1 (en) | Single-cell exon sequencing method and use thereof |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 24752019 Country of ref document: EP Kind code of ref document: A1 |