[go: up one dir, main page]

WO2025166152A1 - Multiplexed perturbation and decoding in pooled genetic screens - Google Patents

Multiplexed perturbation and decoding in pooled genetic screens

Info

Publication number
WO2025166152A1
WO2025166152A1 PCT/US2025/014013 US2025014013W WO2025166152A1 WO 2025166152 A1 WO2025166152 A1 WO 2025166152A1 US 2025014013 W US2025014013 W US 2025014013W WO 2025166152 A1 WO2025166152 A1 WO 2025166152A1
Authority
WO
WIPO (PCT)
Prior art keywords
viral vector
promoter
sequence
crispr
guide
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
PCT/US2025/014013
Other languages
French (fr)
Inventor
Paul BLAINEY
Yue QIN
Russell WALTON
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Massachusetts Institute of Technology
Broad Institute Inc
Original Assignee
Massachusetts Institute of Technology
Broad Institute Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Massachusetts Institute of Technology, Broad Institute Inc filed Critical Massachusetts Institute of Technology
Publication of WO2025166152A1 publication Critical patent/WO2025166152A1/en
Pending legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/10Processes for the isolation, preparation or purification of DNA or RNA
    • C12N15/1034Isolating an individual clone by screening libraries
    • C12N15/1065Preparation or screening of tagged libraries, e.g. tagged microorganisms by STM-mutagenesis, tagged polynucleotides, gene tags
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6806Preparing nucleic acids for analysis, e.g. for polymerase chain reaction [PCR] assay

Definitions

  • the subject matter disclosed herein is generally directed to lentiviral vectors for use in multiplexing genetic perturbations and decoding the multiplexed genetic perturbations in pooled single cell assays.
  • Combinatorial screens seek to identify genetic interactions, or phenotypes that arise from an interaction of genetic components. Most pooled screens, including combinatorial screens, achieve perturbation of genetic components through lentiviral delivery of Cas9 single guide RNA (sgRNA). Lentiviral delivery systems are near-ubiquitous across in vitro pooled screens conducted today for their ability to efficiently deliver perturbations to a wide range of cell types and retain a record of the perturbation identity as the lentiviral genome integrated in the host-cell genome. Combinatorial screens can leverage either standard single-perturbation or multi-perturbation (multiplexing) lentiviral vectors.
  • sgRNA Cas9 single guide RNA
  • single-perturbation vectors enable only random sampling in pooled screens, via either high multiplicity of infection (with limited control of multiplicity) or serial transduction and selection.
  • random sampling is limited by an immense combinatorial space, which scales as (n choose m), for n targets and combinations of m perturbations per cell.
  • pairwise interactions of only 200 gene targets results in 19,900 unique combinations, roughly equivalent in scale to a standard genome-wide screen.
  • multiplexing vectors enable the delivery of multiple perturbations on a single vector.
  • multiplexing vectors allow the perturbation of a biologically informed subset of target combinations. In such screens, the size of multiplexed vector libraries scales linearly with the number of selected target combinations.
  • Multiplexed perturbation lentiviral vectors additionally offer several advantages over single-plex solutions for single-target screens.
  • the use of multiple guides targeting the same gene has been shown to improve on-target activity in both knockout and interference screens. As both guides must target the same gene, these approaches require programmed spacer combinations, not random pairings.
  • the selection of active and specific guides remains challenging and pairing guides can increase on-target performance. Pairing guides also combines the off-target risk of multiple guides, necessitating the use of multiple vectors with distinct guides for a given target.
  • the disclosure provides a viral vector for multiplexed perturbation screens, including: a cassette operably connected to a pol II promoter, the cassette comprising at least two guide molecules; a cleavable sequence; a pol III promoter; and a 3' long terminal repeat (LTR), where the at least two guide molecules are separated by the cleavable sequence, the at least two guide molecules, the cleavable sequence, and the pol III promoter are located within the 3' LTR and encoded on the minus strand, and the at least two guide molecules are transcribed in a single transcript, and where the pol II promoter is positioned upstream of the 3’ LTR.
  • the cleavable sequence is a tRNA leader sequence.
  • the cassette comprises one or more additional cleavable sequences.
  • the at least two guide molecules are present in an array cleavable by a CRISPR-Cas polypeptide.
  • the CRISPR-Cas polypeptide is Casl2.
  • the CRISPR-Cas polypeptide is Casl3.
  • the cassette further comprises an internal barcode (iBAR) and spacer within each of the at least two guide molecules, wherein each iBAR is unique to each of the at least two guide molecules.
  • iBAR internal barcode
  • the iBAR is within a loop joining a crRNA and a tracrRNA of each guide molecule.
  • the cassette further comprises an exogenous promoter.
  • the exogenous promoter of the cassette is upstream of the at least two guide molecules, the cleavable sequence, and the pol III promoter.
  • the orientation of the exogenous promoter is antisense relative to the Pol III promoter.
  • the exogenous promoter is a T7 promoter, a T3 promoter, or a SP6 promoter.
  • the viral vector comprises a 5’ LTR promoter.
  • the 5’ LTR promoter is a CMV promoter.
  • orthogonal tRNA sequences are used in the viral vector.
  • the viral vector further includes orthogonal guide molecule scaffolds.
  • the at least two guide molecules target at least two or more different sequences of a gene.
  • the viral vector further encodes a CRISPR-Cas polypeptide.
  • the CRISPR-Cas polypeptide is a Cas9.
  • the CRISPR-Cas polypeptide is a Casl2.
  • the CRISPR-Cas polypeptide is a Casl3.
  • the viral vector is a lentiviral vector.
  • the disclosure provides a method of performing multiplexed pooled perturbation screening, including the steps of: a) introducing one or more perturbation vectors according to any one of claims 1 to 22 to a population of cells to obtain perturbed cells; and b) performing single cell RNA sequencing on the perturbed cells, whereby pol II transcripts comprising the at least two guide molecules encoded for in each vector are sequenced with cellular RNAs from the perturbed cells.
  • the disclosure provides a method of performing multiplexed pooled optical perturbation screening, including the steps of: a) introducing one or more perturbation vectors according to any one of claims 1 to 22 to a population of cells to obtain perturbed cells; b) fixing and permeabilizing the perturbed cells; c) reverse transcribing mRNA transcribed from the pol II promoter encoded within the viral vector in the perturbed cells using primers specific for each of the at least two guide molecules, thereby generating cDNA comprising at least the spacer sequence and iBAR of each of the at least two guide molecules; d) fixing the cDNA in the perturbed cells; and e) in situ sequencing (IS S) the sequences comprising the spacer and/or iBAR of each of the at least two guide molecules.
  • IS S in situ sequencing
  • the disclosure provides a method of performing multiplexed pooled optical perturbation screening, including the steps of: a) introducing one or more viral vector(s) according to any one of claims 9-12 to a population of cells to obtain perturbed cells; b) fixing and permeabilizing the perturbed cells, transcribing mRNA of the viral vector(s) in the perturbed cells from the exogenous promoter of the cassette(s) of the viral vector(s), thereby generating barcoded RNA; c) reverse transcribing mRNA transcribed from the pol II promoter encoded within the viral vector in the perturbed cells using primers specific for each of the at least two guide molecules, thereby generating cDNA comprising at least the spacer sequence and iBAR of each of the at least two guide molecules; d) fixing the cDNA in the perturbed cells; and e) in situ sequencing (ISS) the sequences comprising the spacer and/or iBAR of each of the at least two
  • the fixing in step (b) is performed using about 0.007% glutaraldehyde in about 4% paraformaldehyde.
  • the primers are biotinylated and wherein the method further comprises a streptavidin incubation between the reverse transcription and fixing step.
  • the in situ sequencing comprises: a) contacting the perturbed cells with padlock probes that flank the spacer and/or iBAR of each of the at least two guide molecules; b) gap filling the padlock probes to capture each spacer and iBAR within the padlock probe; c) ligating the padlock probes into circular ssDNA templates; d) performing rolling circle amplification on the circular ssDNA templates to generated amplified cDNA; and e) sequencing the amplified cDNA using sequencing by synthesis to decode perturbations.
  • the in situ sequencing step comprises in situ sequencing of a combination of the at least two spacers and at least two iBARs simultaneously.
  • the methods further include treating the population of cells with a lithium borohydride (LiBH4) solution.
  • LiBH4 lithium borohydride
  • the LiBfh treatment step occurs prior to in situ sequencing.
  • the LiBITi treatment step occurs subsequent to in situ sequencing.
  • the one or more additional cleavable sequences are tRNA leader sequences.
  • the one or more additional cleavable sequences are positioned upstream of the at least two guide molecules.
  • the present disclosure provides a viral vector for multiplexed perturbation screens, said viral vector including: a cassette including two or more guide molecules encoded for in a single transcript within a 3’ long terminal repeat (LTR) of a viral genome, wherein the two or more guide molecules are separated by a cleavable sequence, wherein the two or more guide molecules are operably linked to a pol III promoter encoded for within the 3’ LTR; wherein the cassette is operably linked to a pol II promoter encoded upstream of the 3’ LTR; and wherein the orientation of the cassette is in an antisense orientation relative to the lentiviral genome.
  • LTR long terminal repeat
  • the cleavable sequence of the cassette is a tRNA leader sequence, wherein the two or more guide molecules are separated by a tRNA leader sequence.
  • each guide molecule of the cassette comprises a tRNA leader sequence.
  • the two or more guide molecules are present in an array cleavable by a CRISPR-Cas polypeptide.
  • the CRISPR-Cas polypeptide is Casl2.
  • the CRISPR-Cas polypeptide is Casl3.
  • the cassette of the viral vector further includes an internal barcode (iBAR) within each guide molecule, wherein each iBAR is unique to each guide molecule.
  • the iBAR is within a loop joining a spacer and scaffold of each guide molecule. In some embodiments, the iBAR is within a loop joining a crRNA and a tracrRNA of each guide molecule.
  • the cassette further includes an exogenous promoter.
  • the exogenous promoter is upstream of the two or more guide molecules, the cleavable sequence, and the pol III promoter.
  • the orientation of the exogenous promoter is sense relative to the lentiviral genome. In further embodiments, the orientation of the exogenous promoter is antisense relative to the Pol III promoter.
  • the exogenous promoter is a T7 promoter, a T3 promoter, or a SP6 promoter.
  • the viral vector includes a 5’ LTR promoter.
  • the 5’ LTR promoter of the viral vector is a CMV promoter.
  • orthogonal tRNA sequences are used in the viral vector.
  • orthogonal guide molecule scaffolds are used in the viral vector.
  • the two or more guide molecules target a different sequence of a same gene.
  • the viral vector encodes for a CRISPR-Cas polypeptide.
  • the CRISPR-Cas polypeptide is a Cas9.
  • the CRISPR-Cas polypeptide is a Casl2. In certain embodiments, the CRISPR-Cas polypeptide is a Casl 3. In certain embodiments, the viral vector is a lentiviral vector. [0044] In another aspect, the present disclosure provides a method of performing multiplexed pooled perturbation screening involving introducing one or more perturbation vectors according to any embodiment herein to a population of cells to obtain perturbed cells, optionally the population of cells expresses a CRISPR-Cas polypeptide; and performing single cell RNA sequencing on the perturbed cells, whereby pol II transcripts include the two or more guide molecules encoded for in each vector are sequenced with cellular RNAs from the perturbed cells.
  • the present disclosure provides a method of performing multiplexed pooled optical perturbation screening involving introducing one or more perturbation vectors according to any embodiment herein to a population of cells to obtain perturbed cells, optionally the population of cells expresses a CRISPR-Cas polypeptide; fixing and permeabilizing the perturbed cells; reverse transcribing mRNA transcribed from the pol II promoter encoded within the viral vector in the perturbed cells using primers specific for each of the two guide molecules, thereby generating cDNA comprising at least the spacer sequence and iBAR of each guide molecule; fixing the cDNA in the perturbed cells; and in situ sequencing (ISS) the sequences of each guide molecule spacer and/or iBAR.
  • ISS in situ sequencing
  • Another aspect of the disclosure provides a method for performing multiplexed pooled optical perturbation screening involving: introducing one or more viral vector(s) according to the disclosure to a population of cells to obtain perturbed cells, optionally wherein the population of cells expresses a CRISPR-Cas polypeptide; fixing and permeabilizing the perturbed cells; transcribing mRNA of the viral vector(s) in the perturbed cells from the exogenous promoter of the viral vector(s), thereby generating barcoded RNA; reverse transcribing mRNA transcribed from the pol II promoter encoded within the viral vector in the perturbed cells using primers specific for each of the two guide molecules, thereby generating cDNA comprising at least the spacer sequence and iBAR of each guide molecule; fixing the cDNA in the perturbed cells; and in situ sequencing (ISS) the sequences of each guide molecule spacer and/or iBAR, thereby performing multiplexed pooled optical perturbation screening.
  • the fixing is performed via use of about 0.007% glutaraldehyde in about 4% paraformaldehyde.
  • the primers for reverse transcription are biotinylated and the method further involves a streptavidin incubation between reverse transcription and fixing the cDNA in the perturbed cells.
  • in situ sequencing involves: contacting the perturbed cells with padlock probes flanking each sgRNA spacer and/or iBAR; gap filling the padlock probes to capture each spacer and iBAR within the padlock probe; ligating the padlock probes into circular ssDNA templates; performing rolling circle amplification on the circular ssDNA templates; and sequencing the amplified cDNA using sequencing by synthesis to decode perturbations.
  • the in situ sequencing involves sequencing a combination of the two spacers and two iBARs simultaneously.
  • the term “about” is understood as within a range of normal tolerance in the art, for example within 2 standard deviations of the mean. About can be understood as within 50%, 45%, 40%, 35%, 30%, 25%, 20%, 15%, 10%, 9%, 8%, 7%, 6%, 5%, 4%, 3%, 2%, 1%, 0.5%, 0.1%, 0.05%, or 0.01% of the stated value. Unless otherwise clear from context, all numerical values provided herein are modified by the term about.
  • control is meant a standard or reference condition.
  • fragment is meant a portion of a polypeptide or nucleic acid molecule. This portion contains, preferably, at least 5%, 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, or 90% of the entire length of the reference nucleic acid molecule or polypeptide.
  • a fragment may contain 5, 10, 20, 30, 40, 50, 60, 70, 80, 90, or 100, 200, 300, 400, 500, 600, 700, 800, 900, 1000 or more nucleotides or amino acids.
  • a subject refers to an animal, or tissues or cells thereof, which is the object of treatment, observation, or experiment.
  • a subject includes, but is not limited to, a mammal, including, but not limited to, a human or a nonhuman mammal, such as a non-human primate, bovine, equine, canine, ovine, or feline.
  • the terms “individual,” “patient” or “subject” are used interchangeably herein. Mammals may also include, but are not limited to, murines, simians, humans, farm animals, sport animals, and pets. Tissues, cells and their progeny of a subject obtained in vivo or cultured in vitro are also encompassed.
  • Ranges provided herein are understood to be shorthand for all of the values within the range.
  • a range of 1 to 50 is understood to include any number, combination of numbers, or sub-range from the group consisting of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, or 50, as well as all intervening decimal values between the aforementioned integers such as, for example, 1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7, 1.8, and 1.9.
  • nested sub-ranges that extend from either end point of the range are specifically contemplated.
  • a nested sub-range of an exemplary range of 1 to 50 may comprise 1 to 10, 1 to 20, 1 to 30, and 1 to 40 in one direction, or 50 to 40, 50 to 30, 50 to 20, and 50 to 10 in the other direction.
  • the recitation of numerical ranges by endpoints includes all numbers and fractions subsumed within the respective ranges, as well as the recited endpoints.
  • a “biological sample” may contain whole cells and/or live cells and/or cell debris.
  • the biological sample may contain (or be derived from) a “bodily fluid”.
  • the bodily fluid is selected from amniotic fluid, aqueous humor, vitreous humor, bile, blood serum, breast milk, cerebrospinal fluid, cerumen (earwax), chyle, chyme, endolymph, perilymph, exudates, feces, female ejaculate, gastric acid, gastric juice, lymph, mucus (including nasal drainage and phlegm), pericardial fluid, peritoneal fluid, pleural fluid, pus, rheum, saliva, sebum (skin oil), semen, sputum, synovial fluid, sweat, tears, urine, vaginal secretion, vomit and mixtures of one or more thereof.
  • Biological samples include cell cultures, bodily fluids, cell cultures from bodily fluid
  • FIG. 1A shows a schematic of LentiGuide-BC and similar approaches which use a secondary barcode distant from the sgRNA.
  • FIG. IB shows CROPseq encodes the sgRNA within the lentiviral 3’ long terminal repeat (LTR).
  • FIG. 1C shows CROPseq-multi encodes two sgRNAs with internal barcodes (iBARs), multiplexed using tRNAs, and encoded in the lentiviral 3’ LTR.
  • FIG. ID shows iBARs place secondary barcodes within the loop joining the crRNA and tracrRNA into the sgRNA.
  • FIG. IE shows lentiviral titers with CROPseq-multi relative to CROPseq.
  • FIG. 2A shows lentiviral barcode swapping rates as a function of the distance separating barcode elements (spacer-spacer, spacer-secondary barcode, etc.). Data points are from published barcoding systems and CROPseq-multi (see also, Table 1).
  • FIG. 2B shows assaying lentiviral recombination with pooled lentiviral production.
  • FIG. 2C shows observed lentiviral recombination of CROPseq-multi barcode elements. (T-test: *p ⁇ 0.05, **p ⁇ 0.001, ***p ⁇ 0.0001)
  • FIG. 2D shows reducing recombination frequencies with orthogonal tRNAs.
  • FIG. 3A provides an overview of the in situ sequencing workflow.
  • FIG. 3B shows optimizing the in situ detection protocol to improve the detection efficiency of CROPseq-multi.
  • FIG. 3C shows detection efficiency of CROPseq-multi iBARs with individual or multiplexed detection.
  • FIG. 3D shows a representative image of in situ sequencing reads with multiplexed detection.
  • FIG. 3E shows quantifying recombination in a 3 -vector pool with ISS and NGS.
  • FIG. 3F shows sequencing cycles necessary to uniquely identify library members with different decoding methods. For decoding via the spacer (mean and standard deviation shown), libraries were simulated by randomly sampling guides from the Dolcetto genome-wide CRISPRi library.
  • FIG. 4A shows a schematic representation of the pLentiGuide vector.
  • FIG. 4B shows a schematic representation of derivative multiplexing systems.
  • FIG. 4C shows processing of the lentiviral RNA genome into double stranded DNA for genome integration to illustrate steps vulnerable to recombination. Illustration inspired by Adamson et al. 22
  • FIG. 4D shows a schematic representation of a LentiGuideBC vector for pairing guide RNAs with mRNA barcodes.
  • FIG. 4E shows a schematic representation of a CROPseq vector for pairing guide RNAs with mRNA barcodes.
  • LTR long term repeats
  • pbs primer binding site
  • PPT polypurine tract
  • cPPT central polypurine tract.
  • FIG. 5A shows a detailed illustration of the CROPseq-multi 3’LTR design.
  • the top line of the figure shows the orientation of the viral vector relative to the lentiviral genome, from right to left in 5'-to-3 ' orientation. Underneath, the middle section of this figure, shows the reverse complement orientation. Beneath this, at the bottom of the figure, the bounded box blow-out shows greater details of the 3' LTR cassette (located between flanking 3'LTR regions; the 3'LTR cassette is also indicated by hashed lines).
  • FIG. 5B shows lentiviral titers of CROPseq variants relative to CROPseq.
  • FIG. 5C shows a sequence alignment of orthogonal tRNAs tested in CROPseq-multi.
  • FIG. 5D shows a sequence alignment of orthogonal sgRNA scaffolds used in CROPseq-multi.
  • FIG. 6A shows an illustration of oligo reagent design for in situ detection of CROPseq- multi barcodes.
  • FIG. 6B shows a representative image showing the identification of iBAR 1 reads in multiplexed detection, together with DAPI-stained nuclei and the four sequencing bases in separate fluorescent channels.
  • FIG. 6C shows precision-recall curve for assignment of individual reads to either iBAR 1 or iBAR 2 on signal from the iBAR 1 -specific probe.
  • FIG. 7A shows CROPseq-multi-T7 encodes two sgRNAs with internal barcodes (iBARs), multiplexed using tRNAs, and encoding in the lentiviral 3' LTR. Like CROPseq-multi and CROPseq, the 3' LTR is duplicated during lentiviral integration, producing a second copy of the sgRNAs.
  • the CROPseq-multi-T7 further encodes a T7 promoter, which, without wishing to be bound by theory, enables use of in vitro transcription to generate barcoded RNAs independent of endogenous transcription activity.
  • FIG. 7B provides an overview of the in situ sequencing workflow.
  • FIG. 8 shows a sequence alignment of CROPseq-multi (top) and CROPseq-multi-T7 (bottom).
  • FIG. 9A shows genome editing activity of CROPseq-multi-T7 vectors in SpCas9 nuclease-expressing cells, quantified by next generation sequencing.
  • the table underneath the x- axis indicates the respective target for sgRNA 1 and sgRNA 2; i.e., the first and fourth columns show negative control results for respective cassettes having non-targeting (scrambled) guide RNAs joined by a tRNA for glutamine (Q), with the fourth column cassette further including a T7 promoter - no editing was observed; the second and fifth columns show results for cassettes having guide RNAs targeting AAVS1 as sgRNA 1 and targeting HPRT1 as sgRNA2, with the guide RNAs joined by a tRNA for alanine (A), where the fifth column cassette further included a T7 promoter - editing of both AAVS1 and HPRT1 were consistently observed at about 40-60% levels for each tested cassette; the third and sixth columns show results for cassettes having guide RNA
  • FIG. 9B shows a comparison of mRNA detection efficiency of CROPseq and CROPseq-T7 vectors (left two violin plots) and CROPseq-multi and CROPseq-multi-T7 vectors (right two violin plots).
  • FIG. 9C shows representative images of in situ sequencing reads with multiplexed detection of CROPseq-T7 and CROPseq-multi-T7 vectors using two protocols.
  • PeturbView uses PF A fixation, ethanol permeabilization, and a reverse crosslinking step (top row) whereas NIS-seq uses methanol and acetic acid for fixation and permeabilization (bottom row).
  • FIG. 9D shows representative images of in situ sequencing reads with multiplexed detection of comparing CROPseq-T7 and CROPseq-multi-T7.
  • FIG. 10 shows detection efficiency via in vitro transcription and in situ sequencing of the CROPseq-multi-T7 vector.
  • FIG. 11 shows recombination detection of the CROPseq-multi-T7 vector using a simultaneous sequencing approach to reading barcode sequences using in vitro transcription. A "mixed basecall score" was used to separate recombined reads from correct pairs.
  • FIG. 12 shows the effect of LiBFU on fluorescent signal in RPEl-hTERT cells.
  • FIG. 12 demonstrates that LiBFU treatment is effective in removing fluorescent signals from in situ sequencing reagents and is also compatible with subsequent sequencing.
  • FIG. 12, left panel shows cycle 1 signal from in situ sequencing (ISS) of RPEl-hTERT cells before LiBFU treatment (i.e., pre-treatment).
  • FIG. 12, middle panel shows the same cells after treatment with 1 mg/mL LiBFU solution for 30 minutes at room temperature (i.e., post-treatment).
  • FIG. 12 shows the effect of LiBFU on fluorescent signal in RPEl-hTERT cells.
  • FIG. 12 shows cycle 1 signal from in situ sequencing (ISS) of RPEl-hTERT cells before LiBFU treatment (i.e., pre-treatment).
  • FIG. 12, middle panel shows the same cells after treatment with 1 mg/mL LiBFU solution for 30 minutes at room temperature (i.e., post-treatment).
  • FIG. 12 right panel, shows the same cells after they were cleaved and incorporated into a subsequent round of sequencing (e.g., cycle 2) and indicates that produced signal readout, indicating that the cells were compatible with subsequent sequencing reactions.
  • Grey, BFP signal showing cell body; Green, nucleotide G; Red, nucleotide T; Magenta, nucleotide A; Cyan, nucleotide C.
  • CROPseq-multi provides an excellent multiplexing solution for single-target and combinatorial Cas9-based CRISPR screens that includes robust guide activity, low lentiviral recombination, and compatibility with mRNA-barcoding.
  • the techniques herein demonstrate that CROPseq-multi enables superior detection and improved decoding efficiency for optical pooled screens, and readily enables combinatorial screens.
  • CROPseq-multi is a versatile multiplexing platform for diverse CRISPR screening methodologies.
  • the techniques herein address existing challenges in multiplexing perturbations while maintaining design compatibility across enrichment, single-cell sequencing, and optical pooled screens.
  • This versatility provides the opportunity for direct comparison and integration of different screening modalities, for example scRNA-seq and imaging-based approaches.
  • CROPseq-multi enables single-target screens with smaller libraries and improved performance.
  • the compatibility of CROPseq-multi with high-content screening techniques enables new directions and methodologies for combinatorial screens.
  • combinatorial optical pooled screens may be a powerful approach to interrogate genetic interactions at high throughput and with rich, single-cell-resolved phenotypic measurements.
  • multiplexing vectors enable the delivery of multiple perturbations on a single vector.
  • multiplexing vectors allow the perturbation of a biologically informed subset of target combinations 3 7 . In such screens, the size of multiplexed vector libraries scales linearly with the number of selected target combinations.
  • Multiplexed perturbation lentiviral vectors additionally offer several advantages over single-plex solutions for single-target screens.
  • the use of multiple guides targeting the same gene has been shown to improve on-target activity in both knockout and interference screens 8 l 0 .
  • both guides must target the same gene, these approaches require programmed spacer combinations, not random pairings.
  • Pairing guides also combines the off-target risk of multiple guides, necessitating the use of multiple vectors with distinct guides for a given target.
  • Embodiments disclosed herein provide lentiviral vectors for use in multiplexed pooled genetic perturbation assays.
  • Forward genetic screens seek to dissect complex biological systems by systematically perturbing genetic elements and observing the resulting phenotypes.
  • combinatorial screens genetic perturbations are multiplexed within individual cells to reveal genetic interactions, or phenotypes that result from combinations of perturbations.
  • the rich phenotypic readout and high cellular throughput of optical pooled screens makes the approach an attractive strategy to study genetic interactions; however, current multiplexing approaches are incompatible with this screening method.
  • CROPseq-multi a lentiviral system, termed CROPseq-multi, able to multiplex Cas-based perturbations with mRNA- embedded barcodes.
  • CROPseq-multi has equivalent per-guide activity to CROPseq, minimal positional bias, and low lentiviral recombination rates.
  • An optimized and multiplexed in situ detection protocol improves detection efficiency 10-fold and increases decoding efficiency 3 -fold relative to CROPseq.
  • CROPseq-multi is a general multiplexing solution for Cas-based genetic screening approaches, including optical pooled screens.
  • the CROPseq and CROPseq-multi vectors of the disclosure further include a promoter useful for in vitro transcription detection.
  • a promoter useful for in vitro transcription does not: (i) hinder guide activity; (ii) rely on endogenous expression of the mRNA barcode (i.e., cell line agnostic); or (iii) depend on transcriptional silencing.
  • Embodiments disclosed herein provide an enhanced (brighter) signal, which enables detection in models with high background (e.g., tissue samples) using lower magnification imaging assistance with high throughput, sequencing chemistry solving from four color to two color in certain new NGS kits.
  • a further promoter e.g., T7 promoter
  • T7 promoter also promotes complex phenotypic measurements, as mRNA is unstable and requires immediate conversion to a DNA form (e.g., cDNA) in most assessment protocols, whereas in the protocols disclosed herein, manipulation, fixing and even multiple rounds of immunofluorescence can be performed, prior to in vitro transcription, where the RNA is regenerated.
  • any given expressing cell has multiple copies, so if there is a failure to detect a copy of an mRNA, it tends not to raise an issue, meaning that per molecule detection rate for mRNA tends to be less important than for detection of, e.g., genomic DNA.
  • RNAs When using a promoter useful for in vitro transcription, numerous RNAs can be generated from a single genomic integration, whereas a single copy of a barcode within a genome can mean that if there is a failure to in vitro transcribe that one copy, there will be no detection.
  • An additional advantage of positioning the promoter useful for in vitro transcription in the 3' LTR is that LTR duplication events double the number of barcode copies in a cell that are potentially capable of being detected, lowering the false negative failure rate.
  • the promoter useful for in vitro transcription is a T7 promoter.
  • embodiments disclosed herein provide single vectors capable of introducing two or more perturbations to single cells from the single vector.
  • the vector encodes at least two guide molecules.
  • the vector encodes two guide molecules.
  • the guide molecules may be encoded in a single transcript within a 3 ’ long terminal repeat (LTR) of a viral genome.
  • LTR long terminal repeat
  • Each guide molecule may be separated by a tRNA leader sequence.
  • the transcript encoding the guide molecule may comprise a barcode that identifies the sequence targeted (i.e., the sequences to be modified or perturbed) by each guide molecule.
  • This barcode sequence may an internal barcode (iBAR) located within a loop of the scaffold portion of each guide molecule.
  • embodiments disclosed herein provide single vectors comprising a promoter that enables in vitro transcription detection. In some aspects, embodiments of the disclosure provide for in vitro transcription detection of RNA comprising an internal barcode. In some aspects, the vectors and methods disclosed herein are employed in decoding in pooled genetic screens.
  • the viral vector may be any viral vector suitable for used in multiplex perturbation screens.
  • the vector is a lentivirus vector comprising at least two guide molecules (e.g., 2, 3, 4, 5, 6, 7, 8, 9, 10 or guide molecules, 2-3 guide molecules, 2-4 guide molecules, 2-5 guide molecules, 2-6 guide molecules, 2-7 guide molecules, 2-8 guide molecules, and the like) encoded for in a single transcript within the 3 ’ long terminal repeat (LTR) of the lentiviral genome, wherein the guide molecules are separated by a tRNA leader sequence, and wherein the two guide molecules are operably linked to a pol III promoter encoded for within the 3’ LTR and a pol II promoter encoded upstream of the 3’ LTR.
  • LTR long terminal repeat
  • the guide molecules and pol III promoter are encoded in the 3’ LTR of the lentivirus vector.
  • the 3’ LTR is duplicated to the 5’ end of the lentiviral genome during transduction.
  • the guide molecules are transcribed as a single pol III transcript from the 5’ end of the integrated lentiviral genome.
  • the guide molecule are separated from the pol III transcript by cleavage of the tRNA sequences by endogenous RNases.
  • leader tRNA sequence refers to a tRNA sequence 5’ of the guide molecule sequence.
  • Transfer RNA (abbreviated tRNA) is a small RNA molecule that plays a key role in protein synthesis.
  • Transfer RNA serves as a link (or adaptor) between the messenger RNA (mRNA) molecule and the growing chain of amino acids that make up a protein.
  • the vector includes a selection marker outside of the 3’ LTR.
  • the selection marker is under control of the pol II promoter for being expressed as a poly(A) tailed mRNA with the guide molecule sequences.
  • the mRNA can include the sequences encoded for in the 3’ LTR.
  • the orientation of the leader tRNA sequences is reversed, such that the mRNA is not cleaved.
  • the leader tRNA sequences and guide molecules scaffolds are orthogonal to prevent recombination.
  • the term “orthogonal” refers to the inability of two or more biomolecules, similar in composition and/or function, to interact with one another or affect their respective substrates.
  • the viral vector encodes a sequence for expression of any effector proteins required for the perturbation, such as a programmable nuclease or specific CRISPR system, as described further herein.
  • the viral vector encodes a Cas polypeptide in the vector itself.
  • the viral vector can be used in systems that do not already express a Cas polypeptide and/or are not compatible with serial genetic manipulation, such as primary human cells.
  • vector refers to a nucleic acid molecule capable of transporting another nucleic acid to which it has been linked.
  • Vectors include, but are not limited to, nucleic acid molecules that are single-stranded, double-stranded, or partially double-stranded; nucleic acid molecules that comprise one or more free ends, no free ends (e.g., circular); nucleic acid molecules that comprise DNA, RNA, or both; and other varieties of polynucleotides known in the art.
  • plasmid refers to a circular double stranded DNA loop into which additional DNA segments can be inserted, such as by standard molecular cloning techniques.
  • viral vector Another type of vector is a viral vector, wherein virally- derived DNA or RNA sequences are present in the vector for packaging into a virus (e.g., retroviruses, replication defective retroviruses, adenoviruses, replication defective adenoviruses, and adeno-associated viruses).
  • Viral vectors also include polynucleotides carried by a virus for transfection into a host cell. Certain vectors are capable of autonomous replication in a host cell into which they are introduced (e.g., episomal mammalian vectors).
  • vectors e.g., non- episomal mammalian vectors
  • Other vectors are integrated into the genome of a host cell upon introduction into the host cell, and thereby are replicated along with the host genome (e.g., lentivirus).
  • certain vectors are capable of directing the expression of genes to which they are operatively- linked (i.e., operably linked to a regulatory element).
  • Such vectors are referred to herein as "expression vectors.”
  • Vectors for and that result in expression in a eukaryotic cell can be referred to herein as "eukaryotic expression vectors.”
  • Common expression vectors of utility in recombinant DNA techniques are often in the form of plasmids.
  • a vector comprises one or more pol III promoter (e. ., 1, 2, 3, 4, 5, or more pol III promoters), one or more pol II promoters (e.g, 1 , 2, 3, 4, 5, or more pol II promoters), one or more pol I promoters (e.g., 1, 2, 3, 4, 5, or more pol I promoters), or combinations thereof.
  • pol III promoters include, but are not limited to, U6 and Hl promoters.
  • pol II promoters include, but are not limited to, the retroviral Rous sarcoma virus (RSV) LTR promoter (optionally with the RSV enhancer), the cytomegalovirus (CMV) promoter (optionally with the CMV enhancer) (see, e.g., Boshart et al, Cell, 41 :521-530 (1985)), the SV40 promoter, the dihydrofolate reductase promoter, the P-actin promoter, the phosphoglycerol kinase (PGK) promoter, and the EF 1 a promoter.
  • RSV Rous sarcoma virus
  • CMV cytomegalovirus
  • PGK phosphoglycerol kinase
  • the vector can be transcribed and translated. In some embodiments, the vector can be transcribed and translated in vitro. In some embodiments, the vector is transcribed and translated using T7 promoter regulatory sequences and T7 polymerase. In some embodiments, the vector is transcribed and translated using T3 promoter regulatory sequences and T3 polymerase. In still further embodiments, the vector is transcribed and translated using SP6 promoter regulatory sequences and SP6 RNA polymerase.
  • promoters/RNA polymerases from the bacteriophages T3, T7, and SP6 are believed to be equivalent (Askary, Amjad, et al. "In situ readout of DNA barcodes and single base edits facilitated by in vitro transcription.” Nature biotechnology 38.1 (2020): 66-75.)
  • enhancer elements such as WPRE; CMV enhancers; the R-U5’ segment in LTR of HTLV-I (Mol. Cell. Biol., Vol. 8(1), p. 466-472, 1988); SV40 enhancer; and the intron sequence between exons 2 and 3 of rabbit -globin (Proc. Natl. Acad. Sci. USA., Vol. 78(3), p. 1527-31, 1981).
  • WPRE WPRE
  • CMV enhancers the R-U5’ segment in LTR of HTLV-I
  • SV40 enhancer SV40 enhancer
  • the intron sequence between exons 2 and 3 of rabbit -globin Proc. Natl. Acad. Sci. USA., Vol. 78(3), p. 1527-31, 1981.
  • a vector can be introduced into host cells to thereby produce transcripts, proteins, or peptides, including fusion proteins or peptides, encoded by nucleic acids as described herein (e.g., clustered regularly interspersed short palindromic repeats (CRISPR) transcripts, proteins, enzymes, mutant forms thereof, fusion proteins thereof, etc ).
  • CRISPR clustered regularly interspersed short palindromic repeats
  • the vector is a lentivirus vector.
  • lentivirus vector refers to a viral vector derived from complex retroviruses such as the human immunodeficiency virus (HIV).
  • HIV human immunodeficiency virus
  • lentiviral vectors derived from any strain and subtype can be used.
  • the lentiviral vector may be based on a human or primate lentivirus such as HIV or a non-non-human lentivirus such as Feline immunodeficiency virus, simian immunodeficiency virus and equine infectious anemia virus (EIAV).
  • EIAV equine infectious anemia virus
  • the lentiviral vector is a HIV-based vector and especially a HIV- 1 -based vector (see, e.g., Dull T, Zufferey R, Kelly M, et al. A third-generation lentivirus vector with a conditional packaging system. J Virol. 1998;72(11):8463-8471; and Zufferey R, Dull T, Mandel RJ, et al. Selfinactivating lentivirus vector for safe and efficient in vivo gene delivery. J Virol. 1998;72(12):9873-9880).
  • the HIV 5’ LTR comprises the viral promoter for transcribing the viral genome RNA.
  • the LTR viral promoter is partially deleted and fused to a heterologous enhancer/promoter such as CMV or RSV.
  • the viral vector may encode two or more guide molecules.
  • the viral vector encodes, two, three, four, or five guide molecules.
  • the guide molecule encodes two guide molecules.
  • the following include general design principles that may be applied to the guide molecule.
  • the terms "guide molecule,” “guide sequence” and “guide polynucleotide” refer to polynucleotides capable of guiding Cas to a target genomic locus and are used interchangeably as in foregoing cited documents such as International Patent Publication No. WO 2014/093622 (PCT/US2013/074667).
  • a guide sequence is any polynucleotide sequence having sufficient complementarity with a target polynucleotide sequence to hybridize with the target sequence and direct sequence-specific binding of a CRISPR complex to the target sequence.
  • the guide molecule can be a polynucleotide.
  • a guide sequence within a nucleic acid-targeting guide RNA
  • a guide sequence may direct sequence-specific binding of a nucleic acid-targeting complex to a target nucleic acid sequence
  • the components of a nucleic acid-targeting CRISPR system sufficient to form a nucleic acid-targeting complex, including the guide sequence to be tested, may be provided to a host cell having the corresponding target nucleic acid sequence, such as by transfection with vectors encoding the components of the nucleic acid-targeting complex, followed by an assessment of preferential targeting (e.g., cleavage) within the target nucleic acid sequence, such as by Surveyor assay (Qui et al. 2004.
  • preferential targeting e.g., cleavage
  • cleavage of a target nucleic acid sequence may be evaluated in a test tube by providing the target nucleic acid sequence, components of a nucleic acid-targeting complex, including the guide sequence to be tested and a control guide sequence different from the test guide sequence, and comparing binding or rate of cleavage at the target sequence between the test and control guide sequence reactions.
  • Other assays are possible and will occur to those skilled in the art.
  • the guide molecule is an RNA.
  • the guide molecule(s) (also referred to interchangeably herein as guide polynucleotide and guide sequence) that are included in the CRISPR-Cas or Cas based system can be any polynucleotide sequence having sufficient complementarity with a target nucleic acid sequence to hybridize with the target nucleic acid sequence and direct sequence-specific binding of a nucleic acid-targeting complex to the target nucleic acid sequence.
  • the degree of complementarity when optimally aligned using a suitable alignment algorithm, can be about or more than about 50%, 60%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 97.5%, 98%, 98.5%, 99%, 99.5%, or more (e.g., 100%).
  • Optimal alignment may be determined with the use of any suitable algorithm for aligning sequences, non-limiting examples of which include the Smith-Waterman algorithm, the Needleman-Wunsch algorithm, algorithms based on the Burrows-Wheeler Transform (e.g., the Burrows Wheeler Aligner), ClustalW, Clustal X, BLAT, Novoalign (Novocraft Technologies; available at www.novocraft.com), ELAND (Illumina, San Diego, CA), SOAP (available at soap.genomics.org.cn), and Maq (available at maq.sourceforge.net).
  • Burrows-Wheeler Transform e.g., the Burrows Wheeler Aligner
  • ClustalW Clustal X
  • BLAT Novoalign
  • ELAND Illumina, San Diego, CA
  • SOAP available at soap.genomics.org.cn
  • Maq available at maq.sourceforge.net.
  • a guide sequence and hence a nucleic acid-targeting guide, may be selected to target any target nucleic acid sequence.
  • the target sequence may be DNA.
  • the target sequence may be any RNA sequence.
  • the target sequence may be a sequence within an RNA molecule selected from the group consisting of messenger RNA (mRNA), pre-mRNA, ribosomal RNA (rRNA), transfer RNA (tRNA), micro-RNA (miRNA), small interfering RNA (siRNA), small nuclear RNA (snRNA), small nucleolar RNA (snoRNA), double stranded RNA (dsRNA), non-coding RNA (ncRNA), long non-coding RNA (IncRNA), and small cytoplasmatic RNA (scRNA).
  • mRNA messenger RNA
  • rRNA ribosomal RNA
  • tRNA transfer RNA
  • miRNA micro-RNA
  • siRNA small interfering RNA
  • snRNA small nuclear RNA
  • snoRNA small nucle
  • the target sequence may be a sequence within an RNA molecule selected from the group consisting of mRNA, pre-mRNA, and rRNA. In some embodiments, the target sequence may be a sequence within an RNA molecule selected from the group consisting of ncRNA, and IncRNA. In some more embodiments, the target sequence may be a sequence within an mRNA molecule or a pre-mRNA molecule.
  • a nucleic acid-targeting guide is selected to reduce the degree secondary structure within the nucleic acid-targeting guide. In some embodiments, about or less than about 75%, 50%, 40%, 30%, 25%, 20%, 15%, 10%, 5%, 1%, or fewer of the nucleotides of the nucleic acid-targeting guide participate in self-complementary base pairing when optimally folded. Optimal folding may be determined by any suitable polynucleotide folding algorithm. Some programs are based on calculating the minimal Gibbs free energy. An example of one such algorithm is mFold, as described by Zuker and Stiegler (Nucleic Acids Res. 9 (1981), 133-148).
  • a guide RNA or crRNA may comprise, consist essentially of, or consist of a direct repeat (DR) sequence and a guide sequence or spacer sequence.
  • the guide RNA or crRNA may comprise, consist essentially of, or consist of a direct repeat sequence fused or linked to a guide sequence or spacer sequence.
  • the direct repeat sequence may be located upstream (i.e., 5’) from the guide sequence or spacer sequence. In other embodiments, the direct repeat sequence may be located downstream (i.e., 3’) from the guide sequence or spacer sequence.
  • the crRNA comprises a stem loop, preferably a single stem loop.
  • the direct repeat sequence forms a stem loop, preferably a single stem loop.
  • the spacer length of the guide RNA is from 15 to 35 nucleotides (nt). In another example embodiment, the spacer length of the guide RNA is at least 15 nucleotides (nt). In another example embodiment, the spacer length is from 15 to 17 nt, e.g., 15, 16, or 17 nt, from 17 to 20 nt, e.g., 17, 18, 19, or 20 nt, from 20 to 24 nt, e.g., 20, 21, 22, 23, or 24 nt, from 23 to 25 nt, e.g., 23, 24, or 25 nt, from 24 to 27 nt, e.g., 24, 25, 26, or 27 nt, from 27 to 30 nt, e.g., 27, 28, 29, or 30 nt, from 30 to 35 nt, e.g., 30, 31, 32, 33, 34, or 35 nt, or 35 nt or longer.
  • the "tracrRNA” sequence or analogous terms includes any polynucleotide sequence that has sufficient complementarity with a crRNA sequence to hybridize.
  • the degree of complementarity between the tracrRNA sequence and crRNA sequence along the length of the shorter of the two when optimally aligned is about or more than about 25%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 97.5%, 98%, 98.5%, 99%, 99.5%, or more (e.g., 100%).
  • the tracr sequence is about or more than about 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 25, 30, 40, 50, or more nucleotides in length.
  • the tracr sequence and crRNA sequence are contained within a single transcript, such that hybridization between the two produces a transcript having a secondary structure, such as a hairpin.
  • degree of complementarity is with reference to the optimal alignment of the spacer sequence and tracr sequence, along the length of the shorter of the two sequences.
  • Optimal alignment may be determined by any suitable alignment algorithm and may further account for secondary structures, such as self-complementarity within either the spacer sequence or tracr sequence.
  • the degree of complementarity between the tracr sequence and spacer sequence along the length of the shorter of the two when optimally aligned is about or more than about 25%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 97.5%, 98%, 98.5%, 99%, 99.5%, or more (e.g., 100%).
  • the degree of complementarity between a guide sequence and its corresponding target sequence can be about or more than about 50%, 60%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 97.5%, 98%, 98.5%, 99%, 99.5%, or more (e.g., 100%);
  • a guide or RNA or sgRNA can be about or more than about 5, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 35, 40, 45, 50, 75, or more nucleotides in length; or guide or RNA or sgRNA can be less than about 75, 50, 45, 40, 35, 30, 25, 20, 15, 12, or fewer nucleotides in length; and tracr RNA can be 30 or 50 nucleotides in length.
  • the degree of complementarity between a guide sequence and its corresponding target sequence is greater than 94.5% or 95% or 95.5% or 96% or 96.5% or 97% or 97.5% or 98% or 98.5% or 99% or 99.5% or 99.9%, or 100%.
  • Off target is less than 100% or 99.9% or 99.5% or 99% or 99% or 98.5% or 98% or 97.5% or 97% or 96.5% or 96% or 95.5% or 95% or 94.5% or 94% or 93% or 92% or 91% or 90% or 89% or 88% or 87% or 86% or 85% or 84% or 83% or 82% or 81% or 80% complementarity between the sequence and the guide, with it being advantageous that off target is 100% or 99.9% or 99.5% or 99% or 99% or 98.5% or 98% or 97.5% or 97% or 96.5% or 96% or 95.5% or 95% or 94.5% complementarity between the sequence and the guide.
  • the guide RNA (capable of guiding Cas to a target locus) may comprise (1) a guide sequence capable of hybridizing to a genomic target locus in the eukaryotic cell; (2) a tracr sequence; and (3) a tracr mate sequence. All of (1) to (3) may reside in a single RNA, i.e., an sgRNA (arranged in a 5’ to 3’ orientation), or the tracr RNA may be a different RNA than the RNA containing the guide and tracr sequence. The tracr hybridizes to the tracr mate sequence and directs the CRISPR/Cas complex to the target sequence.
  • each RNA may be optimized to be shortened from their respective native lengths, and each may be independently chemically modified to protect from degradation by cellular RNase or otherwise increase stability.
  • target sequence refers to a sequence to which a guide sequence is designed to have complementarity, where hybridization between a target sequence and a guide sequence promotes the formation of a CRISPR complex.
  • the target polynucleotide can be a polynucleotide or a part of a polynucleotide to which a part of the guide sequence is designed to have complementarity with and to which the effector function mediated by the complex comprising the CRISPR effector protein and a guide molecule is to be directed.
  • a target sequence is located in the nucleus or cytoplasm of a cell.
  • PAM elements are sequences that can be recognized and bound by Cas proteins. Cas proteins/effector complexes can then unwind the dsDNA at a position adjacent to the PAM element. It will be appreciated that Cas proteins and systems target RNA do not require PAM sequences (Marraffini et al. 2010. Nature. 463:568-571). Instead, many rely on PFSs, which are discussed elsewhere herein.
  • the target sequence should be associated with a PAM (protospacer adjacent motif) or PFS (protospacer flanking sequence or site), that is, a short sequence recognized by the CRISPR complex.
  • the target sequence should be selected, such that its complementary sequence in the DNA duplex (also referred to herein as the non-target sequence) is upstream or downstream of the PAM.
  • the complementary sequence of the target sequence is downstream or 3’ of the PAM or upstream or 5’ of the PAM.
  • the precise sequence and length requirements for the PAM differ depending on the Cas protein used, but PAMs are typically 2-5 base pair sequences adjacent the protospacer (that is, the target sequence). Examples of the natural PAM sequences for different Cas proteins are provided herein below and the skilled person will be able to identify further PAM sequences for use with a given Cas protein.
  • the CRISPR effector protein may recognize a 3 ’ PAM.
  • the CRISPR effector protein may recognize a 3’ PAM which is 5’H, wherein H is A, C or U.
  • engineering of the PAM Interacting (PI) domain on the Cas protein may allow programing of PAM specificity, improve target site recognition fidelity, and increase the versatility of the CRISPR-Cas protein, for example as described for Cas9 in KI einstiver BP et al. Engineered CRISPR-Cas9 nucleases with altered PAM specificities. Nature. 2015 Jul 23;523(7561):481-5. doi: 10.1038/naturel4592. As further detailed herein, the skilled person will understand that Casl3 proteins may be modified analogously.
  • Gao et al "Engineered Cpfl Enzymes with Altered PAM Specificities," bioRxiv 091611; doi: http://dx.doi.org/10.1101/091611 (Dec. 4, 2016).
  • Doench et al. created a pool of sgRNAs, tiling across all possible target sites of a panel of six endogenous mouse and three endogenous human genes and quantitatively assessed their ability to produce null alleles of their target gene by antibody staining and flow cytometry. The authors showed that optimization of the PAM improved activity and also provided an on-line tool for designing sgRNAs.
  • PAM sequences can be identified in a polynucleotide using an appropriate design tool, which are commercially available as well as online.
  • Such freely available tools include, but are not limited to, CRISPRFinder and CRISPRTarget. Mojica et al. 2009. Microbiol. 155(Pt. 3):733-740; Atschul et al. 1990. J. Mol. Biol. 215:403-410; Biswass et al. 2013 RNA Biol. 10:817-827; and Grissa et al. 2007. Nucleic Acid Res. 35:W52-57.
  • Experimental approaches to PAM identification can include, but are not limited to, plasmid depletion assays (Jiang et al. 2013. Nat.
  • CRISPR-Cas systems that target RNA do not typically rely on PAM sequences. Instead, such systems typically recognize protospacer flanking sites (PFSs) instead of PAMs.
  • Type VI CRISPR-Cas systems typically recognize protospacer flanking sites (PFSs) instead of PAMs.
  • PFSs represents an analogue to PAMs for RNA targets.
  • Type VI CRISPR-Cas systems employ a Casl3.
  • Some Casl3 proteins analyzed to date, such as Casl3a (C2c2) identified from Leptotrichia shahii (LShCAsl3a) have a specific discrimination against G at the 3 ’end of the target RNA.
  • RNA Biology. 16(4): 504-517 Some Type VI proteins, such as subtype B, have 5 '-recognition of D (G, T, A) and a 3'-motif requirement ofNAN or NNA.
  • One example is the Casl3b protein identified in Bergeyella zoohelcum (BzCasl3b). See, e.g., Gleditzsch et al. 2019. RNA Biology. 16(4):504-517.
  • Type VI CRISPR-Cas systems appear to have less restrictive rules for substrate (e.g., target sequence) recognition than those that target DNA (e.g., Type V and type II).
  • one or more components e.g., the Cas protein) in the composition for engineering cells may comprise one or more sequences related to nucleus targeting and transportation. Such sequences may facilitate the one or more components in the composition for targeting a sequence within a cell.
  • sequences may facilitate the one or more components in the composition for targeting a sequence within a cell.
  • NLSs nuclear localization sequences
  • the NLSs used in the context of the present disclosure are heterologous to the proteins.
  • Non-limiting examples of NLSs include an NLS sequence derived from: the NLS of the SV40 virus large T-antigen, having the amino acid sequence PKKKRKV (SEQ ID NO: 1) or PKKKRKVEAS (SEQ ID NO: 2); the NLS from nucleoplasmin (e.g., the nucleoplasmin bipartite NLS with the sequence KRPAATKKAGQAKKKK (SEQ ID NO: 3)); the c-myc NLS having the amino acid sequence PAAKRVKLD (SEQ ID NO: 4) or RQRRNELKRSP (SEQ ID NO: 5); the hRNPAl M9 NLS having the sequence NQSSNFGPMKGGNFGGRSSGPYGGGGQYFAKPRNQGGY (SEQ ID NO: 6); the sequence RMRIZFKNKGKDTAELRRRRVEVSVELRKAKKDEQIL
  • the one or more NLSs are of sufficient strength to drive accumulation of the DNA-targeting Cas protein in a detectable amount in the nucleus of a eukaryotic cell.
  • strength of nuclear localization activity may derive from the number of NLSs in the CRISPR-Cas protein, the particular NLS(s) used, or a combination of these factors.
  • Detection of accumulation in the nucleus may be performed by any suitable technique.
  • a detectable marker may be fused to the nucleic acid-targeting protein, such that location within a cell may be visualized, such as in combination with a means for detecting the location of the nucleus (e.g., a stain specific for the nucleus such as DAPI).
  • Cell nuclei may also be isolated from cells, the contents of which may then be analyzed by any suitable process for detecting protein, such as immunohistochemistry, Western blot, or enzyme activity assay. Accumulation in the nucleus may also be determined indirectly, such as by an assay for the effect of nucleic acidtargeting complex formation (e.g., assay for deaminase activity) at the target sequence, or assay for altered gene expression activity affected by DNA-targeting complex formation and/or DNA- targeting), as compared to a control not exposed to the Cas protein, or exposed to a Cas protein lacking the one or more NLSs.
  • nucleic acidtargeting complex formation e.g., assay for deaminase activity
  • the Cas proteins may be provided with 1 or more, such as with, 2, 3, 4, 5, 6, 7, 8, 9, 10, or more heterologous NLSs.
  • the proteins comprises about or more than about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, or more NLSs at or near the amino-terminus, about or more than about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, or more NLSs at or near the carboxy-terminus, or a combination of these (e.g., zero or at least one or more NLS at the amino-terminus and zero or at one or more NLS at the carboxy terminus).
  • each NLS may be selected independently of the others, such that a single NLS may be present in more than one copy and/or in combination with one or more other NLSs present in one or more copies.
  • an NLS is considered near the N- or C-terminus when the nearest amino acid of the NLS is within about 1, 2, 3, 4, 5, 10, 15, 20, 25, 30, 40, 50, or more amino acids along the polypeptide chain from the N- or C-terminus.
  • an NLS attached to the C-terminal of the protein.
  • the CRISPR-Cas protein and a functional domain protein are delivered to the cell or expressed within the cell as separate proteins.
  • each of the CRISPR-Cas and functional domain protein can be provided with one or more NLSs as described herein.
  • the CRISPR-Cas and functional domain protein are delivered to the cell or expressed with the cell as a fusion protein.
  • one or both of the CRISPR-Cas and functional domain protein is provided with one or more NLSs.
  • the functional domain protein is fused to an adaptor protein (such as MS2) as described above, the one or more NLS can be provided on the adaptor protein, provided that this does not interfere with aptamer binding.
  • the one or more NLS sequences may also function as linker sequences between the functional domain protein and the CRISPR-Cas protein.
  • guides of the disclosure comprise specific binding sites (e.g., aptamers) for adapter proteins, which may be linked to or fused to a functional domain protein or catalytic domain thereof.
  • a guide forms a CRISPR complex (e.g., CRISPR-Cas protein binding to guide and target)
  • the adapter proteins bind and the functional domain protein or catalytic domain thereof associated with the adapter protein is positioned in a spatial orientation which is advantageous for the attributed function to be effective.
  • the one or more modified guide may be modified at the tetra loop, the stem loop 1, stem loop 2, or stem loop 3, as described herein, preferably at either the tetra loop or stem loop 2, and in some cases at both the tetra loop and stem loop 2.
  • a component in the systems may comprise one or more nuclear export signals (NES), one or more nuclear localization signals (NLS), or any combinations thereof.
  • the NES may be an HIV Rev NES.
  • the NES may be MAPK NES.
  • the component is a protein, the NES or NLS may be at the C terminus of component. Alternatively, or additionally, the NES or NLS may be at the N terminus of component.
  • the Cas protein and optionally said functional domain protein or catalytic domain thereof comprise one or more heterologous nuclear export signal(s) (NES(s)) or nuclear localization signal(s) (NLS(s)), preferably an HIV Rev NES or MAPK NES, preferably C- terminal.
  • the guide molecule may further comprise a barcode that is unique to each perturbation and thereby identifies which guide molecule(s) and perturbation(s) were introduced into a given cell when analyzed by sequencing.
  • the barcode is incorporated into the sequence encoding the perturbation or is a sequence that is only encoded on a vector encoding the perturbation.
  • the barcode identifying a perturbation can be the perturbation, such as a guide sequence encoding a specific perturbation.
  • the guide sequence can also include one or more internal barcode sequences (iBAR) that are inserted in the guide sequence.
  • the one or more iBARS are inserted in a guide sequence in such a way as to not interfere with the guide sequences ability to be directed to a target sequence.
  • an iBAR is inserted within the loop of a sgRNA joining the crRNA and tracrRNA sequence of the sgRNA (described further herein).
  • an iBAR can identify the perturbation.
  • additional iBARs can identify replicates, subpopulations, clones, and/or subclones.
  • barcode refers to a short sequence of nucleotides (for example, DNA or RNA) that is used as an identifier for an associated molecule, such as a target molecule and/or target nucleic acid, or as an identifier of the source of an associated molecule, such as a cell-of-origin, sample of origin, or individual transcript.
  • a barcode may also refer to any unique, non-naturally occurring, nucleic acid sequence that may be used to identify the originating source of a nucleic acid fragment.
  • the barcode sequence provides a high-quality individual read of a barcode associated with a perturbation, single cell, single nuclei, a viral vector, labeling ligand (e.g., antibody or aptamer), protein, shRNA, sgRNA or cDNA such that multiple species can be sequenced together.
  • Barcoding may be performed based on any of the compositions or methods disclosed in patent publication WO 2014047561 Al, Compositions and methods for labeling of agents, incorporated herein in its entirety.
  • barcoding uses an error correcting scheme (T. K. Moon, Error Correction Coding: Mathematical Methods and Algorithms (Wiley, New York, ed. 1, 2005)).
  • a nucleic acid barcode can have a length of at least, for example, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 35, 40, 45, 50, 60, 70, 80, 90, or 100 nucleotides, and can be in single- or double-stranded form.
  • Target molecule and/or target nucleic acids can be labeled with multiple nucleic acid barcodes in combinatorial fashion, such as a nucleic acid barcode concatemer.
  • a nucleic acid barcode is used to identify a target molecule and/or target nucleic acid, or as being from a particular discrete volume, having a particular physical property (for example, affinity, length, sequence, etc.), or having been subject to certain treatment conditions.
  • Target molecules and/or target nucleic acids can be associated with multiple nucleic acid barcodes to provide information about all of these features (and more).
  • the vector may further encode one or more tRNA sequences.
  • a tRNA sequences is encoded between the sgRNAs.
  • the tRNA sequences may recruit endogenous (i.e. of the cell transfected using the vector) RNAse P and Z to clear the tRNA sequence at the 5’ and 3’ ends, such that, when positioned between guide molecules within a single transcript, the transcript is processed into separate, functional guide molecules.
  • a tRNA sequence may be included in front of the first guide molecule in the single transcript as well as between any two guide molecules in the transcript.
  • the orientation of the tRNA leader sequences may be reversed.
  • a 3' LTR embedding the reverse complement orientation of the tRNA sequences is a key feature for making the mRNA (i.e., pol II transcript encoding the guide sequences) compatible with single cell sequencing.
  • the Pol III transcript is transcribed from the pol III promoter in the 3’ LTR of the integrated lentivirus such that the sgRNAs and tRNAs are oriented such that the functional tRNAs are recognized and cleaved.
  • the pol III promoter initiates transcription on the opposite strand as the pol II promoter in the lentivirus vector by reversing the orientation of the elements.
  • the orientation of the U6 and tRNA/sgRNA arrays are antisense relative to the lentiviral genome.
  • the tRNA sequences may be orthogonal to the cell to be transfected.
  • the two or more guide molecules may also be operably linked to a promoter encoded for or within the 3’ LTR and a pol II promoter encoded upstream of the 3’ LTR.
  • Example tRNA sequences are disclosed in the Examples section below. In example embodiments, any method of multiplexing sgRNAs can be used instead of tRNA sequences.
  • the lentivirus vectors are delivered to a population of cells to obtain perturbed cells.
  • the population of cells can include any tissue culture cell line or any primary cell line.
  • the population of cells expresses any effector proteins required for the perturbation, such as a programmable nuclease or specific CRISPR system, as described further herein.
  • the population of cells comprises eukaryotic cells, preferably, mammalian cells, more preferably, human cells.
  • the present disclosure provides for a method of performing multiplexed pooled perturbation screening comprising introducing one or more perturbation vectors encoding two or more guide molecules in a single transcript according to any embodiment herein to a population of cells to obtain perturbed cells, optionally the population of cells expresses any effector proteins required for the perturbation; and performing single cell RNA sequencing on the perturbed cells to identify the phenotype and multiplexed perturbations in single cells.
  • the exemplary embodiments disclosed herein are directed to a method of performing multiplexed pooled optical perturbation screening comprising introducing one or more perturbations using the vectors described herein.
  • the perturbed cells are then fixed and permeabilized.
  • the mRNA is reverse transcribed from the pol II promoter of the lentivirus vector in the perturbed cells using primers specific for the guide molecules, thereby generating cDNA comprising the sequence encoding the guide molecules.
  • the cDNA is then fixed in the cells and in situ sequencing (ISS) of the cDNA sequences is perform.
  • ISS in situ sequencing
  • the ISS step may comprise contacting the perturbed cell with padlock probes flanking each guide molecule and/or iBAR, gap filling the padlock probes to capture each spacer and iBAR sequence, ligating the padlock probes into circular ssDNA templates and performing rolling circle amplification to generate amplified cDNA and sequencing the amplified cDNA using sequencing by synthesis.
  • an enzyme typically a DNA polymerase, initiates the sequencing by incorporating fluorescently labeled nucleotide bases one at a time. After each base addition, the sample is imaged to detect the fluorescent signal associated with that base, which signifies its identity. This process is cycled for multiple rounds, with each cycle adding and imaging a single base. The accumulated data, represented by a series of images, is then analyzed to deduce the complete RNA sequence. Sequencing by synthesis provides valuable insights into gene expression patterns and RNA modifications within individual cells and tissues while preserving their spatial context.
  • embodiments disclosed herein are directed to a method of performing multiplexed pooled perturbation screening, the method involving introducing one or more perturbation vectors that includes a promoter, optionally where the promoter may be used to generate in vitro transcription barcoded RNA.
  • a promoter optionally where the promoter may be used to generate in vitro transcription barcoded RNA.
  • in vitro detection can enhance decoding strategies employed in the performance of multiplexed pooled perturbation screening.
  • the term "perturbation" refers to any alteration of the function of a biological system by external or internal means, such as alterations in gene expression, alterations by environmental stimuli, or alterations by drug treatment.
  • the perturbation used in the present disclosure is genetic.
  • a genetic perturbation refers to a perturbation that perturbs a nucleic acid, such as a genome sequence (e.g., a target gene or regulatory element) or RNA sequence (e.g., a transcript sequence).
  • a plurality of cells is perturbed with sequence specific perturbations.
  • sequence specific refers to a perturbation that targets a specific nucleotide sequence in a cell e.g., a DNA or RNA sequence).
  • a genetic perturbation is a CRISPR mediated perturbation (e.g., INDELs, substitutions, CRISPRa (CRISPR activation), CRISPRi (CRISPR interference), prime editing, base editing, or an RNAi (RNA interference) mediated perturbation.
  • the perturbations can be identified by sequencing.
  • each perturbation can be identified by at least one barcode sequence.
  • the one or more perturbations target specific genes of interest.
  • the perturbations both target the same gene, but at a different sequence.
  • perturbations include any perturbation that can be directed to a target sequence for perturbation by a programmable system.
  • the programmable system is a programmable nuclease system, such as a CRISPR system.
  • the perturbations encoded by the vectors of the present disclosure are guide sequences capable of targeting a programmable system to a target sequence in a cell.
  • the programmable system includes an enzymatic component that is targeted to the perturbation target.
  • the cells used for the multiplexed perturbations is modified to express the programmable system for making the perturbations in the cells.
  • the programmable system is introduced to the cells concurrently or before introducing the vectors encoding the perturbations.
  • the perturbations are guide sequences specific for a CRISPR system.
  • the guide sequences are single guide RNA sequences (sgRNA), which are described further herein.
  • sgRNA single guide RNA sequences
  • the population of cells may express a Cas polypeptide from a CRISPR- Cas system.
  • the cell maybe genetically modified to express the Cas polypeptide or the Cas polypeptide may be deliver prior to, with, or subsequent to delivery of the viral vector systems disclosed herein.
  • the Cas polypeptide used will align with the corresponding guide molecule i.e. a Cas9 will be used with a Cas9 guide molecule and so forth.
  • the Cas is a Type II Cas polypeptide.
  • the Type II CRISPR-Cas system is a II-A CRISPR-Cas system.
  • the Type II CRISPR-Cas system is a II-B CRISPR- Cas system. In some embodiments, the Type II CRISPR-Cas system is a II-C1 CRISPR-Cas system. In some embodiments, the Type II CRISPR-Cas system is a II-C2 CRISPR-Cas system. In some embodiments, the Type II system is a Cas9 system. In some embodiments, the Type II system includes a Cas9.
  • the Type V CRISPR-Cas system is a V-A CRISPR-Cas system. In some embodiments, the Type V CRISPR-Cas system is a V-Bl CRISPR-Cas system. In some embodiments, the Type V CRISPR-Cas system is a V-B2 CRISPR-Cas system. In some embodiments, the Type V CRISPR-Cas system is a V-C CRISPR-Cas system. In some embodiments, the Type V CRISPR-Cas system is a V-D CRISPR-Cas system. In some embodiments, the Type V CRISPR-Cas system is a V-E CRISPR-Cas system.
  • the Type V CRISPR-Cas system is a V-Fl CRISPR-Cas system. In some embodiments, the Type V CRISPR-Cas system is a V-Fl (V-U3) CRISPR-Cas system. In some embodiments, the Type V CRISPR-Cas system is a V-F2 CRISPR-Cas system. In some embodiments, the Type V CRISPR-Cas system is a V-F3 CRISPR-Cas system. In some embodiments, the Type V CRISPR-Cas system is a V-G CRISPR-Cas system.
  • the Type V CRISPR-Cas system is a V-H CRISPR-Cas system. In some embodiments, the Type V CRISPR-Cas system is a V-I CRISPR-Cas system. In some embodiments, the Type V CRISPR-Cas system is a V-K (V-U5) CRISPR-Cas system. In some embodiments, the Type V CRISPR-Cas system is a V-Ul CRISPR-Cas system. In some embodiments, the Type V CRISPR-Cas system is a V-U2 CRISPR-Cas system.
  • the Type V CRISPR-Cas system is a V-U4 CRISPR-Cas system. In some embodiments, the Type V CRISPR-Cas system includes a Cast 2a (Cpfl), Cast 2b (C2cl), Cast 2c
  • C2c3 Casl2d (CasY), Casl2e (CasX), Casl4, and/or CasCb.
  • the Class 2 system is a Type VI system.
  • the Type VI CRISPR-Cas system is a VI-A CRISPR-Cas system.
  • the Type VI CRISPR-Cas system is a VI-B1 CRISPR-Cas system.
  • the Type VI CRISPR-Cas system is a VI-B2 CRISPR-Cas system.
  • the Type VI CRISPR-Cas system is a VI-C CRISPR-Cas system.
  • the Type VI CRISPR- Cas system is a VI-D CRISPR-Cas system.
  • the Type VI CRISPR-Cas system includes a Casl3a (C2c2), Casl3b (Group 29/30), Casl3c, and/or Casl3d.
  • the programmable nuclease to modify the one or more target genes is a transposon-encoded RNA-guided nuclease system, referred to herein as OMEGA (obligate mobile element-guided activity).
  • OMEGA obligate mobile element-guided activity
  • OMEGA systems include, but are not limited to IscB, IsrB, TnpB systems.
  • the guide molecule of an OMEGA system is referred to as coRNA and while different in size and structure from a typical CRISPR-Cas guide also contains a programmable spacer sequence and scaffold component. Accordingly, OMEGA systems may be used within the context of the disclosure, both in the context of viral vectors encoding two or more guide molecules that are coRNAs, and in the use of OMEGA polypeptides to make the desired cellular perturbation.
  • Perturb-seq identifies perturbations by sequencing barcodes identifying the perturbations expressed as poly(A) tailed mRNAs.
  • the multiplexed vectors described herein can be used with any method of perturb-seq. Examples of prior perturb-seq assays have been described (see, e.g., Dixit et al., "Perturb-Seq: Dissecting Molecular Circuits with Scalable Single-Cell RNA Profiling of Pooled Genetic Screens" 2016, Cell 167, 1853-1866; Adamson et al., "A Multiplexed Single-Cell CRISPR Screening Platform Enables Systematic Dissection of the Unfolded Protein Response” 2016, Cell 167, 1867-1882; Jaitin DA, Weiner A, Yofe I, et al.
  • perturbations are identified along with transcriptome mRNA using single cell sequencing.
  • the disclosure involves single cell RNA sequencing (see, e.g., Qi Z, Barrett T, Parikh AS, Tirosh I, Puram SV. Single-cell sequencing and its applications in head and neck cancer. Oral Oncol. 2019;99: 104441; Kalisky, T., Blainey, P. & Quake, S. R. Genomic Analysis at the Single-Cell Level. Annual review of genetics 45, 431-445, (2011); Kalisky, T. & Quake, S. R. Single-cell genomics. Nature Methods 8, 311-314 (2011); Islam, S. et al.
  • the disclosure involves plate based single cell RNA sequencing (see, e.g., Picelli, S. et al., 2014, "Full-length RNA-seq from single cells using Smart- seq2" Nature protocols 9, 171-181, doi: 10.1038/nprot.2014.006).
  • the disclosure involves high-throughput single-cell RNA- seq.
  • Macosko et al. 2015, "Highly Parallel Genome-wide Expression Profiling of Individual Cells Using Nanoliter Droplets” Cell 161, 1202-1214; International patent application number PCT/US2015/049178, published as W02016/040476 on March 17, 2016; Klein et al., 2015, “Droplet Barcoding for Single-Cell Transcriptomics Applied to Embryonic Stem Cells" Cell 161, 1187-1201; International patent application number PCT/US2016/027734, published as WO2016168584A1 on October 20, 2016; Zheng, et al., 2016, “Haplotyping germline and cancer genomes with high-throughput linked-read sequencing” Nature Biotechnology 34, 303-311; Zheng, et al., 2017, “Massively parallel digital transcriptional profiling of single cells” Nat.
  • the disclosure involves single nucleus RNA sequencing.
  • Swiech et al., 2014 "In vivo interrogation of gene function in the mammalian brain using CRISPR-Cas9" Nature Biotechnology Vol. 33, pp. 102-106; Habib et al., 2016, “Div-Seq: Single-nucleus RNA-Seq reveals dynamics of rare adult newborn neurons” Science, Vol. 353, Issue 6302, pp. 925-928; Habib et al., 2017, “Massively parallel single-nucleus RNA-seq with DroNc-seq” Nat Methods. 2017 Oct;14(10):955-958; International Patent Application No.
  • optical screens are performed in cells using multiplexed perturbations.
  • the present disclosure also provides for a method of performing multiplexed pooled optical perturbation screening comprising introducing one or more perturbation vectors encoding two sgRNAs in a single transcript according to any embodiment herein to a population of cells to obtain perturbed cells (optionally, Cas-expressing); fixing and permeabilizing the perturbed cells; reverse transcribing mRNA transcribed from the pol II promoter of the lentivirus vector in the perturbed cells using primers specific for each of the two sgRNAs, thereby generating cDNA comprising the sequence encoding the two sgRNAs; fixing the cDNA in the perturbed cells; and in situ sequencing (ISS) the sequences comprising each sgRNA spacer and/or iBAR.
  • perturbed cells optionally, Cas-expressing
  • fixing and permeabilizing the perturbed cells reverse transcribing mRNA transcribed from the pol II promoter of the
  • in situ sequencing comprises contacting the perturbed cells with padlock probes flanking each sgRNA spacer and/or iBAR; gap filling the padlock probes to capture each spacer and iBAR within the padlock probe; ligating the padlock probes into circular ssDNA templates; performing rolling circle amplification on the circular ssDNA templates; and sequencing the amplified cDNA using sequencing by synthesis to decode perturbations.
  • the primers specific for each of the two sgRNAs for reverse transcription are biotinylated and the method further comprises a streptavidin incubation between reverse transcription and fixing the cDNA in the perturbed cells.
  • the perturbed cells are permeabilized.
  • a biological sample can be permeabilized by exposing the sample to one or more permeabilizing agents.
  • Suitable agents for this purpose include, but are not limited to, organic solvents (e.g., acetone, ethanol, and methanol), cross-linking agents (e.g, paraformaldehyde), detergents (e.g., saponin, Triton X-100TM, Tween-20TM, or sodium dodecyl sulfate (SDS)), and enzymes (e.g., trypsin, proteases (e.g., proteinase K).
  • the population of cells is permeabilized with a detergent.
  • the plasma membranes of the plurality of cells are permeabilized with lower concentrations of commonly used detergents, such as saponin, Triton X- 100TM, Tween-20TM, or sodium dodecyl sulfate (SDS). Saponin interacts with membrane cholesterol, selectively removing it and leaving holes in the membrane.
  • the detergent is non-ionic.
  • the detergent is an anionic detergent (e.g., SDS or N-lauroylsarcosine sodium salt solution).
  • the plurality of cells can be permeabilized using any of the detergents described herein, e.g., SDS and/or N-lauroylsarcosine sodium salt solution) before or after enzymatic treatment (e.g., treatment with any of the enzymes described herein, e.g., trypsin, proteases (e.g., pepsin and/or proteinase K)). Additional methods for sample permeabilization are described, for example, in lamur et al., Method Mol. Biol.588:63- 66, 2010, the entire contents of which are incorporated herein by reference.
  • any of the detergents described herein e.g., SDS and/or N-lauroylsarcosine sodium salt solution
  • enzymatic treatment e.g., treatment with any of the enzymes described herein, e.g., trypsin, proteases (e.g., pepsin and/or proteinase K)
  • the concentration of the detergent is sufficient to permeabilize the cells without denaturing proteins.
  • NP40, digitonin, or tween is used.
  • the concentration of detergent used herein may be from 0.005% to 1%, from 0.01% to 0.8%, from 0.01% to 0.6%, from 0.01% to 0.4%, from 0.01% to 0.2%, from 0.01% to 0.1%, from 0.005% to 0.05%, from 0.01% to 0.03%, from 0.015% to 0.025%, from 0.018% to 0.022%, from 0.015% to 0.017%, from 0.016% to 0.018%, from 0.017% to 0.019%, from 0.018% to 0.02%, from 0.019% to 0.021%, from 0.02% to 0.022%, or from 0.021% to 0.023%.
  • the concentration of the detergent may be about 0.01%, about 0.015%, about 0.02%, about 0.025%, or about 0.03%.
  • the concentration of the detergent may be about 0.02%.
  • SDS is used at concentrations below 0.5%, such as 0.1, 0.05, or less than 0.01%.
  • the perturbed cells are fixed.
  • Various fixing methods can be used.
  • fixing is accomplished by crosslinking.
  • Non-limiting methods of crosslinking are known in the art.
  • Fixation methods can be divided into two groups: additive and denaturing fixation.
  • Additive fixation solutions also called cross-linking fixations
  • Another group is the denaturing (or precipitating) fixations.
  • a cell may be fixed using chemicals such as formaldehyde, paraformaldehyde, glutaraldehyde, ethanol, methanol, acetone, acetic acid, or the like.
  • a cell may be fixed using Hepes- glutamic acid buffer-mediated organic solvent (HOPE).
  • HOPE Hepes- glutamic acid buffer-mediated organic solvent
  • the fixing comprises about 0.007% glutaraldehyde in about 4% paraformaldehyde.
  • perturbations encoded for in the vectors described herein are identified in single cells by in situ methods.
  • in situ sequencing is used to identify the barcodes (e.g., sgRNA spacer and/or iBARS) (see, e.g., Yue L, Liu F, Hu J, et al. A guidebook of spatial transcriptomic technologies, data resources and analysis approaches. Comput Struct Biotechnol J. 2023;21 :940-955).
  • the in situ sequencing comprises any method of locally amplifying short sequences (e.g., sgRNA spacer and/or iBARS) and then imaging one nucleotide at a time.
  • rolling circle amplification comprising a 'padlock' probe that hybridizes on either side of a target sequence to form a circular template that can be copied repeatedly as a long string. Because the product is tethered to the template, it provides reliable localization and is amenable to in situ sequencing by successive rounds of ligation-based oligonucleotide probe incorporation (see, e.g., Ke R, Mignardi M, Pacureanu A, et al. In situ sequencing for RNA analysis in preserved tissue and cells. Nat Methods. 2013;10(9):857-860).
  • PLP padlock probes
  • the in situ sequencing comprises the simultaneous in situ sequencing of more than one of the two spacers and two iBARs simultaneously.
  • simultaneous sequencing provides the following advantages: 1) to save sequencing cycles (i.e., read both iBARS simultaneously to get two bases of information from a cell in each in situ sequencing cycle, cutting the number of cycles needed in half) and/or 2) to detect errors in the perturbation sequences in a cell (e.g., spacer/barcode sequence accuracy problems and/or recombination).
  • cell morphological phenotypes are determined for multiplexed perturbations.
  • the methods described herein can be to detect any phenotypes detectable by microscopy.
  • the phenotypes comprise cell morphology or biomolecule organization, including those detected by live cell markers, immunostaining, histological staining, or other similar methods.
  • the one or more additional phenotypes comprise any time resolved phenotype, such as, ion indicators, (e.g., calcium, sodium, magnesium, zinc, pH, and membrane potential indicators), voltage imaging, dynamic metabolite measurements, markers of cell stress, and/or cell migration.
  • a movie is taken of the population of cells after perturbation and before fixing.
  • live cell imaging is performed.
  • morphological features can be identified by cell painting (see, e.g., Bray MA, Singh S, Han H, et al. Cell Painting, a high-content image-based assay for morphological profiling using multiplexed fluorescent dyes. Nat Protoc. 2016; 11(9): 1757-1774); and Laber, et al., Discovering cellular programs of intrinsic and extrinsic drivers of metabolic traits using LipocyteProfiler, bioRxiv 2021.07.17.452050).
  • optical screens for gene expression are determined for perturbations (e.g., MERFISH; Moffitt JR, Hao J, Wang G, Chen KH, Babcock HP, Zhuang X. High-throughput single-cell gene-expression profiling with multiplexed error-robust fluorescence in situ hybridization. Proc Natl Acad Sci U S A. 2016; 113(39): 11046-11051).
  • perturbations are identified according to the present disclosure and combined with MERFISH.
  • kits containing any one or more of the elements discussed herein.
  • a kit may include any embodiment of multiplexed perturbation vectors, including a library of perturbation vectors capable of perturbing a plurality of gene targets.
  • a kit may include any embodiment of vectors.
  • kits may include primers and padlock probes specific to the vectors.
  • kits may be provided individually or in combinations, and may be provided in any suitable container, such as a vial, a bottle, or a tube.
  • the kit includes instructions in one or more languages, for example in more than one language.
  • a kit comprises one or more reagents for use in a process utilizing one or more of the elements described herein.
  • Reagents may be provided in any suitable container.
  • a kit may provide one or more reaction or storage buffers.
  • Reagents may be provided in a form that is usable in a particular process, or in a form that requires addition of one or more other components before use e.g., in concentrate or lyophilized form).
  • Example 1 - CROPseq-multi for multiplexed perturbation and decoding in pooled genetic screens
  • pLentiGuide 17 -derived systems can enable multiplexing through the use of serial pol III (typically U6) promoters and SpCas9 sgRNAs 18,9,19 (FIG. 4A, FIG. 4B).
  • serial pol III typically U6 promoters
  • SpCas9 sgRNAs 18,9,19 FIG. 4A, FIG. 4B
  • a major limitation of these multiplexing systems is the -400 bp distance separating sgRNA spacers, leading to -30% lentiviral recombination that results in unintended sgRNA combinations (Table 2) 18,9,19 . If the screening methodology captures both sgRNA identities, these recombination events can be detected and filtered out.
  • lentivirus are pseudodiploid and, in the process of infection, reverse transcription during minus-strand synthesis is prone to template switching in a homology and distance-dependent manner 20 22 (FIG. 4C, FIG. 4D).
  • the Big Papi vector seeks to minimize recombination with antiparallel orthogonal U6-sgRNAs, minimizing the distance separating spacers to under 200 bp and with intervening secondary barcodes, reducing recombination to about 7% 23,24 .
  • pLentiGuide-derived Casl2a systems capitalize on the native crRNA array processing ability of Casl2a enzymes 25,26 , enabling a minimal separation of spacers by only 20 bp, likely reducing recombination to negligible levels 10,27,28,8 .
  • a limitation of both Big Papi and Casl2a systems is their requirement for Cas effector enzymes other than SpCas9. Big Papi relies on the delivery of two Cas effectors, SpCas9 and SaCas9.
  • Cas 12a enzymes are typically less active per guide 10,27,28 , limited in guide design by relatively restrictive protospacer adjacent motifs, and either lagging in development or incompatible with applications including CRISPR-KO, CRISPRa, CRISPRi, base editing, and prime editing 29 31 .
  • Designs similar to Big Papi but utilizing two orthogonal SpCas9 guide scaffolds have been implemented to eliminate the requirement for SaCas9 28 .
  • all current multiplexing solutions lack compatibility with mRNA barcoding that is required for some screening modalities, including some RNA-sequencing and in situ detection workflows.
  • NGS Fluorescence Activated Cell Sorting
  • qPCR quantitative polymerase chain reaction
  • Lenti Guide-Barcode (LentiGuideBC) 1 vectors and similar designs (Perturb-seq 32 ,
  • MOSAIC-seq 33 , CRISP-seq 34 typically sacrifice multiplexing capability in pooled screens for the ability to express a secondary barcode in mRNA (FIG. 1A, FIG. 4E).
  • these vector designs are not fundamentally incompatible with multiplexing, however constructing libraries with at least 3 distal designed sequence elements, e g. two sgRNAs and a barcode, is challenging and to-date has only been performed via multi-step arrayed cloning 18 , which is impractical for high throughput pooled screens.
  • the secondary barcode is separated from the sgRNA spacer by at least 1,700 bp encoding the pol II promoter and resistance gene, resulting in lentiviral recombination near the theoretical maximum of 5O% 20 .
  • this recombination contributes to a major loss of statistical power 20 .
  • Efforts to shorten this distance by moving the U6-sgRNA downstream of the pol II promoter and resistance gene have resulted in poor guide activity 20 , potentially due to transcriptional interference by the pol II promoter 20,35 .
  • Co-packaging integration-deficient templates as a means to mitigate recombination has also been described, albeit at the cost of ⁇ 100-fold reduction in lentiviral titer 21 .
  • the CROPseq vector offers a solution to mRNA-barcoding without lentiviral recombination 3? (FIG. IB).
  • LTR long terminal repeat
  • the CROPseq design leverages the high-fidelity intramolecular duplication of the 3’ LTR to the 5’ end of the lentiviral genome during transduction (FIG. 5A). This duplication results in two copies of the sgRNA, with the 5’ duplicate expressing functional sgRNAs without transcriptional interference and the 3’ version transcribed as mRNA in the 3’ UTR of the pol II- transcribed selection gene, compatible with mRNA detection approaches.
  • tRNAs recruit endogenous RNAse P and Z to cleave the tRNA at the 5’ and 3’ ends, such that, when positioned between sgRNAs within a single transcript, the transcript is processed into separate, functional sgRNAs.
  • each tRNA eliminates the requirement for a 5’ guanine base on the following guide that is otherwise required in U6 transcription systems and is often encoded as a mismatched 20th or 21st base of the spacer. This design increased the size of the 3’ LTR insertion from 352 bp (CROPseq) to 643 bp (FIG. 5A).
  • tRNA-encoding sequences could be processed out of mRNA encoding either the lentiviral genome or the selection gene by the endogenous RNases
  • the orientation of the elements within the 3’ LTR was reversed (FIG. 1C, FIG. 5A).
  • the pol III promoter (U6) initiates transcription on the opposite strand as the pol II promoter (EFla) in the integrated lentivirus of FIG. 1C.
  • EFla pol II promoter
  • the transcripts from the pol III promoter include functional tRNA sequences.
  • the vectors of FIG. 1A and IB are not operable if tRNA sequences are added because the transcriptional elements are on the same strand.
  • the CROPseq-inspired multiplexing solution has now been termed CROPseq-multi herein (FIG. 1C, FIG. 5A).
  • tRNAs are not required. 3' LTR-embedded, antisense crRNA arrays have additionally been used for Casl2.
  • the mRNA can be used for barcoding because the mRNA transcribed from the pol II promoter is not cleaved (i.e., because the crRNA array is antisense).
  • Casl2 systems do not require tRNAs because Casl2 itself processes the crRNA arrays.
  • Cast 3 arrays can be used for programmable perturbation of RNA. Casl3 systems do not require tRNAs because Casl3 itself processes the crRNA arrays.
  • Linked barcodes are advantageous for designing sequences with maximum orthogonality, thus minimizing the sequence length needed to uniquely identify library members, for representing pairs of guides that may not be individually unique, and for other applications such as clonal barcoding.
  • Linked barcodes are prone to distance-dependent recombination, the iBAR system is attractive for the placement of barcodes within the synthetic loop that joins the crRNA and tracrRNA into a sgRNA, or only 19 bp from the spacer in the design (FIG.
  • iBARs are transcribed both antisense as mRNA and within the sgRNA scaffold, their detection should be compatible with both mRNA-based 1, 18,32,34 and direct-capture 9 (U6 product) protocols.
  • tRNA G - tRNA-Gly-GCC-2-1 leader tRNA
  • GCATTGGTGGTTCAGTGGTAGAATTCTCGCCTGCCACGCGGGAGGCCCGGGTTCGAT TCCCGGCCAATGCA SEQ ID NO: 18
  • in situ detection of barcodes involves fixation and permeabilization of cells, reverse transcription of barcoded mRNA to cDNA, fixation of the cDNA, copying the barcode sequence into a padlock probe, ligation of the padlock into a circular ssDNA template, rolling circle amplification, and sequencing by synthesis to decode perturbations 1,4 ’ (FIG. 3A).
  • Robust mRNA expression is required for efficient detection.
  • sequencing reagent costs and imaging time impact screening throughput minimizing the number of required sequencing cycles (i.e. barcode bases) necessary to uniquely identify perturbations is desirable.
  • Padlock probes flanking each spacer and iBAR were designed such that both features are captured within the gapfill (FIG. 6A).
  • the detection efficiency of the first spacer and iBAR pair was optimized. With the standard protocol, detection efficiencies (reads per cell) were low for the CROPseq-multi iBAR, averaging 1.3 reads per cell, compared to an average of 2.9 reads per cell for the CROPseq spacer (FIG. 3B).
  • Two protocol changes were employed to improve detection efficiencies for CROPseq-multi (FIG. 3A).
  • the primary fixation was altered by adding 0.007% glutaraldehyde to the standard 4% PFA fixative.
  • cDNA retention was optimized by using a biotinylated reverse transcription primer and adding a streptavidin incubation between the reverse transcription and cDNA fixation steps (FIG. 3A).
  • the optimized cDNA retention alone improved detection efficiency to an average 7.5 reads per cell for CROPseq-multi (FIG. 3B).
  • the optimized primary fixation alone did not improve detection, but in combination with the optimized cDNA retention, further improved detection efficiency to an average 18.8 reads per cell (FIG. 3B).
  • An additional feature of the CROPseq-multi design is the use of two barcodes could facilitate more efficient decoding of perturbations. With simultaneous detection of both barcodes as separate reads, or multiplexed detection, a total of two nucleotides of a barcode pair can be decoded per sequencing cycle (one nucleotide from each barcode per cycle). Further, the detection of both barcodes would enable identification and filtering of lentiviral recombination events. Of note, this strategy is dependent on the ability to reliably detect both barcodes in each cell, which should be facilitated by the optimized detection protocol.
  • FIG. 3B Multiplexed detection of iBARS 1 and 2 was performed with the optimized protocol disclosed herein, and a mean of 27.7 total reads per cell was observed (FIG. 3B). Without wishing to be bound by theory, multiplexed detection could impact per-barcode detection efficiencies due to optical crowding at high read densities. Detection of iBAR 1 and iBAR 2 was performed both individually and multiplexed, and only modestly lower per-barcode detection efficiencies were observed when multiplexed (FIG. 3C, FIG. 3D).
  • the stringency of read assignment to cells was increased by varying the required minimum read counts per iBAR, by rationale that cells with few reads for either iBAR might be the result of deletion recombination events, silencing of the lentiviral transgene, or incomplete selection, and, together with imperfect cell segmentation, could appear as false-positive pair-swap events (FIG. 3E). While the in situ detected pair-swap rate was modestly higher for both arrayed and pooled lentiviral preparations compared to NGS measured pair-swap rates of the same samples, increasing the read count stringency for iBAR assignment brought the two measurements closer to agreement (FIG. 3E). In a screening context, the primary goal would likely be filtering out all such low-confidence assignments and incorrect pairings, so the distinction between pair-swap events and these potential modes of false-positives is less important.
  • a single additional sequencing cycle with multiplexed decoding is sufficient to detect >95% of recombination events, corresponding to a roughly 3-fold decrease in cycle number relative to decoding with spacer sequences (FIG. 3F).
  • decoding a genome-wide CROPseq library with 4 guides per gene (-80,000 vectors) would require sequencing all 20 cycles of the spacer.
  • CROPseq-multi an equivalent number of vectors, encoding twice as many guides per gene, could be decoded with only 6 cycles of sequencing while detecting recombination events.
  • multiplexed decoding reduces the sequencing reagent costs by the same factor.
  • CROPseq-multi is a generalized multiplexing solution that addresses numerous technical challenges in multiplexed screens, including robust and equal guide activity, minimized lentiviral recombination, and compatibility with mRNA-barcoding and in situ readout. Both single-perturbation and combinatorial screens can be enabled by CROPseq-multi. In single-target screens, activity with two guides against the same target per vector typically achieves superior on- target performance with smaller library sizes, and optical pooled screens will benefit from superior detection and improved decoding efficiency. For combinatorial screens, CROPseq-multi offers a solution with minimal lentiviral recombination without diverging from the most highly-developed SpCas9-based systems. Example 2 - Inclusion of T7 promoter in CROPseq and CROPseq-multi for multiplexed perturbation and decoding in pooled genetic screens
  • Example 1 Building upon the vectors generated in Example 1, engineering of a lentiviral system compatible with an in vitro transcription readout was pursued. Accordingly, taking the vectors generated in Example 1, a T7 promoter was further included, thereby generating a vector which has been herein termed CROPseq-multi-T7 (FIG. 7A). Inclusion of the T7 promoter is illustrated in the alignment shown in FIG. 8.
  • the 3' LTR is duplicated during lentiviral integration.
  • the duplicated multiplexing cassette generates functional sgRNAs via pol III transcription and endogenous tRNA processing (see FIG. 4E, FIG. 7A).
  • the inclusion of the T7 promoter in the CROPseq-multi-T7 vector further enables optional in vitro transcription detection that is independent of endogenous transcription activity (FIG. 7A). While in situ amplification and detection yields distinct punctate "colonies,” in vitro detection yields a single large nuclear focus (FIG. 7B).
  • multiplexed perturbation vectors compatible with in vitro transcription readouts enhance pooled genetic screening decoding strategies.
  • FIG. 9C and FIG. 9D illustrate the enhanced capabilities of CROPseq-T7 and CROPseq-multi-T7 vectors in multiplex screening.
  • the CROPseq-multi-T7 vector allows for quantification of recombination events using in vitro transcription sequencing and in situ sequencing.
  • detection efficiencies for iBARl was 80% while the detection efficiency for iBAR2 was 96% (FIG. 10).
  • the signal intensity for iBAR2 is greater, as expected given proximity to the T7 promoter (FIG. 10).
  • lithium borohydride (LiBF ) treatment is effective in removing fluorescent signals from in situ sequencing (ISS) reagents and is also compatible with subsequent sequencing.
  • FIG. 12 left panel, shows cycle 1 signal from ISS of RPEl-hTERT cells before LiBH4 treatment (i.e., pre-treatment).
  • T7 in vitro transcription protocol was adapted from previous protocols (Kudo, T. et al. Highly multiplexed, image-based pooled screens in primary cells and tissues with Perturb View. 2023.12.26.573143 and Fandrey, C. I. et al. Cell Type- Agnostic Optical Perturbation Screening Using Nuclear In-Situ Sequencing (NIS-Seq). 2024.01.18.576210). Samples were fixed in 4% (v/v) formaldehyde (Electron Microscopy Sciences 15714) in IX PBS (Ambion AM9625) for 30 minutes at room temperature, then washed twice in PBS.
  • Cells were permeabilized in 70% (v/v) ethanol for 30 minutes at room temperature. After permeabilization, the 70% ethanol solution was diluted with three 75% buffer exchanges of PBS-T, followed by two washes with PBS-T. Reverse crosslinking was performed by incubating samples at 65 °C for 4 hours in 0.1 M Sodium Bicarbonate and 0.3 M NaCl in water. Samples were washed three times with PBS-T. Alternatively to formaldehyde fixation and reverse crosslinking, fixation and permeabilization was validated in 3: 1 (v/v) methanol and acetic acid for 20 minutes, followed by two washes with PBS-T, as previously described.
  • the reverse transcription solution was prepared with the following composition: IX RevertAid RT buffer (Thermo Fisher Scientific EP0452), 250 pM dNTPs (New England Biolabs N0447L), 1 pM each biotinylated reverse transcription primer (Integrated DNA Technologies), 200 gg/mL molecular biology grade recombinant albumin (rAlbumin) (New England Biolabs B9200S), 0.8 U/gL RiboLock RNase inhibitor (Thermo Fisher Scientific EO0384), and 4.8 U/gL RevertAid H minus Reverse Transcriptase (Thermo Fisher Scientific EP0452). Samples were incubated in reverse transcription solution for 16 hours at 37 °C.
  • the gapfill and ligation solution was composed of IX Ampligase buffer (Lucigen A3210K), 50 nM dNTPs (New England Biolabs N0447L), 0.1 gM each padlock probe, 200 gg/mL rAlbumin (New England Biolabs B9200S), 0.4 U/gL RNase H (Enzymatics Y9220L), 0.02 U/gL TaqIT polymerase (Enzymatics P7620L), and 0.5 U/gL Ampligase (Lucigen A3210K).
  • RCA solution was composed of IX Phi29 buffer (Thermo Fisher Scientific EP0091), 5%(v/v) glycerol (MilliporeSigma G5516), 250 gM dNTPs (New England Biolabs N0447L), 200 gg/mL rAlbumin (New England Biolabs B9200S), and 1 U/gL Phi29 DNA polymerase (Thermo Fisher Scientific EP0091).

Landscapes

  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Organic Chemistry (AREA)
  • Genetics & Genomics (AREA)
  • Engineering & Computer Science (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Biotechnology (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Analytical Chemistry (AREA)
  • Microbiology (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Biochemistry (AREA)
  • Physics & Mathematics (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Crystallography & Structural Chemistry (AREA)
  • Immunology (AREA)
  • Plant Pathology (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

The present disclosure relates to a CROPseq-multi vector for multiplexed perturbation and decoding in pooled genetic screens. The present disclosure also relates to multiplexed perturbation and decoding for optical pooled screens with the CROPseq-multi vector. The CROPseq-multi vector provides a universal solution for multiplexed perturbation and decoding in pooled screens.

Description

MULTIPLEXED PERTURBATION AND DECODING IN POOLED GENETIC SCREENS
CROSS REFERENCE TO RELATED APPLICATIONS
[0001] The present application claims the benefit of priority to U.S. Provisional Application No. 63/549,322, filed on February 2, 2024, and U.S. Provisional Application No. 63/659,465, filed June 13, 2024. The entire contents of the aforementioned patent applications are incorporated herein by this reference.
STATEMENT REGARDING FEDERALLY SPONSORED RESEARCH
[0002] This invention was made with government support under Grant No. CA264422 awarded by the National Institutes of Health. The government has certain rights in the invention.
TECHNICAL FIELD
[0003] The subject matter disclosed herein is generally directed to lentiviral vectors for use in multiplexing genetic perturbations and decoding the multiplexed genetic perturbations in pooled single cell assays.
SEQUENCE LISTING
[0004] The instant application contains a Sequence Listing which has been filed electronically in extensible Markup Language format and is hereby incorporated by reference in its entirety. Said XML file, created on January 29, 2025, is named BN00007_2262_BIl 1174.xml and is 68,939 Bytes in size.
BACKGROUND
[0005] Combinatorial screens seek to identify genetic interactions, or phenotypes that arise from an interaction of genetic components. Most pooled screens, including combinatorial screens, achieve perturbation of genetic components through lentiviral delivery of Cas9 single guide RNA (sgRNA). Lentiviral delivery systems are near-ubiquitous across in vitro pooled screens conducted today for their ability to efficiently deliver perturbations to a wide range of cell types and retain a record of the perturbation identity as the lentiviral genome integrated in the host-cell genome. Combinatorial screens can leverage either standard single-perturbation or multi-perturbation (multiplexing) lentiviral vectors. In principle, single-perturbation vectors enable only random sampling in pooled screens, via either high multiplicity of infection (with limited control of multiplicity) or serial transduction and selection. However, random sampling is limited by an immense combinatorial space, which scales as (n choose m), for n targets and combinations of m perturbations per cell. For example, pairwise interactions of only 200 gene targets results in 19,900 unique combinations, roughly equivalent in scale to a standard genome-wide screen. Alternatively, multiplexing vectors enable the delivery of multiple perturbations on a single vector. In addition to enabling an exhaustive search of the combinatorial space, multiplexing vectors allow the perturbation of a biologically informed subset of target combinations. In such screens, the size of multiplexed vector libraries scales linearly with the number of selected target combinations.
[0006] Multiplexed perturbation lentiviral vectors additionally offer several advantages over single-plex solutions for single-target screens. The use of multiple guides targeting the same gene has been shown to improve on-target activity in both knockout and interference screens. As both guides must target the same gene, these approaches require programmed spacer combinations, not random pairings. Despite improvements in algorithms for guide design, the selection of active and specific guides remains challenging and pairing guides can increase on-target performance. Pairing guides also combines the off-target risk of multiple guides, necessitating the use of multiple vectors with distinct guides for a given target.
[0007] Accordingly, a need therefore exists for improved methods and compositions for multiplexing genetic perturbations and decoding the multiplexed genetic perturbations in pooled single cell assays.
SUMMARY
[0008] In one aspect, the disclosure provides a viral vector for multiplexed perturbation screens, including: a cassette operably connected to a pol II promoter, the cassette comprising at least two guide molecules; a cleavable sequence; a pol III promoter; and a 3' long terminal repeat (LTR), where the at least two guide molecules are separated by the cleavable sequence, the at least two guide molecules, the cleavable sequence, and the pol III promoter are located within the 3' LTR and encoded on the minus strand, and the at least two guide molecules are transcribed in a single transcript, and where the pol II promoter is positioned upstream of the 3’ LTR. [0009] In some embodiments, the cleavable sequence is a tRNA leader sequence.
[0010] In some embodiments, the cassette comprises one or more additional cleavable sequences.
[0011] In some embodiments, the at least two guide molecules are present in an array cleavable by a CRISPR-Cas polypeptide.
[0012] In some embodiments, the CRISPR-Cas polypeptide is Casl2.
[0013] In some embodiments, the CRISPR-Cas polypeptide is Casl3.
[0011] In some embodiments, the cassette further comprises an internal barcode (iBAR) and spacer within each of the at least two guide molecules, wherein each iBAR is unique to each of the at least two guide molecules.
[0015] In some embodiments, the iBAR is within a loop joining a crRNA and a tracrRNA of each guide molecule.
[0016] In some embodiments, the cassette further comprises an exogenous promoter.
[0017] In some embodiments, the exogenous promoter of the cassette is upstream of the at least two guide molecules, the cleavable sequence, and the pol III promoter.
[0018] In some embodiments, the orientation of the exogenous promoter is antisense relative to the Pol III promoter.
[0019] In some embodiments, the exogenous promoter is a T7 promoter, a T3 promoter, or a SP6 promoter.
[0020] In some embodiments, the viral vector comprises a 5’ LTR promoter.
[0021] In some embodiments, the 5’ LTR promoter is a CMV promoter.
[0022] In some embodiments, orthogonal tRNA sequences are used in the viral vector.
[0023] In some embodiments, the viral vector further includes orthogonal guide molecule scaffolds.
[0024] In some embodiments, the at least two guide molecules target at least two or more different sequences of a gene.
[0025] In some embodiments, the viral vector further encodes a CRISPR-Cas polypeptide.
[0026] In some embodiments, the CRISPR-Cas polypeptide is a Cas9.
[0027] In some embodiments, the CRISPR-Cas polypeptide is a Casl2.
[0028] In some embodiments, the CRISPR-Cas polypeptide is a Casl3. [0029] In some embodiments, the viral vector is a lentiviral vector.
[0030] In an aspect, the disclosure provides a method of performing multiplexed pooled perturbation screening, including the steps of: a) introducing one or more perturbation vectors according to any one of claims 1 to 22 to a population of cells to obtain perturbed cells; and b) performing single cell RNA sequencing on the perturbed cells, whereby pol II transcripts comprising the at least two guide molecules encoded for in each vector are sequenced with cellular RNAs from the perturbed cells.
[0031] In an aspect, the disclosure provides a method of performing multiplexed pooled optical perturbation screening, including the steps of: a) introducing one or more perturbation vectors according to any one of claims 1 to 22 to a population of cells to obtain perturbed cells; b) fixing and permeabilizing the perturbed cells; c) reverse transcribing mRNA transcribed from the pol II promoter encoded within the viral vector in the perturbed cells using primers specific for each of the at least two guide molecules, thereby generating cDNA comprising at least the spacer sequence and iBAR of each of the at least two guide molecules; d) fixing the cDNA in the perturbed cells; and e) in situ sequencing (IS S) the sequences comprising the spacer and/or iBAR of each of the at least two guide molecules.
[0032] In an aspect, the disclosure provides a method of performing multiplexed pooled optical perturbation screening, including the steps of: a) introducing one or more viral vector(s) according to any one of claims 9-12 to a population of cells to obtain perturbed cells; b) fixing and permeabilizing the perturbed cells, transcribing mRNA of the viral vector(s) in the perturbed cells from the exogenous promoter of the cassette(s) of the viral vector(s), thereby generating barcoded RNA; c) reverse transcribing mRNA transcribed from the pol II promoter encoded within the viral vector in the perturbed cells using primers specific for each of the at least two guide molecules, thereby generating cDNA comprising at least the spacer sequence and iBAR of each of the at least two guide molecules; d) fixing the cDNA in the perturbed cells; and e) in situ sequencing (ISS) the sequences comprising the spacer and/or iBAR of each of the at least two guide molecules, thereby performing multiplexed pooled optical perturbation screening.
[0033] In some embodiments, the fixing in step (b) is performed using about 0.007% glutaraldehyde in about 4% paraformaldehyde. [0034] In some embodiments, the primers are biotinylated and wherein the method further comprises a streptavidin incubation between the reverse transcription and fixing step.
[0035] In some embodiments, the in situ sequencing (ISS) comprises: a) contacting the perturbed cells with padlock probes that flank the spacer and/or iBAR of each of the at least two guide molecules; b) gap filling the padlock probes to capture each spacer and iBAR within the padlock probe; c) ligating the padlock probes into circular ssDNA templates; d) performing rolling circle amplification on the circular ssDNA templates to generated amplified cDNA; and e) sequencing the amplified cDNA using sequencing by synthesis to decode perturbations.
[0036] In some embodiments, the in situ sequencing step comprises in situ sequencing of a combination of the at least two spacers and at least two iBARs simultaneously.
[0037] In some embodiments, the methods further include treating the population of cells with a lithium borohydride (LiBH4) solution.
[0038] In some embodiments, the LiBfh treatment step occurs prior to in situ sequencing.
[0039] In some embodiments, the LiBITi treatment step occurs subsequent to in situ sequencing.
[0040] In some embodiments, the one or more additional cleavable sequences are tRNA leader sequences.
[0041] In some embodiments, the one or more additional cleavable sequences are positioned upstream of the at least two guide molecules.
[0042] In some embodiments, the cleavable sequence and one of the one or more additional cleavable sequences are positioned on either side of at least one of the at least two guide molecules. [0043] In one aspect, the present disclosure provides a viral vector for multiplexed perturbation screens, said viral vector including: a cassette including two or more guide molecules encoded for in a single transcript within a 3’ long terminal repeat (LTR) of a viral genome, wherein the two or more guide molecules are separated by a cleavable sequence, wherein the two or more guide molecules are operably linked to a pol III promoter encoded for within the 3’ LTR; wherein the cassette is operably linked to a pol II promoter encoded upstream of the 3’ LTR; and wherein the orientation of the cassette is in an antisense orientation relative to the lentiviral genome. In certain embodiments, the cleavable sequence of the cassette is a tRNA leader sequence, wherein the two or more guide molecules are separated by a tRNA leader sequence. In certain embodiments, each guide molecule of the cassette comprises a tRNA leader sequence. In certain embodiments, the two or more guide molecules are present in an array cleavable by a CRISPR-Cas polypeptide. In certain embodiments, the CRISPR-Cas polypeptide is Casl2. In certain embodiments, the CRISPR-Cas polypeptide is Casl3. In certain embodiments, the cassette of the viral vector further includes an internal barcode (iBAR) within each guide molecule, wherein each iBAR is unique to each guide molecule. In certain embodiments, the iBAR is within a loop joining a spacer and scaffold of each guide molecule. In some embodiments, the iBAR is within a loop joining a crRNA and a tracrRNA of each guide molecule. In some embodiments, the cassette further includes an exogenous promoter. Optionally, the exogenous promoter is upstream of the two or more guide molecules, the cleavable sequence, and the pol III promoter. Optionally, the orientation of the exogenous promoter is sense relative to the lentiviral genome. In further embodiments, the orientation of the exogenous promoter is antisense relative to the Pol III promoter. In certain embodiments, the exogenous promoter is a T7 promoter, a T3 promoter, or a SP6 promoter. In certain embodiments, the viral vector includes a 5’ LTR promoter. In certain embodiments, the 5’ LTR promoter of the viral vector is a CMV promoter. In certain embodiments, orthogonal tRNA sequences are used in the viral vector. In certain embodiments, orthogonal guide molecule scaffolds are used in the viral vector. In certain embodiments, the two or more guide molecules target a different sequence of a same gene. In certain embodiments, the viral vector encodes for a CRISPR-Cas polypeptide. In certain embodiments, the CRISPR-Cas polypeptide is a Cas9. In certain embodiments, the CRISPR-Cas polypeptide is a Casl2. In certain embodiments, the CRISPR-Cas polypeptide is a Casl 3. In certain embodiments, the viral vector is a lentiviral vector. [0044] In another aspect, the present disclosure provides a method of performing multiplexed pooled perturbation screening involving introducing one or more perturbation vectors according to any embodiment herein to a population of cells to obtain perturbed cells, optionally the population of cells expresses a CRISPR-Cas polypeptide; and performing single cell RNA sequencing on the perturbed cells, whereby pol II transcripts include the two or more guide molecules encoded for in each vector are sequenced with cellular RNAs from the perturbed cells. [0045] In another aspect, the present disclosure provides a method of performing multiplexed pooled optical perturbation screening involving introducing one or more perturbation vectors according to any embodiment herein to a population of cells to obtain perturbed cells, optionally the population of cells expresses a CRISPR-Cas polypeptide; fixing and permeabilizing the perturbed cells; reverse transcribing mRNA transcribed from the pol II promoter encoded within the viral vector in the perturbed cells using primers specific for each of the two guide molecules, thereby generating cDNA comprising at least the spacer sequence and iBAR of each guide molecule; fixing the cDNA in the perturbed cells; and in situ sequencing (ISS) the sequences of each guide molecule spacer and/or iBAR.
[0046] Another aspect of the disclosure provides a method for performing multiplexed pooled optical perturbation screening involving: introducing one or more viral vector(s) according to the disclosure to a population of cells to obtain perturbed cells, optionally wherein the population of cells expresses a CRISPR-Cas polypeptide; fixing and permeabilizing the perturbed cells; transcribing mRNA of the viral vector(s) in the perturbed cells from the exogenous promoter of the viral vector(s), thereby generating barcoded RNA; reverse transcribing mRNA transcribed from the pol II promoter encoded within the viral vector in the perturbed cells using primers specific for each of the two guide molecules, thereby generating cDNA comprising at least the spacer sequence and iBAR of each guide molecule; fixing the cDNA in the perturbed cells; and in situ sequencing (ISS) the sequences of each guide molecule spacer and/or iBAR, thereby performing multiplexed pooled optical perturbation screening. In certain embodiments, the fixing is performed via use of about 0.007% glutaraldehyde in about 4% paraformaldehyde. In certain embodiments, the primers for reverse transcription are biotinylated and the method further involves a streptavidin incubation between reverse transcription and fixing the cDNA in the perturbed cells. In certain embodiments, in situ sequencing (ISS) involves: contacting the perturbed cells with padlock probes flanking each sgRNA spacer and/or iBAR; gap filling the padlock probes to capture each spacer and iBAR within the padlock probe; ligating the padlock probes into circular ssDNA templates; performing rolling circle amplification on the circular ssDNA templates; and sequencing the amplified cDNA using sequencing by synthesis to decode perturbations. In certain embodiments, the in situ sequencing involves sequencing a combination of the two spacers and two iBARs simultaneously.
[0047] These and other aspects, objects, features, and advantages of the example embodiments will become apparent to those having ordinary skill in the art upon consideration of the following detailed description of example embodiments. Definitions
[0048] Unless defined otherwise, technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this disclosure pertains. Definitions of common terms and techniques in molecular biology may be found in Molecular Cloning: A Laboratory Manual, 2nd edition (1989) (Sambrook, Fritsch, and Maniatis); Molecular Cloning: A Laboratory Manual, 4th edition (2012) (Green and Sambrook); Current Protocols in Molecular Biology (1987) (F.M. Ausubel et al. eds.); the series Methods in Enzymology (Academic Press, Inc.): PCR2: APractical Approach (1995) (M.J. MacPherson, B.D. Hames, and G.R. Taylor eds.): Antibodies, A Laboratory Manual (1988) (Harlow and Lane, eds.): Antibodies A Laboratory Manual, 2nd edition 2013 (E.A. Greenfield ed.); Animal Cell Culture (1987) (R.I. Freshney, ed.); Benjamin Lewin, Genes IX, published by Jones and Bartlet, 2008 (ISBN 0763752223); Kendrew etal. (eds.), The Encyclopedia of Molecular Biology, published by Blackwell Science Ltd., 1994 (ISBN 0632021829); Robert A. Meyers (ed.), Molecular Biology and Biotechnology: a Comprehensive Desk Reference, published by VCH Publishers, Inc., 1995 (ISBN 9780471185710); Singleton et al., Dictionary of Microbiology and Molecular Biology 2nd ed., J. Wiley & Sons (New York, N.Y. 1994), March, Advanced Organic Chemistry Reactions, Mechanisms and Structure 4th ed., John Wiley & Sons (New York, N.Y. 1992); and Marten H. Hofker and Jan van Deursen, Transgenic Mouse Methods and Protocols, 2nd edition (2011).
[0049] As used herein, the singular forms “a,” “an,” and “the" include both singular and plural referents unless the context clearly dictates otherwise.
[0050] Unless specifically stated or obvious from context, as used herein, the term “about” is understood as within a range of normal tolerance in the art, for example within 2 standard deviations of the mean. About can be understood as within 50%, 45%, 40%, 35%, 30%, 25%, 20%, 15%, 10%, 9%, 8%, 7%, 6%, 5%, 4%, 3%, 2%, 1%, 0.5%, 0.1%, 0.05%, or 0.01% of the stated value. Unless otherwise clear from context, all numerical values provided herein are modified by the term about.
[0051] By “control” is meant a standard or reference condition.
[0052] By “fragment” is meant a portion of a polypeptide or nucleic acid molecule. This portion contains, preferably, at least 5%, 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, or 90% of the entire length of the reference nucleic acid molecule or polypeptide. A fragment may contain 5, 10, 20, 30, 40, 50, 60, 70, 80, 90, or 100, 200, 300, 400, 500, 600, 700, 800, 900, 1000 or more nucleotides or amino acids.
[0053] The term “optional” or “optionally” means that the subsequent described event, circumstance or substituent may or may not occur, and that the description includes instances where the event or circumstance occurs and instances where it does not.
[0054] The terms “individual,” “patient” or “subject” refers to an animal, or tissues or cells thereof, which is the object of treatment, observation, or experiment. By way of example only, a subject includes, but is not limited to, a mammal, including, but not limited to, a human or a nonhuman mammal, such as a non-human primate, bovine, equine, canine, ovine, or feline. The terms “individual,” “patient” or “subject” are used interchangeably herein. Mammals may also include, but are not limited to, murines, simians, humans, farm animals, sport animals, and pets. Tissues, cells and their progeny of a subject obtained in vivo or cultured in vitro are also encompassed.
[0055] Ranges provided herein are understood to be shorthand for all of the values within the range. For example, a range of 1 to 50 is understood to include any number, combination of numbers, or sub-range from the group consisting of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, or 50, as well as all intervening decimal values between the aforementioned integers such as, for example, 1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7, 1.8, and 1.9. With respect to sub-ranges, “nested sub-ranges” that extend from either end point of the range are specifically contemplated. For example, a nested sub-range of an exemplary range of 1 to 50 may comprise 1 to 10, 1 to 20, 1 to 30, and 1 to 40 in one direction, or 50 to 40, 50 to 30, 50 to 20, and 50 to 10 in the other direction. The recitation of numerical ranges by endpoints includes all numbers and fractions subsumed within the respective ranges, as well as the recited endpoints.
[0056] As used herein, a “biological sample” may contain whole cells and/or live cells and/or cell debris. The biological sample may contain (or be derived from) a “bodily fluid”. The present disclosure encompasses embodiments wherein the bodily fluid is selected from amniotic fluid, aqueous humor, vitreous humor, bile, blood serum, breast milk, cerebrospinal fluid, cerumen (earwax), chyle, chyme, endolymph, perilymph, exudates, feces, female ejaculate, gastric acid, gastric juice, lymph, mucus (including nasal drainage and phlegm), pericardial fluid, peritoneal fluid, pleural fluid, pus, rheum, saliva, sebum (skin oil), semen, sputum, synovial fluid, sweat, tears, urine, vaginal secretion, vomit and mixtures of one or more thereof. Biological samples include cell cultures, bodily fluids, cell cultures from bodily fluids. Bodily fluids may be obtained from a mammal organism, for example by puncture, or other collecting or sampling procedures.
[0057] Various embodiments are described hereinafter. It should be noted that the specific embodiments are not intended as an exhaustive description or as a limitation to the broader aspects discussed herein. One aspect described in conjunction with a particular embodiment is not necessarily limited to that embodiment and can be practiced with any other embodiment(s).
[0058] Reference throughout this specification to “one embodiment”, “an embodiment, ” “an example embodiment, ” means that a particular feature, structure or characteristic described in connection with the embodiment is included in at least one embodiment of the present disclosure. Thus, appearances of the phrases “in one embodiment, ” “in an embodiment, ” or “an example embodiment” in various places throughout this specification are not necessarily all referring to the same embodiment, but may. Furthermore, the particular features, structures or characteristics may be combined in any suitable manner, as would be apparent to a person skilled in the art from this disclosure, in one or more embodiments. Furthermore, while some embodiments described herein include some but not other features included in other embodiments, combinations of features of different embodiments are meant to be within the scope of the disclosure. For example, in the appended claims, any of the claimed embodiments can be used in any combination.
[0059] All publications, published patent documents, and patent applications cited herein are hereby incorporated by reference to the same extent as though each individual publication, published patent document, or patent application was specifically and individually indicated as being incorporated by reference.
BRIEF DESCRIPTION OF THE DRAWINGS
[0060] An understanding of the features and advantages of the present disclosure will be obtained by reference to the following detailed description that sets forth illustrative embodiments, in which the principles of the disclosure may be utilized, and the accompanying drawings of which: [0061] FIG. 1A shows a schematic of LentiGuide-BC and similar approaches which use a secondary barcode distant from the sgRNA. FIG. IB shows CROPseq encodes the sgRNA within the lentiviral 3’ long terminal repeat (LTR). FIG. 1C shows CROPseq-multi encodes two sgRNAs with internal barcodes (iBARs), multiplexed using tRNAs, and encoded in the lentiviral 3’ LTR. Like CROPseq, the 3’ LTR is duplicated during lentiviral integration, producing a second copy of the sgRNAs. FIG. ID shows iBARs place secondary barcodes within the loop joining the crRNA and tracrRNA into the sgRNA. FIG. IE shows lentiviral titers with CROPseq-multi relative to CROPseq. FIG. IF shows a comparison of genome editing activity of CROPseq and CROPseq- multi vectors in SpCas9 nuclease-expressing cells, quantified by next generation sequencing. Mean and n=3 biological replicates shown. BC, barcode.
[0062] FIG. 2A shows lentiviral barcode swapping rates as a function of the distance separating barcode elements (spacer-spacer, spacer-secondary barcode, etc.). Data points are from published barcoding systems and CROPseq-multi (see also, Table 1). FIG. 2B shows assaying lentiviral recombination with pooled lentiviral production. FIG. 2C shows observed lentiviral recombination of CROPseq-multi barcode elements. (T-test: *p<0.05, **p<0.001, ***p<0.0001) FIG. 2D shows reducing recombination frequencies with orthogonal tRNAs.
[0063] FIG. 3A provides an overview of the in situ sequencing workflow. FIG. 3B shows optimizing the in situ detection protocol to improve the detection efficiency of CROPseq-multi. FIG. 3C shows detection efficiency of CROPseq-multi iBARs with individual or multiplexed detection. FIG. 3D shows a representative image of in situ sequencing reads with multiplexed detection. FIG. 3E shows quantifying recombination in a 3 -vector pool with ISS and NGS. FIG. 3F shows sequencing cycles necessary to uniquely identify library members with different decoding methods. For decoding via the spacer (mean and standard deviation shown), libraries were simulated by randomly sampling guides from the Dolcetto genome-wide CRISPRi library.
[0064] FIG. 4A shows a schematic representation of the pLentiGuide vector. FIG. 4B shows a schematic representation of derivative multiplexing systems. FIG. 4C shows processing of the lentiviral RNA genome into double stranded DNA for genome integration to illustrate steps vulnerable to recombination. Illustration inspired by Adamson et al.22 FIG. 4D shows a schematic representation of a LentiGuideBC vector for pairing guide RNAs with mRNA barcodes. FIG. 4E shows a schematic representation of a CROPseq vector for pairing guide RNAs with mRNA barcodes. LTR, long term repeats; pbs, primer binding site; PPT, polypurine tract; cPPT, central polypurine tract. [0065] FIG. 5A shows a detailed illustration of the CROPseq-multi 3’LTR design. The top line of the figure shows the orientation of the viral vector relative to the lentiviral genome, from right to left in 5'-to-3 ' orientation. Underneath, the middle section of this figure, shows the reverse complement orientation. Beneath this, at the bottom of the figure, the bounded box blow-out shows greater details of the 3' LTR cassette (located between flanking 3'LTR regions; the 3'LTR cassette is also indicated by hashed lines). FIG. 5B shows lentiviral titers of CROPseq variants relative to CROPseq. FIG. 5C shows a sequence alignment of orthogonal tRNAs tested in CROPseq-multi. FIG. 5D shows a sequence alignment of orthogonal sgRNA scaffolds used in CROPseq-multi.
[0066] FIG. 6A shows an illustration of oligo reagent design for in situ detection of CROPseq- multi barcodes. FIG. 6B shows a representative image showing the identification of iBAR 1 reads in multiplexed detection, together with DAPI-stained nuclei and the four sequencing bases in separate fluorescent channels. FIG. 6C shows precision-recall curve for assignment of individual reads to either iBAR 1 or iBAR 2 on signal from the iBAR 1 -specific probe.
[0067] FIG. 7A shows CROPseq-multi-T7 encodes two sgRNAs with internal barcodes (iBARs), multiplexed using tRNAs, and encoding in the lentiviral 3' LTR. Like CROPseq-multi and CROPseq, the 3' LTR is duplicated during lentiviral integration, producing a second copy of the sgRNAs. The CROPseq-multi-T7 further encodes a T7 promoter, which, without wishing to be bound by theory, enables use of in vitro transcription to generate barcoded RNAs independent of endogenous transcription activity. FIG. 7B provides an overview of the in situ sequencing workflow.
[0068] FIG. 8 shows a sequence alignment of CROPseq-multi (top) and CROPseq-multi-T7 (bottom).
[0069] FIG. 9A shows genome editing activity of CROPseq-multi-T7 vectors in SpCas9 nuclease-expressing cells, quantified by next generation sequencing. The table underneath the x- axis indicates the respective target for sgRNA 1 and sgRNA 2; i.e., the first and fourth columns show negative control results for respective cassettes having non-targeting (scrambled) guide RNAs joined by a tRNA for glutamine (Q), with the fourth column cassette further including a T7 promoter - no editing was observed; the second and fifth columns show results for cassettes having guide RNAs targeting AAVS1 as sgRNA 1 and targeting HPRT1 as sgRNA2, with the guide RNAs joined by a tRNA for alanine (A), where the fifth column cassette further included a T7 promoter - editing of both AAVS1 and HPRT1 were consistently observed at about 40-60% levels for each tested cassette; the third and sixth columns show results for cassettes having guide RNAs targeting HPRT1 as sgRNAl and targeting AAVS1 as sgRNA2, with the guide RNAs joined by a tRNA for proline (P), where the sixth column cassette further includes a T7 promoter - editing of both HPRT1 and AAVS1 were consistently observed at about 40-60% levels for each tested cassette. FIG. 9B shows a comparison of mRNA detection efficiency of CROPseq and CROPseq-T7 vectors (left two violin plots) and CROPseq-multi and CROPseq-multi-T7 vectors (right two violin plots). FIG. 9C shows representative images of in situ sequencing reads with multiplexed detection of CROPseq-T7 and CROPseq-multi-T7 vectors using two protocols. PeturbView uses PF A fixation, ethanol permeabilization, and a reverse crosslinking step (top row) whereas NIS-seq uses methanol and acetic acid for fixation and permeabilization (bottom row). FIG. 9D shows representative images of in situ sequencing reads with multiplexed detection of comparing CROPseq-T7 and CROPseq-multi-T7.
[0070] FIG. 10 shows detection efficiency via in vitro transcription and in situ sequencing of the CROPseq-multi-T7 vector.
[0071] FIG. 11 shows recombination detection of the CROPseq-multi-T7 vector using a simultaneous sequencing approach to reading barcode sequences using in vitro transcription. A "mixed basecall score" was used to separate recombined reads from correct pairs.
[0072] FIG. 12 shows the effect of LiBFU on fluorescent signal in RPEl-hTERT cells. In particular, FIG. 12 demonstrates that LiBFU treatment is effective in removing fluorescent signals from in situ sequencing reagents and is also compatible with subsequent sequencing. FIG. 12, left panel, shows cycle 1 signal from in situ sequencing (ISS) of RPEl-hTERT cells before LiBFU treatment (i.e., pre-treatment). FIG. 12, middle panel, shows the same cells after treatment with 1 mg/mL LiBFU solution for 30 minutes at room temperature (i.e., post-treatment). FIG. 12, right panel, shows the same cells after they were cleaved and incorporated into a subsequent round of sequencing (e.g., cycle 2) and indicates that produced signal readout, indicating that the cells were compatible with subsequent sequencing reactions. Grey, BFP signal showing cell body; Green, nucleotide G; Red, nucleotide T; Magenta, nucleotide A; Cyan, nucleotide C.
[0073] The figures herein are for illustrative purposes only and are not necessarily drawn to scale. DETAILED DESCRIPTION
[0074] The present disclosure is based, at least in part, on the discovery of Cropseq-multi, a lentiviral system engineered to enable multiplexed SpCas9-based perturbations with mRNA- barcoding. As described herein, CROPseq-multi provides an excellent multiplexing solution for single-target and combinatorial Cas9-based CRISPR screens that includes robust guide activity, low lentiviral recombination, and compatibility with mRNA-barcoding. Advantageously, the techniques herein demonstrate that CROPseq-multi enables superior detection and improved decoding efficiency for optical pooled screens, and readily enables combinatorial screens. CROPseq-multi is a versatile multiplexing platform for diverse CRISPR screening methodologies. The techniques herein address existing challenges in multiplexing perturbations while maintaining design compatibility across enrichment, single-cell sequencing, and optical pooled screens. This versatility provides the opportunity for direct comparison and integration of different screening modalities, for example scRNA-seq and imaging-based approaches. CROPseq-multi enables single-target screens with smaller libraries and improved performance. Furthermore, the compatibility of CROPseq-multi with high-content screening techniques enables new directions and methodologies for combinatorial screens. In particular, combinatorial optical pooled screens may be a powerful approach to interrogate genetic interactions at high throughput and with rich, single-cell-resolved phenotypic measurements.
[0075] OverviewMost pooled screens, including combinatorial screens, achieve perturbation of genetic components through lentiviral delivery of Cas9 single guide RNA (sgRNA). Lentiviral delivery systems are near-ubiquitous across in vitro pooled screens conducted today for their ability to efficiently deliver perturbations to a wide range of cell types and retain a record of the perturbation identity as the lentiviral genome integrated in the host-cell genome. Combinatorial screens can leverage either standard single-perturbation or multi-perturbation (multiplexing) lentiviral vectors. In principle, single-perturbation vectors enable only random sampling in pooled screens, via either high multiplicity of infection (with limited control of multiplicity) or serial transduction and selection1,2. However, random sampling is limited by an immense combinatorial space, which scales as (n choose m), for n targets and combinations of m perturbations per cell. For example, pairwise interactions of only 200 gene targets results in 19,900 unique combinations, roughly equivalent in scale to a standard genome-wide screen. Alternatively, multiplexing vectors enable the delivery of multiple perturbations on a single vector. In addition to enabling an exhaustive search of the combinatorial space, multiplexing vectors allow the perturbation of a biologically informed subset of target combinations3 7. In such screens, the size of multiplexed vector libraries scales linearly with the number of selected target combinations.
[0076] Multiplexed perturbation lentiviral vectors additionally offer several advantages over single-plex solutions for single-target screens. The use of multiple guides targeting the same gene has been shown to improve on-target activity in both knockout and interference screens8 l 0. As both guides must target the same gene, these approaches require programmed spacer combinations, not random pairings. Despite improvements in algorithms for guide design" the selection of active and specific guides remains challenging and pairing guides can increase on-target performance. Pairing guides also combines the off-target risk of multiple guides, necessitating the use of multiple vectors with distinct guides for a given target.
[0077] Embodiments disclosed herein provide lentiviral vectors for use in multiplexed pooled genetic perturbation assays. Forward genetic screens seek to dissect complex biological systems by systematically perturbing genetic elements and observing the resulting phenotypes. In combinatorial screens, genetic perturbations are multiplexed within individual cells to reveal genetic interactions, or phenotypes that result from combinations of perturbations. The rich phenotypic readout and high cellular throughput of optical pooled screens makes the approach an attractive strategy to study genetic interactions; however, current multiplexing approaches are incompatible with this screening method. As described herein, the present disclosure relates to a lentiviral system, termed CROPseq-multi, able to multiplex Cas-based perturbations with mRNA- embedded barcodes. CROPseq-multi has equivalent per-guide activity to CROPseq, minimal positional bias, and low lentiviral recombination rates. An optimized and multiplexed in situ detection protocol improves detection efficiency 10-fold and increases decoding efficiency 3 -fold relative to CROPseq. CROPseq-multi is a general multiplexing solution for Cas-based genetic screening approaches, including optical pooled screens. In some embodiments, the CROPseq and CROPseq-multi vectors of the disclosure further include a promoter useful for in vitro transcription detection. Significantly, the inclusion of a promoter useful for in vitro transcription does not: (i) hinder guide activity; (ii) rely on endogenous expression of the mRNA barcode (i.e., cell line agnostic); or (iii) depend on transcriptional silencing. Embodiments disclosed herein provide an enhanced (brighter) signal, which enables detection in models with high background (e.g., tissue samples) using lower magnification imaging assistance with high throughput, sequencing chemistry solving from four color to two color in certain new NGS kits. Inclusion of such a further promoter (e.g., T7 promoter) in the cassettes of the instant disclosure also promotes complex phenotypic measurements, as mRNA is unstable and requires immediate conversion to a DNA form (e.g., cDNA) in most assessment protocols, whereas in the protocols disclosed herein, manipulation, fixing and even multiple rounds of immunofluorescence can be performed, prior to in vitro transcription, where the RNA is regenerated. For mRNA, any given expressing cell has multiple copies, so if there is a failure to detect a copy of an mRNA, it tends not to raise an issue, meaning that per molecule detection rate for mRNA tends to be less important than for detection of, e.g., genomic DNA. When using a promoter useful for in vitro transcription, numerous RNAs can be generated from a single genomic integration, whereas a single copy of a barcode within a genome can mean that if there is a failure to in vitro transcribe that one copy, there will be no detection. An additional advantage of positioning the promoter useful for in vitro transcription in the 3' LTR is that LTR duplication events double the number of barcode copies in a cell that are potentially capable of being detected, lowering the false negative failure rate. In contrast, in a normal lentiviral vector having a promoter useful for in vitro transcription and barcode sequence between the LTRs, only a single copy of the promoter and barcode is integrated, raising the risk of a false negative/lack of detection failure. In some embodiments, the promoter useful for in vitro transcription is a T7 promoter.
VIRAL VECTORS FOR MULTIPLEXED PERTURBATION AND DECODING IN POOLED GENETIC SCREENS
[0078] In an aspect, embodiments disclosed herein provide single vectors capable of introducing two or more perturbations to single cells from the single vector. In an example embodiment, the vector encodes at least two guide molecules. In an embodiment, the vector encodes two guide molecules. The guide molecules may be encoded in a single transcript within a 3 ’ long terminal repeat (LTR) of a viral genome. Each guide molecule may be separated by a tRNA leader sequence. In an embodiment, the transcript encoding the guide molecule may comprise a barcode that identifies the sequence targeted (i.e., the sequences to be modified or perturbed) by each guide molecule. This barcode sequence may an internal barcode (iBAR) located within a loop of the scaffold portion of each guide molecule. In an aspect, embodiments disclosed herein provide single vectors comprising a promoter that enables in vitro transcription detection. In some aspects, embodiments of the disclosure provide for in vitro transcription detection of RNA comprising an internal barcode. In some aspects, the vectors and methods disclosed herein are employed in decoding in pooled genetic screens.
Vectors
[0079] The viral vector may be any viral vector suitable for used in multiplex perturbation screens. In embodiments, the vector is a lentivirus vector comprising at least two guide molecules (e.g., 2, 3, 4, 5, 6, 7, 8, 9, 10 or guide molecules, 2-3 guide molecules, 2-4 guide molecules, 2-5 guide molecules, 2-6 guide molecules, 2-7 guide molecules, 2-8 guide molecules, and the like) encoded for in a single transcript within the 3 ’ long terminal repeat (LTR) of the lentiviral genome, wherein the guide molecules are separated by a tRNA leader sequence, and wherein the two guide molecules are operably linked to a pol III promoter encoded for within the 3’ LTR and a pol II promoter encoded upstream of the 3’ LTR. In embodiments, the guide molecules and pol III promoter are encoded in the 3’ LTR of the lentivirus vector. In embodiments, the 3’ LTR is duplicated to the 5’ end of the lentiviral genome during transduction. In embodiments, the guide molecules are transcribed as a single pol III transcript from the 5’ end of the integrated lentiviral genome. In example embodiments, the guide molecule are separated from the pol III transcript by cleavage of the tRNA sequences by endogenous RNases. As used herein, leader tRNA sequence refers to a tRNA sequence 5’ of the guide molecule sequence. Transfer RNA (abbreviated tRNA) is a small RNA molecule that plays a key role in protein synthesis. Transfer RNA serves as a link (or adaptor) between the messenger RNA (mRNA) molecule and the growing chain of amino acids that make up a protein. In example embodiments, the vector includes a selection marker outside of the 3’ LTR. In example embodiments, the selection marker is under control of the pol II promoter for being expressed as a poly(A) tailed mRNA with the guide molecule sequences. In example embodiments, the mRNA can include the sequences encoded for in the 3’ LTR. In example embodiments, the orientation of the leader tRNA sequences is reversed, such that the mRNA is not cleaved. In example embodiments, the leader tRNA sequences and guide molecules scaffolds are orthogonal to prevent recombination. As used herein, the term "orthogonal" refers to the inability of two or more biomolecules, similar in composition and/or function, to interact with one another or affect their respective substrates. In example embodiments, the viral vector encodes a sequence for expression of any effector proteins required for the perturbation, such as a programmable nuclease or specific CRISPR system, as described further herein. In example embodiments, the viral vector encodes a Cas polypeptide in the vector itself. In this embodiment, the viral vector can be used in systems that do not already express a Cas polypeptide and/or are not compatible with serial genetic manipulation, such as primary human cells.
[0080] In general, and throughout this specification, the term "vector" refers to a nucleic acid molecule capable of transporting another nucleic acid to which it has been linked. Vectors include, but are not limited to, nucleic acid molecules that are single-stranded, double-stranded, or partially double-stranded; nucleic acid molecules that comprise one or more free ends, no free ends (e.g., circular); nucleic acid molecules that comprise DNA, RNA, or both; and other varieties of polynucleotides known in the art. One type of vector is a "plasmid," which refers to a circular double stranded DNA loop into which additional DNA segments can be inserted, such as by standard molecular cloning techniques. Another type of vector is a viral vector, wherein virally- derived DNA or RNA sequences are present in the vector for packaging into a virus (e.g., retroviruses, replication defective retroviruses, adenoviruses, replication defective adenoviruses, and adeno-associated viruses). Viral vectors also include polynucleotides carried by a virus for transfection into a host cell. Certain vectors are capable of autonomous replication in a host cell into which they are introduced (e.g., episomal mammalian vectors). Other vectors (e.g., non- episomal mammalian vectors) are integrated into the genome of a host cell upon introduction into the host cell, and thereby are replicated along with the host genome (e.g., lentivirus). Moreover, certain vectors are capable of directing the expression of genes to which they are operatively- linked (i.e., operably linked to a regulatory element). Such vectors are referred to herein as "expression vectors." Vectors for and that result in expression in a eukaryotic cell can be referred to herein as "eukaryotic expression vectors." Common expression vectors of utility in recombinant DNA techniques are often in the form of plasmids. The term "regulatory element" is intended to include promoters, enhancers, internal ribosomal entry sites (IRES), and other expression control elements (e.g., transcription termination signals, such as polyadenylation signals and poly-U sequences). In some embodiments, a vector comprises one or more pol III promoter (e. ., 1, 2, 3, 4, 5, or more pol III promoters), one or more pol II promoters (e.g, 1 , 2, 3, 4, 5, or more pol II promoters), one or more pol I promoters (e.g., 1, 2, 3, 4, 5, or more pol I promoters), or combinations thereof. Examples of pol III promoters include, but are not limited to, U6 and Hl promoters. Examples of pol II promoters include, but are not limited to, the retroviral Rous sarcoma virus (RSV) LTR promoter (optionally with the RSV enhancer), the cytomegalovirus (CMV) promoter (optionally with the CMV enhancer) (see, e.g., Boshart et al, Cell, 41 :521-530 (1985)), the SV40 promoter, the dihydrofolate reductase promoter, the P-actin promoter, the phosphoglycerol kinase (PGK) promoter, and the EF 1 a promoter.
[0081] In some embodiments, the vector can be transcribed and translated. In some embodiments, the vector can be transcribed and translated in vitro. In some embodiments, the vector is transcribed and translated using T7 promoter regulatory sequences and T7 polymerase. In some embodiments, the vector is transcribed and translated using T3 promoter regulatory sequences and T3 polymerase. In still further embodiments, the vector is transcribed and translated using SP6 promoter regulatory sequences and SP6 RNA polymerase. Without wishing to be bound by theory, promoters/RNA polymerases from the bacteriophages T3, T7, and SP6 are believed to be equivalent (Askary, Amjad, et al. "In situ readout of DNA barcodes and single base edits facilitated by in vitro transcription." Nature biotechnology 38.1 (2020): 66-75.)
[0082] Also encompassed by the term "regulatory element" are enhancer elements, such as WPRE; CMV enhancers; the R-U5’ segment in LTR of HTLV-I (Mol. Cell. Biol., Vol. 8(1), p. 466-472, 1988); SV40 enhancer; and the intron sequence between exons 2 and 3 of rabbit -globin (Proc. Natl. Acad. Sci. USA., Vol. 78(3), p. 1527-31, 1981). It will be appreciated by those skilled in the art that the design of the expression vector can depend on such factors as the choice of the host cell to be transformed, the level of expression desired, etc. A vector can be introduced into host cells to thereby produce transcripts, proteins, or peptides, including fusion proteins or peptides, encoded by nucleic acids as described herein (e.g., clustered regularly interspersed short palindromic repeats (CRISPR) transcripts, proteins, enzymes, mutant forms thereof, fusion proteins thereof, etc ).
Lentivirus [0083] In example embodiments, the vector is a lentivirus vector. The term "lentivirus vector", as used herein, refers to a viral vector derived from complex retroviruses such as the human immunodeficiency virus (HIV). In the present disclosure, lentiviral vectors derived from any strain and subtype can be used. The lentiviral vector may be based on a human or primate lentivirus such as HIV or a non-non-human lentivirus such as Feline immunodeficiency virus, simian immunodeficiency virus and equine infectious anemia virus (EIAV). In an embodiment, the lentiviral vector is a HIV-based vector and especially a HIV- 1 -based vector (see, e.g., Dull T, Zufferey R, Kelly M, et al. A third-generation lentivirus vector with a conditional packaging system. J Virol. 1998;72(11):8463-8471; and Zufferey R, Dull T, Mandel RJ, et al. Selfinactivating lentivirus vector for safe and efficient in vivo gene delivery. J Virol. 1998;72(12):9873-9880). The HIV 5’ LTR comprises the viral promoter for transcribing the viral genome RNA. In example embodiments, the LTR viral promoter is partially deleted and fused to a heterologous enhancer/promoter such as CMV or RSV.
Guide Molecules
[0084] In an embodiment, the viral vector may encode two or more guide molecules. In embodiments, the viral vector encodes, two, three, four, or five guide molecules. In embodiments, the guide molecule encodes two guide molecules. The following include general design principles that may be applied to the guide molecule. The terms "guide molecule," "guide sequence" and "guide polynucleotide" refer to polynucleotides capable of guiding Cas to a target genomic locus and are used interchangeably as in foregoing cited documents such as International Patent Publication No. WO 2014/093622 (PCT/US2013/074667). In general, a guide sequence is any polynucleotide sequence having sufficient complementarity with a target polynucleotide sequence to hybridize with the target sequence and direct sequence-specific binding of a CRISPR complex to the target sequence. The guide molecule can be a polynucleotide.
[0085] Notably, in reference to the multi-guide-constructs of the instant disclosure where multiple sgRNA-encoding sequences are positioned within the 3' LTR, a number of references in the art have specifically taught away from placing multiple sgRNAs within a single 3' LTR [see, e.g., Adamson et al. bioRxiv doi.org/10.1101/298349 ("As illustrated by Datlinger et al, positioning the sgRNA expression cassette within the lentivirus U3 region may place an upper limit on cassette size). This may restrict the use of CROPseq-Guide-Puro to delivery of single sgRNAs because combinatorial sgRNA delivery would require inserting multiple cassettes into a single LTR. Moreover, because recombination can disrupt sequences within LTRs (Yu et al., 1998), CROPseq-Guide-Puro presents no obvious advantage for library designs that incorporate tandem cassettes.") and Hill et al. Nat Methods. 2018 Apr; 15(4): 271-274 ("Although CROP-seq is not subject to sgRNA-barcode swapping, it is limited by its placement of the sgRNA in the lenti viral LTR, as larger intervening sequences such as dual sgRNA designs (Gasperini et al. Am J Hum Genet. 2017; 101 : 192-205) might render the LTR non-functional.")]. The functionality observed herein for those viral vectors of the instant disclosure having multiple sgRNAs located in the 3' LTR therefore constitutes a surprising result over such teachings in the art. The viral vectors of the disclosure accordingly possess unexpected advantage(s) as compared to the art.
[0086] The ability of a guide sequence (within a nucleic acid-targeting guide RNA) to direct sequence-specific binding of a nucleic acid-targeting complex to a target nucleic acid sequence may be assessed by any suitable assay. For example, the components of a nucleic acid-targeting CRISPR system sufficient to form a nucleic acid-targeting complex, including the guide sequence to be tested, may be provided to a host cell having the corresponding target nucleic acid sequence, such as by transfection with vectors encoding the components of the nucleic acid-targeting complex, followed by an assessment of preferential targeting (e.g., cleavage) within the target nucleic acid sequence, such as by Surveyor assay (Qui et al. 2004. BioTechniques. 36(4)702-707). Similarly, cleavage of a target nucleic acid sequence may be evaluated in a test tube by providing the target nucleic acid sequence, components of a nucleic acid-targeting complex, including the guide sequence to be tested and a control guide sequence different from the test guide sequence, and comparing binding or rate of cleavage at the target sequence between the test and control guide sequence reactions. Other assays are possible and will occur to those skilled in the art.
[0087] In some embodiments, the guide molecule is an RNA. The guide molecule(s) (also referred to interchangeably herein as guide polynucleotide and guide sequence) that are included in the CRISPR-Cas or Cas based system can be any polynucleotide sequence having sufficient complementarity with a target nucleic acid sequence to hybridize with the target nucleic acid sequence and direct sequence-specific binding of a nucleic acid-targeting complex to the target nucleic acid sequence. In some embodiments, the degree of complementarity, when optimally aligned using a suitable alignment algorithm, can be about or more than about 50%, 60%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 97.5%, 98%, 98.5%, 99%, 99.5%, or more (e.g., 100%). Optimal alignment may be determined with the use of any suitable algorithm for aligning sequences, non-limiting examples of which include the Smith-Waterman algorithm, the Needleman-Wunsch algorithm, algorithms based on the Burrows-Wheeler Transform (e.g., the Burrows Wheeler Aligner), ClustalW, Clustal X, BLAT, Novoalign (Novocraft Technologies; available at www.novocraft.com), ELAND (Illumina, San Diego, CA), SOAP (available at soap.genomics.org.cn), and Maq (available at maq.sourceforge.net).
[0088] A guide sequence, and hence a nucleic acid-targeting guide, may be selected to target any target nucleic acid sequence. The target sequence may be DNA. The target sequence may be any RNA sequence. In some embodiments, the target sequence may be a sequence within an RNA molecule selected from the group consisting of messenger RNA (mRNA), pre-mRNA, ribosomal RNA (rRNA), transfer RNA (tRNA), micro-RNA (miRNA), small interfering RNA (siRNA), small nuclear RNA (snRNA), small nucleolar RNA (snoRNA), double stranded RNA (dsRNA), non-coding RNA (ncRNA), long non-coding RNA (IncRNA), and small cytoplasmatic RNA (scRNA). In some embodiments, the target sequence may be a sequence within an RNA molecule selected from the group consisting of mRNA, pre-mRNA, and rRNA. In some embodiments, the target sequence may be a sequence within an RNA molecule selected from the group consisting of ncRNA, and IncRNA. In some more embodiments, the target sequence may be a sequence within an mRNA molecule or a pre-mRNA molecule.
[0089] In some embodiments, a nucleic acid-targeting guide is selected to reduce the degree secondary structure within the nucleic acid-targeting guide. In some embodiments, about or less than about 75%, 50%, 40%, 30%, 25%, 20%, 15%, 10%, 5%, 1%, or fewer of the nucleotides of the nucleic acid-targeting guide participate in self-complementary base pairing when optimally folded. Optimal folding may be determined by any suitable polynucleotide folding algorithm. Some programs are based on calculating the minimal Gibbs free energy. An example of one such algorithm is mFold, as described by Zuker and Stiegler (Nucleic Acids Res. 9 (1981), 133-148). Another example folding algorithm is the online webserver RNAfold, developed at Institute for Theoretical Chemistry at the University of Vienna, using the centroid structure prediction algorithm (see, e.g., A.R. Gruber et al., 2008, Cell 106(1): 23-24; and PA Carr and GM Church, 2009, Nature Biotechnology 27(12): 1151-62). [0090] In embodiments, a guide RNA or crRNA may comprise, consist essentially of, or consist of a direct repeat (DR) sequence and a guide sequence or spacer sequence. In embodiments, the guide RNA or crRNA may comprise, consist essentially of, or consist of a direct repeat sequence fused or linked to a guide sequence or spacer sequence. In embodiments, the direct repeat sequence may be located upstream (i.e., 5’) from the guide sequence or spacer sequence. In other embodiments, the direct repeat sequence may be located downstream (i.e., 3’) from the guide sequence or spacer sequence.
[0091] In one example embodiment, the crRNA comprises a stem loop, preferably a single stem loop. In one example embodiment, the direct repeat sequence forms a stem loop, preferably a single stem loop.
[0092] In one example embodiment, the spacer length of the guide RNA is from 15 to 35 nucleotides (nt). In another example embodiment, the spacer length of the guide RNA is at least 15 nucleotides (nt). In another example embodiment, the spacer length is from 15 to 17 nt, e.g., 15, 16, or 17 nt, from 17 to 20 nt, e.g., 17, 18, 19, or 20 nt, from 20 to 24 nt, e.g., 20, 21, 22, 23, or 24 nt, from 23 to 25 nt, e.g., 23, 24, or 25 nt, from 24 to 27 nt, e.g., 24, 25, 26, or 27 nt, from 27 to 30 nt, e.g., 27, 28, 29, or 30 nt, from 30 to 35 nt, e.g., 30, 31, 32, 33, 34, or 35 nt, or 35 nt or longer.
[0093] The "tracrRNA" sequence or analogous terms includes any polynucleotide sequence that has sufficient complementarity with a crRNA sequence to hybridize. In some embodiments, the degree of complementarity between the tracrRNA sequence and crRNA sequence along the length of the shorter of the two when optimally aligned is about or more than about 25%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 97.5%, 98%, 98.5%, 99%, 99.5%, or more (e.g., 100%). In some embodiments, the tracr sequence is about or more than about 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 25, 30, 40, 50, or more nucleotides in length. In some embodiments, the tracr sequence and crRNA sequence are contained within a single transcript, such that hybridization between the two produces a transcript having a secondary structure, such as a hairpin.
[0094] In general, degree of complementarity is with reference to the optimal alignment of the spacer sequence and tracr sequence, along the length of the shorter of the two sequences. Optimal alignment may be determined by any suitable alignment algorithm and may further account for secondary structures, such as self-complementarity within either the spacer sequence or tracr sequence. In some embodiments, the degree of complementarity between the tracr sequence and spacer sequence along the length of the shorter of the two when optimally aligned is about or more than about 25%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 97.5%, 98%, 98.5%, 99%, 99.5%, or more (e.g., 100%).
[0095] In some embodiments, the degree of complementarity between a guide sequence and its corresponding target sequence can be about or more than about 50%, 60%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 97.5%, 98%, 98.5%, 99%, 99.5%, or more (e.g., 100%); a guide or RNA or sgRNA can be about or more than about 5, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 35, 40, 45, 50, 75, or more nucleotides in length; or guide or RNA or sgRNA can be less than about 75, 50, 45, 40, 35, 30, 25, 20, 15, 12, or fewer nucleotides in length; and tracr RNA can be 30 or 50 nucleotides in length. In some embodiments, the degree of complementarity between a guide sequence and its corresponding target sequence is greater than 94.5% or 95% or 95.5% or 96% or 96.5% or 97% or 97.5% or 98% or 98.5% or 99% or 99.5% or 99.9%, or 100%. Off target is less than 100% or 99.9% or 99.5% or 99% or 99% or 98.5% or 98% or 97.5% or 97% or 96.5% or 96% or 95.5% or 95% or 94.5% or 94% or 93% or 92% or 91% or 90% or 89% or 88% or 87% or 86% or 85% or 84% or 83% or 82% or 81% or 80% complementarity between the sequence and the guide, with it being advantageous that off target is 100% or 99.9% or 99.5% or 99% or 99% or 98.5% or 98% or 97.5% or 97% or 96.5% or 96% or 95.5% or 95% or 94.5% complementarity between the sequence and the guide.
[0096] In some embodiments according to the disclosure, the guide RNA (capable of guiding Cas to a target locus) may comprise (1) a guide sequence capable of hybridizing to a genomic target locus in the eukaryotic cell; (2) a tracr sequence; and (3) a tracr mate sequence. All of (1) to (3) may reside in a single RNA, i.e., an sgRNA (arranged in a 5’ to 3’ orientation), or the tracr RNA may be a different RNA than the RNA containing the guide and tracr sequence. The tracr hybridizes to the tracr mate sequence and directs the CRISPR/Cas complex to the target sequence. Where the tracr RNA is on a different RNA than the RNA containing the guide and tracr sequence, the length of each RNA may be optimized to be shortened from their respective native lengths, and each may be independently chemically modified to protect from degradation by cellular RNase or otherwise increase stability.
[0097] Many modifications to guide sequences are known in the art and are further contemplated within the context of this disclosure. Various modifications may be used to increase the specificity of binding to the target sequence and/or increase the activity of the Cas protein and/or reduce off-target effects. Example guide sequence modifications are described in International Patent Application No. PCT US2019/045582, specifically paragraphs [0178]-[0333], which is incorporated herein by reference.
Tarset Sequences, I1 AMs, and PFSs
[0098] In the context of formation of a CRISPR complex, "target sequence" refers to a sequence to which a guide sequence is designed to have complementarity, where hybridization between a target sequence and a guide sequence promotes the formation of a CRISPR complex. In other words, the target polynucleotide can be a polynucleotide or a part of a polynucleotide to which a part of the guide sequence is designed to have complementarity with and to which the effector function mediated by the complex comprising the CRISPR effector protein and a guide molecule is to be directed. In some embodiments, a target sequence is located in the nucleus or cytoplasm of a cell.
[0099] PAM elements are sequences that can be recognized and bound by Cas proteins. Cas proteins/effector complexes can then unwind the dsDNA at a position adjacent to the PAM element. It will be appreciated that Cas proteins and systems target RNA do not require PAM sequences (Marraffini et al. 2010. Nature. 463:568-571). Instead, many rely on PFSs, which are discussed elsewhere herein. In one example embodiment, the target sequence should be associated with a PAM (protospacer adjacent motif) or PFS (protospacer flanking sequence or site), that is, a short sequence recognized by the CRISPR complex. Depending on the nature of the CRISPR-Cas protein, the target sequence should be selected, such that its complementary sequence in the DNA duplex (also referred to herein as the non-target sequence) is upstream or downstream of the PAM. In the embodiments, the complementary sequence of the target sequence is downstream or 3’ of the PAM or upstream or 5’ of the PAM. The precise sequence and length requirements for the PAM differ depending on the Cas protein used, but PAMs are typically 2-5 base pair sequences adjacent the protospacer (that is, the target sequence). Examples of the natural PAM sequences for different Cas proteins are provided herein below and the skilled person will be able to identify further PAM sequences for use with a given Cas protein.
[0100] The ability to recognize different PAM sequences depends on the Cas polypeptide(s) included in the system. See, e.g., Gleditzsch et al. 2019. RNA Biology. 16(4):504-517. Table A (from Gleditzsch et al. 2019) below shows several Cas polypeptides and the PAM sequence they recognize.
[0101] In embodiments, the CRISPR effector protein may recognize a 3 ’ PAM. In one example embodiment, the CRISPR effector protein may recognize a 3’ PAM which is 5’H, wherein H is A, C or U.
[0102] Further, engineering of the PAM Interacting (PI) domain on the Cas protein may allow programing of PAM specificity, improve target site recognition fidelity, and increase the versatility of the CRISPR-Cas protein, for example as described for Cas9 in KI einstiver BP et al. Engineered CRISPR-Cas9 nucleases with altered PAM specificities. Nature. 2015 Jul 23;523(7561):481-5. doi: 10.1038/naturel4592. As further detailed herein, the skilled person will understand that Casl3 proteins may be modified analogously. Gao et al, "Engineered Cpfl Enzymes with Altered PAM Specificities," bioRxiv 091611; doi: http://dx.doi.org/10.1101/091611 (Dec. 4, 2016). Doench et al. created a pool of sgRNAs, tiling across all possible target sites of a panel of six endogenous mouse and three endogenous human genes and quantitatively assessed their ability to produce null alleles of their target gene by antibody staining and flow cytometry. The authors showed that optimization of the PAM improved activity and also provided an on-line tool for designing sgRNAs.
[0103] PAM sequences can be identified in a polynucleotide using an appropriate design tool, which are commercially available as well as online. Such freely available tools include, but are not limited to, CRISPRFinder and CRISPRTarget. Mojica et al. 2009. Microbiol. 155(Pt. 3):733-740; Atschul et al. 1990. J. Mol. Biol. 215:403-410; Biswass et al. 2013 RNA Biol. 10:817-827; and Grissa et al. 2007. Nucleic Acid Res. 35:W52-57. Experimental approaches to PAM identification can include, but are not limited to, plasmid depletion assays (Jiang et al. 2013. Nat. Biotechnol. 31 :233-239; Esvelt et al. 2013. Nat. Methods. 10: 1116-1121; Kleinstiver et al. 2015. Nature. 523:481-485), screened by a high-throughput in vivo model called PAM-SCNAR (Pattanayak et al. 2013. Nat. Biotechnol. 31:839-843 and Leenay et al. 2016. Mol. Cell. 16:253), and negative screening (Zetsche et al. 2015. Cell. 163:759-771).
[0104] As previously mentioned, CRISPR-Cas systems that target RNA do not typically rely on PAM sequences. Instead, such systems typically recognize protospacer flanking sites (PFSs) instead of PAMs Thus, Type VI CRISPR-Cas systems typically recognize protospacer flanking sites (PFSs) instead of PAMs. PFSs represents an analogue to PAMs for RNA targets. Type VI CRISPR-Cas systems employ a Casl3. Some Casl3 proteins analyzed to date, such as Casl3a (C2c2) identified from Leptotrichia shahii (LShCAsl3a) have a specific discrimination against G at the 3 ’end of the target RNA. The presence of a C at the corresponding crRNA repeat site can indicate that nucleotide pairing at this position is rejected. However, some Casl3 proteins (e.g., LwaCAsl3a and PspCasl3b) do not seem to have a PFS preference. See, e.g., Gleditzsch et al. 2019. RNA Biology. 16(4): 504-517. [0105] Some Type VI proteins, such as subtype B, have 5 '-recognition of D (G, T, A) and a 3'-motif requirement ofNAN or NNA. One example is the Casl3b protein identified in Bergeyella zoohelcum (BzCasl3b). See, e.g., Gleditzsch et al. 2019. RNA Biology. 16(4):504-517.
[0106] Overall Type VI CRISPR-Cas systems appear to have less restrictive rules for substrate (e.g., target sequence) recognition than those that target DNA (e.g., Type V and type II).
Sequences related to nucleus targeting and transportation
[0107] In some embodiments, one or more components e.g., the Cas protein) in the composition for engineering cells may comprise one or more sequences related to nucleus targeting and transportation. Such sequences may facilitate the one or more components in the composition for targeting a sequence within a cell. In order to improve targeting of the CRISPR-Cas protein used in the methods of the present disclosure to the nucleus, it may be advantageous to provide one or both of these components with one or more nuclear localization sequences (NLSs).
[0108] In one example embodiment, the NLSs used in the context of the present disclosure are heterologous to the proteins. Non-limiting examples of NLSs include an NLS sequence derived from: the NLS of the SV40 virus large T-antigen, having the amino acid sequence PKKKRKV (SEQ ID NO: 1) or PKKKRKVEAS (SEQ ID NO: 2); the NLS from nucleoplasmin (e.g., the nucleoplasmin bipartite NLS with the sequence KRPAATKKAGQAKKKK (SEQ ID NO: 3)); the c-myc NLS having the amino acid sequence PAAKRVKLD (SEQ ID NO: 4) or RQRRNELKRSP (SEQ ID NO: 5); the hRNPAl M9 NLS having the sequence NQSSNFGPMKGGNFGGRSSGPYGGGGQYFAKPRNQGGY (SEQ ID NO: 6); the sequence RMRIZFKNKGKDTAELRRRRVEVSVELRKAKKDEQILKRRNV (SEQ ID NO: 7) of the IBB domain from importin-alpha; the sequences VSRKRPRP (SEQ ID NO: 8) and PPKKARED (SEQ ID NO: 9) of the myoma T protein; the sequence PQPKKKPL (SEQ ID NO: 10) of human p53; the sequence SALIKKKKKMAP (SEQ ID NO: 11) of mouse c-abl IV; the sequences DRLRR (SEQ ID NO: 12) and PKQKKRK (SEQ ID NO: 13) of the influenza virus NS1; the sequence RKLKKKIKKL (SEQ ID NO: 14) of the Hepatitis virus delta antigen; the sequence REKKKFLKRR (SEQ ID NO: 15) of the mouse Mxl protein; the sequence KRKGDEVDGVDEVAKKKSKK (SEQ ID NO: 16) of the human poly(ADP-ribose) polymerase; and the sequence RKCLQAGMNLEARKTKK (SEQ ID NO: 17) of the steroid hormone receptors (human) glucocorticoid. In general, the one or more NLSs are of sufficient strength to drive accumulation of the DNA-targeting Cas protein in a detectable amount in the nucleus of a eukaryotic cell. In general, strength of nuclear localization activity may derive from the number of NLSs in the CRISPR-Cas protein, the particular NLS(s) used, or a combination of these factors. Detection of accumulation in the nucleus may be performed by any suitable technique. For example, a detectable marker may be fused to the nucleic acid-targeting protein, such that location within a cell may be visualized, such as in combination with a means for detecting the location of the nucleus (e.g., a stain specific for the nucleus such as DAPI). Cell nuclei may also be isolated from cells, the contents of which may then be analyzed by any suitable process for detecting protein, such as immunohistochemistry, Western blot, or enzyme activity assay. Accumulation in the nucleus may also be determined indirectly, such as by an assay for the effect of nucleic acidtargeting complex formation (e.g., assay for deaminase activity) at the target sequence, or assay for altered gene expression activity affected by DNA-targeting complex formation and/or DNA- targeting), as compared to a control not exposed to the Cas protein, or exposed to a Cas protein lacking the one or more NLSs.
[0109] The Cas proteins may be provided with 1 or more, such as with, 2, 3, 4, 5, 6, 7, 8, 9, 10, or more heterologous NLSs. In some embodiments, the proteins comprises about or more than about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, or more NLSs at or near the amino-terminus, about or more than about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, or more NLSs at or near the carboxy-terminus, or a combination of these (e.g., zero or at least one or more NLS at the amino-terminus and zero or at one or more NLS at the carboxy terminus). When more than one NLS is present, each may be selected independently of the others, such that a single NLS may be present in more than one copy and/or in combination with one or more other NLSs present in one or more copies. In some embodiments, an NLS is considered near the N- or C-terminus when the nearest amino acid of the NLS is within about 1, 2, 3, 4, 5, 10, 15, 20, 25, 30, 40, 50, or more amino acids along the polypeptide chain from the N- or C-terminus. In embodiments of the Cas proteins, an NLS attached to the C-terminal of the protein.
[0110] In embodiments, the CRISPR-Cas protein and a functional domain protein (described further herein) are delivered to the cell or expressed within the cell as separate proteins. In embodiments, each of the CRISPR-Cas and functional domain protein can be provided with one or more NLSs as described herein. In embodiments, the CRISPR-Cas and functional domain protein are delivered to the cell or expressed with the cell as a fusion protein. In embodiments, one or both of the CRISPR-Cas and functional domain protein is provided with one or more NLSs. Where the functional domain protein is fused to an adaptor protein (such as MS2) as described above, the one or more NLS can be provided on the adaptor protein, provided that this does not interfere with aptamer binding. In embodiments, the one or more NLS sequences may also function as linker sequences between the functional domain protein and the CRISPR-Cas protein.
[0111] In embodiments, guides of the disclosure comprise specific binding sites (e.g., aptamers) for adapter proteins, which may be linked to or fused to a functional domain protein or catalytic domain thereof. When such a guide forms a CRISPR complex (e.g., CRISPR-Cas protein binding to guide and target), the adapter proteins bind and the functional domain protein or catalytic domain thereof associated with the adapter protein is positioned in a spatial orientation which is advantageous for the attributed function to be effective.
[0112] The skilled person will understand that modifications to the guide which allow for binding of the adapter + nucleotide deaminase, but not proper positioning of the adapter + nucleotide deaminase (e.g., due to steric hindrance within the three-dimensional structure of the CRISPR complex) are modifications which are not intended. The one or more modified guide may be modified at the tetra loop, the stem loop 1, stem loop 2, or stem loop 3, as described herein, preferably at either the tetra loop or stem loop 2, and in some cases at both the tetra loop and stem loop 2.
[0113] In embodiments, a component (e.g., the dead Cas protein, the functional domain protein or catalytic domain thereof, or a combination thereof) in the systems may comprise one or more nuclear export signals (NES), one or more nuclear localization signals (NLS), or any combinations thereof. In some cases, the NES may be an HIV Rev NES. In certain cases, the NES may be MAPK NES. When the component is a protein, the NES or NLS may be at the C terminus of component. Alternatively, or additionally, the NES or NLS may be at the N terminus of component. In some examples, the Cas protein and optionally said functional domain protein or catalytic domain thereof comprise one or more heterologous nuclear export signal(s) (NES(s)) or nuclear localization signal(s) (NLS(s)), preferably an HIV Rev NES or MAPK NES, preferably C- terminal. Barcodes
[0114] In embodiments, the guide molecule may further comprise a barcode that is unique to each perturbation and thereby identifies which guide molecule(s) and perturbation(s) were introduced into a given cell when analyzed by sequencing. For example, the barcode is incorporated into the sequence encoding the perturbation or is a sequence that is only encoded on a vector encoding the perturbation. In embodiments, the barcode identifying a perturbation can be the perturbation, such as a guide sequence encoding a specific perturbation. The guide sequence can also include one or more internal barcode sequences (iBAR) that are inserted in the guide sequence. In embodiments, the one or more iBARS are inserted in a guide sequence in such a way as to not interfere with the guide sequences ability to be directed to a target sequence. In embodiments, an iBAR is inserted within the loop of a sgRNA joining the crRNA and tracrRNA sequence of the sgRNA (described further herein). In example embodiments, an iBAR can identify the perturbation. In example embodiments, additional iBARs can identify replicates, subpopulations, clones, and/or subclones.
[0115] The term "barcode" as used herein refers to a short sequence of nucleotides (for example, DNA or RNA) that is used as an identifier for an associated molecule, such as a target molecule and/or target nucleic acid, or as an identifier of the source of an associated molecule, such as a cell-of-origin, sample of origin, or individual transcript. A barcode may also refer to any unique, non-naturally occurring, nucleic acid sequence that may be used to identify the originating source of a nucleic acid fragment. Although it is not necessary to understand the mechanism of a disclosure, it is believed that the barcode sequence provides a high-quality individual read of a barcode associated with a perturbation, single cell, single nuclei, a viral vector, labeling ligand (e.g., antibody or aptamer), protein, shRNA, sgRNA or cDNA such that multiple species can be sequenced together. Barcoding may be performed based on any of the compositions or methods disclosed in patent publication WO 2014047561 Al, Compositions and methods for labeling of agents, incorporated herein in its entirety. In example embodiments, barcoding uses an error correcting scheme (T. K. Moon, Error Correction Coding: Mathematical Methods and Algorithms (Wiley, New York, ed. 1, 2005)).
[0116] A nucleic acid barcode can have a length of at least, for example, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 35, 40, 45, 50, 60, 70, 80, 90, or 100 nucleotides, and can be in single- or double-stranded form. Target molecule and/or target nucleic acids can be labeled with multiple nucleic acid barcodes in combinatorial fashion, such as a nucleic acid barcode concatemer. Typically, a nucleic acid barcode is used to identify a target molecule and/or target nucleic acid, or as being from a particular discrete volume, having a particular physical property (for example, affinity, length, sequence, etc.), or having been subject to certain treatment conditions. Target molecules and/or target nucleic acids can be associated with multiple nucleic acid barcodes to provide information about all of these features (and more). tRNA Sequences
[0117] In embodiments, to minimize the 3’ LTR insertion and guide separation needed to minimize intramolecular recombination events, the vector may further encode one or more tRNA sequences. In embodiments, to allow cleavage of a single pol III transcript comprising two sgRNAs into two separate functional sgRNAs, a tRNA sequences is encoded between the sgRNAs. In example embodiments, the tRNA sequences may recruit endogenous (i.e. of the cell transfected using the vector) RNAse P and Z to clear the tRNA sequence at the 5’ and 3’ ends, such that, when positioned between guide molecules within a single transcript, the transcript is processed into separate, functional guide molecules. In embodiments, a tRNA sequence may be included in front of the first guide molecule in the single transcript as well as between any two guide molecules in the transcript. In embodiments, the orientation of the tRNA leader sequences may be reversed. In example embodiments, a 3' LTR embedding the reverse complement orientation of the tRNA sequences is a key feature for making the mRNA (i.e., pol II transcript encoding the guide sequences) compatible with single cell sequencing. For example, the Pol III transcript is transcribed from the pol III promoter in the 3’ LTR of the integrated lentivirus such that the sgRNAs and tRNAs are oriented such that the functional tRNAs are recognized and cleaved. In example embodiments, the pol III promoter initiates transcription on the opposite strand as the pol II promoter in the lentivirus vector by reversing the orientation of the elements. In example embodiments, the orientation of the U6 and tRNA/sgRNA arrays are antisense relative to the lentiviral genome. In embodiments, the tRNA sequences may be orthogonal to the cell to be transfected. The two or more guide molecules may also be operably linked to a promoter encoded for or within the 3’ LTR and a pol II promoter encoded upstream of the 3’ LTR. Example tRNA sequences are disclosed in the Examples section below. In example embodiments, any method of multiplexing sgRNAs can be used instead of tRNA sequences.
Population of cells
[0118] In example embodiments, the lentivirus vectors, each comprising two perturbations (e.g., sgRNAs), are delivered to a population of cells to obtain perturbed cells. In example embodiments, the population of cells can include any tissue culture cell line or any primary cell line. In example embodiments, the population of cells expresses any effector proteins required for the perturbation, such as a programmable nuclease or specific CRISPR system, as described further herein. In example embodiments, the population of cells comprises eukaryotic cells, preferably, mammalian cells, more preferably, human cells.
METHOD OF PERFORMING MULTIPLEXED POOLED PERTURBATIONS
[0119] The present disclosure provides for a method of performing multiplexed pooled perturbation screening comprising introducing one or more perturbation vectors encoding two or more guide molecules in a single transcript according to any embodiment herein to a population of cells to obtain perturbed cells, optionally the population of cells expresses any effector proteins required for the perturbation; and performing single cell RNA sequencing on the perturbed cells to identify the phenotype and multiplexed perturbations in single cells.
[0120] The exemplary embodiments disclosed herein are directed to a method of performing multiplexed pooled optical perturbation screening comprising introducing one or more perturbations using the vectors described herein. The perturbed cells are then fixed and permeabilized. The mRNA is reverse transcribed from the pol II promoter of the lentivirus vector in the perturbed cells using primers specific for the guide molecules, thereby generating cDNA comprising the sequence encoding the guide molecules. The cDNA is then fixed in the cells and in situ sequencing (ISS) of the cDNA sequences is perform. The ISS step may comprise contacting the perturbed cell with padlock probes flanking each guide molecule and/or iBAR, gap filling the padlock probes to capture each spacer and iBAR sequence, ligating the padlock probes into circular ssDNA templates and performing rolling circle amplification to generate amplified cDNA and sequencing the amplified cDNA using sequencing by synthesis. During the sequencing-by- synthesis process, an enzyme, typically a DNA polymerase, initiates the sequencing by incorporating fluorescently labeled nucleotide bases one at a time. After each base addition, the sample is imaged to detect the fluorescent signal associated with that base, which signifies its identity. This process is cycled for multiple rounds, with each cycle adding and imaging a single base. The accumulated data, represented by a series of images, is then analyzed to deduce the complete RNA sequence. Sequencing by synthesis provides valuable insights into gene expression patterns and RNA modifications within individual cells and tissues while preserving their spatial context.
[0121] In another aspect, embodiments disclosed herein are directed to a method of performing multiplexed pooled perturbation screening, the method involving introducing one or more perturbation vectors that includes a promoter, optionally where the promoter may be used to generate in vitro transcription barcoded RNA. Combined with optical mRNA perturbation detection, in vitro detection can enhance decoding strategies employed in the performance of multiplexed pooled perturbation screening.
Perturbations
[0122] In example embodiments, the term "perturbation" refers to any alteration of the function of a biological system by external or internal means, such as alterations in gene expression, alterations by environmental stimuli, or alterations by drug treatment. In example embodiments, the perturbation used in the present disclosure is genetic. As used herein a genetic perturbation refers to a perturbation that perturbs a nucleic acid, such as a genome sequence (e.g., a target gene or regulatory element) or RNA sequence (e.g., a transcript sequence). In example embodiments, a plurality of cells is perturbed with sequence specific perturbations. As used herein "sequence specific" refers to a perturbation that targets a specific nucleotide sequence in a cell e.g., a DNA or RNA sequence). In example embodiments, a genetic perturbation is a CRISPR mediated perturbation (e.g., INDELs, substitutions, CRISPRa (CRISPR activation), CRISPRi (CRISPR interference), prime editing, base editing, or an RNAi (RNA interference) mediated perturbation. In example embodiments, the perturbations can be identified by sequencing. In embodiments, each perturbation can be identified by at least one barcode sequence. In example embodiments, the one or more perturbations target specific genes of interest. In example embodiments, the perturbations both target the same gene, but at a different sequence. [0123] In example embodiments, perturbations include any perturbation that can be directed to a target sequence for perturbation by a programmable system. In example embodiments, the programmable system is a programmable nuclease system, such as a CRISPR system. In example embodiments, the perturbations encoded by the vectors of the present disclosure are guide sequences capable of targeting a programmable system to a target sequence in a cell. In example embodiments, the programmable system includes an enzymatic component that is targeted to the perturbation target. In example embodiments, the cells used for the multiplexed perturbations is modified to express the programmable system for making the perturbations in the cells. In other example embodiments, the programmable system is introduced to the cells concurrently or before introducing the vectors encoding the perturbations. In embodiments, the perturbations are guide sequences specific for a CRISPR system. In example, embodiments, the guide sequences are single guide RNA sequences (sgRNA), which are described further herein. Example CRISPR systems that can be directed to a target sequence by a guide sequence are provided below.
CRISPR-Cas
[0124] As noted above, the population of cells may express a Cas polypeptide from a CRISPR- Cas system. The cell maybe genetically modified to express the Cas polypeptide or the Cas polypeptide may be deliver prior to, with, or subsequent to delivery of the viral vector systems disclosed herein. The Cas polypeptide used will align with the corresponding guide molecule i.e. a Cas9 will be used with a Cas9 guide molecule and so forth. In an example embodiment, the Cas is a Type II Cas polypeptide. In an embodiment, the Type II CRISPR-Cas system is a II-A CRISPR-Cas system. In some embodiments, the Type II CRISPR-Cas system is a II-B CRISPR- Cas system. In some embodiments, the Type II CRISPR-Cas system is a II-C1 CRISPR-Cas system. In some embodiments, the Type II CRISPR-Cas system is a II-C2 CRISPR-Cas system. In some embodiments, the Type II system is a Cas9 system. In some embodiments, the Type II system includes a Cas9.
[0125] In embodiments, the Type V CRISPR-Cas system is a V-A CRISPR-Cas system. In some embodiments, the Type V CRISPR-Cas system is a V-Bl CRISPR-Cas system. In some embodiments, the Type V CRISPR-Cas system is a V-B2 CRISPR-Cas system. In some embodiments, the Type V CRISPR-Cas system is a V-C CRISPR-Cas system. In some embodiments, the Type V CRISPR-Cas system is a V-D CRISPR-Cas system. In some embodiments, the Type V CRISPR-Cas system is a V-E CRISPR-Cas system. In some embodiments, the Type V CRISPR-Cas system is a V-Fl CRISPR-Cas system. In some embodiments, the Type V CRISPR-Cas system is a V-Fl (V-U3) CRISPR-Cas system. In some embodiments, the Type V CRISPR-Cas system is a V-F2 CRISPR-Cas system. In some embodiments, the Type V CRISPR-Cas system is a V-F3 CRISPR-Cas system. In some embodiments, the Type V CRISPR-Cas system is a V-G CRISPR-Cas system. In some embodiments, the Type V CRISPR-Cas system is a V-H CRISPR-Cas system. In some embodiments, the Type V CRISPR-Cas system is a V-I CRISPR-Cas system. In some embodiments, the Type V CRISPR-Cas system is a V-K (V-U5) CRISPR-Cas system. In some embodiments, the Type V CRISPR-Cas system is a V-Ul CRISPR-Cas system. In some embodiments, the Type V CRISPR-Cas system is a V-U2 CRISPR-Cas system. In some embodiments, the Type V CRISPR-Cas system is a V-U4 CRISPR-Cas system. In some embodiments, the Type V CRISPR-Cas system includes a Cast 2a (Cpfl), Cast 2b (C2cl), Cast 2c
(C2c3), Casl2d (CasY), Casl2e (CasX), Casl4, and/or CasCb.
[0126] In some embodiments the Class 2 system is a Type VI system. In some embodiments, the Type VI CRISPR-Cas system is a VI-A CRISPR-Cas system. In some embodiments, the Type VI CRISPR-Cas system is a VI-B1 CRISPR-Cas system. In some embodiments, the Type VI CRISPR-Cas system is a VI-B2 CRISPR-Cas system. In some embodiments, the Type VI CRISPR-Cas system is a VI-C CRISPR-Cas system. In some embodiments, the Type VI CRISPR- Cas system is a VI-D CRISPR-Cas system. In some embodiments, the Type VI CRISPR-Cas system includes a Casl3a (C2c2), Casl3b (Group 29/30), Casl3c, and/or Casl3d.
OMEGA systems
[0127] In embodiments, the programmable nuclease to modify the one or more target genes is a transposon-encoded RNA-guided nuclease system, referred to herein as OMEGA (obligate mobile element-guided activity). See, e.g., Altae-Tran H, Kannan S, Demircioglu FE, et al. The widespread IS200/IS605 transposon family encodes diverse programmable RNA-guided endonucleases. Science. 2021;374(6563):57-65. OMEGA systems include, but are not limited to IscB, IsrB, TnpB systems. The guide molecule of an OMEGA system is referred to as coRNA and while different in size and structure from a typical CRISPR-Cas guide also contains a programmable spacer sequence and scaffold component. Accordingly, OMEGA systems may be used within the context of the disclosure, both in the context of viral vectors encoding two or more guide molecules that are coRNAs, and in the use of OMEGA polypeptides to make the desired cellular perturbation.
Sequencing
[0128] Perturb-seq identifies perturbations by sequencing barcodes identifying the perturbations expressed as poly(A) tailed mRNAs. The multiplexed vectors described herein can be used with any method of perturb-seq. Examples of prior perturb-seq assays have been described (see, e.g., Dixit et al., "Perturb-Seq: Dissecting Molecular Circuits with Scalable Single-Cell RNA Profiling of Pooled Genetic Screens" 2016, Cell 167, 1853-1866; Adamson et al., "A Multiplexed Single-Cell CRISPR Screening Platform Enables Systematic Dissection of the Unfolded Protein Response" 2016, Cell 167, 1867-1882; Jaitin DA, Weiner A, Yofe I, et al. Dissecting Immune Circuits by Linking CRISPR-Pooled Screens with Single-Cell RNA-Seq. Cell. 2016; 167(7): 1883- 1896. el 5; Feldman et al., Lentiviral co-packaging mitigates the effects of intermolecular recombination and multiple integrations in pooled genetic screens, bioRxiv 262121, doi: doi.org/10.1101/262121; Datlinger, et al., 2017, Pooled CRISPR screening with single-cell transcriptome readout. Nature Methods. Vol.14 No.3 DOI: 10.1038/nmeth.4177; Hill et al., On the design of CRISPR-based single cell molecular screens, Nat Methods. 2018 Apr; 15(4): 271-274; Replogle, et al., "Combinatorial single-cell CRISPR screens by direct guide RNA capture and targeted sequencing" Nat Biotechnol (2020). doi.org/10.1038/s41587-020-0470-y; Schraivogel D, Gschwind AR, Milbank JH, et al. "Targeted Perturb-seq enables genome-scale genetic screens in single cells". Nat Methods. 2020;17(6):629-635; Frangieh CJ, Melms JC, Thakore PI, et al. Multimodal pooled Perturb-CITE-seq screens in patient models define mechanisms of cancer immune evasion. Nat Genet. 2021;53(3):332-341; US patent application publication number US20200283843A1; and US Patent number US11214797B2). The present multiplexed vector can identify multiplexed perturbations in a single transcript.
[0129] In example embodiments, perturbations are identified along with transcriptome mRNA using single cell sequencing. In example embodiments, the disclosure involves single cell RNA sequencing (see, e.g., Qi Z, Barrett T, Parikh AS, Tirosh I, Puram SV. Single-cell sequencing and its applications in head and neck cancer. Oral Oncol. 2019;99: 104441; Kalisky, T., Blainey, P. & Quake, S. R. Genomic Analysis at the Single-Cell Level. Annual review of genetics 45, 431-445, (2011); Kalisky, T. & Quake, S. R. Single-cell genomics. Nature Methods 8, 311-314 (2011); Islam, S. et al. Characterization of the single-cell transcriptional landscape by highly multiplex RNA-seq. Genome Research, (2011); Tang, F. et al. RNA-Seq analysis to capture the transcriptome landscape of a single cell. Nature Protocols 5, 516-535, (2010); Tang, F. et al. mRNA-Seq whole-transcriptome analysis of a single cell. Nature Methods 6, 377-382, (2009); Ramskold, D. et al. Full-length mRNA-Seq from single-cell levels of RNA and individual circulating tumor cells. Nature Biotechnology 30, 777-782, (2012); and Hashimshony, T., Wagner, F., Sher, N. & Yanai, I. CEL-Seq: Single-Cell RNA-Seq by Multiplexed Linear Amplification. Cell Reports, Cell Reports, Volume 2, Issue 3, p666-673, 2012).
[0130] In example embodiments, the disclosure involves plate based single cell RNA sequencing (see, e.g., Picelli, S. et al., 2014, "Full-length RNA-seq from single cells using Smart- seq2" Nature protocols 9, 171-181, doi: 10.1038/nprot.2014.006).
[0131] In example embodiments, the disclosure involves high-throughput single-cell RNA- seq. In this regard reference is made to Macosko et al., 2015, "Highly Parallel Genome-wide Expression Profiling of Individual Cells Using Nanoliter Droplets" Cell 161, 1202-1214; International patent application number PCT/US2015/049178, published as W02016/040476 on March 17, 2016; Klein et al., 2015, "Droplet Barcoding for Single-Cell Transcriptomics Applied to Embryonic Stem Cells" Cell 161, 1187-1201; International patent application number PCT/US2016/027734, published as WO2016168584A1 on October 20, 2016; Zheng, et al., 2016, "Haplotyping germline and cancer genomes with high-throughput linked-read sequencing" Nature Biotechnology 34, 303-311; Zheng, et al., 2017, "Massively parallel digital transcriptional profiling of single cells" Nat. Commun. 8, 14049 doi: 10.1038/ncommsl4049; International patent publication number WO2014210353A2; Zilionis, et al., 2017, "Single-cell barcoding and sequencing using droplet microfluidics" Nat Protoc. Jan;12(l):44-73; Cao et al., 2017, "Comprehensive single cell transcriptional profiling of a multicellular organism by combinatorial indexing" bioRxiv preprint first posted online Feb. 2, 2017, doi: dx.doi.org/10.1101/104844; Rosenberg et al., 2017, "Scaling single cell transcriptomics through split pool barcoding" bioRxiv preprint first posted online Feb. 2, 2017, doi: dx.doi.org/10.1101/105163; Rosenberg et al., "Single-cell profiling of the developing mouse brain and spinal cord with split-pool barcoding" Science 15 Mar 2018; Vitak, et al., "Sequencing thousands of single-cell genomes with combinatorial indexing" Nature Methods, 14(3):302-308, 2017; Cao, et al., Comprehensive single-cell transcriptional profding of a multicellular organism. Science, 357(6352):661-667, 2017; Gierahn et al., "Seq-Well: portable, low-cost RNA sequencing of single cells at high throughput" Nature Methods 14, 395-398 (2017); and Hughes, et al., "Highly Efficient, Massively-Parallel Single-Cell RNA-Seq Reveals Cellular States and Molecular Features of Human Skin Pathology" bioRxiv 689273; doi: doi.org/10.1101/689273, all the contents and disclosure of each of which are herein incorporated by reference in their entirety.
[0132] In example embodiments, the disclosure involves single nucleus RNA sequencing. In this regard reference is made to Swiech et al., 2014, "In vivo interrogation of gene function in the mammalian brain using CRISPR-Cas9" Nature Biotechnology Vol. 33, pp. 102-106; Habib et al., 2016, "Div-Seq: Single-nucleus RNA-Seq reveals dynamics of rare adult newborn neurons" Science, Vol. 353, Issue 6302, pp. 925-928; Habib et al., 2017, "Massively parallel single-nucleus RNA-seq with DroNc-seq" Nat Methods. 2017 Oct;14(10):955-958; International Patent Application No. PCT/US2016/059239, published as WO2017164936 on September 28, 2017; International Patent Application No.PCT/US2018/060860, published as WO/2019/094984 on May 16, 2019; International Patent Application No. PCT/US2019/055894, published as WO/2020/077236 on April 16, 2020; Drokhlyansky, et al., "The enteric nervous system of the human and mouse colon at a single-cell resolution," bioRxiv 746743; doi: doi.org/10.1101/746743; and Drokhlyansky E, Smillie CS, Van Wittenberghe N, et al. The Human and Mouse Enteric Nervous System at Single-Cell Resolution. Cell. 2020;182(6): 1606-1622.e23, which are herein incorporated by reference in their entirety.
Multiplexed Optical Pooled Screens
[0133] In example embodiments, optical screens are performed in cells using multiplexed perturbations. The present disclosure also provides for a method of performing multiplexed pooled optical perturbation screening comprising introducing one or more perturbation vectors encoding two sgRNAs in a single transcript according to any embodiment herein to a population of cells to obtain perturbed cells (optionally, Cas-expressing); fixing and permeabilizing the perturbed cells; reverse transcribing mRNA transcribed from the pol II promoter of the lentivirus vector in the perturbed cells using primers specific for each of the two sgRNAs, thereby generating cDNA comprising the sequence encoding the two sgRNAs; fixing the cDNA in the perturbed cells; and in situ sequencing (ISS) the sequences comprising each sgRNA spacer and/or iBAR. In example embodiments, in situ sequencing (ISS) comprises contacting the perturbed cells with padlock probes flanking each sgRNA spacer and/or iBAR; gap filling the padlock probes to capture each spacer and iBAR within the padlock probe; ligating the padlock probes into circular ssDNA templates; performing rolling circle amplification on the circular ssDNA templates; and sequencing the amplified cDNA using sequencing by synthesis to decode perturbations. In example embodiments, the primers specific for each of the two sgRNAs for reverse transcription are biotinylated and the method further comprises a streptavidin incubation between reverse transcription and fixing the cDNA in the perturbed cells.
[0134] In example embodiments, the perturbed cells are permeabilized. In general, a biological sample can be permeabilized by exposing the sample to one or more permeabilizing agents. Suitable agents for this purpose include, but are not limited to, organic solvents (e.g., acetone, ethanol, and methanol), cross-linking agents (e.g, paraformaldehyde), detergents (e.g., saponin, Triton X-100™, Tween-20™, or sodium dodecyl sulfate (SDS)), and enzymes (e.g., trypsin, proteases (e.g., proteinase K). In example embodiments, the population of cells is permeabilized with a detergent. In example embodiments, the plasma membranes of the plurality of cells are permeabilized with lower concentrations of commonly used detergents, such as saponin, Triton X- 100™, Tween-20™, or sodium dodecyl sulfate (SDS). Saponin interacts with membrane cholesterol, selectively removing it and leaving holes in the membrane. In example embodiments, the detergent is non-ionic. In some embodiments, the detergent is an anionic detergent (e.g., SDS or N-lauroylsarcosine sodium salt solution). In some embodiments, the plurality of cells can be permeabilized using any of the detergents described herein, e.g., SDS and/or N-lauroylsarcosine sodium salt solution) before or after enzymatic treatment (e.g., treatment with any of the enzymes described herein, e.g., trypsin, proteases (e.g., pepsin and/or proteinase K)). Additional methods for sample permeabilization are described, for example, in lamur et al., Method Mol. Biol.588:63- 66, 2010, the entire contents of which are incorporated herein by reference.
[0135] In example embodiments, the concentration of the detergent is sufficient to permeabilize the cells without denaturing proteins. In example embodiments, NP40, digitonin, or tween is used. For example, the concentration of detergent used herein may be from 0.005% to 1%, from 0.01% to 0.8%, from 0.01% to 0.6%, from 0.01% to 0.4%, from 0.01% to 0.2%, from 0.01% to 0.1%, from 0.005% to 0.05%, from 0.01% to 0.03%, from 0.015% to 0.025%, from 0.018% to 0.022%, from 0.015% to 0.017%, from 0.016% to 0.018%, from 0.017% to 0.019%, from 0.018% to 0.02%, from 0.019% to 0.021%, from 0.02% to 0.022%, or from 0.021% to 0.023%. In some cases, the concentration of the detergent may be about 0.01%, about 0.015%, about 0.02%, about 0.025%, or about 0.03%. For example, the concentration of the detergent may be about 0.02%. In example embodiments, SDS is used at concentrations below 0.5%, such as 0.1, 0.05, or less than 0.01%.
[0136] In example embodiments, the perturbed cells are fixed. Various fixing methods can be used. In one example embodiment, fixing is accomplished by crosslinking. Non-limiting methods of crosslinking are known in the art. Fixation methods can be divided into two groups: additive and denaturing fixation. Additive fixation solutions (also called cross-linking fixations) contain various aldehydes, including formaldehyde, paraformaldehyde, glutaraldehyde, etc., and can create covalent chemical bonds between molecules. This method can preserve the natural structure of proteins, i.e., secondary and tertiary structures. Another group is the denaturing (or precipitating) fixations. These methods can denature proteins by reducing their solubility and/or disrupting the hydrophobic interactions, and thus modify the tertiary structures of proteins as well as inactivate enzymes. Alcohols, such as methanol and ethanol, are commonly used for denaturing fixation. However, alcohols are seldom solely applied since they can induce serious cell shrinkage. Other denaturing chemicals, like acetone and acetic acid, are usually combined with alcohols to enhance the fixation performance. Common fixation solutions include 2.5% glutaraldehyde, 4- 10% formalin (formalin is an alternative name for an aqueous solution of formaldehyde), 4% paraformaldehyde, methanol/acetone (1:1), and ethanol/acetic acid (3:1). Techniques for fixing cells and tissues are known to those of ordinary skill in the art. As non-limiting examples, a cell may be fixed using chemicals such as formaldehyde, paraformaldehyde, glutaraldehyde, ethanol, methanol, acetone, acetic acid, or the like. In one embodiment, a cell may be fixed using Hepes- glutamic acid buffer-mediated organic solvent (HOPE). In example embodiments, the fixing comprises about 0.007% glutaraldehyde in about 4% paraformaldehyde.
In situ sequencing
[0137] In example embodiments, perturbations encoded for in the vectors described herein are identified in single cells by in situ methods. In example embodiments, in situ sequencing (ISS) is used to identify the barcodes (e.g., sgRNA spacer and/or iBARS) (see, e.g., Yue L, Liu F, Hu J, et al. A guidebook of spatial transcriptomic technologies, data resources and analysis approaches. Comput Struct Biotechnol J. 2023;21 :940-955). In example embodiments, the in situ sequencing comprises any method of locally amplifying short sequences (e.g., sgRNA spacer and/or iBARS) and then imaging one nucleotide at a time. In example embodiments, rolling circle amplification comprising a 'padlock' probe that hybridizes on either side of a target sequence to form a circular template that can be copied repeatedly as a long string. Because the product is tethered to the template, it provides reliable localization and is amenable to in situ sequencing by successive rounds of ligation-based oligonucleotide probe incorporation (see, e.g., Ke R, Mignardi M, Pacureanu A, et al. In situ sequencing for RNA analysis in preserved tissue and cells. Nat Methods. 2013;10(9):857-860). In fluorescence in situ RNA sequencing (FISSEQ), the amplicons of cDNA are generated inside the cells (see, e.g., Lee J.H., et al. Highly multiplexed subcellular RNA sequencing in situ. Science. 2014;343(6177): 1360-1363). As used herein, padlock probes (PLP) refer to long oligonucleotides, whose ends are complementary to adjacent target sequences. Upon hybridization to the target, the two ends are brought into contact, allowing PLP circularization by ligation.
[0138] In example embodiments, the in situ sequencing comprises the simultaneous in situ sequencing of more than one of the two spacers and two iBARs simultaneously. In example embodiments, simultaneous sequencing provides the following advantages: 1) to save sequencing cycles (i.e., read both iBARS simultaneously to get two bases of information from a cell in each in situ sequencing cycle, cutting the number of cycles needed in half) and/or 2) to detect errors in the perturbation sequences in a cell (e.g., spacer/barcode sequence accuracy problems and/or recombination).
Optical phenotypes
[0139] In example embodiments, cell morphological phenotypes are determined for multiplexed perturbations. In example embodiments, the methods described herein can be to detect any phenotypes detectable by microscopy. In example embodiments, the phenotypes comprise cell morphology or biomolecule organization, including those detected by live cell markers, immunostaining, histological staining, or other similar methods. In example embodiments, the one or more additional phenotypes comprise any time resolved phenotype, such as, ion indicators, (e.g., calcium, sodium, magnesium, zinc, pH, and membrane potential indicators), voltage imaging, dynamic metabolite measurements, markers of cell stress, and/or cell migration. In example embodiments, a movie is taken of the population of cells after perturbation and before fixing. In example embodiments, live cell imaging is performed.
[0140] In example embodiments, morphological features can be identified by cell painting (see, e.g., Bray MA, Singh S, Han H, et al. Cell Painting, a high-content image-based assay for morphological profiling using multiplexed fluorescent dyes. Nat Protoc. 2016; 11(9): 1757-1774); and Laber, et al., Discovering cellular programs of intrinsic and extrinsic drivers of metabolic traits using LipocyteProfiler, bioRxiv 2021.07.17.452050).
[0141] In example embodiments, optical screens for gene expression are determined for perturbations (e.g., MERFISH; Moffitt JR, Hao J, Wang G, Chen KH, Babcock HP, Zhuang X. High-throughput single-cell gene-expression profiling with multiplexed error-robust fluorescence in situ hybridization. Proc Natl Acad Sci U S A. 2016; 113(39): 11046-11051). In example embodiments, perturbations are identified according to the present disclosure and combined with MERFISH.
Kits
[0142] In an aspect, the disclosure provides kits containing any one or more of the elements discussed herein. For example, a kit may include any embodiment of multiplexed perturbation vectors, including a library of perturbation vectors capable of perturbing a plurality of gene targets. For example, a kit may include any embodiment of vectors. Additionally, kits may include primers and padlock probes specific to the vectors.
[0143] Elements may be provided individually or in combinations, and may be provided in any suitable container, such as a vial, a bottle, or a tube. In some embodiments, the kit includes instructions in one or more languages, for example in more than one language. In some embodiments, a kit comprises one or more reagents for use in a process utilizing one or more of the elements described herein. Reagents may be provided in any suitable container. For example, a kit may provide one or more reaction or storage buffers. Reagents may be provided in a form that is usable in a particular process, or in a form that requires addition of one or more other components before use e.g., in concentrate or lyophilized form). [0144] Further embodiments are illustrated in the following Examples which are given for illustrative purposes only and are not intended to limit the scope of the disclosure.
EXAMPLES
Example 1 - CROPseq-multi for multiplexed perturbation and decoding in pooled genetic screens
[0145] Existing lentiviral perturbation systems offer performance and functionality tradeoffs (Table 1). pLentiGuide17-derived systems can enable multiplexing through the use of serial pol III (typically U6) promoters and SpCas9 sgRNAs18,9,19 (FIG. 4A, FIG. 4B). A major limitation of these multiplexing systems is the -400 bp distance separating sgRNA spacers, leading to -30% lentiviral recombination that results in unintended sgRNA combinations (Table 2)18,9,19. If the screening methodology captures both sgRNA identities, these recombination events can be detected and filtered out. However, if only one sgRNA is observed, these recombination events will be misassigned and contribute to experimental noise. These recombination events are an inherent property of lentiviral systems; lentivirus are pseudodiploid and, in the process of infection, reverse transcription during minus-strand synthesis is prone to template switching in a homology and distance-dependent manner20 22 (FIG. 4C, FIG. 4D). The Big Papi vector seeks to minimize recombination with antiparallel orthogonal U6-sgRNAs, minimizing the distance separating spacers to under 200 bp and with intervening secondary barcodes, reducing recombination to about 7%23,24. pLentiGuide-derived Casl2a systems capitalize on the native crRNA array processing ability of Casl2a enzymes25,26, enabling a minimal separation of spacers by only 20 bp, likely reducing recombination to negligible levels10,27,28,8. A limitation of both Big Papi and Casl2a systems is their requirement for Cas effector enzymes other than SpCas9. Big Papi relies on the delivery of two Cas effectors, SpCas9 and SaCas9. In contrast to the widely used SpCas9, Cas 12a enzymes are typically less active per guide10,27,28, limited in guide design by relatively restrictive protospacer adjacent motifs, and either lagging in development or incompatible with applications including CRISPR-KO, CRISPRa, CRISPRi, base editing, and prime editing29 31. Designs similar to Big Papi but utilizing two orthogonal SpCas9 guide scaffolds have been implemented to eliminate the requirement for SaCas928. However, all current multiplexing solutions lack compatibility with mRNA barcoding that is required for some screening modalities, including some RNA-sequencing and in situ detection workflows.
[0146] Table 1. Reporter barcode swapping rates with lentiviral barcoding systems. Next
Generation Sequencing (NGS), Fluorescence Activated Cell Sorting (FACS), quantitative polymerase chain reaction (qPCR). [0147] Table 2. Comparison of selected lentiviral gRNA delivery systems for barcoding and multiplexing. *Up to 2 for pooled library construction; up to 3 have been demonstrated for arrayed library construction. **Predicted based on distance. BC, barcode.
[0148] Lenti Guide-Barcode (LentiGuideBC)1 vectors and similar designs (Perturb-seq32,
MOSAIC-seq33, CRISP-seq34) typically sacrifice multiplexing capability in pooled screens for the ability to express a secondary barcode in mRNA (FIG. 1A, FIG. 4E). In principle, these vector designs are not fundamentally incompatible with multiplexing, however constructing libraries with at least 3 distal designed sequence elements, e g. two sgRNAs and a barcode, is challenging and to-date has only been performed via multi-step arrayed cloning18, which is impractical for high throughput pooled screens. In these designs, the secondary barcode is separated from the sgRNA spacer by at least 1,700 bp encoding the pol II promoter and resistance gene, resulting in lentiviral recombination near the theoretical maximum of 5O%20. As only about half of cells are correctly genotyped, this recombination contributes to a major loss of statistical power20. Efforts to shorten this distance by moving the U6-sgRNA downstream of the pol II promoter and resistance gene have resulted in poor guide activity20, potentially due to transcriptional interference by the pol II promoter20,35. Co-packaging integration-deficient templates as a means to mitigate recombination has also been described, albeit at the cost of ~100-fold reduction in lentiviral titer21.
[0149] The CROPseq vector offers a solution to mRNA-barcoding without lentiviral recombination3?(FIG. IB). By embedding the U6 and sgRNA within the lentiviral 3 ’ long terminal repeat (LTR), the CROPseq design leverages the high-fidelity intramolecular duplication of the 3’ LTR to the 5’ end of the lentiviral genome during transduction (FIG. 5A). This duplication results in two copies of the sgRNA, with the 5’ duplicate expressing functional sgRNAs without transcriptional interference and the 3’ version transcribed as mRNA in the 3’ UTR of the pol II- transcribed selection gene, compatible with mRNA detection approaches. With these considerations in mind, engineering of a lentiviral system that could enable multiplexed SpCas9- based perturbation with mRNA-barcoding was pursued.
Design of a CROPseq-inspired multiplexing vector
[0150] While CROPseq enables faithful duplication of the 3’ LTR via intramolecular recombination during plus-strand synthesis, adjacent guides and/or barcode elements are still vulnerable to intermolecular recombination during minus-strand synthesis (FIG. 5A). To minimize the 3’ LTR insertion and sgRNA separation, endogenous tRNA processing was employed as the multiplexing solution, as other groups have implemented, albeit for distinct applications37 42(FIG. 1C). Encoded in about 72 bp, tRNAs recruit endogenous RNAse P and Z to cleave the tRNA at the 5’ and 3’ ends, such that, when positioned between sgRNAs within a single transcript, the transcript is processed into separate, functional sgRNAs. While only a single tRNA is required for multiplexing (U6-sgRNA-tRNA-sgRNA), it was elected herein to additionally precede the first sgRNA with a tRNA (U6-tRNA-sgRNA-tRNA-sgRNA) as pol III promoter expression can be improved when paired with tRNAs, which encode their own promoter elements37,42 Furthermore, each tRNA eliminates the requirement for a 5’ guanine base on the following guide that is otherwise required in U6 transcription systems and is often encoded as a mismatched 20th or 21st base of the spacer. This design increased the size of the 3’ LTR insertion from 352 bp (CROPseq) to 643 bp (FIG. 5A). As tRNA-encoding sequences could be processed out of mRNA encoding either the lentiviral genome or the selection gene by the endogenous RNases, the orientation of the elements within the 3’ LTR was reversed (FIG. 1C, FIG. 5A). In other words, the pol III promoter (U6) initiates transcription on the opposite strand as the pol II promoter (EFla) in the integrated lentivirus of FIG. 1C. This is a distinguishing feature for using tRNA sequences in the vector of the present disclosure. This results in the transcript from the pol II promoter including the reverse complement of the tRNA sequences and these sequences are not cleaved by the endogenous RNases. The transcripts from the pol III promoter include functional tRNA sequences. The vectors of FIG. 1A and IB are not operable if tRNA sequences are added because the transcriptional elements are on the same strand. The CROPseq-inspired multiplexing solution has now been termed CROPseq-multi herein (FIG. 1C, FIG. 5A).
[0151] In example embodiments, tRNAs are not required. 3' LTR-embedded, antisense crRNA arrays have additionally been used for Casl2. In this embodiment, the mRNA can be used for barcoding because the mRNA transcribed from the pol II promoter is not cleaved (i.e., because the crRNA array is antisense). Casl2 systems do not require tRNAs because Casl2 itself processes the crRNA arrays. In example embodiments, Cast 3 arrays can be used for programmable perturbation of RNA. Casl3 systems do not require tRNAs because Casl3 itself processes the crRNA arrays.
[0152] In addition to the guide multiplexing changes, 12 bp barcodes internal to the sgRNAs (iBARs) were added as freely specified additional readout elements43 (FIG. 1C, FIG. ID, FIG. 5A). Linked barcodes are advantageous for designing sequences with maximum orthogonality, thus minimizing the sequence length needed to uniquely identify library members, for representing pairs of guides that may not be individually unique, and for other applications such as clonal barcoding. Linked barcodes are prone to distance-dependent recombination, the iBAR system is attractive for the placement of barcodes within the synthetic loop that joins the crRNA and tracrRNA into a sgRNA, or only 19 bp from the spacer in the design (FIG. ID, FIG. 5A). As iBARs are transcribed both antisense as mRNA and within the sgRNA scaffold, their detection should be compatible with both mRNA-based1, 18,32,34 and direct-capture9 (U6 product) protocols.
[0153] First, whether CROPseq-multi was compatible with lentiviral production and transduction was examined. It has been suggested that the CROPseq design may be incompatible with multiplexing as the lentiviral 3’ LTR is expected to be incompatible with large insertions20,22. Insertions of up to 1.2 kb have been evaluated, albeit with reduced viral titers36. While the design was compatible with lentiviral delivery, a roughly 10-fold reduction in functional titer of CROPseq-multi was observed as compared to CROPseq (FIG. IE, FIG. 5B). Without wishing to be bound by theory, under rationale that increased size of the 3’ LTR insertion might explain the low titer, viral titers of CROPseq-multi with only one guide (445 bp insertion), two guides (643 bp insertion), and CROPseq (352 bp insertion) were compared. Surprisingly, the same 10-fold reduction in titer was observed with the single guide CROPseq-multi, which indicated that titer did not decrease linearly with increasing 3’ LTR insertion size and that the sequence elements (i.e. tRNA(s)) and/or orientation (i.e. U6 promoter) may contribute (FIG. 5B). It was then hypothesized that an alternate lentiviral promoter could improve titer if RNA secondary structure or transcriptional interference were limiting lentiviral genome production. Swapping the RSV promoter with a CMV promoter rescued the titer of the full two-guide CROPseq-multi system to 64% that of CROPseq (FIG. IE). The CMV promoter was also used as the default construction for CROPseq-multi.
Editing performance of CROPseq-multi
[0154] An SpCas9-expressing A549 lung adenocarcinoma cell line was next transduced, respectively, with CROPseq and CROPseq-multi vectors encoding guides targeting AAVS1 and HPRT1, and genome editing performance was evaluated by next generation sequencing (NGS) (FIG. IF). Ideally, multiplexing systems possess equivalent activity per-guide to single-plex systems and without positional bias (i.e. equal guide activity in any position). First, iBARs were validated as not detrimental to activity by comparing standard sgRNAs to those with 12 nt iBARs, both in the CROPseq vector architecture (FIG. IF). For CROPseq-multi, both orientations of spacers and four intervening or "middle" tRNAs were tested. Designs employing human proline, glutamine, and alanine tRNAs (tRNAp, tRNAq, and tRNA \, respectively) displayed no significant difference between either target in either position of CROPseq-multi and the single-guide CROPseq vectors, along with minimal positional bias within each CROPseq-multi design (FIG. IF). One design employing a human valine tRNA (tRNAv) significantly underperformed, relative to CROPseq, with the HPRT1 guide in the second position, suggesting that tRNA selection was an important factor.
[0155] The following human tRNA sequences were employed in the CROPseq-multi designs described herein. tRNAG - tRNA-Gly-GCC-2-1 (leader tRNA) GCATTGGTGGTTCAGTGGTAGAATTCTCGCCTGCCACGCGGGAGGCCCGGGTTCGAT TCCCGGCCAATGCA (SEQ ID NO: 18) tRNAp - tRNA-Pro-AGG-2-4
GGCTCGTTGGTCTAGGGGTATGATTCTCGCTTAGGGTGCGAGAGGtCCCGGGTTCAA ATCCCGGACGAGCCC (SEQ ID NO: 19) tRNAQ - tRNA-Gln-CTG-1-5
GGTTCCATGGTGTAATGGTtAGCACTCTGGACTCTGAATCCAGCGaTCCGAGTTCAAA TCTCGGTGGAACCT (SEQ ID NO: 20) tRNAA - tRNA-Ala-AGC-1-1
GGGGGTATAGCTCAGTGGTAGAGCGCGTGCTTAGCATGCACGAGGtCCTGGGTTCGA TCCCCAGTACCTCCA (SEQ ID NO: 21) tRNAv - tRNA-Val-AAC-1-2 (observed positional bias)
GTTTCCGTAGTGTAGTGGTTATCACGTTCGCCTAACACGCGAAAGGTCCCCGGTTCG AAACCGGGCGGAAACA (SEQ ID NO: 22)
Lentiviral recombination with CROPseq-multi
[0156] To experimentally evaluate recombination rates, arrayed and pooled lentiviral production was performed with combinations of CROPseq-multi vectors and the identities of barcode elements after transduction and integration in genomic DNA were measured. Arrayed lentiviral preparation guarantees identical co-packaged genomes such that template switching does not impact barcode pairing and serves as a control for other sources of recombination, such as PCR co-amplification. In pooled lentiviral preparation, as is performed in high throughput pooled screens, lentiviral genome copies are essentially randomly paired within virions. In a pooled preparation of two vectors, 50% of virions will harbor two different barcode pairs that can result in observable intermolecular recombination events (FIG. 2A). In a high-complexity library, nearly all virions are expected to package different barcode pairs, so the recombination rate should approach double the observed rate in a two-vector assay. In all conditions, recombination was measured by next-generation sequencing of genomic integrants.
[0157] Intermolecular recombination events that decoupled spacers from their iBARs were detected but well below 1% and likely negligible for most applications (FIG. 2C). For two CROPseq-multi vectors with a common middle tRNA, intermolecular recombination events resulting in incorrect pairings of sgRNAs or "pair-swaps" were observed at about 5.9% (FIG. 2C). As the three middle tRNAs that were validated are divergent in sequence (FIG. 5C), it was reasoned that the use of orthogonal middle tRNAs could effectively decrease the barcodeseparating distance to only 75 bp. Correspondingly, the use of orthogonal tRNAs reduced observed pair-swap recombination events to 3.2% (FIG. 2C). Observed rates were converted to expected true rates and recombination rates were compared with previously reported recombination frequencies from other dual barcode systems (either multiplexed guides or guides with secondary barcodes) (FIG. 2C). Barcode swapping frequency was modeled as a function of homologous intervening sequence length and it was identified that recombination frequency was well explained by fixed per-bp probability of template switching of 5.3x1 O’4 per bp, in agreement with previously reported measurements44 (FIG. 2C).
[0158] With both arrayed and pooled lentiviral preps, deletion events were also observed in about 2% of integrations, which appeared to be driven by sgRNA scaffold homology, despite the use of two orthogonal scaffold sequences (FIG. 2B, FIG. 5D). Considering the observed deletion rates and the expected true pair-swap rates based on the observed rates, it was estimated that the fraction of CROPseq-multi integration events without any recombination events for complex libraries was 87% with the use of a single middle tRNA or 90% when implemented with three orthogonal middle tRNAs (FIG. 2D).
Multiplexed detection of CROPseq-multi with in situ sequencing for optical pooled screens [0159] Detecting barcodes with in situ sequencing creates a unique set of constraints. Briefly, in situ detection of barcodes involves fixation and permeabilization of cells, reverse transcription of barcoded mRNA to cDNA, fixation of the cDNA, copying the barcode sequence into a padlock probe, ligation of the padlock into a circular ssDNA template, rolling circle amplification, and sequencing by synthesis to decode perturbations1,4’ (FIG. 3A). Robust mRNA expression is required for efficient detection. And, as sequencing reagent costs and imaging time impact screening throughput, minimizing the number of required sequencing cycles (i.e. barcode bases) necessary to uniquely identify perturbations is desirable.
[0160] Padlock probes flanking each spacer and iBAR were designed such that both features are captured within the gapfill (FIG. 6A). First, the detection efficiency of the first spacer and iBAR pair was optimized. With the standard protocol, detection efficiencies (reads per cell) were low for the CROPseq-multi iBAR, averaging 1.3 reads per cell, compared to an average of 2.9 reads per cell for the CROPseq spacer (FIG. 3B). Two protocol changes were employed to improve detection efficiencies for CROPseq-multi (FIG. 3A). First, the primary fixation was altered by adding 0.007% glutaraldehyde to the standard 4% PFA fixative. Second, cDNA retention was optimized by using a biotinylated reverse transcription primer and adding a streptavidin incubation between the reverse transcription and cDNA fixation steps (FIG. 3A). The optimized cDNA retention alone improved detection efficiency to an average 7.5 reads per cell for CROPseq-multi (FIG. 3B). The optimized primary fixation alone did not improve detection, but in combination with the optimized cDNA retention, further improved detection efficiency to an average 18.8 reads per cell (FIG. 3B). These modifications did not improve detection of the CROPseq vector, which indicated that an optimal detection protocol might be specific to a given barcode design.
[0161] An additional feature of the CROPseq-multi design is the use of two barcodes could facilitate more efficient decoding of perturbations. With simultaneous detection of both barcodes as separate reads, or multiplexed detection, a total of two nucleotides of a barcode pair can be decoded per sequencing cycle (one nucleotide from each barcode per cycle). Further, the detection of both barcodes would enable identification and filtering of lentiviral recombination events. Of note, this strategy is dependent on the ability to reliably detect both barcodes in each cell, which should be facilitated by the optimized detection protocol.
[0162] Multiplexed detection of iBARS 1 and 2 was performed with the optimized protocol disclosed herein, and a mean of 27.7 total reads per cell was observed (FIG. 3B). Without wishing to be bound by theory, multiplexed detection could impact per-barcode detection efficiencies due to optical crowding at high read densities. Detection of iBAR 1 and iBAR 2 was performed both individually and multiplexed, and only modestly lower per-barcode detection efficiencies were observed when multiplexed (FIG. 3C, FIG. 3D). In this experiment, all iBAR sequences were unique and reads were assigned to iBAR 1 or iBAR 2 based on mapping sequences to the library, however a fluorophore-conjugated oligo could be used to assign reads to the correct iBAR position (FIG. 6B, FIG. 6C)
[0163] Multiplexed in situ detection was used to decode perturbations and quantify recombination in cells transduced with 3 CROPseq-multi vectors, delivered by either arrayed or pooled lentiviral preparations. Assaying recombination with 3 vectors, the observed recombination rate is expected to reflect % of the true recombination rate. Unlike the NGS measurements, quantification of recombination in situ is dependent on the accuracy of cell segmentation and is sensitive to technical artifacts such as transcript diffusion. The stringency of read assignment to cells was increased by varying the required minimum read counts per iBAR, by rationale that cells with few reads for either iBAR might be the result of deletion recombination events, silencing of the lentiviral transgene, or incomplete selection, and, together with imperfect cell segmentation, could appear as false-positive pair-swap events (FIG. 3E). While the in situ detected pair-swap rate was modestly higher for both arrayed and pooled lentiviral preparations compared to NGS measured pair-swap rates of the same samples, increasing the read count stringency for iBAR assignment brought the two measurements closer to agreement (FIG. 3E). In a screening context, the primary goal would likely be filtering out all such low-confidence assignments and incorrect pairings, so the distinction between pair-swap events and these potential modes of false-positives is less important.
[0164] In optical pooled screens, throughput is largely dictated by imaging time, which scales linearly with sequencing cycles during sample genotyping. The use of even a single secondary barcode reduces the required cycle number for a given library size to roughly half that of decoding via the spacer sequence of a typical dual sgRNA library (FIG. 3F). With multiplexed decoding of two secondary barcodes, the required number of cycles is again halved relative to a single secondary barcode, as two bases are decoded per sequencing cycle (FIG. 3F). For most library sizes, a single additional sequencing cycle with multiplexed decoding is sufficient to detect >95% of recombination events, corresponding to a roughly 3-fold decrease in cycle number relative to decoding with spacer sequences (FIG. 3F). For example, decoding a genome-wide CROPseq library with 4 guides per gene (-80,000 vectors) would require sequencing all 20 cycles of the spacer. With CROPseq-multi, an equivalent number of vectors, encoding twice as many guides per gene, could be decoded with only 6 cycles of sequencing while detecting recombination events. Correspondingly, multiplexed decoding reduces the sequencing reagent costs by the same factor.
[0165] CROPseq-multi is a generalized multiplexing solution that addresses numerous technical challenges in multiplexed screens, including robust and equal guide activity, minimized lentiviral recombination, and compatibility with mRNA-barcoding and in situ readout. Both single-perturbation and combinatorial screens can be enabled by CROPseq-multi. In single-target screens, activity with two guides against the same target per vector typically achieves superior on- target performance with smaller library sizes, and optical pooled screens will benefit from superior detection and improved decoding efficiency. For combinatorial screens, CROPseq-multi offers a solution with minimal lentiviral recombination without diverging from the most highly-developed SpCas9-based systems. Example 2 - Inclusion of T7 promoter in CROPseq and CROPseq-multi for multiplexed perturbation and decoding in pooled genetic screens
[0166] Building upon the vectors generated in Example 1, engineering of a lentiviral system compatible with an in vitro transcription readout was pursued. Accordingly, taking the vectors generated in Example 1, a T7 promoter was further included, thereby generating a vector which has been herein termed CROPseq-multi-T7 (FIG. 7A). Inclusion of the T7 promoter is illustrated in the alignment shown in FIG. 8.
[0167] As noted above in Example 1, the 3' LTR is duplicated during lentiviral integration. The duplicated multiplexing cassette generates functional sgRNAs via pol III transcription and endogenous tRNA processing (see FIG. 4E, FIG. 7A). Significantly, however, the inclusion of the T7 promoter in the CROPseq-multi-T7 vector further enables optional in vitro transcription detection that is independent of endogenous transcription activity (FIG. 7A). While in situ amplification and detection yields distinct punctate "colonies," in vitro detection yields a single large nuclear focus (FIG. 7B). Thus, used in combination with in situ amplification and screening, multiplexed perturbation vectors compatible with in vitro transcription readouts enhance pooled genetic screening decoding strategies.
Editing performance of CROPse -T7 and CROPseq-nuilti-T7
[0168] An SpCas9-expressing A549 lung adenocarcinoma cell line was next transduced with CROPseq-multi-T7 vectors encoding guides targeting AAVS1 and HPRT1, to evaluate the impact of the inclusion of the T7 promoter on genome editing activity. Using next generation sequencing (NGS), inclusion of the T7 promoter was validated as not impacting genome editing activity of the CROPseq-multi vector (FIG. 9A).
[0169] These findings were further validated by showing that respective addition of the T7 promoter to the CROPseq and CROPseq-multi vectors did not impact mRNA detection (FIG. 9B). Lentiviral recombination with CROPseq-multi-T7
[0170] Next, to experimentally evaluate recombination rates, arrayed and pooled lentiviral production was performed with CROPseq-T7 and CROPseq-multi-T7 vectors, and the identities of barcode elements after transduction and integration in genomic DNA were measured using Perturb View (Kudo, Takamasa, et al. "Highly multiplexed, image-based pooled screens in primary cells and tissues with Perturb View." bioRxiv (2023): 2023-12) and NIS-seq (Fandrey, Caroline I., et al. "Cell Type- Agnostic Optical Perturbation Screening Using Nuclear Tn-Situ Sequencing (NIS- Seq)." bioRxiv (2024): 2024-01) protocols.
[0171] Employing biotinylated reverse transcription primers and streptavidin (SA) binding to anchor cDNA (alternating columns with primary amine-modified primer), FIG. 9C and FIG. 9D illustrate the enhanced capabilities of CROPseq-T7 and CROPseq-multi-T7 vectors in multiplex screening.
CROPseq-multi-T7 Quantification
[0172] The CROPseq-multi-T7 vector allows for quantification of recombination events using in vitro transcription sequencing and in situ sequencing. With the CROPseq-multi-T7, it was found that detection efficiencies for iBARl was 80% while the detection efficiency for iBAR2 was 96% (FIG. 10). When the same barcode sequence is used in both iBAR positions and both signals are read simultaneously, the signal intensity for iBAR2 is greater, as expected given proximity to the T7 promoter (FIG. 10).
[0173] For recombination detection using in vitro transcription, it is not possible to decode iBARl from iBAR2 using spatial demultiplexing. Therefore, one approach is to read the barcodes serially, though this approach requires additional cycling. For example, a serial amplification and sequencing protocol requires enough amplification cycles and sequencing of iBAR2 to identify the construct, followed by stripping the read, and then using an additional two cycles of amplification and sequencing of iBARl. This approach is sufficient to filter out approximately 95% of recombination events.
[0174] Alternatively, another approach is to encode the same barcode in both iBARs and read both barcodes out simultaneously. This approach relies on detection of mixed basecalls during sequencing. Using this approach, if there is no recombination, the one barcode sequence will be recovered, with both barcodes contributing to signal (the one barcode sequence will therefore show discrete basecalls across the region sequenced). In contrast, if there is recombination, different base signals of the recombined sequences will be mixed, and that signal mixing can be used as a filter. Where mixed basecalls are observed, such mixed basecalls identify a mixed underlying population, which has become mixed due to recombination (FIG. 11). This second approach enables filtering of recombination events without additional cycles (as compared to the serial amplification protocol, noted above); however, this approach still requires more amplification cycles than multiplexed mRNA detection.
[0175] As shown in FIG. 12, lithium borohydride (LiBF ) treatment is effective in removing fluorescent signals from in situ sequencing (ISS) reagents and is also compatible with subsequent sequencing. These results are advantageous because nonspecific binding of fluorescent dyes used during ISS can lead to high cellular background, resulting in a compromised signal-to-noise ratio for base calling and limiting the number of sequencing synthesis cycles that can be implemented because background noise accumulates to unacceptable levels as the number of sequencing synthesis cycles increases. Unfortunately, prior art solutions for removing signals from fluorescent dyes while enabling incorporation of nucleotides in subsequent sequencing cycles have not been reported. Consequently, the LiBFU treatment techniques described herein represent a significant discovery in the field. RPEl-hTERT cells were used to demonstrate that LiBFU treatment is effective in removing fluorescent signals from in situ sequencing reagents while also being compatible with subsequent sequencing. For example, FIG. 12, left panel, shows cycle 1 signal from ISS of RPEl-hTERT cells before LiBH4 treatment (i.e., pre-treatment). FIG. 12, middle panel, shows the same cells after treatment with 1 mg/mL LiBFk solution for 30 minutes at room temperature (i.e., post-treatment). FIG. 12, right panel, shows the same cells after they were cleaved and incorporated into a subsequent round of sequencing (e.g., cycle 2) and indicates that produced signal readout, indicating that the cells were compatible with subsequent sequencing reactions.
Example 3 - Methods employed generally in the preceding examples
[0176] The in situ amplification protocol was modified from previous studies (Feldman, D. et al. Optical Pooled Screens in Human Cells. Cell 179, 787-799. el7 (2019) and Feldman, D. et al. Pooled genetic perturbation screens with image-based phenotypes. Nat. Protoc. 1-37 (2022) doi: 10.1038/s41596-021-00653-8.). For mRNA detection, cells were fixed in 4% (v/v) formaldehyde (Electron Microscopy Sciences 15714) and 0.007% (v/v) glutaraldehyde (Electron Microscopy Sciences 16120) in IX PBS (Ambion AM9625) for 30 minutes at room temperature, then washed twice in PBS. It has been observed that the use of glutaraldehyde in the primary fixation step can impact some immunofluorescence stains; omission or titration to lower concentrations may offer a balance of detection sensitivity and compatibility with phenotype measurements. Samples were permeabilized in IX PBS + 0.2% (v/v) Tween-20 (VWR 100216- 360) for 15 minutes at room temperature, then washed twice in IX PBS + 0.1% (v/v) Tween-20, henceforth “PBS-T”. For phenotypic measurements prior to reverse transcription and cDNA- fixation, it is recommended to use RiboLock Rnase Inhibitor (Thermo Fisher Scientific EO0384) together with RNase-free reagents to preserve mRNA integrity.
[0177] The T7 in vitro transcription protocol was adapted from previous protocols (Kudo, T. et al. Highly multiplexed, image-based pooled screens in primary cells and tissues with Perturb View. 2023.12.26.573143 and Fandrey, C. I. et al. Cell Type- Agnostic Optical Perturbation Screening Using Nuclear In-Situ Sequencing (NIS-Seq). 2024.01.18.576210). Samples were fixed in 4% (v/v) formaldehyde (Electron Microscopy Sciences 15714) in IX PBS (Ambion AM9625) for 30 minutes at room temperature, then washed twice in PBS. Cells were permeabilized in 70% (v/v) ethanol for 30 minutes at room temperature. After permeabilization, the 70% ethanol solution was diluted with three 75% buffer exchanges of PBS-T, followed by two washes with PBS-T. Reverse crosslinking was performed by incubating samples at 65 °C for 4 hours in 0.1 M Sodium Bicarbonate and 0.3 M NaCl in water. Samples were washed three times with PBS-T. Alternatively to formaldehyde fixation and reverse crosslinking, fixation and permeabilization was validated in 3: 1 (v/v) methanol and acetic acid for 20 minutes, followed by two washes with PBS-T, as previously described. In vitro transcription was performed with Perturb View and NIS-seq preprints (Kudo, T. et al. Highly multiplexed, image-based pooled screens in primary cells and tissues with Perturb View. 2023.12.26.573143 and Fandrey, C. I. et al. Cell Type-Agnostic Optical Perturbation Screening Using Nuclear In-Situ Sequencing (NIS-Seq). 2024.01.18.576210). After in vitro transcription, samples were washed twice with PBS-T and fixed for 30 minutes at room temperature with 3% (v/v) formaldehyde (Electron Microscopy Sciences 15714) and 0.1% (v/v) glutaraldehyde (Electron Microscopy Sciences 16120). The fixation was quenched w 0.2 M Tris- HC1 pH 8.0 and washed three times with PBS-T. Subsequent steps were performed identically for both mRNA detection and T7 in vitro transcription detection protocols.
[0178] The reverse transcription solution was prepared with the following composition: IX RevertAid RT buffer (Thermo Fisher Scientific EP0452), 250 pM dNTPs (New England Biolabs N0447L), 1 pM each biotinylated reverse transcription primer (Integrated DNA Technologies), 200 gg/mL molecular biology grade recombinant albumin (rAlbumin) (New England Biolabs B9200S), 0.8 U/gL RiboLock RNase inhibitor (Thermo Fisher Scientific EO0384), and 4.8 U/gL RevertAid H minus Reverse Transcriptase (Thermo Fisher Scientific EP0452). Samples were incubated in reverse transcription solution for 16 hours at 37 °C. Samples were then washed twice with PBS-T and incubated for 15 minutes with 20 gg/mL Streptavidin (New England Biolabs N7021S) and 100 gg/mL rAlbumin (New England Biolabs B9200S) in IX PBS. Next, samples were washed twice with PBS-T prior to post-fixation in 3% (v/v) formaldehyde (Electron Microscopy Sciences 15714) and 0.1% (v/v) glutaraldehyde (Electron Microscopy Sciences 16120) for 30 minutes at room temperature. After fixation, samples were washed twice with PBS- T and incubated in gapfill and ligation solution at 37 °C for 5 min, followed by 45 °C for 90 minutes. The gapfill and ligation solution was composed of IX Ampligase buffer (Lucigen A3210K), 50 nM dNTPs (New England Biolabs N0447L), 0.1 gM each padlock probe, 200 gg/mL rAlbumin (New England Biolabs B9200S), 0.4 U/gL RNase H (Enzymatics Y9220L), 0.02 U/gL TaqIT polymerase (Enzymatics P7620L), and 0.5 U/gL Ampligase (Lucigen A3210K). Samples were then washed twice with PBS-T and incubated in RCA solution for 16 hours at 30 °C. RCA solution was composed of IX Phi29 buffer (Thermo Fisher Scientific EP0091), 5%(v/v) glycerol (MilliporeSigma G5516), 250 gM dNTPs (New England Biolabs N0447L), 200 gg/mL rAlbumin (New England Biolabs B9200S), and 1 U/gL Phi29 DNA polymerase (Thermo Fisher Scientific EP0091). Following RCA, samples were washed twice in PBS-T and incubated with 1 gM each sequencing primer (Integrated DNA Technologies) in 2X SSC buffer (Ambion AM9763) for 30 minutes at room temperature, followed by two PBS-T washes.
[0179] Sequencing by synthesis was then performed as previously described (Feldman, D. et al. Optical Pooled Screens in Human Cells. Cell 179, 787-799. el7 (2019) and Feldman, D. et al. Pooled genetic perturbation screens with image-based phenotypes. Nat. Protoc. 1-37 (2022) doi: 10.1038/s41596-021-00653-8.). Samples were incubated in incorporation mix (MiSeq Nano kit v2 reagent 1) (Illumina MS- 103- 1003) for 5 minutes at 60 °C on a flat-top thermal cycler, then washed six times with PR2 buffer (Illumina MS-103-1003), followed by 5 heated washes in PR2, 5 min each at 60 °C. Samples were imaged in 2X SSC + 200 ng/mL DAPI (MilliporeSigma D9542) on a Nikon Ti2 Microscope at 10X magnification. To proceed to the next cycle, samples were incubated in cleavage mix (MiSeq Nano kit v2 reagent 4) (Illumina MS-103-1003) for 6 min at 60 °C, then washed three times in PR2 followed by three heated PR2 washes of 1 min each at 60 °C. Samples were then ready to return to the incorporation step for the subsequent sequencing cycle. In situ sequencing images were analyzed as previously described (Feldman, D. et al. Optical Pooled Screens in Human Cells. Cell 179, 787-799. el7 (2019) and Feldman, D. et al. Pooled genetic perturbation screens with image-based phenotypes. Nat. Protoc. 1-37 (2022) doi: 10.1038/s41596-021-00653-8.).
[0180] Table 3. T7-Containing Vectors of the Disclosure.
References
1. Feldman, D. et cd. Optical Pooled Screens in Human Cells. Cell 179, 787-799. el7 (2019).
2. Yao, D. et al. Scalable genetic screening for regulatory circuits using compressed Perturb-seq. Nat. Biotechnol. 1-14 (2023) doi : 10.1038/s41587-023 -01964-9. Thompson, N. A. et al. Combinatorial CRISPR screen identifies fitness effects of gene paralogues. Nat. Commun. 12, 1302 (2021). Kegel, B. D., Quinn, N., Thompson, N. A., Adams, D. J. & Ryan, C. J. Comprehensive prediction of robust synthetic lethality between paralog pairs in cancer cell lines. Cell Syst. 12, 1144-1159. e6 (2021). Dede, M., McLaughlin, M., Kim, E. & Hart, T. Multiplex enCas!2a screens detect functional buffering among paralogs otherwise masked in monogenic Cas9 knockout screens. Genome Biol. 21, 262 (2020). Parrish, P. C. R. et al. Discovery of synthetic lethal and tumor suppressor paralog pairs in the human genome. Cell Rep. 36, 109597 (2021). Ryan, C. J., Mehta, I., Kebabci, N. & Adams, D. J. Targeting synthetic lethal paralogs in cancer. Trends Cancer 9, 397-409 (2023). Petiwala, S. et al. Optimization of Genomewide CRISPR Screens Using AsCasl2a and MultiGuide Arrays. CRISPR J. 6, 75-82 (2023). Replogle, J. M. et al. Combinatorial single-cell CRISPR screens by direct guide RNA capture and targeted sequencing. Nat. Biotechnol. 38, 954-961 (2020). Liu, J. et al. Pooled library screening with multiplexed Cpfl library. Nat. Commun. 10, 3144 (2019). Wong, N., Liu, W. & Wang, X. WU-CRISPR: characteristics of functional guide RNAs for the CRISPR/Cas9 system. Genome Biol. 16, 218 (2015). H, X. et al. Sequence determinants of improved CRISPR sgRNA design. Genome Res. 25, (2015). Doench, I. G. et al. Optimized sgRNA design to maximize activity and minimize off-target effects of CRISPR-Cas9. Nat. Biotechnol. 34, 184-191 (2016). Sanson, K. R. et al. Optimized libraries for CRISPR-Cas9 genetic screens with multiple modalities. Nat. Commun. 9, 5416 (2018). Chuai, G. et al. DeepCRISPR: optimized CRISPR guide RNA design by deep learning. Genome Biol. 19, 80 (2018). Kim, H. K. et al. SpCas9 activity prediction by DeepSpCas9, a deep learning-based model with high generalization performance. Set. Adv. 5, eaax9249 (2019). Sanjana, N. E., Shalem, O. & Zhang, F. Improved vectors and genome-wide libraries for CRISPR screening. Nat. Methods 11, 783-784 (2014). Adamson, B. et al. A Multiplexed Single-Cell CRISPR Screening Platform Enables Systematic Dissection of the Unfolded Protein Response. Cell 167, 1867-1882. e21 (2016). Replogle, J. M. etal. Mapping information-rich genotype-phenotype landscapes with genomescale Perturb-seq. Cell 185, 2559-2575. e28 (2022). Hill, A. J. et al. On the design of CRISPR-based single cell molecular screens. Nat. Methods 15, 271-274 (2018). Feldman, D., Singh, A., Garrity, A. J. & Blainey, P. C. Lentiviral co-packaging mitigates the effects of intermolecular recombination and multiple integrations in pooled genetic screens. 262121 Preprint at doi.org/10.1101/262121 (2018). Adamson, B., Norman, T. M., Jost, M. & Weissman, J. S. Approaches to maximize sgRNA- barcode coupling in Perturb-seq screens. 298349 Preprint at doi.org/10.1101/298349 (2018). Najm, F. J. et al. Orthologous CRISPR-Cas9 enzymes for combinatorial genetic screens. Nat. Biotechnol. 36, 179-189 (2018). Hegde, M., Strand, C., Hanna, R. E. & Doench, J. G. Uncoupling of sgRNAs from their associated barcodes during PCR amplification of combinatorial CRISPR screens. PLoS ONE 13, e0197547 (2018). Zetsche, B. etal. Cpfl Is a single RNA-guided endonuclease of a class 2 CRISPR-Cas system. Ce/Z 163, 759-771 (2015). Zetsche, B. et al. Multiplex gene editing by CRISPR-Cpfl using a single crRNA array. Nat. Biotechnol. 35, 31-34 (2017). DeWeirdt, P. C. et al. Optimization of AsCasl2a for combinatorial genetic screens in human cells. Nat. Biotechnol. 39, 94-104 (2021). Li, R. et al. Comparative optimization of combinatorial CRISPR screens. Nat. Commun. 13, 2469 (2022). Komor, A. C., Badran, A. H. & Liu, D. R. CRISPR-Based Technologies for the Manipulation of Eukaryotic Genomes. Cell 168, 20-36 (2017). Moon, S. B., Kim, D. Y., Ko, J.-H. & Kim, Y.-S. Recent advances in the CRISPR genome editing tool set. Exp. Mol. Med. 51, 1-11 (2019). Anzalone, A. V., Koblan, L W. & Liu, D. R. Genome editing with CRISPR-Cas nucleases, base editors, transposases and prime editors. Nat. Biotechnol. 38, 824-844 (2020). Dixit, A. et al. Perturb-Seq: Dissecting Molecular Circuits with Scalable Single-Cell RNA Profiling of Pooled Genetic Screens. Cell 167, 1853-1866. el7 (2016). Xie, S., Duan, J., Li, B., Zhou, P. & Hon, G. C. Multiplexed Engineering and Analysis of Combinatorial Enhancer Activity in Single Cells. Mol. Cell 66, 285-299. e5 (2017). Jaitin, D. A. et al. Dissecting Immune Circuits by Linking CRISPR-Pooled Screens with Single-Cell RNA-Seq. Cell 167, 1883-1896.el5 (2016). Datlinger, P. et al. Pooled CRISPR screening with single-cell transcriptome readout. Nat. Methods 14, 297-301 (2017). Urbinati, F. et al. Mechanism of Reduction in Titers From Lentivirus Vectors Carrying Large Inserts in the 3'LTR. Mol. Ther. 17, 1527-1536 (2009). Xie, K., Minkenberg, B. & Yang, Y. Boosting CRISPR/Cas9 multiplex editing capability with the endogenous tRNA-processing system. Proc. Natl. Acad. Set. 112, 3570-3575 (2015). Port, F. & Bullock, S. L. Augmenting CRISPR applications in Drosophila with tRNA-flanked sgRNAs. Nat. Methods 13, 852-854 (2016). Xu, L., Zhao, L., Gao, Y., Xu, J. & Han, R. Empower multiplex cell and tissue-specific CRISPR-mediated gene manipulation with self-cleaving ribozymes and tRNA. Nucleic Acids Res. 45, e28-e28 (2017). Qi, W. et al. High-efficiency CRISPR/Cas9 multiplex gene editing using the glycine tRNA- processing system-based strategy in maize. BMC Biotechnol. 16, 58 (2016). Zhang, Y. et al. A gRNA-tRNA array for CRISPR-Cas9 based rapid multiplexed genome editing in Saccharomyces cerevisiae. Nat. Commun. 10, 1053 (2019). Knapp, D. J. H. F. et al. Decoupling tRNA promoter and processing activities enables specific Pol-II Cas9 guide RNA expression. Nat. Commun. 10, 1490 (2019). Zhu, S. et al. Guide RNAs with embedded barcodes boost CRISPR-pooled screens. Genome Biol. 20, 20 (2019). Schlub, T. E., Smyth, R. P., Grimm, A. J., Mak, J. & Davenport, M. P. Accurately Measuring Recombination between Closely Related HIV-1 Genomes. PLoS Comput. Biol. 6, el000766 (2010).
45. Feldman, D. et al. Pooled genetic perturbation screens with image-based phenotypes. Nat. Protoc. 1-37 (2022) doi: 10.1038/s41596-021-00653-8.
46. Sack, L. M., Davoli, T., Xu, Q., Li, M. Z. & Elledge, S. J. Sources of Error in Mammalian Genetic Screens. G3 GenesGenomesGenetics 6, 2781-2790 (2016).
47. Labitigan, R. L. D. et al. Mapping variation in the morphological landscape of human cells with optical pooled CRISPRi screening. 2022.12.27.522042 Preprint at doi.org/ 10.1101 /2022.12.27.522042 (2022) .
48. Replogle, J. M. et al. Maximizing CRISPRi efficacy and accessibility with dual-sgRNA libraries and optimal effectors. eLife 11, e81856 (2022).
49. Xie, S., Cooley, A., Armendariz, D., Zhou, P. & Hon, G. C. Frequent sgRNA-barcode recombination in single-cell perturbation assays. PLoS ONE 13, e0198635 (2018).
50. Yu, Hong, et al. "The nature of human immunodeficiency virus type 1 strand transfers." Journal of Biological Chemistry 273.43 (1998): 28384-28391.
51. Gasperini, Molly, et al. "CRISPR/Cas9-mediated scanning for regulatory elements required for HPRT1 expression via thousands of large, programmed genomic deletions." The American Journal of Human Genetics 101.2 (2017): 192-205.
52. Datlinger, Paul, et al. "Pooled CRISPR screening with single-cell transcriptome readout." Nature methods 14.3 (2017): 297-301.
***
[0181] Various modifications and variations of the described methods, pharmaceutical compositions, and kits of the disclosure will be apparent to those skilled in the art without departing from the scope and spirit of the disclosure. Although the disclosure has been described in connection with specific embodiments, it will be understood that it is capable of further modifications and that the disclosure as claimed should not be unduly limited to such specific embodiments. Indeed, various modifications of the described modes for carrying out the disclosure that are obvious to those skilled in the art are intended to be within the scope of the disclosure. This application is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure come within known customary practice within the art to which the disclosure pertains and may be applied to the essential features herein.

Claims

CLAIMS What is claimed is:
1. A viral vector for multiplexed perturbation screens, comprising: a cassette operably connected to a pol II promoter, the cassette comprising at least two guide molecules; a cleavable sequence; a pol III promoter; and a 3' long terminal repeat (LTR), wherein the at least two guide molecules are separated by the cleavable sequence, the at least two guide molecules, the cleavable sequence, and the pol III promoter are located within the 3' LTR and encoded on the minus strand, and the at least two guide molecules are transcribed in a single transcript, wherein the pol II promoter is positioned upstream of the 3’ LTR.
2. The viral vector of claim 1, wherein the cleavable sequence is a tRNA leader sequence.
3. The viral vector of claim 1, wherein the cassette comprises one or more additional cleavable sequences.
4. The viral vector of claim 1 , wherein the at least two guide molecules are present in an array cleavable by a CRISPR-Cas polypeptide.
5. The viral vector of claim 4, wherein the CRISPR-Cas polypeptide is Casl2.
6. The viral vector of claim 4, wherein the CRISPR-Cas polypeptide is Casl3.
7. The viral vector of claim 1, wherein the cassette further comprises an internal barcode (iBAR) and spacer within each of the at least two guide molecules, wherein each iBAR is unique to each of the at least two guide molecules.
8. The viral vector of claim 7, wherein the iBAR is within a loop joining a crRNA and a tracrRNA of each guide molecule.
9. The viral vector of claim 1, wherein the cassette further comprises an exogenous promoter.
10. The viral vector of claim 9, wherein the exogenous promoter of the cassette is upstream of the at least two guide molecules, the cleavable sequence, and the pol III promoter.
11. The viral vector of claim 9 or claim 10, wherein the orientation of the exogenous promoter is antisense relative to the Pol III promoter.
12. The viral vector of claim 10 or claim 11, wherein the exogenous promoter is a T7 promoter, a T3 promoter, or a SP6 promoter.
13. The viral vector of claim 1, wherein the viral vector comprises a 5’ LTR promoter.
14. The viral vector of claim 13, wherein the 5’ LTR promoter is a CMV promoter.
15. The viral vector of claim 2 or claim 3, wherein orthogonal tRNA sequences are used in the viral vector.
16. The viral vector of claim 1, further including orthogonal guide molecule scaffolds.
17. The viral vector of claim 1, wherein the at least two guide molecules target at least two or more different sequences of a gene.
18. The viral vector of claim 1, further encoding a CRISPR-Cas polypeptide.
19. The viral vector of claim 18, wherein the CRISPR-Cas polypeptide is a Cas9.
20. The viral vector of claim 18, wherein the CRISPR-Cas polypeptide is a Casl2.
21. The viral vector of claim 18, wherein the CRISPR-Cas polypeptide is a Casl3.
22. The viral vector of any one of the preceding claims, wherein the viral vector is a lentiviral vector.
23. A method of performing multiplexed pooled perturbation screening, comprising: a) introducing one or more perturbation vectors according to any one of claims 1 to 22 to a population of cells to obtain perturbed cells; and b) performing single cell RNA sequencing on the perturbed cells, whereby pol II transcripts comprising the at least two guide molecules encoded for in each vector are sequenced with cellular RNAs from the perturbed cells.
24. A method of performing multiplexed pooled optical perturbation screening, comprising: a) introducing one or more perturbation vectors according to any one of claims 1 to 22 to a population of cells to obtain perturbed cells; b) fixing and penneabilizing the perturbed cells; c) reverse transcribing mRNA transcribed from the pol II promoter encoded within the viral vector in the perturbed cells using primers specific for each of the at least two guide molecules, thereby generating cDNA comprising at least the spacer sequence and iBAR of each of the at least two guide molecules; d) fixing the cDNA in the perturbed cells; and e) in situ sequencing (ISS) the sequences comprising the spacer and/or iBAR of each of the at least two guide molecules.
25. A method for performing multiplexed pooled optical perturbation screening, comprising: a) introducing one or more viral vector(s) according to any one of claims 9-12 to a population of cells to obtain perturbed cells; b) fixing and penneabilizing the perturbed cells, transcribing mRNA of the viral vector(s) in the perturbed cells from the exogenous promoter of the cassette(s) of the viral vector(s), thereby generating barcoded RNA; c) reverse transcribing mRNA transcribed from the pol II promoter encoded within the viral vector in the perturbed cells using primers specific for each of the at least two guide molecules, thereby generating cDNA comprising at least the spacer sequence and iBAR of each of the at least two guide molecules; d) fixing the cDNA in the perturbed cells; and e) in situ sequencing (ISS) the sequences comprising the spacer and/or iBAR of each of the at least two guide molecules, thereby performing multiplexed pooled optical perturbation screening.
26. The method of claim 24 or claim 25, wherein the fixing in step (b) is performed using about 0.007% glutaraldehyde in about 4% paraformaldehyde.
27. The method of claim 24 or claim 25, wherein the primers are biotinylated and wherein the method further comprises a streptavidin incubation between the reverse transcription and fixing step.
28. The method of claim 24 or claim 25, wherein the in situ sequencing (ISS) comprises: a) contacting the perturbed cells with padlock probes that flank the spacer and/or iBAR of each of the at least two guide molecules; b) gap filling the padlock probes to capture each spacer and iBAR within the padlock probe; c) ligating the padlock probes into circular ssDNA templates; d) performing rolling circle amplification on the circular ssDNA templates to generated amplified cDNA; and e) sequencing the amplified cDNA using sequencing by synthesis to decode perturbations.
29. The method of claim 24 or claim 25, wherein the in situ sequencing step comprises in situ sequencing of a combination of the at least two spacers and at least two iBARs simultaneously.
30. The method of claim any one of claims 23-29, further comprising treating the population of cells with a lithium borohydride (LiBHT) solution.
31. The method of claim 30, wherein the LiBTU treatment step occurs prior to in situ sequencing.
32. The method of claim 30 or claim 31, wherein the LiBTU treatment step occurs subsequent to in situ sequencing.
33. The viral vector of claim 3, wherein the one or more additional cleavable sequences are tRNA leader sequences.
34. The viral vector of claim 3, wherein the one or more additional cleavable sequences are positioned upstream of the at least two guide molecules.
35. The viral vector of claim 3, wherein the cleavable sequence and one of the one or more additional cleavable sequences are positioned on either side of at least one of the at least two guide molecules.
PCT/US2025/014013 2024-02-02 2025-01-31 Multiplexed perturbation and decoding in pooled genetic screens Pending WO2025166152A1 (en)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US202463549322P 2024-02-02 2024-02-02
US63/549,322 2024-02-02
US202463659465P 2024-06-13 2024-06-13
US63/659,465 2024-06-13

Publications (1)

Publication Number Publication Date
WO2025166152A1 true WO2025166152A1 (en) 2025-08-07

Family

ID=94823996

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2025/014013 Pending WO2025166152A1 (en) 2024-02-02 2025-01-31 Multiplexed perturbation and decoding in pooled genetic screens

Country Status (1)

Country Link
WO (1) WO2025166152A1 (en)

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2014047561A1 (en) 2012-09-21 2014-03-27 The Broad Institute Inc. Compositions and methods for labeling of agents
WO2014093622A2 (en) 2012-12-12 2014-06-19 The Broad Institute, Inc. Delivery, engineering and optimization of systems, methods and compositions for sequence manipulation and therapeutic applications
WO2014210353A2 (en) 2013-06-27 2014-12-31 10X Technologies, Inc. Compositions and methods for sample processing
WO2016040476A1 (en) 2014-09-09 2016-03-17 The Broad Institute, Inc. A droplet-based method and apparatus for composite single-cell nucleic acid analysis
WO2016168584A1 (en) 2015-04-17 2016-10-20 President And Fellows Of Harvard College Barcoding systems and methods for gene sequencing and other applications
WO2017164936A1 (en) 2016-03-21 2017-09-28 The Broad Institute, Inc. Methods for determining spatial and temporal gene expression dynamics in single cells
WO2019094984A1 (en) 2017-11-13 2019-05-16 The Broad Institute, Inc. Methods for determining spatial and temporal gene expression dynamics during adult neurogenesis in single cells
WO2020077236A1 (en) 2018-10-12 2020-04-16 The Broad Institute, Inc. Method for extracting nuclei or whole cells from formalin-fixed paraffin-embedded tissues
US20200283843A1 (en) 2019-03-04 2020-09-10 The Broad Institute, Inc. Methods and compositions for massively parallel variant and small molecule phenotyping
WO2021208971A1 (en) * 2020-04-16 2021-10-21 The University Of Hong Kong A system for three-way combinatorial crispr screens for analysing target interactions and methods thereof
US11214797B2 (en) 2015-10-28 2022-01-04 The Broad Institute, Inc. Assays for massively combinatorial perturbation profiling and cellular circuit reconstruction

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2014047561A1 (en) 2012-09-21 2014-03-27 The Broad Institute Inc. Compositions and methods for labeling of agents
WO2014093622A2 (en) 2012-12-12 2014-06-19 The Broad Institute, Inc. Delivery, engineering and optimization of systems, methods and compositions for sequence manipulation and therapeutic applications
WO2014210353A2 (en) 2013-06-27 2014-12-31 10X Technologies, Inc. Compositions and methods for sample processing
WO2016040476A1 (en) 2014-09-09 2016-03-17 The Broad Institute, Inc. A droplet-based method and apparatus for composite single-cell nucleic acid analysis
WO2016168584A1 (en) 2015-04-17 2016-10-20 President And Fellows Of Harvard College Barcoding systems and methods for gene sequencing and other applications
US11214797B2 (en) 2015-10-28 2022-01-04 The Broad Institute, Inc. Assays for massively combinatorial perturbation profiling and cellular circuit reconstruction
WO2017164936A1 (en) 2016-03-21 2017-09-28 The Broad Institute, Inc. Methods for determining spatial and temporal gene expression dynamics in single cells
WO2019094984A1 (en) 2017-11-13 2019-05-16 The Broad Institute, Inc. Methods for determining spatial and temporal gene expression dynamics during adult neurogenesis in single cells
WO2020077236A1 (en) 2018-10-12 2020-04-16 The Broad Institute, Inc. Method for extracting nuclei or whole cells from formalin-fixed paraffin-embedded tissues
US20200283843A1 (en) 2019-03-04 2020-09-10 The Broad Institute, Inc. Methods and compositions for massively parallel variant and small molecule phenotyping
WO2021208971A1 (en) * 2020-04-16 2021-10-21 The University Of Hong Kong A system for three-way combinatorial crispr screens for analysing target interactions and methods thereof

Non-Patent Citations (125)

* Cited by examiner, † Cited by third party
Title
"Current Protocols in Molecular Biology", 1987
"Molecular Biology and Biotechnology: a Comprehensive Desk Reference", 1995, VCH PUBLISHERS, INC.
A.R. GRUBER ET AL., CELL, vol. 106, no. 1, 2008, pages 23 - 24
ADAMSON ET AL.: "Comprehensive single cell transcriptional profiling of a multicellular organism by combinatorial indexing", BIORXIV, 2 February 2017 (2017-02-02)
ADAMSON, B. ET AL.: "A Multiplexed Single-Cell CRISPR Screening Platform Enables Systematic Dissection of the Unfolded Protein Response", CELL, vol. 167, 2016, pages 1867 - 1882
ADAMSON, B.NORMAN, T. M.JOST, M.WEISSMAN, J. S.: "Approaches to maximize sgRNA-barcode coupling", PERTURB-SEQ SCREENS, 2018, pages 298349
ALTAE-TRAN HKANNAN SDEMIRCIOGLU FE ET AL.: "The widespread IS200/IS605 transposon family encodes diverse programmable RNA-guided endonucleases", SCIENCE, vol. 374, no. 6563, 2021, pages 57 - 65, XP055901842, DOI: 10.1126/science.abj6856
ANZALONE, A. V.KOBLAN, L. W.LIU, D. R.: "Genome editing with CRISPR-Cas nucleases, base editors, transposases and prime editors", NAT. BIOTECHNOL., vol. 38, 2020, pages 824 - 844, XP037622140, DOI: 10.1038/s41587-020-0561-9
ASKARYAMJAD ET AL.: "In situ readout of DNA barcodes and single base edits facilitated by in vitro transcription", NATURE BIOTECHNOLOGY, vol. 38, no. 1, 2020, pages 66 - 75
ATSCHUL ET AL., J. MOL. BIOL., vol. 215, 1990, pages 403 - 410
BISWASS ET AL., RNA BIOL., vol. 10, 2013, pages 817 - 827
BOSHART ET AL., CELL, vol. 41, 1985, pages 521 - 530
BRAY MASINGH SHAN H ET AL.: "Cell Painting, a high-content image-based assay for morphological profiling using multiplexed fluorescent dyes", NAT PROTOC., vol. 11, no. 9, 2016, pages 1757 - 1774, XP055826149, DOI: 10.1038/nprot.2016.105
CAO ET AL.: "Comprehensive single-cell transcriptional profiling of a multicellular organism", SCIENCE, vol. 357, no. 6352, 2017, pages 661 - 667, XP055624798, DOI: 10.1126/science.aam8940
CHUAI, G.: "DeepCRISPR: optimized CRISPR guide RNA design by deep learning", GENOME BIOL., vol. 19, 2018, pages 80, XP055716006, DOI: 10.1186/s13059-018-1459-4
DATLINGER, PAUL ET AL.: "Pooled CRISPR screening with single-cell transcriptome readout", NATURE METHODS, vol. 14, no. 3, 2017, pages 297 - 301, XP055460183, DOI: 10.1038/nmeth.4177
DEDE, M.MCLAUGHLIN, M.KIM, E.HART, T.: "Multiplex enCas12a screens detect functional buffering among paralogs otherwise masked in monogenic Cas9 knockout screens", GENOME BIOL., vol. 21, 2020, pages 262
DEWEIRDT, P. C. ET AL.: "Optimization of AsCas12a for combinatorial genetic screens in human cells", NAT. BIOTECHNOL., vol. 39, 2021, pages 94 - 104, XP037333516, DOI: 10.1038/s41587-020-0600-6
DIXIT ET AL.: "Perturb-Seq: Dissecting Molecular Circuits with Scalable Single-Cell RNA Profiling of Pooled Genetic Screens", CELL, vol. 167, 2016, pages 1853 - 1866
DOENCH, J. G. ET AL.: "Optimized sgRNA design to maximize activity and minimize off-target effects of CRISPR-Cas9", NAT. BIOTECHNOL., vol. 34, 2016, pages 184 - 191, XP093235579, DOI: 10.1038/nbt.3437
DROKHLYANSKY ESMILLIE CSVAN WITTENBERGHE N ET AL.: "The Human and Mouse Enteric Nervous System at Single-Cell Resolution", CELL, vol. 182, no. 6, 2020, pages 1606 - 1622
DROKHLYANSKY ET AL.: "The enteric nervous system of the human and mouse colon at a single-cell resolution", BIORXIV 746743
DULL TZUFFEREY RKELLY M ET AL.: "A third-generation lentivirus vector with a conditional packaging system", J VIROL., vol. 72, no. 11, 1998, pages 8463 - 8471, XP055715204, DOI: 10.1128/JVI.72.11.8463-8471.1998
ESVELT ET AL., NAT. METHODS, vol. 10, 2013, pages 1116 - 1121
FANDREY, C. I. ET AL.: "Cell Type-Agnostic Optical Perturbation Screening Using Nuclear", SITU SEQUENCING (NIS-SEQ, 18 January 2024 (2024-01-18), pages 576210
FANDREY, CAROLINE I. ET AL.: "Cell Type-Agnostic Optical Perturbation Screening Using Nuclear In-Situ Sequencing (NIS-Seq", BIORXIV, January 2024 (2024-01-01)
FELDMAN ET AL.: "Lentiviral co-packaging mitigates the effects of intermolecular recombination and multiple integrations in pooled genetic screens", BIORXIV 262121
FELDMAN, D. ET AL.: "Optical Pooled Screens in Human Cells", CELL, vol. 179, 2019, pages 787 - 799
FELDMAN, D. ET AL.: "Pooled genetic perturbation screens with image-based phenotypes", NAT. PROTOC., 2022, pages 1 - 37
FELDMAN, D.: "Pooled genetic perturbation screens with image-based phenotypes", PROTOC., 2022, pages 1 - 37
FELDMAN, D.SINGH, A.GARRITY, A. J.BLAINEY, P. C., LENTIVIRAL CO-PACKAGING MITIGATES THE EFFECTS OF INTERMOLECULAR RECOMBINATION AND MULTIPLE INTEGRATIONS IN POOLED GENETIC SCREENS, 2018
FRANGIEH CJMELMS JCTHAKORE PI ET AL.: "Multimodal pooled Perturb-CITE-seq screens in patient models define mechanisms of cancer immune evasion", NAT GENET., vol. 53, no. 3, 2021, pages 332 - 341, XP037414653, DOI: 10.1038/s41588-021-00779-1
GAO ET AL.: "Engineered Cpfl Enzymes with Altered PAM Specificities", BIORXIV 091611, 4 December 2016 (2016-12-04)
GASPERINI ET AL., AM J HUM GENET., vol. 101, 2017, pages 192 - 205
GASPERINI, MOLLY ET AL.: "CRISPR/Cas9-mediated scanning for regulatory elements required for HPRT1 expression via thousands of large, programmed genomic deletions", THE AMERICAN, vol. 101, no. 2, pages 192 - 205, XP085148255, DOI: 10.1016/j.ajhg.2017.06.010
GIER RODRIGO A. ET AL: "High-performance CRISPR-Cas12a genome editing for combinatorial genetic screening", NATURE COMMUNICATIONS, vol. 11, no. 1, 13 July 2020 (2020-07-13), XP055792498, Retrieved from the Internet <URL:http://www.nature.com/articles/s41467-020-17209-1> DOI: 10.1038/s41467-020-17209-1 *
GIERAHN ET AL.: "Seq-Well: portable, low-cost RNA sequencing of single cells at high throughput", NATURE METHODS, vol. 14, 2017, pages 395 - 398
GLEDITZSCH ET AL., RNA BIOLOGY, vol. 16, no. 4, 2019, pages 504 - 517
GRISSA ET AL., NUCLEIC ACID RES., vol. 35, 2007, pages 52 - 57
H, X. ET AL.: "Sequence determinants of improved CRISPR sgRNA design", GENOME RES., vol. 25, 2015, XP055865738, DOI: 10.1101/gr.191452.115
HABIB ET AL.: "Div-Seq: Single-nucleus RNA-Seq reveals dynamics of rare adult newborn neurons", SCIENCE, vol. 353, 2016, pages 925 - 928, XP055608529, DOI: 10.1126/science.aad7038
HABIB ET AL.: "Massively parallel single-nucleus RNA-seq with DroNc-seq", NAT METHODS, vol. 14, no. 10, October 2017 (2017-10-01), pages 955 - 958, XP055651390, DOI: 10.1038/nmeth.4407
HASHIMSHONY, T.WAGNER, F.SHER, N.YANAI, I.: "CEL-Seq: Single-Cell RNA-Seq by Multiplexed Linear Amplification", CELL REPORTS, CELL REPORTS, vol. 2, 2012, pages 666 - 673, XP055111758, DOI: 10.1016/j.celrep.2012.08.003
HEGDE, M.STRAND, C.HANNA, R. E.DOENCH, J. G.: "Uncoupling of sgRNAs from their associated barcodes during PCR amplification of combinatorial CRISPR screens", PLOS ONE, vol. 13, 2018, pages e0197547
HILL, A. J. ET AL.: "On the design of CRISPR-based single cell molecular screens", NAT. METHODS, vol. 15, 2018, pages 271 - 274, XP055886157, DOI: 10.1038/nmeth.4604
HUGHES ET AL.: "Highly Efficient, Massively-Parallel Single-Cell RNA-Seq Reveals Cellular States and Molecular Features of Human Skin Pathology", BIORXIV 689273
ISLAM, S. ET AL.: "Characterization of the single-cell transcriptional landscape by highly multiplex RNA-seq", GENOME RESEARCH, 2011
JAITIN DAWEINER AYOFE I ET AL.: "Dissecting Immune Circuits by Linking CRISPR-Pooled Screens with Single-Cell RNA-Seq", CELL, vol. 167, no. 7, 2016, pages 1883 - 1896, XP029850714, DOI: 10.1016/j.cell.2016.11.039
JAMUR ET AL., METHOD MOL. BIOL., vol. 588, 2010, pages 63 - 66
JIANG CHAOQIAN ET AL: "Multiplexed Gene Engineering Based on dCas9 and gRNA-tRNA Array Encoded on Single Transcript", INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES, vol. 24, no. 10, 10 May 2023 (2023-05-10), Basel, CH, pages 8535, XP093130116, ISSN: 1422-0067, Retrieved from the Internet <URL:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10218229/pdf/ijms-24-08535.pdf> DOI: 10.3390/ijms24108535 *
KALISKY, T.BLAINEY, P.QUAKE, S. R.: "Genomic Analysis at the Single-Cell Level", ANNUAL REVIEW OF GENETICS, vol. 45, 2011, pages 431 - 445
KALISKY, T.QUAKE, S. R.: "Single-cell genomics", NATURE METHODS, vol. 8, 2011, pages 311 - 314
KE RMIGNARDI MPACUREANU A ET AL.: "In situ sequencing for RNA analysis in preserved tissue and cells", NAT METHODS, vol. 10, no. 9, 2013, pages 857 - 860
KEGEL, B. D.QUINN, N.THOMPSON, N. A.ADAMS, D. J.RYAN, C. J.: "Comprehensive prediction of robust synthetic lethality between paralog pairs in cancer cell lines", CELL SYST., vol. 12, 2021, pages 1144 - 1159
KIM, H. K. ET AL.: "SpCas9 activity prediction by DeepSpCas9, a deep learning-based model with high generalization performance", SCI. ADV., vol. 5, 2019, pages 9249
KLEIN ET AL.: "Droplet Barcoding for Single-Cell Transcriptomics Applied to Embryonic Stem Cells", CELL, vol. 161, 2015, pages 1187 - 1201, XP055731640, DOI: 10.1016/j.cell.2015.04.044
KLEINSTIVER BP ET AL.: "Engineered CRISPR-Cas9 nucleases with altered PAM specificities", NATURE, vol. 523, no. 7561, 23 July 2015 (2015-07-23), pages 481 - 5, XP055293257, DOI: 10.1038/nature14592
KLEINSTIVER ET AL., NATURE, vol. 523, 2015, pages 481 - 485
KNAPP, D. J. H. F.: " Decoupling tRNA promoter and processing activities enables specific Pol-II Cas9 guide RNA expression.", NAT. COMMUN., vol. 10, 2019, pages 1490
KOMOR, A. C.BADRAN, A. H.LIU, D. R.: "CRISPR-Based Technologies for the Manipulation of Eukaryotic Genomes", CELL, vol. 168, 2017, pages 20 - 36, XP002781814, DOI: 10.1016/j.cell.2016.10.044
KUDO, T. ET AL.: "Highly multiplexed, image-based pooled screens in primary cells and tissues", PERTURBVIEW, 26 December 2023 (2023-12-26), pages 573143
KUDO, TAKAMASA ET AL.: "Highly multiplexed, image-based pooled screens in primary cells and tissues with Perturb View", BIORXIV, December 2023 (2023-12-01)
LABER ET AL.: "Discovering cellular programs of intrinsic and extrinsic drivers of metabolic traits using LipocyteProfiler", BIORXIV 2021.07.17.452050
LABITIGAN, R. L. D. ET AL., MAPPING VARIATION IN THE MORPHOLOGICAL LANDSCAPE OF HUMAN CELLS WITH OPTICAL POOLED CRISPRI SCREENING, 2022
LEE J.H. ET AL.: "Highly multiplexed subcellular RNA sequencing in situ", SCIENCE, vol. 343, no. 6177, 2014, pages 1360 - 1363, XP055305772, DOI: 10.1126/science.1250212
LEENAY ET AL., MOL. CELL, vol. 16, 2016, pages 253
LI, R. ET AL.: "Comparative optimization of combinatorial CRISPR screens", NAT. COMMUN., vol. 13, 2022, pages 2469
LIU, J. ET AL.: "Pooled library screening with multiplexed Cpfl library", NAT. COMMUN., vol. 10, 2019, pages 3144
MACOSKO ET AL.: "Highly Parallel Genome-wide Expression Profiling of Individual Cells Using Nanoliter Droplets", CELL, vol. 161, 2015, pages 1202 - 1214, XP029129143, DOI: 10.1016/j.cell.2015.05.002
MARCH: "Advanced Organic Chemistry Reactions, Mechanisms and Structure", 1992, JOHN WILEY & SONS
MARRAFFINI ET AL., NATURE, vol. 463, 2010, pages 568 - 571
MOFFITT JRHAO JWANG GCHEN KHBABCOCK HPZHUANG X: "High-throughput single-cell gene-expression profiling with multiplexed error-robust fluorescence in situ hybridization", PROC NATL ACAD SCI U S A., vol. 113, no. 39, 2016, pages 11046 - 11051, XP055438958, DOI: 10.1073/pnas.1612826113
MOJICA ET AL., MICROBIOL., vol. 155, no. 3, 2009, pages 733 - 740
MOL. CELL. BIOL., vol. 8, no. 1, 1988, pages 466 - 472
MOON, S. B.KIM, D. Y.KO, J.-H.KIM, Y.-S.: "Recent advances in the CRISPR genome editing tool set", EXP. MOL. MED., vol. 51, 2019, pages 1 - 11, XP055718089, DOI: 10.1038/s12276-019-0339-7
NAJM, F. J. ET AL.: "Orthologous CRISPR-Cas9 enzymes for combinatorial genetic screens", NAT., vol. 36, 2018, pages 179 - 189, XP055545881, DOI: 10.1038/nbt.4048
PA CARRGM CHURCH, NATURE BIOTECHNOLOGY, vol. 27, no. 12, 2009, pages 1151 - 62
PARRISH, P. C. R. ET AL.: "Discovery of synthetic lethal and tumor suppressor paralog pairs in the human genome", CELL REP., vol. 36, 2021, pages 109597
PATTANAYAK ET AL., NAT. BIOTECHNOL., vol. 31, 2013, pages 839 - 843
PETIWALA, S. ET AL.: "Optimization of Genomewide CRISPR Screens Using AsCas12a and Multi-Guide Arrays", CRISPR J., vol. 6, 2023, pages 75 - 82
PICELLI, S. ET AL.: "Full-length RNA-seq from single cells using Smart-seq2", NATURE PROTOCOLS, vol. 9, 2014, pages 171 - 181, XP002742134, DOI: 10.1038/nprot.2014.006
PORT, F.BULLOCK, S. L.: "Augmenting CRISPR applications in Drosophila with tRNA-flanked sgRNAs", NAT. METHODS, vol. 13, 2016, pages 852 - 854, XP055344550, DOI: 10.1038/nmeth.3972
PROC. NATL. ACAD. SCI. USA., vol. 78, no. 3, 1981, pages 1527 - 31
QI ZBARRETT TPARIKH ASTIROSH IPURAM SV: "Single-cell sequencing and its applications in head and neck cancer", ORAL ONCOL., vol. 99, 2019, pages 104441, XP085946260, DOI: 10.1016/j.oraloncology.2019.104441
QI, W. ET AL.: "High-efficiency CRISPR/Cas9 multiplex gene editing using the glycine tRNA-processing system-based strategy in maize", BMC BIOTECHNOL., vol. 16, 2016, pages 58, XP055515108, DOI: 10.1186/s12896-016-0289-2
QUI ET AL., BIOTECHNIQUES, vol. 36, no. 4, 2004, pages 702 - 707
RAMSKOLD, D. ET AL.: "Full-length mRNA-Seq from single-cell levels of RNA and individual circulating tumor cells", NATURE BIOTECHNOLOGY, vol. 30, 2012, pages 777 - 782, XP037004921, DOI: 10.1038/nbt.2282
REPLOGLE ET AL.: "Combinatorial single-cell CRISPR screens by direct guide RNA capture and targeted sequencing", NAT BIOTECHNOL, 2020
REPLOGLE, J. M. ET AL.: "Combinatorial single-cell CRISPR screens by direct guide RNA capture and targeted sequencing", NAT. BIOTECHNOL., vol. 38, 2020, pages 954 - 961, XP037211717, DOI: 10.1038/s41587-020-0470-y
REPLOGLE, J. M. ET AL.: "Mapping information-rich genotype-phenotype landscapes with genome-scale Perturb-seq", CELL, vol. 185, 2022, pages 2559 - 2575
REPLOGLE, J. M. ET AL.: "Maximizing CRISPRi efficacy and accessibility with dual-sgRNA libraries and optimal effectors", ELIFE, vol. 11, 2022, pages e81856, XP093212069, DOI: 10.7554/eLife.81856
ROSENBERG ET AL.: "Scaling single cell transcriptomics through split pool barcoding", BIORXIV, 2 February 2017 (2017-02-02)
ROSENBERG ET AL.: "Single-cell profiling of the developing mouse brain and spinal cord with split-pool barcoding", SCIENCE, 15 March 2018 (2018-03-15)
RYAN, C. J.MEHTA, I.KEBABCI, N.ADAMS, D. J.: "Targeting synthetic lethal paralogs in cancer", TRENDS CANCER, vol. 9, 2023, pages 397 - 409
SACK, L. M.DAVOLI, T.XU, Q.LI, M. Z.ELLEDGE, S. J.: "Sources of Error in Mammalian Genetic Screens", G3 GENESGENOMESGENETICS, vol. 6, 2016, pages 2781 - 2790
SANJANA, N. E., SHALEM, O. & ZHANG, F.: "Improved vectors and genome-wide libraries for CRISPR screening", NAT. METHODS, vol. 11, 2014, pages 783 - 784, XP093235581, DOI: 10.1038/nmeth.3047
SANSON, K. R. ET AL.: "Optimized libraries for CRISPR-Cas9 genetic screens with multiple modalities", NAT. COMMUN., vol. 9, 2018, pages 5416, XP093034521, DOI: 10.1038/s41467-018-07901-8
SCHLUB, T. E.SMYTH, R. P.GRIMM, A. J.MAK, J.DAVENPORT, M. P.: "Accurately Measuring Recombination between Closely Related HIV-1 Genomes", PLOS COMPUT. BIOL., vol. 6, 2010, pages e1000766
SCHRAIVOGEL DGSCHWIND ARMILBANK JH ET AL.: "Targeted Perturb-seq enables genome-scale genetic screens in single cells", NAT METHODS, vol. 17, no. 6, 2020, pages 629 - 635, XP037177159, DOI: 10.1038/s41592-020-0837-5
SINGLETON ET AL.: "Dictionary of Microbiology and Molecular Biology", 1994, BLACKWELL SCIENCE LTD.
SWIECH ET AL.: "In vivo interrogation of gene function in the mammalian brain using CRISPR-Cas9", NATURE BIOTECHNOLOGY, vol. 33, 2014, pages 102 - 106, XP055176807, DOI: 10.1038/nbt.3055
T. K. MOON: "Error Correction Coding: Mathematical Methods and Algorithms", 2005, WILEY
TANG, F. ET AL.: "mRNA-Seq whole-transcriptome analysis of a single cell", NATURE METHODS, vol. 6, 2009, pages 377 - 382, XP055037482, DOI: 10.1038/nmeth.1315
TANG, F. ET AL.: "RNA-Seq analysis to capture the transcriptome landscape of a single cell", NATURE PROTOCOLS, vol. 5, 2010, pages 516 - 535, XP009162232, DOI: 10.1038/nprot.2009.236
THOMPSON, N. A. ET AL.: "Combinatorial CRISPR screen identifies fitness effects of gene paralogues", NAT. COMMUN., vol. 12, 2021, pages 1302
URBINATI, F. ET AL.: "Mechanism of Reduction in Titers From Lentivirus Vectors Carrying Large Inserts in the 3'LTR", MOL. THER., vol. 17, 2009, pages 1527 - 1536
VITAK ET AL.: "Sequencing thousands of single-cell genomes with combinatorial indexing", NATURE METHODS, vol. 14, no. 3, 2017, pages 302 - 308
WONG, N.LIU, W.WANG, X.: "WU-CRISPR: characteristics of functional guide RNAs for the CRISPR/Cas9 system", GENOME BIOL., vol. 16, 2015, pages 218
XIE, K.MINKENBERG, B.YANG, Y.: "Boosting CRISPR/Cas9 multiplex editing capability with the endogenous tRNA-processing system", PROC. NATL. ACAD. SCI., vol. 112, 2015, pages 3570 - 3575, XP055196411, DOI: 10.1073/pnas.1420294112
XIE, S.COOLEY, A.ARMENDARIZ, D.ZHOU, P.HON, G. C.: "Frequent sgRNA-barcode recombination in single-cell perturbation assays", PLOS ONE, vol. 13, 2018, pages e0198635
XIE, S.DUAN, J.LI, B.ZHOU, P.HON, G. C.: "Multiplexed Engineering and Analysis of Combinatorial Enhancer Activity in Single Cells", MOL. CELL, vol. 66, 2017, pages 285 - 299
XU, L., ZHAO, L., GAO, Y., XU, J. & HAN, R.: "Empower multiplex cell and tissue-specific CRISPR-mediated gene manipulation with self-cleaving ribozymes and tRNA", RES., vol. 45, 2017, pages e28 - e28, XP055515116, DOI: 10.1093/nar/gkw1048
YAO, D.: "Scalable genetic screening for regulatory circuits using compressed Perturb-seq. ", NAT. BIOTECHNOL., 2023, pages 1 - 14
YU ET AL., CROPSEQ-GUIDE-PURO PRESENTS NO OBVIOUS ADVANTAGE FOR LIBRARY DESIGNS THAT INCORPORATE TANDEM CASSETTES, 1998
YU, HONG ET AL.: "The nature of human immunodeficiency virus type 1 strand transfers", JOURNAL OF BIOLOGICALCHEMISTRY, vol. 273, no. 43, 1998, pages 28384 - 28391
YUE LLIU FHU J ET AL.: "A guidebook of spatial transcriptomic technologies, data resources and analysis approaches", COMPUT STRUCT BIOTECHNOL J., vol. 21, 2023, pages 940 - 955
ZETSCHE, B.: "Cpfl Is a single RNA-guided endonuclease of a class 2 CRISPR-Cas system", CELL, vol. 163, 2015, pages 759 - 771
ZETSCHE, B.: "Multiplex gene editing by CRISPR-Cpfl using a single crRNA array", BIOTECHNOL., vol. 35, 2017, pages 31 - 34, XP055512019, DOI: 10.1038/nbt.3737
ZHANG, Y. ET AL.: "A gRNA-tRNA array for CRISPR-Cas9 based rapid multiplexed genome editing in Saccharomyces cerevisiae", NAT. COMMUN., vol. 10, 2019, pages 1053, XP093194610, DOI: 10.1038/s41467-019-09005-3
ZHENG ET AL.: "Haplotyping germline and cancer genomes with high-throughput linked-read sequencing", NATURE BIOTECHNOLOGY, vol. 34, 2016, pages 303 - 311, XP055486933, DOI: 10.1038/nbt.3432
ZHENG ET AL.: "Massively parallel digital transcriptional profiling of single cells", NAT. COMMUN., vol. 8, 2017, pages 14049
ZHU, S. ET AL.: "Guide RNAs with embedded barcodes boost CRISPR-pooled screens", GENOME BIOL., vol. 20, 2019, pages 20
ZILIONIS ET AL.: "Single-cell barcoding and sequencing using droplet microfluidics", NAT PROTOC., vol. 12, no. 1, 2017, pages 44 - 73, XP055532179, DOI: 10.1038/nprot.2016.154
ZUFFEREY RDULL TMANDEL RJ ET AL.: "Self-inactivating lentivirus vector for safe and efficient in vivo gene delivery", J VIROL., vol. 72, no. 12, 1998, pages 9873 - 9880
ZUKERSTIEGLER, NUCLEIC ACIDS RES., vol. 9, 1981, pages 133 - 148

Similar Documents

Publication Publication Date Title
JP7506405B2 (en) Lentiviral-Based Vectors for Eukaryotic Gene Editing and Related Systems and Methods
EP2898090B1 (en) Method and kit for preparing a target rna depleted sample
KR102382772B1 (en) High-Specificity Genome Editing Using Chemically Modified Guide RNAs
EP3234192B1 (en) Unbiased identification of double-strand breaks and genomic rearrangement by genome-wide insert capture sequencing
US11396676B2 (en) Sequencing and analysis of exosome associated nucleic acids
JP2025060959A (en) Methods and Reagents for Concentrating Nucleic Acid Material for Sequencing Applications and Other Nucleic Acid Material Interrogation - Patent application
US6080541A (en) Method for producing tagged genes, transcripts, and proteins
WO2019094984A1 (en) Methods for determining spatial and temporal gene expression dynamics during adult neurogenesis in single cells
WO2019222284A1 (en) In situ cell screening methods and systems
JP2018532419A (en) CRISPR-Cas sgRNA library
CN107488655B (en) Removal method of 5&#39; and 3&#39; adapter ligation by-products in sequencing library construction
CN107109698B (en) RNA STITCH sequencing: an assay for direct mapping of RNA:RNA interactions in cells
Walton et al. CROPseq-multi: a versatile solution for multiplexed perturbation and decoding in pooled CRISPR screens
CN104531874B (en) EGFR gene mutation detection method and kit
US20220017895A1 (en) Gramc: genome-scale reporter assay method for cis-regulatory modules
CN109957568B (en) gRNA for targeting HBB RNA, C2C 2-based HBB mutation detection method and detection kit
WO2025166152A1 (en) Multiplexed perturbation and decoding in pooled genetic screens
KR20220106153A (en) Compositions, Sets, and Methods Related to Targeted Assays
WO2024119461A1 (en) Compositions and methods for detecting target cleavage sites of crispr/cas nucleases and dna translocation
CN120230749A (en) TnpB-omega RNA gene editing system and application
WO2024089629A1 (en) Cas12 protein, crispr-cas system and uses thereof
WO2022187278A1 (en) Nucleic acid detection and analysis systems
WO2022197727A1 (en) Generation of novel crispr genome editing agents using combinatorial chemistry
US20240150830A1 (en) Phased genome scale epigenetic maps and methods for generating maps
Lee et al. Arabidopsis LTR retrotransposons and their regulation by epigenetically activated small RNA

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 25708960

Country of ref document: EP

Kind code of ref document: A1