WO2024168092A2 - Procédés et kits de marquage de molécules cellulaires pour analyse multiplex - Google Patents
Procédés et kits de marquage de molécules cellulaires pour analyse multiplex Download PDFInfo
- Publication number
- WO2024168092A2 WO2024168092A2 PCT/US2024/014893 US2024014893W WO2024168092A2 WO 2024168092 A2 WO2024168092 A2 WO 2024168092A2 US 2024014893 W US2024014893 W US 2024014893W WO 2024168092 A2 WO2024168092 A2 WO 2024168092A2
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- cells
- target
- sequence
- primers
- cdna
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6806—Preparing nucleic acids for analysis, e.g. for polymerase chain reaction [PCR] assay
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/10—Processes for the isolation, preparation or purification of DNA or RNA
- C12N15/1034—Isolating an individual clone by screening libraries
- C12N15/1065—Preparation or screening of tagged libraries, e.g. tagged microorganisms by STM-mutagenesis, tagged polynucleotides, gene tags
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/10—Processes for the isolation, preparation or purification of DNA or RNA
- C12N15/1096—Processes for the isolation, preparation or purification of DNA or RNA cDNA Synthesis; Subtracted cDNA library construction, e.g. RT, RT-PCR
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N5/00—Undifferentiated human, animal or plant cells, e.g. cell lines; Tissues; Cultivation or maintenance thereof; Culture media therefor
- C12N5/06—Animal cells or tissues; Human cells or tissues
- C12N5/0602—Vertebrate cells
- C12N5/0634—Cells from the blood or the immune system
- C12N5/0636—T lymphocytes
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q2521/00—Reaction characterised by the enzymatic activity
- C12Q2521/10—Nucleotidyl transfering
- C12Q2521/107—RNA dependent DNA polymerase,(i.e. reverse transcriptase)
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q2527/00—Reactions demanding special reaction conditions
- C12Q2527/156—Permeability
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q2535/00—Reactions characterised by the assay type for determining the identity of a nucleotide base or a sequence of oligonucleotides
- C12Q2535/122—Massive parallel sequencing
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q2563/00—Nucleic acid detection characterized by the use of physical, structural and functional properties
- C12Q2563/179—Nucleic acid detection characterized by the use of physical, structural and functional properties the label being a nucleic acid
Definitions
- the present disclosure relates generally to methods of uniquely labeling or barcoding molecules within or originating from a nucleus or plurality of nuclei, a cell or plurality of cells, or one or more tissues, organs, or organisms.
- the present disclosure also relates to kits for uniquely labeling molecules within or originating from a nucleus or plurality of nuclei, a cell or plurality of cells, or one or more tissues, organs, or organisms.
- the methods and kits may relate to the labeling of RNAs and/or cDNAs within cells for the preparation of sequencing libraries for multiplex analysis.
- NGS Next Generation Sequencing
- RNA transcripts are generally purified from lysed cells (i.e., cells that have been broken apart), followed by conversion of the RNA transcripts into complementary DNA (cDNA) using reverse transcription.
- cDNA complementary DNA
- the cDNA sequences can then be sequenced using NGS. In such a procedure, all of the cDNA sequences are mixed together before sequencing, such that RNA expression is measured for a whole sample and individual sequences cannot be linked back to an individual cell.
- Methods for uniquely labeling or barcoding transcripts from individual cells can involve the manual separation of individual cells into separate reaction vessels and can require specialized equipment.
- An alternative approach to sequencing individual transcripts in cells is to use microscopy to identify individual fluorescent bases.
- this technique can be difficult to implement and is limited to sequencing a low number of cells.
- Single-cell sequencing can allow the identification of transcripts from individual cells.
- a limitation of single-cell RNA sequencing is the inability to detect every expressed gene in a given cell without exhaustive sequencing, which can be cost prohibitive due to the sequencing depth required per cell. This is problematic for applications such as single cell clustered regularly interspaced short palindromic repeats (CRISPR) screens where it is essential to detect the guide RNA (gRNA) in every cell in order to associate a transcriptomewide perturbation effect.
- CRISPR single cell clustered regularly interspaced short palindromic repeats
- TCRs T cell receptors
- T cell activation during disease pathogenesis and progression can assist, e.g., in the development of next-generation therapeutics with more favorable and sustainable outcomes.
- FIGs. 1A-1B provide overviews of two embodiments of the present method.
- cells or nuclei are fixed and permeabilized
- cDNA is generated within the cells or nuclei by reverse transcription using well-specific barcoded primers
- one or more additional barcodes are appended to the cDNA within the cells or nuclei by a split-pool labeling process
- the cells or nuclei are lysed
- the released barcoded cDNA is isolated and a second cDNA strand is generated by template switching
- the cDNA is amplified in a multiplex “preamplification” reaction using both non-specific primers (whole transcriptome, or WT primers) and target- specific primers
- the amplified cDNA molecules are used to prepare, in parallel, a Whole Transcriptome sequencing library and a target sequencing library (such as a CRISPR library or a TCR library).
- FIG. 1 A illustrates an exemplary method for multiplex labeling of nucleic acids in cells in the context of a CRISPR screen.
- FIG. IB illustrates an exemplary method for multiplex labeling of nucleic acids in cells in the context of TCR profiling of T cells.
- FIGS. 2A-2C provide an overview of the cDNA products obtained at different steps in one embodiment of the present methods, e.g., using a pair of nested “focal primers” to enrich for one or more target sequences when preparing parallel WT and target-specific sequencing libraries.
- FIG. 2A shows the products of the combinatorial barcoding of cDNA in cells or nuclei following template switching. Both a generic cDNA molecule representative of the whole transcriptome, and a target cDNA molecule of interest are shown.
- FIG. 2B shows primer binding sites used during multiplex cDNA amplification with both non-specific whole transcriptome primers and with spiked-in target primers (i.e., the “preamplification” step using WT and target “preamplification primers” as described elsewhere herein).
- the whole transcriptome is amplified by PCR using a pair of non-specific WT primers binding to the R2 sequence (“R2 Primer”) and the TS primer binding sequence (“PCR Primer”), and the target sequences are enriched by PCR using a pair of primers including one target-specific primer (“Focal Primer 1”) and one non-specific primer (“R2 Primer”).
- FIG. 2C shows the two types of cDNA products resulting from the multiplex amplification (i.e., preamplification) step shown in FIG. IB: the whole transcriptome (WT) cDNA products, and the target cDNA products (“Gene-specific cDNA from first enrichment).
- the combined sample containing the two products can then used as input for the subsequent separate preparation of whole transcriptome (WT) sequencing libraries and target (e.g., “focal PCR”) sequencing libraries.
- FIGS. 3A-3D provides an overview of in situ cell barcoding steps (i.e., combinatorial barcoding, or split-pool labeling) performed in various embodiments of the present methods.
- FIG. 3A Round 1 Barcoding. Fixed and permeabilized cells are loaded into a multiple wells (e.g., 48 wells) of a Round 1 plate. RNA is reverse transcribed to generate cDNA using reverse transcription primers comprising a well-specific barcode (“BC1”) and, e.g., a poly(dT) sequence or a random sequence.
- FIG. 3B Round 2 Barcoding: The cells containing the cDNA are pooled and loaded into a Round 2 Plate.
- FIG. 3C Round 3 Barcoding: The cells are pooled and loaded into a Round 3 Plate. A third barcode is ligated to the cDNA (indicated in red) via another adapter, which also contains an Illumina R2 sequence, and biotin.
- FIG. 3D Lysis and Sublibrary Generation: Cells are split into multiple sublibraries (or “samples”) (e.g., 8 sublibraries or samples) and lysed.
- FIGS. 4A-4D provide an overview of cDNA capture and amplification steps performed in various embodiments of the present methods.
- FIG. 4A cDNA Capture: Following cell lysis, biotinylated cDNA is captured (isolated) in each sublibrary via streptavidin beads.
- FIG. 4B cDNA Template Switch: A template switch (TS) reaction adds an adapter to the 3’ end of the cDNA.
- FIG. 4C WT cDNA Amplification: the whole transcriptome (WT) cDNA is amplified by PCR using primers binding to a Template Switch (TS) sequence and to the Illumina Truseq R2 sequence.
- FIG. 4D cDNA amplification with spiked-in target primers.
- target specific primers are added, or spiked-in, during the first round of cDNA amplification (i.e., the amplification step shown in FIG. 4C), so as to enrich the presence of the target cDNAs among the whole transcriptome cDNA molecules.
- FIG. 4C shows a target (hU6- sgRNA-polyA) transcript enriched due to the presence of a spiked-in human U6 specific primer, alongside two non-target cDNAs from the whole transcriptome.
- the amplified cDNA mixture comprising the WT and target cDNAs can be split and used to separately prepare the Whole Transcriptome and target (e.g., CRISPR) sequencing libraries.
- FIGS. 5A-5C show steps in the preparation of WT sequencing libraries starting from amplified WT cDNA molecules as shown, e.g., in FIG. 4C.
- FIG. 5A the cDNA molecules are fragmented, and the ends are repaired and then A-tailed.
- FIG. 5B Adapter Ligation: An Illumina Truseq R1 Adapter is ligated to the 5’ end of the DNA.
- FIG. 5C Round 4 Barcoding: The sequencing library is amplified, adding P5/P7 Adapters and a fourth barcode via the UDI - WT Plate.
- FIGS. 6A-6C show steps in the preparation of target (e.g., CRISPR) sequencing libraries starting from enriched target cDNA molecules amplified with spiked in primers during the “preamplification” PCR step (i.e., the target cDNA molecules shown in FIG. 4D).
- FIGS. 6A-6B show the products of two additional amplification steps performed after the initial preamplification round of PCR.
- FIG. 6A CRISPR PCR: In the first additional round of amplification, a PCR reaction using a second hU6 specific primer further amplifies and enriches the target (sgRNA) cDNA molecules. This reaction also adds an adaptor (e.g., with an Illumina R1 sequence).
- FIG. 6B CRISPR Index PCR: In the second additional round of amplification, the CRISPR Sequencing Library is amplified, adding P5/P7 adaptors and a fourth barcode via the Illumina indexes in the UDI Plate - EC.
- FIG. 6C another representation of the products of the two additional rounds of amplification.
- FIGS. 7A-7C show steps in the preparation of target (e.g.,) TCR sequencing libraries.
- FIGS. 7A-7B show the products of two additional amplification steps performed after the initial preamplification round of PCR.
- FIG. 7 A TCR Amplification 1 : In the first additional round of amplification, a PCR reaction using a TCR specific primer further amplifies and enriches the target (TCR) cDNA molecules, e.g., cDNA molecules from the whole transcriptome that contain V(D)J segments in the CDR3 repertoire of the T cell. This reaction also adds an adaptor (e.g., an adapter with an Illumina Nextera R1 sequence).
- an adaptor e.g., an adapter with an Illumina Nextera R1 sequence
- FIG. 7B TCR Amplification 2: In the second additional round of amplification, the TCR Sequencing Library is amplified, adding P5/P7 adaptors and a fourth barcode via the Illumina indexes in the UDI Plate - EC.
- FIG. 7C another representation of the products of the two additional rounds of amplification.
- FIGS. 8A-8B show validation of enrichment ability of spiked-in target primers for Focal Barcoding protocol.
- FIG. 8A Percentage of cells with gene detected in HEK293 or NIH/3T3 cells, in Whole Transcriptome vs.
- Focal libraries for GPX4, PRDX2, CHCHD2, KDELR1, Psmd2, GAPDH, RPL5, and Actb genes.
- Whole Transcriptome libraries were sequenced at 10,000 reads/cell.
- Focal libraries were sequenced at 250 reads/cell.
- FIG. 8B Number of unique transcripts captured in HEK293 or NIH/3T3 cells, in Whole Transcriptome vs.
- Focal libraries for GPX4, PRDX2, CHCHD2, KDELR1, Psmd2, GAPDH, RPL5, and Actb genes.
- Whole Transcriptome libraries were sequenced at 10,000 reads/cell.
- Focal libraries were sequenced at 250 reads/cell.
- FIGS. 9A-9B illustrate enrichment of spiked-in primers for moderately expressed genes, and improvement in purity following application of a 1 read count threshold filter.
- FIG. 9A Two low- to medium-expressing genes were enriched using the herein-disclosed methods in humans (KDELR1) and mice (Psmd2), and the level of enrichment (as measured by virtue of the percentage of cells comprising the gene or reads per cell) was determined in 12k and 62k cell sublibraries.
- KDELR1 Two low- to medium-expressing genes were enriched using the herein-disclosed methods in humans (KDELR1) and mice (Psmd2), and the level of enrichment (as measured by virtue of the percentage of cells comprising the gene or reads per cell) was determined in 12k and 62k cell sublibraries.
- Sequencing libraries were prepared according to the herein- described methods using two sets of target genes (Psmd2-KDELR1 or Fnl-RPL5) with 10 ng or 50 ng of preamplified cDNA introduced into the first round of target sequence specific amplification performed subsequent to the multiplex “preamplification” round of amplification.
- FIG. 10 shows Tapestation image showing the amount of enriched target gene transcripts (RPL5 and Fnl) in samples prepared with (lane C2) or without (lane A2) spiked-in target-specific primers during a first round of multiplex cDNA amplification (i.e., “preamplification”).
- FIG. 11 shows the fraction of sequencing reads with valid barcodes that mapped to either of the targeted genes RPL5 or Fnl, in samples prepared with (sublibrary S3) or without (sublibrary SI) spiked-in target-specific primers during a first round of multiplex cDNA amplification (i.e., “preamplification”).
- FIG. 12 shows the number of target gene transcripts (i.e., RPL5 or Fnl transcripts) detected per cell in samples prepared with (sublibrary S3) or without (sublibrary SI) spiked-in target-specific primers during a first round of multiplex cDNA amplification (i.e., “preamplification”).
- target gene transcripts i.e., RPL5 or Fnl transcripts
- FIGS. 13A-13B show the fraction of cells that contained an enriched transcript (i.e., an RPL5 or Fnl transcript), both without any filtering (FIG. 13A) or following a filtering step to only consider transcripts represented by more than 2 reads (FIG. 13B), in samples prepared with (sublibrary S3) or without (sublibrary SI) spiked-in target-specific primers during a first round of multiplex cDNA amplification (i.e., “preamplification”).
- an enriched transcript i.e., an RPL5 or Fnl transcript
- FIGS. 13A-13B show the fraction of cells that contained an enriched transcript (i.e., an RPL5 or Fnl transcript), both without any filtering (FIG. 13A) or following a filtering step to only consider transcripts represented by more than 2 reads (FIG. 13B), in samples prepared with (sublibrary S3) or without (sublibrary SI) spiked-in target-specific primers during a first round of multiplex cDNA
- FIGS. 14A-14B show high TCR detection in primary T cells showing sensitive clonotype detection.
- FIG. 14A shows high TCR chain identification rate. Isolated T cells from healthy donor PBMCs were directly profiled (Primary). Alpha, Beta, and Paired detection are represented in percentages.
- FIG. 14B shows TCR chain assignment across 8 donors. High rate of chain assignments to both TCR alpha and beta. Among T cells with a detected TCR, paired alpha beta chain assignments ranged between 49%-66%.
- FIGS. 15A-15B show comprehensive Immune Repertoire Detection measured by number of unique alpha and beta chain clonotypes across donors. Nearly four hundred thousand unique alpha chain clonotypes and five hundred thousand unique beta chain clonotypes were identified across the 8 donors, with the vast majority being classified as rare clonotypes. The rare clonotypes (darker color, lower shaded) are defined as only being detected in 1 or 2 cells and the majority of detected clonotypes are rare.
- FIG. 15 A Unique Alpha Chain.
- FIG. 15B Unique Beta Chain.
- FIGS. 16A-16D show increased detection of TCR alpha and beta chains with spiking in of TCR-specific primers during first multiplex cDNA amplification step (i.e., “preamplification” step).
- FIG. 16A shows percentages of activated T cells with detected alpha chain or with no detected chain, in libraries prepared with or without TCR-specific preamplification primers.
- FIG. 16B shows percentages of resting T cells with detected alpha chain or with no detected chain, in libraries prepared with or without TCR-specific preamplification primers.
- FIG. 16C shows percentages of activated T cells with detected beta chain or with no detected chain, in libraries prepared with or without TCR-specific preamplification primers.
- FIG. 16A shows percentages of activated T cells with detected alpha chain or with no detected chain, in libraries prepared with or without TCR-specific preamplification primers.
- 16D shows percentages of resting T cells with detected beta chain or with no detected chain, in libraries prepared with or without TCR-specific preamplification primers.
- Spike-in +e TCR-specific preamplification primers added.
- Spike-in - TCR-specific preamplification primers not added.
- the present disclosure provides methods, compositions, kits, and systems for labeling RNA and other molecules within and originating from cells and nuclei.
- the present methods relate to single cell methods of labeling, in parallel, target transcripts of interest such as CRISPR gRNAs or T cell receptors, and the transcriptomes of single cells, e.g., whole transcriptomes or subsets of the transcriptome.
- nucleic acids for multiplex transcriptional analysis in a plurality of cells or nuclei, the method comprising:
- RNA molecules are reverse transcribed using reverse transcription (RT) primers each comprising: (i) a poly(T) sequence or a random sequence; and (ii) an RT barcode sequence, wherein the RT barcode sequences present within the RT primers are specific to each aliquot within the first plurality of aliquots;
- RT reverse transcription
- nucleic acid tags to cDNA molecules within the cells or nuclei of the additional plurality of aliquots, wherein each nucleic acid tag comprises a tag barcode sequence, and wherein the tag barcode sequences present within the nucleic acid tags are specific to each aliquot within the additional plurality of aliquots;
- the multiplex set of preamplification primers comprises: at least one pair of whole transcriptome (WT) preamplification primers configured to amplify all of the tagged cDNA molecules isolated from the lysate, and at least one pair of target-specific preamplification primers configured to specifically amplify tagged target cDNA molecules isolated from the lysate, thereby generating an enriched plurality of amplified tagged cDNA molecules that comprises the whole transcriptome and that is enriched for the one or more target cDNA molecules;
- WT whole transcriptome
- the preparation of the WT sequencing library comprises fragmenting the first portion of the enriched plurality of amplified tagged cDNA molecules and appending a first adapter comprising a first adapter sequence to the fragment ends, and amplifying the tagged cDNA molecules comprising the first adapter in an index PCR using WT index amplification primers, wherein one or more of the WT index amplification primers comprises a sequence complementary to the first adapter sequence, wherein one or more of the WT index amplification primers comprises an index sequence, and wherein one or more of the WT index amplification primers comprises a next-generation sequencing (NGS) adapter sequence, an NGS primer binding sequence, and/or an NGS flow-cell binding sequence.
- NGS next-generation sequencing
- the preparation of the target sequencing library comprises: amplifying the tagged target cDNA molecules in the second portion of the enriched plurality of amplified tagged cDNA molecules in a second target-specific amplification step, thereby generating a further enriched plurality of tagged target cDNA molecules; wherein the tagged target cDNA molecules are amplified in the second targetspecific amplification step using at least one pair of target amplification primers configured to specifically amplify the tagged target cDNA molecules in the second portion; and amplifying the further enriched plurality of tagged target cDNA molecules in an index target PCR using target index amplification primers; wherein one or more of the target index amplification primers comprises an index sequence, and wherein one or more of the target index amplification primers comprises a next-generation sequencing (NGS) adapter sequence, an NGS primer binding sequence, and/or an NGS flow-cell binding sequence.
- NGS next-generation sequencing
- the nucleic acid tags that are coupled to the tagged cDNA molecules during the last of the one or more times that steps (e)(i) to (e)(iii) are repeated comprise one or more elements selected from the group consisting of a random nucleotide sequence to prevent counting of PCR duplicates, a capture agent, and a second adapter sequence.
- the capture agent comprises biotin.
- the binding agent comprises streptavidin-coated magnetic beads.
- the lysing in step (f) is performed in the presence of a protease.
- the protease is proteinase K.
- a protease inhibitor is added to the lysate prior to or together with the binding agent.
- the protease inhibitor is phenylmethanesulfonyl fluoride (PMSF) or 4-(2-aminoethyl)benzenesulfonyl fluoride hydrochloride (AEBSF.
- PMSF phenylmethanesulfonyl fluoride
- AEBSF 4-(2-aminoethyl)benzenesulfonyl fluoride hydrochloride
- the second strands are generated in step (h) using a template switching oligo (TSO) comprising a third adapter sequence, such that the template switching introduces the third adapter sequence to the 3 ’-end of the released cDNA molecules.
- TSO template switching oligo
- step (f) prior to the lysing in step (f), the plurality of cells or nuclei comprising the tagged cDNA molecules are divided into a plurality of samples.
- the index sequence in one or more of the WT index amplification primers and/or one or more of the target index amplification primers is sample-specific.
- At least one target-specific preamplification primer and at least one target amplification primer are complementary to the same sequence within a target cDNA molecule as a target amplification primer.
- none of the target-specific preamplification primers specifically binds to the same sequence within a target cDNA molecule as a target amplification primer.
- the at least one pair of targetspecific preamplification primers and the at least one pair of WT preamplification primers have at least one preamplification primer in common.
- the at least one preamplification primer common to both the at least one pair of target-specific preamplification primers and the at least one pair of WT preamplification primers comprises a sequence complementary to the second adapter sequence.
- the at least one pair of WT preamplification primers comprise a primer comprising a sequence complementary to the third adapter sequence and a primer comprising a sequence complementary to the second adapter sequence.
- the multiplex set of preamplification primers comprises a single pair of target-specific preamplification primers and/or a single pair of WT preamplification primers.
- the multiplex set of preamplification primers comprises from 1-10, 10-100, or 100-200 pairs of target-specific preamplification primers.
- the RT primers and/or the nucleic acid tags are DNA molecules.
- the method further comprising sequencing the WT sequencing library and the target sequencing library.
- the WT sequencing library and the target sequencing library are sequenced separately.
- the WT sequencing library and the target sequencing library are sequenced together.
- the method further comprising filtering the sequencing reads generated from the sequencing of the WT sequencing library and the target sequencing library to remove transcripts represented by only one read.
- the method further comprising grouping sequencing reads generated from the sequencing of the WT sequencing library and the target sequencing library according to one or more features selected from the group consisting of RT barcode sequences, tag barcode sequences, series or combinations of tag barcode sequences, and index sequences.
- the grouped sequencing reads are used to determine the individual cell or nucleus from among the plurality of cells or nuclei from which a given cDNA originated.
- the method further comprising grouping sequence reads into target cDNA sequences and non-target cDNA sequences.
- the method further comprising identifying one or more cells or nuclei from among the plurality of cells or nuclei in which a target cDNA is expressed, or in which the target cDNA is expressed at an increased or decreased level relative to a reference value.
- the method further comprising relating the identity and/or expression level of a target gene or sequence in an individual cell or nucleus to the overall pattern of expression of the whole transcriptome in the same individual cell or nucleus.
- the method further comprising relating the identity and/or expression level of a target gene or sequence in an individual cell or nucleus to the expression level of one or more individual non-target genes in the same original cell or nucleus.
- the one or more target cDNAs comprise one or more CRISPR guide RNA (gRNA) sequences.
- gRNA CRISPR guide RNA
- the gRNA sequences are from a gRNA library used in a CRISPR screen.
- At least one of the target-specific preamplification primers is specific to a Pol III promoter.
- the Pol III promoter is a U6 promoter.
- At least one of the target-specific preamplification primers comprises the sequence of SEQ ID NO: 11.
- At least one pair of target-specific preamplification primers includes a primer comprising the sequence of SEQ ID NO: 12 or SEQ ID NO: 13.
- at least one of the target amplification primers is specific to a Pol III promoter.
- the Pol III promoter is a U6 promoter.
- At least one of the target amplification primers comprises the sequence of SEQ ID NO: 14.
- At least one of the target amplification primers comprises the sequence of SEQ ID NO: 12.
- the plurality of cells each comprises an expression construct encoding an RNA-guided nuclease, or inactivated form thereof, that is capable of physically interacting with the guide RNA and being directed to a target locus in the genome by the guide RNA.
- the RNA-guided nuclease is Cas9 or Cpf 1.
- the target genes or sequences comprise a T cell receptor (TCR) gene or sequence.
- TCR T cell receptor
- the at least one pair of target preamplification and/or the at least one pair of target amplification primers comprise a primer specific to a TCR alpha, beta, gamma, or delta chain.
- the at least one pair of target-specific preamplification primers and/or the at least one pair of target amplification primers comprise a primer specific to a TCR alpha chain and a primer specific to a TCR beta chain.
- At least one of the target-specific preamplification primers and/or at least one of the target amplification primers is specific to a TCR CDR3 region.
- the cells or nuclei comprise mammalian cells or nuclei.
- the cells or nuclei comprise human cells or nuclei.
- the cells or nuclei comprise mouse cells or nuclei.
- the cells or nuclei comprise T cells or nuclei derived therefrom.
- the T cells or nuclei comprise one or more cells or nuclei selected from the group consisting of chimeric antigen receptor (CAR) T cells, activated T cells, primary cells, T cells isolated from a cell line, T cells isolated from a tissue, T cells isolated from a subject, effector T cells, cytotoxic T cells, helper T cells, regulatory T cells, memory T cells, nuclei derived from any of the heretofore listed T cells, and combinations thereof.
- CAR chimeric antigen receptor
- the first multiplex amplification of step (i) comprises amplifying the tagged cDNA molecules for from 5 to 20 cycles.
- the first multiplex amplification of step (i) is performed according to the conditions shown in Table 12.
- steps (a), (b), (d), (e)(z), (e)(zzz), or (f) are carried out at a temperature of below about 8, 7, 6, 5, 4, 3, 2, 1, 0, -1, -2, -3, or -4 °C, between about -4 to 8, -4 to 0, 0 to 4, 4 to 8, or 0 to 8 °C, or at about 8, 7, 6, 5, 4, 3, 2, 1, 0, -1, -2, -3, or -4 °C.
- the cells or nuclei were fixed and/or permeabilized at 4 °C or below 4 °C.
- the nucleic acid tags are coupled to the cDNA molecules in step (e)(ii) by ligation.
- the RT primers each comprise a 5’ overhang comprising a 5’ overhang sequence.
- the nucleic acid tags each comprise a first strand comprising a 3’ hybridization sequence and/or a 5’ hybridization sequence flanking the 3’ end and/or the 5’ end of the tag barcode sequence, respectively.
- the RT primers each comprise a 5’ overhang comprising a 5’ overhang sequence
- the nucleic acid tags each further comprise a second strand comprising: a first portion complementary to a 5’ hybridization sequence of a previously coupled nucleic acid tag or a 5’ overhang sequence of an RT primer; and a second portion complementary to the 3’ hybridization sequence.
- the method further comprising: size selecting the enriched plurality of amplified tagged cDNA molecules subsequent to step (h) using solid phase reversible immobilization (SPRI) beads.
- the method further comprising: size selecting the further enriched plurality of tagged target cDNA molecules using solid phase reversible immobilization (SPRI) beads.
- SPRI solid phase reversible immobilization
- the size selection using SPRI beads is single-sided.
- the size selection using SPRI beads is double-sided.
- one or more of the pluralities of aliquots or samples are distributed in a multi-well plate.
- the multi-well plate is a 96-well plate.
- the nucleic acid tags used in one or more of the additional pluralities of aliquots comprise 96 distinct barcode sequences.
- each of the 96 distinct barcode sequences is present in only one of the 96 aliquots.
- At least a subset of the wells of the multiwell plate contain primers comprising well-specific unique dual indexes (UDIs).
- UMIs well-specific unique dual indexes
- one or more of the UDIs comprise any of SEQ ID NOS: 13-302.
- kits for performing the method of any one or more of claims 1-70 are kits for performing the method of any one or more of claims 1-70.
- kits for preparing multiplex sequencing libraries comprising at least one set of primers for cDNA labeling and amplification of the whole transcriptome, and at least one set of target-specific preamplification primers for cDNA labeling, amplification and enrichment of one or more target genes of interest.
- the kit further comprising instructions for preparing the sequencing libraries.
- the kit comprises target-specific primers comprising SEQ ID NO: 11 or SEQ ID NO: 14. DETAILED DESCRIPTION
- the present disclosure relates generally to methods of uniquely labeling or barcoding molecules within a nucleus, a plurality of nuclei, a cell, a plurality of cells, and/or one or more tissues, organs, organisms, or subjects.
- the present disclosure also relates to kits for uniquely labeling or barcoding molecules within a nucleus, a plurality of nuclei, a cell, a plurality of cells, and/or a tissue, organ or organism.
- the molecules to be labeled may include, but are not limited to, RNA molecules, cDNA molecules, DNA molecules, proteins, peptides, and/or antigens.
- the present disclosure provides methods and compositions for creating multiple related sequencing libraries, e.g., transcriptome sequencing libraries for multiplex analyses.
- multiple related sequencing libraries e.g., transcriptome sequencing libraries for multiplex analyses.
- single cell whole transcriptome libraries are created that are coupled with gene- or vector-enriched libraries.
- the present methods and compositions enable the enrichment and highly sensitive detection of individual (or a small number of, e.g., up to 10, 50, 100, 200) target transcripts of interest in parallel with the whole transcriptome (or a subset thereof) in single cells.
- the target transcripts can correspond to any sequence of interest whose presence or expression level in a cell may be associated with other properties of interest of the cell.
- the transcripts comprise CRISPR guide RNAs (gRNAs, or single-guide RNAs or sgRNAs), and the present methods can be used to associate the presence and/or expression level of a given gRNA in an individual cell (and therefore any expected effects on the genes targeted by the gRNA), with the whole transcriptome of same cell.
- CRISPR guide RNAs gRNAs, or single-guide RNAs or sgRNAs
- the present methods can be used to associate the presence and/or expression level of a given gRNA in an individual cell (and therefore any expected effects on the genes targeted by the gRNA), with the whole transcriptome of same cell.
- Such methods enable, for example, the profiling of up to one million or more cells, while reducing the sequencing needed to detect and assign specific expression events (e.g., the expression of individual gRNAs) to individual cells in a population of cells (e.g., in a pooled single-cell CRISPR screen such as CROP-seq or similar methods).
- the present methods can also be used in methods involving the analysis of T cell receptor (TCR) identity in individual cells, the analyses performed using the presently described methods can allow the detection or characterization of a disease process (e.g., via the detection and/or tracking of T cell clonotypes) or the development, preparation, or monitoring of cell therapies (e.g., therapies involving chimeric antigen receptor (CAR) T cells).
- TCR T cell receptor
- the present methods can be used for other applications as well, e.g., any of a number of applications in which it is desirable to associate the presence or expression level of a target with the whole transcriptome in single cells, e.g., for the validation of biomarkers, for the validation of drug targets, for lineage tracing, or for Massively Parallel Reporter Assays (MPRAs).
- MPRAs Massively Parallel Reporter Assays
- each embodiment disclosed herein can comprise, consist essentially of, or consist of its particular stated element, step, ingredient, or component.
- the transition term “comprise” or “comprises” means includes, but is not limited to, and allows for the inclusion of unspecified elements, steps, ingredients, or components, even in major amounts.
- the transitional phrase “consisting of’ excludes any element, step, ingredient or component not specified.
- the transition phrase “consisting essentially of’ limits the scope of the embodiment to the specified elements, steps, ingredients or components, and to those that do not materially affect the embodiment.
- the term “about” has the meaning reasonably ascribed to it by a person skilled in the art when used in conjunction with a stated numerical value or range, i.e., denoting somewhat more or somewhat less than the stated value or range, to within a range of, e.g., ⁇ 20% of the stated value; ⁇ 19% of the stated value; ⁇ 18% of the stated value; ⁇ 17% of the stated value; ⁇ 16% of the stated value; ⁇ 15% of the stated value; ⁇ 14% of the stated value; ⁇ 13% of the stated value; ⁇ 12% of the stated value; ⁇ 11% of the stated value; ⁇ 10% of the stated value; ⁇ 9% of the stated value; ⁇ 8% of the stated value; ⁇ 7% of the stated value; ⁇ 6% of the stated value; ⁇ 5% of the stated value; ⁇ 4% of the stated value; ⁇ 3% of the stated value; ⁇ 2% of the stated value; or ⁇ 1% of the stated value.
- polynucleotide and “nucleic acid,” used interchangeably herein, refer to a polymeric form of nucleotides of any length, either ribonucleotides or deoxynucleotides.
- this term includes, but is not limited to, single-, double-, or multi-stranded DNA or RNA, genomic DNA, cDNA, DNA-RNA hybrids, or a polymer comprising purine and pyrimidine bases or other natural, chemically or biochemically modified, non-natural, or derivatized nucleotide bases.
- nucleic acids containing known analogs or derivatives of natural nucleotides, e.g., molecules that have similar binding properties as the reference nucleic acid.
- nucleic acids can comprise one or more modified nucleotides, e.g., nucleic acids modified at the base moiety, at the sugar moiety, or at the phosphate backbone (e.g., phosphorothioates).
- nucleic acids can comprise one or more moieties to allow or facilitate, e.g., detection, quantification, purification, capture, identification, or selective removal, e.g., biotin, fluorescent labels, etc.
- gene refers to the segment of DNA involved in producing a polypeptide chain or a non-coding transcript (e.g., mRNA). For coding sequences, it may include regions preceding and following the coding region (leader sequence and/or trailer sequence) as well as intervening sequences (introns) between individual coding segments (exons).
- a “transgene” refers to a gene that has been introduced into a cell or organism from another source (e.g., from another organism or following synthesis).
- hybridizable or “complementary” or “substantially complementary” it is meant that a nucleic acid (e.g. RNA) comprises a sequence of nucleotides that enables it to non-covalently bind, i.e. form Watson-Crick base pairs and/or G/U base pairs, “anneal”, or “hybridize,” to another nucleic acid in a sequence-specific, antiparallel, manner (i.e., a nucleic acid specifically binds to a complementary nucleic acid) under the appropriate in vitro and/or in vivo conditions of temperature and solution ionic strength.
- standard Watson-Crick base-pairing includes: adenine (A) pairing with thymidine (T), adenine (A) pairing with uracil (U), and guanine (G) pairing with cytosine (C) [DNA, RNA],
- A adenine
- U uracil
- G guanine
- C cytosine
- G/U base-pairing is partially responsible for the degeneracy (i.e., redundancy) of the genetic code in the context of tRNA anti-codon base-pairing with codons in mRNA.
- a guanine (G) of a proteinbinding segment (dsRNA duplex) of a subject DNA-targeting RNA molecule is considered complementary to a uracil (U), and vice versa.
- G guanine
- U uracil
- hybridize or “complementary” refer to a first nucleotide sequence capable of forming non-covalently bind (hydrogen bond) with at least a portion of a specified second nucleotide sequence.
- a "promoter” refers to a set of nucleic acid sequences that direct the transcription of a nucleic acid, e.g., an adjacent coding sequence. Promoters can be constitutive or inducible. As used herein, a promoter includes necessary nucleic acid sequences near the start site of transcription, such as, in the case of a polymerase II type promoter, a TATA element. Promoters as used herein can include bacterial promoters or eukaryotic promoters including RNA polymerase II (e.g., EF-1 alpha) and RNA polymerase III (e.g., U6) promoters. A promoter can also include distal enhancer or repressor elements. The promoter can be a heterologous promoter (i.e., not naturally linked to the coding sequence) or homologous (i.e., the promoter that naturally drives the expression of the transcribed sequence).
- binding or “coupling” is used broadly throughout this disclosure to refer to any form of attaching or coupling two or more components, entities, or objects.
- two or more components may be bound to each other via chemical bonds, covalent bonds, non- covalent bonds, ionic bonds, hydrogen bonds, electrostatic forces, Watson-Crick hybridization, nucleic acid sequence complementarity, etc.
- An “expression cassette” is a nucleic acid construct, generated recombinantly or synthetically, with a series of specified nucleic acid elements that permit transcription of a particular polynucleotide sequence in a host cell.
- An expression cassette may be part of a plasmid, viral genome, or nucleic acid fragment.
- an expression cassette includes a polynucleotide to be transcribed (e.g., a protein coding sequence or a non-coding RNA such as a guide RNA), operably linked to a promoter.
- the promoter can be a heterologous promoter, i.e., a promoter not naturally linked to the transcribed sequence.
- a “barcode” or “index” refers to a nucleotide sequence (the “barcode sequence” or “index sequence”) that is used to label an entity such as a cell, plurality of cells, cell populations, cell compartments, nucleic acids, polypeptides, or other molecules, and that varies among or between cells, cell populations, nucleic acids or other molecules, etc.
- a barcode is used to label (or tag) cDNA s generated within a given aliquot of cells, e.g., where all of the cDNA s labeled in the aliquot receive the same barcode, or receive a set of barcodes that is specific to the aliquot, i.e., that the specific set of barcodes used in the aliquot is different from the sets used in the other aliquots.
- Barcodes can be added to polynucleotides (or other molecules) in any of a number of ways.
- polynucleotides for example, they can be introduced, e.g., in a primer, template, tempi ate- switch oligonucleotide (TSO), or other polynucleotide used during a polymerization-based reaction such as reverse transcription, PCR, or other polymerization-based and/or amplification reaction; barcodes can also be added to polynucleotides by hybridization and/or by ligation, e.g., by ligation of an adaptor or other polynucleotide (e.g., a “nucleic acid tag”) comprising a barcode.
- TSO tempi ate- switch oligonucleotide
- Such adaptors or other barcode-comprising polynucleotides can be appended to a polynucleotide, e.g., by ligation via blunt-end ligation, ligation to compatible restriction ends, ligation to A-tailed or otherwise tailed ends, using a linker strand, etc.
- the barcode, or adapter or other polynucleotide comprising the barcode can be single-stranded, double-stranded, partially double-stranded and partially single-stranded (e.g., comprising one or more overhangs at the 3’ and/or 5’ ends), etc.
- a polynucleotide when a polynucleotide is said to comprise a “barcode” (or the equivalent term “barcode sequence”) it means that the polynucleotide comprises a sequence of nucleotides that can be used to distinguish the polynucleotide comprising the barcode from one or more other polynucleotides, e.g., from polynucleotides originating from another cell, from polynucleotides labeled in a different aliquot or well, or from all other polynucleotides in a sample.
- the barcode alone is sufficient to distinguish the polynucleotide from other polynucleotides, whereas in other embodiments the barcode provides information that can contribute to distinguishing the polynucleotide from other polynucleotides, but is not sufficient on its own (e.g., one or more additional sequence elements, or other markers, are also needed to completely distinguish the polynucleotide).
- a “barcode” can refer to a single sequence of contiguous nucleotides, or to a combination of individual sequences of contiguous nucleotides.
- multiple rounds of tagging can be performed, e.g., multiple rounds in each of which cells are divided into aliquots, nucleic acid tags (comprising a barcode) are added to molecules (such as cDNA s) in the cells of each aliquot, and the cells are then recombined (or repooled).
- the tagged cDNA s in the cells of the aliquots can comprise two nucleic acid tags, each comprising a barcode.
- the two barcodes present on the same molecule can be referred to herein as a single “barcode,” even if there are other sequence elements (such as linker sequences, adapter sequences, primer-binding sequences, etc.) intervening between the two barcodes on the molecule.
- a polynucleotide is said to comprise a “barcode,” this can mean that, depending on the context, the specific sequence of the barcode (or “barcode sequence”) can vary between the different polynucleotides comprising the barcode, or that the specific barcode sequence is the same between the different polynucleotides.
- the specific sequence of the barcode or “barcode sequence”
- all of the cDNA s in cells of a given aliquot (or sample) are tagged with nucleic acid tags comprising the same barcode sequence, whereas the cDNA s of cells in other aliquots are tagged with nucleic acid tags comprising other barcode sequences.
- a polynucleotide comprising a barcode is added to molecules in a given aliquot (or sample), wherein the specific barcode sequence differs between the different polynucleotides used in the aliquot or sample; such polynucleotides and barcodes can be used, for example, to distinguish between the different original molecules in the sample (e.g., to be able to detect errors arising during amplification of the original molecules, such as cDNA s derived from original mRNA molecules).
- one or more barcodes is included in primers used for reverse transcription (i.e., RT primers) or amplification (e.g., PCR), or in a nucleic acid tag appended to a polynucleotide, e.g., by ligation, where the cells are divided into two or more (e.g., 2, 4, 8, 12, 16, 24, 32, 48, 96, or more) aliquots or wells prior to the reverse transcription, amplification, or ligation, and the barcode sequences used in the cells are aliquot- or well-specific.
- primers used for reverse transcription i.e., RT primers
- amplification e.g., PCR
- a nucleic acid tag appended to a polynucleotide e.g., by ligation
- barcode or index sequences are said to be “aliquot-specific” or “wellspecific,” this means that there is an association between the sequences used and the presence of the different cells within the two or more aliquots or wells, such that the association can be used to derive information about the location of a given cell within the aliquots or wells based upon the specific sequence. For example, in some embodiments each cell within a given aliquot or well has primers or tags with the same barcode sequence, and the barcode sequences are different between each aliquot or well.
- the barcodes are still considered aliquot-specific orwell-specific, so long that some information can be derived from the barcode sequence about the aliquot or well in which a given cell or nucleus was present.
- split-pool labeling or “split-pool barcoding” or “combinatorial labeling” or “combinatorial barcoding” refers to a cell-specific labeling method involving the use of fixed and permeabilized cells or nuclei as containers, wherein a plurality of the cells or nuclei are first separated into multiple wells (or aliquots), followed by the labeling of RNA or other molecules within each cell or nucleus using a well-specific tag or barcode, followed by the pooling of the cells, and wherein this cycle of separation, tagging, and pooling is repeated one or more times.
- each cell or nucleus within the plurality will comprise a combination of tags or barcodes that will reflect the particular combination of wells or aliquots in which it was present throughout the multiple rounds of tagging.
- the number of potential barcode combinations increases correspondingly.
- a suitable experimental design can be prepared that will generate a high likelihood that tagged molecules, e.g., cDNA s, within each cell or nucleus will have the same combination of barcodes that is unique among the overall population of cells or nuclei.
- split-pool labeling methods are disclosed, e.g., in US Patent Nos. 10,900,065, 11,634,751, 11,168,355, 11,427,856, 11,555,216, 11,639,519, 11,680,283, 10,633,648, 11,421,221, US Pat. App. Pub. No. US 2021/0388415 Al, in Rosenberg et al., Science 360, 176-182 (2016), Rosenberg et al., BioRxiv (2017), “Scaling single cell transcriptomics through split pool barcoding,” doi.org/10.1101/105163, Tran et al. BioRxiv (2022) “High sensitivity single cell RNA sequencing with split pool barcoding,” doi.org/10.1101/2022.08.27.505512, the entire disclosures of all of which are herein incorporated by reference (including all supplemental material).
- CRISPR/Cas9 refers to a class of bacterial systems for defense against foreign nucleic acids.
- CRISPR-Cas systems are found in a wide range of bacterial and archaeal organisms.
- CRISPR-Cas systems fall into two classes with six types, I, II, III, IV, V, and VI, with Class 1 including types I, III, and IV CRISPR systems, and Class 2 including types II, V, and VI.
- Class 2 CRISPR-Cas systems include individual Cas proteins that carry out multiple functions including spacer acquisition, RNA processing from the CRISPR locus, target identification, and cleavage of target nucleic acids, e.g., Cas9 in the case of type II systems and Casl2a (or Cpfl) in the case of type V systems.
- Any suitable Cas protein or protein assembly can be used in the present methods, i.e., any Cas that can associate with a guide RNA (gRNA), be directed to a target sequence as defined by the gRNA, and, e.g., cleave or otherwise degrade the target sequence, inhibit or activate transcription at the target sequence (in the case of CRISPRi or CRISPRa), etc.
- gRNA guide RNA
- a “guide RNA” refers to an RNA sequence that comprises a constant, scaffold sequence (tracrRNA) that interacts with the Cas protein (e.g., Cas9), and a variable sequence (crRNA) that defines the target sequence of the nuclease.
- tagged cDNA molecules refers to complementary DNA (cDNA) molecules comprising one or more barcodes (e.g., well-specific barcodes), e.g., cDNA molecules generated within cells or nuclei to which a nucleic acid tag has been appended (e.g., by ligation).
- barcodes e.g., well-specific barcodes
- tagged target cDNA molecules refers to tagged cDNA molecules (or transcripts) expressed from a “target gene” or comprising a “target sequence” of interest.
- Target cDNAs can refer to any transcript whose particular identity, whose presence, or whose level of expression can potentially vary between cells or nuclei, and where it may be of interest to correlate the identity, presence, and/or expression level of the transcript with the whole transcriptome in individual cells or nuclei.
- target transcripts are guide RNAs (gRNAs), e.g., as used in the context of a CRISPR screen, or T cell receptors (TCR), where different clonotypes may display associations with, e.g., disease patterns or other physiological or biological features of interest as reflected in the whole transcriptome in single cells.
- preamplification refers to a round of multiplex PCR performed on the whole transcriptome (WT), using WT primers (or “WT preamplification primers”) configured to amplify the whole transcriptome, e.g., primers annealing to adapter sequences (e.g., TSO or R2 sequences) present on all tagged cDNAs generated using the present combinatorial barcoding methods (see, e.g., FIGS.
- target cDNAs e.g., one or more primers capable of annealing to target cDNA sequences and amplifying target cDNA molecules, either alone or together in combination with a generic primer (e.g., one target specific primer and one R2 primer; see, e.g., FIG. 2B), such that target cDNAs can become enriched during the multiplex amplification relative to non-target cDNAs within the whole transcriptome.
- a primer is said to be “target-specific,” this can indicate that the primer is complementary to a sequence present in all target cDNAs (e.g., a common sequence such as a U6 primer sequence present in all gRNAs used in a CRISPR library screen) or that it is complementary to a sequence present in one target only (e.g., in a transcript expressed from a specific gene of interest).
- a “target-specific” preamplification primer is not configured to bind specifically to non-target cDNAs in the whole transcriptome.
- One aspect of the present disclosure relates to methods of labeling nucleic acids, e.g., labeling nucleic acids in a cell- or nucleus-specific manner, such that the nucleic acids within the cells or nuclei each comprise a cell-specific label (or nucleus-specific label when nuclei are used instead of cells).
- the methods comprise labeling nucleic acids in one or more cells or nuclei in order to prepare two (or more) sequencing libraries in parallel, with one sequencing library corresponding to the whole transcriptome (or a subset thereof), and one sequencing library corresponding to one or more target sequences (e.g., target genes, target transcripts, or derivatives thereof) of interest.
- both libraries comprise tagged cDNA molecules originating from the same plurality of cells, and because the tagged cDNA molecules in the two libraries collectively share the same cell- or nucleus-specific labels (i.e., each cDNA molecule has a combination of barcode sequences that can be used to identify the individual cell from which the cDNA originated), the present methods can be used to associate the presence or expression level of individual target sequences with the overall pattern of expression of the whole transcriptome in the individual cells.
- the target sequences can be any target of interest, e.g., any target whose particular identity, whose presence, or whose level of expression can vary between cells, and where the identity, presence, and/or expression level has the potential to affect, or be correlated with an effect on, the whole transcriptome or a subset thereof.
- the target may be a guide RNA (gRNA) used in the context of a CRISPR screen, where the method is used to determine the specific impact of each individual gRNA being evaluated in the screen on the overall expression in the cell.
- gRNA guide RNA
- the target may be a potential biomarker or drug target, and the method is used to identify individual markers or transcripts whose alteration has an impact on the overall transcription in the cell, or on the transcription of particular subsets of genes involved, e.g., in a pathway, physiological process, or disease-related phenomenon or property of interest.
- the target may be a gene whose identity varies between cells, such as a T cell receptor (TCR), where the present methods allow the association of individual TCR clonotypes with the overall pattern of the whole transcriptome (or a subsets thereof) in the same cells.
- TCR T cell receptor
- the present disclosure provides a method of labeling nucleic acids for multiplex transcriptional analysis in a plurality of cells or nuclei, the method comprising providing a plurality of fixed and permeabilized cells or nuclei, each comprising a plurality of RNA molecules and one or more target genes or transcripts of interest; dividing the plurality of cells or nuclei into a first plurality of aliquots, wherein each aliquot comprises more than one cell or nucleus; generating cDNA molecules by reverse transcribing RNA molecules within the cells or nuclei of the first plurality of aliquots using reverse transcription (RT) primers each comprising: (i) a poly(T) sequence or a random sequence; and (ii) an aliquot-specific RT barcode sequence; pooling the cells or nuclei from the first plurality of aliquots; using one or more rounds of split-pool barcoding to cell-specifically (or nucleus-specifically) tag the cDNA molecules with one or more nucle
- RT reverse transcription
- one or more tagging steps such as those involving reverse transcription to generate cDNA molecules and the subsequent coupling of one or more nucleic acid tags to the cDNA molecules, may take place at the interior of the cells or nuclei, and one or more subsequent steps, such as template switching and preamplification/amplification steps, may be performed on tagged cDNA molecules isolated from the cells or nuclei following their lysis.
- the present methods involve labeling cDNA molecules with barcodes in a process called, alternatively, split-pool labeling, tagging, or barcoding, or combinatorial labeling, tagging, or barcoding.
- the split-pool barcoding step of the protocol may be repeated a number of times sufficient to generate a unique combination or series of labeling sequences for the cDNAs in each sequencing library such that all (or virtually all, or the great majority) of the cDNAs originating from a given cell (or nucleus) will have the same combination or series of labeling sequences (also referred to as barcode sequences or index sequences), and that the complexity of the combinations or series of labeling sequences is such that each combination or series is unique, or essentially unique, among all of the cells in the plurality of cells.
- the labeling could be repeated enough times to generate a sufficient number of distinct combinations or series of labeling sequences that each individual combination or series has, e.g., at least a 95%, 96%, 97%, 98%, 99%, or higher probability of being unique among all of the combinations or series in the cells of the plurality.
- the split-pool barcoding may be repeated a number of times such that the cDNAs in the first cell may have a first unique series of labeling sequences, the cDNA s in a second cell may have a second unique series of labeling sequences, the cDNAs in a third cell may have a third unique series of labeling sequences, and so on.
- the methods of the present disclosure may provide for the labeling of cDNA sequences from single cells with unique barcodes, wherein the unique barcodes may identify or aid in identifying the cell from which the cDNA originated.
- the unique barcodes may identify or aid in identifying the cell from which the cDNA originated.
- a portion, a majority, or substantially all of the cDNA from a single cell may have the same barcode, and that barcode may not be repeated in cDNA originating from one or more other cells in a sample (e.g., from a second cell, a third cell, a fourth cell, etc.).
- the barcodes used in the present methods can be added to the cDNA molecules at any of a number of steps, e.g., during reverse transcription, during one or more rounds of ligationbased tagging (e.g., appending a nucleic acid tag to a cDNA molecule), or during amplification (e.g., introduced via a PCR primer).
- the barcodes introduced at any of the steps can be any length, e.g., 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 nucleotides or longer.
- the barcodes are at least 8 nucleotides long. In some embodiments, the barcodes are 8 nucleotides long.
- the barcodes are greater than 8 nucleotides long.
- the length of any given barcode used in a given embodiment of the method e.g., a barcode sequence in a first nucleic acid tag
- can be independent of the length of a different barcode used in the embodiment e.g., a barcode in a second nucleic acid tag.
- a RT primer barcode, a barcode in a first nucleic acid tag, a barcode in a second nucleic acid tag, and/or a barcode or index sequence in a PCR primer may all be the same length, whereas in some embodiments they may all be different lengths, while in other embodiments some barcodes may be the same length while some are of different lengths.
- barcoded (or tagged) cDNA molecules are mixed together and sequenced (e.g., using NGS), such that data can be gathered regarding RNA expression at the level of a single cell.
- cDNA molecules are barcoded (or tagged) together, but then sequenced independently. In either case, so long as the cDNA molecules originated from the same original cells (or nuclei) and were labeled together such that all of the cDNA molecules originating from a given original cell or nucleus comprise the same combination of barcodes, the sequencing data can be combined and analyzed together, even if certain cDNA molecules from the same individual cells are present in different sequencing libraries and/or sequenced independently. Certain embodiments of the methods of the present disclosure may be useful in assessing, analyzing, or studying the transcriptome (i.e., the different RNA species transcribed from the genome of a given cell) of one or more individual cells.
- an aliquot or group of cells can be separated into different reaction vessels or containers.
- Vessels or containers can also be referred to herein as receptacles, samples, and wells, and the terms vessel, container, receptacle, sample, and well may be used interchangeably herein.
- cells or nuclei may be separated into a number of different reaction vessels.
- the number of reaction vessels may include four 1.5 ml microcentrifuge tubes, a plurality of wells of a 96-well plate, or another suitable number and type of reaction vessels.
- the reaction vessels or containers include one or more 96-well plates (or, e.g., 6, 12, 24, 48, 384 well plates).
- cells or nuclei can be distributed into a plurality of aliquots and polynucleotides within the cells or nuclei labeled with an aliquotspecific barcode (e.g., by reverse transcription of RNA within the cells or nuclei using primers comprising the barcode, or by appending a nucleic acid tag to polynucleotides within the cells or nuclei wherein the tags comprise aliquot-specific barcodes), the aliquots can then be repooled, washed, and separated again into a new plurality of aliquots, and a further set of barcodes can be added to the polynucleotides.
- an aliquotspecific barcode e.g., by reverse transcription of RNA within the cells or nuclei using primers comprising the barcode, or by appending a nucleic acid tag to polynucleotides within the cells or nuclei wherein the tags comprise aliquot-specific barcodes
- cDNAs or other polynucleotides within each cell or nucleus may be bound to a unique combination or sequence of barcodes, or substantially unique combination or sequence of barcodes.
- all (or most, depending, e.g., on the efficiency of the tagging reactions in a given cell or nucleus) of the cDNA molecules or other polynucleotides within any individual cell or nucleus within a plurality of cells or nuclei will comprise the same combination of barcodes (or barcode sequences).
- the combination or sequence of barcodes can be used to identify, or help identify, the individual cell from which a given tagged cDNA molecule originated.
- cells or nuclei within each well or aliquot are tagged with a different barcode, i.e., all of the barcodes (or barcode sequences) used within the well or aliquot are the same, while the barcode sequences are different in each of the wells or aliquots.
- barcode sequences are different in each of the wells or aliquots.
- other barcoding strategies are possible as well, e.g., in which more than one barcode sequence is used within a given well or aliquot, or in which one or more barcode sequences are present in multiple wells or aliquots during the barcoding step.
- any barcoding protocol can be encompassed by the present disclosure so long that during the protocol the labeled molecules (e.g., RNA molecules, or cDNA molecules produced therefrom) within each cell or nucleus acquire a combination of barcodes that reflects the different wells or aliquots in which the cell or nucleus was present.
- the labeled molecules e.g., RNA molecules, or cDNA molecules produced therefrom
- the different labeling sequences can be introduced at one or more steps, including during reverse transcription (e.g., wherein each reverse transcription (RT) primer comprises a barcode), during one or more subsequent labeling steps (e.g., ligating, tagmentation, or otherwise coupling a nucleic acid tag comprising a barcode sequence to a cDNA), or during one or more amplification steps (e.g., using one or more primers that include a barcode or index sequence).
- RT reverse transcription
- subsequent labeling steps e.g., ligating, tagmentation, or otherwise coupling a nucleic acid tag comprising a barcode sequence to a cDNA
- amplification steps e.g., using one or more primers that include a barcode or index sequence.
- RNA is labeled within cells or nuclei by generating cDNA through reverse transcription (RT) using well-specific barcode-containing primers, and subsequently additional well-specific barcodes are ligated to the cDNA molecules in one or more round of split-pool tagging, and finally yet more barcodes (or indexes) are added to the cDNA molecules during amplification using well- or samplespecific barcoded primers (e.g., unique dual indexes or UDIs).
- RT reverse transcription
- the number of possible barcode combinations can vary by, e.g., increasing or decreasing the number of wells used for reverse transcription, for ligation-based tagging, and/or for indexing during amplification, and/or by changing the number of total barcoding steps, e.g., by varying the number of rounds of split-pool tagging or by omitting barcodes in one or more steps (e.g., by performing RT and/or amplification using non-barcoded primers, or by omitting the ligation tagging steps and/or amplification indexing steps altogether).
- steps of the present methods in which a nucleic acid tag is appended or coupled to a cDNA or other polynucleotide within a cell or nucleus may be repeated one or more times, e.g., 1, 2, 3, 4, 5 times, or 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 20, 25, 30, 35, 40, 45, 50, 100, or more times.
- the steps are repeated a sufficient number of times such that the cDNA s of each cell or nucleus would be likely to be bound to a unique barcode (e.g., unique among the plurality of cells or nuclei, or among multiple pluralities of cells or nuclei, e.g., in situations where multiple pluralities may be sequenced together).
- the number of times may be selected to provide a greater than 50% likelihood, greater than 90% likelihood, greater than 95% likelihood, greater than 99% likelihood, or some other probability that the cDNA s in each cell are bound to a unique barcode.
- the number of total possible barcode combinations in the population will be a function of the number of barcode tagging rounds that are performed, and on the number of different barcodes/aliquots included in each round.
- the total number of possible barcode combinations can be achieved in any of a number of ways.
- the number of total possible barcode combinations is greater than the number of different cells in the population, e.g., such that the probability that a given combination of barcodes is unique among all of the cells of the plurality is, e.g., 95%, 96%, 97%, 98%, 99%, or higher.
- the present methods could include, e.g., 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 25, 30, 35, 40, 45, 50, or more individual rounds of barcoding.
- the present methods can be used to specifically label molecules, e.g., nucleic acids, in any of a wide variety of cell types.
- Cells suitable for use in the present methods include, e.g., primary cells, cell lines (e.g., HEK293, HEK293T, HEK293F, NH43T3, Jurkat cells, or others), cells isolated from an organism, organoid, or a tissue, isolated blood cells, and others.
- the cells are healthy cells (e.g., wild-type or control cells).
- the cells are disease cells (e.g., cancer cells, infected cells).
- the cells are stem cells.
- the plurality of cells may be eukaryotic cells, vertebrate cells, mammalian cells, human cells, mouse cells, insect cells, plant cells, fungal cells, yeast cells, or bacterial cells.
- the cells comprise one or more cell types such as blood cells (e.g., peripheral blood mononuclear cells or PBMCs, immune cells such as T cells, B cells, NK cells), brain cells, liver cells, gut cells, bone marrow cells, pancreatic cells, epithelial cells, endothelial cells, neuronal cells, fibroblast cells, bone cells, muscle cells, skin cells, fat cells, lymphocytes, myeloid cells, macrophages, stem cells, and others.
- the cells are all from a single source, i.e., from a single individual, organism, or tissue.
- the cells are from multiple sources, i.e. from multiple individuals, organisms, or tissues.
- the cells are autologous cells. In some embodiments, the cells are allogeneic cells. In some embodiments, the cells are all or primarily comprise a single cell type (e.g., from a cell line, or a specific cell type isolated from a primary sample). In some embodiments, the cells comprise a mixture of different cell types. In some embodiments, the cells are adherent cells. In some embodiments, the cells are suspension cells.
- the cells have been previously frozen.
- the cells have been previously fixed and frozen, e.g., the methods are performed using multiple samples that have been fixed and/or frozen at different times.
- the cells have been previously fixed, permeabilized, and frozen.
- nuclei can be prepared, e.g., using standard methods such as by douncing. In some embodiments, nuclei are prepared from frozen cells, tissue samples, or tissue slices or sections.
- Nuclei can be prepared, e.g., by placing the frozen sample (cells, tissue, minced tissue sample, slice, section, etc.) into a cooled nuclei isolation (NIM) buffer solution (e.g., NIM1 or NIM2 buffer), then transferred to a dounce and homogenized, e.g., using a pestle (e.g., 10 strokes each with a loose and with a tight pestle). The homogenate can then be filtered (e.g., using a 40 um or 70 um filter) and transferred to, e.g. conical tubes.
- NIM nuclei isolation
- the tube can then be centrifuged, e.g., 200x or 500x g in a pre-cooled swinging bucket centrifuge for, e.g., 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, or more minutes at a low temperature, e.g., 4 °C, or at 1 °C, 2 °C, 3 °C, 4 °C, 5 °C, 6 °C, 7 °C, or 8 °C, or at a temperature less than 8 °C, 7 °C, 6 °C, 5 °C, 4 °C, 3 °C, 2 °C, or 1 °C.
- a low temperature e.g., 4 °C, or at 1 °C, 2 °C, 3 °C, 4 °C, 5 °C, 6 °C, 7 °C, or 8 °C, or at a temperature less than 8 °C, 7 °C, 6 °C, 5 °C, 4
- the cells are counted before and/or after centrifugation, e.g., to ensure an appropriate number of cells in each aliquot or tube.
- the pellets can then be resuspended in an appropriate solution or buffer, e.g., a nuclei buffer containing BSA (e.g., 0.75% BSA), and subsequently fixed and stored, e.g., at -80 °C.
- the present methods allow the labeling and multiplex analysis of cells at a broad range of scales. For example, in some embodiments, the methods are used to label up to 10,000 cells. In some embodiments, the methods are used to label up to 100,000 cells or nuclei. In some embodiments, the methods are used to label up to 1,000,000 cells or nuclei (see, e.g., FIGs. 3A-3D).
- FIGS. 3A-3D provide an overview of in situ cell barcoding steps (i.e., combinatorial barcoding, or split-pool labeling) according to embodiments of the present methods.
- FIG. 3 A illustrates Round 1 Barcoding. Fixed and permeabilized cells were loaded into multiple wells of a Round 1 plate.
- the image shows cells distributed into 48 wells of the Plate, e.g., for labeling up to 100,000 cells.
- different numbers of wells are used, e.g., labeling up to 10,000 cells distributed into 8 wells of the plate, or labeling of up to 1,000,000 cells distributed into all 96 wells.
- some or all of the different wells e.g., the 48 wells shown in FIGs. 3A-3C
- the wells contain cells originating from the same samples or experimental conditions.
- RNA was reverse transcribed via oligo dT and random hexamer primers with a well-specific barcode that are associated with specific samples.
- FIG. 3B illustrates Round 2 Barcoding.
- FIG. 3C illustrates Round 3 Barcoding. The cells were pooled and loaded into the Round 3 Plate. A third barcode was ligated to the cDNA, which also contains an Illumina R2 sequence, and biotin.
- FIG. 3D illustrates Lysis and Sublibrary Generation. Cells were split into multiple sublibraries (or “samples”) and lysed.
- the number of sublibraries can vary according to the number of cells used, e.g., 8 sublibraries with up to 100,000 cells, or 2 sublibraries with up to 10,000 cells, or 16 sublibraries with up to 1,000,000 cells.
- the numbers of cells labeled and analyzed are at least 10,000, 20,000, 30,000, 40,000, 50,000, 60,000, 70,000, 80,000, 90,000, 100,000, 200,000, 300,000, 400,000, 500,000, 600,000, 700,000, 800,000, 900,000, 1,000,000, 1,500,000, 2,000,000, 3,000,000, or more.
- the cells have been genetically modified, e.g., modified using CRISPR to introduce a transgene, delete a gene, locus, or genomic region, or modify the sequence of a gene, locus, or genomic region, e.g., through inducing indels (small insertions or deletions) at the gRNA target site.
- the cells have been modified to modulate the expression of one or more endogenous genes, e.g., using CRISPRi, CRISPRa, RNAi, etc.
- the modified cells express transgenes, e.g., coding sequences encoding polypeptides with potential activity of interest, e.g., candidate drug targets or biomarkers.
- the present methods are used in assays for, e.g., validating biomarkers and/or drug targets.
- the modified cells comprise candidate regulatory sequences and encode reporter genes, e.g., in massively parallel reporter assays (MPRAs).
- MPRAs massively parallel reporter assays
- the present methods are used for lineage tracing.
- the present methods are used in screens, e.g., CRISPR screens, where the methods can be used both to identify the specific guide RNA (gRNA) expressed in a given cell and assess the effects of the gRNA expression on global transcriptome in the same cell.
- the present methods are used to generate a first sequencing library for a population of cells having undergone CRISPR screening that is enriched for the gRNAs, and a second sequencing library directed to the whole transcriptome (or a subset of transcripts of interest).
- each of the cells within the population comprises an RNA- guided nuclease (e.g., Cas9 or Cpfl), and the transgenes comprise guide RNAs.
- the detection of a given guide RNA points to a particular genomic modification as directed by the RNA-guided nuclease at a particular locus specified by the guide RNA.
- modifications could comprise, e.g., deletions, insertions, nucleotide alternations, transcriptional activation (e.g., CRISPRa) or inhibition (e.g., CRISPRi), or other changes.
- the RNA-guided nuclease (e.g., Cas9) can be introduced, e.g., by stably integrating a transgene encoding the nuclease into the genome of the cells, by introducing a plasmid or vector (e.g., a viral vector) comprising an expression cassette encoding the nuclease into the cells, by introducing mRNA encoding the nuclease into the cells, or by introducing the nuclease polypeptides into the cells.
- the transgenes drive the expression of transcripts that can potentially alter the activity of one or more endogenous genes, e.g., antisense, miRNA, shRNA, or siRNA sequences.
- Expression of the transgenes can be driven using any suitable promoter, e.g., a constitutive or an inducible promoter suitable for use in the cells.
- the promoter is a constitutive promoter, e.g. a U6 promoter such as human U6 (hU6), or the EF-1 alpha promoter (EF-1 A).
- transcription of transgenes comprising small non-coding RNAs is driven by a type III RNA polymerase, e.g., U6, 7SK, or Hl promoter.
- the transcription of transgenes comprising protein coding sequences is driven by a type III RNA polymerase, e.g., CMV, EFla, CAG, or PGK promoter.
- a type III RNA polymerase e.g., CMV, EFla, CAG, or PGK promoter.
- the transgenes have been introduced into the cells using a vector e.g., a viral vector such as a lentiviral vector, adeno associated viral (AAV) vector, or adenoviral vector.
- the vector is configured such that a transcript (e.g., gRNA) is expressed under the control of more than one promoter, e.g., an RNA Pol III promoter such as hU6, as well as an RNA Pol II promoter such as EFla, such that the gRNA is produced in both a non-polyadenylated form (from the Pol III promoter) and a polyadenylated form (expressed from the Pol II promoter).
- a transcript e.g., gRNA
- the non-polyadenylated gRNA can bind to the RNA-guided nuclease (e.g., Cas9) to effect a genomic editing event (e.g., indel) at a target locus defined by the gRNA, and the polyadenylated gRNA can be detected for singlecell RNA sequencing, e.g., using a poly(T) reverse transcription primer according to the present methods.
- the gRNA sequence is present in the vector (and/or the genome after integration) within an expression cassette, i.e., a construct comprising an expressed sequence (such as a gRNA), operably linked to a promoter (e.g., hU6).
- each vector comprises a gRNA-specific barcode sequence (or a barcode specific to a different, non-gRNA polynucleotide), such that the specific gRNA (or other polynucleotide) expressed from the vector can be determined indirectly by sequencing the barcode.
- the vector is, or is derived from, a vector disclosed in Datlinger et al. (2017) Nature Methods Vol. 14(3); Jaitin et al. (2016) Cell, 167.7: 1883-1896; Xie et al., (2017) Mol. Cell. 66(2):285-299; Dixit et al. (2016) 167(7): 1853-1866; all of which are herein incorporated by reference in their entireties.
- a CRISPR screen comprising 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 200, 300, 400, 500, 600, 700, 800, 900, 1000, 1100, 1200, 1300, 1400, 1500, 1600, 1700, 1800, 1900, 2000 or more gRNAs can be analyzed using the present methods.
- the set of gRNAs (or vectors, e.g,. lentiviral vectors) comprising the gRNA sequences can target any number of endogenous genes. For example, each gene could be targeted by 1, 2, 3, 4, 5, or more different gRNAs used in the screen.
- the targets e.g., gRNA, e.g., vectors such as lentiviral vectors comprising a gRNA sequence
- gRNA multiplicity of infection
- a higher m.o.i. is used, such that cells (or nuclei) in the library are likely to have more than one gRNA vector.
- the cells (or nuclei) are fixed prior to barcoding such that the components are immobilized or held in place.
- fixation is performed on single cell (or nuclei) suspensions.
- the cells (or nuclei) were previously frozen and are thawed before fixation.
- the cells (or nuclei) are used directly, e.g., from culture or after isolation from a biological source, without freezing.
- the cells (or nuclei) are assessed for quality before fixation.
- Cells (or nuclei) can be counted (e.g., using a hemocytometer) and/or otherwise assessed in any of a number of standard ways, e.g., by staining with trypan blue, acridine orange, and/or propidium iodide.
- the cell (or nuclei) suspensions show 70% or more viability, the suspensions show 5% and/or less aggregation/debris.
- suspensions with at least about 100,000 cells or nuclei are used.
- adherent cells are first dissociated using TrypLE Express Enzyme (IX), phenol red (Thermo Fisher Scientific).
- the present methods comprise providing a plurality of fixed and permeabilized cells. In some embodiments, the present methods comprise fixing and permeabilizing the plurality of cells or nuclei prior to, e.g., generating cDNA in the cells or nuclei by reverse transcribing RNA in the cells or nuclei. In some embodiments, the cells or nuclei may be fixed and permeabilized and frozen, e.g., at -80 °C, prior to, e.g., generating cDNA. In some embodiments, the cells or nuclei are fixed and permeabilized and then directly used in the present methods, i.e., without freezing and storing them.
- the plurality of cells (or nuclei) may be fixed using any of a number of suitable reagents or conditions.
- the cells (or nuclei) can be fixed in formaldehyde in phosphate buffered saline (PBS) (e.g., in about 1-4% formaldehyde in PBS).
- PBS phosphate buffered saline
- the plurality of cells (or nuclei) may be fixed using methanol (e.g., 100% methanol) at about -20° C. or at about 25° C.
- the plurality of cells (or nuclei) may be fixed using methanol (e.g., 100% methanol), at between about -20° C. and about 25° C.
- the plurality of cells (or nuclei) may be fixed using ethanol (e.g., about 70-100% ethanol) at about -20° C. or at room temperature. In yet various other embodiments, the plurality of cells (or nuclei) may be fixed using ethanol (e.g., about 70-100% ethanol) at between about -20° C. and room temperature. In still various other embodiments, the plurality of cells (or nuclei) may be fixed using acetic acid, for example, at about -20° C. In still various other embodiments, the plurality of cells (or nuclei) may be fixed using acetone, for example, at about -20° C. Other suitable methods of fixing the plurality of cells (or nuclei) are also within the scope of this disclosure.
- RNases are inactivated or eliminated before, during, and/or after fixation and/or permeabilization using, e.g., RNase decontamination products such as RNaseZap RNAse Decontamination Solution (Thermo Fisher).
- RNase decontamination products such as RNaseZap RNAse Decontamination Solution (Thermo Fisher).
- BSA e.g., 5-10%, or 7.5%) is added to the cells or nuclei, e.g., to prevent aggregation.
- the methods may include fixing and/or permeabilizing the cells or nuclei at a temperature below about 8 °C, below about 7 °C, below about 6 °C, below about 5 °C, below about 4 °C, below about 3 °C, below about 2 °C, below about 1 °C, below about 0 °C, below about -5 °C, below about -10 °C, at about 8 °C, at about 7 °C, at about 6 °C, at about 5 °C, at about 4 °C, at about 3 °C, at about 2 °C, at about 1 °C, or at another suitable temperature.
- the cells or nuclei are fixed and/or permeabilized at a temperature of below about 8, 7, 6, 5, 4, 3, 2, 1, 0, -1, -2, -3, or -4 °C, between about -4 to 8, - 4 to 0, 0 to 4, 4 to 8, or 0 to 8 °C, or at about 8, 7, 6, 5, 4, 3, 2, 1, 0, -1, -2, -3, or -4 °C.
- the cells or nuclei are fixed and/or permeabilized on ice.
- the cells are adherent cells (i.e., cells that are adhered to a plate, e.g., adherent mammalian cells).
- adherent cells are fixed, permeabilized, and/or undergo reverse transcription, followed by trypsinization to detach the cells from a surface.
- the adherent cells may be detached prior to the separation and/or tagging steps.
- the adherent cells may be trypsinized prior to the fixing and/or permeabilizing steps.
- Permeabilization of the cells (or nuclei) can be achieved in any of a number of ways.
- a detergent or surfactant such as TRITONTM X-100 may be added to the plurality of cells (or nuclei), followed by the optional addition of HC1.
- about 0.2% TRITONTM X-100 is added to the plurality of cells (or nuclei), followed by the addition of about 0.1 N HC1.
- the plurality of cells (or nuclei) is permeabilized using ethanol (e.g., about 70% ethanol), methanol (e.g., about 100% methanol), Tween 20 (e.g., about 0.2% Tween 20), and/or NP-40 (e.g., about 0.1% NP-40).
- ethanol e.g., about 70% ethanol
- methanol e.g., about 100% methanol
- Tween 20 e.g., about 0.2% Tween 20
- NP-40 e.g., about 0.1% NP-40
- reverse transcription is conducted or performed on the plurality of cells or nuclei.
- reverse transcription may be conducted on a fixed and/or permeabilized plurality of T cells (or nuclei).
- variants of M- MuLV reverse transcriptase may be used in the reverse transcription.
- any suitable method of reverse transcription is within the scope of this disclosure.
- the reverse transcription primers may be configured to reverse transcribe all, or substantially all, RNA in a cell (e.g., a random hexamer with a 5' overhang).
- the reverse transcription primers may be configured to reverse transcribe RNA having a poly(A) tail (e.g., a poly(dT) primer, such as a dT(15) primer or anchored dT(15) primer, with a 5' overhang).
- the reverse transcription primers are configured to reverse transcribe both all, or substantially all, RNA in a cell, as well as to reverse transcribe polyadenylated RNA.
- reverse transcription primers may be included that are configured to reverse transcribe predetermined RNAs (e.g., target RNAs).
- a portion of a reverse transcription (RT) primer that is configured to bind to RNA and/or initiate reverse transcription may comprise one or more of the following: a random hexamer, random septamer, an octamer, a nonamer, a decamer, or a poly(T) (or polydT) stretch of nucleotides (e.g., comprising 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 or more consecutive thymine bases).
- the poly(T) sequence is anchored, i.e., comprises a base other than T at the 3’ end (e.g., a mixture of anchored primers is used each comprising an A, C, or G at the 3’ end of the poly(T) sequence).
- RT primers are used that comprise well-specific barcodes, as described in more detail elsewhere herein.
- primers comprising a poly(T) sequence are used in the reverse transcription in the absence of primers with random sequences (such as random hexamers). In some embodiments, primers comprising random sequences (e.g., random hexamers) are used in the absence of primers with poly(T) sequences. In some embodiments, the reverse transcription is performed with both primers comprising a random sequence (e.g., random hexamer) and primers comprising a poly(T) sequence (e.g., with 15 consecutive thymidine residues). [0164] In some embodiments, RT primers are used that are specific for one or more target genes or transcripts.
- specific primers can be used to enrich for specific transcripts among the population of cells, e.g., genes or sequences (such as gRNAs) or a set of transcripts of interest (e.g., to specifically examine genes relating to a particular physiological or cellular pathway or process).
- specific primers can be used to enrich for genes relating to a process connected to the genes targeted by a set of gRNAs being assessed in a CRISPR screen.
- target genes of the gRNAs e.g., genes whose expression is expected to be altered following the expression of individual gRNAs can be enriched, so as to provide a control for the activity of an expressed gRNA).
- transcripts corresponding to genes encoding other targets can be used, e.g., to T cell receptors (TCRs), e.g., to one or more of TCR alpha, beta, gamma, or delta chains.
- TCRs T cell receptors
- each of the RT primers comprises a 5’ overhang comprising a 5’ overhang sequence located 5’ of the poly(T) or the random nucleotide sequence, wherein the 5’ overhang sequence is the same in all of the RT primers used in the first plurality of aliquots, and wherein following reverse transcription of the RNA, the 5’ overhang sequence is present at the 5’ end of each of the cDNA molecules.
- the RT primers can be used at any of a range of suitable concentrations.
- the concentration of each RT primer e.g., a polydT primer, a random hexamer primer, or a target-specific primer
- the concentrations are each between about 1 pM and about 7 pM, between about 1.5 pM and about 4 pM, between about 2 pM and about 3 pM, about 2.5 pM, or another suitable concentration.
- each of the RT primers comprises a barcode (i.e., a specific barcode sequence) (i.e., an “RT primer barcode”).
- the reverse transcription is performed on a population of cells (or nuclei) distributed in a plurality of aliquots or wells, and the RT primer barcodes are aliquot- or well-specific.
- all of the RT primer barcodes used in a given aliquot or well are the same, and a different RT primer barcode is used in each of the aliquots or wells.
- a first barcode may be added to the cDNA molecules in a first specific container, mixture, reaction, receptacle, sample, well, or vessel (e.g., specific to the given container, mixture, reaction, receptacle, sample, well, or vessel), and a second barcode sequence may be added to the cDNA molecules in a second container, mixture, reaction, receptacle, sample, well, or vessel (and the same for a third, fourth, etc. barcode sequence and third, fourth, etc. container, mixture, reaction, receptacle, sample, well, or vessel).
- 48 sets of different well-specific RT primers are used (e.g., in a 48-well plate).
- each sample can get a unique well-specific barcode.
- each sample could have 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, or more different sets of well-specific RT primers.
- a user can know which barcodes correspond to each sample, so the user can recover sample identities.
- Other numbers of specific barcodes (or well-specific RT primers) are also within the scope of this disclosure.
- Such a configuration may allow or provide for the multiplexing of the method.
- any distribution of barcode sequences that can provide some information about the aliquot or well in which a given cell was present can be used. For example, even if more than one barcode sequence is used in one or more aliquots or wells, or if a given barcode sequence is used in more than one aliquot or well, the methods are encompassed by the present disclosure.
- the barcode sequences present within the RT primers can have any of a range of lengths, e.g., 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 nucleotides or longer.
- the RT primer barcode sequences are 8 nucleotides in length. By varying 8 nucleotides, there are 65,536 possible unique sequences.
- the RT primer barcodes comprise more than 8 nucleotides. In some other embodiments, the RT primers comprise fewer than 8 nucleotides.
- the plurality of cells can be divided prior to reverse transcription into any of a number of aliquots or wells, and using any of a number of suitable reaction vessels or containers.
- the plurality of cells or nuclei can be distributed into individual tubes or containers, or into a plurality of wells in a multi-well plate. Any multi-well plate can be used, e.g., 4, 6, 8, 12, 24, 48, 96, 384, or 1536 well plates.
- the plurality of cells or nuclei are distributed into one or more 96-well plates.
- all 96 wells of a 96-well plate are used (i.e., the plurality of cells or nuclei is divided into 96 aliquots).
- a fraction of the wells on the plate are used, e.g., the cells or nuclei are distributed into, e.g., 8, 16, 24, 32, 40, 48, 56, 64, 72, 80, 88, 12, 24, 36, 48, 60, 72, or 84 wells of a 96- well plate.
- up to 12 wells are used for up to 10,000 cells or nuclei
- up to 48 wells are used for up to 100,000 cells or nuclei
- up to 96 wells are used for up to 1,000,000 cells or nuclei.
- a plurality of cells or nuclei is subjected to combinatorial barcoding, such that molecules (e.g., cDNAs synthesized in cells by reverse transcription) are labeled with barcodes that when viewed in combination provide a cell- or nucleus-specific label for the labeled molecules.
- a plurality of cells is split into multiple aliquots or wells, labeled with well-specific barcodes (e.g., during reverse transcription using barcoded primers and/or by ligating barcoded nucleic acid tags), and repooled.
- This cycle of dividing the plurality of cells, well-specific labeling, and repooling can be repeated any number of times, with each round adding more tags to the cDNAs and thereby creating a set of nucleic acid tags that together can act as, e.g., a cell-specific (or nucleus-specific) barcode.
- a cell-specific (or nucleus-specific) barcode As more and more rounds are added, the number of paths that a cell can take increases and consequently the number of possible barcodes that can be created also increases. Given enough rounds and divisions, the number of possible barcodes will be much higher than the number of cells, resulting in a high likelihood that each cell (or nucleus) in the population has a unique barcode.
- enough rounds of labeling is performed such that the number ofpossible barcodes is 50%, 60%, 70%, 80%, 90%, 100%, 110%, 120%, 130%, 140%, 150%, 200%, 300%, 400%, 500%, 600%, 700%, 800%, 900%, 1000%, or lx, 2x, 3x, 4x, 5x, 6x, 7x, 8x, 9x, lOx, 20x, 30x, 40x, 50x, 60x, 70x, 80x, 90x, lOOx, or more of the number of cells (or nuclei) in the plurality of cells (or nuclei).
- enough rounds of labeling are performed such that the likelihood that the tagged molecules within a given cell or nucleus have a unique barcode combination among the plurality of cells is about 95%, 96%, 97%, 98%, 99%, 99.1%, 99.2%, 99.3%, 99.4%, 99.5%, 99.6%, 99.7%, 99.8%, 99.9%, or greater.
- the plurality of cells or nuclei can be divided into any of a number of aliquots or wells, and using any of a number of suitable reaction vessels or containers.
- the plurality of cells or nuclei can be distributed into individual tubes or containers, or into a plurality of wells in a multi-well plate. Any multi-well plate can be used, e.g., 4, 6, 8, 12, 24, 48, 96, 384, or 1536 well plates.
- the plurality of cells or nuclei are distributed into one or more 96-well plates.
- all 96 wells of a 96-well plate are used (i.e., the plurality of cells or nuclei is divided into 96 aliquots).
- a fraction of the wells on the plate are used, e.g., the cells or nuclei are distributed into, e.g., 8, 16, 24, 32, 40, 48, 56, 64, 72, 80, 88, 12, 24, 36, 48, 60, 72, or 84 wells of a 96-well plate.
- 96 distinct barcode sequences are present among the nucleic acid tags used in the plurality of aliquots.
- the plurality of aliquots comprises 96 aliquots distributed in a 96-well plate.
- each of the 96 distinct barcode sequences are present in only one of the 96 aliquots.
- FIGS. 3A-3D illustrate an exemplary embodiment of labelling or tagging nucleic tags.
- Poly T and random hexamer primers anneal to mRNA within single cells.
- Each primer contains a barcode and optionally a 5’ overhang comprising a 5’ overhang sequence.
- Reverse transcriptase extends cDNA to form a cDNA/mRNA hybrid comprising the barcode (BC1), for example, a wellspecific barcode.
- the cells may be pooled and split and redistributed into individual wells.
- nucleic acid tags can be appended to the cDNA/mRNA hybrid via the 5’ overhang.
- the nucleic acid tags can contain a second barcode (e.g., BC2) and optionally a second DNA linker.
- the cells can be pooled and split and redistributed into individual wells.
- a second nucleic acid tag can be ligated to the growing cDNA.
- the adaptor contains a third barcode, an Illumina adaptor sequence (e.g., R2) or a compatible equivalent thereof, and a biotin molecule.
- combinatorial barcoding is accomplished by coupling the cDNA molecules generated in each cell during reverse transcription (or DNA fragments or adapters appended to other molecules as described elsewhere herein) with a nucleic acid tag, wherein each nucleic acid tag comprises a barcode sequence, e.g., a well-specific barcode sequence or tag barcode.
- coupling the cDNA molecule with the nucleic tag comprises ligating the nucleic tag to the cDNA molecule.
- each nucleic acid tag comprises a first strand comprising the barcode sequence, and further comprises a 3’ and/or 5’ region located 3’ and/or 5’ of the barcode.
- the first strand comprises a 3' hybridization sequence extending from a 3' end of a labeling (i.e., barcode) sequence and/or a 5' hybridization sequence extending from a 5' end of the labeling (i.e., barcode) sequence.
- Each nucleic acid tag may also comprise a second (linker) strand including an overhang sequence.
- the overhang sequence may include (i) a first portion complementary to a 5' hybridization sequence of a different nucleic acid tag (e.g., a nucleic acid tag appended in a previous round of tagging) or to a 5' overhang sequence of an RT primer.
- the second (linker) strand may also comprise a sequence complementary to the 3' hybridization sequence of the first strand.
- the first and second strands are preannealed before being added to the wells or aliquots containing the cells or nuclei.
- the nucleic acid tags comprise i) a tag barcode sequence, and ii) a 3’ hybridization sequence located 3’ of the barcode sequence and/or a 5’ hybridization sequence located 5’ of the barcode sequence, wherein multiple distinct tag barcode sequences are present among the nucleic acid tags used in the second plurality of aliquots, and wherein the tag barcode sequences present in each individual aliquot of the second plurality of aliquots are specific to the individual aliquot.
- the 3’ end of the nucleic acid tag is present within the 3’ hybridization sequence, and the 3’ end of the nucleic acid tag is brought into proximity of the 5’ end of the cDNA molecule by being preannealed to a linker nucleic acid strand that is complementary to the 3’ hybridization sequence of the nucleic acid tag and to the 5’ overhang sequence of the RT primer, or to the 3’ hybridization sequence of the nucleic acid tag and to the 5’ hybridization sequence of a previously coupled nucleic acid tag.
- the barcode sequences present within the nucleic acid tags can be any of range of lengths, e.g., 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 nucleotides or longer.
- the barcode sequences are 8 nucleotides in length. By varying 8 nucleotides, there are 65,536 possible unique sequences. In some embodiments, more than 8 nucleotides are used. In some other embodiments, fewer than 8 nucleotides are used.
- the first strand of the nucleic acid tag is preannealed to the second (linker) strand.
- the linker strand includes sequence complementary to part of the RT primer or to a 5’ region in a previously coupled nucleic acid tag (i.e., appended to the cDNA in a previous round of tagging), thereby allowing it to hybridize and bring the 3' end of the barcodes into close proximity to the 5' end of the reverse transcription primer or the previously added tag.
- the phosphate of the reverse transcription primer or previous tag can is ligated to the 3' end of the first-round barcodes by, e.g., T4 DNA ligase.
- a 5’ region of the nucleic acid tag (e.g., domain s2) can then provide an accessible binding domain for a linker oligo to be used in another round of barcoding.
- the nucleic acid tags (barcode oligos) can include a 5' phosphate that can allow ligation to the 3' end of another oligo by T4 DNA ligase.
- the nucleic acid tags are ligated to the cDNA s (or adapter molecules) during each round of labeling.
- the methods of labeling nucleic acids in the first cell may comprise ligating at least two of the nucleic acid tags that are bound to the cDNA s.
- the nucleic acid tags are hybridized to the cDNA s (or adapters) during each round, and ligation is performed for all of the hybridized tags at a later stage, e.g., following cell lysis, i.e., ligation may be conducted before or after the lysing and/or the cDNA purification steps.
- Ligation can comprise covalently linking the 5' phosphate sequences on the nucleic acid tags to the 3' end of an adjacent strand or nucleic acid tag such that individual tags are formed into a continuous, or substantially continuous, barcode sequence that is bound to the 3' end of the cDNA sequence.
- a double-stranded DNA or RNA ligase may be used with an additional linker strand that is configured to hold a nucleic acid tag together with an adjacent nucleic acid in a “nicked” double-stranded conformation.
- the double-stranded DNA or RNA ligase can then be used to seal the “nick.”
- a single-stranded DNA or RNA ligase may be used without an additional linker.
- the ligation may be performed within the plurality of cells.
- FIGS. 6 and 7 illustrate the ligation of a plurality of nucleic acid tags to form a substantially continuous label or barcode.
- one or more unbound nucleic acid tags are removed (e.g., by washing the plurality of cells).
- the methods may comprise removing a portion, a majority, or substantially all of the unbound nucleic acid tags.
- Unbound nucleic acid tags may be removed such that further rounds of the disclosed methods are not contaminated with one or more unbound nucleic acid tags from a previous round of a given method.
- unbound nucleic acid tags may be removed via centrifugation.
- the plurality of cells can be centrifuged such that a pellet of cells is formed at the bottom of a centrifuge tube.
- the supernatant i.e., liquid containing the unbound nucleic acid tags
- the cells may then be resuspended in a buffer (e.g., a fresh buffer that is free or substantially free of unbound nucleic acid tags).
- a buffer e.g., a fresh buffer that is free or substantially free of unbound nucleic acid tags
- the plurality of cells may be coupled or linked to magnetic beads that are coated with an antibody that is configured to bind the cell or nuclear membrane.
- the plurality of cells can then be pelleted using a magnet to draw them to one side of the reaction vessel.
- the plurality of cells may be placed in a cell strainer (e.g., a PLURISTRAINER® cell strainer) and washed with a wash buffer.
- wash buffer may include, e.g., a surfactant, a detergent, and/or about 5-60% formamide.
- the ligation can be stopped during each round of combinatorial labeling by adding an excess of oligo that is complementary to all or part of the linker (second) strand used during the same round of labeling.
- oligo strands that are fully complementary to the linker oligos can be added. These oligos can bind the linker strands attached to unligated barcodes and displace the unligated barcodes through a strand displacement reaction. The unligated barcodes can then be completely single-stranded.
- T4 DNA ligase for example, is unable to ligate single-stranded DNA to other single-stranded DNA , the ligation reaction will stop progressing.
- stop ligation strands are diluted into 10X Ligase Buffer and water, e.g., 264 pl stop ligation strand, 300 pl 10X T4 DNA Ligase Buffer, and 636 pl nuclease-free water.
- the nucleic acid tag (e.g., a final nucleic acid tag appended during the last of multiple rounds of tagging) may comprise a capture agent such as, but not limited to, biotin, e.g., a 5' biotin.
- a cDNA labeled with a 5' biotin-comprising nucleic acid tag may allow or permit the attachment or coupling of the cDNA to a streptavidin-coated magnetic bead, e.g., Cl beads.
- a plurality of beads may be coated with a capture strand (i.e., a nucleic acid sequence) that is configured to hybridize to a final sequence overhang of a barcode.
- cDNA may be purified or isolated by use of a commercially available kit (e.g., an RNEASYTM kit).
- one or more nucleic acid tags may comprise additional elements (in addition to biotin or another capture agent) such as a nucleotide sequence (e.g., comprising random and/or degenerate nucleotides) allowing the detection of PCR duplicates, primer binding sequences, adapter sequences for next-generation sequencing (NGS) (e.g., Illumina adapter sequences), and others.
- NGS next-generation sequencing
- the random nucleotide sequences allow the computational removal of PCR duplicates, since these duplicates will have the same random sequence. In this way, each original transcript will only be counted once, even if multiple PCR duplicates are sequenced.
- Such sequences can contain any number of random successive nucleotides, e.g., 2, 3, 4, 5, 6, 7, 8, 9, 10 nucleotides or more, and can be used either alone or in conjunction with other sequence elements (such as the other barcode sequences described herein) to allow the identification of the PCR duplicates.
- the number of different random sequences per cell indicates how many unique RNA molecules can be detected per cell, which is directly related to how efficiently the molecules can be barcoded and processed to enable detection by next generation sequencing.
- the methods include lysing (i.e., breaking down the cell or nuclear structure) the plurality of cells (or nuclei) to release the tagged cDNA molecules from the plurality of cells or nuclei following combinatorial barcoding, thereby forming a lysate comprising the released tagged cDNA molecules.
- the cells or nuclei comprising the tagged cDNAs (or other molecules) are divided into one or more samples or sublibraries, and the lysis is performed separately on each sample or sublibrary.
- each sample or sublibrary can be tagged with one or more index or barcode sequences during a subsequent step of the herein-described methods (e.g., using unique dual indices, or UDIs, as described elsewhere herein).
- the total number of sublibraries prepared can depend on various factors, including the number of cells in the plurality of cells or nuclei.
- the plurality of cells or nuclei comprises up to 10,000 cells or nuclei, and two sublibraries are prepared.
- the plurality of cells or nuclei comprises up to 100,000 cells or nuclei, and 8 sublibraries are prepared.
- the plurality of cells or nuclei comprises up to 1,000,000 cells or nuclei, and 16 sublibraries are prepared. Other numbers of cells or nuclei, including less than 10,000 and greater than 1,000,000, or various numbers between 10,000 and 1,000,000, can be used, and a skilled artisan will be able to determine a suitable number of sublibraries.
- the number of cells in each library can be, for example, 200, 500, 1,000, 2,000, 3,000, 4,000, 5,000, 6,000, 7,000, 8,000, 9,000, 10,000, or more.
- each sublibrary can be immediately processed, e.g., to prepare a sequencing library, or stored, e.g., at -80 °C. Further, while all of the sublibraries can be processed together, each sublibrary can be sequenced separately.
- the plurality of cells is lysed in a lysis solution (e.g., a solution comprising Tris-HCl, EDTA, NaCl, and SDS, e.g., 10 mM Tris-HCl (pH 7.9), 50 mM EDTA (pH 7.9), 0.2 M NaCl, 2.2% SDS) comprising an RNase inhibitor (e.g., 0.5 mg/ml ANTI- RNase, AMBION®) and a protease such as a serine protease, e.g., Proteinase K (e.g., 1000 mg/ml proteinase K (AMBION®)).
- a lysis solution e.g., a solution comprising Tris-HCl, EDTA, NaCl, and SDS, e.g., 10 mM Tris-HCl (pH 7.9), 50 mM EDTA (pH 7.9), 0.2 M NaCl, 2.2% SDS
- lysis is performed at about 55 °C for about 3 hours with shaking (e.g., vigorous shaking).
- the plurality of cells is lysed using ultrasonication and/or by being passed through an 18-25 gauge syringe needle at least once.
- the plurality of cells is lysed by being heated to about 70-90 °C.
- the plurality of cells may be lysed by being heated to about 70-90 °C for about one or more hours.
- the cDNA molecules may be isolated from the lysed cells or nuclei.
- RNase H or another RNase
- the methods may further comprise ligating at least two of the nucleic acid tags that are bound to the released cDNA s (i.e., in embodiments in which the tags were not ligated during each round of split-pool tagging).
- the methods may comprise ligating at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, or more nucleic acid tags that are bound to the cDNA molecules.
- the tagged cDNA molecules are isolated from the lysis solution with a purification or clean-up step, e.g. an SPRI bead cleanup, before binding the desired nucleic acids (i.e., cDNAs containing 5' biotin) to streptavidin beads.
- a protease inhibitor is added to lysates and then streptavidin beads are directly added (i.e., skipping the first SPRI isolation of nucleic acids).
- the protease inhibitor may include phenylmethanesulfonyl fluoride (PMSF), 4-(2-aminoethyl)benzenesulfonyl fluoride hydrochloride (AEBSF), a combination thereof, and/or another suitable protease inhibitor.
- PMSF phenylmethanesulfonyl fluoride
- AEBSF 4-(2-aminoethyl)benzenesulfonyl fluoride hydrochloride
- a combination thereof and/or another suitable protease inhibitor.
- the cDNA molecules are isolated using Streptavidin beads, e.g., Cl beads.
- Streptavidin beads e.g., Cl beads.
- 20 pl of resuspended DYNABEADS® MYONETM Streptavidin Cl beads (for each aliquot of cells) can be added to a 1.7 ml microcentrifuge tube (EPPENDORF®).
- the beads can be washed, e.g., 3 times, with, e.g., l x phosphate buffered saline Tween 20 (PBST) and resuspended in PBST (e.g., 20 pl PBST).
- PBST l x phosphate buffered saline Tween 20
- 900 pl PBST is added to the cell aliquot and 20 pl of washed Cl beads are added to the aliquot of lysed cells.
- the samples can, e.g., be placed on a gentle roller for 15 minutes at room temperature and then washed, e.g., 3 times with 800 pl PBST using a magnetic tube rack (EPPENDORF®).
- the beads can then be resuspended in, e.g., PBS such as 100 pl PBS.
- a microcentrifuge tube comprising a sample can be placed against a magnetic tube rack (EPPENDORF®) for, e.g., 2 minutes and then the liquid can be aspirated.
- the beads can be resuspended in, e.g., an RNase solution (3 pl RNase Mix (ROCHETM), 1 pl RNase H (NEW ENGLAND BIOLABS®), 5 pl RNase H 10x Buffer (NEW ENGLAND BIOLABS®), and 41 pl nuclease-free water).
- RNase solution 3 pl RNase Mix (ROCHETM), 1 pl RNase H (NEW ENGLAND BIOLABS®), 5 pl RNase H 10x Buffer (NEW ENGLAND BIOLABS®), and 41 pl nuclease-free water.
- the sample can be incubated under suitable conditions, e.g., at 37 °C for 1 hour, and then removed from the conditions and placed against a magnetic tube rack (EPPENDORF®) for, e.g., 2 minutes.
- the sample can be washed with, e.g., 750 pl of nuclease-free water+0.01% Tween 20 (H2O-T), without resuspending the beads and keeping the tube disposed against the magnetic tube rack.
- the liquid can then be aspirated.
- the sample can be washed with 750 pl H2O-T without resuspending the beads and while keeping the tube disposed against the magnetic tube rack.
- the liquid can be aspirated while keeping the tube disposed against the magnetic tube rack.
- FIGS. 4A-4B illustrates an exemplary embodiment of isolating cDNA molecules.
- the biotinylated cDNA/mRNA hybrid binds to a streptavidin binder bead (FIG. 4A). Molecules having biotin are collected and molecules lacking biotin are removed. Next, a template switch reaction is performed (FIG.
- a template switching oligonucleotide TSO
- a template switching (TS) adapter comprising, e.g., a primer binding site (e.g., a template switching primer “TS primer”) to the 3’ end of the cDNA molecule for cDNA amplification.
- TSO template switching oligonucleotide
- TS template switching
- two products can be created: (i) whole transcriptome and (ii) target cDNA molecules (see, e.g., FIGS. 4C-4D).
- a common adapter sequence (or NGS adapter) is added to the 3 '-end of the released cDNA molecules following isolation of the released cDNA (i.e., cDNA /mRNA duplex).
- the common (or NGS) adapter sequence is the same, or substantially the same, for each of the cDNA molecules (i.e., within a given experiment).
- the addition of the common adapter may be conducted or performed in a solution including up to about 10% w/v of PEG, wherein the molecular weight of the PEG is between about 7,000 g/mol and 9,000 g/mol.
- ddC dideoxycytidine
- adapters are used with a phosphate at the 5' end and ddC at the 3' end.
- enzymes are capable of ligating singlestranded oligo to the 3' end of single-stranded DNA, e.g., T4 RNA ligase 1 (NEW ENGLAND BIOLABS®) or thermostable 5' AppDNA /RNA Ligase (NEW ENGLAND BIOLABS®).
- the adapter sequence is added to the 3 '-end of the released cDNA molecules by template switching (see Picelli, S, et al. Nature Methods 10, 1096-1098 (2013)).
- template switching can be performed on the cDNA molecules, i.e., the cDNA/RNA duplexes attached to streptavidin beads, using a template switching oligonucleotide (TSO) or template switching primer (TS) comprising the adapter sequence.
- TSO template switching oligonucleotide
- TS template switching primer
- up to 10% w/v PEG molecular weight 7000-9000 is used in the template switch reaction.
- the adapter sequence introduced by the TSO is used for the amplification of tagged cDNA molecules, e.g., as illustrated in FIGs. 4A- 4D.
- FIGS. 4A-4D provide an overview of cDNA capture and amplification (multiplex preamplification and/or target amplification) steps according to certain embodiments of the present methods, e.g., in the context of CRISPR screens such as CROP-seq or similar methods.
- FIG. 4A shows cDNA Capture. Biotinylated cDNA was captured via streptavidin beads.
- FIG. 4B shows cDNA Template Switch. A template switch (TS) reaction adds an adapter to the 3’ end of the cDNA.
- TS template switch
- FIG. 4C shows cDNA Amplification.
- the cDNA was amplified with template switch adaptor and Illumina Truseq R2 specific primers.
- the first round of amplification (or multiplex amplification) shown here also referred to as “preamplification” in the context of the present disclosure, additional primers specific for one or more target cDNAs (e.g., expressed from target genes or comprising target sequences) were added, or “spiked-in,” at this step, so as to enrich the presence of the target cDNAs in the mixture.
- FIG. 4C shows hU6-sgRNA-polyA transcripts which had been enriched due to the addition of a human U6 specific primer.
- FIG. 4D presents another view of the products of the preamplification step, showing a preamplified cDNA representative of the whole transcriptome, and a preamplified target cDNA, in this case comprising a gRNA sequence.
- FIGS. 5A-5C provide an overview of the preparation of whole transcriptome (WT) sequencing libraries according to certain embodiments of the present methods, e.g., in the context of CRISPR screening, TCR profiling of T cells, or other multiplex applications.
- FIG. 5 A shows cDNA transcripts were fragmented, ends were repaired, and A-tailed.
- FIG. 5B shows Adapter Ligation. As an example, an Illumina Truseq R1 Adapter was ligated to the 5’ end of the DNA.
- FIG. 5C shows Round 4 Barcoding.
- the sequencing library was amplified, adding P5/P7 Adapters and a fourth barcode via the UDI - WT Plate.
- FIGS. 6A-6C provide an overview of the preparation of CRISPR sequencing libraries according to certain embodiments of the present methods.
- FIGS. 6A-6B show two additional amplification steps after the initial preamplification and subsequent separation of the enriched preamplified tagged cDNAs.
- FIG. 6A shows CRISPR PCR.
- a PCR reaction enriched the sgRNA containing cDNA with a second hU6 specific primer. This reaction could also add an adaptor, e.g., Illumina adaptor or any compatible equivalents thereof.
- FIG. 6B shows CRISPR Index PCR.
- FIG. 6C shows another view of the products of the two additional rounds of amplification.
- the cDNA molecules are amplified in a first round of “multiplex” amplification, in which PCR is performed using both non-specific (i.e., non-target specific) WT primers (e.g., TSO and R2 primers) configured to amplify all tagged cDNAs in the transcriptome, and target-specific primers configured to amplify target sequences only.
- non-specific WT primers e.g., TSO and R2 primers
- target-specific primers configured to amplify target sequences only.
- this initial round of amplification is a first enrichment step (due to the spiked-in target primers) taking place before a subsequent, additional round of target-specific amplification (i.e., a target-specific “enrichment” round of PCR performed exclusively with target-specific primers during the preparation of the target sequencing library), this round is referred to herein as a “preamplification” step.
- preamplification primers This preamplification step is a multiplex PCR in which both the whole transcriptome and the specific target sequences to be enriched (e.g., gRNA or TCR sequences) are amplified simultaneously, but with the target sequences being amplified preferentially (due to the presence of the spiked-in target specific primers or gene specific primers) such that they become enriched within the overall transcriptome.
- target sequences to be enriched e.g., gRNA or TCR sequences
- preamplification comprises amplifying cDNA molecules using at least one pair of primers (i.e., whole transcriptome (WT) preamplification primers) configured to broadly amplify tagged cDNA molecules in the mixture but which do not specifically anneal to target cDNAs, as well as at least one pair of “spiked-in” target-specific primers (“target preamplification primers”) configured to specifically amplify one or more target sequences.
- WT whole transcriptome
- target preamplification primers configured to specifically amplify one or more target sequences.
- the at least one pair of WT primers can comprise one primer complementary to an adapter sequence introduced by the TSO, and one primer complementary to an adapter sequence (e.g., R2 sequence) introduced by the last nucleic acid tag appended to the cDNA during split-pool labeling.
- a pair of spikedin target preamplification primers comprises two target-specific primers, such that neither of the primers is configured to amplify non-target cDNAs within the whole transcriptome.
- a pair of spiked-in preamplification primers comprises one target-specific primer, and one non-target-specific primer (e.g., a WT primer such as an RT primer), such that PCR using the pair of primers will only amplify (i.e., exponentially amplify) target cDNAs.
- one or more WT preamplification primers used in the preamplification round comprise the sequence of SEQ ID NO: 12 or SEQ ID NO: 13, or a sequence comprising 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identity to SEQ ID NO: 12 or SEQ ID NO: 13, or a sequence comprising not more than 1, 2, 3, 4, or 5 mismatches relative to SEQ ID NO: 12 or SEQ ID NO: 13.
- one or more target-specific primers used in the preamplification round comprise the sequence of any one of SEQ ID NO: 11, or a sequence comprising 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identity to SEQ ID NO: 11, or a sequence comprising not more than 1, 2, 3, 4, or 5 mismatches relative to SEQ ID NO: 11.
- one or more pairs of target preamplification primers used in the preamplification round comprise the sequence of SEQ ID NO: 12, or a sequence comprising 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identity to SEQ ID NO: 12, or a sequence comprising not more than 1, 2, 3, 4, or 5 mismatches relative to SEQ ID NO: 12.
- one or more target-specific primers comprises any of the sequences shown as SEQ ID NOS: 1-10, or a sequence comprising 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identity to any of SEQ ID NOS: 1-10, or a sequence comprising not more than 1, 2, 3, 4, or 5 mismatches relative to any of SEQ ID NOS: 1-10.
- the primers can be used at a range of concentrations.
- the primers are each added at from 1 to 10 pM, e.g., at 1.2 pM, 2.4 pM, 4.8 pM, 7.2 pM, 9.6 pM, or 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, or more pM, or from 100 nM to 1 pM, e.g., 100, 200, 300, 400, 500, 600, 700, 800, 900, or more nM for each target preamplification primer.
- the ratio (e.g., molar ratio) of target-specific primers to WT primers used is 1 : 1000, 1 :900, 1 :800, 1 :700, 1 :600, 1 :500, 1 :400, 1 :300, 1 :200, 1 : 100, 1 :90, 1 :80, 1 :70, 1 :60, 1 :50, 1 :40, 1 :30, 1 :20, 1 : 10, 1 :9, 1 :8, 1 :7, 1 :6, 1 :5, 1 :4, 1 :3, 1 :2, 1 : 1, 2: 1, 3: 1, 4: 1, 5: 1, 6: 1, 7: 1, 8: 1, 9: 1, 10: 1, 20: 1, 30: 1, 40: 1, 50:1, 60: 1, 70: 1, 80: 1, 90: 1, 100:1, 200: 1, 300: 1, 400: 1, 500: 1, 600:1, 700: 1, 800: 1, 900: 1, or 1000: 1.
- primers said to “non-specifically” amplify the tagged cDNAs (e.g., WT preamplification primers) in the mixture means that they are configured broadly amplify all tagged cDNAs generated using the present methods (e.g., all cDNAs comprising a TSO adapter sequence and an R2 adapter sequence). It does not necessarily mean, however, that the primers will have identical affinity for each of the tagged cDNAs in the mixture, or that they will amplify all tagged cDNAs in the mixture at an identical rate.
- primers said to “non-specifically” amplify the tagged cDNAs simply means that the primers are not designed to specifically anneal to any particular cDNAs in the mixture, e.g., are not designed to be complementary to a target sequence, such that the primers can generally be expected to amplify the overall collection of tagged cDNA molecules in the mixture.
- a pair of primers said to “specifically” amplify a target sequence can comprise two primers with complementarity to the target sequence, or one primer with complementarity to the target sequence and one primer capable of annealing to all tagged cDNAs in the mixture (including non-target sequences).
- the primer capable of annealing to all of the tagged cDNAs could comprise complementarity to an R2 adapter sequence present the last nucleic acid tag coupled to all tagged cDNAs during combinatorial barcoding.
- the at least one pair of target preamplification primers may comprise a primer with complementarity to one or more target sequence or sequences (e.g. to a specific gRNA in a CRISPR screen, or to a particular TCR variable region sequence), or to a sequence common to a plurality of different target sequences (such as a U6 promoter sequence found upstream of Pol III generated transcripts in the case of CRISPR screens such as CROP-seq screens, or to a TCR sequence, e.g., a variable region sequence common to multiple TCR genes in the population.
- a primer with complementarity to one or more target sequence or sequences e.g. to a specific gRNA in a CRISPR screen, or to a particular TCR variable region sequence
- a sequence common to a plurality of different target sequences such as a U6 promoter sequence found upstream of Pol III generated transcripts in the case of CRISPR screens such as CROP-seq screens
- TCR sequence e.g
- the cDNA amplification will give rise to two products: (1) the whole transcriptome, and (2) an enriched pool of tagged cDNAs corresponding to the specific target or targets (e.g., gRNA or TCR sequences) amplified by the target preamplification primers, meaning that the resulting mixture of cDNAs will be an enriched cDNA amplification product that comprises cDNAs for the whole transcriptome that is enriched for transcripts corresponding to the specific target transcripts of interest (e.g., gRNA TCR sequences).
- targets e.g., gRNA or TCR sequences
- the preamplification step is performed by placing tubes comprising the (magnetic) beads with bound tagged cDNA molecules against a magnetic rack, removing and discarding the clear supernatant, and resuspending the beads in bind buffer, then removing and discarding the supernatant, then removing the tubes from the magnetic rack and resuspending the beads in amplification reaction solution (e.g., a solution comprising an amplification master buffer, WT preamplification primers for amplifying the whole transcriptome, and/or target specific preamplification primers (e.g., TCR-specific primers).
- amplification reaction solution e.g., a solution comprising an amplification master buffer, WT preamplification primers for amplifying the whole transcriptome, and/or target specific preamplification primers (e.g., TCR-specific primers).
- the resuspended beads can then be stored on ice (or a suitable temperature, e.g., at about 0, 1, 2, 3, 4, 5, 6, 7, or 8 °C).
- the tubes can then be placed in a thermocycler and subjected to suitable conditions for PCR amplification of the cDNAs. Following amplification, the tubes can be removed and stored, e.g., at 4 °C.
- the PCR products can be cleaned up, e.g., by the addition of solid phase reversible immobilization (SPRI) beads.
- SPRI beads are used to remove polynucleotides of less than about 200 base pairs, less than about 175 base pairs, or less than about 150 base pairs (see DeAngelis, M M, et al. Nucleic Acids Research (1995) 23(22):4742). In some embodiments, SPRI beads are used to remove polynucleotides of less than about 200 base pairs.
- the ratio of SPRI bead solution to amplified cDNA molecule solution may be between about 0.9: 1 and about 0.7:1, between about 0.875: 1 and about 0.775: 1, between about 0.85: 1 and about 0.75: 1, between about 0.825:1 and about 0.725:1, about 0.8: 1, or another suitable ratio.
- the SPRI clean-up is single-sided. In some embodiments, the SPRI clean-up is double-sided.
- the SPRI bead solution may include between about 1 M and 4 M NaCl, between about 2 M and 3 M NaCl, between about 2.25 M and 2.75 M NaCl, about 2.5 M NaCl, or another suitable amount of NaCl.
- the SPRI bead solution may also include between about 15% w/v and 25% w/v polyethylene glycol (PEG), wherein the molecular weight of the PEG is between about 7,000 g/mol and 9,000 g/mol (PEG 8000).
- PEG polyethylene glycol
- the SPRI bead solution may include between about 17% w/v and 23% w/v PEG 8000, between about 18% w/v and 22% w/v PEG 8000, between about 19% w/v and 21% w/v PEG 8000, about 20% w/v PEG 8000, or another suitable % w/v PEG 8000.
- 20 pl of the RNase-treated beads can be added to a single PCR tube.
- 80 pl of ligase mix (5 pl T4 RNA Ligase 1 (NEW ENGLAND BIOLABS®), 10 pl 10X T4 RNA ligase buffer, 5 pl BC_0047 oligo at 50 pM, 50 pl 50% PEG 8000, and 10 pl 10 mM ATP) can be added to the 20 pl of beads in the PCR tube.
- 50 pl of the ligase mixed with the beads can be transferred into a new PCR tube to prevent too many beads from settling to the bottom of a single tube and the sample can be incubated at 25 °C for 16 hours.
- the combined products of preamplification i.e., mixture of whole transcriptome cDNA molecules and enriched target cDNA molecules
- WT Whole Transcriptome
- Target Sequencing Library a target-specific library
- an additional (e.g., fourth) sublibrary-specific barcode is added to the cDNA molecules.
- this additional (e.g., fourth) barcode is an Illumina Unique Dual Index (UDI).
- the preamplified WT cDNA molecules are fragmented, an adapter comprising, inter alia, a primer binding sequence is appended to the fragmented ends, and an additional amplification reaction is performed to introduce one or more index sequences (e.g., unique dual indexes, or UDIs) and sequencing primer binding sites for NGS sequencing (see, e.g., FIGS. 5A-5C).
- the WT cDNAs can be fragmented, e.g., using a fragmentation enzyme and fragmentation buffer.
- the preamplified cDNA molecules are fragmented by incubating with the fragmentation enzyme and buffer at 32 °C for 10 minutes and are then held at 65 °C for, e.g., 30 minutes.
- the fragment ends are repaired and A-tailed, and the adapter is ligated to the ends.
- an Illumina Truseq R1 Adapter is ligated to the 5’ end of the DNA.
- the ligation of the adapter to the ends of the fragments of the preamplified cDNA can be preceded and/or followed by an SPRI clean-up step, e.g., using Ampure XP or KAPA Pure Beads.
- the cleaned-up molecules are then subjected to an additional round of amplification, e.g., adding P5/P7 adapter sequences.
- a fourth barcode can also be added.
- the additional barcodes correspond to unique dual indexes (UDI), e.g., with different well-specific index primers used for each sublibrary.
- the indexing round of amplification be preceded and/or followed by an additional SPRI clean up step (e.g., using Ampure XP or KAPA Pure Beads).
- FIGS. 5A-5C illustrate an exemplary embodiment of preparation of whole transcriptome libraries for sequencing.
- Sublibrary cDNA is fragmented to a size compatible with a suitable sequencing platform (FIG. 5A), e.g., Illumina sequencing or other compatible sequencing platforms.
- a second adaptor e.g., an R1 adapter
- FIG. 5C a final PCR amplifies the fragmented cDNA and appends the fourth DNA barcode, UDIs, and P5 and P7 adaptors.
- a target sequencing library is prepared.
- the preparation of the target library comprises one or more additional rounds of PCR amplification, e.g., using the product of the first PCR amplification as a starting material for amplification.
- This second round of “enrichment” PCR uses targetspecific primer pairs to specifically amplify target cDNA molecules, in the absence of WT primer pairs.
- a first additional round of amplification is performed using at least one target-specific amplification primer that is different than the target-specific preamplification primer(s) used in the previous, preamplification step.
- distinct, nested primers are used in the preamplification and amplification steps, e.g., as shown in FIGS. 2A-2C (i.e., Focal Primer 1 and Focal Primer 2).
- the same primers are used in the preamplification and amplification steps, or, alternatively, different primers are used that nevertheless bind to the same binding site on the target cDNA (i.e., the primers bind to the same sites on the target cDNA, but may comprise distinct additional elements outside of their hybridization region).
- the first additional round of amplification is followed by a second additional round of amplification.
- the first round of amplification can be performed to further enrich the target sequences and also add, e.g., an adapter sequence such as an Illumina adapter or other compatible adapters.
- the second round of amplification can further enrich the target sequences as well as introduce additional elements such as flow cell binding sites (e.g., P5 and P7) and index sequences (e.g., unique dual indexes, or UDIs).
- UDIs as used herein can be Illumina UDIs or any compatible UDIs, (e.g., Zymo-Seq SwitchFreeTM 3’mRNA Library Kits (Zymo-Seq RiboFree® Total RNA Library Kits (Cat. No. R3000/R3003).
- the UDIs used are those shown in Table 16 (i.e., in a UDI - WT plate for WT index PCR in the preparation of WT sequencing libraries) and Table 20 (i.e., in a UDI - EC plate, for target specific index PCR in the preparation of target sequencing libraries.
- FIGS. 2A-2C provide an overview of one embodiment of the present methods, e.g., using a pair of nested “focal primers” to enrich for one or more target sequences when preparing multiple sequencing libraries for multiplex single-cell analysis purposes.
- FIG. 2A shows the products of the combinatorial barcoding of cDNA in cells or nuclei, e.g., using split-pool labeling as described herein. Shown are both a generic cDNA molecule representative of the entire genome, as well as a cDNA molecule representative of a subset of the whole transcriptome comprising a target cDNA of interest. Two nested primer binding sites are indicated on the target cDNA, corresponding to one embodiment of the present disclosure.
- FIG. 1 shows the products of the combinatorial barcoding of cDNA in cells or nuclei, e.g., using split-pool labeling as described herein. Shown are both a generic cDNA molecule representative of the entire genome, as well as a cDNA molecule
- the cDNA representative of the whole transcriptome was amplified using a primer (the “R2 primer”) specific to an adapter sequence introduced, e.g., by a nucleic acid tag during the last round of combinatorial barcoding, and a second primer (the “PCR primer”) specific to a second adapter sequence introduced, e.g., by the template switching oligonucleotide (TSO).
- the target cDNA was amplified by the same R2 primer and by a second primer (the “Focal Primer”) that binds specifically to the target cDNA.
- FIG. 2C shows the two types of cDNA products present in the same sample following preamplification: the whole transcriptome (WT) cDNA products representative of the whole transcriptome in the cell or nucleus, and the enriched target (or “gene-specific”) cDNA molecules.
- WT whole transcriptome
- target or “gene-specific”
- the first additional round of amplification is followed by a second additional round of amplification.
- the first round of amplification can be performed to further enrich the target sequences and also add, e.g., an adapter sequence such as an Illumina adapter or other compatible adapters.
- the second round of amplification can further enrich the target sequences as well as introduce additional elements such as flow cell binding sites (e.g., P5 and P7) and index sequences (e.g., unique dual indexes, or UDIs).
- UDIs as used herein can be Illumina UDIs or any compatible UDIs, (e.g., Zymo-Seq SwitchFreeTM 3’mRNA Library Kits, Cat. No. R3008/R3009; Zymo-Seq RiboFree® Total RNA Library Kits, Cat. No. R3000/R3003).
- any compatible UDIs e.g., Zymo-Seq SwitchFreeTM 3’mRNA Library Kits, Cat. No. R3008/R3009; Zymo-Seq RiboFree® Total RNA Library Kits, Cat. No. R3000/R3003
- FIGS. 6A-6C One exemplary method of preparing a target-specific sequencing library is shown in FIGS. 6A-6C, where the first additional round of amplification further enriches the sgRNA- containing cDNA using a second hU6 specific primer, and also adds an Illumina adaptor, and the second additional round adds P5/P7 adaptors and a fourth barcode via the Illumina indexes in the UDI Plate - EC.
- Another exemplary method is shown in FIGS.
- the first additional round amplifies a subset of the cDNA from the whole transcriptome that contains V(D)J segments in the CDR3 repertoire of the T cell, and also adds an Illumina adaptor
- the second, final round amplifies the TCR Amplification product from the previous round and also appends UDIs from the UDI - EC Plate as well as the P5 and P7 adaptors.
- the indexing round of amplification can be preceded and/or followed by an additional SPRI clean up step (e.g., using Ampure XP or KAPA Pure Beads).
- FIGS. 7A-7C provide an overview of the preparation of TCR sequencing libraries according to certain embodiments of the present methods.
- FIGS. 7A-7B show two additional amplification steps after the initial preamplification and subsequent separation of the enriched preamplified tagged cDNAs.
- FIG. 7A illustrates TCR Amplification 1.
- a PCR reaction amplified a subset of the cDNA from the whole transcriptome that contains V(D)J segments in the CDR3 repertoire of the T cell. This reaction also added an adaptor, e.g., Illumina adaptor or any compactible equivalents thereof.
- FIG. 7B illustrates TCR Amplification 2.
- a final PCR amplifies the TCR Amplification 1 product and appends the fourth DNA barcodes, UDIs from the UDI - EC Plate, as well as the P5 and P7 adaptors.
- FIG. 7C shows another view of the products of the two additional rounds of amplification.
- one or more primers are used in the preamplification and/or amplification steps that are specific to a Pol III promoter, e.g., a U6 promoter such as a human U6 promoter.
- one or more target primers used in the preamplification round comprise the sequence of SEQ ID NO: 1, or a sequence comprising 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identity to SEQ ID NO: 1, or a sequence comprising not more than 1, 2, 3, 4, or 5 mismatches relative to SEQ ID NO: 1.
- one or more target amplification primers used in the amplification round comprise the sequence of SEQ ID NO:4, or a sequence comprising 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identity to SEQ ID NO:4, or a sequence comprising not more than 1, 2, 3, 4, or 5 mismatches relative to SEQ ID NO:4.
- one or more pairs of target amplification primers used in the amplification round comprise the sequence of SEQ ID NO:2, or a sequence comprising 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identity to SEQ ID NO:2, or a sequence comprising not more than 1, 2, 3, 4, or 5 mismatches relative to SEQ ID NO:2.
- specific cDNAs are enriched by hybridization-based methods, e.g., by gene capture using a panel specific to the cDNAs of interest.
- hybridization-based methods e.g., gene capture
- such methods can be used to restrict the analysis to a subset of the whole transcriptome related to the process in question, rather than on the whole transcriptome itself.
- hybridization-based methods e.g., gene capture
- FIGs. 7A-7C illustrate an exemplary embodiment of preparing TCR libraries for sequencing.
- a PCR reaction amplifies a subset of the cDNA from the whole transcriptome that contains V(D)J segments in the CDR3 repertoire of the T cell. This reaction also adds an Illumina adaptor.
- a final PCR amplifies the amplification product from the first amplification and appends the fourth DNA barcodes, UDIs, and P5 and P7 adaptors.
- FIGs. 6A-6C illustrate an exemplary embodiment of preparing CRISPR libraries for sequencing.
- a PCR reaction enriches the sgRNA containing cDNA. This reaction also adds an Illumina adaptor.
- a final PCR amplifies the CRISPR PCR product and appends the fourth DNA barcodes as UDI, and the P5 and P7 adaptors.
- sequencing reads from both libraries are grouped by cell barcodes (e.g., RT primer barcodes, nucleic acid tag barcodes, UDI barcodes, and combinations thereof).
- Each barcode combination should correspond to the cDNA from a single cell.
- only reads with valid barcodes are retained.
- the sequencing reads with each barcode combination can be aligned to a reference genome, e.g., to a reference human genome. Multiple reads with the same random identifier sequence are counted as a single read.
- reads with random identifier sequences with two or less mismatches are assumed to be generated by sequencing errors and are counted as a single read.
- transcripts represented by one read only are filtered out and not included in subsequent analysis of the libraries.
- Sequence reads from each library i.e., the whole transcriptome library and the CRISPR or other target library
- each library i.e., the whole transcriptome library and the CRISPR or other target library
- sequence reads from each library comprising the same barcode sequence combinations are then associated to correlate the expression of, e.g., a given gRNA with the whole transcriptome in the same cell or nucleus.
- specific other targets can be assessed as well, e.g., the genes specifically targeted by the different gRNAs, e.g., as a control to confirm an expected effect of a given gRNA on the expression of its target.
- Such controls can provide further support for the association between specific genetic perturbations or expression events and complex phenotypes such as altered transcriptional profiles.
- kits for preparing and sequencing enriched sequencing libraries e.g., sequencing libraries for performing whole transcriptome sequencing together with enriched sequencing of a target transcript or transcripts of interest according to the herein-disclosed methods.
- the kit may comprise at least one reverse transcription (RT) primer, e.g., an RT primer as disclosed herein comprising an RT barcode, a sequence such as a poly(dT) sequence, random sequence, or a target sequence, and/or a 5' overhang sequence.
- the kit may also comprise a plurality of nucleic acid tags, e.g., nucleic acid tags with well-specific barcodes. Each first nucleic acid tag may comprise a first strand.
- the first strand may include a barcode sequence, flanked by a 3’ and/or 5’ sequence located 3’ or 5’ of the barcode.
- the first strand comprises a 3' hybridization sequence extending from a 3 ' end of a first labeling sequence and/or a 5' hybridization sequence extending from a 5' end of the first labeling sequence.
- Each first nucleic acid tag may further comprise a second strand.
- the kit may also comprise one or more linker strands as described herein, and/or one or more stop oligos according to the present disclosure.
- the nucleic acid tags are pre-complexed with their corresponding linker strands in the kit.
- the kit may comprise one or more sets of nucleic acid tags, e.g., sets of nucleic acid tags configured to be used in a given round of ligation-based tagging.
- each tag in a given set of nucleic acid tags may comprise the same 5’ and/or 3’ hybridization sequence, with the 5’ and/or 3’ hybridization sequences differing from the 5’ and/or 3’ hybridization sequences used in other sets of tags.
- Each set of tags may also comprise a plurality of distinct barcode sequences, e.g., 96 different barcode sequences for distinctively labeling cDNAs present in cells or nuclei within each well of a 96-well plate.
- the different sets of nucleic acid tags may also differ with respect to the presence or absence of additional elements such as capture agent (e.g., biotin), a random sequence, and/or an adapter sequence such as an NGS adapter sequence.
- the kit may further comprise at least one of a reverse transcriptase, a fixation agent, a permeabilization agent, a ligation agent, and/or a lysis agent.
- the kit comprises primers for amplifying cDNA molecules according to the present disclosure.
- the kit comprises primers for amplifying transcripts being enriched according to the present disclosure, e.g., a U6 primer for amplifying gRNAs according to the present methods.
- the kit comprises a second primer for amplifying transcripts being enriched, e.g., for a second round of amplification as described herein.
- the kit comprises one or more universal primers for amplifying the whole transcriptome, and/or for amplifying target transcripts when combined with a target transcript specific primer (e.g., a U6 primer).
- the kit comprises one or more primers comprising any of the nucleotide sequence shown as SEQ ID NOS: 1-302.
- the kit comprises one or more reaction vessels or containers for the any one or more of the herein-described compositions or methods.
- the kit comprises one or more multi-well plates such as 96-well plates.
- the kit comprises one or more multi-well plates pre-loaded with barcoded RT primers, with nucleic acid tags, or with indexed primers (e.g. UDI primers) according to the present disclosure.
- FIG. 8 A shows the percentage of cells with gene detected in HEK293 or NIH/3T3 cells in Whole Transcriptome vs. Focal libraries.
- Whole Transcriptome libraries were sequenced at 10,000 reads/cell.
- Focal libraries were sequenced at 250 reads/cell.
- FIG. 8B shows number of unique transcripts captured in HEK293 or NIH/3T3 cells in Whole Transcriptome vs.
- Focal libraries Whole Transcriptome libraries were sequenced at 10,000 reads/cell.
- Focal libraries were sequenced at 250 reads/cell.
- Example 2. Enrichment of low to medium expressing genes in 12k and 62k cell human and mouse sublibraries.
- FIGS. 9A-9B An enrichment protocol was performed comprising a spike-in preamplification step as described herein on two low- to medium-expressing genes in humans (KDELR1) and mice (Psmd2), and the level of enrichment (as measured by virtue of the percentage of cells comprising the gene) in 12k and 62k cell sublibraries.
- KDELR1 low- to medium-expressing genes
- Psmd2 mice
- FIGS. 9A-9B illustrate enrichment of moderately expressed genes in human , and improvement in purity following application of a 1 read count threshold filter.
- 9A shows two low- to medium-expressing genes were enriched using the herein-disclosed methods in humans (KDELR1) and mice (Psmd2), and the level of enrichment (as measured by virtue of the percentage of cells comprising the gene or reads per cell) was determined in 12k and 62k cell sublibraries.
- FIG. 9B shows sequencing libraries were prepared according to the herein-described methods using two sets of target genes (Psmd2-KDELR1 or Fnl-RPL5) with 10 ng or 50 ng of preamplified cDNA introduced into the first round of target sequence specific amplification performed subsequent to the multiplex “preamplification” round of amplification.
- the number of amplification cycles performed in this round of amplification was varied from 13-21 cycles.
- a substantial increase in purity was observed when a 1-read filter was applied relative to the purity in the absence of a filter. In this experiment, the purity did not increase substantially with more stringent filters (i.e., filters requiring higher numbers of reads per transcript).
- Sequencing libraries were prepared according to the herein-described methods, using either Psmd2 with KDELR1 or Fnl with RPL5 as target genes, and varying the amounts of preamplified cDNA (e.g., 10 ng or 50 ng) introduced into the first additional round of amplification performed during the preparation of the target-specific sequencing library (e.g., the first round of amplification subsequent to the multiplex preamplification step, analogous to the “CRISPR PCR” shown in FIG. 6C). The number of amplification cycles performed in this round of amplification was also varied (e.g., from 13-21 cycles).
- preamplified cDNA e.g. 10 ng or 50 ng
- the number of amplification cycles performed in this round of amplification was also varied (e.g., from 13-21 cycles).
- Example 4 Spiking in target primers during wt cdna amplification improves the yield and enrichment of target cdna molecules in a combinatorial barcoding-based multiplex scrna-seq protocol
- RNA was fixed and permeabilized, and cDNA generated by reverse transcribing (RT) RNA within the cells using RT primers with well-specific barcodes.
- the cDNA molecules were tagged within the cells by appending nucleic acid tags (comprising barcodes) to the cDNA via ligation, with the nucleic acid tags appended in the final round also comprising biotin and an adapter sequence.
- the cells were lysed, and the tagged cDNA molecules isolated from the lysate using streptavidin-coated magnetic beads.
- Second strand synthesis was carried out using a template switching oligonucleotide (TSO) comprising an adapter sequence, and cDNA was amplified with generic WT primers (i.e., primers binding to the TSO adapter sequence and to a TruSeq R2 sequence (see, e.g., FIG. 1A)), or with generic WT primers with gene-specific primers spiked in. Primers specific to RPL5 and to Fnl genes were used. The target primers used are shown in Table 1, and PCR cycling times and conditions are shown in Table 2.
- TSO template switching oligonucleotide
- the amplified cDNA was used in an additional round of PCR performed using gene-specific primers in order to enrich the target genes.
- the enriched cDNA product was analyzed by Tapestation (FIG. 10).
- Lane A2 contains enriched cDNA from a sample in which no gene-specific primers were spiked in during the initial round of cDNA amplification
- lane C2 contains cDNA from a sample in which primers were spiked-in. The results show strong enrichment in the sample with spiked-in primers (lane C2) relative to the sample with no spike-in (lane A2).
- Sequencing reads obtained from libraries prepared from the different samples were analyzed to determine the fraction of reads with valid barcodes that mapped to either of the targeted genes (RPL5 or Fnl) (FIG. 11).
- Sublibrary S3 which was prepared from a sample in which target-specific primers were spiked in during cDNA amplification, showed a significantly higher fraction of mapped reads than sublibrary SI, which was prepared from a sample without spiked-in primers.
- the sequencing reads were analyzed to determine the number of target gene transcripts (i.e., RPL5 or Fnl transcripts) detected per cell (FIG. 12).
- Sublibrary S3 which was prepared from a sample in which target-specific primers were spiked in during cDNA amplification, showed a much higher number of RPL5 or Fnl transcripts detected per cell than sublibrary SI, which was prepared from a sample without spiked-in primers.
- the sequencing reads were analyzed to determine the fraction of cells that contained an enriched transcript (i.e., an RPL5 or Fnl transcript), both without any filtering (FIG. 13A) or following a filtering step to only consider transcripts represented by more than 2 reads (FIG. 13B).
- the fraction of cells with a target gene transcript was higher in Sublibrary S3 (with spiked-in primers) than in SI (no spiked-in primers) with or without filtering.
- Example 5 Using combinatorial barcoding to simultaneously profile the transcriptome and immune repertoire of 1 million T cells.
- TCRs T cell receptors
- Isolated T cells from healthy donor PBMCs were directly profiled, and showed high levels of TCR chain detection, both for all cells (FIG. 13 A) and for each of the 8 donors (FIG. 13B). Further, a number of unique alpha and beta chain clonotypes were detected across donors (FIGS. 14A-14B), including nearly four hundred thousand unique alpha chain clonotypes and five hundred thousand unique beta chain clonotypes identified across the 8 donors, with the vast majority being classified as rare clonotypes. [0247] The results are shown in FIGs. 13A-13B and 14A-14B. As shown in FIGS. 13A-13B, high TCR detection in primary T cells showing sensitive clonotype detection. FIG.
- FIG. 13 A shows high TCR chain identification rate. Isolated T cells from healthy donor PBMCs were directly profiled (Primary). Alpha, Beta, and Paired detection were represented in percentages.
- FIG. 13B shows TCR chain assignment across 8 donors. High rate of chain assignments to both TCR alpha and beta was observed. Among T cells with a detected TCR, paired alpha beta chain assignments ranged between 49%-66%.
- FIGS. 14A-14B show comprehensive Immune Repertoire Detection measured by number of unique alpha and beta chain clonotypes across donors. Nearly four hundred thousand (-400,000) unique alpha chain clonotypes and five hundred thousand (-500,000) unique beta chain clonotypes were identified across the 8 donors. The rare clonotypes (darker color, lower shaded) are defined as only being detected in 1 or 2 cells and the majority of detected clonotypes are rare.
- FIG. 14 A Unique Alpha Chain.
- FIG. 14B Unique Beta Chain.
- WT and TCR sequencing libraries were prepared with or without the inclusion of TCR-specific preamplification primers (i.e., with or without the “spiking-in” of TCR-specific primers during preamplification).
- TCR-specific primers were included during the preamplification step (i.e., only non-specific primers capable of amplifying the whole transcriptome were included).
- these are generic primers (e.g., specific to adapter sequences) introduced to all cDNA molecules via the template switching oligonucleotide (TSO) or the last-added nucleic acid tag.
- TSO template switching oligonucleotide
- primers specific to CDR3 sequences within the variable regions of both TCR alpha and beta chains were included (or “spiked-in”) to the amplification reaction.
- FIGS. 16A-16D show increased detection of TCR alpha and beta chains with spiking in of TCR-specific primers during first multiplex cDNA amplification step (i.e., “preamplification” step).
- Whole transcriptome (WT) and TCR-specific sequencing libraries were prepared from activated and resting T cells with or without spiked-in primers, and the percentages of cells with detected alpha (FIGs. 16A and 16B) and/or beta (FIGs. 62C and 16D) chains were determined.
- FIG. 16A shows percentages of activated T cells with detected alpha chain or with no detected chain, in libraries prepared with or without TCR-specific preamplification primers.
- FIG. 16B shows percentages of resting T cells with detected alpha chain or with no detected chain, in libraries prepared with or without TCR-specific preamplification primers.
- FIG. 16C shows percentages of activated T cells with detected beta chain or with no detected chain, in libraries prepared with or without TCR-specific preamplification primers.
- FIG. 16D shows percentages of resting T cells with detected beta chain or with no detected chain, in libraries prepared with or without TCR-specific preamplification primers.
- Example 7 Exemplary primers for use in embodiments of the present methods.
- This example provides exemplary multiplex preamplification primers for use in embodiments of the present methods, such as CRISPR, show in Tables 3 and 4.
- Example 8 Exemplary protocols for the multiplex labeling of the whole transcriptome and target sequences of interest
- CRISPR Detect enables analysis of single guide RNAs (sgRNAs) in studies using CROP-seq or similar methods. Compatible methods generate a polyadenylated transcript containing the sgRNA sequence downstream of a human U6 promoter.
- CRISPR Detect is combined with Evercode WT Mini v2
- paired sgRNA detection and whole transcriptome expression can be analyzed in up to 10,000 cells across up to 12 different biological samples or experimental conditions.
- Evercode Cell Fixation kits convert the cells into individualized reaction compartments, thus avoiding the requirement for dedicated microfluids hardware. Through three rounds of barcoding, the transcriptome of each fixed cell is uniquely labeled.
- pooled cells are randomly distributed into different wells, and transcripts are labeled with well-specific barcodes.
- Barcoded transcripts are amplified during cDNA amplification, and sgRNA containing polyadenylated transcripts are enriched with human U6 specific primers.
- Amplified cDNA is split to create Whole Transcriptome and CRISPR sequencing libraries. After sequencing, the Parse Biosciences Analysis Pipeline assigns reads that share the same four barcode combination to a single cell and associates sgRNAs to the appropriate cell. in situ cell barcoding
- FIGS. 3A-3D provide an overview of in situ cell barcoding steps (i.e., combinatorial barcoding, or split-pool labeling) according to embodiments of the present methods.
- FIGS. 4A- 4D provide an overview of cDNA capture and amplification steps for CRISPR screens such as CROP-seq or similar methods.
- the pooled cells are added to the Ligation Master Mix, which is loaded into the Round 2 Plate.
- An in situ ligation reaction adds a well-specific barcode to the 3’ end of the cDNA.
- the ligation reaction is quenched with Round 2 Stop Mix, and the cells are pooled and strained.
- Using a Pl 000 pipette add 2 mL of sample in Resuspension Buffer into the Ligation Master Mix. Mix thoroughly by pipetting lOx with a Pl 000 set to 1000 pL. Store on ice.
- Round 3 Ligation Enzyme is added to the pooled cells, which are then loaded into the Round 3 Plate.
- a second in situ ligation reaction adds a third well-specific barcode, the Illumina Truseq R2 sequence, and a biotin.
- the sample is then pooled and strained.
- the cell pool is centrifuged, washed, and resuspended in Dilution Buffer.
- the cells are counted and divided into sublibraries. These sublibraries are lysed and stored at -80°C.
- Binder Bead Preparation step Streptavidin-coated Binder Beads are washed.
- the barcoded cDNA is captured with streptavidin-coated magnetic Binder Beads and washed to remove cellular debris.
- To capture the cDNA Remove the desired tube(s) of lysate from -80°C. Incubate the tube(s) in a heat block or thermocycler at 37°C for 5 minutes. Briefly centrifuge and store at room temperature. Briefly centrifuge Lysis Neutralizer and gently mix by pipetting 2x with a P20 set to 15 pL. Add 2.5 pL of Lysis Neutralizer to each tube of lysate. Mix 5x with a P200 pipette set to 40 pL. Briefly centrifuge. Incubate for 10 minutes at room temperature.
- Binder Beads by pipetting 3x. Add 50 pL of Binder Beads to each tube of lysate. Fully mix by pipetting 5x with a P200 set to 90 pL. Place the tube(s) into a 96 well PCR tube rack, press to secure, and ensure the caps are secured tightly. Place the lid on the rack. Place the rack onto a vortex mixer with a plate adaptor. Push to secure. Vortex on 20% power (-800-1000 RPM) for 60 minutes at room temperature. Remove the tube(s) from the vortex mixer with a plate adaptor. Briefly vortex the tube(s) on a standard vortex adaptor.
- a template switching reaction is added to the captured cDNA.
- This template switching reaction adds a 5’ adaptor to the cDNA.
- To template switch Prepare the Template Switch Master Mix in a new 1.5 mL tube. Mix by pipetting lOx and store on ice. Place each tube of captured cDNA on the high position of the magnetic rack. Incubate until the solution clears ( ⁇ 2 minutes). While still on the magnetic rack, remove and discard the supernatant. While still on the magnetic rack, add 125 pL of Bind Buffer C to each tube. While still on the magnetic rack, remove and discard the Bind Buffer C. Remove the tube(s) from the magnetic rack.
- cDNA Amplification (“multiplex cDNA amplification,” or “preamplification”)
- the captured cDNA is washed and amplified with TS- and WT preamplification primers (e.g., Illumina Truseq R2- specific primers, or compatible equivalent thereof).
- the target cDNA is enriched using target pairs of primers (target specific preamplification primers).
- target pairs of primers target specific preamplification primers
- sgRNA transcripts are enriched with Human U6 primers.
- Bind Buffer C While still on the magnetic rack, add 125 pL of Bind Buffer C to each tube. Incubate for 1 minute at room temperature. While still on the magnetic rack, remove and discard the Bind Buffer C. Remove tube(s) from the magnetic rack. Fully resuspend each bead pellet with 100 pL of the Amplification Reaction Master Mix. Store on ice. Determine the number of PCR cycles required for cDNA amplification based on the recommendations in Table 11. Although these recommendations are appropriate for many cell types, the number of cycles may need to be optimized for your sample type.
- Amplified cDNA is purified with a 0.8x SPRI bead cleanup.
- the concentration and size distribution of the cDNA are measured with fluorescent dyes and capillary electrophoresis.
- the cDNA is then stored at 4°C for up to 48 hours or at - 20°C for up to 3 months.
- To quantify the cDNA Measure the concentration of each tube of purified cDNA with, e.g., the Qubit dsDNA HS (High Sensitivity) Assay Kit according to the manufacturer’s instructions. Record the concentration(s). Assess the size distribution of each tube of purified cDNA with a High Sensitivity DNA Kit on the Agilent Bioanalyzer System or High Sensitivity D5000 ScreenTape and Reagents on the Agilent TapeStation System according to the manufacturer’s instructions.
- Samples may need to be diluted to be within the manufacturer's recommended concentration range. Typically, between a 1 :3 to 1 : 10 dilution is appropriate.
- Purified cDNA can be stored at 4°C for up to 48 hours or at -20°C for up to 3 months. Otherwise, proceed immediately to WT Sequencing Library Preparation.
- Vortex the Fragmentation Buffer for 5 seconds Briefly centrifuge. Prepare the Fragmentation Master Mix in a new 1.5 mL tube. Mix by pipetting lOx and store on ice. Add 15 pL of Fragmentation Master Mix to each tube of diluted cDNA. Mix by pipetting lOx with a P200 multichannel pipette set to 40 pL. Briefly centrifuge. Place the tube(s) into a cooled thermocycler and perform the program shown in Table 13.
- step 4 of the thermal cycling program store the tube(s) on ice and proceed immediately to Post-A-tailing Size Selection.
- Fragmented and A-tailed DNA is size selected with a double sided SPRI cleanup.
- Adaptors with an Illumina Truseq R2 sequence are ligated to the 5’ end of the fragmented DNA.
- Adaptor ligated DNA is purified with a 0.8x SPRI bead cleanup.
- Purified adaptor ligated DNA is PCR amplified with Illumina Truseq R1 and R2 primers. This indexing PCR generates sequencing libraries and adds i5/i7 UDIs that act as a fourth cell barcode.
- Safe stopping point Sequencing libraries can be stored at 4°C for up to 18 hours.
- the sequencing libraries are size selected with a double sided SPRI cleanup. To size select the sequencing libraries: Gather freshly prepared 85% ethanol. Gather room temperature SPRI beads ( ⁇ 50 pL per sublibrary). Vortex the SPRI beads until fully mixed. Add 30 pL of SPRI beads to each sequencing library tube. Vortex the tube(s) for 5 seconds. Briefly centrifuge. Incubate for 5 minutes at room temperature. Place the tube(s) on the high position of the magnetic rack for 0.2 mL tubes. Incubate until the solution clears ( ⁇ 2 minutes). While still on the magnetic rack, transfer 75 pL of the supernatant containing the DNA into new 0.2 mL tube(s). Discard the tube(s) with bead pellet(s).
- the concentration and size distribution of the sequencing libraries are measured with fluorescent dyes and capillary electrophoresis. Libraries are then stored at 4°C for up to 48 hours or at -20°C for up to 3 months.
- To quantify the sequencing libraries Measure the concentration of each purified sequencing library with, e.g., a Qubit dsDNA HS (High Sensitivity) Assay Kit according to the manufacturer’s instructions. Assess the size distribution of each purified sequencing library with a High Sensitivity DNA Kit on the Agilent Bioanalyzer System or High Sensitivity DI 000 ScreenTape and Reagents on the Agilent TapeStation System according to the manufacturer’s instructions. Samples may need to be diluted to be within the manufacturer's recommended concentration range. Typically, between a 1 :3 to 1 : 10 dilution is appropriate. Safe stopping point: Sequencing libraries can be stored at -20°C for up to 3 months.
- Target Sequencing Library Preparation - Target Amplification 1 e.g., CRISPR PCR
- a PCR reaction amplifies a subset of the cDNA enriched for target transcripts, e.g., sgRNA transcripts.
- target transcripts e.g., sgRNA transcripts.
- Target (e.g., gRNA, or sgRNA) enriched cDNA sublibraries are size selected with a double sided SPRI cleanup. To size select the sublibraries: Gather freshly prepared 85% ethanol. Gather room temperature SPRI beads ( ⁇ 45 pL per sublibrary). Vortex the SPRI beads until fully mixed. Add 30 pL of SPRI beads to each tube of target (e.g., CRISPR/gRNA) enriched cDNA. Vortex the tube(s) for 2-3 seconds. Briefly centrifuge. Incubate for 5 minutes at room temperature. Place the tube(s) on the high position of the magnetic rack for 0.2 mL tubes. Incubate until the solution clears ( ⁇ 2 minutes).
- target e.g., CRISPR/gRNA
- the size-selected target enriched cDNA can be stored at 4°C for up to 2 days or -20°C for up to 3 months.
- Target Amplification 2 (e.g., CRISPR Index PCR)
- a final PCR adds i5/i7 UDIs that act as a fourth cell barcode.
- the final target e.g., CRISPR/gRNA
- UDI Plate - EC Orient the UDI Plate - EC with the notch on the bottom left.
- Safe stopping point: target (e.g., CRISPR) sequencing libraries can be stored at 4°C for up to 18 hours.
- Post-Target Amplification 2 (e.g., CRISPR Index PCR) Size Selection
- Target (e.g., CRISPR) sequencing libraries are size selected with a double sided SPRI cleanup. To size select the sublibraries: Gather freshly prepared 85% ethanol. Gather room temperature SPRI beads ( ⁇ 45 pL per sublibrary). Vortex the SPRI beads until fully mixed. Add 30 pL of SPRI beads to each tube of target sequencing library. Vortex the tube(s) for 2-3 seconds. Briefly centrifuge. Incubate for 5 minutes at room temperature. Place the tube(s) on the high position of the magnetic rack for 0.2 mL tubes. Incubate until the solution clears ( ⁇ 2 minutes). While still on the magnetic rack, transfer 75 pL of the supernatant containing the target sequencing library into new 0.2 mL tube(s).
- CRISPR CRISPR
- Target e.g., CRISPR
- the concentration and size distribution of target sequencing libraries are measured with fluorescent dyes and capillary electrophoresis. Libraries are then stored at -20°C for up to 3 months.
- To quantify the target sequencing libraries Measure the concentration of each purified target sequencing library with, e.g., the Qubit dsDNA HS (High Sensitivity) Assay Kit according to the manufacturer’s instructions. Assess the size distribution of each purified sequencing library with a High Sensitivity DNA Kit on the Agilent Bioanalyzer System or High Sensitivity D5000 ScreenTape and Reagents on the Agilent TapeStation System according to the manufacturer’s instructions. Samples may need to be diluted to be within the manufacturer's recommended concentration range.
- Sequencing libraries can be stored at -20°C for up to 3 months.
- Sequencing libraries should be diluted and denatured according to the instruction for the relevant sequencing instrument.
- Target sequencing libraries can be sequenced together or separately from Whole Transcriptome libraries.
- sequencing target e.g., CRISPR
Landscapes
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Chemical & Material Sciences (AREA)
- Engineering & Computer Science (AREA)
- Genetics & Genomics (AREA)
- Organic Chemistry (AREA)
- Wood Science & Technology (AREA)
- Zoology (AREA)
- Biomedical Technology (AREA)
- Biotechnology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Microbiology (AREA)
- Biochemistry (AREA)
- Biophysics (AREA)
- Physics & Mathematics (AREA)
- Immunology (AREA)
- Molecular Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Crystallography & Structural Chemistry (AREA)
- Plant Pathology (AREA)
- Analytical Chemistry (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Cell Biology (AREA)
- Hematology (AREA)
- Chemical Kinetics & Catalysis (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Abstract
La présente divulgation concerne, de manière générale, des méthodes multiplex de marquage spécifique à une cellule ou à un noyau de molécules telles que des acides nucléiques. La présente divulgation concerne également des kits de marquage multiplex spécifique à une cellule ou à un noyau de molécules. Dans certains modes de réalisation, les méthodes et les kits concernent le marquage parallèle d'acides nucléiques correspondant à des séquences cibles d'intérêt et au transcriptome complet.
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| EP24754033.9A EP4662310A2 (fr) | 2023-02-07 | 2024-02-07 | Procédés et kits de marquage de molécules cellulaires pour analyse multiplex |
Applications Claiming Priority (6)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US202363483741P | 2023-02-07 | 2023-02-07 | |
| US63/483,741 | 2023-02-07 | ||
| US202363471951P | 2023-06-08 | 2023-06-08 | |
| US63/471,951 | 2023-06-08 | ||
| US202363614344P | 2023-12-22 | 2023-12-22 | |
| US63/614,344 | 2023-12-22 |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| WO2024168092A2 true WO2024168092A2 (fr) | 2024-08-15 |
| WO2024168092A3 WO2024168092A3 (fr) | 2024-09-26 |
Family
ID=92263480
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/US2024/014893 Ceased WO2024168092A2 (fr) | 2023-02-07 | 2024-02-07 | Procédés et kits de marquage de molécules cellulaires pour analyse multiplex |
Country Status (2)
| Country | Link |
|---|---|
| EP (1) | EP4662310A2 (fr) |
| WO (1) | WO2024168092A2 (fr) |
Family Cites Families (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CA2874343C (fr) * | 2012-05-21 | 2021-11-09 | Fluidigm Corporation | Analyse de particules uniques de populations de particules |
| WO2019184655A1 (fr) * | 2018-03-27 | 2019-10-03 | 苏州克睿基因生物科技有限公司 | Application de système crispr/cas dans l'édition de gènes |
| CN111247248A (zh) * | 2018-06-04 | 2020-06-05 | 伊鲁米纳公司 | 高通量单细胞转录组文库及制备和使用方法 |
-
2024
- 2024-02-07 EP EP24754033.9A patent/EP4662310A2/fr active Pending
- 2024-02-07 WO PCT/US2024/014893 patent/WO2024168092A2/fr not_active Ceased
Also Published As
| Publication number | Publication date |
|---|---|
| EP4662310A2 (fr) | 2025-12-17 |
| WO2024168092A3 (fr) | 2024-09-26 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US11427856B2 (en) | Methods and kits for labeling cellular molecules | |
| US12234501B2 (en) | In situ combinatorial labeling of cellular molecules | |
| CN116949132A (zh) | 一种构建单细胞测序文库的方法 | |
| WO2024168092A2 (fr) | Procédés et kits de marquage de molécules cellulaires pour analyse multiplex | |
| WO2025179223A1 (fr) | Nouveaux procédés et kits de marquage combinatoire de molécules cellulaires | |
| WO2025137724A2 (fr) | Procédés et kits d'analyse de transcriptome de cellule individuelle et de profilage immunitaire |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 24754033 Country of ref document: EP Kind code of ref document: A2 |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 24754033 Country of ref document: EP Kind code of ref document: A2 |
|
| WWP | Wipo information: published in national office |
Ref document number: 2024754033 Country of ref document: EP |