WO2025102355A1 - Methods and reagents for high-throughput single cell full length rna analysis - Google Patents
Methods and reagents for high-throughput single cell full length rna analysis Download PDFInfo
- Publication number
- WO2025102355A1 WO2025102355A1 PCT/CN2023/132328 CN2023132328W WO2025102355A1 WO 2025102355 A1 WO2025102355 A1 WO 2025102355A1 CN 2023132328 W CN2023132328 W CN 2023132328W WO 2025102355 A1 WO2025102355 A1 WO 2025102355A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- nucleic acids
- cdnas
- barcoded nucleic
- barcoded
- cell
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6806—Preparing nucleic acids for analysis, e.g. for polymerase chain reaction [PCR] assay
Definitions
- the present disclosure generally relates to molecular biology. More specifically, provided herein include methods, compositions, kits and systems for high-throughput single cell sequencing.
- the method comprises: partitioning a cell of a plurality of cells and a bead of a plurality of beads attached with a plurality of barcode oligonucleotides into a partition of a plurality of partitions; hybridizing the plurality of barcode oligonucleotides attached to the bead in the partition with the RNA targets associated with the cell in the partition; reverse transcribing the RNA targets hybridized to the barcode oligonucleotides to generate a first plurality of barcoded complementary deoxyribonucleic acids (cDNAs) ; obtaining a first portion and a second portion of the first plurality of barcoded cDNAs; circularizing each of the first portion of the first plurality of barcoded cDNAs to generate a plurality of circularized barcoded cDNAs; amplifying the plurality of circularized barcoded cDNAs to generate a second plurality of linear barcoded
- cDNAs complementary deoxyribonucleic acids
- each barcode oligonucleotide of the plurality of barcode oligonucleotides comprises a cell barcode, a unique molecular identifier (UMI) and a probe sequence.
- the probe sequence is capable of binding to an RNA target associated with the cell.
- the first plurality of barcoded cDNAs comprises the barcode oligonucleotides and cDNAs corresponding to the RNA targets.
- the cDNAs corresponding to the RNA targets comprise one end attached to the UMI and the cell barcode and the other end.
- the RNA target comprises a messenger RNA (mRNA) .
- the method for single cell analysis comprises: partitioning a cell of a plurality of cells and a bead of a plurality of beads attached with a plurality of barcode oligonucleotides into a partition of a plurality of partitions; barcoding the nucleic acid targets associated with the cell in the partition to generate a first plurality of barcoded nucleic acids; obtaining a first portion and a second portion of the first plurality of barcoded nucleic acids;
- each barcode oligonucleotide of the plurality of barcode oligonucleotides comprises a cell barcode, a unique molecular identifier (UMI) and a probe sequence.
- the probe sequence is capable of binding to a nucleic acid target associated with the cell.
- the first plurality of barcoded nucleic acids comprises the barcode oligonucleotides and nucleotide sequences corresponding to the nucleic acid targets.
- the nucleotide sequences corresponding to the nucleic acid targets comprise one end attached to the UMI and the cell barcode and the other end.
- the nucleic acid targets can, e.g., comprise a ribonucleic acid (RNA) , a messenger RNA (mRNA) , and a deoxyribonucleic acid (DNA) .
- the nucleic acid targets comprise nucleic acid targets of the cell, from the cell, in the cell, and/or on the surface of the cell.
- the partition can be a droplet or a microwell.
- the plurality of partitions comprises a plurality of microwells of a microwell array.
- the plurality of partitions comprises at least 1000 partitions.
- at least 50%of partitions of the plurality of partitions comprise a single cell of the plurality of cells and a single bead of the plurality of beads.
- at most 10%of partitions of the plurality of partitions comprise two or more cells of the plurality of cells.
- at most 10%of partitions of the plurality of partitions comprise no cell of the plurality of cells.
- at most 10%of partitions of the plurality of partitions comprise two or more beads of the plurality of beads.
- at most 10%of partitions of the plurality of partitions comprise no bead of the plurality of beads.
- the probe sequence can be, e.g., at least 10 nucleotides in length. In some embodiments, the probe sequence is not a poly-dT sequence. In some embodiments, the barcode oligonucleotides comprising probe sequence is capable of binding to a non-poly-A RNA target and/or nucleic acid target. In some embodiments, the barcode oligonucleotides comprising probe sequences that are not poly-dT sequences are capable of binding to an identical non-poly-A RNA target and/or nucleic acid target.
- the barcode oligonucleotides comprising probe sequences that are not poly-dT sequences are capable of binding to different non-poly-A RNA targets and/or nucleic acid targets.
- the probe sequence is a poly-dT sequence.
- the poly-dT sequence is at least 10 nucleotides in length.
- the poly-dT sequences of the barcode oligonucleotides attached to a bead of the plurality of beads are identical.
- the probe sequences of barcode oligonucleotides comprise a degenerate sequence. In some embodiments, the degenerate sequence is at least 3 nucleotides in length.
- the degenerate sequence spans, or corresponds to, a mutation.
- the probe sequences of barcode oligonucleotides span a region of interest. In some embodiments, the probe sequence is adjacent a region of interest.
- the cell barcodes of two barcode oligonucleotides attached to a bead of the plurality of beads comprise an identical sequence. In some embodiments, the cell barcodes of two barcode oligonucleotides attached to two beads of the plurality of beads comprise different sequences. In some embodiments, the cell barcode of each barcode oligonucleotide is at least 6 nucleotides in length. In some embodiments, the UMIs of two barcode oligonucleotides attached to a bead of the plurality of beads comprise different sequences. In some embodiments, the UMIs of two barcode oligonucleotides attached to two beads of the plurality of beads comprise an identical sequence.
- the UMI of each barcode oligonucleotide is at least 6 nucleotides in length.
- the barcode oligonucleotide further comprises a first polymerase chain reaction (PCR) primer-binding sequence.
- the first PCR primer-binding sequence comprises a Read 1 sequence.
- the barcode oligonucleotide comprises from the 5’ end to the 3’ end, the cell barcode, the UMI, the PCR primer-binding sequence, and the probe sequence or the UMI, the cell barcode, the PCR primer-binding sequence, and the probe sequence.
- the barcode oligonucleotides are reversibly attached to, covalently attached to, or irreversibly attached to the bead.
- the bead is a gel bead.
- the gel bead is degradable upon application of a stimulus.
- the stimulus comprises a thermal stimulus, a chemical stimulus, a biological stimulus, a photo-stimulus, or a combination thereof.
- the bead is a solid bead and/or a magnetic bead.
- barcoding the nucleic acid targets associated with the cell comprises: hybridizing the barcode oligonucleotides attached to the bead in each partition of the plurality of partitions with nucleic acid targets associated with the cell in the partition; extending the barcode oligonucleotides attached to the bead and hybridized to the nucleic acid targets using the nucleic acid targets as templates to generate single-stranded barcoded nucleic acids; and generating double-stranded barcoded nucleic acids from the single-stranded barcoded nucleic acids.
- generating double-stranded barcoded nucleic acids from the single-stranded barcoded nucleic acids comprises extending the single-stranded barcoded nucleic acids. In some embodiments, extending the single-stranded barcoded nucleic acids comprises extending the single-stranded barcoded nucleic acids using a template switching oligonucleotide.
- the method can further comprise pooling the beads prior to extending the barcode oligonucleotides or prior to generating the double-stranded barcoded nucleic acids.
- extending the barcode oligonucleotides attached to the bead and hybridized to the nucleic acid targets comprises extending the barcode oligonucleotides attached to the bead and hybridized to the nucleic acid targets in bulk.
- generating the double-stranded barcoded nucleic acids comprises generating the double-stranded barcoded nucleic acids from the single-stranded barcoded nucleic acids in bulk.
- the method can further comprise pooling the beads subsequent to extending the barcode oligonucleotides attached to the bead to generate the single-stranded barcoded nucleic acids or subsequent to generating the double-stranded barcoded nucleic acids.
- extending the barcode oligonucleotides attached to the bead and hybridized to the nucleic acid targets comprises extending the barcode oligonucleotides attached to the bead and hybridized to the nucleic acid targets in the partition.
- generating the double-stranded barcoded nucleic acids comprises generating the double-stranded barcoded nucleic acids from the single-stranded barcoded nucleic acids in the partition.
- circularizing each of the first portion of the first plurality of barcoded cDNAs comprises connecting the UMIs and the cell barcodes attached to one end of the cDNAs corresponding to the RNA targets to the other end of the cDNAs corresponding to the RNA targets. In some embodiments, circularizing each of the first portion of the first plurality of barcoded nucleic acids comprises connecting the UMIs and the cell barcodes attached to one end of the nucleotide sequences corresponding to the nucleic acid targets to the other end of the nucleotide sequences corresponding to the nucleic acid targets.
- circularizing each of the first portion of the first plurality of barcoded nucleic acids/cDNAs comprises: generating barcoded nucleic acid/cDNA comprising a first circle handle attached to one end of the first plurality of barcoded nucleic acid/cDNA and a second circle handle attached to the other end of the first plurality of barcoded nucleic acid/cDNA; and connecting the first circle handle and the second circle handle.
- the first circle handle and the second circle handle comprise an identical nucleotide sequence, an overlapping nucleotide sequence and/or a complementary nucleotide sequence.
- the identical nucleotide sequence, the overlapping nucleotide sequence and/or the complementary nucleotide sequence is at least 10 nucleotides in length and/or at most 150 nucleotides in length (e.g., about 40 nucleotides in length) .
- connecting the first circle handle and the second circle handle comprises connecting the identical nucleotide sequences, the overlapping nucleotide sequences and/or the complementary nucleotide sequences of the first circle handle and the second circle handle.
- amplifying the plurality of circularized barcoded nucleic acids/cDNAs comprises: hybridizing first linearization primers and second linearization primers to the plurality of circularized barcoded nucleic acids/cDNAs; extending the first linearization primers and the second linearization primers using the plurality of circularized barcoded nucleic acids/cDNAs as templates.
- the first linearization primers and the second linearization primers hybridize to a sequence between 1) the one end of the nucleotide sequences corresponding to the nucleic acid targets or the cDNAs corresponding to the mRNA targets, and 2) the UMI and the cell barcode.
- the first linearization primers and the second linearization primers hybridize to a sequence comprising the first PCR primer-binding sequence of the barcode oligonucleotides.
- amplifying the plurality of circularized barcoded nucleic acids/cDNAs further comprises purifying the second plurality of linear barcoded nucleic acids/cDNAs.
- analyzing the second portion of the first plurality of barcoded nucleic acids/cDNAs and the second plurality of linear barcoded nucleic acids/cDNAs comprises amplifying the second portion of the first plurality of barcoded nucleic acids/cDNAs to obtain an amplified second portion of the first plurality of barcoded nucleic acids/cDNAs and the second plurality of linear barcoded nucleic acids/cDNAs to obtain an amplified second plurality of linear barcoded nucleic acids/cDNAs.
- analyzing the second portion of the first plurality of barcoded nucleic acids/cDNAs and the second plurality of linear barcoded nucleic acids/cDNAs comprises pooling the second portion of the first plurality of barcoded nucleic acids/cDNAs and the second plurality of linear barcoded nucleic acids/cDNAs before amplifying the second portion of the first plurality of barcoded nucleic acids/cDNAs and the second plurality of linear barcoded nucleic acids/cDNAs.
- analyzing the second portion of the first plurality of barcoded nucleic acids/cDNAs and the second plurality of linear barcoded nucleic acids/cDNAs comprises processing the amplified second portion of the first plurality of barcoded nucleic acids/cDNAs and the amplified second plurality of linear barcoded nucleic acids/cDNAs to generate processed second portion of the first plurality of barcoded nucleic acids/cDNAs and processed second plurality of linear barcoded nucleic acids/cDNAs.
- processing the amplified second portion of the first plurality of barcoded nucleic acids/cDNAs and the amplified second plurality of linear barcoded nucleic acids/cDNAs comprises: fragmenting the amplified second portion of the first plurality of barcoded nucleic acids/cDNAs to generate fragmented second portion of the first plurality of barcoded nucleic acids/cDNAs and the amplified second plurality of linear barcoded nucleic acids/cDNAs to generate fragmented second plurality of linear barcoded nucleic acids/cDNAs; adding a second polymerase chain reaction (PCR) primer-binding sequence; and generating processed second portion of the first plurality of barcoded nucleic acids/cDNAs and processed second plurality of linear barcoded nucleic acids/cDNAs comprising sequencing primer sequences from the fragmented second portion of the first plurality of barcoded nucleic acids/cDNAs and fragmented second plurality of linear barcoded nucleic acids/cDNAs.
- PCR poly
- fragmenting the amplified second portion of the first plurality of barcoded nucleic acids/cDNAs and the amplified second plurality of linear barcoded nucleic acids/cDNAs comprises fragmenting the amplified second portion of the first plurality of barcoded nucleic acids/cDNAs and the amplified second plurality of linear barcoded nucleic acids/cDNAs enzymatically.
- the second PCR primer-binding sequence comprises a Read 2 sequence.
- the sequencing primer sequences comprise a P5 sequence and a P7 sequence.
- analyzing the second portion of the first plurality of barcoded nucleic acids/cDNAs and the second plurality of linear barcoded nucleic acids/cDNAs further comprises pooling 1) the amplified second portion of the first plurality of barcoded nucleic acids/cDNAs and the amplified second plurality of linear barcoded nucleic acids/cDNAs; 2) the processed second portion of the first plurality of barcoded nucleic acids/cDNAs and the processed second plurality of linear barcoded nucleic acids/cDNAs; or 3) the fragmented second portion of the first plurality of barcoded nucleic acids/cDNAs and fragmented second plurality of linear barcoded nucleic acids/cDNAs.
- analyzing the second portion of the first plurality of barcoded nucleic acids/cDNAs and the second plurality of linear barcoded nucleic acids/cDNAs, or products thereof comprises sequencing the second portion of the first plurality of barcoded nucleic acids/cDNAs and the second plurality of linear barcoded nucleic acids/cDNAs, or products thereof to obtain sequencing information.
- sequencing the second portion of the first plurality of barcoded nucleic acids/cDNAs and the second plurality of linear barcoded nucleic acids/cDNAs, or products thereof comprises sequencing the processed second portion of the first plurality of barcoded nucleic acids/cDNAs and the processed second plurality of linear barcoded nucleic acids/cDNAs.
- sequencing the second portion of the first plurality of barcoded nucleic acids/cDNAs and the second plurality of linear barcoded nucleic acids/cDNAs, or products thereof comprises sequencing products of the second portion of the first plurality of barcoded nucleic acids/cDNAs and the second plurality of linear barcoded nucleic acids/cDNAs each comprising a P5 sequence, a Read 1 sequence, a cell barcode, a UMI, a poly-dT sequence, a probe sequence, a sequence of a nucleic acid target or a part thereof, a Read 2 sequence, a sample index, and/or a P7 sequence to obtain sequencing information.
- analyzing the second portion of the first plurality of barcoded nucleic acids/cDNAs and the second plurality of linear barcoded nucleic acids/cDNAs, or products thereof comprising obtaining the full-length sequences of the nucleic acid targets or the RNA targets by integrating the sequencing information of the second portion of the first plurality of barcoded nucleic acids/cDNAs and the second plurality of linear barcoded nucleic acids/cDNAs.
- analyzing the second portion of the first plurality of barcoded nucleic acids/cDNAs and the second plurality of linear barcoded nucleic acids/cDNAs, or products thereof comprises analyzing the sequencing information. In some embodiments, analyzing the second portion of the first plurality of barcoded nucleic acids/cDNAs and the second plurality of linear barcoded nucleic acids/cDNAs, or products thereof comprises: determining an expression profile of each of the nucleic acid targets and/or the RNA targets using a number of UMIs with different sequences associated with the nucleic acid targets and/or the RNA targets in the sequencing information. In some embodiments, the expression profile comprises an absolute abundance or a relative abundance.
- the expression profile comprises an RNA expression profile, an mRNA expression profile and/or a protein expression profile.
- analyzing the second portion of the first plurality of barcoded nucleic acids/cDNAs and the second plurality of linear barcoded nucleic acids/cDNAs, or products thereof comprises:
- the second portion of the first plurality of barcoded nucleic acids/cDNAs and the second plurality of linear barcoded nucleic acids/cDNAs are from at least 100 cells (e.g., at least 1,000 cells, or about 100 cells to about 50,000 cells) .
- the method can further comprise releasing the nucleic acids from the cell prior to barcoding the nucleic acid targets associated with the cell.
- the method can further comprise lysing the cell to release the nucleic acid targets form the cell.
- reverse transcribing the RNA targets hybridized to the barcode oligonucleotides is performed without lysing or digesting the cells.
- compositions for single cell analysis comprising a plurality of beads disclosed herein.
- the cell barcodes of the plurality of barcode oligonucleotides attached to each of the plurality of beads are identical.
- the cell barcodes of barcodes oligonucleotide attached to different beads of the plurality of beads are different.
- the plurality of beads comprises at least 100 beads.
- kits for single cell analysis comprises: a composition disclosed herein; and instructions of using the composition for single cell sequencing or analysis.
- FIG. 1 depicts non-limiting exemplary embodiments and data related to workflow of high-throughput single-cell full-length sequencing.
- FIG. 2 depicts non-limiting exemplary embodiments and data related to workflow of 3’ and 5’ library preparation.
- FIG. 3 depicts non-limiting exemplary embodiments and data related to T-SNE cluster of 3T3 and CCRF cell lines.
- FIG. 4A-FIG. 4C depict non-limiting exemplary embodiments and data related to transcript coverage of 3’ transcripts (FIG. 4A) , 5’ transcripts (FIG. 4B) and merged transcripts (FIG. 4C) .
- a method for simultaneously detecting the gene expression of 3' and 5' ends of transcripts at a high-throughput single-cell level comprises capturing mRNA using magnetic beads with poly-T tails.
- the beads are barcoded with barcode oligonucleotides, and the barcode oligonucleotides comprise a cell barcode to distinguish individual cells, and an UMI used for transcript quantification.
- the mRNA is reverse transcribed and amplified by PCR to generate cDNA.
- a portion of the cDNA is used to construct a 3' end transcriptome library for quantifying 3' end gene expression, while another portion is used for circular amplification and construction of a 5' end transcriptome library for quantifying 5' end gene expression.
- transcriptome sequencing typically only provides gene expression levels and sequence information from one end (e.g., either 3' end or 5' end) of the gene, without differentiating between different transcripts.
- expression heterogeneity of transcripts within different cells can lead to significant functional differences and regulatory mechanisms.
- expression heterogeneity or “expression heterogeneity of transcripts” refers to the differences in gene expression between individual cells. Expression heterogeneity could be caused by mechanisms such as sequence alterations to the constructs during integration, chromatin changes imparted during integration, locus-mediated inhibition of expression, or insufficient chromatin insulation.
- One mechanism is alternative splicing, generating many transcript isoforms from a single gene.
- a classic example is the Drosophila sex-determination pathway, in which alternative splicing acts as a sex-specific genetic switch that forms the basis of a regulatory hierarchy.
- Alternative splicing is also implicated in human diseases.
- the neurodegenerative disease FTDP-17 has been associated with mutations that affect the alternative splicing of tau pre-mRNAs. Therefore, sequence information from both ends of transcripts would provide additional information, compared to sequence information obtained from only one end transcripts.
- Simultaneous detection of 3' and 5' gene expression of RNA at the single-cell level disclosed herein enables differentiation and quantification of different transcripts. It also allows for the analysis of transcript isoforms, which is crucial for a comprehensive understanding of gene regulation and cellular functions.
- SMART-seq3 is a classic single-cell transcriptome sequencing method that can simultaneously obtain gene sequence information from both the 3' and 5' ends of RNA.
- SMART-seq3 is a classic single-cell transcriptome sequencing method that can simultaneously obtain gene sequence information from both the 3' and 5' ends of RNA.
- individual libraries can be constructed for each well without the need for adding cell barcodes.
- Full-length sequencing of individual cell transcripts can then be achieved by assembling short-read sequencing data.
- this method has limitations such as low cell throughput, labor-intensive procedures, and high costs due to the need for independent library construction and sequencing for each cell.
- third-generation sequencing technologies such as PacBio and Nanopore have been increasingly applied.
- the combination of third-generation sequencing technologies and single-cell sequencing has expanded the application scenarios of single-cell sequencing.
- Long-read sequencing has made it possible to sequence the entire transcriptome of massive single cells.
- third-generation sequencing also has corresponding drawbacks.
- Nanopore has lower sequencing accuracy, which significantly affects the recognition of cell barcodes and UMIs in single cells, leading to low data utilization and low accuracy of the gene sequences obtained.
- PacBio has relatively higher accuracy, but is expensive.
- the limited number of nanopores in PacBio sequencing chips results in a lower number of transcripts sequenced and lower sequencing throughput, compared to Nanopore.
- Second-generation sequencing is currently the platform with the highest sequencing accuracy.
- Second-generation sequencing is currently the platform with the highest sequencing accuracy.
- Due to the short read lengths and the lack of suitable high-throughput single-cell library construction methods there are significant technical barriers to achieving high-throughput full-length transcriptome sequencing using second-generation sequencing. Therefore, there is an urgent need to establish a high-throughput single-cell full-length library preparation and sequencing method.
- the method comprises: partitioning a cell of a plurality of cells and a bead of a plurality of beads attached with a plurality of barcode oligonucleotides into a partition of a plurality of partitions; hybridizing the plurality of barcode oligonucleotides attached to the bead in the partition with the RNA targets associated with the cell in the partition; reverse transcribing the RNA targets hybridized to the barcode oligonucleotides to generate a first plurality of barcoded complementary deoxyribonucleic acids (cDNAs) ; obtaining a first portion and a second portion of the first plurality of barcoded cDNAs; circularizing each of the first portion of the first plurality of barcoded cDNAs to generate a plurality of circularized barcoded cDNAs; amplifying the plurality of circularized barcoded cDNAs to generate a second plurality of linear barcoded
- cDNAs complementary deoxyribonucleic acids
- each barcode oligonucleotide of the plurality of barcode oligonucleotides comprises a cell barcode, a unique molecular identifier (UMI) and a probe sequence.
- the probe sequence is capable of binding to an RNA target associated with the cell.
- the first plurality of barcoded cDNAs comprises the barcode oligonucleotides and cDNAs corresponding to the RNA targets.
- the cDNAs corresponding to the RNA targets comprise one end attached to the UMI and the cell barcode and the other end.
- the RNA target can comprise a mRNA. Reverse transcribing the RNA targets hybridized to the barcode oligonucleotides can be performed without lysing or digesting the cells.
- the method comprises: partitioning a cell of a plurality of cells and a bead of a plurality of beads attached with a plurality of barcode oligonucleotides into a partition of a plurality of partitions; barcoding the nucleic acid targets associated with the cell in the partition to generate a first plurality of barcoded nucleic acids; obtaining a first portion and a second portion of the first plurality of barcoded nucleic acids; circularizing each of the first portion of the first plurality of barcoded nucleic acids to generate a plurality of circularized barcoded nucleic acids; amplifying the plurality of circularized barcoded nucleic acids to generate a second plurality of linear barcoded nucleic acids; and analyzing the second portion of the first plurality of barcoded nucleic acids and the second plurality of linear barcoded nucleic acids, or products thereof.
- each barcode oligonucleotide of the plurality of barcode oligonucleotides comprises a cell barcode, a unique molecular identifier (UMI) and a probe sequence.
- the probe sequence is capable of binding to a nucleic acid target associated with the cell.
- the first plurality of barcoded nucleic acids comprises the barcode oligonucleotides and nucleotide sequences corresponding to the nucleic acid targets.
- the nucleotide sequences corresponding to the nucleic acid targets comprise one end attached to the UMI and the cell barcode and the other end.
- the nucleic acid targets can comprise a ribonucleic acid (RNA) , a messenger RNA (mRNA) , and a deoxyribonucleic acid (DNA) .
- the nucleic acid targets can comprise nucleic acid targets of the cell, from the cell, in the cell, and/or on the surface of the cell.
- a nucleic acid target can be in the cell (which can be released from the cell by cell lysis before the nucleic acid target is barcoded) .
- a nucleic acid target can be on the surface of the cell (e.g., an oligonucleotide attached to an antibody bound to an antibody on the surface of the cell) .
- the method comprises releasing the nucleic acids of (or form or in) the cell prior to barcoding the nucleic acid targets associated with the cell.
- the method comprises lysing the cell to release the nucleic acids from (or in) the cell.
- the nucleic acid targets analyzed can be, e.g., from at least 100 cells (e.g., at least 1,000 cells, or about 100 cells to about 50,000 cells) .
- compositions for single cell analysis can comprise a plurality of beads disclosed herein.
- the cell barcodes of the plurality of barcode oligonucleotides attached to each of the plurality of beads are identical.
- the cell barcodes of barcodes oligonucleotide attached to different beads of the plurality of beads are different.
- the plurality of beads comprises at least 100 beads.
- kits for single cell analysis can comprise: a composition disclosed herein; and instructions of using the composition for single cell sequencing or analysis.
- Disclosed herein also include a method of nucleic acid sequencing.
- the method can comprise introducing a plurality of cells and/or a plurality of barcode oligonucleotides into a plurality of partitions.
- the introduction of a plurality of cells and/or a plurality of barcode oligonucleotides (alone or attached to beads) can be performed using partitioning.
- partitioning refers to introducing particles (e.g., cells, or beads) into vessels (e.g., microwells, droplets) that can be used to sequester or separate one particle from another. Such vessels are referred to using the noun “partition. ”
- a partition can include two or more particles of the same type or different types.
- Partitioning can be performed using a variety of methods known to a person skilled in the art, for example, using microfluidics, wells, microwells, multi-well plates, multi-well arrays, dispensing, dilution, droplets and the like.
- the cells, barcode oligonucleotides, and/or beads can be diluted and dispensed across a plurality of partitions via the use of flow channels in a microwell array.
- a “partition” as used herein can refer to a part, a portion, or a division sequestered from the rest of the parts, portions, or divisions.
- a partition can be formed through the use of wells, microwells, multi-well plates, microwell arrays, microfluidics, dilution, dispensing, droplets, or any other means of sequestering one fraction of a sample from another.
- a partition is a droplet or a microwell.
- the method can comprise partitioning a plurality of cells into a plurality of partitions, wherein a partition of the plurality of partitions comprises one cell of the plurality of cells.
- the method can also comprise partitioning a plurality of barcode oligonucleotides into the plurality of partitions.
- the plurality of barcode oligonucleotides can be attached to beads and the method can comprise partitioning a plurality of beads with the plurality of barcode oligonucleotides attached thereon into the plurality of partitions.
- a plurality of cells and/or a plurality of beads with a plurality of barcode oligonucleotides attached thereon can be co-partitioned by combining the plurality of cells and/or the plurality of beads with a plurality of barcode oligonucleotides attached thereon to form a mixture that can be then partitioned into a plurality of partitions.
- partitioning a plurality of cells and/or a plurality of beads with a plurality of barcode oligonucleotides attached thereon can be performed through the use of fluid flow in microwell array.
- the partitioning can comprise flowing one or more solutions comprising a plurality of cells and/or a plurality of beads with a plurality of barcode oligonucleotides attached thereon, sequentially or concurrently in a mixture, into the plurality of microwells via the inlet port.
- introducing the plurality of barcode oligonucleotides into the plurality of partitions can be performed without using a bead.
- the plurality of barcode oligonucleotides can be introduced into the partitions (e.g. microwells) by attaching or synthesizing the plurality of barcode oligonucleotides onto the surface of the partitions.
- attaching or synthesizing the plurality of barcode oligonucleotides onto the surface of the partitions can involve a ligation step.
- synthesizing the plurality of barcode oligonucleotides can comprise ligating two smaller oligonucleotides together to generate a plurality of barcode oligonucleotides each having a pre-designed sequence.
- a primer can be attached to the surface of a partition which can hybridize to a primer binding site of an oligonucleotide that also contains a template nucleotide sequence. The primer can then be extended by a primer extension reaction or other amplification reaction, and an oligonucleotide complementary to the template oligonucleotide can thereby be attached to the surface of the partition.
- the surface of the partitions can be pre-functionalized with a chemical moiety to facilitate the attachment of barcode oligonucleotides.
- the attachment of the barcode oligonucleotides can occur through the interaction between two members of a binding pair, one attached to the surface of the partitions and the other comprised in or conjugated to the barcode oligonucleotides, or a portion thereof.
- the surface of the microwell can be coated with a moiety (e.g. a member of a binding pair) capable of binding with another moiety (e.g.
- the other member of the binding pair) of the barcode oligonucleotide such that the binding of the two moieties results in the attachment of the barcode oligonucleotide or a portion thereof to the microwell.
- the surface of the microwell can be coated with streptavidin.
- the biotinylated barcode oligonucleotides can be attached to the surface of the microwell via streptavidin-biotin interaction.
- the surface of the partitions e.g. microwells
- the surface of the partitions can be modified to enhance its chemical reactivity and facilitate the oligonucleotide attachment, such as, by treating the microwells with oxygen plasma, corona discharges, and ultraviolet/ozone (UVO) as will be understood by a person skilled in the art.
- oxygen plasma e.g. oxygen plasma
- corona discharges e.g. oxygen plasma
- UVO ultraviolet/ozone
- a partition can be sized to fit at most one bead (and a cell) , not two beads.
- a size or dimension (e.g., length, width, depth, radius, or diameter) of a partition can be different in different embodiments.
- a size or dimension of one, one or more, or each, of the plurality of partitions is, is about, is at least, is at least about, is at most, or is at most about, 1 nanometer (nm) , 2 nm, 3 nm, 4 nm, 5 nm, 6 nm, 7 nm, 8 nm, 9 nm, 10 nm, 11 nm, 12 nm, 13 nm, 14 nm, 15 nm, 16 nm, 17 nm, 18 nm, 19 nm, 20 nm, 21 nm, 22 nm, 23 nm, 24 nm, 25 nm, 26 nm, 27 nm, 28 nm, 29
- the volume of one, one or more, or each, of the plurality of partitions can be different in different embodiments.
- the volume of one, one or more, or each, of the plurality of partitions can be, be about, be at least, be at least about, be at most, or be at most about, 1 nm 3 , 2 nm 3 , 3 nm 3 , 4 nm 3 , 5 nm 3 , 6 nm 3 , 7 nm 3 , 8 nm 3 , 9 nm 3 , 10 nm 3 , 20 nm 3 , 30 nm 3 , 40 nm 3 , 50 nm 3 , 60 nm 3 , 70 nm 3 , 80 nm 3 , 90 nm 3 , 100 nm 3 , 200 nm 3 , 300 nm 3 , 400 nm 3 , 500 nm 3 , 600 nm 3 , 700 nm 3 , 800 nm 3
- the volume of one, one or more, or each, of the plurality of partitions can be, be about, be at least, be at least about, be at most, or be at most about, 1 nanolieter (nl) , 2 nl, 3 nl, 4 nl, 5 nl, 6 nl, 7 nl, 8 nl, 9 nl, 10 nl, 11 nl, 12 nl, 13 nl, 14 nl, 15 nl, 16 nl, 17 nl, 18 nl, 19 nl, 20 nl, 21 nl, 22 nl, 23 nl, 24 nl, 25 nl, 26 nl, 27 nl, 28 nl, 29 nl, 30 nl, 31 nl, 32 nl, 33 nl, 34 nl, 35 nl, 36 nl, 37 nl, 38 nl, 39 nl, 40 nl, 41 nl,
- the number of partitions can be different in different embodiments.
- the number of partitions is, is about, is at least, is at least about, is at most, or is at most, 100, 200, 300, 400, 500, 600, 700, 800, 900, 1000, 2000, 3000, 4000, 5000, 6000, 7000, 8000, 9000, 10000, 20000, 30000, 40000, 50000, 60000, 70000, 80000, 90000, 100000, 200000, 300000, 400000, 500000, 600000, 700000, 800000, 900000, 1000000, 2000000, 3000000, 4000000, 5000000, 6000000, 7000000, 8000000, 9000000, 10000000, 20000000, 30000000, 40000000, 50000000, 60000000, 70000000, 80000000, 90000000, 100000000, 200000000, 300000, 400000, 500000000, 600000, 700000, 800000, 900000000, 1000000000, 20000000, 30000000, 40000000, 50000000,
- the percentage of the plurality of partitions comprising a single cell and a single bead can be different in different embodiments.
- the percentage of the plurality of partitions comprising a single cell and a single bead is, is about, is at least, is at least about, is at most, or is at most about, 5%, 6%, 7%, 8%, 9%, 10%, 11%, 12%, 13%, 14%, 15%, 16%, 17%, 18%, 19%, 20%, 21%, 22%, 23%, 24%, 25%, 26%, 27%, 28%, 29%, 30%, 31%, 32%, 33%, 34%, 35%, 36%, 37%, 38%, 39%, 40%, 41%, 42%, 43%, 44%, 45%, 46%, 47%, 48%, 49%, 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%
- the percentage of the plurality of partitions comprising no cell can be different in different embodiments.
- the percentage of the plurality of partitions comprising no cell is, is about, is at least, is at least about, is at most, or is at most about, 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or a number or a range between any two of these values.
- at least 50%of partitions of the plurality of partitions can comprise no cell of the plurality of cells.
- the partition is a microwell and the plurality of partitions comprise a plurality of microwells in a microwell array.
- microwell generally refers to a well with a volume of less than 1 mL.
- a microwell array can contain a number of microwells arranged in rows and columns. The size and spacing of the microwells may vary depending on different applications.
- a location of a microwell in a microwell array can be identified by its unique address describing its row and column position within the microwell array.
- microwell array comprising a plurality of microwells
- a microwell array comprising a plurality of microwells can be formed from a material selected from the group consisting of silicon, glass, ceramic, elastomers such as polydimethylsiloxane (PDMS) and thermoset polyester, thermoplastic polymers such as polystyrene, polycarbonate, poly (methyl methacrylate) (PMMA) , poly-ethylene glycol diacrylate (PEGDA) , Teflon, polyurethane (PU) , composite materials such as cyclic-olefin copolymer, and combinations thereof.
- PDMS polydimethylsiloxane
- thermoset polyester thermoplastic polymers such as polystyrene, polycarbonate, poly (methyl methacrylate) (PMMA) , poly-ethylene glycol diacrylate (PEGDA) , Teflon, polyurethane (PU)
- composite materials such as cyclic-olefin copolymer, and combinations
- the microwell array can comprise an inlet port in fluid communication with the plurality of microwells.
- the microwell array can also comprise an outlet port in fluid communication with the plurality of microwells.
- Microwells can be introduced with samples, free reagents, and/or reagents encapsulated in microcapsules.
- the reagents can comprise restriction enzymes, ligase, polymerase, fluorophores, oligonucleotide barcodes, oligonucleotide probes, adapters, buffers, dNTPs, ddNTPs, and other reagents required for performing the methods described herein. Samples and reagents can flow from the inlet port through a flow channel to deliver to the microwell array, and the waste can be pushed out from the outlet port and removed.
- the plurality of cells introduced into a plurality of partitions can be obtained from any organism of interest such as Monera (bacteria) , Protista, Fungi, Plantae, and Animalia Kingdoms.
- a cell can be a mammalian cell, and particularly a human cell such as T cells, B cells, natural killer cells, stem cells, or cancer cells.
- Cells described herein can be obtained from a cell sample.
- a cell sample comprising cells can be obtained from any source including a clinical sample and a derivative thereof, a biological sample and a derivative thereof, a forensic sample and a derivative thereof, an environmental sample and a derivative thereof and a combination thereof.
- a cell sample can be collected from any bodily fluids including, but not limited to, blood, urine, serum, lymph, saliva, anal, and vaginal secretions, perspiration and semen of any organism.
- a cell sample can be products of experimental manipulation including purification, cell culturation, cell isolation, cell separation, cell quantification, sample dilution, or any other cell sample processing approaches.
- a cell sample can be obtained by dissociation of any biopsy tissues of any organism including, but not limited to, skin, bone, hair, brain, liver, heart, kidney, spleen, pancreas, stomach, intestine, bladder, lung, esophagus.
- sample nucleic acids and “nucleic acid targets” are used interchangeably.
- the sample nucleic acids associated with a plurality of cells can comprise deoxyribonucleic acid (DNA) , ribonucleic acid (RNA) , and/or any combination or hybrid thereof.
- nucleic acid and “polynucleotide” are interchangeable and can refer to any nucleic acid, whether composed of phosphodiester linkages or modified linkages such as phosphotriester, phosphoramidate, siloxane, carbonate, carboxymethylester, acetamidate, carbamate, thioether, bridged phosphoramidate, bridged methylene phosphonate, bridged phosphoramidate, bridged phosphoramidate, bridged methylene phosphonate, phosphorothioate, methylphosphonate, phosphorodithioate, bridged phosphorothioate or sultone linkages, and combinations of such linkages.
- phosphodiester linkages or modified linkages such as phosphotriester, phosphoramidate, siloxane, carbonate, carboxymethylester, acetamidate, carbamate, thioether, bridged phosphoramidate, bridged methylene phosphonate, bridged phosphoramidate, bridge
- nucleic acid and “polynucleotide” also specifically include nucleic acids composed of bases other than the five biologically occurring bases (adenine, guanine, thymine, cytosine and uracil) .
- the sample nucleic acids can be single-stranded or double-stranded, or contain portions of both double-stranded or single-stranded sequences.
- the sample nucleic acids can contain any combination of nucleotides, including uracil, adenine, thymine, cytosine, guanine, inosine, xanthine, hypoxanthine, isocytosine, isoguanine and any nucleotide derivative thereof.
- nucleotide may include naturally occurring nucleotides and nucleotide analogs, including both synthetic and naturally occurring species.
- the sample nucleic acids can be genomic DNA (gDNA) , mitochondrial DNA (mtDNA) , messenger RNA (mRNA) , ribosomal RNA (rRNA) , transfer RNA (tRNA) , nuclear RNA (nRNA) , small interfering RNA (siRNA) , small nuclear RNA (snRNA) , small nucleolar RNA (snoRNA) , small Cajal body-specific RNA (scaRNA) , microRNA (miRNA) , double stranded (dsRNA) , ribozyme, riboswitch or viral RNA, or any nucleic acids that may be obtained from a sample.
- gDNA genomic DNA
- mtDNA messenger RNA
- rRNA ribosomal RNA
- tRNA transfer RNA
- nRNA nuclear RNA
- siRNA small interfering RNA
- snRNA small nuclear RNA
- snoRNA small nucleolar RNA
- the plurality of cells can be diluted prior to partitioning to ensure majority of the partitions comprise at most one cell with low doublets (more than one cell in one partition) .
- a dilution can be prepared such that a desired cell concentration is achieved.
- the cell concentration can be between 1 ⁇ 10 4 and 1 ⁇ 10 6 (e.g.
- the cell concentration is about 1 ⁇ 10 5 -3 ⁇ 10 5 (e.g. about, at least, at least about, at most, at most about, 1 ⁇ 10 5 , 1.1 ⁇ 10 5 , 1.2 ⁇ 10 5 , 1.3 ⁇ 10 5 , 1.4 ⁇ 10 5 , 1.5 ⁇ 10 5 , 1.6 ⁇ 10 5 , 1.7 ⁇ 10 5 , 1.8 ⁇ 10 5 , 1.9 ⁇ 10 5 , 2.0 ⁇ 10 5 , 2.1 ⁇ 10 5 , 2.2 ⁇ 10 5 , 2.3 ⁇ 10 5 , 2.4 ⁇ 10 5 , 2.5 ⁇ 10 5 , 2.6 ⁇ 10 5 , 2.7 ⁇ 10 5 , 2.8 ⁇ 10 5 , 2.9 ⁇ 10 5 , 3.0 ⁇ 10 5 , or a number or a range between any two of these values) .
- the plurality of barcode oligonucleotides introduced into the plurality of partitions are associated with a bead.
- the beads can provide a surface upon which molecules, such as oligonucleotides, can be synthesized or attached.
- a bead comprises, comprises about, comprises at least, comprises at least about, comprises at most, or comprises at most about 1, 5, 10, 50, 100, 1000, 10000, 20000, 30000, 40000, 50000, 60000, 70000, 80000, 90000, 100000, 200,000, 300,000, 400,000, 500,000, 600,000, 700,000, 800,000, 900,000, 1000000, 2000000, 3000000, 4000000, 5000000, 6000000, 7000000, 8000000, 9000000, 10000000, 20000000, 50000000, 100000000, 200000000, 300000000, 400000000, 500000000, 600000000, 700000, 800000000, 900000000, 1000000000, or a number or a range between any two of these values, barcode oligonucleotides.
- FIG. 2 shows a bead attached with a barcode oligonucleotide for illustrative purposes and is not intended to be limiting.
- the attachment can be reversible or irreversible.
- the attachment can be covalent or non-covalent via non-covalent bonds such as ionic bonds, hydrogen bonds, or van der Waals interactions.
- the attachment can be direct to the surface of a bead or indirect through other oligonucleotide sequences attached to the surface of a bead.
- a bead can be dissolvable, degradable, or disruptable. Barcode oligonucleotides can be reversibly attached to, covalently attached to, or irreversibly attached to the bead.
- a bead can be a gel bead such as a hydrogel bead. In some embodiments, the gel bead is degradable upon application of a stimulus.
- the stimulus can comprise a thermal stimulus, a chemical stimulus, a biological stimulus, a photo-stimulus, or a combination thereof.
- a bead can be a solid bead and/or a magnetic bead.
- the bead is a magnetic bead.
- the magnetic bead can comprise a paramagnetic material coated or embedded in the magnetic bead (e.g. on a surface, in an intermediate layer, and/or mixed with other materials of the magnetic bead) .
- a paramagnetic material refers to a material having a magnetic susceptibility slightly greater than 1 (e.g. between about 1 and about 5) .
- a magnetic susceptibility is a measure of how much a material can become magnetized in an applied magnetic field.
- Paramagnetic materials include, but not limited to, magnesium, molybdenum, lithium, aluminum, nickel, tantalum, titanium, iron oxide, gold, copper, or a combination thereof.
- the magnetic bead comprising barcode oligonucleotides can be immobilized or retained in a partition (e.g. a microwell) by an external magnetic field, thereby retaining the barcode oligonucleotides in a partition.
- the magnetic bead comprising barcode oligonucleotides can be mobilized or released when the external magnetic field is removed.
- a bead can be immobilized or retained in a partition (e.g. a microwell) through an interaction between two members of a binding pair.
- the partition e.g. microwell
- the partition can be coated with a capture moiety (e.g. a member of a binding pair) capable of binding with a binding moiety (the other member of the binding pair) comprised in or conjugated to a bead, such that the binding of the two moieties results in the attachment of the bead to the partition (e.g. microwell) , thereby immobilizing or retaining the bead in the partition.
- the surface of a partition e.g. microwell
- streptavidin the surface of a partition (e.g. microwell) via streptavidin-biotin interaction.
- Beads can be of uniform size or heterogeneous size.
- the beads have a diameter of about, at least, at least about, at most, or at most about, 1 ⁇ m, 5 ⁇ m, 10 ⁇ m, 20 ⁇ m, 30 ⁇ m, 40 ⁇ m, 45 ⁇ m, 50 ⁇ m, 60 ⁇ m, 65 ⁇ m, 70 ⁇ m, 75 ⁇ m, 80 ⁇ m, 90 ⁇ m, 100 ⁇ m, 250 ⁇ m, 500 ⁇ m, or 1 mm.
- a bead can be sized such that at most one bead (and a cell) , not two beads, can fit one partition.
- a size or dimension (e.g., length, width, depth, radius, or diameter) of a bead can be different in different embodiments.
- a size or dimension of one, or each, bead is, is about, is at least, is at least about, is at most, or is at most about, 1 nanometer (nm) , 2 nm, 3 nm, 4 nm, 5 nm, 6 nm, 7 nm, 8 nm, 9 nm, 10 nm, 11 nm, 12 nm, 13 nm, 14 nm, 15 nm, 16 nm, 17 nm, 18 nm, 19 nm, 20 nm, 21 nm, 22 nm, 23 nm, 24 nm, 25 nm, 26 nm, 27 nm, 28 nm, 29 nm, 30 nm, 31 nm, 32 nm, 33 nm, 34 nm, 35 nm, 36 nm, 37 nm, 38 nm, 39 nm, 40 nm, 41 nm, 42 n
- a size or dimension of one, or each, bead is about 1 nm to about 100 ⁇ m.
- the bead can have a dimension about 10 ⁇ m to about 100 ⁇ m.
- the bead can have a dimension about 30 ⁇ m.
- the volume of one, or each, bead can be different in different embodiments.
- the volume of one, or each, bead can be, be about, be at least, be at least about, be at most, or be at most about, 1 nm 3 , 2 nm 3 , 3 nm 3 , 4 nm 3 , 5 nm 3 , 6 nm 3 , 7 nm 3 , 8 nm 3 , 9 nm 3 , 10 nm 3 , 20 nm 3 , 30 nm 3 , 40 nm 3 , 50 nm 3 , 60 nm 3 , 70 nm 3 , 80 nm 3 , 90 nm 3 , 100 nm 3 , 200 nm 3 , 300 nm 3 , 400 nm 3 , 500 nm 3 , 600 nm 3 , 700 nm 3 , 800 nm 3 , 900 ⁇ m 3 , 1000 nm 3 ,
- the volume of one, or each, bead can be, be about, be at least, be at least about, be at most, or be at most about, 1 nanolieter (nL) , 2 nL, 3 nL, 4 nL, 5 nL, 6 nL, 7 nL, 8 nL, 9 nL, 10 nL, 11 nL, 12 nL, 13 nL, 14 nL, 15 nL, 16 nL, 17 nL, 18 nL, 19 nL, 20 nL, 21 nL, 22 nL, 23 nL, 24 nL, 25 nL, 26 nL, 27 nL, 28 nL, 29 nL, 30 nL, 31 nL, 32 nL, 33 nL, 34 nL, 35 nL, 36 nL, 37 nL, 38 nL, 39 nL, 40 nL, 41 nL, 42 nL, 43 n
- the number of beads introduced into a plurality of partitions can be different in different embodiments.
- the number of beads introduced into a plurality of partitions is, is about, is at least, is at least about, is at most, or is at most, 100, 200, 300, 400, 500, 600, 700, 800, 900, 1000, 2000, 3000, 4000, 5000, 6000, 7000, 8000, 9000, 10000, 20000, 30000, 40000, 50000, 60000, 70000, 80000, 90000, 100000, 200000, 300000, 400000, 500000, 600000, 700000, 800000, 900000, 1000000, 2000000, 3000000, 4000000, 5000000, 6000000, 7000000, 8000000, 9000000, 10000000, 20000000, 30000000, 40000000, 50000000, 60000000, 70000000, 80000000, 90000000, 100000000, 200000000, 300000, 400000, 500000, 600000, 7000000, 8000000, 9000000, 10000000, 20000000
- beads are introduced to the partitions such that the percentage of partitions each occupied with one bead is, is about, is at least, is at least about, is at most, or is at most about, 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or a number or a range between any two of these values. For example, at least 80%of the plurality of partitions can be each occupied with one bead.
- beads are introduced to the partitions such that the percentage of partitions with no bead is, is about, is at least, is at least about, is at most, or is at most about, 1%, 2%, 3%, 4%, 5%, 6%, 7%, 8%, 9%, 10%, 11%, 12%, 13%, 14%, 15%, 16%, 17%, 18%, 19%, 20%, 21%, 22%, 23%, 24%, 25%, 26%, 27%, 28%, 29%, 30%, 31%, 32%, 33%, 34%, 35%, 36%, 37%, 38%, 39%, 40%, 41%, 42%, 43%, 44%, 45%, 46%, 47%, 48%, 49%, 50%, or a number or a range between any two of these values. For example, at most 20%of the plurality of partitions contain no bead.
- the method described herein can comprise barcoding a plurality of sample nucleic acids associated with the cell in the partition using the plurality of barcode oligonucleotides to generate a plurality of barcoded nucleic acids.
- FIG. 2 shows barcoding an mRNA molecule with a barcode oligonucleotide.
- the barcode oligonucleotide is shown attached to a bead for illustrative purposes and is not intended to be limiting.
- the method can comprise lysing cells (e.g. after introducing a plurality of barcode oligonucleotides and/or a plurality of cells to the partition) to release the content of the cell within the partition.
- Lysis agents can be contacted with the cells or cell suspension concurrently, or immediately after the introduction of the cells into the partition and before the barcoding, e.g. through the flow channels.
- lysis agents include bioactive reagents, such as lysis enzymes, or surfactant-based lysis solutions including non-ionic surfactants such as TritonX-100 and Tween 20 and ionic surfactants such as sodium dodecyl sulfate (SDS) . Lysis methods including, but not limited to, thermal, acoustic, electrical, or mechanical cellular disruption can also be used.
- barcoding a plurality of sample nucleic acids (e.g., mRNA shown in FIG. 2) associated with the cell in the partition can comprise extending the plurality of barcode oligonucleotides using the plurality of sample nucleic acids as templates to generate partially single-stranded/partially double-stranded barcoded nucleic acids of the plurality of barcoded nucleic acids hybridized to sample nucleic acids of the plurality of sample nucleic acids.
- the partially single-stranded/partially double-stranded barcoded nucleic acids hybridized to sample nucleic acids can be separated by denaturation (e.g., heat denaturation or chemical denaturation using for example, sodium hydroxide) to generate single-stranded barcoded nucleic acids of the plurality of barcoded nucleic acids.
- the single-stranded barcoded nucleic acids can comprise a barcode oligonucleotide and an oligonucleotide complementary to the sample nucleic acids.
- the single-stranded barcoded nucleic acids can be generated by reverse transcription using a reverse transcriptase.
- the single-stranded barcoded nucleic acids can be generated by using a DNA polymerase.
- the single-stranded barcoded nucleic acids can be cDNA produced by extending a barcode oligonucleotide using a sample RNA (e.g., mRNA) associated with the cell as a template.
- the single-stranded barcoded nucleic acids can be further extended using a template switching oligonucleotide (TSO) .
- TSO template switching oligonucleotide
- a TSO is an oligo that hybridizes to untemplated C nucleotides added by a reverse transcriptase during reverse transcription. The TSO can be introduced into the partitions together with the reverse transcription reagents.
- a reverse transcriptase can be used to generate a cDNA by extending a barcode oligonucleotide hybridized to an RNA. After extending the barcode oligonucleotide to the 5’-end of the RNA, the reverse transcriptase can add one or more nucleotides with cytosine (C) bases (e.g. two or three) to the 3’-end of the cDNA.
- C cytosine
- the TSO can include one or more nucleotides with guanine (G) bases (e.g. two or more) on the 3’-end of the TSO.
- the nucleotides with G bases can be ribonucleotides.
- the G bases at the 3’-end of the TSO can hybridize to the cytosine bases at the 3’-end of the cDNA.
- the reverse transcriptase can further extend the cDNA using the TSO as the template to generate a cDNA with the reverse complement of the TSO sequence on its 3’-end.
- the barcoded nucleic acid can include the barcode sequences (e.g., cell barcode and UMI) on the 5’-end and a TSO sequence at its 3’-end.
- barcoding a plurality of sample nucleic acids comprises extending the barcode oligonucleotides using the sample nucleic acids as templates and the plurality of barcode oligonucleotides as TSO to generate a plurality of single-stranded barcoded nucleic acids of the plurality of barcoded nucleic acids that are hybridized to the plurality of sample nucleic acids.
- the barcode oligonucleotides are not attached to a bead and the barcode oligonucleotides can be TSO.
- extension primers e.g. poly (dT)
- a sample nucleic acid e.g. the poly-adenylated mRNA
- the extension primers can be extended using the sample nucleic acids as a template.
- a reverse transcriptase can be used to generate a cDNA by extending an extension primer hybridized to an RNA. After extending the extension primers to the 5’-end of the RNA, the reverse transcriptase can add one or more C bases (e.g.
- the TSO or barcode oligonucleotide can include one or more G bases (e.g. two or more) on the 3’-end of the TSO.
- the nucleotides with guanine bases can be ribonucleotides.
- the G bases at the 3’-end of the TSO or barcode oligonucleotide can hybridize to the cytosine bases at the 3’-end of the cDNA.
- the reverse transcriptase can switch template from the mRNA to the TSO or barcode oligonucleotide.
- the reverse transcriptase can further extend the cDNA using the TSO or barcode oligonucleotide as the template to generate a cDNA further comprising the reverse complement of the TSO or barcode oligonucleotide.
- the barcode sequences e.g. cell barcode and UMI
- the single-stranded barcoded nucleic acids can be separated from the template sample nucleic acids by digesting the template sample nucleic acids (e.g., using RNase) , by chemical treatment (e.g., using sodium hydroxide) , by hydrolyzing the template sample nucleic acids, or via a denaturation or melting process by increasing the temperature, adding organic solvents, or increasing pH. Following the melting process, the sample nucleic acids can be removed (e.g. washed away) and the single-stranded barcoded nucleic acids can be retained in the partition (e.g. through attachment to the partitions or through attachments to beads which can be retained in the partitions) .
- barcoding a plurality of sample nucleic acids associated with the cell in the partition can comprise generating the plurality of barcoded nucleic acids comprising double-stranded barcoded nucleic acids in the partition using the single-stranded barcoded nucleic acids as templates.
- the double-stranded barcoded nucleic acids can be generated from the single-stranded barcoded nucleic acids retained in the partition using, for example, second-strand synthesis or one-cycle PCR.
- the generated double-stranded barcoded nucleic acid can be denaturized or melted to generate two single-stranded barcoded nucleic acids: one single-stranded barcoded nucleic acid retained in the partition (e.g., attached to the bead) and the other single-stranded barcoded nucleic acid released into the solution from the retained single-stranded barcoded nucleic acid that can then be pooled to provide a pooled mixture outside the partitions.
- Both single-stranded barcoded nucleic acids (e.g. retained in the partitions or pooled outside the partitions) have a sequence comprising a sequence of a barcode oligonucleotide (e.g. a cell barcode sequence and/or a UMI barcode) and a sequence of a sample nucleic acid or a reverse complement thereof.
- barcode as used herein generally can be a verb or a noun.
- the term “barcode” or “barcode oligonucleotide” refers to a label that can be attached to a polynucleotide, or any variant thereof, to convey information about the polynucleotide.
- a barcode can be a polynucleotide sequence attached to all fragments of the sample nucleic acids associated with the cell in the partition. The barcode can then be sequenced alone or with the fragments and/or full length of the sample nucleic acids associated with the cell.
- barcode refers to a process of attaching a barcode or a barcode oligonucleotide to a sample nucleic acid associated with the cell.
- the barcode oligonucleotides can be attached to a partition directly or indirectly.
- the barcode oligonucleotides can also be associated with beads.
- Barcode oligonucleotides can be generated from a variety of different formats, including pre-designed polynucleotide barcodes, randomly synthesized barcode sequences, microarray-based barcode synthesis, random N-mers, or combinations thereof as will be understood by a person skilled in the art.
- the plurality of barcode oligonucleotides comprise, comprise about, comprise at least, comprise at least about, comprise at most, or comprise at most about 1, 5, 10, 50, 100, 1000, 10000, 20000, 30000, 40000, 50000, 60000, 70000, 80000, 90000, 100000, 200,000, 300,000, 400,000, 500,000, 600,000, 700,000, 800,000, 900,000, 1000000, 2000000, 3000000, 4000000, 5000000, 6000000, 7000000, 8000000, 9000000, 10000000, 20000000, 50000000, 100000000, 200000000, 300000000, 400000000, 500000000, 600000000, 700000, 800000000, 900000000, 1000000000 barcode oligonucleotides, or a number or a range between any two of these values.
- a barcode oligonucleotide of the plurality of barcode oligonucleotides can be in any suitable length.
- a barcode oligonucleotide of the plurality of barcode oligonucleotides can be about 2 to about 500 nucleotides in length, about 2 to about 100 nucleotides in length, about 2 to about 50 nucleotides in length, about 2 to about 40 nucleotides in length, about 4 to about 20 nucleotides in length, or about 6 to 16 nucleotides in length.
- a barcode oligonucleotide of the plurality of barcode oligonucleotides is about, at least, at least about, at most, or at most about 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 85, 90, 95, 100, 150, 200, 250, 300, 400, or 500 nucleotides in length, or a number or a range between any two of these values.
- Each of the plurality of barcode oligonucleotides used herein can comprise a cell barcode and a molecular barcode (e.g. a UMI) (see FIG. 2) .
- a barcode oligonucleotide can also comprise a probe sequence or region capable of hybridizing to sample nucleic acids (e.g. poly (dT) sequence in FIG. 2) .
- a barcode oligonucleotide can also include additional sequence segments such as additional recognition or binding sequences, a template switching oligonucleotide, and primer-binding sequences (e.g. sequencing primer-binding sequence, in FIG. 2 or a PCR primer-binding sequence for subsequent processing (e.g. PCR amplification) and/or sequencing.
- the configuration of the various sequences comprised in a barcode oligonucleotide of the plurality of barcode oligonucleotides introduced into a partition can vary depending on, for example, the particular configuration desired and/or the order in which the various components of the sequence are added as will be understood to a person skilled in the art.
- the barcode oligonucleotide can comprise from the 5’ end to the 3’ end, the cell barcode, the UMI, the PCR primer-binding sequence, and the probe sequence or the UMI, the cell barcode, the PCR primer-binding sequence, and the probe sequence.
- a barcode oligonucleotide has a configuration from the 5’ end to the 3’ end: cell barcode, UMI, primer-binding sequence, probe sequence. In some embodiments, a barcode oligonucleotide has a configuration from the 5’ end to the 3’ end: cell barcode, UMI, primer-binding sequence, TSO.
- the cell barcodes are for identifying the plurality of barcoded nucleic acids originate from the cell.
- the cell barcodes of the barcode oligonucleotides in a partition can be identical or different.
- the cell barcodes can serve to track the sample nucleic acids associated with the cell throughout the processing (e.g., location of the cells in the plurality of partitions) when the cell barcode associated with the sample nucleic acids is read during sequencing.
- the cell barcodes can serve to provide linkage information between cell nucleic acid sequences and cell functionality when in combination with optical imaging. Barcoded nucleic acids with an identical cell barcode can be generated from sample nucleic acids of cell within a given partition. Some barcoded nucleic acids are pooled and sequenced to determine cell nucleic acid sequences or a profile (e.g., an mRNA expression profile) which is associated with (e.g., identifiable by or linked with) the cell barcode sequence.
- a profile e.g., an mRNA expression profile
- the number (or percentage) of barcode oligonucleotides introduced in a partition with cell barcodes having an identical sequence can be different in different embodiments.
- the number of barcode oligonucleotides introduced in a partition with cell barcodes having an identical sequence is, is about, is at least, is at least about, is at most, or is at most about, 100, 200, 300, 400, 500, 600, 700, 800, 900, 1000, 2000, 3000, 4000, 5000, 6000, 7000, 8000, 9000, 10000, 20000, 30000, 40000, 50000, 60000, 70000, 80000, 90000, 100000, 200000, 300000, 400000, 500000, 600000, 700000, 800000, 900000, 1000000, 2000000, 3000000, 4000000, 5000000, 6000000, 7000000, 8000000, 9000000, 10000000, 20000000, 30000000, 40000000, 50000000, 60000000, 70000000, 80000000, 90000000,
- the percentage of barcode oligonucleotides introduced in a partition with cell barcodes having an identical sequence is, is about, is at least, is at least about, is at most, or is at most about, 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, 99.9%, 100%, or a number or a range between any two of these values.
- the cell barcodes of at least two barcode oligonucleotides introduced in a partition comprise an identical sequence.
- a cell barcode can be unique (or substantially unique) to a partition.
- the number of unique cell barcode sequences can be different in different embodiments. In some embodiments, the number of unique cell barcode sequences is, is about, is at least, is at least about, is at most, or is at most about, 100, 200, 300, 400, 500, 600, 700, 800, 900, 1000, 2000, 3000, 4000, 5000, 6000, 7000, 8000, 9000, 10000, 20000, 30000, 40000, 50000, 60000, 70000, 80000, 90000, 100000, 200000, 300000, 400000, 500000, 600000, 700000, 800000, 900000, 1000000, 2000000, 3000000, 4000000, 5000000, 6000000, 7000000, 8000000, 9000000, 10000000, 20000000, 30000000, 40000000, 50000000, 60000000, 70000000, 80000000, 90000000, 100000000, 200000000, 300000, 400000, 500000, 6000000,
- the percentage of unique cell barcode sequences is, is about, is at least, is at least about, is at most, or is at most about, 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, 99.9%, 100%, or a number or a range between any two of these values, of the cell barcode sequences of the barcode oligonucleotides introduced in a partition.
- the cell barcodes of barcode oligonucleotides introduced in two partitions can be a cell bar
- barcode oligonucleotides are introduced to the plurality of partitions such that different sets of a plurality of barcode oligonucleotides introduced in different partitions have different cell barcode and a same set of plurality of barcode oligonucleotides introduced in a same partition have same cell barcode.
- nucleic acids associated in the cell in a partition of the plurality of partitions can be barcoded with the same cell barcode.
- the cell barcodes of two barcode oligonucleotides attached to a bead of the plurality of beads can comprise an identical sequence.
- the cell barcodes of two barcode oligonucleotides attached to two beads of the plurality of beads can comprise different sequences.
- the length of a cell barcode of a barcode oligonucleotide (or a cell barcode of each barcode oligonucleotide or all cell barcodes of the plurality of barcode oligonucleotides) can be different in different embodiments.
- a cell barcode of a barcode oligonucleotide is, is about, is at least, is at least about, is at most, or is at most about, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88,
- a cell barcode of a barcode oligonucleotide (or each cell barcode of each barcode oligonucleotide or all cell barcodes of the plurality of barcode oligonucleotides) has a length greater than 2 nucleic acid bases. In some embodiments, a cell barcode of a barcode oligonucleotide (or each cell barcode of each barcode oligonucleotide or all cell barcodes of the plurality of barcode oligonucleotides) is 2-40 nucleotides in length.
- a cell barcode of a barcode oligonucleotide (or each cell barcode of each barcode oligonucleotide or all cell barcodes of the plurality of barcode oligonucleotides) is at least 6 nucleic acid bases in length.
- the unique molecule identifiers are for identifying molecular origins of the plurality of barcoded nucleic acids.
- UMIs are short sequences used to uniquely tag each molecule in a sample in some embodiments.
- the UMIs of the barcode oligonucleotides of the plurality of barcode oligonucleotides partitioned into a partition can be identical or different.
- the UMIs of two barcode oligonucleotides attached to a bead of the plurality of beads can comprise different sequences.
- the UMIs of two barcode oligonucleotides attached to two beads of the plurality of beads can comprise an identical sequence.
- the UMIs of the plurality of barcode oligonucleotides are different.
- the number (or percentage) of UMIs of barcode oligonucleotides introduced in a partition with different sequences can be different in different embodiments.
- the number of UMIs of barcode oligonucleotides introduced in a partition with different sequences is, is about, is at least, is at least about, is at most, or is at most about, 100, 200, 300, 400, 500, 600, 700, 800, 900, 1000, 2000, 3000, 4000, 5000, 6000, 7000, 8000, 9000, 10000, 20000, 30000, 40000, 50000, 60000, 70000, 80000, 90000, 100000, 200000, 300000, 400000, 500000, 600000, 700000, 800000, 900000, 1000000, 2000000, 3000000, 4000000, 5000000, 6000000, 7000000, 8000000, 9000000, 10000000, 20000000, 30000000, 40000000, 50000000, 60000000, 70000000, 80000000, 90000000, 100000000, 200000000, 300000, 400000, 500000, 600000, 700000, 800000, 900000000, 1000000000, 20000000, 30000000, 40000000
- the percentage of UMIs of barcode oligonucleotides introduced in a partition with different sequences is, is about, is at least, is at least about, is at most, or is at most about, 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, 99.9%, 100%, or a number or a range between any two of these values.
- the number of barcode oligonucleotides introduced in a partition with UMIs having an identical sequence can be different in different embodiments.
- the number of barcode oligonucleotides introduced in a partition with UMIs having an identical sequence is, is about, is at least, is at least about, is at most, or is at most about, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, or a number or a range between any two of these values.
- the UMIs of two barcode oligonucleotides introduced in a partition can comprise an identical sequence.
- the number of unique UMI sequences can be different in different embodiments.
- the number of unique UMI sequences is, is about, is at least, is at least about, is at most, or is at most about, 100, 200, 300, 400, 500, 600, 700, 800, 900, 1000, 2000, 3000, 4000, 5000, 6000, 7000, 8000, 9000, 10000, 20000, 30000, 40000, 50000, 60000, 70000, 80000, 90000, 100000, 200000, 300000, 400000, 500000, 600000, 700000, 800000, 900000, 1000000, 2000000, 3000000, 4000000, 5000000, 6000000, 7000000, 8000000, 9000000, 10000000, 20000000, 30000000, 40000000, 50000000, 60000000, 70000000, 80000000, 90000000, 100000000, 200000000, 300000, 400000, 500000000, 600000, 700000, 800000, 900000000, 1000000000, 20000000, 30000000, 40000
- a UMI of a barcode oligonucleotide can be different in different embodiments.
- a UMI of a barcode oligonucleotide is, is about, is at least, is at least about, is at most, or is at most about, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81
- the UMIs have a length greater than 2 nucleic acid bases. In some embodiments, the UMIs are 2-40 nucleotides in length. In some embodiments, the UMIs are at least 6 nucleic acid bases in length.
- a barcode oligonucleotide (or each barcode oligonucleotide of the plurality of barcode oligonucleotides) can comprise a primer sequence.
- the primer sequence can be a sequencing primer sequence (or a sequencing primer binding sequence) or a PCR primer sequence (or PCR primer binding sequence) .
- the PCR primer binding sequence is a Read 1 sequence.
- a barcode oligonucleotide (or each barcode oligonucleotide of the plurality of barcode oligonucleotides) can comprise a probe sequence or region capable of hybridizing to a plurality of sample nucleic acids, a particular type of sample nucleic acids (e.g. mRNA) , and/or specific sample nucleic acids (e.g. specific gene of interest) .
- sample nucleic acids e.g. mRNA
- specific sample nucleic acids e.g. specific gene of interest
- a probe sequence is, is about, is at least, is at least about, is at most, or is at most about, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, or a number or a range between any two
- barcode oligonucleotides comprising probe sequences can be introduced into the partitions together with other reagents such as the reverse transcription reagents.
- the number of the barcode oligonucleotides introduced into a partition comprising a probe sequence can be different in different embodiments.
- the number of barcode oligonucleotides introduced into a partition comprising a probe sequence is, is about, is at least, is at least about, is at most, or is at most about, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 200, 300, 400, 500, 600, 700, 800, 900, 1000, 2000, 3000, 4000, 5000, 6000, 7000, 8000, 9000, 10000, 20000, 30000, 40000, 50000, 60000, 70000, 80000, 90000, 100000, 200000, 300000, 400000, 500000, 600000, 700000, 800000, 900000, 1000000, 2000000, 3000000, 4000000, 5000000, 6000000, 7000000, 8000000, 9000000, 10000000, 20000000, 30000000, 40000000, 50000000, 60000000, 70000000, 80000000, 90000000, 100000000, 200000000, 300000, 400000, 50000000, 60000000, 70000000, 80000000, 90
- the probe sequence can be on a 3’ end of a barcode oligonucleotide of the plurality of barcode oligonucleotides introduced in a partition.
- Barcode oligonucleotides each comprising a poly (dT) probe sequence can be used to capture (e.g., hybridize to) 3’ end of polyadenylated mRNA transcripts in a sample nucleic acid for a downstream 3’ gene expression library construction.
- the probe sequence can comprise a poly (dT) sequence which is a single-stranded sequence of deoxythymidine (dT) used for first-strand cDNA synthesis catalyzed by reverse transcriptase.
- the probe sequence comprises a poly (dT) sequence can be introduced into the partitions as extension primers to synthesize the first-strand cDNA using the sample nucleic acid (e.g. RNA) as a template.
- the barcode oligonucleotides of the plurality of barcode oligonucleotides each comprises a poly (dT) sequence.
- the poly (dT) sequence can be capable of binding to a poly (A) region (e.g., a poly (A) tail) of a nucleic acid target (e.g., mRNA target) .
- the poly (dT) sequences of the barcode oligonucleotides of the plurality of barcode oligonucleotides attached to a bead (or each bead or all beads) are identical.
- the percentage of the barcode oligonucleotides of the plurality of barcode oligonucleotides attached to a bead (or each bead or all beads) with an identical poly (dT) sequence can be different in different embodiments. In some embodiments, the percentage of the barcode oligonucleotides of the plurality of barcode oligonucleotides attached to a bead (or each bead or all beads) with an identical poly-dT sequence is, is about, is at least, is at least about, is at most, is at most about, 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 8
- barcode oligonucleotides of the plurality of barcode oligonucleotides each comprises a probe sequence.
- the probe sequence for example, is not a poly (dT) sequence (though a probe sequence can comprise a stretch of Ts) .
- the probe sequence can be capable of binding to a non-poly (A) nucleic acid target.
- the number of different probe sequences of the barcode oligonucleotides attached to a bead (or each bead or all beads) can be different in different embodiments.
- barcode oligonucleotides of the plurality of barcode oligonucleotides can comprise probe sequences that are capable of binding to an identical non-poly (A) nucleic acid target.
- barcode oligonucleotides of the plurality of barcode oligonucleotides can comprise probe sequences that are capable of binding to different non-poly (A) nucleic acid targets.
- the number of different probe sequences of the barcode oligonucleotides attached to a bead is, is about, is at least, is at least about, is at most, or is at most about, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96
- the probe sequences of all barcode oligonucleotides of the plurality of barcode oligonucleotides comprise poly (dT) capable of hybridizing to poly (A) tails of mRNA molecules (or poly (dA) regions or tails of DNA) .
- the probe sequences of some barcode oligonucleotides of the plurality of barcode oligonucleotides comprise non-poly (dT) (e.g., gene-specific or target-specific probe) sequences.
- the non-poly (dT) probe sequences can be designed based on known sequences of a target nucleic acid of interest.
- the non-poly (dT) probe sequences can span a nucleic acid region of interest, or adjacent (upstream or downstream) of a nucleic acid region of interest.
- a non-poly (dT) probe sequence is, is about, is at least, is at least about, is at most, or is at most about, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98
- the number of the barcode oligonucleotides introduced into a partition comprising a gene-specific probe sequence can be different in different embodiments.
- the number of barcode oligonucleotides introduced into a partition comprising a gene-specific probe sequence is, is about, is at least, is at least about, is at most, or is at most about, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 200, 300, 400, 500, 600, 700, 800, 900, 1000, 2000, 3000, 4000, 5000, 6000, 7000, 8000, 9000, 10000, 20000, 30000, 40000, 50000, 60000, 70000, 80000, 90000, 100000, 200000, 300000, 400000, 500000, 600000, 700000, 800000, 900000, 1000000, 2000000, 3000000, 4000000, 5000000, 6000000, 7000000, 8000000, 9000000, 10000000, 20000000, 30000000, 40000000, 50000000, 60000000,
- the number of nucleic acid targets of interest e.g. genes of interest
- the number of nucleic acid targets of interest e.g., the number of nucleic acid targets of interest (e.g., the number of nucleic acid targets of interest) that the barcode oligonucleotides introduced into a partition are capable of binding can be different in different embodiments.
- the number of nucleic acid targets of interest e.g.
- the barcode oligonucleotides introduced into a partition are capable of binding is, is about, is at least, is at least about, is at most, or is at most about, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 200, 300, 400,
- Barcode oligonucleotide introduced into a partition can bind to a molecule (or a copy) of a nucleic acid target.
- Barcode oligonucleotides introduced into a partition can bind to molecules (or copies) of a nucleic acid target or a plurality of nucleic acid targets.
- the barcode oligonucleotides of the plurality of barcode oligonucleotides can each comprise a poly (dT) sequence, a non-poly (dT) probe sequence, and/or both.
- the poly (dT) sequence and the gene-specific probe sequence can be on a same barcode oligonucleotide or different barcode oligonucleotides of the plurality of barcode oligonucleotides introduced into a partition.
- the probe sequences of barcode oligonucleotides of the plurality of barcode oligonucleotides comprise a degenerate sequence.
- the length of a degenerate sequence can be different in different embodiments.
- the length of the degenerate sequence is, is about, is at least, is at least about, is at most, or is at most about, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83,
- a length of the degenerate sequence can be at least 3 nucleotides.
- the degenerate sequence can span a mutation.
- the degenerate sequence is three nucleotides in length, and the second position of the degenerate sequence is the position of a single nucleotide variation.
- the degenerate sequence can correspond a mutation.
- the degenerate sequence is one nucleotide in length, and the position of the degenerate sequence corresponds to the position of a single nucleotide variation.
- the length of the degenerate sequence and the length of the mutation can be identical.
- the length of the degenerate sequence and the length of the mutation can be different.
- the length of the degenerate sequence can be longer the length of the mutation.
- a barcode oligonucleotide (or each barcode oligonucleotide of the plurality of barcode oligonucleotides) can be a template switching oligonucleotide.
- a primer comprising a target binding region, such as a poly (dT) sequence can hybridize to a sample nucleic acid (e.g., an mRNA) and be extended by, for example, reverse transcription to generate an extended primer comprising a reverse complement of the sample nucleic acid, or a portion thereof (e.g., cDNA) .
- the extended primer or cDNA can be further extended to include the reverse complement of a TSO oligonucleotide or barcode oligonucleotide as illustrated in FIG. 2.
- the resulting barcoded nucleic acid includes the barcodes of the barcode oligonucleotide on the 3’-end.
- a barcode oligonucleotide (or each barcode oligonucleotide of the plurality of barcode oligonucleotides) is not a template switching oligonucleotide.
- a barcode oligonucleotide comprising a target binding region, such as a poly (dT) sequence can hybridize to a sample nucleic acid (e.g., an mRNA) and be extended by, for example, reverse transcription to generate an extended primer comprising a reverse complement of the sample nucleic acid, or a portion thereof (e.g., cDNA) .
- the extended primer or cDNA can be further extended to include the reverse complement of a TSO oligonucleotide.
- the resulting barcoded nucleic acid includes the barcodes of the barcode oligonucleotide on the 5’-end.
- a template switching oligonucleotide is an oligonucleotide that hybridizes to untemplated C nucleotides added by a reverse transcriptase during reverse transcription.
- the TSO can hybridize to the 3’ end of a cDNA molecule.
- the TSO can include one or more nucleotides with guanine (G) bases on the 3’-end of the TSO, with which the one or more cytosine (C) bases added by a reverse transcriptase to the 3’-end of a cDNA can hybridize.
- the series of G bases can comprise 1G base, 2 G bases, 3 G bases, 4 G bases, 5 G bases or more than 5 G bases.
- the series of G bases can be ribonucleotides.
- the reverse transcriptase can further extend the cDNA using the TSO as the template to generate a barcoded cDNA comprising the TSO.
- a template switching oligonucleotide is, is about, is at least, is at least about, is at most, or is at most about, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, or a
- the TSO can, e.g., have a length greater than 2 nucleic acid bases.
- the template switching oligonucleotides are 2-40 nucleotides in length. In some embodiments, the template switching oligonucleotides are at least 12 nucleic acid bases in length.
- the number of the barcode oligonucleotides introduced into a partition comprising a TSO can be different in different embodiments.
- the number of barcode oligonucleotides introduced into a partition comprising a TSO is, is about, is at least, is at least about, is at most, or is at most about, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 200, 300, 400, 500, 600, 700, 800, 900, 1000, 2000, 3000, 4000, 5000, 6000, 7000, 8000, 9000, 10000, 20000, 30000, 40000, 50000, 60000, 70000, 80000, 90000, 100000, 200000, 300000, 400000, 500000, 600000, 700000, 800000, 900000, 1000000, 2000000, 3000000, 4000000, 5000000, 6000000, 7000000, 8000000, 9000000, 10000000, 20000000, 30000000, 40000000, 50000000, 60000000, 70000000, 8000
- the TSO of the barcode oligonucleotides introduced into a partition can be identical. In some embodiments, the TSO of the barcode oligonucleotides introduced into a partition can be different. The percentage of the barcode oligonucleotides of the plurality of barcode oligonucleotides introduced into a partition with an identical TSO sequence can be different in different embodiments.
- the percentage of the barcode oligonucleotides of the plurality of barcode oligonucleotides introduced into a partition with an identical TSO sequence is, is about, is at least, is at least about, is at most, is at most about, 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, 99.9%, 100%, or a number or a range between any two of these values.
- Barcoding the nucleic acid targets associated with the cell can comprise: hybridizing the barcode oligonucleotides attached to the bead in each partition of the plurality of partitions with nucleic acid targets associated with the cell in the partition; extending the barcode oligonucleotides attached to the bead and hybridized to the nucleic acid targets using the nucleic acid targets as templates to generate single-stranded barcoded nucleic acids; and generating double-stranded barcoded nucleic acids from the single-stranded barcoded nucleic acids.
- Generating double-stranded barcoded nucleic acids from the single-stranded barcoded nucleic acids can comprise extending the single-stranded barcoded nucleic acids.
- Extending the single-stranded barcoded nucleic acids can comprise extending the single-stranded barcoded nucleic acids using a template switching oligonucleotide.
- the barcoded nucleic acids can be generated by reverse transcription using a reverse transcriptase.
- the barcoded nucleic acids can be generated by using a DNA polymerase.
- Barcoding the nucleic acids associated with the cell can comprise generating double-stranded barcoded nucleic acids from the single-stranded barcoded nucleic acids.
- Extending the single-stranded barcoded nucleic acids comprises further extending the single-stranded barcoded nucleic acids using a template switching oligonucleotide.
- a reverse transcriptase can be used to generate a cDNA by extending a barcode oligonucleotide hybridized to an RNA.
- the reverse transcriptase can add one or more nucleotides with cytosine (Cs) bases (e.g., two or three) to the 3’-end of the cDNA.
- Cs cytosine
- the template switch oligonucleotide (TSO) can include one or more nucleotides with guanine (G) bases (e.g., two or three) on the 3’-end of the TSO.
- the nucleotides with guanine bases can be ribonucleotides.
- the guanine bases at the 3’-end of the TSO can hybridize to the cytosine bases at the 3’-end of the cDNA.
- the reverse transcriptase can further extend the cDNA using the TSO as the template to generate a cDNA with the TSO sequence on its 3’-end.
- a barcoded nucleic acid can include a TSO sequence at its 3’-end.
- the method comprises pooling barcoded nucleic acids of the plurality of barcoded nucleic acids after barcoding the sample nucleic acids and before sequencing the barcoded nucleic acids to obtain pooled barcoded nucleic acids.
- the method comprises pooling the beads prior to extending the barcode oligonucleotides.
- the method can comprise pooling the beads prior to generating the double-stranded barcoded nucleic acids.
- extending the barcode oligonucleotides attached to the bead and hybridized to the nucleic acid targets comprises extending the barcode oligonucleotides attached to the bead and hybridized to the nucleic acid targets in bulk.
- Generating the double-stranded barcoded nucleic acids can comprise generating the double-stranded barcoded nucleic acids from the single-stranded barcoded nucleic acids in bulk.
- the method comprises pooling the beads subsequent to extending the barcode oligonucleotides to generate the single-stranded barcoded nucleic acids.
- the method can comprise pooling the beads subsequent to generating the double-stranded barcoded nucleic acids.
- extending the barcode oligonucleotides attached to the bead and hybridized to the nucleic acid targets comprises extending the barcode oligonucleotides attached to the bead and hybridized to the nucleic acid targets in the partition.
- Generating the double-stranded barcoded nucleic acids can comprise generating the double-stranded barcoded nucleic acids from the single-stranded barcoded nucleic acids in the partition.
- pooling barcoded nucleic acids occurs after generating the double-stranded barcoded nucleic acids. In some embodiments, pooling barcoded nucleic acids occurs after denaturizing (such as heat denaturization or chemical denaturization with, for example, sodium hydroxide) the double-stranded barcoded nucleic acids which generates two single-stranded barcoded nucleic acids, one retained in the partition and one released from the barcoded nucleic acids retained in the partition. In some embodiments, pooling barcoded nucleic acids occurs after amplification of the barcoded nucleic acids. In some embodiments, pooling barcoded nucleic acids occurs after further processing (e.g., fragmentation) of the barcoded nucleic acids. In some embodiments, pooling barcoded nucleic acids comprises collecting the single-stranded barcoded nucleic acids released from the barcoded nucleic acids retained in the partition.
- the barcode oligonucleotides are attached to beads, only single-stranded barcoded nucleic acids released into bulk are collected by pooling, and the beads are not pooled (e.g. not removed from the partitions) but retained in the partitions (e.g. by an external magnetic field applied on magnetic beads) , thereby allowing one to trace the origin of the pooled barcoded nucleic acids, for example, to its original location in the plurality of partitions.
- the pooled barcoded nucleic acids can be single-stranded or double-stranded (e.g. generated from the single-stranded pooled barcoded nucleic acids by PCR amplification) .
- the pooled barcoded nucleic acids e.g. barcoded cDNA
- the pooled barcoded nucleic acids with desired length may be selected.
- the barcoded nucleic acid (e.g., barcoded cDNA) is circularized by connecting the two ends of the barcoded nucleic acid (e.g., barcoded cDNA) .
- the barcoded nucleic acid (e.g., barcoded cDNA) is divided into a first portion and a second portion. Barcoded nucleic acid (e.g., barcoded cDNA) circularization can comprise circularization of the first portion, the second portion or both portions of the barcoded nucleic acid (e.g., barcoded cDNA) .
- the barcoded nucleic acid (e.g., barcoded cDNA) to be circularized can comprise a nucleotide sequence corresponding to the nucleic acid target (e.g., mRNA) target and the barcode oligonucleotide comprising the cell barcode and the UMI attached to one end of the nucleotide sequence corresponding to the nucleic acid (e.g., mRNA) .
- the nucleic acid target e.g., mRNA
- the barcode oligonucleotide comprising the cell barcode and the UMI attached to one end of the nucleotide sequence corresponding to the nucleic acid (e.g., mRNA) .
- Barcoded nucleic acid (e.g., barcoded cDNA) circularization can comprise connecting the barcode oligonucleotide comprising the cell barcode and the UMI attached to one end of the nucleotide sequence corresponding to the nucleic acid target (e.g., mRNA) to the other end of the nucleotide sequence corresponding to the nucleic acid target (e.g., mRNA) .
- the nucleic acid target e.g., mRNA
- the barcode oligonucleotide can be connected to the other end of the nucleotide sequence corresponding to the nucleic acid target (e.g., mRNA) directly or indirectly.
- the cell barcode, UMI or primer-binding sequence of the barcode oligonucleotide can connect to the other end of the nucleotide sequence corresponding to the nucleic acid target (e.g., mRNA) directly or indirectly.
- the barcode oligonucleotide can be connected to the other end of the nucleotide sequence corresponding to the nucleic acid target (e.g., mRNA) through a region of sequence identity.
- the barcoded nucleic acid (e.g., barcoded cDNA) can be circularized using isolated protein reagents (proteins) .
- the two ends of the barcoded nucleic acid (e.g., barcoded cDNA) share a region of sequence identity and are contacted with: a non-processive 5' exonuclease; a non-strand-displacing DNA polymerase; and a ligase.
- the 5' exonuclease and the DNA polymerase are different entities.
- the barcoded nucleic acid (e.g., barcoded cDNA) can additionally be contacted with a single stranded DNA binding protein (SSB) , which accelerates nucleic acid annealing.
- SSB single stranded DNA binding protein
- the non-processive 5' exonuclease, the non-strand-displacing DNA polymerase, the SSB and the ligase are each isolated (e.g., purified) .
- an “isolated” protein means that the protein is removed from its original environment (e.g., the natural environment if it is naturally occurring) , and isolated or separated from at least one other component with which it is naturally associated.
- a naturally-occurring protein present in its natural living host e.g. a bacteriophage protein present in a bacterium that has been infected with the phage
- a naturally-occurring protein present in its natural living host e.g. a bacteriophage protein present in a bacterium that has been infected with the phage
- Such proteins can be part of a composition or reaction mixture, and still be isolated in that such composition or reaction mixture is not part of its natural environment.
- an isolated protein, ” as used herein can include 1, 2, 3, 4 or more copies of the protein, i.e., the protein can be in the form of a monomer, or it can be in the form of a multimer, such as dimer, trimer, tetramer or the like, depending on the particular protein under consideration.
- the protein is purified. Methods for purifying the proteins of the invention is known to one of skill in the art.
- the protein is substantially purified or is purified to homogeneity.
- substantially purified means that the protein is separated and is essentially free from other proteins, i.e., the protein is the primary and active constituent.
- the purified protein can then be contacted with the DNAs to be joined, where it then acts in concert with other proteins to achieve the joining.
- the proteins can be contacted with (combined with) the DNAs in any order; for example, the proteins can be added to a reaction mixture comprising the DNAs, or the DNAs can be added to a reaction mixture comprising the proteins.
- Proteins used herein can be in the form of “active fragments, ” rather than the full-length proteins, provided that the fragments retain the activities (enzymatic activities or binding activities) required to achieve the joining.
- active fragments can be in the form of “active fragments, ” rather than the full-length proteins, provided that the fragments retain the activities (enzymatic activities or binding activities) required to achieve the joining.
- the non-processive 5' exonuclease can be any non-processive 5' ⁇ 3' double strand specific exodeoxyribonuclease.
- the terms “5' exonuclease” or “exonuclease” are used herein to refer to a 5' ⁇ 3' exodeoxyribonuclease and sometimes used interchangeably with “non-processive 5' exonuclease. ”
- a “non-processive” exonuclease, as used herein, is an exonuclease that degrades a limited number (e.g., only a few) nucleotides during each DNA binding event.
- the 5' exonucleases can be the phage T7 gene 6 product, RedA of lambda phage (lambda exonuclease) , RecE of Rac prophage, or any of a variety of 5' ⁇ 3' exonucleases that can be involved in homologous recombination reactions. Methods for preparing the T7 gene 6 product and optimal reaction conditions for using it are known to one of skill in the art.
- SSB can protect the single stranded overhangs generated by the 5' exonuclease, as well as facilitating the rapid annealing of the homologous single stranded regions.
- Any SSB which can accelerate nucleic acid annealing can be used herein.
- An SSB, which “accelerates nucleic acid annealing, ” as used herein, can be an SSB which can accelerate nucleic acid binding by a factor of greater than about 500-fold, compared to the binding in the absence of the SSB.
- the SSBs can be the T7 gene 2.5 product, the E. coli RecA protein, RedB of lambda phage, and RecT of Rac prophage. Methods for preparing the T7 protein and optimal reaction conditions for using it are known to one of skill in the art. In yet a further embodiment, polyethylene glycol ( "PEG" ) is used to enhance the annealing process.
- PEG polyethylene glycol
- the non-strand-displacing DNA polymerase used herein can be any non-strand-displacing DNA polymerase capable of filling in the gaps left by the 5' exonuclease digestion.
- the term “polymerase” is sometimes used herein to refer to a DNA polymerase.
- a “non-strand-displacing DNA polymerase, ” as used herein, is a DNA polymerase that terminates synthesis of DNA when it encounters DNA strands which lie in its path as it proceeds to copy a dsDNA molecule, or degrades the encountered DNA strands as it proceeds while concurrently filling in the gap thus created, thereby generating a “moving nick.
- non-strand-displacing DNA polymerase synthesizes DNA faster than the exonuclease in the reaction mixture degrades it.
- Suitable non-strand-displacing DNA polymerases will be evident to the skilled worker.
- the non-strand-displacing DNA polymerase can be the T7 gene 5 product, T4 DNA polymerase, and E. coli Pol I. Methods for preparing and using the above-noted DNA polymerases are known to one of skill in the art.
- the ligase used herein can be any DNA ligase.
- the term “ligase” is sometimes used herein to refer to a DNA ligase.
- Suitable DNA ligases include, e.g., the T7 gene 1.3 product, T4 DNA ligase, E. coli DNA ligase and Taq Ligase. Methods for their preparation and optimal reaction conditions are known to one of skill in the art.
- the 5' exonuclease is the phage T7 gene 6 product, RedA of lambda phage, or RecE of Rac prophage;
- the SSB is the phage T7 gene 2.5 product, the E. coli recA protein, RedB of lambda phage, or RecT of Rac prophage;
- the DNA polymerase is the phage T7 gene 5 product, phage T4 DNA polymerase, or E. coli pol I;
- the ligase is the phage T7 gene 1.3 product, phage T4 DNA ligase, or E. coli DNA ligase.
- the four proteins used herein can be contacted with the barcoded nucleic acid (e.g., barcoded cDNA) to be circularized (e.g., added to a reaction mixture comprising a solution containing suitable salts, buffers, ATP, deoxynucleotides, etc. plus the DNA molecules) simultaneously.
- the four proteins are added substantially simultaneously.
- a mixture of the four proteins in suitable ratios can be added to the reaction mixture with a single pipetting operation.
- the barcoded nucleic acid (e.g., barcoded cDNA) are added to a reaction mixture comprising a solution containing suitable salts, buffers, ATP, deoxynucleotides, etc. and the four proteins.
- the barcoded nucleic acid (e.g., barcoded cDNA) is in contact with the four proteins sequentially.
- the barcoded nucleic acid (e.g., barcoded cDNA) can be in contact with a reaction mixture comprising a solution containing suitable salts, buffers, ATP, deoxynucleotides, etc. and a subset of the four proteins, and the remaining proteins are then added, in any order or in any combination (e.g.
- the exonuclease can be added last; and preceding the addition of the exonuclease, the SSB, polymerase and ligase can be added sequentially, in any order, or two of the proteins can be added substantially simultaneously, and the other protein can be added before or after those two proteins) .
- the circularization is performed under conditions whereby a 3' single-stranded overhang is generated in the regions of sequence identity at each ends of the barcoded nucleic acid (e.g., barcoded cDNA) by the exonuclease without the use of a restriction enzyme; the two single-stranded overhangs anneal to form a gapped molecule; the gaps are filled in by the polymerase leaving nicks; and nicks are sealed by the ligase, thereby joining the two ends of the barcoded nucleic acid (e.g., barcoded cDNA) and forming an intact (un-nicked) circularized barcoded nucleic acid molecule, in which a single copy of the region of sequence identity is retained.
- the barcoded nucleic acid e.g., barcoded cDNA
- the 5' exonuclease can generate 3' single stranded overhangs in both ends of the barcoded nucleic acid (e.g., barcoded cDNA) ; the overhangs are generated in the regions of sequence identity; the two single stranded overhangs anneal to form a gapped molecule; the DNA polymerase fills in the gaps; and the ligase seals the nicks.
- barcoded nucleic acid e.g., barcoded cDNA
- the four proteins can act together in a concerted fashion; the individual enzymatic reactions are not actively terminated (e.g., by an experimenter or investigator) before a subsequent reaction begins.
- formation of a double stranded DNA molecule results in the molecule being relatively withdrawn or inert from the reactions.
- Conditions which are effective for connecting the two ends of the barcoded nucleic acid allow for the net assembly of a circularized barcoded nucleic acid, rather than the degradation of the barcoded nucleic acid (e.g., barcoded cDNA) by the exonuclease.
- the gaps formed by digestion by the 5' exonuclease can be filled in by the polymerase substantially immediately after they are formed. This can be accomplished by contacting the barcoded nucleic acid (e.g., barcoded cDNA) with a substantially lower amount of 5' exonuclease activity than the amount of DNA polymerase activity.
- the gaps formed by digestion by the 5' exonuclease can be filled in by the polymerase substantially immediately after they are formed, and the intact (un-nicked) reaction product is “fixed” by the ligation reaction.
- Suitable amounts of activities can include: exonuclease activity between about 0.1 U/mL and about 50 U/mL; DNA polymerase between about 10 U/mL and about 30 U/mL; SSB between about 0.1 ⁇ M and about 1 ⁇ M; and ligase between about 0.1 ⁇ M and about 1 ⁇ M.
- Reaction conditions (such as the presence of salts, buffers, ATP, dNTPs, etc. and the times and temperature of incubation) can be optimized readily by one of skill in the art.
- the incubation temperature can be about 25°C to about 50°C, and the reaction can be carried out for about 1-1.5 hours at 37°C, or for about 2-3 hours at 30°C.
- the regions of sequence identity are sometimes referred to herein as “overlaps” or “regions of overlap. ”
- the region of sequence identity should be sufficiently long to allow the circulation to occur.
- the length can vary from a minimum of about 10 base pairs (bp) to about 300 bp (e.g., 10 bp, 20 bp, 30 bp, 40 bp, 50 bp, 60 bp, 70 bp, 80 bp, 90 bp, 100 bp, 150 bp, 200 bp, 250 bp, 300 bp or a number between any two of these values) or more.
- the length of the overlap is not greater than about 1/10 the length of the nucleic acid fragment to be circularized; otherwise there may not be sufficient time for annealing and gap filling. If longer overlaps are used, the T7 endonuclease can also be required to debranch the joint molecules.
- the region of sequence identity is of a length that allows it to be generated readily by synthetic methods, e.g. about 40 bp (e.g., about 35 to about 45 bp) .
- the regions of sequence identity can be added to the ends of the barcoded nucleic acid (e.g., barcoded cDNA) to be circularized by any of a variety of methods.
- the regions of sequence identity can be introduced by PCR amplification.
- Circularization through the region of sequence identity can be achieved by using circle handle (s) .
- a first circle handle can be added to one end of the barcoded nucleic acid (e.g., barcoded cDNA) and a second circle handle can be added to the other end of the barcoded nucleic acid (e.g., barcoded cDNA) .
- the first and second circle handles can comprise regions of sequence identity and nucleotide sequences capable of hybridizing to the ends of the barcoded nucleic acid (e.g., barcoded cDNA) .
- the first and second circle handles can be extended using the barcoded nucleic acid (e.g., barcoded cDNA) as a template.
- the extension can be achieved using a one-round PCR amplification that can be performed by one of skill in the art.
- the barcoded nucleic acid (e.g., barcoded cDNA) with the circle handles can be circularized by connecting the first and second circle handles.
- the first and second circle handles can be double-stranded DNA, single-stranded DNA or partially double-stranded/partially single-stranded DNA.
- the regions of sequence identity in the first and second circle handles can be double-stranded DNA, single-stranded DNA or partially double-stranded/partially single-stranded DNA.
- the first and second circle handles can be double-stranded DNA in full length, including double-stranded regions of sequence identity.
- the first and second circle handles can comprise single-stranded regions of sequence identity.
- a non-strand displacing DNA polymerase used herein must elongate in the 5' direction from a primer molecule, the regions of sequence identity cannot have a free 5' end (e.g.
- the 5' ends of the barcoded nucleic acids to be circularized are blocked so that 5' exonuclease cannot digest them.
- the blocking agent can be reversible, so that the blocked end (s) can eventually be connected to form a circularized nucleic acid.
- Suitable blocking agents include, e.g., phosphorothioate bonds, 5' spacer molecules, and Locked Nucleic Acid (LNA) .
- the barcoded nucleic acid is a double-stranded cDNA.
- the first circle handle comprising a region of sequence identity is added to one end of the double-stranded barcoded cDNA; and the second circle handle comprising the same region of sequence identity is added to one end of the double-stranded barcoded cDNA.
- the addition of the first and second circle handles can be by using PCR amplification.
- the regions of sequence identity incorporated into the barcoded cDNA can be double-stranded.
- the regions of sequence identity within the first and second circle handles can be digested by the exonuclease to generate 3' single-stranded overhangs; the two single-stranded overhangs can anneal to form a gapped molecule; the gaps can be filled in by the polymerase leaving nicks; and nicks can be sealed by the ligase, thereby connecting the two ends of the barcoded cDNA and forming an intact (un-nicked) circularized barcoded cDNA, in which a single copy of the region of sequence identity is retained.
- the regions of sequence identity added to the ends of barcoded nucleic acid are single-stranded and comprises complementary sequences.
- the two single-stranded regions of sequence identity can anneal to form a gapped molecule without the need to be digested by the exonuclease; the gaps can be filled in by the polymerase leaving nicks; and nicks can be sealed by the ligase, thereby connecting the two ends of the barcoded nucleic acid and forming an intact (un-nicked) circularized barcoded nucleic acid.
- the method comprises amplifying the barcoded nucleic acid to generate amplified barcoded nucleic acids, such as amplifying barcoded cDNAs.
- Amplifying the barcoded nucleic acids can comprise amplifying the barcoded nucleic acids using polymerase chain reaction (PCR) to generate the amplified barcoded nucleic acids.
- the barcode oligonucleotide can include a polymerase chain reaction (PCR) primer-binding sequence and a TSO sequence. The first PCR primer-binding sequence and the TSO sequence can be used to amplify the barcoded nucleic acid, such as a barcoded cDNA.
- the barcode oligonucleotide can include a second polymerase chain reaction (PCR) primer-binding sequence (e.g., a Read 2 sequence) .
- PCR polymerase chain reaction
- a first primer comprising the sequence of second PCR primer-binding sequence and a second primer comprising a random sequence (e.g., a random hexamer) can be used to amplify the barcoded nucleic acid, such as a barcoded cDNA.
- the second primer can include one or more non-random sequences, such as a third PCR primer-binding sequence (e.g., a Read 3 sequence) .
- the method comprises amplifying the circularized barcoded nucleic acid to generate a second linear barcoded nucleic acid.
- Amplifying the circularized barcoded nucleic acids can comprise amplifying the circularized barcoded nucleic acids using polymerase chain reaction (PCR) to generate the second linear barcoded nucleic acids.
- the circularized barcoded nucleic acid can comprise a barcode oligonucleotide that can include a first polymerase chain reaction (PCR) primer-binding sequence (e.g., a Read 1 sequence) .
- the first PCR primer-binding sequence can be used to amplify the circularized barcoded nucleic acid, such as a circularized barcoded cDNA.
- the barcode oligonucleotide can include a first polymerase chain reaction (PCR) primer-binding sequence (e.g., a Read 1 sequence) .
- PCR polymerase chain reaction
- a first and a second primers comprising the sequence of first PCR primer-binding sequence or a portion thereof can be used to amplify the circularized barcoded nucleic acid, such as a circularized barcoded cDNA.
- the amplification of the circularized barcoded nucleic acid generates the second linear barcoded nucleic acid comprising the barcode oligonucleotide on a different end of the nucleic acid sequence corresponding to the nucleic acid target, compared to the barcoded nucleic acid from which the circularized barcoded nucleic acid is generated.
- a first barcoded nucleic acid can comprise from 5’ to 3’: a region of sequence identity, a cell barcode, a UMI, a first PCR primer-binding sequence, a probe sequence (e.g., poly (dT) ) , a nucleic acid sequence corresponding to the nucleic acid target, a TSO, and another copy of the same region of sequence identity.
- the circularized barcoded nucleic acid can comprise the region of sequence identity, the cell barcode, the UMI, the first PCR primer-binding sequence, the probe sequence (e.g., poly-dT) , a nucleic acid sequence corresponding to the nucleic acid target, and the TSO, in the order listed as a loop.
- the second linear barcoded nucleic acid generated by such amplification can comprise from 5’ to 3’: the first PCR primer-binding sequence or a portion thereof, the probe sequence (e.g., poly-dT) , a nucleic acid sequence corresponding to the nucleic acid target, the TSO, the region of sequence identity, the cell barcode, the UMI, and another copy of the first PCR primer-binding sequence or a portion thereof.
- the second linear barcoded nucleic acid can be purified after amplification.
- the barcoded nucleic acids are further processed prior to sequencing to generate processed barcoded nucleic acids.
- the method can include amplification of barcoded nucleic acids, fragmentation of amplified barcoded nucleic acids, end repair of fragmented barcoded nucleic acids, A-tailing of fragmented barcoded nucleic acids that have been end-repaired (e.g., to facilitate ligation to adapters) , and attaching (e.g. by ligation and/or PCR) with a second sequencing primer sequence (e.g. a Read 2 sequence) , sample indexes (e.g. short sequences specific to a given sample library) , and/or flow cell binding sequences (e.g. P5 and/or P7) . Additional PCR amplification can also be performed. This process can also be referred to as sequencing library construction.
- the method comprises performing a polymerase chain reaction in bulk, subsequent to the pooling, on the pooled barcoded nucleic acids, thereby generating amplified barcoded nucleic acids.
- PCR amplification can be carried out to generate sufficient mass for the subsequent library construction processes. PCR amplification can also be performed with primers specific to target nucleic acids of interest.
- the method comprises fragmenting (e.g., via enzymatic fragmentation, mechanical force, chemical treatment, etc. ) the pooled barcoded nucleic acids to generate fragmented barcoded nucleic acids. Fragmentation can be carried out by any suitable process such as physical fragmentation, enzymatic fragmentation, or a combination of both.
- the barcoded nucleic acids can be sheared physically using acoustics, nebulization, centrifugal force, needles, or hydrodynamics.
- the barcoded nucleic acids can also be fragmented using enzymes, such as restriction enzymes and endonucleases.
- Fragmentation can yield fragments of a desired size for subsequent sequencing.
- the desired sizes of the fragmented nucleic acids are determined by the limitations of the next generation sequencing instrumentation and by the specific sequencing application as will be understood by a person skilled in the art.
- the fragmented nucleic acids can have a length of between about 50 bases to about 1, 500 bases.
- the fragmented barcoded nucleic acids have about 100 bp to 700bp in length.
- Fragmented barcoded nucleic acids can undergo end-repair and A-tailing (to add one or more adenine bases) to form an A overhang.
- This A overhang allows adapter containing one or more thymine overhanging bases to base pair with the fragmented barcoded nucleic acids.
- Fragmented barcoded nucleic acids can be further processed by adding additional sequences (e.g. adapters) for use in sequencing based on specific sequencing platforms.
- Adapters can be attached to the fragmented barcoded nucleic acids by ligation using a ligase and/or PCR.
- fragmented barcoded nucleic acids can be processed by adding a second sequencing primer sequence.
- the second sequencing primer sequence can comprise a Read 2 sequence.
- An adapter comprising the second primer sequence can be ligated to the fragmented barcoded nucleic acids after, for example, end-repair and A tailing, using a ligase.
- the adaptor can include one or more thymine (T) bases that can hybridize to the one or more A bases added by A tailing.
- An adaptor can be, for example, partially double-stranded or double stranded.
- the adapter can also include platform-specific sequences for fragment recognition by specific sequencing instrument.
- the adapter can comprise a sequence for attaching the fragmented barcoded nucleic acids to a flow well of Illumina platforms, such as a P5 sequence, a P7 sequence, or a portion thereof.
- Different adapter sequences can be used for different next generation sequencing instrument as will be understood by a person skilled in the art.
- the adapter can also contain sample indexes to identify samples and to permit multiplexing. Sample indexes enable multiple samples to be sequenced together (i.e. multiplexed) on the same instrument flow cell as will be understood by a person skilled in the art. Adapters can comprise a single sample index or a dual sample indexes depending on the implementations such as the number of libraries combined and the level of accuracy desired.
- the amplified barcoded nucleic acids generated from sequencing library construction can include a P5 sequence, a sample index, a Read 1 sequence, a cell barcode, a UMI, a poly (dT) sequence, a target biding region, a sequence of a sample nucleic acid or a portion thereof, a Read 2 sequence, a sample index, and/or a P7 sequence (e.g., from 5’-end to 3’-end) .
- the amplified barcoded nucleic acids can include a P5 sequence, a sample index, a Read 1 sequence, a cell barcode, a UMI, a sequence of a template switching oligonucleotide, a sequence of a sample nucleic acid or a portion thereof, a Read 2 sequence, a sample index, and/or a P7 sequence (e.g., from 5’-end to 3’-end) .
- sequencing the barcoded nucleic acids, or products thereof comprises sequencing products of the barcoded nucleic acids.
- Products of the barcoded nucleic acids can include the processed nucleic acids generated by any step of the sequencing library construction process, such as amplified barcoded nucleic acids, fragmented barcoded nucleic acids, fragmented barcoded nucleic acids comprising additional sequences such as the second sequencing primer sequence and/or adapter sequences described herein.
- the method disclosed herein can comprise sequencing the plurality of barcoded nucleic acids or products thereof to obtain nucleic acid sequences of the plurality of barcoded nucleic acids.
- the barcoded nucleic acids generated by the method disclosed herein comprise barcoded nucleic acids retained in a partition and barcoded nucleic acids pooled, from each partition, into a pooled mixture outside the partitions.
- the barcoded nucleic acids retained in a partition and the pooled barcoded nucleic acids in a pooled mixture outside the partitions can be sequenced using a same or different sequencing technique.
- sequencing the plurality of barcoded nucleic acids or products thereof comprises sequencing the pooled barcoded nucleic acids to obtain nucleic acid sequences of the pooled barcoded nucleic acids.
- a “sequence” can refer to the sequence, a complementary sequence thereof (e.g., a reverse, a compliment, or a reverse complement) , the full-length sequence, a subsequence, or a combination thereof.
- the nucleic acids sequences of the pooled barcoded nucleic acids can each comprise a sequence of a barcode oligonucleotide (e.g. the cell barcode and the UMI) and a sequence of a sample nucleic acid associated with the cell or a reverse complement thereof.
- Pooled barcoded nucleic acids can be sequenced using any suitable sequencing method identifiable to a person skilled in the art. For example, sequencing the pooled barcoded nucleic acids can be performed using high-throughput sequencing, pyrosequencing, sequencing-by-synthesis, single-molecule sequencing, nanopore sequencing, sequencing-by-ligation, sequencing-by-hybridization, next generation sequencing, massively-parallel sequencing, primer walking, and any other sequencing methods known in the art and suitable for sequencing the barcoded nucleic acids generated using the methods herein described.
- sequencing the plurality of barcoded nucleic acids or products thereof comprises sequencing the barcoded nucleic acids retained in the partitions to obtain the nucleic acid sequences of the retained barcoded nucleic acids.
- Sequencing the barcoded nucleic acids retained in the partitions can comprise sequencing the entire sequence of a barcoded nucleic acid or sequencing a portion of the sequence of a barcoded nucleic acid, such as the cell barcode sequence of a barcoded nucleic acid.
- sequencing the barcoded nucleic acids retained in the partition can comprise determining the cell barcode sequences of the barcoded nucleic acids retained in the partition using oligonucleotide probes each comprising a fluorescent label.
- the cell barcode sequences of the barcoded nucleic acids retained in the partition can be determined using sequencing-by-ligation.
- the sequencing-by-ligation process can be carried out in the same microfluidic device used for performing other steps of the methods described herein, such as partitioning cells and barcode molecules and barcoding sample nucleic acids, without the necessity to transfer the barcoded nucleic acids elsewhere and therefore can be referred to as on-chip sequencing.
- a first sequencing primer is hybridized to a single-stranded barcoded nucleic acid to be sequenced.
- a mixture e.g., 16
- n-mer probes e.g. 8-mer probes
- m e.g., four
- the number of n-mer probes can be, for example, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, or more.
- the n-mer probes can be, for example, 2-, 3-, 4-, 5-, 6-, 7-, 8-, 9-, 10-, 11-, 12-, 13-, 14-, 15-, 16-, 17-, 18-, 19-, 20-, or more.
- the number of fluorescent labels used can be, for example, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, or more.
- the fluorophore encoding which can be based on the two 3’-most nucleotides of a probe, is read. Three bases including the dye can be cleaved from the 5’ end of the probe, leaving a free 5’ phosphate on the extended primer, which can be then available for further ligation. After multiple ligations (e.g.
- the synthesized strands can be melted and the ligation product can be washed away before a second sequencing primer is annealed.
- a second sequencing primer can then hybridize the single-stranded barcoded nucleic acid at a base position shifted by one nucleotide with respect to the position the first sequencing primer binds to.
- the ligation process can be then repeated for the second sequencing primer.
- the same process can be followed for the rest of the sequencing primers.
- the dye read outs can be converted to a sequence.
- 5 different sequencing primers are provided to sequence the first barcode sequences of the single-stranded barcoded nucleic acids retained in the partition.
- determining the cell barcode sequences of the barcoded nucleic acids retained in the partition using sequencing-by-ligation can comprise introducing a sequencing primer capable of hybridizing to the barcoded nucleic acids retained in the partition.
- the method can also comprise extending the sequencing primer using the barcoded nucleic acids retained in the partition as templates to generate a plurality of extended sequencing primers comprising the barcode sequences, or a portion thereof, of the barcoded nucleic acids retained in the partition.
- a different sequencing primer can be introduced and extended in each of one or more cycles herein described (e.g.
- the method can also comprise introducing a plurality of oligonucleotide probes each comprising a fluorescent label.
- the plurality of oligonucleotide probes can be octamer probes.
- the obtained nucleic acid sequences of the barcoded nucleic acids can be subjected to any downstream post-sequencing data analysis as will be understood by a person skilled in the art.
- the sequence data can undergo a quality control process to remove adapter sequences, low-quality reads, uncalled bases, and/or to filter out contaminants.
- the high-quality data obtained from the quality control can be mapped or aligned to a reference genome or assembled de novo.
- Analyzing the sequence information can comprise determining a number of the barcoded nucleic acids of each of the nucleic acid targets comprising UMIs with different sequences; and/or determining sequences of the barcoded nucleic acids of the nucleic acid targets comprising UMIs with different sequences.
- Gene expression quantification and differential expression analysis can be carried out to identify genes whose expression differs under different conditions, such as, external stimuli and/or signals received from other cells.
- the method can comprise determining a profile (e.g. an expression profile, an omics profile, or a multi-omics profile) of the sample nucleic acids associated with the cell.
- a profile can be a single omics profile, such as a transcriptome profile.
- the profile can be a multi-omics profile, which can include profiles of genome (e.g. a genomics profile) , proteome (e.g. a proteomics profile) , transcriptome (e.g. a transcriptomics profile) , epigenome (e.g. an epigenomics profile) , metabolome (e.g. a metabolomics profile) , and/or microbiome (e.g. microbiome profile) .
- the profile can include an RNA expression profile.
- the profile can include a protein expression profile.
- the expression profile can comprise an RNA expression profile, an mRNA expression profile, and/or a protein expression profile.
- the expression profile can comprise an absolute abundance or a relative abundance.
- a profile can also be a profile of one or more target nucleic acids (e.g. gene markers) or a selection of genes associated with the cell.
- Analyzing the sequencing information can comprise determining the pairing between the 5’ and 3’ sequences of the nucleic acid targets.
- the 5’ and 3’ sequences of the same nucleic acid target can be provided with the same unique identifier (e.g., cell barcode and UMI) .
- unique identifier e.g., cell barcode and UMI
- the ability to matching the 5’ and 3’ sequences derived from the same nucleic acid target is provided by the assignment of unique identifiers (e.g., cell barcode) specifically to the nucleic acid target.
- nucleic acid barcodes can be assigned or associated with the 5’ and 3’ ends of the nucleic acid target, in order to tag or label the 5’ and 3’ sequences with the unique identifiers. These unique identifiers can then be used to attribute the 5’ and 3’ sequences to the same nucleic acid target. In some examples, this is carried out by the circularization of barcoded nucleic acid and amplification of the circularized barcoded nucleic acid as described above.
- the nucleic acid target can be an mRNA.
- a barcode oligonucleotide comprising cell barcode, UMI, first PCR primer-binding sequence and poly (dT) sequence can be added to a cDNA at the end corresponding to the 3’ end of the mRNA target producing a barcoded cDNA, by hybridizing the poly (A) tail of the mRNA and reverse transcription.
- the barcoded cDNA can be divided into a first and a second portions. The first portion of the barcoded cDNAs can be circularized to produce circularized barcoded cDNAs.
- the circularized barcoded cDNAs can be then amplified to produce second linear barcoded cDNAs, with the cell barcode and the UMI attached to the cDNAs at the end corresponding to the 5’ end of the mRNA target.
- the cell barcode and the UMI in the second portion of the barcoded cDNA and the second linear barcoded cDNA can comprise the same sequence, allowing the sequences of the second portion of the barcoded cDNA and the second linear barcoded cDNA to be attributed to the same mRNA target.
- Analyzing the sequencing information can comprise integrating the 5’ and 3’ sequences of the same nucleic acid target to obtain the full-length sequence information of the nucleic acid target.
- the 5’ and 3’ sequences attributed to the same nucleic acid target can be obtained using the method of pairing/matching the 5’ and 3’ sequences of the same nucleic acid target described above. After paring, the 5’ and 3’ sequences of the same nucleic acid target can be integrated, such as by aligning an overlapping sequence.
- the method disclosed herein can be used to determine a profile (e.g., an expression profile, an omics profile, or a multi-omics profile) of a cell, such as to detect changes in gene expression profile of the cell in terms of identification of RNA transcripts and their quantitation.
- a profile of a cell can be determined using the nucleic acid sequences of the plurality barcoded nucleic acids.
- the profile can comprise a transcriptomics profile, a multi-omics profile such as a genomics profile, a proteomics profile, a transcriptomics profile, an epigenomics profile, a metabolomics profile, a chromatics profile, or a combination thereof.
- determining the profile of the cell can comprise determining the profile of the cell using the UMIs and sequences of the sample nucleic acids, or a portion thereof, present in the nucleic acid sequences.
- the cell can have a differential expression of genes upon stimulation.
- a differential expression analysis can be performed to detect quantitative changes in expression levels of the cell. Genes expressed differentially can be detected. Differential expression profile can be correlated to the cell functionality and/or cell’s phenotypes.
- compositions for single cell sequencing or single cell analysis comprises a plurality of beads of the present disclosure.
- the cell barcodes of the plurality of barcode oligonucleotides attached to each of the plurality of beads can be identical.
- the cell barcodes of barcodes oligonucleotide attached to different beads of the plurality of beads can be different.
- the number of beads can be different in different embodiments.
- the number of beads is, is about, is at least, is at least about, is at most, or is at most, 100, 200, 300, 400, 500, 600, 700, 800, 900, 1000, 2000, 3000, 4000, 5000, 6000, 7000, 8000, 9000, 10000, 20000, 30000, 40000, 50000, 60000, 70000, 80000, 90000, 100000, 200000, 300000, 400000, 500000, 600000, 700000, 800000, 900000, 1000000, 2000000, 3000000, 4000000, 5000000, 6000000, 7000000, 8000000, 9000000, 10000000, 20000000, 30000000, 40000000, 50000000, 60000000, 70000000, 80000000, 90000000, 100000000, 200000000, 300000000, 400000, 500000000, 600000, 700000, 800000, 900000000, 1000000000, or a number or a range between any two of these values.
- the number beads can be
- kits for single cell sequencing or single cell analysis comprises a composition comprising a plurality of beads of the present disclosure.
- the kit can comprise instructions of using the composition for single cell sequencing or single cell analysis.
- a method of generating beads comprising barcode oligonucleotides comprises providing a plurality of beads each attached to a plurality of oligonucleotide barcodes.
- Each barcode oligonucleotide of the plurality of barcode oligonucleotides can comprise a cell barcode, a unique molecular identifier (UMI) , and a poly (dT) sequence.
- the method can comprise adding, to 3’-end of each of barcode oligonucleotides of the plurality of barcode oligonucleotides, a probe sequence that is a not poly (dT) sequence and is capable of binding to a nucleic acid target.
- a probe sequence that is a not poly (dT) sequence and is capable of binding to a nucleic acid target.
- adding the probe sequence comprises adding the probe sequence to the 3’-end of each of the barcode oligonucleotides of the plurality of barcode oligonucleotides chemically. In some embodiments, adding the probe sequence comprises adding the probe sequence to the 3’-end of each of the barcode oligonucleotides of the plurality of barcode oligonucleotides using an enzyme. In some embodiments, the enzyme is a ligase. Adding the probe sequence can comprise ligating a probe oligonucleotide comprising the probe sequence to the 3’-end of each of the barcode oligonucleotides of the plurality of barcode oligonucleotides using the ligase.
- the enzyme is a DNA polymerase.
- Adding the probe sequence can comprise synthesizing the probe sequence at the 3’-end of each of the barcode oligonucleotides of the plurality of barcode oligonucleotides using the DNA polymerase.
- a method of generating beads comprising barcode oligonucleotides comprises providing a plurality of beads each attached to a plurality of oligonucleotide barcodes.
- Each barcode oligonucleotide of the plurality of barcode oligonucleotides can comprise a cell barcode and a unique molecular identifier (UMI) .
- the method can comprise adding to 3’-end of each of barcode oligonucleotides of the plurality of barcode oligonucleotides (i) a poly (dT) sequence and/or (ii) a probe sequence that is a non-poly (dT) sequence and is capable of binding to a nucleic acid target.
- the Examples provides a non-limiting method for simultaneously detecting full-length transcript sequences at a high-throughput single-cell level. It involved capturing mRNA using magnetic beads with poly (dT) tails. The beads were attached with cell barcodes for cell identification and UMIs for transcript quantification. After the capturing, cDNA was synthesized through reverse transcription and PCR amplification. A portion of the cDNA was used for 3' end transcriptome library construction to obtain gene expression and sequence information from the 3' end. Another portion was subjected to ligation-mediated amplification and reverse PCR to obtain expression quantification and sequence information of genes from the 5' end. By integrating the information from the 3' and 5' ends, the complete full-length sequence of transcript was obtained.
- single-cell suspension was loaded onto a microfluidic chip, and individual cells were isolated into individual wells on the chip. Then, capture beads with cell barcodes and UMIs were loaded onto the chip. Based on the diameters of the beads and wells (e.g., approximately 25 ⁇ m and 40 ⁇ m, respectively) , only one bead was loaded into each well. 100 ⁇ L of cell lysis buffer was loaded into the chip, and the chip was incubated at room temperature for 15 minutes to lyse cells and capture RNA. After 15 minutes, the magnetic beads with captured RNA were taken out of the microchip and subjected to reverse transcription and PCR amplification to generate cDNA.
- cDNA was used for 3' end transcriptome library construction, following the specific methods provided by the Singleron GEXSCOPE kit. The remaining cDNA was used for 5' end library construction. Prior to 5' end library construction, the cDNA underwent circularization and amplification, following the specific methods outlined below:
- Circularization Mix 250 ng of cDNA product was taken.
- the Circularization Mix was prepared on ice according to the following table, vortexed to mix and centrifuged briefly.
- the circularization program was set up on a thermal cycler, according to the following table.
- the temperature of the lid of the thermal cycler was set at 85°C.
- the Circularization Mix was mixed well by pipetting and centrifuged briefly. The Circularization Mix was placed in the thermal cycler and the circularization program was run.
- cyclicase is a mixture of an exonuclease, a DNA polymerase, and a ligase.
- the digestion reaction mix was mixed well by pipetting, centrifuged briefly and kept on ice.
- the digestion program was set up on a thermal cycler, according to the following table.
- the lid of the thermal cycler was set to be OFF.
- the PCR tube was placed in the thermal cycler and the digestion program was run.
- 0.5 mL of 80%ethanol per reaction was prepared.
- the enriched product was centrifuged briefly and the volume was measured with a pipette.
- AMPure beads were vortexed until homogenized and 32.5 ⁇ L of beads were added into 25 ⁇ L of the fragmented product obtained from digestion step above.
- the product and the beads were mix well by vortexing and incubated at room temperature for 5 minutes.
- the tube was centrifuged briefly and placed on the magnetic rack for 5 minutes or until the liquid was clear. The supernatant was carefully removed and discarded without disturbing the beads.
- the tube was kept on the magnetic stand and 200 ⁇ L of freshly prepared 80%ethanol was added to wash the magnetic beads.
- the tube was incubated at room temperature for 30 seconds, and the supernatant was carefully aspirated without disrupting the beads.
- the wash with 80%ethanol was repeated one more time.
- the tube was centrifuged briefly and returned onto the magnetic stand. The excess of ethanol was removed using a fine pipet tip.
- the lid was kept open to dry the beads for about 2 minutes or until the beads were not shiny anymore, but no more than 5 minutes.
- the tube was removed from the magnetic stand.
- the target was eluted from the beads by adding 20 ⁇ L nuclease-free water.
- the beads and the nuclease-free water were mixed well by pipetting up and down for 10 times and incubated for at least 5 minutes at room temperature.
- the tube was centrifuged briefly, and placed back onto the magnetic stand until the liquid was clear.
- the supernatant (purified product) was transferred to a new EP tube. Quantification of the purified product was not necessary.
- the PCR Mix was prepared on ice according to the following table, vortexed to mix and centrifuged briefly.
- PCR Mix 200 ⁇ L was taken, mixed by pipetting and distributed into PCR tubes, with a volume of 50 ⁇ L in each tube.
- the PCR tubes were covered and placed in a PCR machine for amplification, with a lid temperature of 105°C and a reaction volume of 50 ⁇ L.
- the PCR program was set as in the table below. After the completion of the PCR program, the amplification products can be stored at 4°Cfor 48 hours or at -20°C for 3 months. Alternatively, the amplification products were purified by proceeding with cDNA purification step below directly.
- 0.5 mL of 80%ethanol per reaction was prepared.
- the enriched product was centrifuged briefly and the volume with a pipette was measured.
- AMPure beads were vortexed until homogenized and 40 ⁇ L of beads were added into 50 ⁇ L of first-round PCR-enriched product from amplification step above.
- the beads and the PCR product were mixed well by vortexing and incubated at room temperature for 5 minutes.
- the tube was centrifuged briefly and placed on the magnetic rack for 5 minutes or until the liquid was clear. The supernatant was carefully removed and discarded without disturbing the beads.
- the tube was kept on the magnetic stand and 200 ⁇ L of freshly prepared 80%ethanol was added to wash the magnetic beads.
- the tube was incubated at room temperature for 30 seconds, and the supernatant was carefully aspirated without disrupting the beads.
- the 80%ethanol wash step was repeated one more time.
- the tube was centrifuged briefly and returned onto the magnetic stand. The excess of ethanol was removed using a fine pipet tip.
- the lid was kept open to dry the beads for about 2 minutes or until the beads were not shiny anymore, but no more than 5 minutes.
- the tube was removed from the magnetic stand.
- the target was eluted from the beads by adding 20 ⁇ L of eluding buffer (EB) .
- the beads and the EB were mixed well by pipetting up and down for 10 times and incubated for at least 5 minutes at room temperature.
- the tube was centrifuged briefly and placed back on the magnetic stand until the liquid was clear.
- EB eluding buffer
- the supernatant (purified product) was transferred to a new EP tube.
- the product can be stored at 4°C for 72 hours or at -20°C for 3 months or directly proceed to the next step.
- the quality control of the product was conducted by taking 1 ⁇ L of sample for Qubit concentration detection.
- the amplified products after circularization served as templates for the construction of the 5' end transcriptome library.
- the construction method for the 5' end transcriptome library followed the enrichment library construction method provided by the Singleron sCircle kit.
- the aforementioned single-cell sequencing method was applied to a mixed sample of human and mouse cell lines. 3T3 cells from mice and CCRF cells from humans were mixed in a 1: 1 ratio, and single-cell sequencing was performed. The transcriptome information obtained from the sequencing was used for cell clustering analysis. Additionally, the coverage of the 3' and 5' ends was evaluated, along with the assessment of complete coverage of the full-length gene after integrating the 3' and 5' sequence information.
- a single experiment allowed simultaneous detection of gene expression at both the 5' and 3' ends using the obtained cDNA.
- This approach provided full-length sequence information of transcripts and a more comprehensive understanding of gene expression.
- UMIs Unique Molecular Identifiers
- the method can also be performed in high throughput, allowing for the sequencing of tens of thousands of single-cell full-length transcriptomes in a single experiment.
- the library preparation and sequencing strategy were well-developed, utilizing second-generation sequencing technology to achieve highly accurate full-length single-cell transcriptome sequencing at a relatively low cost.
Landscapes
- Chemical & Material Sciences (AREA)
- Organic Chemistry (AREA)
- Life Sciences & Earth Sciences (AREA)
- Analytical Chemistry (AREA)
- Zoology (AREA)
- Wood Science & Technology (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Health & Medical Sciences (AREA)
- Engineering & Computer Science (AREA)
- Biophysics (AREA)
- Immunology (AREA)
- Microbiology (AREA)
- Molecular Biology (AREA)
- Biotechnology (AREA)
- Physics & Mathematics (AREA)
- Chemical Kinetics & Catalysis (AREA)
- Biochemistry (AREA)
- Bioinformatics & Cheminformatics (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Genetics & Genomics (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Abstract
Disclosed herein include methods, compositions, and kits suitable for single cell target analysis, including but not limited to, high-throughput analysis of the full length of nucleic acid sequences from both 3' and 5' ends.
Description
The present disclosure generally relates to molecular biology. More specifically, provided herein include methods, compositions, kits and systems for high-throughput single cell sequencing.
Single-cell transcriptome technology has been rapidly developed. However, current technology cannot sequence the full length of the nucleic acids with high accuracy and in high throughput. Therefore, there is an urgent need to establish a high-throughput single-cell full-length library preparation and sequencing method.
Disclosed herein include methods for single cell analysis. In some embodiments, the method comprises: partitioning a cell of a plurality of cells and a bead of a plurality of beads attached with a plurality of barcode oligonucleotides into a partition of a plurality of partitions; hybridizing the plurality of barcode oligonucleotides attached to the bead in the partition with the RNA targets associated with the cell in the partition; reverse transcribing the RNA targets hybridized to the barcode oligonucleotides to generate a first plurality of barcoded complementary deoxyribonucleic acids (cDNAs) ; obtaining a first portion and a second portion of the first plurality of barcoded cDNAs; circularizing each of the first portion of the first plurality of barcoded cDNAs to generate a plurality of circularized barcoded cDNAs; amplifying the plurality of circularized barcoded cDNAs to generate a second plurality of linear barcoded cDNAs; and analyzing the second portion of the first plurality of barcoded cDNAs and the second plurality of linear barcoded cDNAs, or products thereof. In some embodiments, each barcode oligonucleotide of the plurality of barcode oligonucleotides comprises a cell barcode, a unique molecular identifier (UMI) and a probe sequence. In some embodiments, the probe sequence is capable of binding to an RNA target associated with the cell. In some embodiments, the first plurality of barcoded cDNAs comprises the barcode oligonucleotides and cDNAs corresponding to the RNA targets. In some embodiments, the cDNAs corresponding to the RNA targets comprise one end attached to the UMI and the cell barcode and the other end. In some embodiments, the RNA target comprises a messenger RNA (mRNA) . The method for single cell analysis, in some embodiments, comprises: partitioning a cell
of a plurality of cells and a bead of a plurality of beads attached with a plurality of barcode oligonucleotides into a partition of a plurality of partitions; barcoding the nucleic acid targets associated with the cell in the partition to generate a first plurality of barcoded nucleic acids; obtaining a first portion and a second portion of the first plurality of barcoded nucleic acids;
circularizing each of the first portion of the first plurality of barcoded nucleic acids to generate a plurality of circularized barcoded nucleic acids; amplifying the plurality of circularized barcoded nucleic acids to generate a second plurality of linear barcoded nucleic acids; and analyzing the second portion of the first plurality of barcoded nucleic acids and the second plurality of linear barcoded nucleic acids, or products thereof. In some embodiments, each barcode oligonucleotide of the plurality of barcode oligonucleotides comprises a cell barcode, a unique molecular identifier (UMI) and a probe sequence. In some embodiments, the probe sequence is capable of binding to a nucleic acid target associated with the cell. In some embodiments, the first plurality of barcoded nucleic acids comprises the barcode oligonucleotides and nucleotide sequences corresponding to the nucleic acid targets. In some embodiments, the nucleotide sequences corresponding to the nucleic acid targets comprise one end attached to the UMI and the cell barcode and the other end.
The nucleic acid targets can, e.g., comprise a ribonucleic acid (RNA) , a messenger RNA (mRNA) , and a deoxyribonucleic acid (DNA) . In some embodiments, the nucleic acid targets comprise nucleic acid targets of the cell, from the cell, in the cell, and/or on the surface of the cell.
The partition can be a droplet or a microwell. In some embodiments, the plurality of partitions comprises a plurality of microwells of a microwell array. In some embodiments, the plurality of partitions comprises at least 1000 partitions. In some embodiments, at least 50%of partitions of the plurality of partitions comprise a single cell of the plurality of cells and a single bead of the plurality of beads. In some embodiments, at most 10%of partitions of the plurality of partitions comprise two or more cells of the plurality of cells. In some embodiments, at most 10%of partitions of the plurality of partitions comprise no cell of the plurality of cells. In some embodiments, at most 10%of partitions of the plurality of partitions comprise two or more beads of the plurality of beads. In some embodiments, at most 10%of partitions of the plurality of partitions comprise no bead of the plurality of beads.
The probe sequence can be, e.g., at least 10 nucleotides in length. In some embodiments, the probe sequence is not a poly-dT sequence. In some embodiments, the barcode oligonucleotides comprising probe sequence is capable of binding to a non-poly-A RNA target and/or nucleic acid target. In some embodiments, the barcode oligonucleotides comprising probe sequences that are not poly-dT sequences are capable of binding to an identical non-poly-A RNA target and/or nucleic acid
target. In some embodiments, the barcode oligonucleotides comprising probe sequences that are not poly-dT sequences are capable of binding to different non-poly-A RNA targets and/or nucleic acid targets. In some embodiments, the probe sequence is a poly-dT sequence. In some embodiments, the poly-dT sequence is at least 10 nucleotides in length. In some embodiments, the poly-dT sequences of the barcode oligonucleotides attached to a bead of the plurality of beads are identical. In some embodiments, the probe sequences of barcode oligonucleotides comprise a degenerate sequence. In some embodiments, the degenerate sequence is at least 3 nucleotides in length. In some embodiments, the degenerate sequence spans, or corresponds to, a mutation. In some embodiments, the probe sequences of barcode oligonucleotides span a region of interest. In some embodiments, the probe sequence is adjacent a region of interest.
In some embodiments, the cell barcodes of two barcode oligonucleotides attached to a bead of the plurality of beads comprise an identical sequence. In some embodiments, the cell barcodes of two barcode oligonucleotides attached to two beads of the plurality of beads comprise different sequences. In some embodiments, the cell barcode of each barcode oligonucleotide is at least 6 nucleotides in length. In some embodiments, the UMIs of two barcode oligonucleotides attached to a bead of the plurality of beads comprise different sequences. In some embodiments, the UMIs of two barcode oligonucleotides attached to two beads of the plurality of beads comprise an identical sequence. In some embodiments, the UMI of each barcode oligonucleotide is at least 6 nucleotides in length. In some embodiments, the barcode oligonucleotide further comprises a first polymerase chain reaction (PCR) primer-binding sequence. In some embodiments, the first PCR primer-binding sequence comprises a Read 1 sequence. In some embodiments, the barcode oligonucleotide comprises from the 5’ end to the 3’ end, the cell barcode, the UMI, the PCR primer-binding sequence, and the probe sequence or the UMI, the cell barcode, the PCR primer-binding sequence, and the probe sequence.
In some embodiments, the barcode oligonucleotides are reversibly attached to, covalently attached to, or irreversibly attached to the bead. In some embodiments, the bead is a gel bead. In some embodiments, the gel bead is degradable upon application of a stimulus. In some embodiments, the stimulus comprises a thermal stimulus, a chemical stimulus, a biological stimulus, a photo-stimulus, or a combination thereof. In some embodiments, the bead is a solid bead and/or a magnetic bead.
In some embodiments, barcoding the nucleic acid targets associated with the cell comprises: hybridizing the barcode oligonucleotides attached to the bead in each partition of the plurality of partitions with nucleic acid targets associated with the cell in the partition; extending the barcode oligonucleotides attached to the bead and hybridized to the nucleic acid targets using the nucleic acid targets as templates to generate single-stranded barcoded nucleic acids; and generating
double-stranded barcoded nucleic acids from the single-stranded barcoded nucleic acids. In some embodiments, generating double-stranded barcoded nucleic acids from the single-stranded barcoded nucleic acids comprises extending the single-stranded barcoded nucleic acids. In some embodiments, extending the single-stranded barcoded nucleic acids comprises extending the single-stranded barcoded nucleic acids using a template switching oligonucleotide.
The method can further comprise pooling the beads prior to extending the barcode oligonucleotides or prior to generating the double-stranded barcoded nucleic acids. In some embodiments, extending the barcode oligonucleotides attached to the bead and hybridized to the nucleic acid targets comprises extending the barcode oligonucleotides attached to the bead and hybridized to the nucleic acid targets in bulk. In some embodiments, generating the double-stranded barcoded nucleic acids comprises generating the double-stranded barcoded nucleic acids from the single-stranded barcoded nucleic acids in bulk. The method can further comprise pooling the beads subsequent to extending the barcode oligonucleotides attached to the bead to generate the single-stranded barcoded nucleic acids or subsequent to generating the double-stranded barcoded nucleic acids. In some embodiments, extending the barcode oligonucleotides attached to the bead and hybridized to the nucleic acid targets comprises extending the barcode oligonucleotides attached to the bead and hybridized to the nucleic acid targets in the partition. In some embodiments, generating the double-stranded barcoded nucleic acids comprises generating the double-stranded barcoded nucleic acids from the single-stranded barcoded nucleic acids in the partition.
In some embodiments, circularizing each of the first portion of the first plurality of barcoded cDNAs comprises connecting the UMIs and the cell barcodes attached to one end of the cDNAs corresponding to the RNA targets to the other end of the cDNAs corresponding to the RNA targets. In some embodiments, circularizing each of the first portion of the first plurality of barcoded nucleic acids comprises connecting the UMIs and the cell barcodes attached to one end of the nucleotide sequences corresponding to the nucleic acid targets to the other end of the nucleotide sequences corresponding to the nucleic acid targets. In some embodiments, circularizing each of the first portion of the first plurality of barcoded nucleic acids/cDNAs comprises: generating barcoded nucleic acid/cDNA comprising a first circle handle attached to one end of the first plurality of barcoded nucleic acid/cDNA and a second circle handle attached to the other end of the first plurality of barcoded nucleic acid/cDNA; and connecting the first circle handle and the second circle handle. In some embodiments, the first circle handle and the second circle handle comprise an identical nucleotide sequence, an overlapping nucleotide sequence and/or a complementary nucleotide sequence. In some embodiments, the identical nucleotide sequence, the overlapping nucleotide sequence and/or the complementary nucleotide sequence is at least 10 nucleotides in
length and/or at most 150 nucleotides in length (e.g., about 40 nucleotides in length) . In some embodiments, generating barcoded nucleic acid/cDNA comprising a first circle handle attached to one end of the first plurality of barcoded nucleic acid/cDNA and a second circle handle attached to the other end of the first plurality of barcoded nucleic acid/cDNA comprises: hybridizing first circularization primers comprising the first circle handles to the one end of each of the first plurality of barcoded nucleic acids/cDNAs, and second circularization primers comprising the second circle handles to the other end of each of the first plurality of barcoded nucleic acids/cDNAs; and extending the first circularization primers and the second circularization primers using each of the first portion of the first plurality of barcoded nucleic acids/cDNAs as templates. In some embodiments, connecting the first circle handle and the second circle handle comprises connecting the identical nucleotide sequences, the overlapping nucleotide sequences and/or the complementary nucleotide sequences of the first circle handle and the second circle handle.
In some embodiments, amplifying the plurality of circularized barcoded nucleic acids/cDNAs comprises: hybridizing first linearization primers and second linearization primers to the plurality of circularized barcoded nucleic acids/cDNAs; extending the first linearization primers and the second linearization primers using the plurality of circularized barcoded nucleic acids/cDNAs as templates. In some embodiments, the first linearization primers and the second linearization primers hybridize to a sequence between 1) the one end of the nucleotide sequences corresponding to the nucleic acid targets or the cDNAs corresponding to the mRNA targets, and 2) the UMI and the cell barcode. In some embodiments, the first linearization primers and the second linearization primers hybridize to a sequence comprising the first PCR primer-binding sequence of the barcode oligonucleotides. In some embodiments, amplifying the plurality of circularized barcoded nucleic acids/cDNAs further comprises purifying the second plurality of linear barcoded nucleic acids/cDNAs.
In some embodiments, analyzing the second portion of the first plurality of barcoded nucleic acids/cDNAs and the second plurality of linear barcoded nucleic acids/cDNAs comprises amplifying the second portion of the first plurality of barcoded nucleic acids/cDNAs to obtain an amplified second portion of the first plurality of barcoded nucleic acids/cDNAs and the second plurality of linear barcoded nucleic acids/cDNAs to obtain an amplified second plurality of linear barcoded nucleic acids/cDNAs. In some embodiments, analyzing the second portion of the first plurality of barcoded nucleic acids/cDNAs and the second plurality of linear barcoded nucleic acids/cDNAs comprises pooling the second portion of the first plurality of barcoded nucleic acids/cDNAs and the second plurality of linear barcoded nucleic acids/cDNAs before amplifying the second portion of the first plurality of barcoded nucleic acids/cDNAs and the second plurality of linear barcoded nucleic acids/cDNAs. In some embodiments, analyzing the second portion of the
first plurality of barcoded nucleic acids/cDNAs and the second plurality of linear barcoded nucleic acids/cDNAs comprises processing the amplified second portion of the first plurality of barcoded nucleic acids/cDNAs and the amplified second plurality of linear barcoded nucleic acids/cDNAs to generate processed second portion of the first plurality of barcoded nucleic acids/cDNAs and processed second plurality of linear barcoded nucleic acids/cDNAs.
In some embodiments, processing the amplified second portion of the first plurality of barcoded nucleic acids/cDNAs and the amplified second plurality of linear barcoded nucleic acids/cDNAs comprises: fragmenting the amplified second portion of the first plurality of barcoded nucleic acids/cDNAs to generate fragmented second portion of the first plurality of barcoded nucleic acids/cDNAs and the amplified second plurality of linear barcoded nucleic acids/cDNAs to generate fragmented second plurality of linear barcoded nucleic acids/cDNAs; adding a second polymerase chain reaction (PCR) primer-binding sequence; and generating processed second portion of the first plurality of barcoded nucleic acids/cDNAs and processed second plurality of linear barcoded nucleic acids/cDNAs comprising sequencing primer sequences from the fragmented second portion of the first plurality of barcoded nucleic acids/cDNAs and fragmented second plurality of linear barcoded nucleic acids/cDNAs. In some embodiments, fragmenting the amplified second portion of the first plurality of barcoded nucleic acids/cDNAs and the amplified second plurality of linear barcoded nucleic acids/cDNAs comprises fragmenting the amplified second portion of the first plurality of barcoded nucleic acids/cDNAs and the amplified second plurality of linear barcoded nucleic acids/cDNAs enzymatically. In some embodiments, the second PCR primer-binding sequence comprises a Read 2 sequence. In some embodiments, the sequencing primer sequences comprise a P5 sequence and a P7 sequence.
In some embodiments, analyzing the second portion of the first plurality of barcoded nucleic acids/cDNAs and the second plurality of linear barcoded nucleic acids/cDNAs further comprises pooling 1) the amplified second portion of the first plurality of barcoded nucleic acids/cDNAs and the amplified second plurality of linear barcoded nucleic acids/cDNAs; 2) the processed second portion of the first plurality of barcoded nucleic acids/cDNAs and the processed second plurality of linear barcoded nucleic acids/cDNAs; or 3) the fragmented second portion of the first plurality of barcoded nucleic acids/cDNAs and fragmented second plurality of linear barcoded nucleic acids/cDNAs.
In some embodiments, analyzing the second portion of the first plurality of barcoded nucleic acids/cDNAs and the second plurality of linear barcoded nucleic acids/cDNAs, or products thereof comprises sequencing the second portion of the first plurality of barcoded nucleic acids/cDNAs and the second plurality of linear barcoded nucleic acids/cDNAs, or products thereof to obtain
sequencing information. In some embodiments, sequencing the second portion of the first plurality of barcoded nucleic acids/cDNAs and the second plurality of linear barcoded nucleic acids/cDNAs, or products thereof comprises sequencing the processed second portion of the first plurality of barcoded nucleic acids/cDNAs and the processed second plurality of linear barcoded nucleic acids/cDNAs. In some embodiments, sequencing the second portion of the first plurality of barcoded nucleic acids/cDNAs and the second plurality of linear barcoded nucleic acids/cDNAs, or products thereof, comprises sequencing products of the second portion of the first plurality of barcoded nucleic acids/cDNAs and the second plurality of linear barcoded nucleic acids/cDNAs each comprising a P5 sequence, a Read 1 sequence, a cell barcode, a UMI, a poly-dT sequence, a probe sequence, a sequence of a nucleic acid target or a part thereof, a Read 2 sequence, a sample index, and/or a P7 sequence to obtain sequencing information.
In some embodiments, analyzing the second portion of the first plurality of barcoded nucleic acids/cDNAs and the second plurality of linear barcoded nucleic acids/cDNAs, or products thereof comprising matching the second portion of the first plurality of barcoded nucleic acids/cDNAs and the second plurality of linear barcoded nucleic acids/cDNAs by the cell barcodes and the UMIs. In some embodiments, analyzing the second portion of the first plurality of barcoded nucleic acids/cDNAs and the second plurality of linear barcoded nucleic acids/cDNAs, or products thereof comprising obtaining the full-length sequences of the nucleic acid targets or the RNA targets by integrating the sequencing information of the second portion of the first plurality of barcoded nucleic acids/cDNAs and the second plurality of linear barcoded nucleic acids/cDNAs.
In some embodiments, analyzing the second portion of the first plurality of barcoded nucleic acids/cDNAs and the second plurality of linear barcoded nucleic acids/cDNAs, or products thereof comprises analyzing the sequencing information. In some embodiments, analyzing the second portion of the first plurality of barcoded nucleic acids/cDNAs and the second plurality of linear barcoded nucleic acids/cDNAs, or products thereof comprises: determining an expression profile of each of the nucleic acid targets and/or the RNA targets using a number of UMIs with different sequences associated with the nucleic acid targets and/or the RNA targets in the sequencing information. In some embodiments, the expression profile comprises an absolute abundance or a relative abundance. In some embodiments, the expression profile comprises an RNA expression profile, an mRNA expression profile and/or a protein expression profile. In some embodiments, analyzing the second portion of the first plurality of barcoded nucleic acids/cDNAs and the second plurality of linear barcoded nucleic acids/cDNAs, or products thereof comprises:
determining a number of the second portion of the first plurality of barcoded nucleic acids/cDNAs and the second plurality of linear barcoded nucleic acids/cDNAs of each of the nucleic acid
targets/RNA targets comprising UMIs with different sequences; and/or determining sequences of the second portion of the first plurality of barcoded nucleic acids/cDNAs and the second plurality of linear barcoded nucleic acids/cDNAs of the nucleic acid targets/RNA targets comprising UMIs with different sequences.
In some embodiments, the second portion of the first plurality of barcoded nucleic acids/cDNAs and the second plurality of linear barcoded nucleic acids/cDNAs are from at least 100 cells (e.g., at least 1,000 cells, or about 100 cells to about 50,000 cells) .
The method can further comprise releasing the nucleic acids from the cell prior to barcoding the nucleic acid targets associated with the cell. The method can further comprise lysing the cell to release the nucleic acid targets form the cell. In some embodiments, reverse transcribing the RNA targets hybridized to the barcode oligonucleotides is performed without lysing or digesting the cells.
Disclosed herein include compositions for single cell analysis. The composition comprises a plurality of beads disclosed herein. In some embodiments the cell barcodes of the plurality of barcode oligonucleotides attached to each of the plurality of beads are identical. In some embodiments the cell barcodes of barcodes oligonucleotide attached to different beads of the plurality of beads are different. In some embodiments the plurality of beads comprises at least 100 beads.
Disclosed herein include kits for single cell analysis. The kit comprises: a composition disclosed herein; and instructions of using the composition for single cell sequencing or analysis.
FIG. 1 depicts non-limiting exemplary embodiments and data related to workflow of high-throughput single-cell full-length sequencing.
FIG. 2 depicts non-limiting exemplary embodiments and data related to workflow of 3’ and 5’ library preparation.
FIG. 3 depicts non-limiting exemplary embodiments and data related to T-SNE cluster of 3T3 and CCRF cell lines.
FIG. 4A-FIG. 4C depict non-limiting exemplary embodiments and data related to transcript coverage of 3’ transcripts (FIG. 4A) , 5’ transcripts (FIG. 4B) and merged transcripts (FIG. 4C) .
In the following detailed description, reference is made to the accompanying drawings, which form a part hereof. In the drawings, similar symbols typically identify similar components, unless context dictates otherwise. The illustrative embodiments described in the detailed description, drawings, and claims are not meant to be limiting. Other embodiments may be utilized, and other changes may be made, without departing from the spirit or scope of the subject matter presented herein. It will be readily understood that the aspects of the present disclosure, as generally described herein, and illustrated in the Figures, can be arranged, substituted, combined, separated, and designed in a wide variety of different configurations, all of which are explicitly contemplated herein and made part of the present disclosure herein.
All patents, published patent applications, other publications, and sequences from GenBank, and other databases referred to herein are incorporated by reference in their entirety with respect to the related technology.
A method for simultaneously detecting the gene expression of 3' and 5' ends of transcripts at a high-throughput single-cell level is disclosed herein. The method, for example, comprises capturing mRNA using magnetic beads with poly-T tails.
In a non-limiting example of the method, the beads are barcoded with barcode oligonucleotides, and the barcode oligonucleotides comprise a cell barcode to distinguish individual cells, and an UMI used for transcript quantification. After the capture, the mRNA is reverse transcribed and amplified by PCR to generate cDNA. A portion of the cDNA is used to construct a 3' end transcriptome library for quantifying 3' end gene expression, while another portion is used for circular amplification and construction of a 5' end transcriptome library for quantifying 5' end gene expression. By simultaneously detecting the 3' and 5' ends of RNA at the single-cell level and performing combined analysis of the 5' and 3' libraries, the full-length sequences of genes can be covered. The comprehensive approach disclosed herein can enhance understanding of gene expression patterns at a cellular level and reveal more cellular functions and regulatory mechanisms.
Definitions
Unless defined otherwise, technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which the present disclosure belongs. See, e.g. Singleton et al., Dictionary of Microbiology and Molecular Biology 2nd ed., J. Wiley &Sons (New York, NY 1994) ; Sambrook et al., Molecular Cloning, A Laboratory Manual, Cold
Spring Harbor Press (Cold Spring Harbor, NY 1989) . For purposes of the present disclosure, the following terms are defined below.
In single-cell research, traditional transcriptome sequencing typically only provides gene expression levels and sequence information from one end (e.g., either 3' end or 5' end) of the gene, without differentiating between different transcripts. However, the expression heterogeneity of transcripts within different cells can lead to significant functional differences and regulatory mechanisms. As used here, the term “expression heterogeneity” or “expression heterogeneity of transcripts” refers to the differences in gene expression between individual cells. Expression heterogeneity could be caused by mechanisms such as sequence alterations to the constructs during integration, chromatin changes imparted during integration, locus-mediated inhibition of expression, or insufficient chromatin insulation. One mechanism is alternative splicing, generating many transcript isoforms from a single gene. A classic example is the Drosophila sex-determination pathway, in which alternative splicing acts as a sex-specific genetic switch that forms the basis of a regulatory hierarchy. Alternative splicing is also implicated in human diseases. For example, the neurodegenerative disease FTDP-17 has been associated with mutations that affect the alternative splicing of tau pre-mRNAs. Therefore, sequence information from both ends of transcripts would provide additional information, compared to sequence information obtained from only one end transcripts. Simultaneous detection of 3' and 5' gene expression of RNA at the single-cell level disclosed herein enables differentiation and quantification of different transcripts. It also allows for the analysis of transcript isoforms, which is crucial for a comprehensive understanding of gene regulation and cellular functions.
Low-throughput single-cell full-length sequencing can be achieved using the library preparation and sequencing method called SMART-seq3. SMART-seq3 is a classic single-cell transcriptome sequencing method that can simultaneously obtain gene sequence information from both the 3' and 5' ends of RNA. By individually isolating single cells into 96-or 384-well plates and performing cell lysis, mRNA capture, one-strand synthesis, amplification, and library construction steps for each well, individual libraries can be constructed for each well without the need for adding cell barcodes. Full-length sequencing of individual cell transcripts can then be achieved by assembling short-read sequencing data. However, this method has limitations such as low cell throughput, labor-intensive procedures, and high costs due to the need for independent library construction and sequencing for each cell.
With the advancement of sequencing technologies, third-generation sequencing technologies such as PacBio and Nanopore have been increasingly applied. The combination of third-generation sequencing technologies and single-cell sequencing has expanded the application scenarios of
single-cell sequencing. Long-read sequencing has made it possible to sequence the entire transcriptome of massive single cells. However, third-generation sequencing also has corresponding drawbacks. Nanopore has lower sequencing accuracy, which significantly affects the recognition of cell barcodes and UMIs in single cells, leading to low data utilization and low accuracy of the gene sequences obtained. PacBio has relatively higher accuracy, but is expensive. Moreover, the limited number of nanopores in PacBio sequencing chips results in a lower number of transcripts sequenced and lower sequencing throughput, compared to Nanopore.
Second-generation sequencing is currently the platform with the highest sequencing accuracy. However, due to the short read lengths and the lack of suitable high-throughput single-cell library construction methods, there are significant technical barriers to achieving high-throughput full-length transcriptome sequencing using second-generation sequencing. Therefore, there is an urgent need to establish a high-throughput single-cell full-length library preparation and sequencing method.
Disclosed herein include methods for single cell analysis. In some embodiments, the method comprises: partitioning a cell of a plurality of cells and a bead of a plurality of beads attached with a plurality of barcode oligonucleotides into a partition of a plurality of partitions; hybridizing the plurality of barcode oligonucleotides attached to the bead in the partition with the RNA targets associated with the cell in the partition; reverse transcribing the RNA targets hybridized to the barcode oligonucleotides to generate a first plurality of barcoded complementary deoxyribonucleic acids (cDNAs) ; obtaining a first portion and a second portion of the first plurality of barcoded cDNAs; circularizing each of the first portion of the first plurality of barcoded cDNAs to generate a plurality of circularized barcoded cDNAs; amplifying the plurality of circularized barcoded cDNAs to generate a second plurality of linear barcoded cDNAs; and analyzing the second portion of the first plurality of barcoded cDNAs and the second plurality of linear barcoded cDNAs, or products thereof. In some embodiments, each barcode oligonucleotide of the plurality of barcode oligonucleotides comprises a cell barcode, a unique molecular identifier (UMI) and a probe sequence. In some embodiments, the probe sequence is capable of binding to an RNA target associated with the cell. In some embodiments, the first plurality of barcoded cDNAs comprises the barcode oligonucleotides and cDNAs corresponding to the RNA targets. In some embodiments, the cDNAs corresponding to the RNA targets comprise one end attached to the UMI and the cell barcode and the other end. The RNA target can comprise a mRNA. Reverse transcribing the RNA targets hybridized to the barcode oligonucleotides can be performed without lysing or digesting the cells.
Disclosed herein include methods for single cell analysis. In some embodiments, the method
comprises: partitioning a cell of a plurality of cells and a bead of a plurality of beads attached with a plurality of barcode oligonucleotides into a partition of a plurality of partitions; barcoding the nucleic acid targets associated with the cell in the partition to generate a first plurality of barcoded nucleic acids; obtaining a first portion and a second portion of the first plurality of barcoded nucleic acids; circularizing each of the first portion of the first plurality of barcoded nucleic acids to generate a plurality of circularized barcoded nucleic acids; amplifying the plurality of circularized barcoded nucleic acids to generate a second plurality of linear barcoded nucleic acids; and analyzing the second portion of the first plurality of barcoded nucleic acids and the second plurality of linear barcoded nucleic acids, or products thereof. In some embodiments, each barcode oligonucleotide of the plurality of barcode oligonucleotides comprises a cell barcode, a unique molecular identifier (UMI) and a probe sequence. In some embodiments, the probe sequence is capable of binding to a nucleic acid target associated with the cell. In some embodiments, the first plurality of barcoded nucleic acids comprises the barcode oligonucleotides and nucleotide sequences corresponding to the nucleic acid targets. In some embodiments, the nucleotide sequences corresponding to the nucleic acid targets comprise one end attached to the UMI and the cell barcode and the other end.
The nucleic acid targets can comprise a ribonucleic acid (RNA) , a messenger RNA (mRNA) , and a deoxyribonucleic acid (DNA) . The nucleic acid targets can comprise nucleic acid targets of the cell, from the cell, in the cell, and/or on the surface of the cell.
A nucleic acid target can be in the cell (which can be released from the cell by cell lysis before the nucleic acid target is barcoded) . A nucleic acid target can be on the surface of the cell (e.g., an oligonucleotide attached to an antibody bound to an antibody on the surface of the cell) . In some embodiments, the method comprises releasing the nucleic acids of (or form or in) the cell prior to barcoding the nucleic acid targets associated with the cell. The method comprises lysing the cell to release the nucleic acids from (or in) the cell. The nucleic acid targets analyzed can be, e.g., from at least 100 cells (e.g., at least 1,000 cells, or about 100 cells to about 50,000 cells) .
Disclosed herein include compositions for single cell analysis. The composition can comprise a plurality of beads disclosed herein. In some embodiments the cell barcodes of the plurality of barcode oligonucleotides attached to each of the plurality of beads are identical. In some embodiments the cell barcodes of barcodes oligonucleotide attached to different beads of the plurality of beads are different. In some embodiments the plurality of beads comprises at least 100 beads.
Disclosed herein include kits for single cell analysis. The kit can comprise: a composition disclosed
herein; and instructions of using the composition for single cell sequencing or analysis.
Introducing cells and barcode oligonucleotides into partitions
Disclosed herein also include a method of nucleic acid sequencing. In some embodiments, the method can comprise introducing a plurality of cells and/or a plurality of barcode oligonucleotides into a plurality of partitions. The introduction of a plurality of cells and/or a plurality of barcode oligonucleotides (alone or attached to beads) can be performed using partitioning.
As used herein, the term “partitioning” refers to introducing particles (e.g., cells, or beads) into vessels (e.g., microwells, droplets) that can be used to sequester or separate one particle from another. Such vessels are referred to using the noun “partition. ” A partition can include two or more particles of the same type or different types.
Partitioning can be performed using a variety of methods known to a person skilled in the art, for example, using microfluidics, wells, microwells, multi-well plates, multi-well arrays, dispensing, dilution, droplets and the like. For example, the cells, barcode oligonucleotides, and/or beads can be diluted and dispensed across a plurality of partitions via the use of flow channels in a microwell array.
Partitions
A “partition” as used herein can refer to a part, a portion, or a division sequestered from the rest of the parts, portions, or divisions. A partition can be formed through the use of wells, microwells, multi-well plates, microwell arrays, microfluidics, dilution, dispensing, droplets, or any other means of sequestering one fraction of a sample from another. In some embodiments, a partition is a droplet or a microwell.
In some embodiments, the method can comprise partitioning a plurality of cells into a plurality of partitions, wherein a partition of the plurality of partitions comprises one cell of the plurality of cells. In some embodiments, the method can also comprise partitioning a plurality of barcode oligonucleotides into the plurality of partitions. For example, the plurality of barcode oligonucleotides can be attached to beads and the method can comprise partitioning a plurality of beads with the plurality of barcode oligonucleotides attached thereon into the plurality of partitions.
In some embodiments, a plurality of cells and/or a plurality of beads with a plurality of barcode oligonucleotides attached thereon can be co-partitioned by combining the plurality of cells and/or the plurality of beads with a plurality of barcode oligonucleotides attached thereon to form a mixture that can be then partitioned into a plurality of partitions.
In some embodiments, partitioning a plurality of cells and/or a plurality of beads with a plurality of barcode oligonucleotides attached thereon can be performed through the use of fluid flow in microwell array. For example, the partitioning can comprise flowing one or more solutions comprising a plurality of cells and/or a plurality of beads with a plurality of barcode oligonucleotides attached thereon, sequentially or concurrently in a mixture, into the plurality of microwells via the inlet port.
In some embodiments, introducing the plurality of barcode oligonucleotides into the plurality of partitions can be performed without using a bead. In some embodiments, the plurality of barcode oligonucleotides can be introduced into the partitions (e.g. microwells) by attaching or synthesizing the plurality of barcode oligonucleotides onto the surface of the partitions.
In some embodiments, attaching or synthesizing the plurality of barcode oligonucleotides onto the surface of the partitions can involve a ligation step. In some embodiments, synthesizing the plurality of barcode oligonucleotides can comprise ligating two smaller oligonucleotides together to generate a plurality of barcode oligonucleotides each having a pre-designed sequence. For example, a primer can be attached to the surface of a partition which can hybridize to a primer binding site of an oligonucleotide that also contains a template nucleotide sequence. The primer can then be extended by a primer extension reaction or other amplification reaction, and an oligonucleotide complementary to the template oligonucleotide can thereby be attached to the surface of the partition.
The surface of the partitions (e.g. microwells) can be pre-functionalized with a chemical moiety to facilitate the attachment of barcode oligonucleotides. The attachment of the barcode oligonucleotides can occur through the interaction between two members of a binding pair, one attached to the surface of the partitions and the other comprised in or conjugated to the barcode oligonucleotides, or a portion thereof. For example, the surface of the microwell can be coated with a moiety (e.g. a member of a binding pair) capable of binding with another moiety (e.g. the other member of the binding pair) of the barcode oligonucleotide, such that the binding of the two moieties results in the attachment of the barcode oligonucleotide or a portion thereof to the microwell. For example, the surface of the microwell can be coated with streptavidin. The biotinylated barcode oligonucleotides can be attached to the surface of the microwell via streptavidin-biotin interaction.
In some embodiments, the surface of the partitions (e.g. microwells) can be modified to enhance its chemical reactivity and facilitate the oligonucleotide attachment, such as, by treating the microwells with oxygen plasma, corona discharges, and ultraviolet/ozone (UVO) as will be understood by a
person skilled in the art.
A partition can be sized to fit at most one bead (and a cell) , not two beads. A size or dimension (e.g., length, width, depth, radius, or diameter) of a partition can be different in different embodiments. In some embodiments, a size or dimension of one, one or more, or each, of the plurality of partitions is, is about, is at least, is at least about, is at most, or is at most about, 1 nanometer (nm) , 2 nm, 3 nm, 4 nm, 5 nm, 6 nm, 7 nm, 8 nm, 9 nm, 10 nm, 11 nm, 12 nm, 13 nm, 14 nm, 15 nm, 16 nm, 17 nm, 18 nm, 19 nm, 20 nm, 21 nm, 22 nm, 23 nm, 24 nm, 25 nm, 26 nm, 27 nm, 28 nm, 29 nm, 30 nm, 31 nm, 32 nm, 33 nm, 34 nm, 35 nm, 36 nm, 37 nm, 38 nm, 39 nm, 40 nm, 41 nm, 42 nm, 43 nm, 44 nm, 45 nm, 46 nm, 47 nm, 48 nm, 49 nm, 50 nm, 51 nm, 52 nm, 53 nm, 54 nm, 55 nm, 56 nm, 57 nm, 58 nm, 59 nm, 60 nm, 61 nm, 62 nm, 63 nm, 64 nm, 65 nm, 66 nm, 67 nm, 68 nm, 69 nm, 70 nm, 71 nm, 72 nm, 73 nm, 74 nm, 75 nm, 76 nm, 77 nm, 78 nm, 79 nm, 80 nm, 81 nm, 82 nm, 83 nm, 84 nm, 85 nm, 86 nm, 87 nm, 88 nm, 89 nm, 90 nm, 91 nm, 92 nm, 93 nm, 94 nm, 95 nm, 96 nm, 97 nm, 98 nm, 99 nm, 100 nm, 110 nm, 120 nm, 130 nm, 140 nm, 150 nm, 160 nm, 170 nm, 180 nm, 190 nm, 200 nm, 210 nm, 220 nm, 230 nm, 240 nm, 250 nm, 260 nm, 270 nm, 280 nm, 290 nm, 300 nm, 310 nm, 320 nm, 330 nm, 340 nm, 350 nm, 360 nm, 370 nm, 380 nm, 390 nm, 400 nm, 410 nm, 420 nm, 430 nm, 440 nm, 450 nm, 460 nm, 470 nm, 480 nm, 490 nm, 500 nm, 510 nm, 520 nm, 530 nm, 540 nm, 550 nm, 560 nm, 570 nm, 580 nm, 590 nm, 600 nm, 610 nm, 620 nm, 630 nm, 640 nm, 650 nm, 660 nm, 670 nm, 680 nm, 690 nm, 700 nm, 710 nm, 720 nm, 730 nm, 740 nm, 750 nm, 760 nm, 770 nm, 780 nm, 790 nm, 800 nm, 810 nm, 820 nm, 830 nm, 840 nm, 850 nm, 860 nm, 870 nm, 880 nm, 890 nm, 900 nm, 910 nm, 920 nm, 930 nm, 940 nm, 950 nm, 960 nm, 970 nm, 980 nm, 990 nm, 1000 nm, 2 micrometer (μm) , 3 μm, 4 μm, 5 μm, 6 μm, 7 μm, 8 μm, 9 μm, 10 μm, 20 μm, 30 μm, 40 μm, 50 μm, 60 μm, 70 μm, 80 μm, 90 μm, 100 μm, 110 μm, 120 μm, 130 μm, 140 μm, 150 μm, 160 μm, 170 μm, 180 μm, 190 μm, 200 μm, 210 μm, 220 μm, 230 μm, 240 μm, 250 μm, 260 μm, 270 μm, 280 μm, 290 μm, 300 μm, 310 μm, 320 μm, 330 μm, 340 μm, 350 μm, 360 μm, 370 μm, 380 μm, 390 μm, 400 μm, 410 μm, 420 μm, 430 μm, 440 μm, 450 μm, 460 μm, 470 μm, 480 μm, 490 μm, 500 μm, or a number or a range between any two of these values. For example, a size or dimension of one, one or more, or each, of the plurality of partitions is about 1 nm to about 100 μm.
The volume of one, one or more, or each, of the plurality of partitions can be different in different embodiments. The volume of one, one or more, or each, of the plurality of partitions can be, be about, be at least, be at least about, be at most, or be at most about, 1 nm3, 2 nm3, 3 nm3, 4 nm3, 5 nm3, 6 nm3, 7 nm3, 8 nm3, 9 nm3, 10 nm3, 20 nm3, 30 nm3, 40 nm3, 50 nm3, 60 nm3, 70 nm3, 80 nm3, 90 nm3, 100 nm3, 200 nm3, 300 nm3, 400 nm3, 500 nm3, 600 nm3, 700 nm3, 800 nm3, 900 μm3, 1000 nm3, 10000 nm3, 100000 μm3, 1000000 nm3, 10000000 nm3, 100000000 μm3, 1000000000 nm3, 2
μm3, 3 μm3, 4 μm3, 5 μm3, 6 μm3, 7 μm3, 8 μm3, 9 μm3, 10 μm3, 20 μm3, 30 μm3, 40 μm3, 50 μm3, 60 μm3, 70 μm3, 80 μm3, 90 μm3, 100 μm3, 200 μm3, 300 μm3, 400 μm3, 500 μm3, 600 μm3, 700 μm3, 800 μm3, 900 μm3, 1000 μm3, 10000 μm3, 100000 μm3, 1000000 μm3, or a number or a range between any two of these values. The volume of one, one or more, or each, of the plurality of partitions can be, be about, be at least, be at least about, be at most, or be at most about, 1 nanolieter (nl) , 2 nl, 3 nl, 4 nl, 5 nl, 6 nl, 7 nl, 8 nl, 9 nl, 10 nl, 11 nl, 12 nl, 13 nl, 14 nl, 15 nl, 16 nl, 17 nl, 18 nl, 19 nl, 20 nl, 21 nl, 22 nl, 23 nl, 24 nl, 25 nl, 26 nl, 27 nl, 28 nl, 29 nl, 30 nl, 31 nl, 32 nl, 33 nl, 34 nl, 35 nl, 36 nl, 37 nl, 38 nl, 39 nl, 40 nl, 41 nl, 42 nl, 43 nl, 44 nl, 45 nl, 46 nl, 47 nl, 48 nl, 49 nl, 50 nl, 51 nl, 52 nl, 53 nl, 54 nl, 55 nl, 56 nl, 57 nl, 58 nl, 59 nl, 60 nl, 61 nl, 62 nl, 63 nl, 64 nl, 65 nl, 66 nl, 67 nl, 68 nl, 69 nl, 70 nl, 71 nl, 72 nl, 73 nl, 74 nl, 75 nl, 76 nl, 77 nl, 78 nl, 79 nl, 80 nl, 81 nl, 82 nl, 83 nl, 84 nl, 85 nl, 86 nl, 87 nl, 88 nl, 89 nl, 90 nl, 91 nl, 92 nl, 93 nl, 94 nl, 95 nl, 96 nl, 97 nl, 98 nl, 99 nl, 100 nl, or a number or a range between any two of these values. For example, the volume of one, one or more, or each, of the plurality of partitions is about 1 nm3 to about 1000000 μm3.
The number of partitions can be different in different embodiments. In some embodiments, the number of partitions is, is about, is at least, is at least about, is at most, or is at most, 100, 200, 300, 400, 500, 600, 700, 800, 900, 1000, 2000, 3000, 4000, 5000, 6000, 7000, 8000, 9000, 10000, 20000, 30000, 40000, 50000, 60000, 70000, 80000, 90000, 100000, 200000, 300000, 400000, 500000, 600000, 700000, 800000, 900000, 1000000, 2000000, 3000000, 4000000, 5000000, 6000000, 7000000, 8000000, 9000000, 10000000, 20000000, 30000000, 40000000, 50000000, 60000000, 70000000, 80000000, 90000000, 100000000, 200000000, 300000000, 400000000, 500000000, 600000000, 700000000, 800000000, 900000000, 1000000000, or a number or a range between any two of these values. For example, the number of partitions can be at least 1000 partitions.
The percentage of the plurality of partitions comprising a single cell and a single bead can be different in different embodiments. In some embodiments, the percentage of the plurality of partitions comprising a single cell and a single bead is, is about, is at least, is at least about, is at most, or is at most about, 5%, 6%, 7%, 8%, 9%, 10%, 11%, 12%, 13%, 14%, 15%, 16%, 17%, 18%, 19%, 20%, 21%, 22%, 23%, 24%, 25%, 26%, 27%, 28%, 29%, 30%, 31%, 32%, 33%, 34%, 35%, 36%, 37%, 38%, 39%, 40%, 41%, 42%, 43%, 44%, 45%, 46%, 47%, 48%, 49%, 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or a number or a range between any two of these values. For example, at least 10%of partitions of the plurality of partitions comprise a cell of the plurality of cells and a bead of the plurality of beads.
The percentage of the plurality of partitions comprising no cell can be different in different embodiments. In some embodiments, the percentage of the plurality of partitions comprising no cell is, is about, is at least, is at least about, is at most, or is at most about, 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or a number or a range between any two of these values. For example, at least 50%of partitions of the plurality of partitions can comprise no cell of the plurality of cells.
The percentage of the plurality of partitions comprising more than two cells can be different in different embodiments. In some embodiments, the percentage of the plurality of partitions comprising more than two cells is, is about, is at least, is at least about, is at most, or is at most about, 1%, 2%, 3%, 4%, 5%, 6%, 7%, 8%, 9%, 10%, 11%, 12%, 13%, 14%, 15%, 16%, 17%, 18%, 19%, 20%, 21%, 22%, 23%, 24%, 25%, 26%, 27%, 28%, 29%, 30%, 31%, 32%, 33%, 34%, 35%, 36%, 37%, 38%, 39%, 40%, 41%, 42%, 43%, 44%, 45%, 46%, 47%, 48%, 49%, 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or a number or a range between any two of these values. For example, at most 10%of partitions of the plurality of partitions can comprise more than two cells of the plurality of cells.
Microwells
In some embodiments, the partition is a microwell and the plurality of partitions comprise a plurality of microwells in a microwell array. The term “microwell, ” as used herein, generally refers to a well with a volume of less than 1 mL. A microwell array can contain a number of microwells arranged in rows and columns. The size and spacing of the microwells may vary depending on different applications. A location of a microwell in a microwell array can be identified by its unique address describing its row and column position within the microwell array.
The microwell array comprising a plurality of microwells can be formed from any suitable material as will be understood by a person skilled in the art. In some embodiments, a microwell array comprising a plurality of microwells can be formed from a material selected from the group consisting of silicon, glass, ceramic, elastomers such as polydimethylsiloxane (PDMS) and thermoset polyester, thermoplastic polymers such as polystyrene, polycarbonate, poly (methyl methacrylate) (PMMA) , poly-ethylene glycol diacrylate (PEGDA) , Teflon, polyurethane (PU) , composite materials such as cyclic-olefin copolymer, and combinations thereof.
In some embodiments, the microwell array can comprise an inlet port in fluid communication with the plurality of microwells. The microwell array can also comprise an outlet port in fluid communication with the plurality of microwells. Microwells can be introduced with samples, free reagents, and/or reagents encapsulated in microcapsules. The reagents can comprise restriction enzymes, ligase, polymerase, fluorophores, oligonucleotide barcodes, oligonucleotide probes, adapters, buffers, dNTPs, ddNTPs, and other reagents required for performing the methods described herein. Samples and reagents can flow from the inlet port through a flow channel to deliver to the microwell array, and the waste can be pushed out from the outlet port and removed.
Sample Nucleic Acids and Cells
The plurality of cells introduced into a plurality of partitions can be obtained from any organism of interest such as Monera (bacteria) , Protista, Fungi, Plantae, and Animalia Kingdoms. A cell can be a mammalian cell, and particularly a human cell such as T cells, B cells, natural killer cells, stem cells, or cancer cells.
Cells described herein can be obtained from a cell sample. A cell sample comprising cells can be obtained from any source including a clinical sample and a derivative thereof, a biological sample and a derivative thereof, a forensic sample and a derivative thereof, an environmental sample and a derivative thereof and a combination thereof. A cell sample can be collected from any bodily fluids including, but not limited to, blood, urine, serum, lymph, saliva, anal, and vaginal secretions, perspiration and semen of any organism. A cell sample can be products of experimental manipulation including purification, cell culturation, cell isolation, cell separation, cell quantification, sample dilution, or any other cell sample processing approaches. A cell sample can be obtained by dissociation of any biopsy tissues of any organism including, but not limited to, skin, bone, hair, brain, liver, heart, kidney, spleen, pancreas, stomach, intestine, bladder, lung, esophagus.
As used herein, the term “sample nucleic acids” and “nucleic acid targets” are used interchangeably. The sample nucleic acids associated with a plurality of cells can comprise deoxyribonucleic acid (DNA) , ribonucleic acid (RNA) , and/or any combination or hybrid thereof. As used herein, the terms “nucleic acid” and “polynucleotide” are interchangeable and can refer to any nucleic acid, whether composed of phosphodiester linkages or modified linkages such as phosphotriester, phosphoramidate, siloxane, carbonate, carboxymethylester, acetamidate, carbamate, thioether, bridged phosphoramidate, bridged methylene phosphonate, bridged phosphoramidate, bridged phosphoramidate, bridged methylene phosphonate, phosphorothioate, methylphosphonate, phosphorodithioate, bridged phosphorothioate or sultone linkages, and combinations of such linkages. The terms “nucleic acid” and “polynucleotide” also specifically include nucleic acids
composed of bases other than the five biologically occurring bases (adenine, guanine, thymine, cytosine and uracil) .
The sample nucleic acids can be single-stranded or double-stranded, or contain portions of both double-stranded or single-stranded sequences. The sample nucleic acids can contain any combination of nucleotides, including uracil, adenine, thymine, cytosine, guanine, inosine, xanthine, hypoxanthine, isocytosine, isoguanine and any nucleotide derivative thereof. As used herein, the term “nucleotide” may include naturally occurring nucleotides and nucleotide analogs, including both synthetic and naturally occurring species. The sample nucleic acids can be genomic DNA (gDNA) , mitochondrial DNA (mtDNA) , messenger RNA (mRNA) , ribosomal RNA (rRNA) , transfer RNA (tRNA) , nuclear RNA (nRNA) , small interfering RNA (siRNA) , small nuclear RNA (snRNA) , small nucleolar RNA (snoRNA) , small Cajal body-specific RNA (scaRNA) , microRNA (miRNA) , double stranded (dsRNA) , ribozyme, riboswitch or viral RNA, or any nucleic acids that may be obtained from a sample.
In some embodiments, the plurality of cells can be diluted prior to partitioning to ensure majority of the partitions comprise at most one cell with low doublets (more than one cell in one partition) . A dilution can be prepared such that a desired cell concentration is achieved. The cell concentration can be between 1×104 and 1×106 (e.g. about, at least, at least about, at most, at most about, 1×104, 2×104, 3×104, 4×104, 5×104, 6×104, 7×104, 8×104, 9×104, 1×105, 1.5×105, 2×105, 2.5×105, 3×105, 3.5×105, 4×105, 4.5×105, 5×105, 5.5×105, 6×105, 6.5×105, 7×105, 7.5×105, 8×105, 8.5×105, 9×105, 1×106, or a number or a range between any two of these values) cells/mL. In some embodiments, the cell concentration is about 1×105 -3×105 (e.g. about, at least, at least about, at most, at most about, 1×105, 1.1×105, 1.2×105, 1.3×105, 1.4×105, 1.5×105, 1.6×105, 1.7×105, 1.8×105, 1.9×105, 2.0×105, 2.1×105, 2.2×105, 2.3×105, 2.4×105, 2.5×105, 2.6×105, 2.7×105, 2.8×105, 2.9×105, 3.0×105, or a number or a range between any two of these values) .
Beads
In some embodiments, the plurality of barcode oligonucleotides introduced into the plurality of partitions are associated with a bead. The beads can provide a surface upon which molecules, such as oligonucleotides, can be synthesized or attached. In some embodiments, a bead comprises, comprises about, comprises at least, comprises at least about, comprises at most, or comprises at most about 1, 5, 10, 50, 100, 1000, 10000, 20000, 30000, 40000, 50000, 60000, 70000, 80000, 90000, 100000, 200,000, 300,000, 400,000, 500,000, 600,000, 700,000, 800,000, 900,000, 1000000, 2000000, 3000000, 4000000, 5000000, 6000000, 7000000, 8000000, 9000000, 10000000, 20000000, 50000000, 100000000, 200000000, 300000000, 400000000, 500000000, 600000000,
700000000, 800000000, 900000000, 1000000000, or a number or a range between any two of these values, barcode oligonucleotides. FIG. 2 shows a bead attached with a barcode oligonucleotide for illustrative purposes and is not intended to be limiting. The attachment can be reversible or irreversible. The attachment can be covalent or non-covalent via non-covalent bonds such as ionic bonds, hydrogen bonds, or van der Waals interactions. The attachment can be direct to the surface of a bead or indirect through other oligonucleotide sequences attached to the surface of a bead.
A bead can be dissolvable, degradable, or disruptable. Barcode oligonucleotides can be reversibly attached to, covalently attached to, or irreversibly attached to the bead. A bead can be a gel bead such as a hydrogel bead. In some embodiments, the gel bead is degradable upon application of a stimulus. The stimulus can comprise a thermal stimulus, a chemical stimulus, a biological stimulus, a photo-stimulus, or a combination thereof.
A bead can be a solid bead and/or a magnetic bead. In some embodiments, the bead is a magnetic bead. The magnetic bead can comprise a paramagnetic material coated or embedded in the magnetic bead (e.g. on a surface, in an intermediate layer, and/or mixed with other materials of the magnetic bead) . A paramagnetic material refers to a material having a magnetic susceptibility slightly greater than 1 (e.g. between about 1 and about 5) . A magnetic susceptibility is a measure of how much a material can become magnetized in an applied magnetic field. Paramagnetic materials include, but not limited to, magnesium, molybdenum, lithium, aluminum, nickel, tantalum, titanium, iron oxide, gold, copper, or a combination thereof.
In some embodiments, the magnetic bead comprising barcode oligonucleotides can be immobilized or retained in a partition (e.g. a microwell) by an external magnetic field, thereby retaining the barcode oligonucleotides in a partition. The magnetic bead comprising barcode oligonucleotides can be mobilized or released when the external magnetic field is removed.
In some embodiments, a bead can be immobilized or retained in a partition (e.g. a microwell) through an interaction between two members of a binding pair. For example, the partition (e.g. microwell) can be coated with a capture moiety (e.g. a member of a binding pair) capable of binding with a binding moiety (the other member of the binding pair) comprised in or conjugated to a bead, such that the binding of the two moieties results in the attachment of the bead to the partition (e.g. microwell) , thereby immobilizing or retaining the bead in the partition. For example, the surface of a partition (e.g. microwell) can be coated with streptavidin. The biotinylated bead can be attached to the surface of the partition (e.g. microwell) via streptavidin-biotin interaction.
Beads can be of uniform size or heterogeneous size. In some embodiments, the beads have a diameter of about, at least, at least about, at most, or at most about, 1 μm, 5 μm, 10 μm, 20 μm, 30
μm, 40 μm, 45 μm, 50 μm, 60 μm, 65 μm, 70 μm, 75 μm, 80 μm, 90 μm, 100 μm, 250 μm, 500 μm, or 1 mm.
In some embodiments, a bead can be sized such that at most one bead (and a cell) , not two beads, can fit one partition. A size or dimension (e.g., length, width, depth, radius, or diameter) of a bead can be different in different embodiments. In some embodiments, a size or dimension of one, or each, bead is, is about, is at least, is at least about, is at most, or is at most about, 1 nanometer (nm) , 2 nm, 3 nm, 4 nm, 5 nm, 6 nm, 7 nm, 8 nm, 9 nm, 10 nm, 11 nm, 12 nm, 13 nm, 14 nm, 15 nm, 16 nm, 17 nm, 18 nm, 19 nm, 20 nm, 21 nm, 22 nm, 23 nm, 24 nm, 25 nm, 26 nm, 27 nm, 28 nm, 29 nm, 30 nm, 31 nm, 32 nm, 33 nm, 34 nm, 35 nm, 36 nm, 37 nm, 38 nm, 39 nm, 40 nm, 41 nm, 42 nm, 43 nm, 44 nm, 45 nm, 46 nm, 47 nm, 48 nm, 49 nm, 50 nm, 51 nm, 52 nm, 53 nm, 54 nm, 55 nm, 56 nm, 57 nm, 58 nm, 59 nm, 60 nm, 61 nm, 62 nm, 63 nm, 64 nm, 65 nm, 66 nm, 67 nm, 68 nm, 69 nm, 70 nm, 71 nm, 72 nm, 73 nm, 74 nm, 75 nm, 76 nm, 77 nm, 78 nm, 79 nm, 80 nm, 81 nm, 82 nm, 83 nm, 84 nm, 85 nm, 86 nm, 87 nm, 88 nm, 89 nm, 90 nm, 91 nm, 92 nm, 93 nm, 94 nm, 95 nm, 96 nm, 97 nm, 98 nm, 99 nm, 100 nm, 110 nm, 120 nm, 130 nm, 140 nm, 150 nm, 160 nm, 170 nm, 180 nm, 190 nm, 200 nm, 210 nm, 220 nm, 230 nm, 240 nm, 250 nm, 260 nm, 270 nm, 280 nm, 290 nm, 300 nm, 310 nm, 320 nm, 330 nm, 340 nm, 350 nm, 360 nm, 370 nm, 380 nm, 390 nm, 400 nm, 410 nm, 420 nm, 430 nm, 440 nm, 450 nm, 460 nm, 470 nm, 480 nm, 490 nm, 500 nm, 510 nm, 520 nm, 530 nm, 540 nm, 550 nm, 560 nm, 570 nm, 580 nm, 590 nm, 600 nm, 610 nm, 620 nm, 630 nm, 640 nm, 650 nm, 660 nm, 670 nm, 680 nm, 690 nm, 700 nm, 710 nm, 720 nm, 730 nm, 740 nm, 750 nm, 760 nm, 770 nm, 780 nm, 790 nm, 800 nm, 810 nm, 820 nm, 830 nm, 840 nm, 850 nm, 860 nm, 870 nm, 880 nm, 890 nm, 900 nm, 910 nm, 920 nm, 930 nm, 940 nm, 950 nm, 960 nm, 970 nm, 980 nm, 990 nm, 1000 nm, 2 micrometer (μm) , 3 μm, 4 μm, 5 μm, 6 μm, 7 μm, 8 μm, 9 μm, 10 μm, 20 μm, 30 μm, 40 μm, 50 μm, 60 μm, 70 μm, 80 μm, 90 μm, 100 μm, 110 μm, 120 μm, 130 μm, 140 μm, 150 μm, 160 μm, 170 μm, 180 μm, 190 μm, 200 μm, 210 μm, 220 μm, 230 μm, 240 μm, 250 μm, 260 μm, 270 μm, 280 μm, 290 μm, 300 μm, 310 μm, 320 μm, 330 μm, 340 μm, 350 μm, 360 μm, 370 μm, 380 μm, 390 μm, 400 μm, 410 μm, 420 μm, 430 μm, 440 μm, 450 μm, 460 μm, 470 μm, 480 μm, 490 μm, 500 μm, or a number or a range between any two of these values. For example, a size or dimension of one, or each, bead is about 1 nm to about 100 μm. As another example, the bead can have a dimension about 10 μm to about 100 μm. As another example, the bead can have a dimension about 30 μm.
The volume of one, or each, bead can be different in different embodiments. The volume of one, or each, bead can be, be about, be at least, be at least about, be at most, or be at most about, 1 nm3, 2 nm3, 3 nm3, 4 nm3, 5 nm3, 6 nm3, 7 nm3, 8 nm3, 9 nm3, 10 nm3, 20 nm3, 30 nm3, 40 nm3, 50 nm3, 60 nm3, 70 nm3, 80 nm3, 90 nm3, 100 nm3, 200 nm3, 300 nm3, 400 nm3, 500 nm3, 600 nm3, 700 nm3,
800 nm3, 900 μm3, 1000 nm3, 10000 nm3, 100000 μm3, 1000000 nm3, 10000000 nm3, 100000000 μm3, 1000000000 nm3, 2 μm3, 3 μm3, 4 μm3, 5 μm3, 6 μm3, 7 μm3, 8 μm3, 9 μm3, 10 μm3, 20 μm3, 30 μm3, 40 μm3, 50 μm3, 60 μm3, 70 μm3, 80 μm3, 90 μm3, 100 μm3, 200 μm3, 300 μm3, 400 μm3, 500 μm3, 600 μm3, 700 μm3, 800 μm3, 900 μm3, 1000 μm3, 10000 μm3, 100000 μm3, 1000000 μm3, or a number or a range between any two of these values. The volume of one, or each, bead can be, be about, be at least, be at least about, be at most, or be at most about, 1 nanolieter (nL) , 2 nL, 3 nL, 4 nL, 5 nL, 6 nL, 7 nL, 8 nL, 9 nL, 10 nL, 11 nL, 12 nL, 13 nL, 14 nL, 15 nL, 16 nL, 17 nL, 18 nL, 19 nL, 20 nL, 21 nL, 22 nL, 23 nL, 24 nL, 25 nL, 26 nL, 27 nL, 28 nL, 29 nL, 30 nL, 31 nL, 32 nL, 33 nL, 34 nL, 35 nL, 36 nL, 37 nL, 38 nL, 39 nL, 40 nL, 41 nL, 42 nL, 43 nL, 44 nL, 45 nL, 46 nL, 47 nL, 48 nL, 49 nL, 50 nL, 51 nL, 52 nL, 53 nL, 54 nL, 55 nL, 56 nL, 57 nL, 58 nL, 59 nL, 60 nL, 61 nL, 62 nL, 63 nL, 64 nL, 65 nL, 66 nL, 67 nL, 68 nL, 69 nL, 70 nL, 71 nL, 72 nL, 73 nL, 74 nL, 75 nL, 76 nL, 77 nL, 78 nL, 79 nL, 80 nL, 81 nL, 82 nL, 83 nL, 84 nL, 85 nL, 86 nL, 87 nL, 88 nL, 89 nL, 90 nL, 91 nL, 92 nL, 93 nL, 94 nL, 95 nL, 96 nL, 97 nL, 98 nL, 99 nL, 100 nL, or a number or a range between any two of these values. For example, the volume of one, or each, bead is about 1 nm3 to about 1000000 μm3.
The number of beads introduced into a plurality of partitions can be different in different embodiments. In some embodiments, the number of beads introduced into a plurality of partitions is, is about, is at least, is at least about, is at most, or is at most, 100, 200, 300, 400, 500, 600, 700, 800, 900, 1000, 2000, 3000, 4000, 5000, 6000, 7000, 8000, 9000, 10000, 20000, 30000, 40000, 50000, 60000, 70000, 80000, 90000, 100000, 200000, 300000, 400000, 500000, 600000, 700000, 800000, 900000, 1000000, 2000000, 3000000, 4000000, 5000000, 6000000, 7000000, 8000000, 9000000, 10000000, 20000000, 30000000, 40000000, 50000000, 60000000, 70000000, 80000000, 90000000, 100000000, 200000000, 300000000, 400000000, 500000000, 600000000, 700000000, 800000000, 900000000, 1000000000, or a number or a range between any two of these values. For example, the number of beads introduced into a plurality of partitions (e.g. microwells) can be at least 80,000 beads.
In some embodiments, beads are introduced to the partitions such that the percentage of partitions each occupied with one bead is, is about, is at least, is at least about, is at most, or is at most about, 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or a number or a range between any two of these values. For example, at least 80%of the plurality of partitions can be each occupied with one bead.
In some embodiments, beads are introduced to the partitions such that the percentage of partitions
with no bead is, is about, is at least, is at least about, is at most, or is at most about, 1%, 2%, 3%, 4%, 5%, 6%, 7%, 8%, 9%, 10%, 11%, 12%, 13%, 14%, 15%, 16%, 17%, 18%, 19%, 20%, 21%, 22%, 23%, 24%, 25%, 26%, 27%, 28%, 29%, 30%, 31%, 32%, 33%, 34%, 35%, 36%, 37%, 38%, 39%, 40%, 41%, 42%, 43%, 44%, 45%, 46%, 47%, 48%, 49%, 50%, or a number or a range between any two of these values. For example, at most 20%of the plurality of partitions contain no bead.
Barcoding Sample Nucleic Acids
The method described herein can comprise barcoding a plurality of sample nucleic acids associated with the cell in the partition using the plurality of barcode oligonucleotides to generate a plurality of barcoded nucleic acids. FIG. 2 shows barcoding an mRNA molecule with a barcode oligonucleotide. The barcode oligonucleotide is shown attached to a bead for illustrative purposes and is not intended to be limiting.
Prior to barcoding the sample nucleic acids, the method can comprise lysing cells (e.g. after introducing a plurality of barcode oligonucleotides and/or a plurality of cells to the partition) to release the content of the cell within the partition. Lysis agents can be contacted with the cells or cell suspension concurrently, or immediately after the introduction of the cells into the partition and before the barcoding, e.g. through the flow channels. Examples of lysis agents include bioactive reagents, such as lysis enzymes, or surfactant-based lysis solutions including non-ionic surfactants such as TritonX-100 and Tween 20 and ionic surfactants such as sodium dodecyl sulfate (SDS) . Lysis methods including, but not limited to, thermal, acoustic, electrical, or mechanical cellular disruption can also be used.
Synthesis of Single-Stranded Barcoded Nucleic Acids
In some embodiments, barcoding a plurality of sample nucleic acids (e.g., mRNA shown in FIG. 2) associated with the cell in the partition can comprise extending the plurality of barcode oligonucleotides using the plurality of sample nucleic acids as templates to generate partially single-stranded/partially double-stranded barcoded nucleic acids of the plurality of barcoded nucleic acids hybridized to sample nucleic acids of the plurality of sample nucleic acids. The partially single-stranded/partially double-stranded barcoded nucleic acids hybridized to sample nucleic acids can be separated by denaturation (e.g., heat denaturation or chemical denaturation using for example, sodium hydroxide) to generate single-stranded barcoded nucleic acids of the plurality of barcoded nucleic acids. The single-stranded barcoded nucleic acids can comprise a barcode oligonucleotide and an oligonucleotide complementary to the sample nucleic acids. In some embodiments, the single-stranded barcoded nucleic acids can be generated by reverse transcription
using a reverse transcriptase. For example, the single-stranded barcoded nucleic acids can be generated by using a DNA polymerase.
In some embodiments, the single-stranded barcoded nucleic acids can be cDNA produced by extending a barcode oligonucleotide using a sample RNA (e.g., mRNA) associated with the cell as a template. The single-stranded barcoded nucleic acids can be further extended using a template switching oligonucleotide (TSO) . A TSO is an oligo that hybridizes to untemplated C nucleotides added by a reverse transcriptase during reverse transcription. The TSO can be introduced into the partitions together with the reverse transcription reagents. For example, a reverse transcriptase can be used to generate a cDNA by extending a barcode oligonucleotide hybridized to an RNA. After extending the barcode oligonucleotide to the 5’-end of the RNA, the reverse transcriptase can add one or more nucleotides with cytosine (C) bases (e.g. two or three) to the 3’-end of the cDNA. The TSO can include one or more nucleotides with guanine (G) bases (e.g. two or more) on the 3’-end of the TSO. The nucleotides with G bases can be ribonucleotides. The G bases at the 3’-end of the TSO can hybridize to the cytosine bases at the 3’-end of the cDNA. The reverse transcriptase can further extend the cDNA using the TSO as the template to generate a cDNA with the reverse complement of the TSO sequence on its 3’-end. The barcoded nucleic acid can include the barcode sequences (e.g., cell barcode and UMI) on the 5’-end and a TSO sequence at its 3’-end.
In some embodiments, barcoding a plurality of sample nucleic acids comprises extending the barcode oligonucleotides using the sample nucleic acids as templates and the plurality of barcode oligonucleotides as TSO to generate a plurality of single-stranded barcoded nucleic acids of the plurality of barcoded nucleic acids that are hybridized to the plurality of sample nucleic acids.
In some embodiments, the barcode oligonucleotides are not attached to a bead and the barcode oligonucleotides can be TSO. For example, extension primers (e.g. poly (dT) ) can be introduced into the partitions which hybridize to a sample nucleic acid (e.g. the poly-adenylated mRNA) . The extension primers can be extended using the sample nucleic acids as a template. For example, a reverse transcriptase can be used to generate a cDNA by extending an extension primer hybridized to an RNA. After extending the extension primers to the 5’-end of the RNA, the reverse transcriptase can add one or more C bases (e.g. two or three) to the 3’-end of the cDNA. The TSO or barcode oligonucleotide can include one or more G bases (e.g. two or more) on the 3’-end of the TSO. The nucleotides with guanine bases can be ribonucleotides. The G bases at the 3’-end of the TSO or barcode oligonucleotide can hybridize to the cytosine bases at the 3’-end of the cDNA. The reverse transcriptase can switch template from the mRNA to the TSO or barcode oligonucleotide. The reverse transcriptase can further extend the cDNA using the TSO or barcode oligonucleotide as the template to generate a cDNA further comprising the reverse complement of the TSO or barcode
oligonucleotide. In this case, the barcode sequences (e.g. cell barcode and UMI) are on the 3’-end of the generated cDNA.
The single-stranded barcoded nucleic acids can be separated from the template sample nucleic acids by digesting the template sample nucleic acids (e.g., using RNase) , by chemical treatment (e.g., using sodium hydroxide) , by hydrolyzing the template sample nucleic acids, or via a denaturation or melting process by increasing the temperature, adding organic solvents, or increasing pH. Following the melting process, the sample nucleic acids can be removed (e.g. washed away) and the single-stranded barcoded nucleic acids can be retained in the partition (e.g. through attachment to the partitions or through attachments to beads which can be retained in the partitions) .
Synthesis of Double-Stranded Barcoded Nucleic Acids
In some embodiments, barcoding a plurality of sample nucleic acids associated with the cell in the partition can comprise generating the plurality of barcoded nucleic acids comprising double-stranded barcoded nucleic acids in the partition using the single-stranded barcoded nucleic acids as templates. The double-stranded barcoded nucleic acids can be generated from the single-stranded barcoded nucleic acids retained in the partition using, for example, second-strand synthesis or one-cycle PCR.
The generated double-stranded barcoded nucleic acid can be denaturized or melted to generate two single-stranded barcoded nucleic acids: one single-stranded barcoded nucleic acid retained in the partition (e.g., attached to the bead) and the other single-stranded barcoded nucleic acid released into the solution from the retained single-stranded barcoded nucleic acid that can then be pooled to provide a pooled mixture outside the partitions. Both single-stranded barcoded nucleic acids (e.g. retained in the partitions or pooled outside the partitions) have a sequence comprising a sequence of a barcode oligonucleotide (e.g. a cell barcode sequence and/or a UMI barcode) and a sequence of a sample nucleic acid or a reverse complement thereof.
Barcodes
The term “barcode” as used herein generally can be a verb or a noun. When used as a noun, the term “barcode” or “barcode oligonucleotide” refers to a label that can be attached to a polynucleotide, or any variant thereof, to convey information about the polynucleotide. For example, a barcode can be a polynucleotide sequence attached to all fragments of the sample nucleic acids associated with the cell in the partition. The barcode can then be sequenced alone or with the fragments and/or full length of the sample nucleic acids associated with the cell. The presence of the same barcode on multiple sequences or different barcodes on different sequences can provide information about the cell origin and/or the molecular origin of the sequences. When used as a verb,
the term “barcode” refers to a process of attaching a barcode or a barcode oligonucleotide to a sample nucleic acid associated with the cell. The barcode oligonucleotides can be attached to a partition directly or indirectly. The barcode oligonucleotides can also be associated with beads.
Barcode oligonucleotides can be generated from a variety of different formats, including pre-designed polynucleotide barcodes, randomly synthesized barcode sequences, microarray-based barcode synthesis, random N-mers, or combinations thereof as will be understood by a person skilled in the art.
In some embodiments, the plurality of barcode oligonucleotides comprise, comprise about, comprise at least, comprise at least about, comprise at most, or comprise at most about 1, 5, 10, 50, 100, 1000, 10000, 20000, 30000, 40000, 50000, 60000, 70000, 80000, 90000, 100000, 200,000, 300,000, 400,000, 500,000, 600,000, 700,000, 800,000, 900,000, 1000000, 2000000, 3000000, 4000000, 5000000, 6000000, 7000000, 8000000, 9000000, 10000000, 20000000, 50000000, 100000000, 200000000, 300000000, 400000000, 500000000, 600000000, 700000000, 800000000, 900000000, 1000000000 barcode oligonucleotides, or a number or a range between any two of these values.
A barcode oligonucleotide of the plurality of barcode oligonucleotides can be in any suitable length. In some embodiments, a barcode oligonucleotide of the plurality of barcode oligonucleotides can be about 2 to about 500 nucleotides in length, about 2 to about 100 nucleotides in length, about 2 to about 50 nucleotides in length, about 2 to about 40 nucleotides in length, about 4 to about 20 nucleotides in length, or about 6 to 16 nucleotides in length. In some embodiments, a barcode oligonucleotide of the plurality of barcode oligonucleotides is about, at least, at least about, at most, or at most about 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 85, 90, 95, 100, 150, 200, 250, 300, 400, or 500 nucleotides in length, or a number or a range between any two of these values.
Each of the plurality of barcode oligonucleotides used herein can comprise a cell barcode and a molecular barcode (e.g. a UMI) (see FIG. 2) . A barcode oligonucleotide can also comprise a probe sequence or region capable of hybridizing to sample nucleic acids (e.g. poly (dT) sequence in FIG. 2) . A barcode oligonucleotide can also include additional sequence segments such as additional recognition or binding sequences, a template switching oligonucleotide, and primer-binding sequences (e.g. sequencing primer-binding sequence, in FIG. 2 or a PCR primer-binding sequence for subsequent processing (e.g. PCR amplification) and/or sequencing.
The configuration of the various sequences comprised in a barcode oligonucleotide of the plurality of barcode oligonucleotides introduced into a partition (e.g. cell barcode sequence, UMI, primer
sequence, probe sequence or region, and/or any additional sequences) can vary depending on, for example, the particular configuration desired and/or the order in which the various components of the sequence are added as will be understood to a person skilled in the art. For example, the barcode oligonucleotide can comprise from the 5’ end to the 3’ end, the cell barcode, the UMI, the PCR primer-binding sequence, and the probe sequence or the UMI, the cell barcode, the PCR primer-binding sequence, and the probe sequence. In some embodiments, a barcode oligonucleotide has a configuration from the 5’ end to the 3’ end: cell barcode, UMI, primer-binding sequence, probe sequence. In some embodiments, a barcode oligonucleotide has a configuration from the 5’ end to the 3’ end: cell barcode, UMI, primer-binding sequence, TSO.
Cell Barcode
In some embodiments, the cell barcodes are for identifying the plurality of barcoded nucleic acids originate from the cell. The cell barcodes of the barcode oligonucleotides in a partition can be identical or different.
In some embodiments, the cell barcodes can serve to track the sample nucleic acids associated with the cell throughout the processing (e.g., location of the cells in the plurality of partitions) when the cell barcode associated with the sample nucleic acids is read during sequencing. In some embodiment, the cell barcodes can serve to provide linkage information between cell nucleic acid sequences and cell functionality when in combination with optical imaging. Barcoded nucleic acids with an identical cell barcode can be generated from sample nucleic acids of cell within a given partition. Some barcoded nucleic acids are pooled and sequenced to determine cell nucleic acid sequences or a profile (e.g., an mRNA expression profile) which is associated with (e.g., identifiable by or linked with) the cell barcode sequence.
The number (or percentage) of barcode oligonucleotides introduced in a partition with cell barcodes having an identical sequence can be different in different embodiments. In some embodiments, the number of barcode oligonucleotides introduced in a partition with cell barcodes having an identical sequence is, is about, is at least, is at least about, is at most, or is at most about, 100, 200, 300, 400, 500, 600, 700, 800, 900, 1000, 2000, 3000, 4000, 5000, 6000, 7000, 8000, 9000, 10000, 20000, 30000, 40000, 50000, 60000, 70000, 80000, 90000, 100000, 200000, 300000, 400000, 500000, 600000, 700000, 800000, 900000, 1000000, 2000000, 3000000, 4000000, 5000000, 6000000, 7000000, 8000000, 9000000, 10000000, 20000000, 30000000, 40000000, 50000000, 60000000, 70000000, 80000000, 90000000, 100000000, 200000000, 300000000, 400000000, 500000000, 600000000, 700000000, 800000000, 900000000, 1000000000, or a number or a range between any two of these values. In some embodiments, the percentage of barcode oligonucleotides introduced
in a partition with cell barcodes having an identical sequence is, is about, is at least, is at least about, is at most, or is at most about, 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, 99.9%, 100%, or a number or a range between any two of these values. For example, the cell barcodes of at least two barcode oligonucleotides introduced in a partition comprise an identical sequence.
A cell barcode can be unique (or substantially unique) to a partition. The number of unique cell barcode sequences can be different in different embodiments. In some embodiments, the number of unique cell barcode sequences is, is about, is at least, is at least about, is at most, or is at most about, 100, 200, 300, 400, 500, 600, 700, 800, 900, 1000, 2000, 3000, 4000, 5000, 6000, 7000, 8000, 9000, 10000, 20000, 30000, 40000, 50000, 60000, 70000, 80000, 90000, 100000, 200000, 300000, 400000, 500000, 600000, 700000, 800000, 900000, 1000000, 2000000, 3000000, 4000000, 5000000, 6000000, 7000000, 8000000, 9000000, 10000000, 20000000, 30000000, 40000000, 50000000, 60000000, 70000000, 80000000, 90000000, 100000000, 200000000, 300000000, 400000000, 500000000, 600000000, 700000000, 800000000, 900000000, 1000000000, or a number or a range between any two of these values. In some embodiments, the percentage of unique cell barcode sequences is, is about, is at least, is at least about, is at most, or is at most about, 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, 99.9%, 100%, or a number or a range between any two of these values, of the cell barcode sequences of the barcode oligonucleotides introduced in a partition. For example, the cell barcodes of barcode oligonucleotides introduced in two partitions can comprise different sequences.
In some embodiments, barcode oligonucleotides are introduced to the plurality of partitions such that different sets of a plurality of barcode oligonucleotides introduced in different partitions have different cell barcode and a same set of plurality of barcode oligonucleotides introduced in a same partition have same cell barcode. For example, nucleic acids associated in the cell in a partition of the plurality of partitions can be barcoded with the same cell barcode. For another example, the cell barcodes of two barcode oligonucleotides attached to a bead of the plurality of beads can comprise an identical sequence. The cell barcodes of two barcode oligonucleotides attached to two beads of the plurality of beads can comprise different sequences.
The length of a cell barcode of a barcode oligonucleotide (or a cell barcode of each barcode oligonucleotide or all cell barcodes of the plurality of barcode oligonucleotides) can be different in
different embodiments. In some embodiments, a cell barcode of a barcode oligonucleotide (or each cell barcode of each barcode oligonucleotide or all cell barcodes of the plurality of barcode oligonucleotides) is, is about, is at least, is at least about, is at most, or is at most about, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, or a number or a range between any two of these values, nucleotides in length.
In some embodiments, a cell barcode of a barcode oligonucleotide (or each cell barcode of each barcode oligonucleotide or all cell barcodes of the plurality of barcode oligonucleotides) has a length greater than 2 nucleic acid bases. In some embodiments, a cell barcode of a barcode oligonucleotide (or each cell barcode of each barcode oligonucleotide or all cell barcodes of the plurality of barcode oligonucleotides) is 2-40 nucleotides in length. In some embodiments, a cell barcode of a barcode oligonucleotide (or each cell barcode of each barcode oligonucleotide or all cell barcodes of the plurality of barcode oligonucleotides) is at least 6 nucleic acid bases in length.
UMI
In some embodiments, the unique molecule identifiers (UMIs) are for identifying molecular origins of the plurality of barcoded nucleic acids. UMIs are short sequences used to uniquely tag each molecule in a sample in some embodiments. The UMIs of the barcode oligonucleotides of the plurality of barcode oligonucleotides partitioned into a partition can be identical or different. For example, the UMIs of two barcode oligonucleotides attached to a bead of the plurality of beads can comprise different sequences. The UMIs of two barcode oligonucleotides attached to two beads of the plurality of beads can comprise an identical sequence.
In some embodiments, the UMIs of the plurality of barcode oligonucleotides are different. The number (or percentage) of UMIs of barcode oligonucleotides introduced in a partition with different sequences can be different in different embodiments. In some embodiments, the number of UMIs of barcode oligonucleotides introduced in a partition with different sequences is, is about, is at least, is at least about, is at most, or is at most about, 100, 200, 300, 400, 500, 600, 700, 800, 900, 1000, 2000, 3000, 4000, 5000, 6000, 7000, 8000, 9000, 10000, 20000, 30000, 40000, 50000, 60000, 70000, 80000, 90000, 100000, 200000, 300000, 400000, 500000, 600000, 700000, 800000, 900000, 1000000, 2000000, 3000000, 4000000, 5000000, 6000000, 7000000, 8000000, 9000000, 10000000, 20000000, 30000000, 40000000, 50000000, 60000000, 70000000, 80000000, 90000000, 100000000, 200000000, 300000000, 400000000, 500000000, 600000000, 700000000, 800000000,
900000000, 1000000000, or a number or a range between any two of these values. In some embodiments, the percentage of UMIs of barcode oligonucleotides introduced in a partition with different sequences is, is about, is at least, is at least about, is at most, or is at most about, 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, 99.9%, 100%, or a number or a range between any two of these values. For example, the UMIs of two barcode oligonucleotides of the plurality of barcode oligonucleotides introduced in a partition can comprise different sequences.
The number of barcode oligonucleotides introduced in a partition with UMIs having an identical sequence can be different in different embodiments. In some embodiments, the number of barcode oligonucleotides introduced in a partition with UMIs having an identical sequence is, is about, is at least, is at least about, is at most, or is at most about, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, or a number or a range between any two of these values. For example, the UMIs of two barcode oligonucleotides introduced in a partition can comprise an identical sequence.
The number of unique UMI sequences can be different in different embodiments. In some embodiments, the number of unique UMI sequences is, is about, is at least, is at least about, is at most, or is at most about, 100, 200, 300, 400, 500, 600, 700, 800, 900, 1000, 2000, 3000, 4000, 5000, 6000, 7000, 8000, 9000, 10000, 20000, 30000, 40000, 50000, 60000, 70000, 80000, 90000, 100000, 200000, 300000, 400000, 500000, 600000, 700000, 800000, 900000, 1000000, 2000000, 3000000, 4000000, 5000000, 6000000, 7000000, 8000000, 9000000, 10000000, 20000000, 30000000, 40000000, 50000000, 60000000, 70000000, 80000000, 90000000, 100000000, 200000000, 300000000, 400000000, 500000000, 600000000, 700000000, 800000000, 900000000, 1000000000, or a number or a range between any two of these values.
The length of a UMI of a barcode oligonucleotide (or a UMI of each barcode oligonucleotide) can be different in different embodiments. In some embodiments, a UMI of a barcode oligonucleotide (or a UMI of each barcode oligonucleotide) is, is about, is at least, is at least about, is at most, or is at most about, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, or a number or a range between any two of these values, nucleotides in length.
In some embodiments, the UMIs have a length greater than 2 nucleic acid bases. In some
embodiments, the UMIs are 2-40 nucleotides in length. In some embodiments, the UMIs are at least 6 nucleic acid bases in length.
Primer Sequence
In some embodiments, a barcode oligonucleotide (or each barcode oligonucleotide of the plurality of barcode oligonucleotides) can comprise a primer sequence. The primer sequence can be a sequencing primer sequence (or a sequencing primer binding sequence) or a PCR primer sequence (or PCR primer binding sequence) . For example, the PCR primer binding sequence is a Read 1 sequence.
Probe Sequence
In some embodiments, a barcode oligonucleotide (or each barcode oligonucleotide of the plurality of barcode oligonucleotides) can comprise a probe sequence or region capable of hybridizing to a plurality of sample nucleic acids, a particular type of sample nucleic acids (e.g. mRNA) , and/or specific sample nucleic acids (e.g. specific gene of interest) .
The length of a probe sequence can be different in different embodiments. In some embodiments, a probe sequence is, is about, is at least, is at least about, is at most, or is at most about, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, or a number or a range between any two of these values, nucleotides in length. The probe sequence can be 12-18 deoxythymidines in length. In some embodiments, the probe sequence can be 20 nucleotides or longer to enable their annealing in reverse transcription reactions at higher temperatures as will be understood by a person skilled in the art.
In some embodiments, barcode oligonucleotides comprising probe sequences can be introduced into the partitions together with other reagents such as the reverse transcription reagents. The number of the barcode oligonucleotides introduced into a partition comprising a probe sequence can be different in different embodiments. In some embodiments, the number of barcode oligonucleotides introduced into a partition comprising a probe sequence (e.g., poly (dT) sequence) is, is about, is at least, is at least about, is at most, or is at most about, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 200, 300, 400, 500, 600, 700, 800, 900, 1000, 2000, 3000, 4000, 5000, 6000, 7000, 8000, 9000, 10000, 20000, 30000, 40000, 50000, 60000, 70000, 80000, 90000, 100000, 200000, 300000, 400000, 500000, 600000, 700000, 800000, 900000, 1000000, 2000000, 3000000, 4000000, 5000000, 6000000, 7000000, 8000000, 9000000, 10000000, 20000000, 30000000, 40000000, 50000000,
60000000, 70000000, 80000000, 90000000, 100000000, 200000000, 300000000, 400000000, 500000000, 600000000, 700000000, 800000000, 900000000, 1000000000, or a number or a range between any two of these values.
In some embodiments, the probe sequence can be on a 3’ end of a barcode oligonucleotide of the plurality of barcode oligonucleotides introduced in a partition. Barcode oligonucleotides each comprising a poly (dT) probe sequence can be used to capture (e.g., hybridize to) 3’ end of polyadenylated mRNA transcripts in a sample nucleic acid for a downstream 3’ gene expression library construction.
In some embodiments, the probe sequence can comprise a poly (dT) sequence which is a single-stranded sequence of deoxythymidine (dT) used for first-strand cDNA synthesis catalyzed by reverse transcriptase. In some embodiments, the probe sequence comprises a poly (dT) sequence can be introduced into the partitions as extension primers to synthesize the first-strand cDNA using the sample nucleic acid (e.g. RNA) as a template.
In some embodiments, the barcode oligonucleotides of the plurality of barcode oligonucleotides each comprises a poly (dT) sequence. The poly (dT) sequence can be capable of binding to a poly (A) region (e.g., a poly (A) tail) of a nucleic acid target (e.g., mRNA target) . In some embodiments, the poly (dT) sequences of the barcode oligonucleotides of the plurality of barcode oligonucleotides attached to a bead (or each bead or all beads) are identical. The percentage of the barcode oligonucleotides of the plurality of barcode oligonucleotides attached to a bead (or each bead or all beads) with an identical poly (dT) sequence can be different in different embodiments. In some embodiments, the percentage of the barcode oligonucleotides of the plurality of barcode oligonucleotides attached to a bead (or each bead or all beads) with an identical poly-dT sequence is, is about, is at least, is at least about, is at most, is at most about, 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or a number or a range between any two of these values.
In some embodiments, barcode oligonucleotides of the plurality of barcode oligonucleotides each comprises a probe sequence. The probe sequence, for example, is not a poly (dT) sequence (though a probe sequence can comprise a stretch of Ts) . The probe sequence can be capable of binding to a non-poly (A) nucleic acid target. The number of different probe sequences of the barcode oligonucleotides attached to a bead (or each bead or all beads) can be different in different embodiments. For example, barcode oligonucleotides of the plurality of barcode oligonucleotides
can comprise probe sequences that are capable of binding to an identical non-poly (A) nucleic acid target. For another example, barcode oligonucleotides of the plurality of barcode oligonucleotides can comprise probe sequences that are capable of binding to different non-poly (A) nucleic acid targets. In some embodiments, the number of different probe sequences of the barcode oligonucleotides attached to a bead (or each bead or all beads) is, is about, is at least, is at least about, is at most, or is at most about, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 200, 300, 400, 500, 600, 700, 800, 900, 1000, 2000, 3000, 4000, 5000, 6000, 7000, 8000, 9000, 10000, 20000, 50000, 1000000, or a number or a range between any two of these values.
In some embodiments, the probe sequences of all barcode oligonucleotides of the plurality of barcode oligonucleotides comprise poly (dT) capable of hybridizing to poly (A) tails of mRNA molecules (or poly (dA) regions or tails of DNA) . In some embodiments, the probe sequences of some barcode oligonucleotides of the plurality of barcode oligonucleotides comprise non-poly (dT) (e.g., gene-specific or target-specific probe) sequences. The non-poly (dT) probe sequences can be designed based on known sequences of a target nucleic acid of interest. The non-poly (dT) probe sequences can span a nucleic acid region of interest, or adjacent (upstream or downstream) of a nucleic acid region of interest.
The length of a non-poly (dT) probe sequence can be different in different embodiments. In some embodiments, a non-poly (dT) probe sequence is, is about, is at least, is at least about, is at most, or is at most about, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, or a number or a range between any two of these values, nucleotides in length. For example, a non-poly (dT) probe sequence is at least 10 nucleotides in length.
The number of the barcode oligonucleotides introduced into a partition comprising a gene-specific probe sequence can be different in different embodiments. In some embodiments, the number of barcode oligonucleotides introduced into a partition comprising a gene-specific probe sequence is, is about, is at least, is at least about, is at most, or is at most about, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 200, 300, 400, 500, 600, 700, 800, 900, 1000, 2000, 3000, 4000, 5000, 6000, 7000, 8000, 9000, 10000, 20000, 30000, 40000, 50000, 60000, 70000, 80000, 90000, 100000, 200000, 300000, 400000, 500000, 600000, 700000, 800000, 900000, 1000000, 2000000, 3000000, 4000000,
5000000, 6000000, 7000000, 8000000, 9000000, 10000000, 20000000, 30000000, 40000000, 50000000, 60000000, 70000000, 80000000, 90000000, 100000000, 200000000, 300000000, 400000000, 500000000, 600000000, 700000000, 800000000, 900000000, 1000000000, or a number or a range between any two of these values.
Accordingly, the number of nucleic acid targets of interest (e.g. genes of interest) that the barcode oligonucleotides introduced into a partition are capable of binding can be different in different embodiments. In some embodiments, the number of nucleic acid targets of interest (e.g. genes of interest) the barcode oligonucleotides introduced into a partition are capable of binding is, is about, is at least, is at least about, is at most, or is at most about, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 200, 300, 400, 500, 600, 700, 800, 900, 1000, 2000, 3000, 4000, 5000, 6000, 7000, 8000, 9000, 10000, 20000, 50000, 1000000, or a number or a range between any two of these values. One barcode oligonucleotide introduced into a partition can bind to a molecule (or a copy) of a nucleic acid target. Barcode oligonucleotides introduced into a partition can bind to molecules (or copies) of a nucleic acid target or a plurality of nucleic acid targets.
In some embodiments, the barcode oligonucleotides of the plurality of barcode oligonucleotides can each comprise a poly (dT) sequence, a non-poly (dT) probe sequence, and/or both. The poly (dT) sequence and the gene-specific probe sequence can be on a same barcode oligonucleotide or different barcode oligonucleotides of the plurality of barcode oligonucleotides introduced into a partition.
In some embodiments, the probe sequences of barcode oligonucleotides of the plurality of barcode oligonucleotides comprise a degenerate sequence. The length of a degenerate sequence can be different in different embodiments. In some embodiments, the length of the degenerate sequence is, is about, is at least, is at least about, is at most, or is at most about, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, or a number or a range between any two of these values nucleotides. For example, a length of the degenerate sequence can be at least 3 nucleotides. The degenerate sequence can span a mutation. For example, the degenerate sequence is three nucleotides in length, and the second position of the degenerate sequence is the position of a single nucleotide variation. The degenerate sequence can correspond a mutation. For example, the degenerate sequence is one
nucleotide in length, and the position of the degenerate sequence corresponds to the position of a single nucleotide variation. The length of the degenerate sequence and the length of the mutation can be identical. The length of the degenerate sequence and the length of the mutation can be different. The length of the degenerate sequence can be longer the length of the mutation.
Template Switching Oligonucleotide
In some embodiments, a barcode oligonucleotide (or each barcode oligonucleotide of the plurality of barcode oligonucleotides) can be a template switching oligonucleotide. A primer comprising a target binding region, such as a poly (dT) sequence, can hybridize to a sample nucleic acid (e.g., an mRNA) and be extended by, for example, reverse transcription to generate an extended primer comprising a reverse complement of the sample nucleic acid, or a portion thereof (e.g., cDNA) . The extended primer or cDNA can be further extended to include the reverse complement of a TSO oligonucleotide or barcode oligonucleotide as illustrated in FIG. 2. The resulting barcoded nucleic acid includes the barcodes of the barcode oligonucleotide on the 3’-end. In some embodiments, a barcode oligonucleotide (or each barcode oligonucleotide of the plurality of barcode oligonucleotides) is not a template switching oligonucleotide. A barcode oligonucleotide comprising a target binding region, such as a poly (dT) sequence, can hybridize to a sample nucleic acid (e.g., an mRNA) and be extended by, for example, reverse transcription to generate an extended primer comprising a reverse complement of the sample nucleic acid, or a portion thereof (e.g., cDNA) . The extended primer or cDNA can be further extended to include the reverse complement of a TSO oligonucleotide. The resulting barcoded nucleic acid includes the barcodes of the barcode oligonucleotide on the 5’-end.
A template switching oligonucleotide (TSO) is an oligonucleotide that hybridizes to untemplated C nucleotides added by a reverse transcriptase during reverse transcription. The TSO can hybridize to the 3’ end of a cDNA molecule. The TSO can include one or more nucleotides with guanine (G) bases on the 3’-end of the TSO, with which the one or more cytosine (C) bases added by a reverse transcriptase to the 3’-end of a cDNA can hybridize. The series of G bases can comprise 1G base, 2 G bases, 3 G bases, 4 G bases, 5 G bases or more than 5 G bases. The series of G bases can be ribonucleotides. The reverse transcriptase can further extend the cDNA using the TSO as the template to generate a barcoded cDNA comprising the TSO.
The length of a TSO can be different in different embodiments. In some embodiments, a template switching oligonucleotide is, is about, is at least, is at least about, is at most, or is at most about, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61,
62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, or a number or a range between any two of these values, nucleotides in length.
The TSO can, e.g., have a length greater than 2 nucleic acid bases. In some embodiments, the template switching oligonucleotides are 2-40 nucleotides in length. In some embodiments, the template switching oligonucleotides are at least 12 nucleic acid bases in length.
The number of the barcode oligonucleotides introduced into a partition comprising a TSO can be different in different embodiments. In some embodiments, the number of barcode oligonucleotides introduced into a partition comprising a TSO is, is about, is at least, is at least about, is at most, or is at most about, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 200, 300, 400, 500, 600, 700, 800, 900, 1000, 2000, 3000, 4000, 5000, 6000, 7000, 8000, 9000, 10000, 20000, 30000, 40000, 50000, 60000, 70000, 80000, 90000, 100000, 200000, 300000, 400000, 500000, 600000, 700000, 800000, 900000, 1000000, 2000000, 3000000, 4000000, 5000000, 6000000, 7000000, 8000000, 9000000, 10000000, 20000000, 30000000, 40000000, 50000000, 60000000, 70000000, 80000000, 90000000, 100000000, 200000000, 300000000, 400000000, 500000000, 600000000, 700000000, 800000000, 900000000, 1000000000, or a number or a range between any two of these values.
In some embodiments, the TSO of the barcode oligonucleotides introduced into a partition can be identical. In some embodiments, the TSO of the barcode oligonucleotides introduced into a partition can be different. The percentage of the barcode oligonucleotides of the plurality of barcode oligonucleotides introduced into a partition with an identical TSO sequence can be different in different embodiments. In some embodiments, the percentage of the barcode oligonucleotides of the plurality of barcode oligonucleotides introduced into a partition with an identical TSO sequence is, is about, is at least, is at least about, is at most, is at most about, 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 61%, 62%, 63%, 64%, 65%, 66%, 67%, 68%, 69%, 70%, 71%, 72%, 73%, 74%, 75%, 76%, 77%, 78%, 79%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, 99.9%, 100%, or a number or a range between any two of these values.
Barcode Oligonucleotide Extension (e.g., Reverse Transcription)
Barcoding the nucleic acid targets associated with the cell can comprise: hybridizing the barcode oligonucleotides attached to the bead in each partition of the plurality of partitions with nucleic acid targets associated with the cell in the partition; extending the barcode oligonucleotides attached to the bead and hybridized to the nucleic acid targets using the nucleic acid targets as templates to
generate single-stranded barcoded nucleic acids; and generating double-stranded barcoded nucleic acids from the single-stranded barcoded nucleic acids. Generating double-stranded barcoded nucleic acids from the single-stranded barcoded nucleic acids can comprise extending the single-stranded barcoded nucleic acids. Extending the single-stranded barcoded nucleic acids can comprise extending the single-stranded barcoded nucleic acids using a template switching oligonucleotide.
For example, the barcoded nucleic acids can be generated by reverse transcription using a reverse transcriptase. For example, the barcoded nucleic acids can be generated by using a DNA polymerase. Barcoding the nucleic acids associated with the cell can comprise generating double-stranded barcoded nucleic acids from the single-stranded barcoded nucleic acids. Extending the single-stranded barcoded nucleic acids comprises further extending the single-stranded barcoded nucleic acids using a template switching oligonucleotide. For example, a reverse transcriptase can be used to generate a cDNA by extending a barcode oligonucleotide hybridized to an RNA. After extending the barcode oligonucleotide to the 5’-end of the RNA, the reverse transcriptase can add one or more nucleotides with cytosine (Cs) bases (e.g., two or three) to the 3’-end of the cDNA. The template switch oligonucleotide (TSO) can include one or more nucleotides with guanine (G) bases (e.g., two or three) on the 3’-end of the TSO. The nucleotides with guanine bases can be ribonucleotides. The guanine bases at the 3’-end of the TSO can hybridize to the cytosine bases at the 3’-end of the cDNA. The reverse transcriptase can further extend the cDNA using the TSO as the template to generate a cDNA with the TSO sequence on its 3’-end. Similarly, a barcoded nucleic acid can include a TSO sequence at its 3’-end.
Pooling
In some embodiments, the method comprises pooling barcoded nucleic acids of the plurality of barcoded nucleic acids after barcoding the sample nucleic acids and before sequencing the barcoded nucleic acids to obtain pooled barcoded nucleic acids.
In some embodiments, the method comprises pooling the beads prior to extending the barcode oligonucleotides. The method can comprise pooling the beads prior to generating the double-stranded barcoded nucleic acids. In some embodiments, extending the barcode oligonucleotides attached to the bead and hybridized to the nucleic acid targets comprises extending the barcode oligonucleotides attached to the bead and hybridized to the nucleic acid targets in bulk. Generating the double-stranded barcoded nucleic acids can comprise generating the double-stranded barcoded nucleic acids from the single-stranded barcoded nucleic acids in bulk.
In some embodiments, the method comprises pooling the beads subsequent to extending the barcode oligonucleotides to generate the single-stranded barcoded nucleic acids. The method can
comprise pooling the beads subsequent to generating the double-stranded barcoded nucleic acids. In some embodiments, extending the barcode oligonucleotides attached to the bead and hybridized to the nucleic acid targets comprises extending the barcode oligonucleotides attached to the bead and hybridized to the nucleic acid targets in the partition. Generating the double-stranded barcoded nucleic acids can comprise generating the double-stranded barcoded nucleic acids from the single-stranded barcoded nucleic acids in the partition.
In some embodiments, pooling barcoded nucleic acids occurs after generating the double-stranded barcoded nucleic acids. In some embodiments, pooling barcoded nucleic acids occurs after denaturizing (such as heat denaturization or chemical denaturization with, for example, sodium hydroxide) the double-stranded barcoded nucleic acids which generates two single-stranded barcoded nucleic acids, one retained in the partition and one released from the barcoded nucleic acids retained in the partition. In some embodiments, pooling barcoded nucleic acids occurs after amplification of the barcoded nucleic acids. In some embodiments, pooling barcoded nucleic acids occurs after further processing (e.g., fragmentation) of the barcoded nucleic acids. In some embodiments, pooling barcoded nucleic acids comprises collecting the single-stranded barcoded nucleic acids released from the barcoded nucleic acids retained in the partition.
In some embodiments, the barcode oligonucleotides are attached to beads, only single-stranded barcoded nucleic acids released into bulk are collected by pooling, and the beads are not pooled (e.g. not removed from the partitions) but retained in the partitions (e.g. by an external magnetic field applied on magnetic beads) , thereby allowing one to trace the origin of the pooled barcoded nucleic acids, for example, to its original location in the plurality of partitions.
The pooled barcoded nucleic acids can be single-stranded or double-stranded (e.g. generated from the single-stranded pooled barcoded nucleic acids by PCR amplification) . The pooled barcoded nucleic acids (e.g. barcoded cDNA) can be purified and/or amplified prior to sequencing library construction. The pooled barcoded nucleic acids with desired length may be selected.
Barcoded Nucleic Acid (e.g., cDNA) Circularization
In some embodiments, the barcoded nucleic acid (e.g., barcoded cDNA) is circularized by connecting the two ends of the barcoded nucleic acid (e.g., barcoded cDNA) . In some embodiments, the barcoded nucleic acid (e.g., barcoded cDNA) is divided into a first portion and a second portion. Barcoded nucleic acid (e.g., barcoded cDNA) circularization can comprise circularization of the first portion, the second portion or both portions of the barcoded nucleic acid (e.g., barcoded cDNA) .
The barcoded nucleic acid (e.g., barcoded cDNA) to be circularized can comprise a nucleotide
sequence corresponding to the nucleic acid target (e.g., mRNA) target and the barcode oligonucleotide comprising the cell barcode and the UMI attached to one end of the nucleotide sequence corresponding to the nucleic acid (e.g., mRNA) . Barcoded nucleic acid (e.g., barcoded cDNA) circularization can comprise connecting the barcode oligonucleotide comprising the cell barcode and the UMI attached to one end of the nucleotide sequence corresponding to the nucleic acid target (e.g., mRNA) to the other end of the nucleotide sequence corresponding to the nucleic acid target (e.g., mRNA) .
The barcode oligonucleotide can be connected to the other end of the nucleotide sequence corresponding to the nucleic acid target (e.g., mRNA) directly or indirectly. The cell barcode, UMI or primer-binding sequence of the barcode oligonucleotide can connect to the other end of the nucleotide sequence corresponding to the nucleic acid target (e.g., mRNA) directly or indirectly. The barcode oligonucleotide can be connected to the other end of the nucleotide sequence corresponding to the nucleic acid target (e.g., mRNA) through a region of sequence identity.
The barcoded nucleic acid (e.g., barcoded cDNA) can be circularized using isolated protein reagents (proteins) . In some embodiments, the two ends of the barcoded nucleic acid (e.g., barcoded cDNA) share a region of sequence identity and are contacted with: a non-processive 5' exonuclease; a non-strand-displacing DNA polymerase; and a ligase. In some embodiments, the 5' exonuclease and the DNA polymerase are different entities. The barcoded nucleic acid (e.g., barcoded cDNA) can additionally be contacted with a single stranded DNA binding protein (SSB) , which accelerates nucleic acid annealing.
In some embodiments, the non-processive 5' exonuclease, the non-strand-displacing DNA polymerase, the SSB and the ligase are each isolated (e.g., purified) . As used herein, an “isolated” protein means that the protein is removed from its original environment (e.g., the natural environment if it is naturally occurring) , and isolated or separated from at least one other component with which it is naturally associated. For example, a naturally-occurring protein present in its natural living host (e.g. a bacteriophage protein present in a bacterium that has been infected with the phage) is not isolated, but the same protein, separated from some or all of the coexisting materials in the natural system, is isolated. Such proteins can be part of a composition or reaction mixture, and still be isolated in that such composition or reaction mixture is not part of its natural environment. The term “an isolated protein, ” as used herein, can include 1, 2, 3, 4 or more copies of the protein, i.e., the protein can be in the form of a monomer, or it can be in the form of a multimer, such as dimer, trimer, tetramer or the like, depending on the particular protein under consideration. In some embodiments, the protein is purified. Methods for purifying the proteins of the invention is known to one of skill in the art. In some embodiments, the protein is substantially purified or is
purified to homogeneity. The term “substantially purified” means that the protein is separated and is essentially free from other proteins, i.e., the protein is the primary and active constituent. The purified protein can then be contacted with the DNAs to be joined, where it then acts in concert with other proteins to achieve the joining. The proteins can be contacted with (combined with) the DNAs in any order; for example, the proteins can be added to a reaction mixture comprising the DNAs, or the DNAs can be added to a reaction mixture comprising the proteins. Proteins used herein can be in the form of “active fragments, ” rather than the full-length proteins, provided that the fragments retain the activities (enzymatic activities or binding activities) required to achieve the joining. One of skill in the art will recognize how to generate such active fragments.
The non-processive 5' exonuclease can be any non-processive 5'→3' double strand specific exodeoxyribonuclease. The terms “5' exonuclease” or “exonuclease” are used herein to refer to a 5'→3' exodeoxyribonuclease and sometimes used interchangeably with “non-processive 5' exonuclease. ” A “non-processive” exonuclease, as used herein, is an exonuclease that degrades a limited number (e.g., only a few) nucleotides during each DNA binding event. Among other properties, which are desirable for the 5' exonuclease, are that it lacks 3' exonuclease activity, it is double strand DNA specific, it generates 5' phosphate ends, and it initiates degradation from both 5'-phosphorylated and unphosphorylated ends. Suitable 5' exonucleases will be evident to one of skill in the art. The 5' exonucleases can be the phage T7 gene 6 product, RedA of lambda phage (lambda exonuclease) , RecE of Rac prophage, or any of a variety of 5'→3' exonucleases that can be involved in homologous recombination reactions. Methods for preparing the T7 gene 6 product and optimal reaction conditions for using it are known to one of skill in the art.
Without wishing to be bound by any particular mechanism of action, SSB can protect the single stranded overhangs generated by the 5' exonuclease, as well as facilitating the rapid annealing of the homologous single stranded regions. Any SSB which can accelerate nucleic acid annealing can be used herein. An SSB, which “accelerates nucleic acid annealing, ” as used herein, can be an SSB which can accelerate nucleic acid binding by a factor of greater than about 500-fold, compared to the binding in the absence of the SSB. Among other properties, which are desirable for the SSB, are that it binds single stranded DNA (ssDNA) more tightly than double stranded DNA (dsDNA) , and that it interacts with both the exonuclease and the DNA polymerase. Suitable SSBs will be evident to the skilled worker. The SSBs can be the T7 gene 2.5 product, the E. coli RecA protein, RedB of lambda phage, and RecT of Rac prophage. Methods for preparing the T7 protein and optimal reaction conditions for using it are known to one of skill in the art. In yet a further embodiment, polyethylene glycol ( "PEG" ) is used to enhance the annealing process.
The non-strand-displacing DNA polymerase used herein can be any non-strand-displacing DNA
polymerase capable of filling in the gaps left by the 5' exonuclease digestion. The term “polymerase” is sometimes used herein to refer to a DNA polymerase. A “non-strand-displacing DNA polymerase, ” as used herein, is a DNA polymerase that terminates synthesis of DNA when it encounters DNA strands which lie in its path as it proceeds to copy a dsDNA molecule, or degrades the encountered DNA strands as it proceeds while concurrently filling in the gap thus created, thereby generating a “moving nick. ” Among the other properties which are desirable for the non-strand-displacing DNA polymerase are that it synthesizes DNA faster than the exonuclease in the reaction mixture degrades it. Suitable non-strand-displacing DNA polymerases will be evident to the skilled worker. The non-strand-displacing DNA polymerase can be the T7 gene 5 product, T4 DNA polymerase, and E. coli Pol I. Methods for preparing and using the above-noted DNA polymerases are known to one of skill in the art.
The ligase used herein can be any DNA ligase. The term “ligase” is sometimes used herein to refer to a DNA ligase. Suitable DNA ligases include, e.g., the T7 gene 1.3 product, T4 DNA ligase, E. coli DNA ligase and Taq Ligase. Methods for their preparation and optimal reaction conditions are known to one of skill in the art. By using a ligase, substantially all of the nicks (e.g., all of the nicks) can be sealed during the reaction procedure, in order to prevent degradation by the exonuclease.
In some embodiments, the 5' exonuclease is the phage T7 gene 6 product, RedA of lambda phage, or RecE of Rac prophage; the SSB is the phage T7 gene 2.5 product, the E. coli recA protein, RedB of lambda phage, or RecT of Rac prophage; the DNA polymerase is the phage T7 gene 5 product, phage T4 DNA polymerase, or E. coli pol I; and/or the ligase is the phage T7 gene 1.3 product, phage T4 DNA ligase, or E. coli DNA ligase.
The four proteins used herein (the exonuclease, SSB, polymerase and ligase) can be contacted with the barcoded nucleic acid (e.g., barcoded cDNA) to be circularized (e.g., added to a reaction mixture comprising a solution containing suitable salts, buffers, ATP, deoxynucleotides, etc. plus the DNA molecules) simultaneously. In some embodiments, the four proteins are added substantially simultaneously. For example, a mixture of the four proteins in suitable ratios can be added to the reaction mixture with a single pipetting operation. In another embodiment, the barcoded nucleic acid (e.g., barcoded cDNA) are added to a reaction mixture comprising a solution containing suitable salts, buffers, ATP, deoxynucleotides, etc. and the four proteins. In yet another embodiment, the barcoded nucleic acid (e.g., barcoded cDNA) is in contact with the four proteins sequentially. For example, the barcoded nucleic acid (e.g., barcoded cDNA) can be in contact with a reaction mixture comprising a solution containing suitable salts, buffers, ATP, deoxynucleotides, etc. and a subset of the four proteins, and the remaining proteins are then added, in any order or in any combination (e.g. the exonuclease can be added last; and preceding the addition of the exonuclease,
the SSB, polymerase and ligase can be added sequentially, in any order, or two of the proteins can be added substantially simultaneously, and the other protein can be added before or after those two proteins) .
In some embodiments, the circularization is performed under conditions whereby a 3' single-stranded overhang is generated in the regions of sequence identity at each ends of the barcoded nucleic acid (e.g., barcoded cDNA) by the exonuclease without the use of a restriction enzyme; the two single-stranded overhangs anneal to form a gapped molecule; the gaps are filled in by the polymerase leaving nicks; and nicks are sealed by the ligase, thereby joining the two ends of the barcoded nucleic acid (e.g., barcoded cDNA) and forming an intact (un-nicked) circularized barcoded nucleic acid molecule, in which a single copy of the region of sequence identity is retained. In some embodiments, none of the enzymatic reactions is actively terminated prior to beginning another of the reactions. In the circularization reaction, the 5' exonuclease can generate 3' single stranded overhangs in both ends of the barcoded nucleic acid (e.g., barcoded cDNA) ; the overhangs are generated in the regions of sequence identity; the two single stranded overhangs anneal to form a gapped molecule; the DNA polymerase fills in the gaps; and the ligase seals the nicks.
The four proteins (e.g., the non-processive 5' exonuclease, the non-strand-displacing DNA polymerase, the SSB and the ligase) can act together in a concerted fashion; the individual enzymatic reactions are not actively terminated (e.g., by an experimenter or investigator) before a subsequent reaction begins. In some embodiments, formation of a double stranded DNA molecule results in the molecule being relatively withdrawn or inert from the reactions. Conditions which are effective for connecting the two ends of the barcoded nucleic acid (e.g., barcoded cDNA) allow for the net assembly of a circularized barcoded nucleic acid, rather than the degradation of the barcoded nucleic acid (e.g., barcoded cDNA) by the exonuclease. In other words, the gaps formed by digestion by the 5' exonuclease can be filled in by the polymerase substantially immediately after they are formed. This can be accomplished by contacting the barcoded nucleic acid (e.g., barcoded cDNA) with a substantially lower amount of 5' exonuclease activity than the amount of DNA polymerase activity. That is, the gaps formed by digestion by the 5' exonuclease can be filled in by the polymerase substantially immediately after they are formed, and the intact (un-nicked) reaction product is “fixed” by the ligation reaction. Suitable amounts of activities can include: exonuclease activity between about 0.1 U/mL and about 50 U/mL; DNA polymerase between about 10 U/mL and about 30 U/mL; SSB between about 0.1 μM and about 1 μM; and ligase between about 0.1 μM and about 1 μM. Lower amounts of polymerase would likely not be able to catch up with the exonuclease, and higher amounts would likely degrade the 3' overhang generated by exonuclease,
resulting in overlaps being digested before annealing can occur. Lower amounts of SSB would likely not allow annealing to occur rapidly enough, and higher amounts would likely stimulate exonuclease processivity, also resulting in polymerase cannot catch up.
Reaction conditions (such as the presence of salts, buffers, ATP, dNTPs, etc. and the times and temperature of incubation) can be optimized readily by one of skill in the art. Preferably, the incubation temperature can be about 25℃ to about 50℃, and the reaction can be carried out for about 1-1.5 hours at 37℃, or for about 2-3 hours at 30℃.
The regions of sequence identity are sometimes referred to herein as “overlaps” or “regions of overlap. ” The region of sequence identity should be sufficiently long to allow the circulation to occur. The length can vary from a minimum of about 10 base pairs (bp) to about 300 bp (e.g., 10 bp, 20 bp, 30 bp, 40 bp, 50 bp, 60 bp, 70 bp, 80 bp, 90 bp, 100 bp, 150 bp, 200 bp, 250 bp, 300 bp or a number between any two of these values) or more. Without being bounded by any mechanism, it is preferable that the length of the overlap is not greater than about 1/10 the length of the nucleic acid fragment to be circularized; otherwise there may not be sufficient time for annealing and gap filling. If longer overlaps are used, the T7 endonuclease can also be required to debranch the joint molecules. In one embodiment, the region of sequence identity is of a length that allows it to be generated readily by synthetic methods, e.g. about 40 bp (e.g., about 35 to about 45 bp) .
The regions of sequence identity can be added to the ends of the barcoded nucleic acid (e.g., barcoded cDNA) to be circularized by any of a variety of methods. For example, the regions of sequence identity can be introduced by PCR amplification.
Circularization through the region of sequence identity can be achieved by using circle handle (s) . Before circularization, a first circle handle can be added to one end of the barcoded nucleic acid (e.g., barcoded cDNA) and a second circle handle can be added to the other end of the barcoded nucleic acid (e.g., barcoded cDNA) . The first and second circle handles can comprise regions of sequence identity and nucleotide sequences capable of hybridizing to the ends of the barcoded nucleic acid (e.g., barcoded cDNA) . After hybridizing the first and second circle handles to the barcoded nucleic acid (e.g., barcoded cDNA) , the first and second circle handles can be extended using the barcoded nucleic acid (e.g., barcoded cDNA) as a template. The extension can be achieved using a one-round PCR amplification that can be performed by one of skill in the art. The barcoded nucleic acid (e.g., barcoded cDNA) with the circle handles can be circularized by connecting the first and second circle handles.
The first and second circle handles can be double-stranded DNA, single-stranded DNA or partially double-stranded/partially single-stranded DNA. The regions of sequence identity in the first and
second circle handles can be double-stranded DNA, single-stranded DNA or partially double-stranded/partially single-stranded DNA. For example, the first and second circle handles can be double-stranded DNA in full length, including double-stranded regions of sequence identity. Alternatively, the first and second circle handles can comprise single-stranded regions of sequence identity. However, because a non-strand displacing DNA polymerase used herein must elongate in the 5' direction from a primer molecule, the regions of sequence identity cannot have a free 5' end (e.g. at the 5' end of the most 5' DNA to be joined) in the presence of exonuclease. Because no primer can be available in such a molecule to be extended, such a molecule would be digested by the exonuclease and the resulting gap could not be filled in by a polymerase. In one embodiment, the 5' ends of the barcoded nucleic acids to be circularized are blocked so that 5' exonuclease cannot digest them. The blocking agent can be reversible, so that the blocked end (s) can eventually be connected to form a circularized nucleic acid. Suitable blocking agents include, e.g., phosphorothioate bonds, 5' spacer molecules, and Locked Nucleic Acid (LNA) .
In some embodiments, the barcoded nucleic acid is a double-stranded cDNA. The first circle handle comprising a region of sequence identity is added to one end of the double-stranded barcoded cDNA; and the second circle handle comprising the same region of sequence identity is added to one end of the double-stranded barcoded cDNA. The addition of the first and second circle handles can be by using PCR amplification. The regions of sequence identity incorporated into the barcoded cDNA can be double-stranded. Then, the regions of sequence identity within the first and second circle handles can be digested by the exonuclease to generate 3' single-stranded overhangs; the two single-stranded overhangs can anneal to form a gapped molecule; the gaps can be filled in by the polymerase leaving nicks; and nicks can be sealed by the ligase, thereby connecting the two ends of the barcoded cDNA and forming an intact (un-nicked) circularized barcoded cDNA, in which a single copy of the region of sequence identity is retained.
In another embodiment, the regions of sequence identity added to the ends of barcoded nucleic acid are single-stranded and comprises complementary sequences. The two single-stranded regions of sequence identity can anneal to form a gapped molecule without the need to be digested by the exonuclease; the gaps can be filled in by the polymerase leaving nicks; and nicks can be sealed by the ligase, thereby connecting the two ends of the barcoded nucleic acid and forming an intact (un-nicked) circularized barcoded nucleic acid.
Barcoded Nucleic Acid (e.g., cDNA) Amplification
In some embodiments, the method comprises amplifying the barcoded nucleic acid to generate amplified barcoded nucleic acids, such as amplifying barcoded cDNAs. Amplifying the barcoded
nucleic acids can comprise amplifying the barcoded nucleic acids using polymerase chain reaction (PCR) to generate the amplified barcoded nucleic acids. For example, the barcode oligonucleotide can include a polymerase chain reaction (PCR) primer-binding sequence and a TSO sequence. The first PCR primer-binding sequence and the TSO sequence can be used to amplify the barcoded nucleic acid, such as a barcoded cDNA. For example, the barcode oligonucleotide can include a second polymerase chain reaction (PCR) primer-binding sequence (e.g., a Read 2 sequence) . A first primer comprising the sequence of second PCR primer-binding sequence and a second primer comprising a random sequence (e.g., a random hexamer) can be used to amplify the barcoded nucleic acid, such as a barcoded cDNA. The second primer can include one or more non-random sequences, such as a third PCR primer-binding sequence (e.g., a Read 3 sequence) .
In some embodiments, the method comprises amplifying the circularized barcoded nucleic acid to generate a second linear barcoded nucleic acid. Amplifying the circularized barcoded nucleic acids can comprise amplifying the circularized barcoded nucleic acids using polymerase chain reaction (PCR) to generate the second linear barcoded nucleic acids. For example, the circularized barcoded nucleic acid can comprise a barcode oligonucleotide that can include a first polymerase chain reaction (PCR) primer-binding sequence (e.g., a Read 1 sequence) . The first PCR primer-binding sequence can be used to amplify the circularized barcoded nucleic acid, such as a circularized barcoded cDNA. For example, the barcode oligonucleotide can include a first polymerase chain reaction (PCR) primer-binding sequence (e.g., a Read 1 sequence) . A first and a second primers comprising the sequence of first PCR primer-binding sequence or a portion thereof can be used to amplify the circularized barcoded nucleic acid, such as a circularized barcoded cDNA.
In some embodiments, the amplification of the circularized barcoded nucleic acid generates the second linear barcoded nucleic acid comprising the barcode oligonucleotide on a different end of the nucleic acid sequence corresponding to the nucleic acid target, compared to the barcoded nucleic acid from which the circularized barcoded nucleic acid is generated. For example, a first barcoded nucleic acid can comprise from 5’ to 3’: a region of sequence identity, a cell barcode, a UMI, a first PCR primer-binding sequence, a probe sequence (e.g., poly (dT) ) , a nucleic acid sequence corresponding to the nucleic acid target, a TSO, and another copy of the same region of sequence identity. After circularization, the circularized barcoded nucleic acid can comprise the region of sequence identity, the cell barcode, the UMI, the first PCR primer-binding sequence, the probe sequence (e.g., poly-dT) , a nucleic acid sequence corresponding to the nucleic acid target, and the TSO, in the order listed as a loop. After amplifying the circularized barcoded nucleic acid, using primers hybridizing with the first PCR primer-binding sequence, the second linear barcoded nucleic acid generated by such amplification can comprise from 5’ to 3’: the first PCR
primer-binding sequence or a portion thereof, the probe sequence (e.g., poly-dT) , a nucleic acid sequence corresponding to the nucleic acid target, the TSO, the region of sequence identity, the cell barcode, the UMI, and another copy of the first PCR primer-binding sequence or a portion thereof. The second linear barcoded nucleic acid can be purified after amplification.
Sequencing Library Construction
The barcoded nucleic acids (e.g. pooled barcoded nucleic acids) are further processed prior to sequencing to generate processed barcoded nucleic acids. For example, the method can include amplification of barcoded nucleic acids, fragmentation of amplified barcoded nucleic acids, end repair of fragmented barcoded nucleic acids, A-tailing of fragmented barcoded nucleic acids that have been end-repaired (e.g., to facilitate ligation to adapters) , and attaching (e.g. by ligation and/or PCR) with a second sequencing primer sequence (e.g. a Read 2 sequence) , sample indexes (e.g. short sequences specific to a given sample library) , and/or flow cell binding sequences (e.g. P5 and/or P7) . Additional PCR amplification can also be performed. This process can also be referred to as sequencing library construction.
In some embodiments, the method comprises performing a polymerase chain reaction in bulk, subsequent to the pooling, on the pooled barcoded nucleic acids, thereby generating amplified barcoded nucleic acids. PCR amplification can be carried out to generate sufficient mass for the subsequent library construction processes. PCR amplification can also be performed with primers specific to target nucleic acids of interest.
In some embodiments, the method comprises fragmenting (e.g., via enzymatic fragmentation, mechanical force, chemical treatment, etc. ) the pooled barcoded nucleic acids to generate fragmented barcoded nucleic acids. Fragmentation can be carried out by any suitable process such as physical fragmentation, enzymatic fragmentation, or a combination of both. For example, the barcoded nucleic acids can be sheared physically using acoustics, nebulization, centrifugal force, needles, or hydrodynamics. The barcoded nucleic acids can also be fragmented using enzymes, such as restriction enzymes and endonucleases.
Fragmentation can yield fragments of a desired size for subsequent sequencing. The desired sizes of the fragmented nucleic acids are determined by the limitations of the next generation sequencing instrumentation and by the specific sequencing application as will be understood by a person skilled in the art. For example, when using Illumina technology, the fragmented nucleic acids can have a length of between about 50 bases to about 1, 500 bases. In some embodiments, the fragmented barcoded nucleic acids have about 100 bp to 700bp in length.
Fragmented barcoded nucleic acids can undergo end-repair and A-tailing (to add one or more
adenine bases) to form an A overhang. This A overhang allows adapter containing one or more thymine overhanging bases to base pair with the fragmented barcoded nucleic acids.
Fragmented barcoded nucleic acids can be further processed by adding additional sequences (e.g. adapters) for use in sequencing based on specific sequencing platforms. Adapters can be attached to the fragmented barcoded nucleic acids by ligation using a ligase and/or PCR. For example, fragmented barcoded nucleic acids can be processed by adding a second sequencing primer sequence. The second sequencing primer sequence can comprise a Read 2 sequence. An adapter comprising the second primer sequence can be ligated to the fragmented barcoded nucleic acids after, for example, end-repair and A tailing, using a ligase. The adaptor can include one or more thymine (T) bases that can hybridize to the one or more A bases added by A tailing. An adaptor can be, for example, partially double-stranded or double stranded.
The adapter can also include platform-specific sequences for fragment recognition by specific sequencing instrument. For example, the adapter can comprise a sequence for attaching the fragmented barcoded nucleic acids to a flow well of Illumina platforms, such as a P5 sequence, a P7 sequence, or a portion thereof. Different adapter sequences can be used for different next generation sequencing instrument as will be understood by a person skilled in the art.
The adapter can also contain sample indexes to identify samples and to permit multiplexing. Sample indexes enable multiple samples to be sequenced together (i.e. multiplexed) on the same instrument flow cell as will be understood by a person skilled in the art. Adapters can comprise a single sample index or a dual sample indexes depending on the implementations such as the number of libraries combined and the level of accuracy desired.
In some embodiments, the amplified barcoded nucleic acids generated from sequencing library construction can include a P5 sequence, a sample index, a Read 1 sequence, a cell barcode, a UMI, a poly (dT) sequence, a target biding region, a sequence of a sample nucleic acid or a portion thereof, a Read 2 sequence, a sample index, and/or a P7 sequence (e.g., from 5’-end to 3’-end) . In some embodiments, the amplified barcoded nucleic acids can include a P5 sequence, a sample index, a Read 1 sequence, a cell barcode, a UMI, a sequence of a template switching oligonucleotide, a sequence of a sample nucleic acid or a portion thereof, a Read 2 sequence, a sample index, and/or a P7 sequence (e.g., from 5’-end to 3’-end) .
In some embodiments, sequencing the barcoded nucleic acids, or products thereof, comprises sequencing products of the barcoded nucleic acids. Products of the barcoded nucleic acids can include the processed nucleic acids generated by any step of the sequencing library construction process, such as amplified barcoded nucleic acids, fragmented barcoded nucleic acids, fragmented
barcoded nucleic acids comprising additional sequences such as the second sequencing primer sequence and/or adapter sequences described herein.
Sequencing Barcoded Nucleic Acids
The method disclosed herein can comprise sequencing the plurality of barcoded nucleic acids or products thereof to obtain nucleic acid sequences of the plurality of barcoded nucleic acids. The barcoded nucleic acids generated by the method disclosed herein comprise barcoded nucleic acids retained in a partition and barcoded nucleic acids pooled, from each partition, into a pooled mixture outside the partitions. The barcoded nucleic acids retained in a partition and the pooled barcoded nucleic acids in a pooled mixture outside the partitions can be sequenced using a same or different sequencing technique.
Sequencing Pooled Barcoded Nucleic Acids
In some embodiments, sequencing the plurality of barcoded nucleic acids or products thereof comprises sequencing the pooled barcoded nucleic acids to obtain nucleic acid sequences of the pooled barcoded nucleic acids. As used herein, a “sequence” can refer to the sequence, a complementary sequence thereof (e.g., a reverse, a compliment, or a reverse complement) , the full-length sequence, a subsequence, or a combination thereof. The nucleic acids sequences of the pooled barcoded nucleic acids can each comprise a sequence of a barcode oligonucleotide (e.g. the cell barcode and the UMI) and a sequence of a sample nucleic acid associated with the cell or a reverse complement thereof.
Pooled barcoded nucleic acids can be sequenced using any suitable sequencing method identifiable to a person skilled in the art. For example, sequencing the pooled barcoded nucleic acids can be performed using high-throughput sequencing, pyrosequencing, sequencing-by-synthesis, single-molecule sequencing, nanopore sequencing, sequencing-by-ligation, sequencing-by-hybridization, next generation sequencing, massively-parallel sequencing, primer walking, and any other sequencing methods known in the art and suitable for sequencing the barcoded nucleic acids generated using the methods herein described.
Sequencing Barcoded Nucleic Acids Retained in the Partitions
In some embodiments, sequencing the plurality of barcoded nucleic acids or products thereof comprises sequencing the barcoded nucleic acids retained in the partitions to obtain the nucleic acid sequences of the retained barcoded nucleic acids. Sequencing the barcoded nucleic acids retained in the partitions can comprise sequencing the entire sequence of a barcoded nucleic acid or sequencing a portion of the sequence of a barcoded nucleic acid, such as the cell barcode sequence of a
barcoded nucleic acid. In some embodiments, sequencing the barcoded nucleic acids retained in the partition can comprise determining the cell barcode sequences of the barcoded nucleic acids retained in the partition using oligonucleotide probes each comprising a fluorescent label.
In some embodiments, the cell barcode sequences of the barcoded nucleic acids retained in the partition can be determined using sequencing-by-ligation. The sequencing-by-ligation process can be carried out in the same microfluidic device used for performing other steps of the methods described herein, such as partitioning cells and barcode molecules and barcoding sample nucleic acids, without the necessity to transfer the barcoded nucleic acids elsewhere and therefore can be referred to as on-chip sequencing.
In the sequencing-by-ligation process, a first sequencing primer is hybridized to a single-stranded barcoded nucleic acid to be sequenced. A mixture (e.g., 16) of n-mer probes (e.g. 8-mer probes) carrying m (e.g., four) distinct fluorescent labels can compete for ligation to the first barcode (e.g., cell barcode) right after the first sequencing primer. The number of n-mer probes can be, for example, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, or more. The n-mer probes can be, for example, 2-, 3-, 4-, 5-, 6-, 7-, 8-, 9-, 10-, 11-, 12-, 13-, 14-, 15-, 16-, 17-, 18-, 19-, 20-, or more. The number of fluorescent labels used can be, for example, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, or more. The fluorophore encoding, which can be based on the two 3’-most nucleotides of a probe, is read. Three bases including the dye can be cleaved from the 5’ end of the probe, leaving a free 5’ phosphate on the extended primer, which can be then available for further ligation. After multiple ligations (e.g. 3 rounds of ligation) , the synthesized strands can be melted and the ligation product can be washed away before a second sequencing primer is annealed. A second sequencing primer can then hybridize the single-stranded barcoded nucleic acid at a base position shifted by one nucleotide with respect to the position the first sequencing primer binds to. The ligation process can be then repeated for the second sequencing primer. The same process can be followed for the rest of the sequencing primers. The dye read outs can be converted to a sequence. In some embodiments, 5 different sequencing primers are provided to sequence the first barcode sequences of the single-stranded barcoded nucleic acids retained in the partition.
In some embodiments, determining the cell barcode sequences of the barcoded nucleic acids retained in the partition using sequencing-by-ligation can comprise introducing a sequencing primer capable of hybridizing to the barcoded nucleic acids retained in the partition. The method can also comprise extending the sequencing primer using the barcoded nucleic acids retained in the partition as templates to generate a plurality of extended sequencing primers comprising the barcode sequences, or a portion thereof, of the barcoded nucleic acids retained in the partition. For example, a different sequencing primer can be introduced and extended in each of one or more cycles herein
described (e.g. one or more cycles of introducing a sequencing primer capable of hybridizing to the barcoded nucleic acids retained in the partition and extending the sequencing primer using the barcoded nucleic acids retained in the partition as templates to generate a plurality of extended sequencing primers comprising the barcode sequences, or a portion thereof, of the barcoded nucleic acids retained in the partition) . The introducing and extending can be repeated with a different sequencing primer capable of hybridizing to the barcoded nucleic acids retained in the partition. The method can also comprise introducing a plurality of oligonucleotide probes each comprising a fluorescent label. For example, the plurality of oligonucleotide probes can be octamer probes.
Post-sequencing analysis
The obtained nucleic acid sequences of the barcoded nucleic acids (e.g. nucleic acid sequences of both the first portion of first barcoded nucleic acids and the second linear barcoded nucleic acids) can be subjected to any downstream post-sequencing data analysis as will be understood by a person skilled in the art. The sequence data can undergo a quality control process to remove adapter sequences, low-quality reads, uncalled bases, and/or to filter out contaminants. The high-quality data obtained from the quality control can be mapped or aligned to a reference genome or assembled de novo.
Analyzing the sequence information can comprise determining a number of the barcoded nucleic acids of each of the nucleic acid targets comprising UMIs with different sequences; and/or determining sequences of the barcoded nucleic acids of the nucleic acid targets comprising UMIs with different sequences. Gene expression quantification and differential expression analysis can be carried out to identify genes whose expression differs under different conditions, such as, external stimuli and/or signals received from other cells.
In some embodiments, the method can comprise determining a profile (e.g. an expression profile, an omics profile, or a multi-omics profile) of the sample nucleic acids associated with the cell. A profile can be a single omics profile, such as a transcriptome profile. The profile can be a multi-omics profile, which can include profiles of genome (e.g. a genomics profile) , proteome (e.g. a proteomics profile) , transcriptome (e.g. a transcriptomics profile) , epigenome (e.g. an epigenomics profile) , metabolome (e.g. a metabolomics profile) , and/or microbiome (e.g. microbiome profile) . The profile can include an RNA expression profile. The profile can include a protein expression profile. The expression profile can comprise an RNA expression profile, an mRNA expression profile, and/or a protein expression profile. The expression profile can comprise an absolute abundance or a relative abundance. A profile can also be a profile of one or more target nucleic acids (e.g. gene markers) or a selection of genes associated with the cell.
Analyzing the sequencing information can comprise determining the pairing between the 5’ and 3’ sequences of the nucleic acid targets. Using the barcode oligonucleotide and barcoding method described herein, the 5’ and 3’ sequences of the same nucleic acid target can be provided with the same unique identifier (e.g., cell barcode and UMI) . Upon characterization of those sequences, they can be attributed as having been derived from the same nucleic acid target. The ability to matching the 5’ and 3’ sequences derived from the same nucleic acid target is provided by the assignment of unique identifiers (e.g., cell barcode) specifically to the nucleic acid target. Unique identifiers, e.g., in the form of nucleic acid barcodes can be assigned or associated with the 5’ and 3’ ends of the nucleic acid target, in order to tag or label the 5’ and 3’ sequences with the unique identifiers. These unique identifiers can then be used to attribute the 5’ and 3’ sequences to the same nucleic acid target. In some examples, this is carried out by the circularization of barcoded nucleic acid and amplification of the circularized barcoded nucleic acid as described above. For example, the nucleic acid target can be an mRNA. A barcode oligonucleotide comprising cell barcode, UMI, first PCR primer-binding sequence and poly (dT) sequence can be added to a cDNA at the end corresponding to the 3’ end of the mRNA target producing a barcoded cDNA, by hybridizing the poly (A) tail of the mRNA and reverse transcription. The barcoded cDNA can be divided into a first and a second portions. The first portion of the barcoded cDNAs can be circularized to produce circularized barcoded cDNAs. The circularized barcoded cDNAs can be then amplified to produce second linear barcoded cDNAs, with the cell barcode and the UMI attached to the cDNAs at the end corresponding to the 5’ end of the mRNA target. The cell barcode and the UMI in the second portion of the barcoded cDNA and the second linear barcoded cDNA can comprise the same sequence, allowing the sequences of the second portion of the barcoded cDNA and the second linear barcoded cDNA to be attributed to the same mRNA target.
Analyzing the sequencing information can comprise integrating the 5’ and 3’ sequences of the same nucleic acid target to obtain the full-length sequence information of the nucleic acid target. The 5’ and 3’ sequences attributed to the same nucleic acid target can be obtained using the method of pairing/matching the 5’ and 3’ sequences of the same nucleic acid target described above. After paring, the 5’ and 3’ sequences of the same nucleic acid target can be integrated, such as by aligning an overlapping sequence.
In some embodiments, the method disclosed herein can be used to determine a profile (e.g., an expression profile, an omics profile, or a multi-omics profile) of a cell, such as to detect changes in gene expression profile of the cell in terms of identification of RNA transcripts and their quantitation. In some embodiments, a profile of a cell can be determined using the nucleic acid sequences of the plurality barcoded nucleic acids. The profile can comprise a transcriptomics profile,
a multi-omics profile such as a genomics profile, a proteomics profile, a transcriptomics profile, an epigenomics profile, a metabolomics profile, a chromatics profile, or a combination thereof. For example, determining the profile of the cell can comprise determining the profile of the cell using the UMIs and sequences of the sample nucleic acids, or a portion thereof, present in the nucleic acid sequences.
In some embodiments, the cell can have a differential expression of genes upon stimulation. A differential expression analysis can be performed to detect quantitative changes in expression levels of the cell. Genes expressed differentially can be detected. Differential expression profile can be correlated to the cell functionality and/or cell’s phenotypes.
Composition &Kit
Disclosed herein include compositions for single cell sequencing or single cell analysis. In some embodiments, a composition for single cell sequencing or single cell analysis comprises a plurality of beads of the present disclosure. The cell barcodes of the plurality of barcode oligonucleotides attached to each of the plurality of beads can be identical. The cell barcodes of barcodes oligonucleotide attached to different beads of the plurality of beads can be different. The number of beads can be different in different embodiments. In some embodiments, the number of beads is, is about, is at least, is at least about, is at most, or is at most, 100, 200, 300, 400, 500, 600, 700, 800, 900, 1000, 2000, 3000, 4000, 5000, 6000, 7000, 8000, 9000, 10000, 20000, 30000, 40000, 50000, 60000, 70000, 80000, 90000, 100000, 200000, 300000, 400000, 500000, 600000, 700000, 800000, 900000, 1000000, 2000000, 3000000, 4000000, 5000000, 6000000, 7000000, 8000000, 9000000, 10000000, 20000000, 30000000, 40000000, 50000000, 60000000, 70000000, 80000000, 90000000, 100000000, 200000000, 300000000, 400000000, 500000000, 600000000, 700000000, 800000000, 900000000, 1000000000, or a number or a range between any two of these values. For example, the number beads can be at least 100 beads.
Disclosed herein include kits for single cell sequencing or single cell analysis. In some embodiments, a kit for single cell sequencing or single cell analysis comprises a composition comprising a plurality of beads of the present disclosure. The kit can comprise instructions of using the composition for single cell sequencing or single cell analysis.
Disclosed herein includes methods of generating beads comprising barcode oligonucleotides. In some embodiments, a method of generating beads comprising barcode oligonucleotides comprises providing a plurality of beads each attached to a plurality of oligonucleotide barcodes. Each barcode oligonucleotide of the plurality of barcode oligonucleotides can comprise a cell barcode, a unique molecular identifier (UMI) , and a poly (dT) sequence. The method can comprise adding, to 3’-end
of each of barcode oligonucleotides of the plurality of barcode oligonucleotides, a probe sequence that is a not poly (dT) sequence and is capable of binding to a nucleic acid target.
In some embodiments, adding the probe sequence comprises adding the probe sequence to the 3’-end of each of the barcode oligonucleotides of the plurality of barcode oligonucleotides chemically. In some embodiments, adding the probe sequence comprises adding the probe sequence to the 3’-end of each of the barcode oligonucleotides of the plurality of barcode oligonucleotides using an enzyme. In some embodiments, the enzyme is a ligase. Adding the probe sequence can comprise ligating a probe oligonucleotide comprising the probe sequence to the 3’-end of each of the barcode oligonucleotides of the plurality of barcode oligonucleotides using the ligase. In some embodiments, the enzyme is a DNA polymerase. Adding the probe sequence can comprise synthesizing the probe sequence at the 3’-end of each of the barcode oligonucleotides of the plurality of barcode oligonucleotides using the DNA polymerase.
Disclosed herein include methods of generating beads comprising barcode oligonucleotides. In some embodiments, a method of generating beads comprising barcode oligonucleotides comprises providing a plurality of beads each attached to a plurality of oligonucleotide barcodes. Each barcode oligonucleotide of the plurality of barcode oligonucleotides can comprise a cell barcode and a unique molecular identifier (UMI) . The method can comprise adding to 3’-end of each of barcode oligonucleotides of the plurality of barcode oligonucleotides (i) a poly (dT) sequence and/or (ii) a probe sequence that is a non-poly (dT) sequence and is capable of binding to a nucleic acid target.
Examples
Some aspects of the embodiments discussed above are disclosed in further detail in the following examples, which are not in any way intended to limit the scope of the present disclosure.
Materials and Methods
The following experimental materials and methods were used for Example 1 described below.
The Examples provides a non-limiting method for simultaneously detecting full-length transcript sequences at a high-throughput single-cell level. It involved capturing mRNA using magnetic beads with poly (dT) tails. The beads were attached with cell barcodes for cell identification and UMIs for transcript quantification. After the capturing, cDNA was synthesized through reverse transcription and PCR amplification. A portion of the cDNA was used for 3' end transcriptome library construction to obtain gene expression and sequence information from the 3' end. Another portion was subjected to ligation-mediated amplification and reverse PCR to obtain expression quantification and sequence information of genes from the 5' end. By integrating the information
from the 3' and 5' ends, the complete full-length sequence of transcript was obtained.
First, single-cell suspension was loaded onto a microfluidic chip, and individual cells were isolated into individual wells on the chip. Then, capture beads with cell barcodes and UMIs were loaded onto the chip. Based on the diameters of the beads and wells (e.g., approximately 25 μm and 40 μm, respectively) , only one bead was loaded into each well. 100 μL of cell lysis buffer was loaded into the chip, and the chip was incubated at room temperature for 15 minutes to lyse cells and capture RNA. After 15 minutes, the magnetic beads with captured RNA were taken out of the microchip and subjected to reverse transcription and PCR amplification to generate cDNA.
A portion of the obtained cDNA was used for 3' end transcriptome library construction, following the specific methods provided by the Singleron GEXSCOPE kit. The remaining cDNA was used for 5' end library construction. Prior to 5' end library construction, the cDNA underwent circularization and amplification, following the specific methods outlined below:
Circularization
To prepare the Circularization Mix, 250 ng of cDNA product was taken. The Circularization Mix was prepared on ice according to the following table, vortexed to mix and centrifuged briefly.
TABLE 1: CIRCULARIZATION MIX
The circularization program was set up on a thermal cycler, according to the following table. The temperature of the lid of the thermal cycler was set at 85℃.
TABLE 2: CIRCULARIZATION PROGRAM
After the circularization program was set, the Circularization Mix was mixed well by pipetting and centrifuged briefly. The Circularization Mix was placed in the thermal cycler and the circularization program was run.
Once the program was finished, the PCR tube with the cDNA was placed on ice (no purification was needed) . The digestion reaction was prepared according to the following table. In Table 3, cyclicase is a mixture of an exonuclease, a DNA polymerase, and a ligase.
TABLE 3: DIGESTION REACTION
The digestion reaction mix was mixed well by pipetting, centrifuged briefly and kept on ice. The digestion program was set up on a thermal cycler, according to the following table. The lid of the thermal cycler was set to be OFF. The PCR tube was placed in the thermal cycler and the digestion program was run.
TABLE 4: DIGESTION PROGRAM
Purification of the Digested cDNA
0.5 mL of 80%ethanol per reaction was prepared. The enriched product was centrifuged briefly and the volume was measured with a pipette. AMPure beads were vortexed until homogenized and 32.5 μL of beads were added into 25 μL of the fragmented product obtained from digestion step above. The product and the beads were mix well by vortexing and incubated at room temperature for 5 minutes. The tube was centrifuged briefly and placed on the magnetic rack for 5 minutes or until the liquid was clear. The supernatant was carefully removed and discarded without disturbing the beads.
The tube was kept on the magnetic stand and 200 μL of freshly prepared 80%ethanol was added to wash the magnetic beads. The tube was incubated at room temperature for 30 seconds, and the supernatant was carefully aspirated without disrupting the beads. The wash with 80%ethanol was repeated one more time. The tube was centrifuged briefly and returned onto the magnetic stand. The excess of ethanol was removed using a fine pipet tip. The lid was kept open to dry the beads for about 2 minutes or until the beads were not shiny anymore, but no more than 5 minutes. The tube was removed from the magnetic stand. The target was eluted from the beads by adding 20 μL nuclease-free water. The beads and the nuclease-free water were mixed well by pipetting up and
down for 10 times and incubated for at least 5 minutes at room temperature. The tube was centrifuged briefly, and placed back onto the magnetic stand until the liquid was clear. The supernatant (purified product) was transferred to a new EP tube. Quantification of the purified product was not necessary.
Amplification of circularization products.
The PCR Mix was prepared on ice according to the following table, vortexed to mix and centrifuged briefly.
TABLE 5: PCR MIX
200 μL of PCR Mix was taken, mixed by pipetting and distributed into PCR tubes, with a volume of 50 μL in each tube. The PCR tubes were covered and placed in a PCR machine for amplification, with a lid temperature of 105℃ and a reaction volume of 50 μL. The PCR program was set as in the table below. After the completion of the PCR program, the amplification products can be stored at 4℃for 48 hours or at -20℃ for 3 months. Alternatively, the amplification products were purified by proceeding with cDNA purification step below directly.
TABLE 6: PCR PROGRAM
Purification of amplification products.
0.5 mL of 80%ethanol per reaction was prepared. The enriched product was centrifuged briefly and the volume with a pipette was measured. AMPure beads were vortexed until homogenized and 40 μL of beads were added into 50 μL of first-round PCR-enriched product from amplification step above. The beads and the PCR product were mixed well by vortexing and incubated at room temperature for 5 minutes. The tube was centrifuged briefly and placed on the magnetic rack for 5 minutes or until the liquid was clear. The supernatant was carefully removed and discarded without disturbing the beads. The tube was kept on the magnetic stand and 200 μL of freshly prepared 80%ethanol was added to wash the magnetic beads. The tube was incubated at room temperature for 30 seconds, and the supernatant was carefully aspirated without disrupting the beads. The 80%ethanol wash step was repeated one more time. The tube was centrifuged briefly and returned onto the magnetic stand. The excess of ethanol was removed using a fine pipet tip. The lid was kept open to dry the beads for about 2 minutes or until the beads were not shiny anymore, but no more than 5 minutes. The tube was removed from the magnetic stand. The target was eluted from the beads by adding 20 μL of eluding buffer (EB) . The beads and the EB were mixed well by pipetting up and down for 10 times and incubated for at least 5 minutes at room temperature. The tube was centrifuged briefly and placed back on the magnetic stand until the liquid was clear. The supernatant (purified product) was transferred to a new EP tube. The product can be stored at 4℃ for 72 hours or at -20℃ for 3 months or directly proceed to the next step. The quality control of the product was conducted by taking 1 μL of sample for Qubit concentration detection.
5’ Transcriptome Library construction
The amplified products after circularization served as templates for the construction of the 5' end transcriptome library. The construction method for the 5' end transcriptome library followed the enrichment library construction method provided by the Singleron sCircle kit.
Example 1
Cell Clustering Analysis
The aforementioned single-cell sequencing method was applied to a mixed sample of human and mouse cell lines. 3T3 cells from mice and CCRF cells from humans were mixed in a 1: 1 ratio, and single-cell sequencing was performed. The transcriptome information obtained from the sequencing was used for cell clustering analysis. Additionally, the coverage of the 3' and 5' ends was evaluated, along with the assessment of complete coverage of the full-length gene after integrating the 3' and 5' sequence information.
Using the high-throughput single-cell analysis provided herein, a single experiment allowed simultaneous detection of gene expression at both the 5' and 3' ends using the obtained cDNA. This
approach provided full-length sequence information of transcripts and a more comprehensive understanding of gene expression. Moreover, introducing Unique Molecular Identifiers (UMIs) during the cell lysis process enabled correction of transcript abundance for both the 5' and 3' ends. This correction enhanced the accuracy of gene expression measurements for transcripts sharing the same UMI. The method can also be performed in high throughput, allowing for the sequencing of tens of thousands of single-cell full-length transcriptomes in a single experiment. The library preparation and sequencing strategy were well-developed, utilizing second-generation sequencing technology to achieve highly accurate full-length single-cell transcriptome sequencing at a relatively low cost.
In at least some of the previously described embodiments, one or more elements used in an embodiment can interchangeably be used in another embodiment unless such a replacement is not technically feasible. It will be appreciated by those skilled in the art that various other omissions, additions and modifications may be made to the methods and structures described above without departing from the scope of the claimed subject matter. All such modifications and changes are intended to fall within the scope of the subject matter, as defined by the appended claims.
With respect to the use of substantially any plural and/or singular terms herein, those having skill in the art can translate from the plural to the singular and/or from the singular to the plural as is appropriate to the context. The various singular/plural permutations may be expressly set forth herein for sake of clarity. As used in this specification and the appended claims, the singular forms “a, ” “an, ” and “the” include plural references unless the context clearly dictates otherwise. Any reference to “or” herein is intended to encompass “and/or” unless otherwise stated.
It will be understood by those within the art that, in general, terms used herein, and especially in the appended claims (e.g., bodies of the appended claims) are generally intended as “open” terms (e.g., the term “including” should be interpreted as “including but not limited to, ” the term “having” should be interpreted as “having at least, ” the term “includes” should be interpreted as “includes but is not limited to, ” etc. ) . It will be further understood by those within the art that if a specific number of an introduced claim recitation is intended, such an intent will be explicitly recited in the claim, and in the absence of such recitation no such intent is present. For example, as an aid to understanding, the following appended claims may contain usage of the introductory phrases “at least one” and “one or more” to introduce claim recitations. However, the use of such phrases should not be construed to imply that the introduction of a claim recitation by the indefinite articles “a” or “an” limits any particular claim containing such introduced claim recitation to embodiments containing only one such recitation, even when the same claim includes the introductory phrases “one or more” or “at least one” and indefinite articles such as “a” or “an” (e.g., “a” and/or “an”
should be interpreted to mean “at least one” or “one or more” ) ; the same holds true for the use of definite articles used to introduce claim recitations. In addition, even if a specific number of an introduced claim recitation is explicitly recited, those skilled in the art will recognize that such recitation should be interpreted to mean at least the recited number (e.g., the bare recitation of “two recitations, ” without other modifiers, means at least two recitations, or two or more recitations) . Furthermore, in those instances where a convention analogous to “at least one of A, B, and C, etc. ” is used, in general such a construction is intended in the sense one having skill in the art would understand the convention (e.g., “a system having at least one of A, B, and C” would include but not be limited to systems that have A alone, B alone, C alone, A and B together, A and C together, B and C together, and/or A, B, and C together, etc. ) . In those instances where a convention analogous to “at least one of A, B, or C, etc. ” is used, in general such a construction is intended in the sense one having skill in the art would understand the convention (e.g., “a system having at least one of A, B, or C” would include but not be limited to systems that have A alone, B alone, C alone, A and B together, A and C together, B and C together, and/or A, B, and C together, etc. ) . It will be further understood by those within the art that virtually any disjunctive word and/or phrase presenting two or more alternative terms, whether in the description, claims, or drawings, should be understood to contemplate the possibilities of including one of the terms, either of the terms, or both terms.
In addition, where features or aspects of the present disclosure are described in terms of Markush groups, those skilled in the art will recognize that the present disclosure is also thereby described in terms of any individual member or subgroup of members of the Markush group.
As will be understood by one skilled in the art, for any and all purposes, such as in terms of providing a written description, all ranges disclosed herein also encompass any and all possible sub-ranges and combinations of sub-ranges thereof. Any listed range can be easily recognized as sufficiently describing and enabling the same range being broken down into at least equal halves, thirds, quarters, fifths, tenths, etc. As a non-limiting example, each range discussed herein can be readily broken down into a lower third, middle third and upper third, etc. As will also be understood by one skilled in the art all language such as “up to, ” “at least, ” “greater than, ” “less than, ” and the like include the number recited and refer to ranges which can be subsequently broken down into sub-ranges as discussed above. Finally, as will be understood by one skilled in the art, a range includes each individual member. Thus, for example, a group having 1-3 articles refers to groups having 1, 2, or 3 articles. Similarly, a group having 1-5 articles refers to groups having 1, 2, 3, 4, or 5 articles, and so forth.
While various aspects and embodiments have been disclosed herein, other aspects and embodiments will be apparent to those skilled in the art. The various aspects and embodiments disclosed herein
are for purposes of illustration and are not intended to be limiting, with the true scope and spirit being indicated by the following claims.
Claims (62)
- A method for single cell analysis, comprising:partitioning a cell of a plurality of cells and a bead of a plurality of beads attached with a plurality of barcode oligonucleotides into a partition of a plurality of partitions, wherein each barcode oligonucleotide of the plurality of barcode oligonucleotides comprises a cell barcode, a unique molecular identifier (UMI) and a probe sequence, and wherein the probe sequence is capable of binding to an RNA target associated with the cell;hybridizing the plurality of barcode oligonucleotides attached to the bead in the partition with the RNA targets associated with the cell in the partition;reverse transcribing the RNA targets hybridized to the barcode oligonucleotides to generate a first plurality of barcoded complementary deoxyribonucleic acids (cDNAs) , wherein the first plurality of barcoded cDNAs comprises the barcode oligonucleotides and cDNAs corresponding to the RNA targets, and wherein the cDNAs corresponding to the RNA targets comprise one end attached to the UMI and the cell barcode and the other end;obtaining a first portion and a second portion of the first plurality of barcoded cDNAs;circularizing each of the first portion of the first plurality of barcoded cDNAs to generate a plurality of circularized barcoded cDNAs;amplifying the plurality of circularized barcoded cDNAs to generate a second plurality of linear barcoded cDNAs; andanalyzing the second portion of the first plurality of barcoded cDNAs and the second plurality of linear barcoded cDNAs, or products thereof.
- The method of claim 1, wherein the RNA target comprises a messenger RNA (mRNA) .
- A method for single cell analysis comprising:partitioning a cell of a plurality of cells and a bead of a plurality of beads attached with a plurality of barcode oligonucleotides into a partition of a plurality of partitions, wherein each barcode oligonucleotide of the plurality of barcode oligonucleotides comprises a cell barcode, a unique molecular identifier (UMI) and a probe sequence, and wherein the probe sequence is capable of binding to a nucleic acid target associated with the cell;barcoding the nucleic acid targets associated with the cell in the partition to generate a first plurality of barcoded nucleic acids,wherein the first plurality of barcoded nucleic acids comprises the barcode oligonucleotides and nucleotide sequences corresponding to the nucleic acid targets, and wherein the nucleotide sequences corresponding to the nucleic acid targets comprise one end attached to the UMI and the cell barcode and the other end;obtaining a first portion and a second portion of the first plurality of barcoded nucleic acids;circularizing each of the first portion of the first plurality of barcoded nucleic acids to generate a plurality of circularized barcoded nucleic acids;amplifying the plurality of circularized barcoded nucleic acids to generate a second plurality of linear barcoded nucleic acids; andanalyzing the second portion of the first plurality of barcoded nucleic acids and the second plurality of linear barcoded nucleic acids, or products thereof.
- The method of claim 3, wherein the nucleic acid targets comprise a ribonucleic acid (RNA) , a messenger RNA (mRNA) , and a deoxyribonucleic acid (DNA) , and/or wherein the nucleic acid targets comprise nucleic acid targets of the cell, from the cell, in the cell, and/or on the surface of the cell.
- The method of any one of claims 1-4, wherein the partition is a droplet or a microwell.
- The method of any one of claims 1-5, wherein the plurality of partitions comprises a plurality of microwells of a microwell array.
- The method of any one of claims 1-6, wherein the plurality of partitions comprises at least 1000 partitions.
- The method of any one of claims 1-7, wherein at least 50%of partitions of the plurality of partitions comprise a single cell of the plurality of cells and a single bead of the plurality of beads,wherein at most 10%of partitions of the plurality of partitions comprise two or more cells of the plurality of cells,wherein at most 10%of partitions of the plurality of partitions comprise no cell of the plurality of cells,wherein at most 10%of partitions of the plurality of partitions comprise two or more beads of the plurality of beads, and/orwherein at most 10%of partitions of the plurality of partitions comprise no bead of the plurality of beads.
- The method of any one of claims 1-8, wherein the probe sequence is at least 10 nucleotides in length.
- The method of any one of claims 1-9, wherein the probe sequence is not a poly-dT sequence.
- The method of any one of claims 1-10, wherein the barcode oligonucleotides comprising probe sequence is capable of binding to a non-poly-A RNA target and/or nucleic acid target.
- The method of claim 10 or 11, wherein the barcode oligonucleotides comprising probe sequences that are not poly-dT sequences are capable of binding to an identical non-poly-A RNA target and/or nucleic acid target.
- The method of claim 10 or 11, wherein the barcode oligonucleotides comprising probe sequences that are not poly-dT sequences are capable of binding to different non-poly-A RNA targets and/or nucleic acid targets.
- The method of any one of claims 1-9, wherein the probe sequence is a poly-dT sequence, optionally wherein the poly-dT sequence is at least 10 nucleotides in length.
- The method of claim 14, wherein the poly-dT sequences of the barcode oligonucleotides attached to a bead of the plurality of beads are identical.
- The method of any one of claims 1-15, wherein the probe sequences of barcode oligonucleotides comprise a degenerate sequence, optionally wherein the degenerate sequence is at least 3 nucleotides in length, optionally wherein the degenerate sequence spans, or corresponds to, a mutation.
- The method of any one of claims 1-16, wherein the probe sequences of barcode oligonucleotides span a region of interest.
- The method of any one of claims 1-17, wherein the probe sequence is adjacent a region of interest.
- The method of any one of claims 1-18, wherein the cell barcodes of two barcode oligonucleotides attached to a bead of the plurality of beads comprise an identical sequence, wherein the cell barcodes of two barcode oligonucleotides attached to two beads of the plurality of beads comprise different sequences, and/or wherein the cell barcode of each barcode oligonucleotide is at least 6 nucleotides in length.
- The method of any one of claims 1-19, wherein the UMIs of two barcode oligonucleotides attached to a bead of the plurality of beads comprise different sequences, wherein the UMIs of two barcode oligonucleotides attached to two beads of the plurality of beads comprise an identical sequence, and/or wherein the UMI of each barcode oligonucleotide is at least 6 nucleotides in length.
- The method of any one of claims 1-20, wherein the barcode oligonucleotide further comprises a first polymerase chain reaction (PCR) primer-binding sequence, optionally wherein the first PCR primer-binding sequence comprises a Read 1 sequence.
- The method of any one of claims 1-21, wherein the barcode oligonucleotides are reversibly attached to, covalently attached to, or irreversibly attached to the bead.
- The method of any one of claims 1-22, wherein the barcode oligonucleotide comprises from the 5’ end to the 3’ end, the cell barcode, the UMI, the PCR primer-binding sequence, and the probe sequence or the UMI, the cell barcode, the PCR primer-binding sequence, and the probe sequence.
- The method of any one of claims 1-23, wherein the bead is a gel bead, optionally wherein the gel bead is degradable upon application of a stimulus, further optionally wherein the stimulus comprises a thermal stimulus, a chemical stimulus, a biological stimulus, a photo-stimulus, or a combination thereof.
- The method of any one of claims 1-23, wherein the bead is a solid bead and/or a magnetic bead.
- The method of any one of claims 3-25, wherein barcoding the nucleic acid targets associated with the cell comprises:hybridizing the barcode oligonucleotides attached to the bead in each partition of the plurality of partitions with nucleic acid targets associated with the cell in the partition;extending the barcode oligonucleotides attached to the bead and hybridized to the nucleic acid targets using the nucleic acid targets as templates to generate single-stranded barcoded nucleic acids; andgenerating double-stranded barcoded nucleic acids from the single-stranded barcoded nucleic acids.
- The method of claim 26, wherein generating double-stranded barcoded nucleic acids from the single-stranded barcoded nucleic acids comprises extending the single-stranded barcoded nucleic acids.
- The method of claim 27, wherein extending the single-stranded barcoded nucleic acids comprises extending the single-stranded barcoded nucleic acids using a template switching oligonucleotide.
- The method of any one of claims 26-28, further comprising pooling the beads prior to extending the barcode oligonucleotides or prior to generating the double-stranded barcoded nucleic acids.
- The method of any one of claims 26-29, wherein extending the barcode oligonucleotides attached to the bead and hybridized to the nucleic acid targets comprises extending the barcode oligonucleotides attached to the bead and hybridized to the nucleic acid targets in bulk, and wherein generating the double-stranded barcoded nucleic acids comprises generating the double-stranded barcoded nucleic acids from the single-stranded barcoded nucleic acids in bulk.
- The method of any one of claims 26-30, further comprising pooling the beads subsequent to extending the barcode oligonucleotides attached to the bead to generate the single-stranded barcoded nucleic acids or subsequent to generating the double-stranded barcoded nucleic acids.
- The method of any one of claims 26-31, wherein extending the barcode oligonucleotides attached to the bead and hybridized to the nucleic acid targets comprises extending the barcode oligonucleotides attached to the bead and hybridized to the nucleic acid targets in the partition, and wherein generating the double-stranded barcoded nucleic acids comprises generating the double-stranded barcoded nucleic acids from the single-stranded barcoded nucleic acids in the partition.
- The method of any one of claims 1-32, wherein circularizing each of the first portion of the first plurality of barcoded cDNAs comprises connecting the UMIs and the cell barcodes attached to one end of the cDNAs corresponding to the RNA targets to the other end of the cDNAs corresponding to the RNA targets, or wherein circularizing each of the first portion of the first plurality of barcoded nucleic acids comprises connecting the UMIs and the cell barcodes attached to one end of the nucleotide sequences corresponding to the nucleic acid targets to the other end of the nucleotide sequences corresponding to the nucleic acid targets.
- The method of any one of claims 1-33, wherein circularizing each of the first portion of the first plurality of barcoded nucleic acids/cDNAs comprises:generating barcoded nucleic acid/cDNA comprising a first circle handle attached to one end of the first plurality of barcoded nucleic acid/cDNA and a second circle handle attached to the other end of the first plurality of barcoded nucleic acid/cDNA; andconnecting the first circle handle and the second circle handle.
- The method of claim 34, wherein the first circle handle and the second circle handle comprise an identical nucleotide sequence, an overlapping nucleotide sequence and/or a complementary nucleotide sequence, optionally wherein the identical nucleotide sequence, the overlapping nucleotide sequence and/or the complementary nucleotide sequence is at least 10 nucleotides in length and/or at most 150 nucleotides in length, further optionally about 40 nucleotides in length.
- The method of any one of claim 34-35, wherein generating barcoded nucleic acid/cDNA comprising a first circle handle attached to one end of the first plurality of barcoded nucleic acid/cDNA and a second circle handle attached to the other end of the first plurality of barcoded nucleic acid/cDNA comprises:hybridizing first circularization primers comprising the first circle handles to the one end of each of the first plurality of barcoded nucleic acids/cDNAs, and second circularization primers comprising the second circle handles to the other end of each of the first plurality of barcoded nucleic acids/cDNAs; andextending the first circularization primers and the second circularization primers using each of the first portion of the first plurality of barcoded nucleic acids/cDNAs as templates.
- The method of claim 35, wherein connecting the first circle handle and the second circle handle comprises connecting the identical nucleotide sequences, the overlapping nucleotide sequences and/or the complementary nucleotide sequences of the first circle handle and the second circle handle.
- The method of any one of claims 1-37, wherein amplifying the plurality of circularized barcoded nucleic acids/cDNAs comprises:hybridizing first linearization primers and second linearization primers to the plurality of circularized barcoded nucleic acids/cDNAs;extending the first linearization primers and the second linearization primers using the plurality of circularized barcoded nucleic acids/cDNAs as templates.
- The method of claim 38, wherein the first linearization primers and the second linearization primers hybridize to a sequence between 1) the one end of the nucleotide sequences corresponding to the nucleic acid targets or the cDNAs corresponding to the mRNA targets, and 2) the UMI and the cell barcode.
- The method of claim 38 or 39, wherein the first linearization primers and the second linearization primers hybridize to a sequence comprising the first PCR primer-binding sequence of the barcode oligonucleotides.
- The method of any one of claims 1-40, wherein amplifying the plurality of circularized barcoded nucleic acids/cDNAs further comprises purifying the second plurality of linear barcoded nucleic acids/cDNAs.
- The method of any one of claims 1-41, wherein analyzing the second portion of the first plurality of barcoded nucleic acids/cDNAs and the second plurality of linear barcoded nucleic acids/cDNAs comprises amplifying the second portion of the first plurality of barcoded nucleic acids/cDNAs to obtain an amplified second portion of the first plurality of barcoded nucleic acids/cDNAs and the second plurality of linear barcoded nucleic acids/cDNAs to obtain an amplified second plurality of linear barcoded nucleic acids/cDNAs.
- The method of claim 42, wherein analyzing the second portion of the first plurality of barcoded nucleic acids/cDNAs and the second plurality of linear barcoded nucleic acids/cDNAs comprises pooling the second portion of the first plurality of barcoded nucleic acids/cDNAs and the second plurality of linear barcoded nucleic acids/cDNAs before amplifying the second portion of the first plurality of barcoded nucleic acids/cDNAs and the second plurality of linear barcoded nucleic acids/cDNAs.
- The method of claims 42 or 43, wherein analyzing the second portion of the first plurality of barcoded nucleic acids/cDNAs and the second plurality of linear barcoded nucleic acids/cDNAs comprises processing the amplified second portion of the first plurality of barcoded nucleic acids/cDNAs and the amplified second plurality of linear barcoded nucleic acids/cDNAs to generate processed second portion of the first plurality of barcoded nucleic acids/cDNAs and processed second plurality of linear barcoded nucleic acids/cDNAs.
- The method of claim 44, wherein processing the amplified second portion of the first plurality of barcoded nucleic acids/cDNAs and the amplified second plurality of linear barcoded nucleic acids/cDNAs comprises:fragmenting the amplified second portion of the first plurality of barcoded nucleic acids/cDNAs to generate fragmented second portion of the first plurality of barcoded nucleic acids/cDNAs and the amplified second plurality of linear barcoded nucleic acids/cDNAs to generate fragmented second plurality of linear barcoded nucleic acids/cDNAs, optionally wherein fragmenting the amplified second portion of the first plurality of barcoded nucleic acids/cDNAs and the amplified second plurality of linear barcoded nucleic acids/cDNAs comprises fragmenting the amplified second portion of the first plurality of barcoded nucleic acids/cDNAs and the amplified second plurality of linear barcoded nucleic acids/cDNAs enzymatically;adding a second polymerase chain reaction (PCR) primer-binding sequence, optionally wherein the second PCR primer-binding sequence comprises a Read 2 sequence; andgenerating processed second portion of the first plurality of barcoded nucleic acids/cDNAs and processed second plurality of linear barcoded nucleic acids/cDNAs comprising sequencing primer sequences from the fragmented second portion of the first plurality of barcoded nucleic acids/cDNAs and fragmented second plurality of linear barcoded nucleic acids/cDNAs, optionally wherein the sequencing primer sequences comprise a P5 sequence and a P7 sequence.
- The method of claim 45, wherein analyzing the second portion of the first plurality of barcoded nucleic acids/cDNAs and the second plurality of linear barcoded nucleic acids/cDNAs further comprises pooling 1) the amplified second portion of the first plurality of barcoded nucleic acids/cDNAs and the amplified second plurality of linear barcoded nucleic acids/cDNAs; 2) the processed second portion of the first plurality of barcoded nucleic acids/cDNAs and the processed second plurality of linear barcoded nucleic acids/cDNAs; or 3) the fragmented second portion of the first plurality of barcoded nucleic acids/cDNAs and fragmented second plurality of linear barcoded nucleic acids/cDNAs.
- The method of any one of claims 1-45, wherein analyzing the second portion of the first plurality of barcoded nucleic acids/cDNAs and the second plurality of linear barcoded nucleic acids/cDNAs, or products thereof comprises sequencing the second portion of the first plurality of barcoded nucleic acids/cDNAs and the second plurality of linear barcoded nucleic acids/cDNAs, or products thereof to obtain sequencing information.
- The method of claim 47, wherein sequencing the second portion of the first plurality of barcoded nucleic acids/cDNAs and the second plurality of linear barcoded nucleic acids/cDNAs, or products thereof comprises sequencing the processed second portion of the first plurality of barcoded nucleic acids/cDNAs and the processed second plurality of linear barcoded nucleic acids/cDNAs.
- The method of claim 47 or 48, wherein sequencing the second portion of the first plurality of barcoded nucleic acids/cDNAs and the second plurality of linear barcoded nucleic acids/cDNAs, or products thereof, comprises sequencing products of the second portion of the first plurality of barcoded nucleic acids/cDNAs and the second plurality of linear barcoded nucleic acids/cDNAs each comprising a P5 sequence, a Read 1 sequence, a cell barcode, a UMI, a poly-dT sequence, a probe sequence, a sequence of a nucleic acid target or a part thereof, a Read 2 sequence, a sample index, and/or a P7 sequence to obtain sequencing information.
- The method of any one of claims 1-49, wherein analyzing the second portion of the first plurality of barcoded nucleic acids/cDNAs and the second plurality of linear barcoded nucleic acids/cDNAs, or products thereof comprises matching the second portion of the first plurality of barcoded nucleic acids/cDNAs and the second plurality of linear barcoded nucleic acids/cDNAs by the cell barcodes and the UMIs.
- The method of any one of claims 47-50, wherein analyzing the second portion of the first plurality of barcoded nucleic acids/cDNAs and the second plurality of linear barcoded nucleic acids/cDNAs, or products thereof comprises obtaining the full-length sequences of the nucleic acid targets or the RNA targets by integrating the sequencing information of the second portion of the first plurality of barcoded nucleic acids/cDNAs and the second plurality of linear barcoded nucleic acids/cDNAs.
- The method of any one of claims 47-51, wherein analyzing the second portion of the first plurality of barcoded nucleic acids/cDNAs and the second plurality of linear barcoded nucleic acids/cDNAs, or products thereof comprises analyzing the sequencing information.
- The method of any one of claims 47-53, wherein analyzing the second portion of the first plurality of barcoded nucleic acids/cDNAs and the second plurality of linear barcoded nucleic acids/cDNAs, or products thereof comprises:determining an expression profile of each of the nucleic acid targets and/or the RNA targets using a number of UMIs with different sequences associated with the nucleic acid targets and/or the RNA targets in the sequencing information,optionally wherein the expression profile comprises an absolute abundance or a relative abundance.
- The method of claim 53, wherein the expression profile comprises an RNA expression profile, an mRNA expression profile and/or a protein expression profile.
- The method of any one of claims 1-54, wherein analyzing the second portion of the first plurality of barcoded nucleic acids/cDNAs and the second plurality of linear barcoded nucleic acids/cDNAs, or products thereof comprises:determining a number of the second portion of the first plurality of barcoded nucleic acids/cDNAs and the second plurality of linear barcoded nucleic acids/cDNAs of each of the nucleic acid targets/RNA targets comprising UMIs with different sequences; and/ordetermining sequences of the second portion of the first plurality of barcoded nucleic acids/cDNAs and the second plurality of linear barcoded nucleic acids/cDNAs of the nucleic acid targets/RNA targets comprising UMIs with different sequences.
- The method of any one of claims 1-55, wherein the second portion of the first plurality of barcoded nucleic acids/cDNAs and the second plurality of linear barcoded nucleic acids/cDNAs are from at least 100 cells, optionally at least 1,000 cells.
- The method of any one of claims 1-56, wherein the second portion of the first plurality of barcoded nucleic acids/cDNAs and the second plurality of linear barcoded nucleic acids/cDNAs are from about 100 cells to about 50,000 cells.
- The method of any one of claims 3-57, further comprising releasing the nucleic acids from the cell prior to barcoding the nucleic acid targets associated with the cell.
- The method of any one of claims 3-58, further comprising lysing the cell to release the nucleic acid targets form the cell.
- The method of any one of claims 1-2 and 5-59, wherein reverse transcribing the RNA targets hybridized to the barcode oligonucleotides is performed without lysing or digesting the cells.
- A composition comprising a plurality of beads of any one of claims 1-60, wherein the cell barcodes of the plurality of barcode oligonucleotides attached to each of the plurality of beads are identical, and wherein the cell barcodes of barcodes oligonucleotide attached to different beads of the plurality of beads are different, optionally wherein the plurality of beads comprises at least 100 beads.
- A kit comprising:a composition of claim 61; andinstructions of using the composition for single cell sequencing or analysis.
Priority Applications (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| PCT/CN2023/132328 WO2025102355A1 (en) | 2023-11-17 | 2023-11-17 | Methods and reagents for high-throughput single cell full length rna analysis |
| PCT/CN2024/132050 WO2025103412A1 (en) | 2023-11-17 | 2024-11-14 | Methods and reagents for high-throughput single cell full length rna analysis |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| PCT/CN2023/132328 WO2025102355A1 (en) | 2023-11-17 | 2023-11-17 | Methods and reagents for high-throughput single cell full length rna analysis |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2025102355A1 true WO2025102355A1 (en) | 2025-05-22 |
Family
ID=95741829
Family Applications (2)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/CN2023/132328 Pending WO2025102355A1 (en) | 2023-11-17 | 2023-11-17 | Methods and reagents for high-throughput single cell full length rna analysis |
| PCT/CN2024/132050 Pending WO2025103412A1 (en) | 2023-11-17 | 2024-11-14 | Methods and reagents for high-throughput single cell full length rna analysis |
Family Applications After (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/CN2024/132050 Pending WO2025103412A1 (en) | 2023-11-17 | 2024-11-14 | Methods and reagents for high-throughput single cell full length rna analysis |
Country Status (1)
| Country | Link |
|---|---|
| WO (2) | WO2025102355A1 (en) |
Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN109811045A (en) * | 2017-11-22 | 2019-05-28 | 深圳华大智造科技有限公司 | Construction method and application of high-throughput single-cell full-length transcriptome sequencing library |
| US20190345488A1 (en) * | 2016-10-01 | 2019-11-14 | Berkeley Lights, Inc. | Dna barcode compositions and methods of in situ identification in a microfluidic device |
| US20220145370A1 (en) * | 2019-03-27 | 2022-05-12 | 10X Genomics, Inc. | Systems and methods for processing rna from cells |
| WO2023030479A1 (en) * | 2021-09-02 | 2023-03-09 | 新格元(南京)生物科技有限公司 | Reagent and method for high-throughput single-cell targeted sequencing |
| US20230193355A1 (en) * | 2020-04-16 | 2023-06-22 | Singleron (Nanjing) Biotechnologies, Ltd. | Methods and compositions for high-throughput target sequencing in single cells |
Family Cites Families (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| SG10202012440VA (en) * | 2016-10-19 | 2021-01-28 | 10X Genomics Inc | Methods and systems for barcoding nucleic acid molecules from individual cells or cell populations |
| US20230151355A1 (en) * | 2019-03-12 | 2023-05-18 | Universal Sequencing Technology Corporation | Methods for Single Cell Intracellular Capture and its Applications |
| WO2021208036A1 (en) * | 2020-04-16 | 2021-10-21 | Singleron (Nanjing) Biotechnologies, Ltd. | A method for detection of whole transcriptome in single cells |
| WO2022188054A1 (en) * | 2021-03-10 | 2022-09-15 | Nanjing University | Methods and reagents for sample multiplexing for high throughput single-cell rna sequencing |
| CN114540472B (en) * | 2021-08-27 | 2024-02-23 | 四川大学华西第二医院 | Three-generation sequencing method |
-
2023
- 2023-11-17 WO PCT/CN2023/132328 patent/WO2025102355A1/en active Pending
-
2024
- 2024-11-14 WO PCT/CN2024/132050 patent/WO2025103412A1/en active Pending
Patent Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20190345488A1 (en) * | 2016-10-01 | 2019-11-14 | Berkeley Lights, Inc. | Dna barcode compositions and methods of in situ identification in a microfluidic device |
| CN109811045A (en) * | 2017-11-22 | 2019-05-28 | 深圳华大智造科技有限公司 | Construction method and application of high-throughput single-cell full-length transcriptome sequencing library |
| US20220145370A1 (en) * | 2019-03-27 | 2022-05-12 | 10X Genomics, Inc. | Systems and methods for processing rna from cells |
| US20230193355A1 (en) * | 2020-04-16 | 2023-06-22 | Singleron (Nanjing) Biotechnologies, Ltd. | Methods and compositions for high-throughput target sequencing in single cells |
| WO2023030479A1 (en) * | 2021-09-02 | 2023-03-09 | 新格元(南京)生物科技有限公司 | Reagent and method for high-throughput single-cell targeted sequencing |
Also Published As
| Publication number | Publication date |
|---|---|
| WO2025103412A1 (en) | 2025-05-22 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US20240141426A1 (en) | Compositions and methods for identification of a duplicate sequencing read | |
| CN113106150B (en) | An ultra-high-throughput single-cell sequencing method | |
| US20210071171A1 (en) | Compositions and methods for targeted nucleic acid sequence enrichment and high efficiency library generation | |
| KR102366116B1 (en) | Compositions and methods for sample processing | |
| US20140162278A1 (en) | Methods and compositions for enrichment of target polynucleotides | |
| US20140024541A1 (en) | Methods and compositions for high-throughput sequencing | |
| CN118497317A (en) | Single cell whole genome library for methylation sequencing | |
| KR20170020704A (en) | Methods of analyzing nucleic acids from individual cells or cell populations | |
| CA2974398A1 (en) | Methods for highly parallel and accurate measurement of nucleic acids | |
| US20140024536A1 (en) | Apparatus and methods for high-throughput sequencing | |
| CN114555802A (en) | single cell analysis | |
| CN106460051A (en) | Integrated single cell sequencing | |
| AU2019248635B2 (en) | Compositions and methods for making controls for sequence-based genetic testing | |
| WO2025103412A1 (en) | Methods and reagents for high-throughput single cell full length rna analysis | |
| JP2023514388A (en) | Parallelized sample processing and library preparation | |
| US20220373544A1 (en) | Methods and systems for determining cell-cell interaction | |
| WO2023138655A1 (en) | Adjustable droplets distribution | |
| EP4455306A1 (en) | Labeling and analysis method for single-cell nucleic acid |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 23958628 Country of ref document: EP Kind code of ref document: A1 |