[go: up one dir, main page]

WO2025059808A1 - Method and kit for high-throughput tagging of cell nucleic acid molecules - Google Patents

Method and kit for high-throughput tagging of cell nucleic acid molecules Download PDF

Info

Publication number
WO2025059808A1
WO2025059808A1 PCT/CN2023/119479 CN2023119479W WO2025059808A1 WO 2025059808 A1 WO2025059808 A1 WO 2025059808A1 CN 2023119479 W CN2023119479 W CN 2023119479W WO 2025059808 A1 WO2025059808 A1 WO 2025059808A1
Authority
WO
WIPO (PCT)
Prior art keywords
sequence
nucleic acid
primer
oligonucleotide
molecule
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
PCT/CN2023/119479
Other languages
French (fr)
Chinese (zh)
Inventor
蒋岚
李芸
黄正
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Institute Of Genomics Chinese Academy Of Sciences China National Center For Bioinformation
Beijing Institute of Genomics of CAS
Original Assignee
Beijing Institute Of Genomics Chinese Academy Of Sciences China National Center For Bioinformation
Beijing Institute of Genomics of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Institute Of Genomics Chinese Academy Of Sciences China National Center For Bioinformation, Beijing Institute of Genomics of CAS filed Critical Beijing Institute Of Genomics Chinese Academy Of Sciences China National Center For Bioinformation
Priority to PCT/CN2023/119479 priority Critical patent/WO2025059808A1/en
Priority to CN202380014812.9A priority patent/CN118647729A/en
Publication of WO2025059808A1 publication Critical patent/WO2025059808A1/en
Pending legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6806Preparing nucleic acids for analysis, e.g. for polymerase chain reaction [PCR] assay
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6869Methods for sequencing
    • CCHEMISTRY; METALLURGY
    • C40COMBINATORIAL TECHNOLOGY
    • C40BCOMBINATORIAL CHEMISTRY; LIBRARIES, e.g. CHEMICAL LIBRARIES
    • C40B40/00Libraries per se, e.g. arrays, mixtures
    • C40B40/04Libraries containing only organic compounds
    • C40B40/06Libraries containing nucleotides or polynucleotides, or derivatives thereof
    • CCHEMISTRY; METALLURGY
    • C40COMBINATORIAL TECHNOLOGY
    • C40BCOMBINATORIAL CHEMISTRY; LIBRARIES, e.g. CHEMICAL LIBRARIES
    • C40B50/00Methods of creating libraries, e.g. combinatorial synthesis
    • C40B50/06Biochemical methods, e.g. using enzymes or whole viable microorganisms

Definitions

  • This application relates to the field of high-throughput single-cell omics, in particular, high-throughput single-cell transcriptome sequencing technology, high-throughput single-cell chromatin accessibility (ATAC, Assay for Transposase-Accessible Chromatin) sequencing technology, and high-throughput single-cell transcriptome + chromatin accessibility multi-omics sequencing technology.
  • high-throughput single-cell transcriptome sequencing technology high-throughput single-cell transcriptome sequencing technology
  • chromatin accessibility Assay for Transposase-Accessible Chromatin
  • single-cell omics sequencing technology has greatly deepened human understanding of cell diversity and heterogeneity, and has played a revolutionary role in the development of multiple biological and biomedical research fields such as developmental biology, tumors and other diseases, assisted reproduction, immunology, neuroscience, and microbiology.
  • Existing single-cell sequencing mainly includes single-cell genome sequencing, transcriptome sequencing, methylation sequencing, chromatin accessibility sequencing, and single-cell multi-omics sequencing containing the above omics information. Its essence is to reveal the genome, transcriptome, methylation, chromatin open state and other omics changes of single cells by analyzing the sequence, copy number, modification status, and interaction of DNA and RNA in a single cell.
  • High-throughput single-cell library construction technology currently mainly includes high-throughput single-cell library construction technology for cell barcode labeling in microfluidic droplets or microplates.
  • all the commercial single-cell library construction technologies based on microfluidic droplets and microplates such as 10x Genomics (Zheng GX, et al. Massively parallel digital transcriptional profiling of single cells. Nat Commun. 2017 Jan 16; 8:14049. doi: 10.1038/ncomms14049. PMID: 28091601), have the following disadvantages: low cell throughput, high library construction cost, high empty rate of micro-reaction system, and high rate of pseudo-single cells.
  • the present application provides a method for labeling a nucleic acid molecule from a cell or a cell nucleus, comprising the following steps:
  • the plurality of first oligonucleotide molecules on the same bead have the same first tag sequence, and the first oligonucleotide molecules on different beads have first tag sequences different from each other;
  • the cell is a naturally occurring cell or a recombinant cell, or a mixture of the two;
  • the cell nucleus is a cell nucleus derived from a naturally occurring cell or a cell nucleus derived from a recombinant cell, or a mixture of the two.
  • the recombinant cell refers to a cell comprising a modified (e.g., artificially modified) nucleic acid molecule (e.g., gene) and/or its product (e.g., protein, RNA), wherein the modification includes, but is not limited to, increasing or decreasing the copy number of endogenous genes in the cell, mutating endogenous genes in the cell, upregulating or downregulating or silencing the expression of endogenous gene products in the cell, introducing exogenous nucleic acid molecules into the cell (the exogenous nucleic acid molecules are integrated into the genome of the cell or exist in a non-integrated form), etc.
  • a modified nucleic acid molecule e.g., gene
  • its product e.g., protein, RNA
  • the method of the present application can be used to label unmodified nucleic acid molecules in the cell/cell nucleus, and can also be used to label modified nucleic acid molecules in the cell/cell nucleus (e.g., modified endogenous nucleic acid molecules (e.g., genes) of the cell or introduced exogenous nucleic acid molecules contained in the cell).
  • modified nucleic acid molecules in the cell/cell nucleus e.g., modified endogenous nucleic acid molecules (e.g., genes) of the cell or introduced exogenous nucleic acid molecules contained in the cell.
  • cells or cell nuclei containing the first nucleic acid molecule originating from at least 2 e.g., at least 10 , at least 10 , at least 10 , at least 10 , at least 10 , at least 10, at least 10 , 2-10, 2-10 , 2-10 , 2-10 , 2-10, 2-10 or 2-10 ) of the first discrete partitions are mixed and redistributed to different second discrete partitions.
  • the first tag sequence contained in the first oligonucleotide molecule is specific to the first discrete partition, all the first nucleic acid molecules derived from cells or cell nuclei in the same first discrete partition in step (2) contain the same first tag sequence or its complementary sequence.
  • the second tag sequence contained in the second oligonucleotide molecule is specific to the second discrete partition, and therefore, all the second nucleic acid molecules derived from cells or cell nuclei assigned to the same second discrete partition in step (4) contain the same second tag sequence or its complementary sequence.
  • the first tag sequence and the second tag sequence can be used together to identify the cell from which the sequencing data originates.
  • the cells or cell nuclei can be of the same source or a mixture of different sources; the cells or cell nuclei can be derived from the same cell line or from different cell lines, from the same tissue or from different tissues, from the same individual or from different individuals, from the same species or from different species.
  • the cells or cell nuclei can also be a mixture of cells and cell nuclei.
  • a single said first discrete partition contains one said bead.
  • the first discrete partitions each independently contain one or more cells or cell nuclei.
  • each of the first discrete partitions independently contains 0-10 (e.g., 0-2, 0-3, 0-4, 0-5, 0-8, 1-2, 1-3, 1-4, 1-5, 1-8, 1-10, 2-3, 2-4, 2-5, 2-8, 2-10, 3-4, 3-5, 3-8, 3-10, 4-5, 4-8, 4-10) cells or cell nuclei.
  • 0-10 e.g., 0-2, 0-3, 0-4, 0-5, 0-8, 1-2, 1-3, 1-4, 1-5, 1-8, 1-10, 2-3, 2-4, 2-5, 2-8, 2-10, 3-4, 3-5, 3-8, 3-10, 4-5, 4-8, 4-10) cells or cell nuclei.
  • each of the second discrete partitions independently contains one or more cells or cell nuclei derived from the first discrete partition that contain the first nucleic acid molecule.
  • each of the second discrete partitions independently contains 0-10 7 (e.g., 0-10, 0-10 2 , 0-10 3 , 0-10 4 , 0-10 5 , 0-10 6 , 0-10 7 , 1-10, 1-10 2 , 1-10 3 , 1-10 4 , 1-10 5 , 1-10 6 , or 1-10 7 ) cells or cell nuclei derived from the first discrete partition that contain the first nucleic acid molecule.
  • 0-10 7 e.g., 0-10, 0-10 2 , 0-10 3 , 0-10 4 , 0-10 5 , 0-10 6 , 0-10 7 , 1-10, 1-10 2 , 1-10 3 , 1-10 4 , 1-10 5 , 1-10 6 , or 1-10 7
  • step (2) the method randomly distributes the plurality of beads and the plurality of cells or cell nuclei to different first discrete partitions by a microdroplet microfluidics system or a microplate system.
  • the droplet microfluidic system is selected from but not limited to: the microfluidic oil-in-water system of the 10X GENOMICS platform, the microfluidic system of the Fluidigm C1 platform, and the microfluidic system of the Biorad ddSEQ system.
  • the microplate system is selected from but not limited to: the microplate system of the BD Rhapsody platform and the microplate system of the Neocell platform.
  • the method uses methanol to fix and permeabilize the cells, or uses formaldehyde or paraformaldehyde and Triton X-100 to fix and permeabilize the cells.
  • the method fixes and permeabilizes the cells by a treatment selected from the group consisting of:
  • the concentration of use is 0.05%-2% (for example, 0.05%-0.2%, 0.05%-0.25%, 0.05%-0.3%, 0.05%-0.5%, 0.05%-0.8%, 0.05%-1%, 0.1%-0.2%, 0.1%-0.25%, 0.1%-0.3%, 0.1%-0.4%,
  • the cells are permeabilized by treating the cells with Triton X-100 (e.g., 0.1%-0.5%, 0.1%-0.8%, 0.1%-1%, 0.2%-0.25%, 0.2%-0.3%, 0.2%-0.4%, 0.2%-0.5%, 0.2%-0.8%, 0.2%-1% or 0.2%) at -4°C to 10°C (e.g., 0°C to 4°C) for 0.5-10 min (e.g., 1-5 min or 3 min).
  • Triton X-100 e.g., 0.1%-0.5%, 0.1%-0.8%, 0.1%-1%, 0.2%-0.25%, 0.2%-0.3%, 0.2%-0.4%, 0.2%
  • cells are fixed and permeabilized by treating the cells with 80% methanol for 10 min at -20°C.
  • the methods use formaldehyde or paraformaldehyde and digitonin to fix and permeabilize the nuclei.
  • the method further comprises administering IGEPAL (e.g., CA-630) and/or Tween-20 to permeabilize the cell nuclei.
  • IGEPAL e.g., CA-630
  • Tween-20 e.g., Tween-20
  • the method fixes and permeabilizes the cell nuclei by a treatment selected from the group consisting of:
  • the permeabilization solution further comprises IGEPAL (e.g., CA-630) and/or Tween-20.
  • IGEPAL e.g., CA-630
  • Tween-20 e.g., Tween-20
  • the concentration of digitonin in the permeabilization solution is 0.0005%-0.05% (eg, 0.0008%-0.005%, 0.0005%-0.002%, 0.0008%-0.002%, or 0.001%).
  • IGEPAL e.g., CA-630
  • concentration of 0.005%-0.1% e.g., 0.005%-0.05%, 0.008%-0.05%, 0.005%-0.02%, 0.008%-0.02%, or 0.01%.
  • the concentration of Tween-20 in the permeabilization solution is 0.005%-0.1% (eg, 0.005%-0.05%, 0.008%-0.05%, 0.005%-0.02%, 0.008%-0.02%, or 0.01%).
  • the cell nuclei are fixed by treating the cell nuclei with 1% formaldehyde for 10 min at room temperature, and after fixation, the cell nuclei are fixed with 0.001% digitonin, 0.01% IGEPAL (e.g., The cell nuclei are permeabilized by treating the cell nuclei with a permeabilization solution containing CA-630) and 0.01% Tween-20 at 0°C to 4°C for 2-4 min (e.g., 3 min).
  • the method of labeling nucleic acid molecules from cells or cell nuclei comprises one or more selected from the following:
  • step (1) at least 2 (e.g., at least 10, at least 102 , at least 103, at least 104 ) at least 10 4 , at least 10 5 , at least 10 6 , at least 10 7 , 2-10, 2-10 2 , 2-10 3 , 2-10 4 , 2-10 5 , 2-10 6 , 2-10 7 , 2-10 8 or 2-10 9 ) cells or cell nuclei; and/or, providing at least 2 (e.g., at least 10, at least 10 2 , at least 10 3 , at least 10 4 , at least 10 5 , at least 10 6 , at least 10 7 , at least 10 8 , 2-10, 2-10 2 , 2-10 3 , 2-10 4 , 2-10 5 , 2-10 6 , 2-10 7 , 2-10 8 or 2-10 9 ) beads;
  • the first discrete partitions are discrete micropores or discrete microdroplets (e.g., water-in-oil droplets);
  • the beads are coupled to at least 2 (e.g., at least 10, at least 10, at least 10 , at least 10 , at least 10 , at least 10 , at least 10 , at least 10 , at least 10, 2-10 , 2-10 , 2-10, 2-10, 2-10, 2-10 , 2-10 or 2-10 ) of the first oligonucleotide molecules;
  • the bead is capable of releasing the first oligonucleotide molecule spontaneously or upon exposure to one or more stimuli (e.g., temperature change, pH change, exposure to a specific chemical or phase, exposure to light, a reducing agent, etc.);
  • one or more stimuli e.g., temperature change, pH change, exposure to a specific chemical or phase, exposure to light, a reducing agent, etc.
  • the beads are gel beads
  • the cells or cell nuclei are divided into at least 2 (e.g., at least 3, at least 4, at least 5, at least 8, at least 10, at least 12, at least 20, at least 24, at least 50, at least 96, at least 100, at least 200, at least 384, at least 400, 2-5, 2-10, 2-50, 2-80, 2-100, 2-500, 2-10 3 , 2-10 4 , 2-10 5 or 2-10 6 ) of the second discrete partitions, wherein each of the second discrete partitions contains at least one cell or cell nucleus;
  • the second discrete partitions are discrete holes in a porous plate
  • step (3) and before step (4) the method further includes the steps of lysing cells and/or purifying the first nucleic acid molecule.
  • the nucleic acid molecule to be labeled is mRNA
  • the first oligonucleotide molecule is the first oligonucleotide molecule a.
  • the step (2) comprises the following steps:
  • the primer B comprises a common sequence T or its (a) annealing the first oligonucleotide molecule a with the cDNA chain generated in (a) within the first discrete partition, and performing an extension reaction to generate an extension product, wherein the extension product is the first nucleic acid molecule; wherein the first oligonucleotide molecule a comprises, from the 5' end to the 3' end: a consensus sequence R1 or a partial sequence thereof, the first tag sequence and a complementary sequence to the 3' end overhang.
  • an overhang can be formed or added at the 3' end of the cDNA chain by using a reverse transcriptase with terminal transfer activity.
  • step (ii)(a) is performed before or after assigning said cells or cell nuclei to said first discrete partitions.
  • the first oligonucleotide molecule a further comprises a unique molecular tag sequence, and the plurality of first oligonucleotide molecules a coupled to the same bead have unique molecular tag sequences different from each other.
  • the unique molecular tag sequence is located at the 3' end of the consensus sequence R1 or a partial sequence thereof.
  • the consensus sequence O is identical or partially identical to the consensus sequence T.
  • the 3' terminal overhang has a length of at least 1, at least 2, at least 3, at least 4, at least 5, at least 6, at least 7, at least 8, at least 9, at least 10, 1-10, 1-5, or 2-10 nucleotides.
  • the 3' terminal overhang is an overhang of 2-5 cytosine nucleotides (e.g., a CCC overhang).
  • step (4) comprises the following steps:
  • the first nucleic acid molecule is amplified using the second oligonucleotide molecule and primer C as primers, and the generated extension product is the second nucleic acid molecule;
  • the second oligonucleotide molecule comprises from the 5' end to the 3' end: the consensus sequence P1 or a partial sequence thereof, the second tag sequence, the consensus sequence R1 or a partial sequence thereof; the primer C comprises the consensus sequence O or a partial sequence thereof, or the primer C comprises the consensus sequence T or a partial sequence thereof.
  • the nucleic acid molecule to be labeled is genomic DNA
  • the first oligonucleotide molecule is the first oligonucleotide molecule b.
  • the step (2) comprises the following steps:
  • the incubation is performed under conditions that allow the nucleic acid molecule to be broken into nucleic acid fragments by the transposase complex I and the transferred strands are connected to the ends of the nucleic acid fragments (e.g., the 5' ends of the nucleic acid fragments).
  • the extension product is the first nucleic acid molecule; wherein the first oligonucleotide molecule b comprises, from the 5' end to the 3' end: a consensus sequence P1 or a partial sequence thereof and the first tag sequence.
  • step (a) is performed before or after partitioning the cells or cell nuclei into the first discrete partitions.
  • step (4) comprises the following steps:
  • the first nucleic acid molecule is amplified using the second oligonucleotide molecule and primer D as primers, and the generated extension product is the second nucleic acid molecule;
  • the second oligonucleotide molecule comprises from the 5' end to the 3' end: the consensus sequence P2 or a partial sequence thereof, the second tag sequence, the consensus sequence R2 or a partial sequence thereof; and the primer D comprises the consensus sequence P1 or a partial sequence thereof.
  • the nucleic acid molecules to be labeled are mRNA and genomic DNA, and the mRNA and genomic DNA have the same cell source;
  • the first oligonucleotide molecule includes a first oligonucleotide molecule a and a first oligonucleotide molecule b
  • the second oligonucleotide molecule includes a second oligonucleotide molecule a and a second oligonucleotide molecule b;
  • the beads are coupled to a plurality of the first oligonucleotide molecules a and a plurality of the first oligonucleotide molecules b at the same time; and the plurality of the first oligonucleotide molecules a and the plurality of the first oligonucleotide molecules b on the same bead have the same first tag sequence.
  • the step (2) comprises the following steps:
  • (A)(i)(a) in the first discrete partition reversely transcribe the mRNA molecule to be labeled with the first oligonucleotide molecule a to generate a cDNA chain, wherein the cDNA chain comprises a cDNA sequence complementary to the mRNA molecule to be labeled formed by using the first oligonucleotide molecule a as a reverse transcription primer, and a 3' terminal overhang; wherein the first oligonucleotide molecule a comprises, from the 5' end to the 3' end: a consensus sequence R1 or a partial sequence thereof, the first tag sequence, and a poly(T) sequence; and, (b) annealing primer A with the cDNA chain generated in (a), and performing an extension reaction to generate an extension product, wherein the extension product is the first nucleic acid molecule a; wherein the primer A comprises, from the 5' end to the 3' end, a consensus sequence O and a complementary sequence to the 3' terminal
  • the first oligonucleotide molecule b in the first discrete partition that is the same as (A), the first oligonucleotide molecule b is connected to the double-stranded nucleic acid fragment generated in (a), and an extension reaction is performed to generate an extension product, wherein the extension product is the first nucleic acid molecule b; wherein the first oligonucleotide molecule b comprises, from the 5' end to the 3' end: a consensus sequence P1 or a partial sequence thereof and the first tag sequence;
  • step (A) and the step (B) may be performed in any order (for example, (A) first and then (B), (B) first and then (A), or simultaneously).
  • step (A)(ii)(a) is performed before or after assigning said cells or cell nuclei to said first discrete partitions.
  • step (B)(a) is performed before or after partitioning the cells or cell nuclei into the first discrete partitions.
  • the consensus sequence O is identical or partially identical to the consensus sequence T.
  • the first nucleic acid molecule b comprises a sequence derived from a genomic DNA fragment in an open chromatin region in the cell or cell nucleus.
  • the 5' end of the consensus sequence R1 or a partial sequence thereof in the transposase complex I is phosphorylated.
  • the first oligonucleotide molecule a further comprises a unique molecular tag sequence, and the plurality of first oligonucleotide molecules a coupled to the same bead have unique molecular tag sequences different from each other.
  • the unique molecular tag sequence is located at the 3' end of the consensus sequence R1 or a partial sequence thereof.
  • the 3' terminal overhang has a length of at least 1, at least 2, at least 3, at least 4, at least 5, at least 6, at least 7, at least 8, at least 9, at least 10, 1-10, 1-5, or 2-10 nucleotides.
  • the 3' terminal overhang is an overhang of 2-5 cytosine nucleotides (e.g., a CCC overhang).
  • step (4) comprises the following steps:
  • the second oligonucleotide molecule a comprises from the 5' end to the 3' end: the consensus sequence P1 or a partial sequence thereof, the second tag sequence, the consensus sequence R1 or a partial sequence thereof;
  • the primer C comprises the consensus sequence O or a partial sequence thereof, or the primer C comprises the consensus sequence T or a partial sequence thereof;
  • the second oligonucleotide molecule b comprises from the 5' end to the 3' end: the consensus sequence P2 or a partial sequence thereof, the second tag sequence, the consensus sequence R2 or a partial sequence thereof;
  • the primer D comprises the consensus sequence P1 or a partial sequence thereof;
  • step (a) and the step (b) may be performed in any order (for example, (a) first and then (b), (b) first and then (a), or simultaneously).
  • library construction methods of the present invention are respectively:
  • the present application also provides a method for constructing a nucleic acid molecule library, which comprises:
  • nucleic acid molecule library is obtained.
  • step (2) the second nucleic acid molecules generated in a plurality of the second discrete partitions are recovered and/or combined.
  • the method comprises:
  • sequence of the nucleic acid molecule library is obtained.
  • the cell is a T cell or a B cell.
  • the method further comprises, after step (a) and before step (c), a step of enriching the target nucleic acid molecule;
  • the target nucleic acid molecule is a second nucleic acid molecule comprising: (i) a nucleotide sequence encoding a T cell receptor (TCR) or a B cell receptor (BCR) or a partial sequence thereof (e.g., a V(D)J sequence), and/or (ii) a complementary sequence of (i).
  • step (c) the second nucleic acid molecule is randomly fragmented by a transposase and an adapter sequence is added to its 5' end.
  • the linker sequence comprises the consensus sequence R2 or a partial sequence thereof.
  • the method further comprises step (d):
  • step (d) comprises: amplifying the product of step (c) using primer E and primer F, wherein primer E comprises a consensus sequence P1 and an optional third tag sequence, and primer F comprises from 5' to 3': a consensus sequence P2 or its complementary sequence, an optional fourth tag sequence, a consensus sequence R2 or a partial sequence thereof.
  • the method comprises:
  • nucleic acid molecule to be labeled is genomic DNA " of the first aspect above, and,
  • sequence of the nucleic acid molecule library is obtained.
  • the method further comprises step (c):
  • step (c) comprises: amplifying the product of step (b) using primer E’ and primer F’, wherein primer E’ comprises a consensus sequence P1 and an optional third tag sequence, and primer F’ comprises a consensus sequence P2 and an optional fourth tag sequence.
  • the method comprises:
  • nucleic acid molecules to be labeled are mRNA and genomic DNA from the same cell :" of the first aspect above, comprising a plurality of second nucleic acid molecules a and a plurality of second nucleic acid molecules b, and,
  • sequence of the nucleic acid molecule library is obtained.
  • the method further comprises, after step (b), step (c): randomly breaking the second nucleic acid molecule a and adding a linker sequence.
  • step (c) the second nucleic acid molecule a is randomly fragmented by a transposase and a linker sequence is added to its 5' end.
  • the linker sequence comprises the consensus sequence R2 or a partial sequence thereof.
  • the method before step (c), further comprises specifically enriching the second nucleic acid molecule a from the product of step (b).
  • the method specifically amplifies and enriches the second nucleic acid molecule a by using a primer G carrying a biotin label.
  • the primer G contains the consensus sequence O or a partial sequence thereof, or the primer G contains the consensus sequence T or a partial sequence thereof.
  • the amplification and enrichment further comprises using a primer H, wherein the primer H comprises a consensus sequence P1 or a partial sequence thereof.
  • the method further comprises step (d):
  • step (c) comprises: amplifying the product of step (c) using primer E and primer F, wherein primer E comprises a consensus sequence P1 and an optional third tag sequence, and primer F comprises from 5' to 3': a consensus sequence P2 or its complementary sequence, an optional fourth tag sequence, and a consensus sequence R2.
  • the method further comprises step (d)':
  • step (d)' comprises: amplifying the product of step (b) using primer E' and primer F', wherein primer E' comprises a consensus sequence P1 and an optional third tag sequence, and primer F' comprises a consensus sequence P2 and an optional fourth tag sequence.
  • the present application also provides a method for performing omics sequencing on a cell or a cell nucleus, comprising:
  • the nucleic acid molecule library is sequenced.
  • nucleic acid molecule libraries before sequencing, at least 2, at least 3, at least 4, at least 5, at least 8, at least 10, at least 12, at least 15, at least 18, at least 20, at least 25, 2-5, 2-10, 2-20, 2-30, 2-40 or 2-50 nucleic acid molecule libraries are combined and then sequenced; wherein each nucleic acid molecule library each has multiple nucleic acid molecules (i.e., amplification products), and the multiple nucleic acid molecules in the same library have the same third tag sequence or the same fourth tag sequence; and nucleic acid molecules derived from different libraries have different third tag sequences or different fourth tag sequences from each other.
  • the present application also provides a nucleic acid molecule library, which is constructed by the method described in any one of the second aspects above.
  • the present application also provides a reagent composition having characteristics selected from I, II and III:
  • the reagent composition comprises a second oligonucleotide molecule a, the sequence of which comprises, from the 5' end to the 3' end, a consensus sequence P1 or a partial sequence thereof, a second tag sequence, and a consensus sequence R1 or a partial sequence thereof;
  • the reagent composition further comprises one or more selected from the following:
  • the plurality of first oligonucleotide molecules a on the same bead have the same first tag sequence, and the first oligonucleotide molecules a on different beads have first tag sequences different from each other;
  • the first oligonucleotide molecule a comprises, from the 5' end to the 3' end: (i) a consensus sequence R1 or a partial sequence thereof, the first tag sequence, and a poly(T) sequence; or (ii) a consensus sequence R1 or a partial sequence thereof, the first tag sequence, and a complementary sequence to the 3' end overhang of the cDNA;
  • the first oligonucleotide molecule a further comprises a unique molecular tag sequence, and the plurality of first oligonucleotide molecules a coupled to the same bead have unique molecular tag sequences that are different from each other; in some embodiments, the unique molecular tag sequence is located at the 3' end of the common sequence R1 or a partial sequence thereof;
  • primer A comprises a consensus sequence O and a complementary sequence of the cDNA 3’ end overhang from the 5’ end to the 3’ end
  • primer B comprises a consensus sequence T or a partial sequence thereof and a poly(T) sequence from the 5’ end to the 3’ end
  • the consensus sequence O is identical or partially identical to the consensus sequence T
  • primer C comprises the consensus sequence O or a partial sequence thereof, or the primer C comprises the consensus sequence T or a partial sequence thereof; in certain embodiments, the consensus sequence O is identical or partially identical to the consensus sequence T;
  • Primer E and/or primer F wherein the primer E comprises a consensus sequence P1 and an optional third tag sequence, and the primer F comprises from 5' to 3': a consensus sequence P2 or a complementary sequence thereof, an optional fourth tag sequence, a consensus sequence R2 or a partial sequence thereof;
  • transposase complex II a transposase complex II, wherein the transposase complex II contains a transposase and a transposition sequence that the transposase can recognize and bind to, and can cut or break a double-stranded nucleic acid; and the transposition sequence comprises a transferred strand and a non-transferred strand; wherein the transferred strand comprises a linker sequence; in certain embodiments, the linker sequence comprises a consensus sequence R2 or a partial sequence thereof;
  • the cDNA 3' terminal overhang has a length of at least 1, at least 2, at least 3, at least 4, at least 5, at least 6, at least 7, at least 8, at least 9, at least 10, 1-10, 1-5 or 2-10 nucleotides. In some embodiments, the cDNA 3' terminal overhang is an overhang of 2-5 cytosine nucleotides (e.g., CCC overhang);
  • the reagent composition comprises a second oligonucleotide molecule b, the sequence of which comprises, from the 5' end to the 3' end, a consensus sequence P2 or a partial sequence thereof, a second tag sequence, and a consensus sequence R2 or a partial sequence thereof;
  • the reagent composition further comprises one or more selected from the following:
  • the plurality of first oligonucleotide molecules b on the same bead have the same first tag sequence, and the first oligonucleotide molecules b on different beads have first tag sequences different from each other;
  • the first oligonucleotide molecule b comprises from the 5' end to the 3' end: a consensus sequence P1 or a partial sequence thereof and the first tag sequence;
  • transposase complex I wherein the transposase complex I is as defined in the section "the nucleic acid molecule to be labeled is genomic DNA" in the labeling method described in the first aspect;
  • primer D comprises the consensus sequence P1 or a partial sequence thereof
  • primer E' comprises a consensus sequence P1 and an optional third tag sequence
  • primer F' comprises a consensus sequence P2 and an optional fourth tag sequence
  • the reagent composition comprises a second oligonucleotide molecule a and a second oligonucleotide molecule b; wherein the sequence of the second oligonucleotide molecule a comprises, from the 5' end to the 3' end: a consensus sequence P1 or a partial sequence thereof, a second tag sequence, and a consensus sequence R1 or a partial sequence thereof; the sequence of the second oligonucleotide molecule b comprises, from the 5' end to the 3' end: a consensus sequence P2 or a partial sequence thereof, a second tag sequence, and a consensus sequence R2 or a partial sequence thereof;
  • the reagent composition further comprises one or more selected from the following:
  • the first oligonucleotide molecule a comprises, from the 5' end to the 3' end: (i) a consensus sequence R1 or a partial sequence thereof, the first tag sequence, and a poly(T) sequence; or, (ii) a consensus sequence R1 or a partial sequence thereof, the first tag sequence, and a complementary sequence to the 3' end overhang of the cDNA; and/or, the first oligonucleotide molecule b comprises, from the 5' end to the 3' end: a consensus sequence P1 or a partial sequence thereof and the first tag sequence;
  • the first oligonucleotide molecule a further comprises a unique molecular tag sequence, and the plurality of first oligonucleotide molecules a coupled to the same bead have unique molecular tag sequences that are different from each other; in some embodiments, the unique molecular tag sequence is located at the 3' end of the common sequence R1 or a partial sequence thereof;
  • primer A comprises a consensus sequence O and a complementary sequence of the cDNA 3’ end overhang from the 5’ end to the 3’ end
  • primer B comprises a consensus sequence T or a partial sequence thereof and a poly(T) sequence from the 5’ end to the 3’ end
  • the consensus sequence O is identical or partially identical to the consensus sequence T
  • transposase complex I wherein the transposase complex I is as defined in the section "the nucleic acid molecule to be labeled is genomic DNA" in the labeling method described in the first aspect;
  • primer C comprises the consensus sequence O or a partial sequence thereof, or the primer C comprises the consensus sequence T or a partial sequence thereof; in certain embodiments, the consensus sequence O is identical or partially identical to the consensus sequence T;
  • primer D comprises the consensus sequence P1 or a partial sequence thereof
  • Primer E and/or primer F wherein the primer E comprises a consensus sequence P1 and an optional third tag sequence, and the primer F comprises from 5' to 3': a consensus sequence P2 or a complementary sequence thereof, an optional fourth tag sequence, a consensus sequence R2 or a partial sequence thereof;
  • Primer E' and/or primer F' wherein the primer E' comprises a consensus sequence P1 and an optional third tag sequence, and the primer F' comprises a consensus sequence P2 and an optional fourth tag sequence;
  • (III-h) comprises primer G and/or primer H, wherein primer G carries a biotin label and contains a consensus sequence O or a partial sequence thereof or a consensus sequence T or a partial sequence thereof, and primer H comprises a consensus sequence P1 or a partial sequence thereof; in certain embodiments, the consensus sequence O is identical or partially identical to the consensus sequence T;
  • transposase complex II contains a transposase and a transposition sequence that the transposase can recognize and bind to, and can cut or break a double-stranded nucleic acid; and the transposition sequence comprises a transferred strand and a non-transferred strand; wherein the transferred strand comprises a linker sequence; in certain embodiments, the linker sequence comprises a consensus sequence R2 or a partial sequence thereof;
  • the cDNA 3' terminal overhang has a length of at least 1, at least 2, at least 3, at least 4, at least 5, at least 6, at least 7, at least 8, at least 9, at least 10, 1-10, 1-5, or 2-10 nucleotides. In certain embodiments, the cDNA 3' terminal overhang is an overhang of 2-5 cytosine nucleotides. (e.g. CCC overhang).
  • the reagent composition further comprises a reagent for fixing and/or permeabilizing cells or cell nuclei.
  • the reagent composition further comprises methanol, formaldehyde and/or paraformaldehyde.
  • the reagent composition further comprises Triton X-100, digitonin, IGEPAL (e.g., CA-630), and/or Tween-20.
  • the reagent composition further comprises: an RNase inhibitor, mineral oil, a buffer, dNTPs, one or more nucleic acid polymerases (e.g., DNA polymerases; e.g., DNA polymerases having strand displacement activity and/or high fidelity), reagents for recovering or purifying nucleic acids (e.g., magnetic beads), a well plate, or any combination thereof.
  • an RNase inhibitor e.g., mineral oil, a buffer, dNTPs
  • one or more nucleic acid polymerases e.g., DNA polymerases; e.g., DNA polymerases having strand displacement activity and/or high fidelity
  • reagents for recovering or purifying nucleic acids e.g., magnetic beads
  • the reagent composition further comprises reagents for sequencing, such as reagents for next-generation sequencing.
  • the present application also provides a kit, which comprises: a multi-reaction system containing a plurality of oligonucleotide molecules, each of which contains a specific tag sequence;
  • the oligonucleotide molecules in each reaction system have the same tag sequence, and the oligonucleotide molecules in different reaction systems have different tag sequences.
  • the oligonucleotide molecule further comprises a consensus sequence P1 or a partial sequence thereof, or the oligonucleotide molecule further comprises a consensus sequence P2 or a partial sequence thereof.
  • the multiple reaction system comprises at least 2 (e.g., at least 3, at least 4, at least 5, at least 8, at least 10, at least 12, at least 20, at least 24, at least 50, at least 96, at least 100, at least 200, at least 384, at least 400, 2-5, 2-10, 2-50, 2-80, 2-100, 2-500, 2-10 3 , 2-10 4 , 2-10 5 , 2-10 6 ) multiple reaction systems containing oligonucleotides;
  • the multi-reaction system is preferably a multi-well plate, and the oligonucleotides can be free or fixed in the reaction system.
  • the present application provides a method for fixing and permeabilizing cells, comprising the following steps:
  • cells are fixed and permeabilized by treating the cells with 80% methanol for 10 min at -20°C.
  • the cell is a naturally occurring cell or a recombinant cell, or a mixture of both.
  • the recombinant cell refers to a cell comprising a modified (e.g., artificially modified) nucleic acid molecule (e.g., gene) and/or its product (e.g., protein, RNA), wherein the modification includes, but is not limited to, increasing or decreasing the copy number of endogenous genes in the cell, mutating endogenous genes in the cell, upregulating or downregulating or silencing the expression of endogenous gene products in the cell, introducing exogenous nucleic acid molecules into the cell (the exogenous nucleic acid molecules are integrated into the genome of the cell or exist in a non-integrated form), etc.
  • a modified nucleic acid molecule e.g., gene
  • its product e.g., protein, RNA
  • the present application provides a method for fixing and permeabilizing a cell nucleus, comprising the following steps:
  • the permeabilization solution further comprises IGEPAL (e.g., CA-630) and/or Tween-20.
  • IGEPAL e.g., CA-630
  • Tween-20 e.g., Tween-20
  • the concentration of digitonin in the permeabilization solution is 0.0005%-0.05% (eg, 0.0008%-0.005%, 0.0005%-0.002%, 0.0008%-0.002%, or 0.001%).
  • IGEPAL e.g., CA-630
  • concentration of 0.005%-0.1% e.g., 0.005%-0.05%, 0.008%-0.05%, 0.005%-0.02%, 0.008%-0.02%, or 0.01%.
  • the concentration of Tween-20 in the permeabilization solution is 0.005%-0.1% (eg, 0.005%-0.05%, 0.008%-0.05%, 0.005%-0.02%, 0.008%-0.02%, or 0.01%).
  • the cell nuclei are fixed by treating the cell nuclei with 1% formaldehyde for 10 min at room temperature, and after fixation, the cell nuclei are fixed with 0.001% digitonin, 0.01% IGEPAL (e.g., The cell nuclei are permeabilized by treating the cell nuclei with a permeabilization solution containing CA-630) and 0.01% Tween-20 at 0°C to 4°C for 2-4 min (e.g., 3 min).
  • the cell nucleus is a cell nucleus derived from a naturally occurring cell or a cell nucleus derived from a recombinant cell, or a mixture of both.
  • the recombinant cell refers to a cell comprising a modified (e.g., artificially modified) nucleic acid molecule (e.g., gene) and/or its product (e.g., protein, RNA), wherein the modification includes but is not limited to, Increasing or decreasing the copy number of the endogenous gene of the cell, mutating the endogenous gene of the cell, upregulating, downregulating or silencing the expression of the endogenous gene product of the cell, introducing exogenous nucleic acid molecules into the cell (the exogenous nucleic acid molecules are integrated into the genome of the cell or exist in a non-integrated form), etc.
  • a modified nucleic acid molecule e.g., gene
  • its product e.g., protein, RNA
  • the present application provides a device for labeling nucleic acid molecules from cells or cell nuclei and/or constructing a nucleic acid molecule library, the device comprising:
  • a processor coupled to the memory, the processor being configured to execute the method described in any one of the first aspects and/or the method described in any one of the second aspects based on instructions stored in the memory.
  • the present application provides a computer-readable storage medium having a computer program stored thereon, characterized in that when the program is executed by a processor, it implements any method of the above-mentioned first aspect and/or any method of the above-mentioned second aspect.
  • the present application also provides the use of any one of the methods of the first aspect above, the reagent composition of the fifth aspect above, the kit of the sixth aspect, the method of the seventh aspect, the eighth aspect, the device of the ninth aspect, or the computer-readable storage medium of the tenth aspect for constructing a nucleic acid molecule library or for performing transcriptome sequencing; or, the use of any one of the methods of the second aspect above for performing transcriptome sequencing.
  • the present application provides the following embodiments:
  • Embodiment 1 A method for labeling a nucleic acid molecule from a cell or a cell nucleus, comprising the following steps:
  • the plurality of first oligonucleotide molecules on the same bead have the same first tag sequence, and the first oligonucleotide molecules on different beads have first tag sequences different from each other;
  • the cell is a naturally occurring cell or a recombinant cell, or a mixture of the two; the cell nucleus is derived from The nucleus of a naturally occurring cell or a nucleus derived from a recombinant cell, or a mixture of both.
  • Implementation Option 2 The method of Implementation Option 1, wherein the method uses methanol to fix and permeabilize the cells, or uses formaldehyde or paraformaldehyde and Triton X-100 to fix and permeabilize the cells.
  • Embodiment 3 The method of embodiment 1, wherein the method uses formaldehyde or paraformaldehyde and digitonin to fix and permeabilize the cell nucleus;
  • the method further comprises administering IGEPAL (e.g., CA-630) and/or Tween-20 to permeabilize the cell nuclei.
  • IGEPAL e.g., CA-630
  • Tween-20 e.g., Tween-20
  • Embodiment 4 The method of any one of embodiments 1-3, comprising one or more selected from the following:
  • step (1) providing at least 2 cells or cell nuclei; and/or providing at least 2 beads;
  • the first discrete partition is a discrete micropore or a discrete microdroplet
  • the beads are coupled to at least two of the first oligonucleotide molecules
  • the bead is capable of releasing the first oligonucleotide molecule spontaneously or upon exposure to one or more stimuli;
  • the beads are gel beads
  • step (3) the cells or cell nuclei are distributed into at least two of the second discrete partitions, wherein each of the second discrete partitions contains at least one cell or cell nucleus;
  • the second discrete partitions are discrete holes in a porous plate
  • step (3) and before step (4) the method further includes the steps of lysing cells and/or purifying the first nucleic acid molecule.
  • Embodiment 5 The method of any one of Embodiments 1-4, wherein the nucleic acid molecule to be labeled is mRNA, and the first oligonucleotide molecule is the first oligonucleotide molecule a.
  • Embodiment 6 The method of embodiment 5, wherein step (2) comprises the following steps:
  • the nucleic acid molecule to be labeled is reverse transcribed with primer B to generate a cDNA chain, wherein the cDNA chain comprises a cDNA sequence complementary to the nucleic acid molecule to be labeled formed by using primer B as a reverse transcription primer, and a 3' terminal overhang; wherein the primer B comprises a consensus sequence T or a partial sequence thereof and a poly(T) sequence from the 5' end to the 3'end; and, (b) in the first discrete partition, the first oligonucleotide molecule a is annealed with the cDNA chain generated in (a), and an extension reaction is performed to generate an extension product, wherein the extension product is the first nucleic acid molecule; wherein the first oligonucleotide molecule a comprises, from the 5' end to the 3' end: a consensus sequence R1 or a partial sequence thereof, the first tag sequence and a complementary sequence to the 3' terminal overhang
  • step (ii)(a) is performed before or after assigning said cells or cell nuclei to said first discrete partitions.
  • the first oligonucleotide molecule a further comprises a unique molecular tag sequence, and the multiple first oligonucleotide molecules a coupled to the same bead have unique molecular tag sequences that are different from each other; in some embodiments, the unique molecular tag sequence is located at the 3' end of the common sequence R1 or a partial sequence thereof.
  • the consensus sequence O is identical or partially identical to the consensus sequence T.
  • the 3' terminal overhang has a length of at least 1, at least 2, at least 3, at least 4, at least 5, at least 6, at least 7, at least 8, at least 9, at least 10, 1-10, 1-5 or 2-10 nucleotides; in certain embodiments, the 3' terminal overhang is an overhang of 2-5 cytosine nucleotides (e.g., a CCC overhang).
  • Embodiment 7 The method of embodiment 5 or 6, wherein step (4) comprises the following steps:
  • the first nucleic acid molecule is amplified using the second oligonucleotide molecule and primer C as primers, and the generated extension product is the second nucleic acid molecule;
  • the second oligonucleotide molecule comprises from the 5' end to the 3' end: the consensus sequence P1 or a partial sequence thereof, the second tag sequence, the consensus sequence R1 or a partial sequence thereof; the primer C comprises the consensus sequence O or a partial sequence thereof, or the primer C comprises the consensus sequence T or a partial sequence thereof.
  • Embodiment 8 The method of any one of Embodiments 1-4, wherein the nucleic acid molecule to be labeled is genomic DNA, and the first oligonucleotide molecule is the first oligonucleotide molecule b.
  • Embodiment 9 The method of embodiment 8, wherein step (2) comprises the following steps:
  • the incubation is performed under conditions that allow the nucleic acid molecule to be broken into nucleic acid fragments by the transposase complex I and the transferred strands are connected to the ends of the nucleic acid fragments (e.g., the 5' ends of the nucleic acid fragments).
  • the first oligonucleotide molecule b is connected to the double-stranded nucleic acid fragment generated in (a) (for example, by using a nuclease), and an extension reaction is performed to generate an extension product, which is the first nucleic acid molecule; wherein the first oligonucleotide molecule b comprises, from the 5' end to the 3' end: a consensus sequence P1 or a partial sequence thereof and the first tag sequence.
  • the first nucleic acid molecule comprises a sequence derived from a genomic DNA fragment in an open chromatin region in the cell or cell nucleus.
  • step (a) is performed before or after partitioning the cells or cell nuclei into the first discrete partitions.
  • the 5' end of the consensus sequence R1 of the transposase complex I or a portion thereof is phosphorylated of.
  • Embodiment 10 The method of embodiment 8 or 9, wherein step (4) comprises the following steps:
  • the first nucleic acid molecule is amplified using the second oligonucleotide molecule and primer D as primers, and the generated extension product is the second nucleic acid molecule;
  • the second oligonucleotide molecule comprises from the 5' end to the 3' end: the consensus sequence P2 or a partial sequence thereof, the second tag sequence, the consensus sequence R2 or a partial sequence thereof; and the primer D comprises the consensus sequence P1 or a partial sequence thereof.
  • Embodiment 11 The method according to any one of embodiments 1 to 4, wherein the nucleic acid molecules to be labeled are mRNA and genomic DNA, and the mRNA and genomic DNA have the same cell source;
  • the first oligonucleotide molecule includes a first oligonucleotide molecule a and a first oligonucleotide molecule b
  • the second oligonucleotide molecule includes a second oligonucleotide molecule a and a second oligonucleotide molecule b;
  • the beads are coupled to a plurality of the first oligonucleotide molecules a and a plurality of the first oligonucleotide molecules b at the same time; and the plurality of the first oligonucleotide molecules a and the plurality of the first oligonucleotide molecules b on the same bead have the same first tag sequence.
  • Embodiment 12 The method of embodiment 11, wherein step (2) comprises the following steps:
  • (A)(i)(a) in the first discrete partition reversely transcribe the mRNA molecule to be labeled with the first oligonucleotide molecule a to generate a cDNA chain, wherein the cDNA chain comprises a cDNA sequence complementary to the mRNA molecule to be labeled formed by using the first oligonucleotide molecule a as a reverse transcription primer, and a 3' terminal overhang; wherein the first oligonucleotide molecule a comprises, from the 5' end to the 3' end: a consensus sequence R1 or a partial sequence thereof, the first tag sequence, and a poly(T) sequence; and, (b) annealing primer A with the cDNA chain generated in (a), and performing an extension reaction to generate an extension product, wherein the extension product is the first nucleic acid molecule a; wherein the primer A comprises, from the 5' end to the 3' end, a consensus sequence O and a complementary sequence to the 3' terminal
  • transposase complex I (B) (a) incubating the DNA molecule to be labeled with transposase complex I; wherein the transposase complex I is as defined in Embodiment 9; and the incubation is performed under conditions that allow the DNA molecule to be broken into nucleic acid fragments by the transposase complex I and the transferred strand to be connected to the end of the nucleic acid fragment (e.g., the 5' end of the nucleic acid fragment); thereby generating double-stranded nucleic acid fragments whose 5' ends contain a consensus sequence R2 or a partial sequence thereof and a consensus sequence R1 or a partial sequence thereof, respectively; and,
  • the first oligonucleotide molecule b is combined with the oligonucleotide molecule in (a) The generated double-stranded nucleic acid fragments are connected and extended to generate an extension product, which is the first nucleic acid molecule b; wherein the first oligonucleotide molecule b comprises, from the 5' end to the 3' end: a consensus sequence P1 or a partial sequence thereof and the first tag sequence;
  • step (A) and the step (B) may be performed in any order (for example, (A) first and then (B), (B) first and then (A), or simultaneously).
  • step (A)(ii)(a) is performed before or after assigning said cells or cell nuclei to said first discrete partitions.
  • step (B)(a) is performed before or after partitioning the cells or cell nuclei into the first discrete partitions.
  • the consensus sequence O is identical or partially identical to the consensus sequence T.
  • the first nucleic acid molecule b comprises a sequence derived from a genomic DNA fragment in an open chromatin region in the cell or cell nucleus.
  • the 5' end of the consensus sequence R1 or a partial sequence thereof in the transposase complex I is phosphorylated.
  • the first oligonucleotide molecule a further comprises a unique molecular tag sequence, and the multiple first oligonucleotide molecules a coupled to the same bead have unique molecular tag sequences that are different from each other; in some embodiments, the unique molecular tag sequence is located at the 3' end of the common sequence R1 or a partial sequence thereof.
  • the 3' terminal overhang has a length of at least 1, at least 2, at least 3, at least 4, at least 5, at least 6, at least 7, at least 8, at least 9, at least 10, 1-10, 1-5 or 2-10 nucleotides; in certain embodiments, the 3' terminal overhang is an overhang of 2-5 cytosine nucleotides (e.g., a CCC overhang).
  • Embodiment 13 The method of embodiment 11 or 12, wherein step (4) comprises the following steps:
  • the second oligonucleotide molecule a comprises from the 5' end to the 3' end: the consensus sequence P1 or a partial sequence thereof, the second tag sequence, the consensus sequence R1 or a partial sequence thereof;
  • the primer C comprises the consensus sequence O or a partial sequence thereof, or the primer C comprises the consensus sequence T or a partial sequence thereof;
  • the second oligonucleotide molecule b comprises from the 5' end to the 3' end: the consensus sequence P2 or a partial sequence thereof, the second tag sequence, the consensus sequence R2 or a partial sequence thereof;
  • the primer D comprises the consensus sequence P1 or a partial sequence thereof;
  • step (a) and the step (b) may be performed in any order (for example, (a) first and then (b), (b) first and then (a), or simultaneously).
  • Embodiment 14 A method for constructing a nucleic acid molecule library, comprising:
  • nucleic acid molecule library is obtained.
  • step (2) the second nucleic acid molecules generated in a plurality of the second discrete partitions are recovered and/or combined.
  • Embodiment 15 The method of embodiment 14, comprising:
  • sequence of the nucleic acid molecule library is obtained.
  • Embodiment 16 The method of Embodiment 15, wherein the cell is a T cell or a B cell.
  • the method further comprises, after step (a) and before step (c), a step of enriching the target nucleic acid molecule;
  • the target nucleic acid molecule is a second nucleic acid molecule comprising: (i) a nucleotide sequence encoding a T cell receptor (TCR) or a B cell receptor (BCR) or a partial sequence thereof (e.g., a V(D)J sequence), and/or (ii) a complementary sequence of (i).
  • Embodiment 17 The method of any one of Embodiments 14-16, wherein, in step (c), the second nucleic acid molecule is randomly interrupted by a transposase and a linker sequence is added to its 5' end.
  • the linker sequence comprises the consensus sequence R2 or a partial sequence thereof.
  • Embodiment 18 The method of any one of Embodiments 14-17, wherein the method further comprises step (d):
  • step (d) comprises: amplifying the product of step (c) using primer E and primer F, wherein primer E comprises a consensus sequence P1 and an optional third tag sequence, and primer F comprises from 5' to 3': a consensus sequence P2 or its complementary sequence, an optional fourth tag sequence, a consensus sequence R2 or a partial sequence thereof.
  • Embodiment 19 The method of embodiment 14, comprising:
  • sequence of the nucleic acid molecule library is obtained.
  • Embodiment 20 The method of Embodiment 19, wherein the method further comprises step (c):
  • step (c) comprises: amplifying the product of step (b) using primer E’ and primer F’, wherein primer E’ comprises a consensus sequence P1 and an optional third tag sequence, and primer F’ comprises a consensus sequence P2 and an optional fourth tag sequence.
  • Embodiment 21 The method of embodiment 14, comprising:
  • sequence of the nucleic acid molecule library is obtained.
  • Embodiment 22 The method of embodiment 21, wherein, after step (b), the method further comprises step (c): randomly breaking the second nucleic acid molecule a and adding a linker sequence.
  • step (c) the second nucleic acid molecule a is randomly fragmented by a transposase and a linker sequence is added to its 5' end.
  • the linker sequence comprises the consensus sequence R2 or a partial sequence thereof.
  • Embodiment 23 The method of embodiment 22, wherein, before step (c), the method further comprises specifically enriching the second nucleic acid molecule a from the product of step (b).
  • the method specifically amplifies and enriches the second nucleic acid molecule a by using a primer G carrying a biotin label.
  • the primer G contains the consensus sequence O or a partial sequence thereof, or the primer G contains the consensus sequence T or a partial sequence thereof.
  • the amplification and enrichment further comprises using a primer H, wherein the primer H comprises a consensus sequence P1 or a partial sequence thereof.
  • Embodiment 24 The method of Embodiment 22 or 23, wherein the method further comprises step (d):
  • step (c) comprises: amplifying the product of step (c) using primer E and primer F, wherein primer E comprises a consensus sequence P1 and an optional third tag sequence, and primer F comprises from 5' to 3': a consensus sequence P2 or its complementary sequence, an optional fourth tag sequence, and a consensus sequence R2.
  • Embodiment 25 The method of any one of Embodiments 21-24, wherein the method further comprises step (d)':
  • step (d)' comprises: amplifying the product of step (b) using primer E' and primer F', wherein primer E' comprises a consensus sequence P1 and an optional third tag sequence, and primer F' comprises a consensus sequence P2 and an optional fourth tag sequence.
  • Embodiment 26 A method for performing omics sequencing on a cell or a cell nucleus, comprising:
  • the nucleic acid molecule library is sequenced.
  • nucleic acid molecule libraries prior to sequencing, at least 2, at least 3, at least 4, at least 5, at least 8, at least 10, at least 12, at least 15, at least 18, at least 20, at least 25, 2-5, 2-10, 2-20, 2-30, 2-40 or 2-50 nucleic acid molecule libraries are combined and then sequenced; wherein each nucleic acid molecule
  • Each sub-library has multiple nucleic acid molecules (i.e., amplification products), and the multiple nucleic acid molecules in the same library have the same third tag sequence or the same fourth tag sequence; and the nucleic acid molecules derived from different libraries have different third tag sequences or different fourth tag sequences from each other.
  • Embodiment 27 A nucleic acid molecule library constructed by the method described in any one of embodiments 14-25.
  • Embodiment 28 A reagent composition having the characteristics selected from I, II and III:
  • the reagent composition comprises a second oligonucleotide molecule a, the sequence of which comprises, from the 5' end to the 3' end, a consensus sequence P1 or a partial sequence thereof, a second tag sequence, and a consensus sequence R1 or a partial sequence thereof;
  • the reagent composition further comprises one or more selected from the following:
  • the plurality of first oligonucleotide molecules a on the same bead have the same first tag sequence, and the first oligonucleotide molecules a on different beads have first tag sequences different from each other;
  • the first oligonucleotide molecule a comprises, from the 5' end to the 3' end: (i) a consensus sequence R1 or a partial sequence thereof, the first tag sequence, and a poly(T) sequence; or (ii) a consensus sequence R1 or a partial sequence thereof, the first tag sequence, and a complementary sequence to the 3' end overhang of the cDNA;
  • the first oligonucleotide molecule a further comprises a unique molecular tag sequence, and the plurality of first oligonucleotide molecules a coupled to the same bead have unique molecular tag sequences that are different from each other; in some embodiments, the unique molecular tag sequence is located at the 3' end of the common sequence R1 or a partial sequence thereof;
  • primer A comprises a consensus sequence O and a complementary sequence of the cDNA 3’ end overhang from the 5’ end to the 3’ end
  • primer B comprises a consensus sequence T or a partial sequence thereof and a poly(T) sequence from the 5’ end to the 3’ end
  • the consensus sequence O is identical or partially identical to the consensus sequence T
  • primer C comprises the consensus sequence O or a partial sequence thereof, or the primer C comprises the consensus sequence T or a partial sequence thereof; in certain embodiments, the consensus sequence O is identical or partially identical to the consensus sequence T;
  • Primer E and/or primer F wherein the primer E comprises a consensus sequence P1 and an optional third tag sequence, and the primer F comprises from 5' to 3': a consensus sequence P2 or a complementary sequence thereof, an optional fourth tag sequence, a consensus sequence R2 or a partial sequence thereof;
  • transposase complex II a transposase complex II, wherein the transposase complex II contains a transposase and a transposition sequence that the transposase can recognize and bind to, and can cut or break a double-stranded nucleic acid; and the transposition sequence comprises a transferred strand and a non-transferred strand; wherein the transferred strand comprises a linker sequence; in certain embodiments, the linker sequence comprises a consensus sequence R2 or a partial sequence thereof;
  • the cDNA 3' terminal overhang has a length of at least 1, at least 2, at least 3, at least 4, at least 5, at least 6, at least 7, at least 8, at least 9, at least 10, 1-10, 1-5 or 2-10 nucleotides; in certain embodiments, the cDNA 3' terminal overhang is an overhang of 2-5 cytosine nucleotides (e.g., a CCC overhang);
  • the reagent composition comprises a second oligonucleotide molecule b, the sequence of which comprises, from the 5' end to the 3' end, a consensus sequence P2 or a partial sequence thereof, a second tag sequence, and a consensus sequence R2 or a partial sequence thereof;
  • the reagent composition further comprises one or more selected from the following:
  • the plurality of first oligonucleotide molecules b on the same bead have the same first tag sequence, and the first oligonucleotide molecules b on different beads have first tag sequences different from each other;
  • the first oligonucleotide molecule b comprises from the 5' end to the 3' end: a consensus sequence P1 or a partial sequence thereof and the first tag sequence;
  • transposase complex I wherein the transposase complex I is as defined in embodiment 9;
  • primer D comprises the consensus sequence P1 or a partial sequence thereof
  • primer E' comprises a consensus sequence P1 and an optional third tag sequence
  • primer F' comprises a consensus sequence P2 and an optional fourth tag sequence
  • the reagent composition comprises a second oligonucleotide molecule a and a second oligonucleotide molecule b; wherein the sequence of the second oligonucleotide molecule a comprises, from the 5' end to the 3' end: a consensus sequence P1 or a partial sequence thereof, a second tag sequence, and a consensus sequence R1 or a partial sequence thereof; the sequence of the second oligonucleotide molecule b comprises, from the 5' end to the 3' end: a consensus sequence P2 or a partial sequence thereof, a second tag sequence, and a consensus sequence R2 or a partial sequence thereof;
  • the reagent composition further comprises one or more selected from the following:
  • the first oligonucleotide molecule a comprises, from the 5' end to the 3' end: (i) a consensus sequence R1 or a partial sequence thereof, the first tag sequence, and a poly(T) sequence; or, (ii) a consensus sequence R1 or a partial sequence thereof, the first tag sequence, and a complementary sequence to the 3' end overhang of the cDNA; and/or, the first oligonucleotide molecule b comprises, from the 5' end to the 3' end: a consensus sequence P1 or a partial sequence thereof and the first tag sequence;
  • the first oligonucleotide molecule a further comprises a unique molecular tag sequence, and the plurality of first oligonucleotide molecules a coupled to the same bead have unique molecular tag sequences that are different from each other; in some embodiments, the unique molecular tag sequence is located at the 3' end of the common sequence R1 or a partial sequence thereof;
  • primer A comprises a consensus sequence O and a complementary sequence of the cDNA 3’ end overhang from the 5’ end to the 3’ end
  • primer B comprises a consensus sequence T or a partial sequence thereof and a poly(T) sequence from the 5’ end to the 3’ end
  • the consensus sequence O is identical or partially identical to the consensus sequence T
  • transposase complex I wherein the transposase complex I is as defined in embodiment 9;
  • primer C comprises the consensus sequence O or a partial sequence thereof, or the primer C comprises the consensus sequence T or a partial sequence thereof; in certain embodiments, the consensus sequence O is identical to the consensus sequence T. Same or partly the same;
  • primer D comprises the consensus sequence P1 or a partial sequence thereof
  • Primer E and/or primer F wherein the primer E comprises a consensus sequence P1 and an optional third tag sequence, and the primer F comprises from 5' to 3': a consensus sequence P2 or a complementary sequence thereof, an optional fourth tag sequence, a consensus sequence R2 or a partial sequence thereof;
  • Primer E' and/or primer F' wherein the primer E' comprises a consensus sequence P1 and an optional third tag sequence, and the primer F' comprises a consensus sequence P2 and an optional fourth tag sequence;
  • primer G carries a biotin label and contains a consensus sequence O or a partial sequence thereof or a consensus sequence T or a partial sequence thereof
  • primer H contains a consensus sequence P1 or a partial sequence thereof
  • the consensus sequence O is identical or partially identical to the consensus sequence T
  • transposase complex II wherein the transposase complex II contains a transposase and a transposition sequence that the transposase can recognize and bind to, and can cut or break a double-stranded nucleic acid; and the transposition sequence comprises a transferred strand and a non-transferred strand; wherein the transferred strand comprises a linker sequence; in certain embodiments, the linker sequence comprises a consensus sequence R2 or a partial sequence thereof;
  • the cDNA 3’ terminal overhang has a length of at least 1, at least 2, at least 3, at least 4, at least 5, at least 6, at least 7, at least 8, at least 9, at least 10, 1-10, 1-5 or 2-10 nucleotides; in some embodiments, the cDNA 3’ terminal overhang is an overhang of 2-5 cytosine nucleotides (e.g., a CCC overhang).
  • Embodiment 29 The reagent composition of Embodiment 28, further comprising a reagent for fixing and/or permeabilizing cells or cell nuclei.
  • the reagent composition further comprises methanol, formaldehyde and/or paraformaldehyde.
  • the reagent composition further comprises Triton X-100, digitonin, IGEPAL (e.g., CA-630), and/or Tween-20.
  • the reagent composition further comprises: an RNase inhibitor, mineral oil, a buffer, dNTPs, one or more nucleic acid polymerases (e.g., DNA polymerases; e.g., DNA polymerases having strand displacement activity and/or high fidelity), reagents for recovering or purifying nucleic acids (e.g., magnetic beads), a well plate, or any combination thereof.
  • an RNase inhibitor e.g., mineral oil, a buffer, dNTPs
  • one or more nucleic acid polymerases e.g., DNA polymerases; e.g., DNA polymerases having strand displacement activity and/or high fidelity
  • reagents for recovering or purifying nucleic acids e.g., magnetic beads
  • the reagent composition further comprises reagents for sequencing; for example, reagents for next-generation sequencing.
  • Embodiment 30 A kit comprising: a multi-reaction system containing a plurality of oligonucleotide molecules, each of the oligonucleotide molecules containing a specific tag sequence;
  • the oligonucleotide molecules in each reaction system have the same tag sequence, and the oligonucleotide molecules in different reaction systems have different tag sequences.
  • the oligonucleotide molecule further comprises a consensus sequence P1 or a partial sequence thereof, or the oligonucleotide molecule further comprises a consensus sequence P2 or a partial sequence thereof.
  • the multi-reaction system comprises at least 2 (e.g., at least 3, at least 4, at least 5, at least 8, at least 10, at least 12, at least 20, at least 24, at least 50, at least 96, at least 100, at least 120, at least 200, at least 240, at least 500, at least 96, at least 100, at least 12 ... at least 100, at least 200, at least 384, at least 400, 2-5, 2-10, 2-50, 2-80, 2-100, 2-500, 2-10 3 , 2-10 4 , 2-10 5 , 2-10 6 ) multiple reaction systems containing oligonucleotides;
  • at least 2 e.g., at least 3, at least 4, at least 5, at least 8, at least 10, at least 12, at least 20, at least 24, at least 50, at least 96, at least 100, at least 120, at least 200, at least 240, at least 500, at least 96, at least 100, at least 12 ... at least 100, at least 200, at least 384, at least 400, 2-5, 2-10
  • the multi-reaction system is preferably a multi-well plate, and the oligonucleotides can be free or fixed in the reaction system.
  • Embodiment 31 A method for fixing and permeabilizing cells, comprising the following steps:
  • the concentration used is 0.05%-2% (e.g., 0.05%-0.2%, 0.05%-0.25%, 0.05%-0.3%, 0.05%-0.5%, 0.05%-0.8%, 0.05%-1%, 0.1%-0.2%, 0.1%-0.25%, 0.1%-0.3%, 0.1%-0.4%, 0.1%-0.5%, 0.1%-0.8%
  • the cells are permeabilized by treating the cells with Triton X-100 (0.1%-1%, 0.2%-0.25%, 0.2%-0.3%, 0.2%-0.4%, 0.2%-0.5%, 0.2%-0.8%, 0.2%-1%, or 0.2%) at -4°C to 10°C (e.g., 0°C to 4°C) for 0.5-10 min (e.g., 1-5 min or 3 min).
  • the cell is a naturally occurring cell or a recombinant cell, or a mixture of both.
  • Embodiment 32 A method for fixing and permeabilizing a cell nucleus, comprising the following steps:
  • the permeabilization solution further comprises IGEPAL (e.g., CA-630) and/or Tween-20.
  • IGEPAL e.g., CA-630
  • Tween-20 e.g., Tween-20
  • the concentration of digitonin in the permeabilization solution is 0.0005%-0.05% (eg, 0.0008%-0.005%, 0.0005%-0.002%, 0.0008%-0.002%, or 0.001%).
  • IGEPAL e.g., CA-630
  • concentration of 0.005%-0.1% e.g., 0.005%-0.05%, 0.008%-0.05%, 0.005%-0.02%, 0.008%-0.02%, or 0.01%.
  • the concentration of Tween-20 in the permeabilization solution is 0.005%-0.1% (eg, 0.005%-0.05%, 0.008%-0.05%, 0.005%-0.02%, 0.008%-0.02%, or 0.01%).
  • the cell nucleus is a cell nucleus derived from a naturally occurring cell or a cell nucleus derived from a recombinant cell, or a mixture of both.
  • Embodiment 33 A device for labeling nucleic acid molecules from cells or cell nuclei and/or constructing a nucleic acid molecule library, the device comprising:
  • a processor coupled to the memory, the processor being configured to execute the method described in any one of embodiments 1-13 and/or the method described in any one of embodiments 14-25 based on instructions stored in the memory.
  • Embodiment 34 A computer-readable storage medium having a computer program stored thereon, characterized in that when the program is executed by a processor, the method of any one of embodiments 1-13 and/or the method of any one of embodiments 14-25 is implemented.
  • Embodiment 35 Use of the method of any one of embodiments 1-13, the kit of embodiment 28 or 29, the reagent composition of embodiment 30, the method of embodiment 31 or 32, the device of embodiment 33 or the computer-readable storage medium of embodiment 34 for constructing a nucleic acid molecule library or for performing transcriptome sequencing; or, use of the method of any one of embodiments 14-25 for performing transcriptome sequencing.
  • the term "pseudo-monocell” refers to a situation in which a micro-reaction system (e.g., an oil-in-water droplet or a microwell) contains two or more cells in a transcriptomic experiment analyzing a single cell.
  • a micro-reaction system e.g., an oil-in-water droplet or a microwell
  • two or more cells in the same micro-reaction system e.g., the same droplet or microwell
  • the sequencing data generated by the "pseudo-monocell" micro-reaction system cannot be used to analyze the transcriptome information of a single cell because it contains sequencing results from two or more cells. Therefore, in the traditional high-throughput single-cell transcriptome sequencing method, it is necessary to filter or remove the sequencing data generated by the "pseudo-monocell” micro-reaction system from the final sequencing data; and, in order to avoid a large amount of waste of sequencing data, it is necessary to reduce or control the "pseudo-monocell" micro-reaction system as much as possible.
  • the term “pseudomonas rate” refers to the ratio of "pseudomonas" microreaction systems (number) to all microreaction systems (number) containing cells.
  • cell throughput refers to the number of cells that can be simultaneously labeled in a single library construction reaction for a given single-cell library construction technology protocol.
  • sample throughput refers to the number of samples that can be simultaneously labeled in a single library construction reaction for a given single-cell library construction technology protocol.
  • the cells or cell nuclei that can be used in the methods of the present invention can be any cell or cell nucleus of interest, for example, cancer cells, stem cells, neural cells, fetal cells, and immune cells or cell nuclei involved in immune responses.
  • the cells/cell nuclei can be a mixture of cells/cell nuclei of the same type, or a mixture of completely heterogeneous cells/cell nuclei of different types.
  • Different cell/cell nuclei types may include different tissue cells/cell nuclei of an individual or the same tissue cells/cell nuclei of different individuals, or cells/cell nuclei of microorganisms derived from different genera, species, strains, variants, or any or all of the foregoing combinations.
  • different cell/cell nuclei types may include normal cells/cell nuclei and cancer cells/cell nuclei of an individual; various cell/cell nuclei types obtained from human subjects, such as a variety of immune cells/cell nuclei; a variety of different bacterial species, strains, and/or variants from environmental, forensic, microbial groups, or other samples; or any other various mixtures of cell/cell nuclei types.
  • a "library of nucleic acid molecules” refers to a collection or population of labeled nucleic acid fragments generated from a target nucleic acid molecule, wherein the combination of labeled nucleic acid fragments in the collection or population exhibits a sequence that qualitatively and/or quantitatively represents the sequence of the target nucleic acid molecule from which the labeled nucleic acid fragment was generated.
  • discrete partitions refer to mutually independent spatial units containing target substances, such as droplets or holes. Generally speaking, each discrete partition can keep its own contents separate from the contents of other discrete partitions. In some embodiments, the discrete partitions may also contain other substances allocated according to different needs, such as dyes, emulsifiers, surfactants, stabilizers, polymers, aptamers, reducing agents, initiators, biotin markers, fluorophores, buffers, acidic solutions, alkaline solutions, light-sensitive enzymes, pH-sensitive enzymes, aqueous buffers, detergents, ionic detergents, non-ionic detergents, etc.
  • substances allocated according to different needs such as dyes, emulsifiers, surfactants, stabilizers, polymers, aptamers, reducing agents, initiators, biotin markers, fluorophores, buffers, acidic solutions, alkaline solutions, light-sensitive enzymes, pH-sensitive enzymes, aqueous buffers, detergents, ionic detergents
  • cDNA refers to "complementary DNA” synthesized by extension of a primer annealed to the RNA molecule of interest catalyzed by RNA-dependent DNA polymerase or reverse transcriptase using at least a portion of the RNA molecule of interest as a template (this process is also called “reverse transcription”).
  • the synthesized cDNA molecule is "homologous” or “complementary” or “base paired” or “forms a complex” with at least a portion of the template.
  • transposase refers to an enzyme that is capable of forming a functional complex with a composition comprising a transposon end (e.g., a transposon, a transposon end, a transposon end composition) and catalyzing the insertion or transposition of the composition comprising a transposon end into a double-stranded nucleic acid molecule (e.g., a DNA double strand, an RNA/cDNA hybrid double strand) incubated with the enzyme in a transposition reaction (e.g., an in vitro transposition reaction).
  • a transposon end e.g., a transposon, a transposon end, a transposon end composition
  • a transposon end e.g., a double-stranded nucleic acid molecule
  • a transposition reaction e.g., an in vitro transposition reaction
  • transposases include Tn5 transposase, MuA transposase, Sleeping Beauty transposase, Mariner transposase, Tn7 transposase, Tn10 transposase, Ty1 transposase, Tn552 transposase, and variants, modified products, and derivatives having the transposition activity (e.g., having higher transposition activity) of the above transposases.
  • the nucleic acids or polynucleotides of the present invention may include, but is not limited to: (1) alteration of the Tm; (2) alteration of the susceptibility of the polynucleotide to one or more nucleases; (3) provision of a moiety for attachment of a label; (4) provision of a label or label quencher; or (5) provision of a moiety for attachment of another molecule in solution or bound to a surface, such as biotin.
  • the nucleic acid or polynucleotide of the invention e.g., the first oligonucleotide molecule, the second oligonucleotide molecule, primer A, primer B, primer C, primer E, primer F, primer D, primer E', primer F', primer G, primer H, the transferred strand in the transposase complex, the non-transferred strand
  • the random portion comprises one or more conformationally restricted nucleic acid analogs, such as, but not limited to, one or more ribonucleic acid analogs in which the ribose ring is "locked" by a methylene bridge connecting the 2'-O atom and the 4'-C atom;
  • the 3' end of the oligonucleotide can be treated with dideoxy to make the 3' end unable to be extended; in some embodiments, the 5' end of the nucleic acid or polynucleotide of
  • the nucleic acid base in the single nucleotide at one or more positions in the polynucleotide or oligonucleotide may include guanine, adenine, uracil, thymine or cytosine, or alternatively, one or more of the nucleic acid bases may include a modified base such as, but not limited to, xanthine, allyamino-uracil, allyamino-thymidine, hypoxanthine, 2-aminoadenine, 5-propynyluracil, 5-propynylcytosine, 4-thiouracil
  • nucleic acid bases may comprise nucleic acid bases derivatized with a biotin moiety, a digoxigenin moiety, a fluorescent or chemiluminescent moiety, a quenching moiety, or some other moiety.
  • nucleic acids or polynucleotides of the invention e.g., the first oligonucleotide molecule, the second oligonucleotide molecule, primer A, primer B, primer C, primer E, primer F, primer D, primer E', primer F', primer G, primer H, the transferred strand in the transposase complex, the non-transferred strand
  • one or more of the sugar moieties may include 2'-deoxyribose, or alternatively, one or more of the sugar moieties may include some other sugar moiety, such as, but not limited to: ribose or 2'-fluoro-2'-deoxyribose or 2'-O-methyl-ribose that provide resistance to some nu
  • the internucleoside linkages of the nucleic acids or polynucleotides of the invention can be phosphodiester linkages, or alternatively, one or more of the internucleoside linkages can include modified linkages such as, but not limited to, phosphorothioate, phosphorodithioate, phosphoroselenate, or phosphorodiselenate linkages, which are resistant to some nucleases.
  • the first tag sequence, the second tag sequence, the third tag sequence, the fourth tag sequence, Unique molecular tag sequence, tag sequence is not limited by its composition or length, as long as it can play a role in identification.
  • the first tag sequence has a length of at least 3, at least 4, at least 5, at least 6, at least 7, at least 8, at least 9, at least 10, at least 15, at least 20, at least 25, 3-8, 3-15, 3-25 or 3-50 nucleotides.
  • the second tag sequence has a length of at least 2, at least 3, at least 4, at least 5, at least 6, at least 7, at least 8, at least 9, at least 10, 3-8, 3-15, 3-25 or 3-50 nucleotides.
  • consensus sequence R1, consensus sequence R2, consensus sequence O, consensus sequence P1, consensus sequence P2, primer A, primer B, primer C, primer E, primer F, primer D, primer E', primer F', primer G, primer H, transferred chain, non-transferred chain in the transposase complex, etc. are not limited by their composition or length. Those skilled in the art can reasonably adjust the length and/or its composition of the sequence for various reasons, which will not be repeated here.
  • the consensus sequence R1 is the same as the Read1 sequence of the 10X Genomics platform or a partial sequence thereof or a complementary sequence thereof.
  • the consensus sequence R2 is the same as the Read2 sequence of the 10X Genomics platform or a partial sequence thereof or a complementary sequence thereof.
  • the consensus sequence O is the same as the TSO sequence of the 10X Genomics platform or a partial sequence thereof or a complementary sequence thereof.
  • the consensus sequence P1 is the same as the P5 sequence of the 10X Genomics platform or a partial sequence thereof or a complementary sequence thereof.
  • the consensus sequence P2 is the same as the P7 sequence of the 10X Genomics platform or a partial sequence thereof or a complementary sequence thereof.
  • a bead generally refers to a particle.
  • a bead may be porous, non-porous, solid, semi-solid, semi-fluid or fluid.
  • a bead may be magnetic or non-magnetic.
  • a bead may be soluble, rupturable or degradable.
  • a bead may be non-degradable.
  • a bead may be a gel bead.
  • a gel bead may be a hydrogel bead.
  • a gel bead may be formed from a molecular precursor, such as a polymer or a monomeric substance.
  • a semi-solid bead may be a liposomal bead.
  • the present invention can simultaneously meet the following five points:
  • the empty rate of the micro-reactor system is greatly reduced, which can increase the cell throughput of the existing single-cell omics library construction system based on the micro-reactor system by 10-100 times, reaching 100,000-1 million cells per reaction;
  • the data quality obtained by this scheme is close to that obtained by standard operation of the commercial 10x Genomics platform.
  • the indicators include but are not limited to: number of genes detected, VDJ capture rate, and detected ATAC-seq signal.
  • Figure 1 Existing 10X Genomics single-cell 3'RNA-Seq exemplary library construction process principle and library structure. Specifically, single-cell suspension, reverse transcription reaction solution and 10X genomics cell barcode-labeled microbeads are prepared into oil-in-water microdroplets with one cell plus one magnetic bead on the 10X genomics platform. After the microdroplets are collected and reverse transcribed and template replaced on the PCR instrument, the cell barcode on the microbead will be loaded into the cell's cDNA product, and the final library will be obtained after subsequent cDNA amplification, enzyme cutting and adding adapters, and PCR amplification.
  • P5 and P7 at the two ends of the library are the sequencing adapter sequences of Illumina; the sample index on the right is the label sequence of the library, which is used to distinguish the sample source of the sequencing data; Read1 and Read2 at both ends of the library are the two primers for sequencing at both ends; the 10X Barcode on the left end of the library is the cell barcode, which can distinguish different single cells; the UMI (unique molecular identifiers) single molecule barcode is used to mark different mRNAs in the same cell; Poly (dT) is a polythymine oligonucleotide, which is introduced during the reverse transcription process of mRNA; the thin line in the middle is the transcriptome sequence.
  • FIG 2 Schematic diagram of the exemplary process principle of the present invention.
  • the present invention is based on the cell barcode labeling combined with the post-combination labeling method of micro-reaction (such as the oil-in-water microdroplet shown in Figure 2).
  • the post-combination labeling method of micro-reaction such as the oil-in-water microdroplet shown in Figure 2).
  • the cells are re-mixed and distributed, and the second round of labels (96-384 types) are introduced to the cells on the nucleic acid molecules of the cells, realizing a variety of new single-cell omics library construction schemes.
  • Figure 3 Schematic diagram of exemplary single-cell transcriptome, VDJ library construction and library structure of the present invention.
  • the present invention prepares oil-in-water microdroplets of intact fixed cells/cell nuclei and 10X GENOMICS RNA microbeads with cell labels, and adds the first round of labels to the mRNA to be tested in situ in the cell through reverse transcription reaction (3'RNA)/template displacement reaction (5'RNA).
  • the cells that have completed the first round of label loading are then released from the microdroplets, mixed thoroughly and divided into 96/384-well plates, and different sequencing primers with specific label sequences are added to each well. Through PCR amplification, different second-round labels are added to the cells in different wells. Next, the amplified products are collected.
  • the amplification product of the 5' end RNA of a single cell can be further enriched for VDJ to obtain a VDJ library.
  • the final library structure is: P5 and P7 at the two ends are the sequencing adapter sequences of illumina; i7 on the right is the label sequence of the library, which is used to distinguish different sequencing libraries; Read1 and Read2 at both ends of the library are the two primers for sequencing at both ends; the 10X Barcode on the left end of the library is the cell barcode, which is the first round of cell label; the part marked with round 2 is the second round of cell label introduced by index PCR; UMI (unique molecular identifiers) single molecule barcode is used to mark different mRNAs in the same cell; TSO is the template replacement sequence; the unmarked part in the middle is the transcriptome sequence.
  • Figure 4 Schematic diagram of an exemplary single-cell multi-omics library construction and library structure of the present invention.
  • the present invention prepares oil-in-water microdroplets of intact fixed cells/cell nuclei and 10X GENOMICS RNA microbeads with cell labels, and adds the first round of labels to the mRNA to be tested through a reverse transcription reaction (3’RNA)/gDNA in the open region of chromatin through a ligation reaction in situ on the cell.
  • the cells that have completed the first round of label loading are then released from the microdroplets, mixed thoroughly and divided into 96/384-well plates, and different sequencing primers with specific label sequences are added to each well. Through PCR amplification, different second-round labels are added to the cells in different wells.
  • the amplified products are collected together for purification and further library construction, and finally the two products are enriched by the corresponding primers of cDNA and gDNA, respectively, and then further library construction is performed to obtain the corresponding single-cell transcriptome sequencing library and ATAC-seq sequencing library.
  • FIG5 The results of sequencing mixed samples of human and mouse cell lines using the single-cell transcriptome method of the present invention.
  • FIG6 The results of sequencing human peripheral blood mononuclear cell samples fixed under different conditions using the transcriptome method of the present invention.
  • FIG7 The results of sequencing frozen human peripheral blood mononuclear cell samples using the single-cell 5’ RNA-seq method of the present invention.
  • FIG8 The results of sequencing human peripheral blood mononuclear cell samples using the single-cell VDJ-seq method of the present invention.
  • FIG9 The results of sequencing frozen human kidney samples using the single-cell transcriptome + ATAC multi-omics method of the present invention.
  • micro-reaction cell barcode labeling platforms taking the Chromium platform of 10X Genomics as an example, Not all micro-reaction systems (GEM, water-in-oil droplets) contain single cells as expected. Usually, there will be situations containing two or more cells, which are also called “pseudo-monocellular cells". In the case of "pseudo-monocellular cells", two or more cells in the same GEM will be marked with the same barcode. This results in the inability to perform a "one-to-one" identification of two or more cells present in the GEM using only the barcode in the GEM.
  • the sequencing data generated by the "pseudo-monocellular" GEM cannot be used to analyze the transcriptome information of a single cell because it contains sequencing results derived from two or more cells. Therefore, it is necessary to filter or remove the sequencing data generated by the "pseudo-monocellular" GEM from the sequencing data finally generated; and in order to avoid a large amount of waste of sequencing data, it is necessary to reduce or control the number or ratio of the "pseudo-monocellular" GEM as much as possible, thereby greatly limiting its library construction throughput.
  • the present invention Based on the barcode labeling of cells in micro-reactions (e.g., water-in-oil droplets, nano-micropores), the present invention removes the cells/nuclei that have completed the first round of cell barcode loading in situ in the cells/nuclei in the micro-reaction system, mixes them thoroughly and divides them into several equal parts, then introduces the second round of labeling on the nucleic acid molecules of the cells/nuclei through the index PCR of the micro-system, and finally uses the two rounds of label information to jointly define a cell.
  • micro-reactions e.g., water-in-oil droplets, nano-micropores
  • the nucleic acid molecule library constructed using the method of the present invention has two rounds of cell labels, which makes it possible to split the sequencing data generated by the "pseudo-monocytes", and then accurately track and determine the cell source of the sequencing data.
  • two or more cells in a "pseudo-monocyte" micro-reaction system all contain the same first round of cell barcode labels
  • the two or more cells each contain different second round of cell barcode labels, so that the sequencing data generated by each cell therein can be distinguished according to the second round of cell barcode labels, so that even the sequencing data generated by the "pseudo-monocytes" can be used.
  • the applicant also hopes to emphasize that the method based on pre-labeling and microfluidic droplet high-throughput library construction technology can improve the throughput and reduce the empty rate and pseudo-single cell rate of the micro-reaction system compared with the existing traditional microfluidic droplet high-throughput library construction technology.
  • the reagents used in the examples of the present application have the meanings generally understood by those skilled in the art.
  • the reagents used in the examples of the present application can be purchased from the market or prepared by themselves according to the formula widely used in the corresponding field.
  • Example 1 Fixation and permeabilization of single cell suspension
  • the library can be constructed using intact single cells from fresh tissues, fresh cell lines, fresh blood samples, primary cells, and frozen cell samples. Before the library is constructed, the cells need to be fixed and permeabilized.
  • Hela cell line, NIH3T3 cell line (purchased from the cell bank of the Chinese Academy of Sciences) and peripheral blood mononuclear cells PBMC were used for the experiment.
  • fixation and permeabilization steps are as follows:
  • Method 1 Fix and permeabilize in 80% methanol at -20°C for 10 min.
  • Method 2 Fix with 1% formaldehyde at room temperature for 10 minutes, then centrifuge and remove the supernatant. Resuspend the cells in 0.2% Triton X-100 and permeabilize on ice for 3 minutes.
  • Method 3 Fix with 1% paraformaldehyde at room temperature for 10 minutes, then centrifuge and remove the supernatant. Resuspend the cells in 0.2% Triton X-100 and permeabilize on ice for 3 minutes.
  • the library can be constructed using cell nuclei from fresh tissues, frozen tissues, cell lines, blood samples, primary cells, and frozen cell samples (the extraction method of cell nuclei refers to the conventional steps widely used in the field). Before library construction, cell nuclei need to be fixed and permeabilized.
  • fixation and permeabilization steps are as follows:
  • Method 1 Fix with 1% formaldehyde at room temperature for 10 minutes.
  • Method 2 Fix with 1.6% paraformaldehyde at room temperature for 5 minutes.
  • the single-end (i7-end) TN5 transposase complex will be used in the single-cell transcriptome library construction process of the present invention.
  • Transposon preparation Tn5-top_ME nucleotide (SEQ ID NO: 9) and Tn5-bottom_Read2N nucleotide (SEQ ID NO: 10) were respectively prepared by TruePrep Dissolve the annealing buffer in the Tagment Enzyme kit to 100 Um, and then mix the two nucleotides in a 1:1 volume ratio. In the embodiment of the present invention, take 10ul of the two nucleotides respectively and mix them thoroughly. Place in a PCR instrument and perform the following annealing reaction program: 75°C 15 minutes, 60°C 10 minutes, 50°C 10 minutes, 40°C 10 minutes, 25°C 30 minutes. The annealed adapter mixture is the transposon and is stored at -20°C.
  • TN5 transposase complex embedding using TruePrep The TruePrep Tagment Enzyme (2 ⁇ g/ ⁇ l) and Coupling Buffer in the Tagment Enzyme Kit are used to prepare the following reaction solution: 10ul TruePrep Tagment Enzyme (2 ⁇ g/ ⁇ l), 33ul Coupling Buffer, 7ul transposon (obtained in the previous step). After thorough mixing, place in a PCR instrument and react at 30°C for 1 hour. After the reaction is completed, a single-end (i7-end) TN5 transposase complex is obtained and stored at -20°C.
  • Example 4 Preparation of single-cell transcriptome library (including single-cell 3'RNA-
  • Example 10X genomics chromium platform 10x Single Cell 5'RNA-seq and 10x Single Cell 3'RNA-seq system are used as examples to prepare oil-in-water microdroplets, giving each microdroplet a unique label.
  • the microbeads for oil-in-water preparation and cell barcode labeling can be replaced by other platforms.
  • the 10x Single Cell 5'RNA-seq reaction system is: 18.8 ⁇ l RT Reagent B, 7.3 ⁇ l Poly-dT RT Primer, 1.9 ⁇ l Reducing Agent B, 2 ⁇ l RT Enzyme C, 38.7ul fixed and permeabilized cell/cell nucleus suspension;
  • the 10x Single Cell 3'RNA-seq reaction system is: 18.8 ⁇ l RT Reagent B, 2.4 ⁇ l Template Switch Oligo, 2 ⁇ l Reducing Agent B, 8.7 ⁇ l RT Enzyme C, 43.2ul fixed and permeabilized cell/cell nucleus suspension.
  • the oil-in-water product that has completed the first round of barcode cell label loading breaks the oil-in-water microdroplets, takes out the cells from the aqueous phase and mixes them thoroughly, and then distributes the cell/cell nucleus suspension into a 96-well plate.
  • Cell lysis and purification Place the 96-well plate with cells in a PCR instrument and incubate at 85°C for 5 minutes. Then perform purification.
  • Index PCR amplification reaction (loading the second round of cell labeling): Add cDNA amplification reaction solution to the above purified product.
  • the reaction solution to be added to each well of the 96-well plate includes: 20ul KAPA HiFi HotStart 2X ReadyMix, 2ul 10uM Partial TSO/IS primer (SEQ ID NO: 2), 2ul 10uM Truseq-i5-end specific second label primer (sequence as SEQ ID NO: 5, there are 96 types of primers used in this embodiment, one is added to each well, and the label sequences contained in the 96 primers are respectively selected from the sequences shown in SEQ ID NO: 7), mix well and quickly place in a PCR instrument for amplification.
  • This example provides a library construction method in which a single-end transposase is inserted into the i7-end sequencing primer and then amplified.
  • 100 ng of the product from the previous step is taken and interrupted by transposition with a single-end (i7-end) TN5 transposase (prepared in Example 3).
  • the reaction system is: 10ul 5X Reaction (vazyme#S601-01), 5ul single-end (i7-end) TN5 transposase, 100ng of the above Index PCR cDNA amplification product, and the reaction solution is fully mixed and placed in a PCR instrument for incubation at 55°C for 15 minutes. Purify and elute with 0.8x SPRIselect magnetic beads.
  • the reaction system was: 50ul NEBNext High-Fidelity 2x PCR Master Mix, 5ul 10uM P5 end primer (SEQ ID NO: 1), Nextare-i7 end second label primer (sequence structure schematic as shown in SEQ ID NO: 6, the label sequence contained in the primer actually used in this embodiment is selected from the sequence shown in SEQ ID NO: 8), 40ul transposition purification product.
  • reaction conditions 72°C 5min, 98°C 45s, 8 cycles [98°C 20s, 60°C 30s, 72°C 1min], 72°C 5min, 4°C temporary storage.
  • Sequencing library purification and fragment screening Use 0.6X and 0.2X SPRIselect magnetic beads to purify and screen the products from the previous step. Finally, a sequencing library with a fragment size of about 300-600bp is obtained.
  • Example 5 Preparation of single cell VDJ library (taking human peripheral blood mononuclear cells as an example)
  • the single-cell VDJ library preparation method provided in this embodiment is based on the purification of the single-cell 5'RNA-seq Index PCR cDNA amplification products shown in Example 4 using immune cells T cells and B cells, that is, using the 10X genomics chromium platform 10x Single Cell 5'RNA-seq to complete steps 1 to 5 of Example 4, and performing nested PCR amplification of the obtained cDNA amplification products loaded with two rounds of markers using primers specific for the VDJ conserved region to enrich for the VDJ sequence.
  • the enriched product still carries the same two rounds of cell markers as the single-cell 5'RNA-seq transcriptome.
  • This example takes PBMC from human peripheral blood as an example, and constructs VDJ libraries of T cells and B cells therein respectively.
  • Nested PCR specific enrichment of VDJ sequences After the 10x Single Cell 5' RNA-seq Index PCR cDNA amplification product in Example 4 was purified from PBMCs derived from human peripheral blood, the cDNA amplification product was enriched in two rounds of PCR using two sets of specific primers.
  • the first round of nested PCR reaction system was: 50ul KAPA HiFi HotStart 2X ReadyMix, 5ul 10uM P5 end primer, 5ul 10uM human T Cell/B Cell Outer primers (there are 2 primers for T Cell (SEQ ID NOs: 11-12), 7 primers for B Cell (SEQ ID NOs: 15-21, the corresponding primers should be mixed 1:1 before amplifying TCR/BCR), 5ul cDNA amplification product, 35ul nuclease-free water, mix thoroughly and quickly place in PCR instrument, the reaction conditions are as follows: 98°C 45s, amplify TCR for 11 cycles [98°C 20s, 62°C 30s, 72°C 1min], 72°C 1min, and store at 4°C.
  • the PCR amplification product was purified and fragment screened using 0.5X and 0.3X SPRIselect magnetic beads, and eluted with 40.5ul EB buffer. Then, the second round of nested PCR amplification was carried out.
  • the PCR reaction system was: 50ul KAPA HiFi HotStart 2X ReadyMix, 5ul 10uM P5 end primer, 5ul 10uM human T Cell/B Cell Inner primer (there are 2 primers for T Cell (SEQ ID NOs: 13-14), 7 primers for B Cell (SEQ ID NOs: 22-28), and the corresponding primers were mixed 1:1 before amplifying TCR/BCR), 40ul of the first round of nested PCR amplification product was fully mixed and quickly placed in the PCR instrument.
  • reaction conditions were as follows: 98°C for 45s, 9 cycles of TCR amplification (9 cycles of BCR amplification) [98°C for 20s, 62°C for 30s, 72°C for 1min], 72°C for 1min, and stored at 4°C. Finally, the final PCR amplification product was purified and fragment screened using 0.5X and 0.25X SPRIselect magnetic beads and eluted with 30.5ul EB buffer. 1ul of the eluted product was taken and the concentration was measured using Qubit. The remaining sample can be stored at -80°C for 3 months.
  • VDJ sequencing library Same as Example 4, this example provides a library construction method in which a single-end transposase is transposed and inserted into the i7-end sequencing primer and then amplified. 100ng of VDJ enriched product is taken and interrupted by transposition with a single-end (i7-end) TN5 transposase (prepared in Example 3).
  • the reaction system is: 10ul 5X Reaction (vazyme#S601-01), 5ul single-end (i7-end) TN5 transposase, 100ng of the above Index PCR cDNA amplification product, the total reaction system is 50ul, and the insufficient volume is supplemented with nuclease-free water. After the reaction solution is fully mixed, it is placed in a PCR instrument and incubated at 55°C for 5min. The product is purified with 0.8x SPRIselect magnetic beads,
  • the magnetic beads were eluted with 40.5ul EB.
  • the purified product was amplified for sequencing library, and the reaction system was: 50ul NEBNext High-Fidelity 2x PCR Master Mix, 5ul 10uM P5 end primer, 5ul 10uM P5 end primer Nextare-i7 end second label primer (the sequence structure is shown in SEQ ID NO:6, and the label sequence contained in the primer actually used in this embodiment is selected from SEQ ID NO:8), 40ul transposition purification product.
  • reaction conditions 72°C 5min, 98°C 45s, 7 cycles [98°C 20s, 60°C 30s, 72°C 1min], 72°C 5min, 4°C temporary storage.
  • Sequencing library purification and fragment screening Use 0.6X and 0.2X SPRIselect magnetic beads to purify and screen the products from the previous step. Finally, a sequencing library with a fragment size of about 300-600bp is obtained.
  • Example 6 Preparation of single-cell mRNA+genomic DNA multi-omics library (taking human kidney single-cell sample as an example)
  • the embodiment uses the 10X genomics chromium platform Single Cell Multiome ATAC+RNA-seq system to first complete the oil-in-water microdroplet preparation and the first round of cell label barcode loading. On this basis, the second round of cell labels are loaded on the transcriptome and chromatin open area of the cell/nucleus through index PC, and finally the construction of the single-cell multi-omics library is completed.
  • the oil-in-water preparation and cell barcode-labeled microbeads can be replaced by other platforms. Specific method:
  • In situ cell transposition reaction According to the Chromium Next GEM Single Cell Multiome ATAC+Gene Expression User Guide, the fixed and permeabilized single cell nuclei and permeabilized cell samples in the above example were subjected to in situ transposition reaction.
  • the reaction system was: 7ul ATAC Buffer B, 3ul ATAC Enzyme B, 5ul fixed cell/nucleus suspension. After thorough mixing, the suspension was placed in a PCR instrument.
  • the reaction conditions were as follows: 37°C for 60 min, and then temporarily stored at 4°C. After the reaction, a specific linker sequence was introduced into the open chromatin region of the sample cell nucleus.
  • Cell lysis and purification Add 1ul Proteinase K to each well of the above-mentioned products, mix thoroughly and centrifuge, then place in a PCR instrument and incubate at 55°C for 5 minutes. Then configure Dynabeads Cleanup Mix according to the 10X Chromium Single Cell Reagent Kits User Guide, add 16ul Dynabeads Cleanup Mix to each well of the 96-well plate for purification, and finally elute with 16.5ul Elution Solution I to each well, and transfer the eluate to a new 96-well plate. The purified product was purified again with 1.8x SPRIselect magnetic beads, and finally eluted with 16.5ul EB, and the eluted product was transferred to a new 96-well plate again.
  • Index PCR amplification reaction (loading the second round of cell labels): Add cDNA and gDNA amplification reaction solution to the above purified products.
  • the reaction solution to be added to each well of the 96-well plate includes: 25ul NEBNext High-Fidelity 2x PCR Master Mix, 2ul 10uM P5 end primer (SEQ ID NO: 1), 2ul 10uM Nextare-i7 end second label primer (sequence such as SEQ ID NO: 6, there are 96 kinds of primers used in this embodiment, one is added to each well to label the gDNA from the chromatin open region, and the label sequences contained in the 96 primers are selected from SEQ ID NO: 6.
  • Enrichment of cDNA amplification products Take 40ul of the above Index PCR amplification products and enrich the cDNA amplification products with biotin-modified primers, add 60ul reaction solution, the reaction solution contains: 50ul KAPA HiFi HotStart 2X ReadyMix, 5ul 10uM P5 end primer (SEQ ID NO: 1), 5ul 10uM Bio-Partial TSO/IS primer (SEQ ID NO: 3), mix well and quickly place in PCR instrument, the reaction conditions are as follows: 98°C 30s, 6 cycles [98°C 20s, 54°C 30s, 72°C 20s], 72°C 1min, 4°C temporary storage.
  • the C1beads adsorbed with the cDNA amplification product were resuspended in new PCR reaction solution (the reaction solution contained: 50ul KAPA HiFi HotStart 2X ReadyMix, 5ul 10uM P5 end primer (SEQ ID NO: 1), 5ul 10uM Partial TSO/IS primer (SEQ ID NO: 2), 40ul nuclease-free water), and further amplified after thorough mixing.
  • the reaction solution contained: 50ul KAPA HiFi HotStart 2X ReadyMix, 5ul 10uM P5 end primer (SEQ ID NO: 1), 5ul 10uM Partial TSO/IS primer (SEQ ID NO: 2), 40ul nuclease-free water), and further amplified after thorough mixing.
  • the reaction conditions were as follows: 98°C 30s, 4 cycles [98°C 20s, 54°C 30s, 72°C 20s], 72°C 1min, and stored at 4°C.
  • the PCR tube was placed on a magnetic stand for 5min, the supernatant was aspirated, and the supernatant was purified with 0.6x SPRIselect magnetic beads and eluted with EB. Take 1ul of the eluted product and measure the concentration with Qubit. The remaining sample can be stored at -80°C for 3 months.
  • ATAC-seq sequencing library construction Take 40ul of the purified Index PCR amplification product in step 6 of this implementation to construct the ATAC-seq sequencing library. Add 50ul KAPA HiFi HotStart2X ReadyMix, 5ul 10uM P5 end primer (SEQ ID NO: 1), 5ul 10uM P7 end primer (SEQ ID NO: 4) to the 40ul Index PCR amplification product, mix well and then perform PCR amplification. Reaction conditions: 98°C 45s, 7-10 cycles (depending on the number of cells loaded) [98°C 20s, 67°C 30s, 72°C 20s], 72°C 1min, and store at 4°C.
  • ATAC-seq sequencing library purification and fragment screening Use 0.4X and 1X SPRIselect magnetic beads to purify and screen the products from the previous step. Finally, a sequencing library with a fragment size of about 200-700bp was obtained.
  • ATAC-se library sequencing The constructed library was sequenced using NovaSeq 6000 (Illumina, San Diego, CA) with a read length of 50 bp and 25,000 reads per cell.
  • FIG5 shows the results of sequencing a mixed sample of human and mouse cell lines using the single-cell transcriptome method of the present invention, specifically a scatter plot of the number of UMIs in a single cell mapped to the genomes of different species, wherein the number of UMIs is the number of cDNA molecules sequenced, and each point in the figure represents a cell (a total of 6446 points), wherein light-colored points represent cells that contain almost only mouse cDNA, dark-colored points represent cells that contain almost only human cDNA, and black points represent cells that are contaminated (i.e., pseudomonocytes).
  • Figure 6 shows the results of sequencing human peripheral blood mononuclear cell samples fixed under different conditions using the transcriptome method of the present invention.
  • A is the number of UMIs detected after peripheral blood mononuclear cells were fixed under three conditions (methanol, 1% formaldehyde, 1% paraformaldehyde);
  • B is the number of genes detected after peripheral blood mononuclear cells were fixed under three conditions (methanol, 1% formaldehyde, 1% paraformaldehyde);
  • C is the visualization result of unsupervised clustering of all cells under the three fixing conditions;
  • D is the distribution of cell clusters under the three fixing conditions.
  • Figure 7 shows the results of sequencing frozen human peripheral blood mononuclear cell samples using the single-cell 5'RNA-seq method of the present invention (single experiment cell throughput: 118,819), specifically the cell clustering visualization results of single-cell 5'RNA-seq of frozen human peripheral blood mononuclear cells, showing that 27 major cell types in the blood were detected in this method.
  • Figure 8 shows the results of sequencing human peripheral blood mononuclear cell samples using the single-cell VDJ-seq method of the present invention.
  • a and C are the visualization results of cells with detected BCR/TCR clones
  • black dots are cells with detected BCR/TCR clones
  • light dots are cells without detected BCR/TCR
  • cells with detected BCR clones completely overlap with the B cell position annotated in the single-cell transcriptome data of Figure 7
  • cells with detected TCR clones completely overlap with the T cell position annotated in the single-cell transcriptome data of Figure 7
  • B and D are the proportions of B cells and T cells with detected BCR and TCR clones, respectively.
  • Figure 9 shows the results of sequencing frozen human kidney samples using the single-cell transcriptome + ATAC multi-omics method of the present invention.
  • A is the cell clustering visualization result of the single-cell transfer group of the frozen human kidney sample, showing that the 12 major cell types in the kidney are detected in this method;
  • B is the number of genes detected in a single cell of the single-cell transcriptome of the frozen human kidney sample;
  • C is the cell clustering visualization result of the single-cell ATAC-seq part of the frozen human kidney sample, and 18 cell clusters are obtained by clustering;
  • D is the ATAC-seq peak information obtained for various cell types in the single-cell ATAC-seq part.

Landscapes

  • Chemical & Material Sciences (AREA)
  • Organic Chemistry (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Biochemistry (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Wood Science & Technology (AREA)
  • Engineering & Computer Science (AREA)
  • Zoology (AREA)
  • Molecular Biology (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • Analytical Chemistry (AREA)
  • Microbiology (AREA)
  • Immunology (AREA)
  • Biotechnology (AREA)
  • Physics & Mathematics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Biophysics (AREA)
  • General Chemical & Material Sciences (AREA)
  • Medicinal Chemistry (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

A method of introducing a dual cell-specific tag in nucleic acid molecules derived from cells, and the construction of a library of nucleic acid molecules for single-cell transcriptome sequencing, single-cell chromatin accessibility sequencing or multi-omic single-cell transcriptome + chromatin accessibility sequencing on the basis of the method, or a method for performing high-throughput sequencing on single-cell transcriptome, single-cell chromatin accessibility or multi-omic single cell transcriptome + chromatin accessibility. Further provided are a library of nucleic acid molecules constructed using the method, and a kit for implementing the method.

Description

用于高通量标记细胞核酸分子的方法和试剂盒Method and kit for high-throughput labeling of cellular nucleic acid molecules 技术领域Technical Field

本申请涉及高通量单细胞组学领域,特别是高通量单细胞转录组测序技术、高通量单细胞染色质可及性(ATAC,Assay for Transposase-Accessible Chromatin)测序技术、高通量单细胞转录组+染色质可及性多组学测序技术。This application relates to the field of high-throughput single-cell omics, in particular, high-throughput single-cell transcriptome sequencing technology, high-throughput single-cell chromatin accessibility (ATAC, Assay for Transposase-Accessible Chromatin) sequencing technology, and high-throughput single-cell transcriptome + chromatin accessibility multi-omics sequencing technology.

背景技术Background Art

单细胞组学测序技术的发展,大大深化了人类对细胞多样性和异质性的认知,对发育生物学、肿瘤等疾病、辅助生殖、免疫学、神经科学、微生物等多个生物学和生物医学研究领域的发展起到革命性的推动。现有的单细胞测序主要包括单细胞基因组测序、转录组测序、甲基化测序、染色质可及性测序以及包含以上组学信息的单细胞多组学测序等。其本质就是通过对单个细胞内的DNA和RNA的序列,拷贝数量,修饰状态,相互作用进行分析,揭示单个细胞的基因组、转录组、甲基化、染色质开放状态等组学变化情况。更加细致的刻画不同样品的细胞异质性,分析单细胞基因调控网络,描绘样品的细胞全景,更大的通量、更多模态、更好的数据质量、更低成本的单细胞测序技术手段仍然具有紧迫的需求。The development of single-cell omics sequencing technology has greatly deepened human understanding of cell diversity and heterogeneity, and has played a revolutionary role in the development of multiple biological and biomedical research fields such as developmental biology, tumors and other diseases, assisted reproduction, immunology, neuroscience, and microbiology. Existing single-cell sequencing mainly includes single-cell genome sequencing, transcriptome sequencing, methylation sequencing, chromatin accessibility sequencing, and single-cell multi-omics sequencing containing the above omics information. Its essence is to reveal the genome, transcriptome, methylation, chromatin open state and other omics changes of single cells by analyzing the sequence, copy number, modification status, and interaction of DNA and RNA in a single cell. In order to more carefully characterize the cellular heterogeneity of different samples, analyze the single-cell gene regulatory network, and depict the cellular panorama of the sample, there is still an urgent need for single-cell sequencing technology with higher throughput, more modalities, better data quality, and lower cost.

高通量单细胞建库技术目前主要包括在微流控液滴或微孔板中进行细胞条形码标记的高通量单细胞建库技术。但目前所有的基于微流控液滴和微孔板的商业化单细胞建库技术,以10x Genomics(Zheng GX,et al.Massively parallel digital transcriptional profiling of single cells.Nat Commun.2017 Jan 16;8:14049.doi:10.1038/ncomms14049.PMID:28091601)为例,缺点是:细胞通量低,建库成本高,微反应体系空载率高,假单细胞率高。High-throughput single-cell library construction technology currently mainly includes high-throughput single-cell library construction technology for cell barcode labeling in microfluidic droplets or microplates. However, all the commercial single-cell library construction technologies based on microfluidic droplets and microplates, such as 10x Genomics (Zheng GX, et al. Massively parallel digital transcriptional profiling of single cells. Nat Commun. 2017 Jan 16; 8:14049. doi: 10.1038/ncomms14049. PMID: 28091601), have the following disadvantages: low cell throughput, high library construction cost, high empty rate of micro-reaction system, and high rate of pseudo-single cells.

综上,构建一种操作简单、价格低廉、细胞通量高、数据质量可靠,同时还适用于多组学单细胞建库的技术方法十分必要。In summary, it is very necessary to construct a technical method that is simple to operate, low-cost, has high cell throughput, reliable data quality, and is also suitable for multi-omics single-cell library construction.

发明内容Summary of the invention

为解决微反应(例如油包水液滴,纳米微孔)细胞条形码标记的高通量单细胞建库技术及现有基于微流控液滴和组合标记的高通量建库技术不能同时兼顾:细胞通量大、操作简便、价格低廉、假单细胞率低、微反应体系空载率低、数据质量好、适用性强的问题,本申请发明人针对现有成熟的微反应(例如油包水液滴,纳米微孔)细胞条形码标记平台研发出新的改进型建库技术方案。In order to solve the problem that high-throughput single-cell library construction technology based on micro-reaction (such as water-in-oil droplets, nano-micropores) cell barcode labeling and the existing high-throughput library construction technology based on microfluidic droplets and combined labeling cannot take into account the following issues at the same time: high cell throughput, simple operation, low price, low false single cell rate, low micro-reaction system empty rate, good data quality, and strong applicability, the inventors of the present application have developed a new improved library construction technology solution for the existing mature micro-reaction (such as water-in-oil droplets, nano-micropores) cell barcode labeling platform.

标记方法Marking method

因此,在第一方面,本申请提供了一种标记来自细胞或细胞核的核酸分子的方法,其包括下述步骤:Therefore, in a first aspect, the present application provides a method for labeling a nucleic acid molecule from a cell or a cell nucleus, comprising the following steps:

(1)提供多个经固定和透化的细胞或细胞核,所述细胞或细胞核含有待标记的核酸分子;和, (1) providing a plurality of fixed and permeabilized cells or cell nuclei, wherein the cells or cell nuclei contain nucleic acid molecules to be labeled; and,

多个偶联了多个第一寡核苷酸分子的珠粒,其中,所述第一寡核苷酸分子含有第一标签序列;a plurality of beads coupled with a plurality of first oligonucleotide molecules, wherein the first oligonucleotide molecules contain a first tag sequence;

并且,同一个珠粒上的所述多个第一寡核苷酸分子具有相同的第一标签序列,并且,不同珠粒上的所述第一寡核苷酸分子具有彼此不同的第一标签序列;Furthermore, the plurality of first oligonucleotide molecules on the same bead have the same first tag sequence, and the first oligonucleotide molecules on different beads have first tag sequences different from each other;

(2)将多个所述珠粒和多个所述细胞或细胞核随机分配至不同的第一离散分区,在所述第一离散分区内使所述第一寡核苷酸分子从所述珠粒上释放并与所述细胞或细胞核接触,从而在所述细胞或细胞核内生成衍生自所述待标记核酸分子的第一核酸分子,所述第一核酸分子含有所述第一标签序列或其互补序列;(2) randomly allocating a plurality of the beads and a plurality of the cells or cell nuclei to different first discrete partitions, releasing the first oligonucleotide molecule from the beads and contacting the first oligonucleotide molecule with the cells or cell nuclei in the first discrete partitions, thereby generating a first nucleic acid molecule derived from the nucleic acid molecule to be labeled in the cells or cell nuclei, wherein the first nucleic acid molecule contains the first tag sequence or its complementary sequence;

(3)将源自不同所述第一离散分区的包含所述第一核酸分子的细胞或细胞核混合并重新分配到不同的第二离散分区;(3) mixing the cells or cell nuclei containing the first nucleic acid molecule originating from different first discrete partitions and redistributing them into different second discrete partitions;

(4)在所述第二离散分区内,使含有第二标签序列的第二寡核苷酸分子与所述第一核酸分子接触,生成含有第一标签序列或其互补序列以及第二标签序列或其互补序列的第二核酸分子;(4) contacting a second oligonucleotide molecule containing a second tag sequence with the first nucleic acid molecule within the second discrete partition to generate a second nucleic acid molecule containing the first tag sequence or its complementary sequence and the second tag sequence or its complementary sequence;

其中,同一个所述第二离散分区的所述第二寡核苷酸分子具有相同的第二标签序列,并且,不同所述第二离散分区的所述第二寡核苷酸分子具有彼此不同的第二标签序列;wherein the second oligonucleotide molecules in the same second discrete partition have the same second tag sequence, and the second oligonucleotide molecules in different second discrete partitions have different second tag sequences;

其中,所述细胞为天然存在的细胞或重组细胞,或两者的混合;所述细胞核为源自天然存在的细胞的细胞核或源自重组细胞的细胞核,或两者的混合。Wherein, the cell is a naturally occurring cell or a recombinant cell, or a mixture of the two; the cell nucleus is a cell nucleus derived from a naturally occurring cell or a cell nucleus derived from a recombinant cell, or a mixture of the two.

在某些实施方案中,所述重组细胞是指包含经修饰(例如,人为修饰)的核酸分子(例如,基因)和/或其产物(例如,蛋白、RNA)的细胞,所述修饰包括但不限于,增加或减少所述细胞内源基因的拷贝数、突变所述细胞内源基因、上调或下调或沉默所述细胞内源基因产物的表达、向所述细胞导入外源核酸分子(所述外源核酸分子被整合入所述细胞的基因组或以非整合形式存在)等。In certain embodiments, the recombinant cell refers to a cell comprising a modified (e.g., artificially modified) nucleic acid molecule (e.g., gene) and/or its product (e.g., protein, RNA), wherein the modification includes, but is not limited to, increasing or decreasing the copy number of endogenous genes in the cell, mutating endogenous genes in the cell, upregulating or downregulating or silencing the expression of endogenous gene products in the cell, introducing exogenous nucleic acid molecules into the cell (the exogenous nucleic acid molecules are integrated into the genome of the cell or exist in a non-integrated form), etc.

本申请的方法可用于标记所述细胞/细胞核中未经修饰的核酸分子,也可用于标记所述细胞/细胞核中经修饰的核酸分子(例如,经修饰的细胞内源核酸分子(例如,基因)或所述细胞含有的经导入的外源核酸分子)。The method of the present application can be used to label unmodified nucleic acid molecules in the cell/cell nucleus, and can also be used to label modified nucleic acid molecules in the cell/cell nucleus (e.g., modified endogenous nucleic acid molecules (e.g., genes) of the cell or introduced exogenous nucleic acid molecules contained in the cell).

在某些实施方案中,步骤(3)中,将源自至少2个(例如,至少10个,至少102个,至少103个,至少104个,至少105个,至少106个,至少107个,至少108个,2-10个,2-102个,2-103个,2-104个,2-105个,2-106个,2-107个,2-108个或2-109个)所述第一离散分区的包含所述第一核酸分子的细胞或细胞核混合并重新分配到不同的第二离散分区。In certain embodiments, in step (3), cells or cell nuclei containing the first nucleic acid molecule originating from at least 2 (e.g., at least 10 , at least 10 , at least 10 , at least 10 , at least 10 , at least 10 , at least 10, at least 10 , 2-10, 2-10 , 2-10 , 2-10 , 2-10, 2-10 , 2-10 , 2-10, 2-10 or 2-10 ) of the first discrete partitions are mixed and redistributed to different second discrete partitions.

本领域技术人员易于理解,由于所述第一寡核苷酸分子所含有的第一标签序列是特异于所述第一离散分区的,因此,步骤(2)中衍生于处于同一所述第一离散分区的细胞或细胞核的所有所述第一核酸分子均含有相同的所述第一标签序列或其互补序列。所述第二寡核苷酸分子所含有的第二标签序列是特异于所述第二离散分区的,因此,步骤(4)中衍生于分配到同一所述第二离散分区的细胞或细胞核的所有所述第二核酸分子均含有相同的所述第二标签序列或其互补序列。从而,可以利用所述第一标签序列和所述第二标签序列共同对测序数据所源自的细胞进行标识。 It is easy for a person skilled in the art to understand that, since the first tag sequence contained in the first oligonucleotide molecule is specific to the first discrete partition, all the first nucleic acid molecules derived from cells or cell nuclei in the same first discrete partition in step (2) contain the same first tag sequence or its complementary sequence. The second tag sequence contained in the second oligonucleotide molecule is specific to the second discrete partition, and therefore, all the second nucleic acid molecules derived from cells or cell nuclei assigned to the same second discrete partition in step (4) contain the same second tag sequence or its complementary sequence. Thus, the first tag sequence and the second tag sequence can be used together to identify the cell from which the sequencing data originates.

易于理解,本申请提供的标记方法能够用于单细胞组学的高通量建库和测序,因此,步骤(1)中,所述细胞或细胞核可以是相同来源的,也可以是不同来源的混合;所述细胞或细胞核可以来源于相同细胞系,也可以来源于不同细胞系,可以来源于相同组织,也可以来源于不同组织,可以来源于相同个体,也可以来源于不同个体,可以来源于同一物种,也可以来源于不同物种。所述细胞或细胞核还可以是细胞与细胞核的混合。It is easy to understand that the labeling method provided in the present application can be used for high-throughput library construction and sequencing of single-cell omics. Therefore, in step (1), the cells or cell nuclei can be of the same source or a mixture of different sources; the cells or cell nuclei can be derived from the same cell line or from different cell lines, from the same tissue or from different tissues, from the same individual or from different individuals, from the same species or from different species. The cells or cell nuclei can also be a mixture of cells and cell nuclei.

在某些实施方案中,单个所述第一离散分区中含有一个所述珠粒。In certain embodiments, a single said first discrete partition contains one said bead.

在某些实施方案中,所述第一离散分区各自独立地含有一个或多个细胞或细胞核。In certain embodiments, the first discrete partitions each independently contain one or more cells or cell nuclei.

在某些实施方案中,所述第一离散分区各自独立地含有0-10个(例如,0-2个、0-3个、0-4个、0-5个、0-8个、1-2个、1-3个、1-4个、1-5个、1-8个、1-10个、2-3个、2-4个、2-5个、2-8个、2-10个、3-4个、3-5个、3-8个、3-10个、4-5个、4-8个、4-10个)细胞或细胞核。In certain embodiments, each of the first discrete partitions independently contains 0-10 (e.g., 0-2, 0-3, 0-4, 0-5, 0-8, 1-2, 1-3, 1-4, 1-5, 1-8, 1-10, 2-3, 2-4, 2-5, 2-8, 2-10, 3-4, 3-5, 3-8, 3-10, 4-5, 4-8, 4-10) cells or cell nuclei.

在某些实施方案中,所述第二离散分区各自独立地含有一个或多个源自所述第一离散分区的包含所述第一核酸分子的细胞或细胞核。In certain embodiments, each of the second discrete partitions independently contains one or more cells or cell nuclei derived from the first discrete partition that contain the first nucleic acid molecule.

在某些实施方案中,所述第二离散分区各自独立地含有0-107个(例如,0-10个、0-102个、0-103个、0-104个、0-105个、0-106个、0-107个、1-10个、1-102个、1-103个、1-104个、1-105个、1-106个或1-107个)源自所述第一离散分区的包含所述第一核酸分子的细胞或细胞核。In certain embodiments, each of the second discrete partitions independently contains 0-10 7 (e.g., 0-10, 0-10 2 , 0-10 3 , 0-10 4 , 0-10 5 , 0-10 6 , 0-10 7 , 1-10, 1-10 2 , 1-10 3 , 1-10 4 , 1-10 5 , 1-10 6 , or 1-10 7 ) cells or cell nuclei derived from the first discrete partition that contain the first nucleic acid molecule.

在某些实施方案中,所述方法在步骤(2)中通过微滴微流控系统或微孔板系统将多个所述珠粒和多个所述细胞或细胞核随机分配至不同的第一离散分区。In certain embodiments, in step (2), the method randomly distributes the plurality of beads and the plurality of cells or cell nuclei to different first discrete partitions by a microdroplet microfluidics system or a microplate system.

在某些实施方案中,所述微滴微流控系统选自但不限于:10X GENOMICS平台的微流控油包水系统、Fluidigm C1平台的微流控系统、Biorad ddSEQ系统的微流控系统。In certain embodiments, the droplet microfluidic system is selected from but not limited to: the microfluidic oil-in-water system of the 10X GENOMICS platform, the microfluidic system of the Fluidigm C1 platform, and the microfluidic system of the Biorad ddSEQ system.

在某些实施方案中,所述微孔板系统选自但不限于:BD Rhapsody平台的微孔板系统,新格元平台的微孔板系统。In certain embodiments, the microplate system is selected from but not limited to: the microplate system of the BD Rhapsody platform and the microplate system of the Neocell platform.

在某些实施方案中,所述方法使用甲醇对细胞进行固定和透化,或者,使用甲醛或多聚甲醛以及Triton X-100对细胞进行固定和透化。In some embodiments, the method uses methanol to fix and permeabilize the cells, or uses formaldehyde or paraformaldehyde and Triton X-100 to fix and permeabilize the cells.

在某些实施方案中,所述方法通过选自以下的处理对细胞进行固定和透化:In certain embodiments, the method fixes and permeabilizes the cells by a treatment selected from the group consisting of:

(i)在-40℃至-10℃(例如,-25℃至-15℃或-20℃)的条件下,使用浓度为60%-100%(例如,60%-80%、70%-80%、70%-85%、70%-90%、75%-80%、75%-85%、75%-90%、75%-100%、80%-85%、80%-90%、80%-100%或80%)的甲醇处理细胞5-30min(例如,8-20min或10min)对细胞进行固定和透化;(i) treating the cells with 60%-100% (e.g., 60%-80%, 70%-80%, 70%-85%, 70%-90%, 75%-80%, 75%-85%, 75%-90%, 75%-100%, 80%-85%, 80%-90%, 80%-100%, or 80%) methanol at -40°C to -10°C (e.g., -25°C to -15°C or -20°C) for 5-30 min (e.g., 8-20 min or 10 min) to fix and permeabilize the cells;

或者,or,

(ii)(a)在0℃至37℃(例如,15℃至30℃或25℃)的条件下,使用浓度为0.05%-5%(例如,0.5%-1%、0.5%-2%、0.5%-3%、0.5%-4%、0.5%-5%或1%)的甲醛或多聚甲醛处理细胞5-30min(例如,5-20min或10min)对细胞进行固定;和,(ii)(a) fixing the cells by treating the cells with formaldehyde or paraformaldehyde at a concentration of 0.05%-5% (e.g., 0.5%-1%, 0.5%-2%, 0.5%-3%, 0.5%-4%, 0.5%-5%, or 1%) at 0°C to 37°C (e.g., 15°C to 30°C or 25°C) for 5-30 minutes (e.g., 5-20 minutes or 10 minutes); and,

(b)使用浓度为0.05%-2%(例如,0.05%-0.2%、0.05%-0.25%、0.05%-0.3%、0.05%-0.5%、0.05%-0.8%、0.05%-1%、0.1%-0.2%、0.1%-0.25%、0.1%-0.3%、0.1%-0.4%、 0.1%-0.5%、0.1%-0.8%、0.1%-1%、0.2%-0.25%、0.2%-0.3%、0.2%-0.4%、0.2%-0.5%、0.2%-0.8%、0.2%-1%或0.2%)的Triton X-100在-4℃至10℃(例如,0℃至4℃)的条件下处理细胞0.5-10min(例如,1-5min或3min)对细胞进行透化。(b) The concentration of use is 0.05%-2% (for example, 0.05%-0.2%, 0.05%-0.25%, 0.05%-0.3%, 0.05%-0.5%, 0.05%-0.8%, 0.05%-1%, 0.1%-0.2%, 0.1%-0.25%, 0.1%-0.3%, 0.1%-0.4%, The cells are permeabilized by treating the cells with Triton X-100 (e.g., 0.1%-0.5%, 0.1%-0.8%, 0.1%-1%, 0.2%-0.25%, 0.2%-0.3%, 0.2%-0.4%, 0.2%-0.5%, 0.2%-0.8%, 0.2%-1% or 0.2%) at -4°C to 10°C (e.g., 0°C to 4°C) for 0.5-10 min (e.g., 1-5 min or 3 min).

在某些实施方案中,通过在-20℃的条件下,使用浓度为80%的甲醇处理细胞10min来对细胞进行固定和透化。In certain embodiments, cells are fixed and permeabilized by treating the cells with 80% methanol for 10 min at -20°C.

在某些实施方案中,所述方法使用甲醛或多聚甲醛以及digitonin对细胞核进行固定和透化。In certain embodiments, the methods use formaldehyde or paraformaldehyde and digitonin to fix and permeabilize the nuclei.

在某些实施方案中,所述方法还包括使用IGEPAL(例如,CA-630)和/或Tween-20对细胞核进行透化。In certain embodiments, the method further comprises administering IGEPAL (e.g., CA-630) and/or Tween-20 to permeabilize the cell nuclei.

在某些实施方案中,所述方法通过选自以下的处理对细胞核进行固定和透化:In certain embodiments, the method fixes and permeabilizes the cell nuclei by a treatment selected from the group consisting of:

(i)对细胞核进行固定,所述固定方法选自:(i) fixing the cell nucleus, wherein the fixing method is selected from:

(a)在0℃至30℃(例如,15℃至28℃或25℃)的条件下,使用浓度为0.05%-4%(例如,0.5%-1%、0.5%-2%、0.5%-3%、0.5%-4%或1%)的甲醛处理细胞核2-20min(例如,5-15min或10min)对细胞核进行固定;或者,(a) fixing the cell nuclei by treating the cell nuclei with formaldehyde at a concentration of 0.05%-4% (e.g., 0.5%-1%, 0.5%-2%, 0.5%-3%, 0.5%-4% or 1%) at 0°C to 30°C (e.g., 15°C to 28°C or 25°C) for 2-20 min (e.g., 5-15 min or 10 min); or,

(b)在0℃至30℃(例如,15℃至28℃或25℃)的条件下,使用浓度为0.05%-4%(例如,0.5%-2%、0.5%-3%、0.5%-4%或1.6%)的多聚甲醛处理细胞核1-15min(例如,1-10min或5min)对细胞核进行固定;(b) treating the cell nuclei with paraformaldehyde at a concentration of 0.05%-4% (e.g., 0.5%-2%, 0.5%-3%, 0.5%-4% or 1.6%) at 0°C to 30°C (e.g., 15°C to 28°C or 25°C) for 1-15 min (e.g., 1-10 min or 5 min) to fix the cell nuclei;

以及,as well as,

(ii)使用包含digitonin的透化液在-4℃至10℃(例如,0℃至4℃)的条件下处理细胞核0.5-10min(例如,1-5min或3min)对细胞核进行透化。(ii) permeabilizing the cell nuclei by treating the cell nuclei with a permeabilization solution containing digitonin at -4°C to 10°C (eg, 0°C to 4°C) for 0.5-10 min (eg, 1-5 min or 3 min).

在某些实施方案中,所述透化液进一步包含IGEPAL(例如,CA-630)和/或Tween-20。In certain embodiments, the permeabilization solution further comprises IGEPAL (e.g., CA-630) and/or Tween-20.

在某些实施方案中,所述透化液中,digitonin的浓度为0.0005%-0.05%(例如,0.0008%-0.005%、0.0005%-0.002%、0.0008%-0.002%或0.001%)。In certain embodiments, the concentration of digitonin in the permeabilization solution is 0.0005%-0.05% (eg, 0.0008%-0.005%, 0.0005%-0.002%, 0.0008%-0.002%, or 0.001%).

在某些实施方案中,所述透化液中,IGEPAL(例如,CA-630)的浓度为0.005%-0.1%(例如,0.005%-0.05%、0.008%-0.05%、0.005%-0.02%、0.008%-0.02%或0.01%)。In certain embodiments, in the permeabilization solution, IGEPAL (e.g., CA-630) at a concentration of 0.005%-0.1% (e.g., 0.005%-0.05%, 0.008%-0.05%, 0.005%-0.02%, 0.008%-0.02%, or 0.01%).

在某些实施方案中,所述透化液中,Tween-20的浓度为0.005%-0.1%(例如,0.005%-0.05%、0.008%-0.05%、0.005%-0.02%、0.008%-0.02%或0.01%)。In certain embodiments, the concentration of Tween-20 in the permeabilization solution is 0.005%-0.1% (eg, 0.005%-0.05%, 0.008%-0.05%, 0.005%-0.02%, 0.008%-0.02%, or 0.01%).

在某些实施方案中,通过在室温的条件下,使用浓度为1%的甲醛处理细胞核10min来对细胞核进行固定,并在固定处理后,使用含有0.001%digitonin、0.01%IGEPAL(例如,CA-630)和0.01%Tween-20的透化液在0℃至4℃的条件下处理细胞核2-4min(例如,3min)对细胞核进行透化。In certain embodiments, the cell nuclei are fixed by treating the cell nuclei with 1% formaldehyde for 10 min at room temperature, and after fixation, the cell nuclei are fixed with 0.001% digitonin, 0.01% IGEPAL (e.g., The cell nuclei are permeabilized by treating the cell nuclei with a permeabilization solution containing CA-630) and 0.01% Tween-20 at 0°C to 4°C for 2-4 min (e.g., 3 min).

在某些实施方案中,所述标记来自细胞或细胞核的核酸分子的方法具备选自以下的一项或多项:In certain embodiments, the method of labeling nucleic acid molecules from cells or cell nuclei comprises one or more selected from the following:

(1)在步骤(1)中,提供至少2个(例如,至少10个,至少102个,至少103个,至 少104个,至少105个,至少106个,至少107个,2-10个,2-102个,2-103个,2-104个,2-105个,2-106个,2-107个,2-108个或2-109个)细胞或细胞核;和/或,提供至少2个(例如,至少10个,至少102个,至少103个,至少104个,至少105个,至少106个,至少107个,至少108个,2-10个,2-102个,2-103个,2-104个,2-105个,2-106个,2-107个,2-108个或2-109个个)珠粒;(1) In step (1), at least 2 (e.g., at least 10, at least 102 , at least 103, at least 104 ) at least 10 4 , at least 10 5 , at least 10 6 , at least 10 7 , 2-10, 2-10 2 , 2-10 3 , 2-10 4 , 2-10 5 , 2-10 6 , 2-10 7 , 2-10 8 or 2-10 9 ) cells or cell nuclei; and/or, providing at least 2 (e.g., at least 10, at least 10 2 , at least 10 3 , at least 10 4 , at least 10 5 , at least 10 6 , at least 10 7 , at least 10 8 , 2-10, 2-10 2 , 2-10 3 , 2-10 4 , 2-10 5 , 2-10 6 , 2-10 7 , 2-10 8 or 2-10 9 ) beads;

(2)所述第一离散分区为离散微孔或离散微液滴(例如,油包水液滴);(2) the first discrete partitions are discrete micropores or discrete microdroplets (e.g., water-in-oil droplets);

(3)所述珠粒偶联了至少2个(例如,至少10个,至少102个,至少103个,至少104个,至少105个,至少106个,至少107个,至少108个,2-10个,2-102个,2-103个,2-104个,2-105个,2-106个,2-107个,2-108个或2-109个)所述第一寡核苷酸分子;(3) the beads are coupled to at least 2 (e.g., at least 10, at least 10, at least 10 , at least 10 , at least 10 , at least 10 , at least 10 , at least 10, 2-10 , 2-10 , 2-10, 2-10 , 2-10, 2-10 , 2-10, 2-10 , 2-10 , 2-10 or 2-10 ) of the first oligonucleotide molecules;

(4)所述珠粒能够自发地或在暴露于一种或多种刺激(例如,温度变化、pH变化、暴露于特定化学物质或相、暴露于光、还原剂等)时释放所述第一寡核苷酸分子;(4) the bead is capable of releasing the first oligonucleotide molecule spontaneously or upon exposure to one or more stimuli (e.g., temperature change, pH change, exposure to a specific chemical or phase, exposure to light, a reducing agent, etc.);

(5)所述珠粒是凝胶珠粒;(5) The beads are gel beads;

(6)步骤(3)中,将所述细胞或细胞核分配到至少2个(例如,至少3个,至少4个,至少5个,至少8个,至少10个,至少12个,至少20个,至少24个,至少50个,至少96个,至少100个,至少200个,至少384个,至少400个,2-5个,2-10个,2-50个,2-80个,2-100个,2-500个,2-103个,2-104个,2-105个或2-106个)所述第二离散分区,其中,每个所述第二离散分区含有至少一个细胞或细胞核;(6) in step (3), the cells or cell nuclei are divided into at least 2 (e.g., at least 3, at least 4, at least 5, at least 8, at least 10, at least 12, at least 20, at least 24, at least 50, at least 96, at least 100, at least 200, at least 384, at least 400, 2-5, 2-10, 2-50, 2-80, 2-100, 2-500, 2-10 3 , 2-10 4 , 2-10 5 or 2-10 6 ) of the second discrete partitions, wherein each of the second discrete partitions contains at least one cell or cell nucleus;

(7)所述第二离散分区为多孔板中的离散孔;(7) The second discrete partitions are discrete holes in a porous plate;

(8)步骤(3)之后,步骤(4)之前,所述方法还包括裂解细胞和/或对所述第一核酸分子进行纯化的步骤。(8) After step (3) and before step (4), the method further includes the steps of lysing cells and/or purifying the first nucleic acid molecule.

进一步,本发明的标记方法分别为:Further, the labeling methods of the present invention are respectively:

待标记的核酸分子为mRNA The nucleic acid molecule to be labeled is mRNA :

在某些实施方案中,所述待标记的核酸分子为mRNA,并且,所述第一寡核苷酸分子为第一寡核苷酸分子a。In certain embodiments, the nucleic acid molecule to be labeled is mRNA, and the first oligonucleotide molecule is the first oligonucleotide molecule a.

在某些实施方案中,所述步骤(2)包括以下步骤:In certain embodiments, the step (2) comprises the following steps:

(i)(a)在所述第一离散分区内,用所述第一寡核苷酸分子a对所述待标记的核酸分子进行逆转录,生成cDNA链,所述cDNA链包含以所述第一寡核苷酸分子a为逆转录引物形成的与所述待标记核酸分子互补的cDNA序列,以及3’末端悬突;其中,所述第一寡核苷酸分子a从5’端至3’端包含:共有序列R1或其部分序列、所述第一标签序列、poly(T)序列;和,(b)将引物A与(a)中生成的所述cDNA链进行退火,并进行延伸反应,生成延伸产物,所述延伸产物即为所述第一核酸分子;其中,所述引物A从5’端至3’端包含共有序列O和所述3’末端悬突的互补序列;(i) (a) in the first discrete partition, reversely transcribe the nucleic acid molecule to be labeled with the first oligonucleotide molecule a to generate a cDNA chain, wherein the cDNA chain comprises a cDNA sequence complementary to the nucleic acid molecule to be labeled formed by using the first oligonucleotide molecule a as a reverse transcription primer, and a 3' terminal overhang; wherein the first oligonucleotide molecule a comprises, from the 5' end to the 3' end: a consensus sequence R1 or a partial sequence thereof, the first tag sequence, and a poly(T) sequence; and, (b) annealing primer A with the cDNA chain generated in (a), and performing an extension reaction to generate an extension product, wherein the extension product is the first nucleic acid molecule; wherein the primer A comprises, from the 5' end to the 3' end, a consensus sequence O and a complementary sequence to the 3' terminal overhang;

或者,or,

(ii)(a)在所述第一离散分区内,用引物B对所述待标记的核酸分子进行逆转录,生成cDNA链,所述cDNA链包含以所述引物B为逆转录引物形成的与所述待标记核酸分子互补的cDNA序列,以及3’末端悬突;其中,所述引物B从5’端至3’端包含共有序列T或其 部分序列和poly(T)序列;和,(b)在所述第一离散分区内,将所述第一寡核苷酸分子a与(a)中生成的所述cDNA链进行退火,并进行延伸反应,生成延伸产物,所述延伸产物即为所述第一核酸分子;其中,所述第一寡核苷酸分子a从5’端至3’端包含:共有序列R1或其部分序列、所述第一标签序列和所述3’末端悬突的互补序列。(ii) (a) in the first discrete partition, reversely transcribing the nucleic acid molecule to be labeled with primer B to generate a cDNA chain, wherein the cDNA chain comprises a cDNA sequence complementary to the nucleic acid molecule to be labeled formed by using primer B as a reverse transcription primer, and a 3' end overhang; wherein the primer B comprises a common sequence T or its (a) annealing the first oligonucleotide molecule a with the cDNA chain generated in (a) within the first discrete partition, and performing an extension reaction to generate an extension product, wherein the extension product is the first nucleic acid molecule; wherein the first oligonucleotide molecule a comprises, from the 5' end to the 3' end: a consensus sequence R1 or a partial sequence thereof, the first tag sequence and a complementary sequence to the 3' end overhang.

可使用各种合适的方法来在cDNA链的3’端形成或添加悬突。在某些实施方案中,可通过使用具有末端转移活性的逆转录酶来在cDNA链的3’端形成或添加悬突。Various suitable methods can be used to form or add an overhang at the 3' end of the cDNA chain. In certain embodiments, an overhang can be formed or added at the 3' end of the cDNA chain by using a reverse transcriptase with terminal transfer activity.

在某些实施方案中,所述步骤(ii)中,步骤(ii)(a)在将所述细胞或细胞核分配到所述第一离散分区之前或之后进行。In certain embodiments, in said step (ii), step (ii)(a) is performed before or after assigning said cells or cell nuclei to said first discrete partitions.

在某些实施方案中,所述第一寡核苷酸分子a进一步包含独特分子标签序列,并且,同一个珠粒上偶联的多个所述第一寡核苷酸分子a具有彼此不同的独特分子标签序列。在某些实施方案中,所述独特分子标签序列位于所述共有序列R1或其部分序列的3’端。In some embodiments, the first oligonucleotide molecule a further comprises a unique molecular tag sequence, and the plurality of first oligonucleotide molecules a coupled to the same bead have unique molecular tag sequences different from each other. In some embodiments, the unique molecular tag sequence is located at the 3' end of the consensus sequence R1 or a partial sequence thereof.

在某些实施方案中,所述共有序列O与所述共有序列T相同或部分相同。In certain embodiments, the consensus sequence O is identical or partially identical to the consensus sequence T.

在某些实施方案中,所述3’末端悬突具有至少1个,至少2个,至少3个,至少4个,至少5个,至少6个,至少7个,至少8个,至少9个,至少10个,1-10个,1-5个或2-10个核苷酸的长度。在某些实施方案中,所述3’末端悬突为2-5个胞嘧啶核苷酸的悬突(例如CCC悬突)。In certain embodiments, the 3' terminal overhang has a length of at least 1, at least 2, at least 3, at least 4, at least 5, at least 6, at least 7, at least 8, at least 9, at least 10, 1-10, 1-5, or 2-10 nucleotides. In certain embodiments, the 3' terminal overhang is an overhang of 2-5 cytosine nucleotides (e.g., a CCC overhang).

在某些实施方案中,所述步骤(4)包括以下步骤:In certain embodiments, step (4) comprises the following steps:

在所述第二离散分区内,以所述第二寡核苷酸分子和引物C为引物扩增所述第一核酸分子,生成的延伸产物即为所述第二核酸分子;In the second discrete partition, the first nucleic acid molecule is amplified using the second oligonucleotide molecule and primer C as primers, and the generated extension product is the second nucleic acid molecule;

其中,所述第二寡核苷酸分子从5’端至3’端包含:共有序列P1或其部分序列、所述第二标签序列、所述共有序列R1或其部分序列;所述引物C包含所述共有序列O或其部分序列,或者,所述引物C包含共有序列T或其部分序列。Wherein, the second oligonucleotide molecule comprises from the 5' end to the 3' end: the consensus sequence P1 or a partial sequence thereof, the second tag sequence, the consensus sequence R1 or a partial sequence thereof; the primer C comprises the consensus sequence O or a partial sequence thereof, or the primer C comprises the consensus sequence T or a partial sequence thereof.

待标记的核酸分子为基因组DNA:The nucleic acid molecule to be labeled is genomic DNA:

在某些实施方案中,所述待标记的核酸分子为基因组DNA,并且,所述第一寡核苷酸分子为第一寡核苷酸分子b。In certain embodiments, the nucleic acid molecule to be labeled is genomic DNA, and the first oligonucleotide molecule is the first oligonucleotide molecule b.

在某些实施方案中,所述步骤(2)包括以下步骤:In certain embodiments, the step (2) comprises the following steps:

(a)将所述待标记核酸分子与转座酶复合体I孵育;其中,所述转座酶复合体I含有转座酶和所述转座酶能够识别并结合的转座序列,且能够切割或断裂双链核酸;并且,所述转座序列包含转移链和非转移链;其中,所述转移链包含第一转移链和第二转移链,所述第一转移链包含转座酶识别序列和共有序列R2或其部分序列,所述第二转移链包含转座酶识别序列和共有序列R1或其部分序列;并且,所述孵育在允许所述待标记的核酸分子被所述转座酶复合体I断裂成核酸片段且所述转移链被连接至所述核酸片段的末端(例如,所述核酸片段的5’端)的条件下进行;从而生成5’端分别含有共有序列R2或其部分序列以及共有序列R1或其部分序列的双链核酸片段;和,(a) incubating the nucleic acid molecule to be labeled with a transposase complex I; wherein the transposase complex I contains a transposase and a transposition sequence that the transposase can recognize and bind to, and can cut or break a double-stranded nucleic acid; and the transposition sequence comprises a transferred strand and a non-transferred strand; wherein the transferred strand comprises a first transferred strand and a second transferred strand, the first transferred strand comprises a transposase recognition sequence and a consensus sequence R2 or a partial sequence thereof, and the second transferred strand comprises a transposase recognition sequence and a consensus sequence R1 or a partial sequence thereof; and the incubation is performed under conditions that allow the nucleic acid molecule to be broken into nucleic acid fragments by the transposase complex I and the transferred strands are connected to the ends of the nucleic acid fragments (e.g., the 5' ends of the nucleic acid fragments); thereby generating double-stranded nucleic acid fragments whose 5' ends contain the consensus sequence R2 or a partial sequence thereof and the consensus sequence R1 or a partial sequence thereof, respectively; and,

(b)在所述第一离散分区内,将所述第一寡核苷酸分子b与(a)中生成的所述双链核酸片段进行连接(例如,利用核酸酶进行连接),并进行延伸反应,生成延伸产物,所 述延伸产物即为所述第一核酸分子;其中,所述第一寡核苷酸分子b从5’端至3’端包含:共有序列P1或其部分序列和所述第一标签序列。(b) within the first discrete partition, connecting the first oligonucleotide molecule b to the double-stranded nucleic acid fragment generated in (a) (for example, by using a nuclease to connect), and performing an extension reaction to generate an extension product. The extension product is the first nucleic acid molecule; wherein the first oligonucleotide molecule b comprises, from the 5' end to the 3' end: a consensus sequence P1 or a partial sequence thereof and the first tag sequence.

在某些实施方案中,所述第一核酸分子包含源自所述细胞或细胞核中处于染色质开放区的基因组DNA片段的序列。In certain embodiments, the first nucleic acid molecule comprises a sequence derived from a genomic DNA fragment in an open chromatin region in the cell or cell nucleus.

在某些实施方案中,所述步骤(a)在将所述细胞或细胞核分配到所述第一离散分区之前或之后进行。In certain embodiments, step (a) is performed before or after partitioning the cells or cell nuclei into the first discrete partitions.

在某些实施方案中,所述转座酶复合体I的共有序列R1或其部分序列的5’端是磷酸化的。In certain embodiments, the 5' end of the consensus sequence R1 of the transposase complex I or a partial sequence thereof is phosphorylated.

在某些实施方案中,所述步骤(4)包括以下步骤:In certain embodiments, step (4) comprises the following steps:

在所述第二离散分区内,以所述第二寡核苷酸分子和引物D为引物扩增所述第一核酸分子,生成的延伸产物即为所述第二核酸分子;In the second discrete partition, the first nucleic acid molecule is amplified using the second oligonucleotide molecule and primer D as primers, and the generated extension product is the second nucleic acid molecule;

其中,所述第二寡核苷酸分子从5’端至3’端包含:共有序列P2或其部分序列、所述第二标签序列、共有序列R2或其部分序列;所述引物D包含共有序列P1或其部分序列。Wherein, the second oligonucleotide molecule comprises from the 5' end to the 3' end: the consensus sequence P2 or a partial sequence thereof, the second tag sequence, the consensus sequence R2 or a partial sequence thereof; and the primer D comprises the consensus sequence P1 or a partial sequence thereof.

待标记的核酸分子为来自相同细胞的mRNA和基因组DNA The nucleic acid molecules to be labeled are mRNA and genomic DNA from the same cell :

在某些实施方案中,所述待标记的核酸分子为mRNA和基因组DNA,并且,所述mRNA和基因组DNA具有相同的细胞来源;In certain embodiments, the nucleic acid molecules to be labeled are mRNA and genomic DNA, and the mRNA and genomic DNA have the same cell source;

并且,所述第一寡核苷酸分子包括第一寡核苷酸分子a和第一寡核苷酸分子b,所述第二寡核苷酸分子包括第二寡核苷酸分子a和第二寡核苷酸分子b;Furthermore, the first oligonucleotide molecule includes a first oligonucleotide molecule a and a first oligonucleotide molecule b, and the second oligonucleotide molecule includes a second oligonucleotide molecule a and a second oligonucleotide molecule b;

其中,所述珠粒同时偶联了多个所述第一寡核苷酸分子a和多个所述第一寡核苷酸分子b;并且,同一个珠粒上的所述多个第一寡核苷酸分子a和多个所述第一寡核苷酸分子b具有相同的第一标签序列。The beads are coupled to a plurality of the first oligonucleotide molecules a and a plurality of the first oligonucleotide molecules b at the same time; and the plurality of the first oligonucleotide molecules a and the plurality of the first oligonucleotide molecules b on the same bead have the same first tag sequence.

在某些实施方案中,所述步骤(2)包括以下步骤:In certain embodiments, the step (2) comprises the following steps:

(A)(i)(a)在所述第一离散分区内,用所述第一寡核苷酸分子a对所述待标记的mRNA分子进行逆转录,生成cDNA链,所述cDNA链包含以所述第一寡核苷酸分子a为逆转录引物形成的与所述待标记mRNA分子互补的cDNA序列,以及3’末端悬突;其中,所述第一寡核苷酸分子a从5’端至3’端包含:共有序列R1或其部分序列、所述第一标签序列、poly(T)序列;和,(b)将引物A与(a)中生成的所述cDNA链进行退火,并进行延伸反应,生成延伸产物,所述延伸产物即为第一核酸分子a;其中,所述引物A从5’端至3’端包含共有序列O和所述3’末端悬突的互补序列;(A)(i)(a) in the first discrete partition, reversely transcribe the mRNA molecule to be labeled with the first oligonucleotide molecule a to generate a cDNA chain, wherein the cDNA chain comprises a cDNA sequence complementary to the mRNA molecule to be labeled formed by using the first oligonucleotide molecule a as a reverse transcription primer, and a 3' terminal overhang; wherein the first oligonucleotide molecule a comprises, from the 5' end to the 3' end: a consensus sequence R1 or a partial sequence thereof, the first tag sequence, and a poly(T) sequence; and, (b) annealing primer A with the cDNA chain generated in (a), and performing an extension reaction to generate an extension product, wherein the extension product is the first nucleic acid molecule a; wherein the primer A comprises, from the 5' end to the 3' end, a consensus sequence O and a complementary sequence to the 3' terminal overhang;

或者,or,

(ii)(a)在所述第一离散分区内,用引物B对所述待标记的mRNA分子进行逆转录,生成cDNA链,所述cDNA链包含以所述引物B为逆转录引物形成的与所述待标记mRNA分子互补的cDNA序列,以及3’末端悬突;其中,所述引物B从5’端至3’端包含共有序列T或其部分序列和poly(T)序列;和,(b)在所述第一离散分区内,将所述第一寡核苷酸分子a与(a)中生成的所述cDNA链进行退火,并进行延伸反应,生成延伸产物,所述延伸产物即为第一核酸分子a;其中,所述第一寡核苷酸分子a从5’端至3’端包含:共有序列R1或 其部分序列、所述第一标签序列和所述3’末端悬突的互补序列;(ii) (a) in the first discrete partition, reversely transcribing the mRNA molecule to be labeled with primer B to generate a cDNA chain, wherein the cDNA chain comprises a cDNA sequence complementary to the mRNA molecule to be labeled formed by using primer B as a reverse transcription primer, and a 3' end overhang; wherein the primer B comprises a consensus sequence T or a partial sequence thereof and a poly(T) sequence from the 5' end to the 3'end; and, (b) in the first discrete partition, annealing the first oligonucleotide molecule a with the cDNA chain generated in (a), and performing an extension reaction to generate an extension product, wherein the extension product is the first nucleic acid molecule a; wherein the first oligonucleotide molecule a comprises from the 5' end to the 3' end: a consensus sequence R1 or R2 a partial sequence thereof, the first tag sequence and a complementary sequence of the 3' end overhang;

和,and,

(B)(a)将所述待标记DNA分子与转座酶复合体I孵育;其中,所述转座酶复合体I如权利要求9中所定义;并且,所述孵育在允许所述待标记的DNA分子被所述转座酶复合体I断裂成核酸片段且所述转移链被连接至所述核酸片段的末端(例如,所述核酸片段的5’端)的条件下进行;从而生成5’端分别含有共有序列R2或其部分序列以及共有序列R1或其部分序列的双链核酸片段;和,(B)(a) incubating the DNA molecule to be labeled with a transposase complex I; wherein the transposase complex I is as defined in claim 9; and the incubation is performed under conditions that allow the DNA molecule to be broken into nucleic acid fragments by the transposase complex I and the transferred strand to be connected to the end of the nucleic acid fragment (e.g., the 5' end of the nucleic acid fragment); thereby generating a double-stranded nucleic acid fragment having a consensus sequence R2 or a partial sequence thereof and a consensus sequence R1 or a partial sequence thereof at the 5' end, respectively; and,

(b)在与(A)相同的所述第一离散分区内,将所述第一寡核苷酸分子b与(a)中生成的所述双链核酸片段进行连接,并进行延伸反应,生成延伸产物,所述延伸产物即为第一核酸分子b;其中,所述第一寡核苷酸分子b从5’端至3’端包含:共有序列P1或其部分序列和所述第一标签序列;(b) in the first discrete partition that is the same as (A), the first oligonucleotide molecule b is connected to the double-stranded nucleic acid fragment generated in (a), and an extension reaction is performed to generate an extension product, wherein the extension product is the first nucleic acid molecule b; wherein the first oligonucleotide molecule b comprises, from the 5' end to the 3' end: a consensus sequence P1 or a partial sequence thereof and the first tag sequence;

其中,所述步骤(A)和所述步骤(B)可以以任意顺序进行(例如,先(A)后(B),先(B)后(A),或同时进行)。Wherein, the step (A) and the step (B) may be performed in any order (for example, (A) first and then (B), (B) first and then (A), or simultaneously).

在某些实施方案中,所述步骤(A)(ii)中,步骤(A)(ii)(a)在将所述细胞或细胞核分配到所述第一离散分区之前或之后进行。In certain embodiments, in said step (A)(ii), step (A)(ii)(a) is performed before or after assigning said cells or cell nuclei to said first discrete partitions.

在某些实施方案中,所述步骤(B)(a)在将所述细胞或细胞核分配到所述第一离散分区之前或之后进行。In certain embodiments, step (B)(a) is performed before or after partitioning the cells or cell nuclei into the first discrete partitions.

在某些实施方案中,所述共有序列O与所述共有序列T相同或部分相同。In certain embodiments, the consensus sequence O is identical or partially identical to the consensus sequence T.

在某些实施方案中,所述第一核酸分子b包含源自所述细胞或细胞核中处于染色质开放区的基因组DNA片段的序列。In certain embodiments, the first nucleic acid molecule b comprises a sequence derived from a genomic DNA fragment in an open chromatin region in the cell or cell nucleus.

在某些实施方案中,所述所述转座酶复合体I中的共有序列R1或其部分序列的5’端是磷酸化的。In certain embodiments, the 5' end of the consensus sequence R1 or a partial sequence thereof in the transposase complex I is phosphorylated.

在某些实施方案中,所述第一寡核苷酸分子a进一步包含独特分子标签序列,并且,同一个珠粒上偶联的多个所述第一寡核苷酸分子a具有彼此不同的独特分子标签序列。在某些实施方案中,所述独特分子标签序列位于所述共有序列R1或其部分序列的3’端。In some embodiments, the first oligonucleotide molecule a further comprises a unique molecular tag sequence, and the plurality of first oligonucleotide molecules a coupled to the same bead have unique molecular tag sequences different from each other. In some embodiments, the unique molecular tag sequence is located at the 3' end of the consensus sequence R1 or a partial sequence thereof.

在某些实施方案中,所述3’末端悬突具有至少1个,至少2个,至少3个,至少4个,至少5个,至少6个,至少7个,至少8个,至少9个,至少10个,1-10个,1-5个或2-10个核苷酸的长度。在某些实施方案中,所述3’末端悬突为2-5个胞嘧啶核苷酸的悬突(例如CCC悬突)。In certain embodiments, the 3' terminal overhang has a length of at least 1, at least 2, at least 3, at least 4, at least 5, at least 6, at least 7, at least 8, at least 9, at least 10, 1-10, 1-5, or 2-10 nucleotides. In certain embodiments, the 3' terminal overhang is an overhang of 2-5 cytosine nucleotides (e.g., a CCC overhang).

在某些实施方案中,所述步骤(4)包括以下步骤:In certain embodiments, step (4) comprises the following steps:

(a)在所述第二离散分区内,以所述第二寡核苷酸分子a和引物C为引物扩增所述第一核酸分子a,生成的延伸产物即为所述第二核酸分子a;(a) amplifying the first nucleic acid molecule a in the second discrete partition using the second oligonucleotide molecule a and primer C as primers, and the generated extension product is the second nucleic acid molecule a;

其中,所述第二寡核苷酸分子a从5’端至3’端包含:共有序列P1或其部分序列、所述第二标签序列、所述共有序列R1或其部分序列;所述引物C包含共有序列O或其部分序列,或者,所述引物C包含共有序列T或其部分序列;Wherein, the second oligonucleotide molecule a comprises from the 5' end to the 3' end: the consensus sequence P1 or a partial sequence thereof, the second tag sequence, the consensus sequence R1 or a partial sequence thereof; the primer C comprises the consensus sequence O or a partial sequence thereof, or the primer C comprises the consensus sequence T or a partial sequence thereof;

and

(b)在同一所述第二离散分区内,以所述第二寡核苷酸分子b和引物D为引物扩增所述第一核酸分子b,生成的延伸产物即为所述第二核酸分子b;(b) amplifying the first nucleic acid molecule b in the same second discrete partition using the second oligonucleotide molecule b and primer D as primers, and the generated extension product is the second nucleic acid molecule b;

其中,所述第二寡核苷酸分子b从5’端至3’端包含:共有序列P2或其部分序列、所述第二标签序列、共有序列R2或其部分序列;所述引物D包含共有序列P1或其部分序列;Wherein, the second oligonucleotide molecule b comprises from the 5' end to the 3' end: the consensus sequence P2 or a partial sequence thereof, the second tag sequence, the consensus sequence R2 or a partial sequence thereof; the primer D comprises the consensus sequence P1 or a partial sequence thereof;

其中,所述步骤(a)和所述步骤(b)可以以任意顺序进行(例如,先(a)后(b),先(b)后(a),或同时进行)。Wherein, the step (a) and the step (b) may be performed in any order (for example, (a) first and then (b), (b) first and then (a), or simultaneously).

进一步,本发明的建库方法分别为:Further, the library construction methods of the present invention are respectively:

在第二方面,本申请还提供了一种构建核酸分子文库的方法,其包括,In a second aspect, the present application also provides a method for constructing a nucleic acid molecule library, which comprises:

(1)根据第一方面任一项所述的方法生成多个经标记的所述第二核酸分子,以及,(1) generating a plurality of labeled second nucleic acid molecules according to the method described in any one of the first aspects, and,

(2)回收和/或合并多个所述第二核酸分子,(2) recovering and/or combining a plurality of said second nucleic acid molecules,

从而获得核酸分子文库。Thus, a nucleic acid molecule library is obtained.

在某些实施方案中,在步骤(2)中,回收和/或合并多个所述第二离散分区中生成的所述第二核酸分子。In certain embodiments, in step (2), the second nucleic acid molecules generated in a plurality of the second discrete partitions are recovered and/or combined.

待标记的核酸分子为mRNA The nucleic acid molecule to be labeled is mRNA :

在某些实施方案中,所述方法包括:In certain embodiments, the method comprises:

(a)根据上述第一方面“待标记的核酸分子为mRNA:”部分所描述的方法生成多个经标记的所述第二核酸分子,(a) generating a plurality of labeled second nucleic acid molecules according to the method described in the section "The nucleic acid molecule to be labeled is mRNA :" of the first aspect above,

(b)回收和/或合并多个所述第二核酸分子;在某些实施方案中,回收和/或合并多个所述第二离散分区中生成的所述第二核酸分子;和(b) recovering and/or combining a plurality of said second nucleic acid molecules; in certain embodiments, recovering and/or combining a plurality of said second nucleic acid molecules generated in said second discrete partitions; and

(c)将所述第二核酸分子随机打断并添加接头序列;(c) randomly breaking the second nucleic acid molecule and adding a linker sequence;

从而获得核酸分子文库序列。Thus, the sequence of the nucleic acid molecule library is obtained.

在某些实施方案中,所述细胞为T细胞或B细胞。In certain embodiments, the cell is a T cell or a B cell.

在某些实施方案中,所述方法在步骤(a)之后步骤(c)之前,所述方法还包括对靶核酸分子进行富集的步骤;所述靶核酸分子为包含:(i)编码T细胞受体(TCR)或B细胞受体(BCR)的核苷酸序列或其部分序列(例如,V(D)J序列),和/或,(ii)(i)的互补序列的所述第二核酸分子。In certain embodiments, the method further comprises, after step (a) and before step (c), a step of enriching the target nucleic acid molecule; the target nucleic acid molecule is a second nucleic acid molecule comprising: (i) a nucleotide sequence encoding a T cell receptor (TCR) or a B cell receptor (BCR) or a partial sequence thereof (e.g., a V(D)J sequence), and/or (ii) a complementary sequence of (i).

在某些实施方案中,所述步骤(c)中,通过转座酶将所述第二核酸分子随机打断并在其5’端添加接头序列。In certain embodiments, in step (c), the second nucleic acid molecule is randomly fragmented by a transposase and an adapter sequence is added to its 5' end.

在某些实施方案中,所述接头序列包含共有序列R2或其部分序列。In certain embodiments, the linker sequence comprises the consensus sequence R2 or a partial sequence thereof.

在某些实施方案中,所述方法还包括步骤(d):In certain embodiments, the method further comprises step (d):

纯化和/或扩增步骤(c)的产物中含有所述第一标签或其互补序列和所述第二标签或其互补序列的核酸分子的步骤。A step of purifying and/or amplifying the nucleic acid molecule containing the first tag or its complementary sequence and the second tag or its complementary sequence in the product of step (c).

在某些实施方案中,所述步骤(d)包括:使用引物E和引物F对步骤(c)的产物进行扩增,其中,所述引物E包含共有序列P1以及任选的第三标签序列,所述引物F从5’至3’包含:共有序列P2或其互补序列、任选的第四标签序列、共有序列R2或其部分序列。In certain embodiments, step (d) comprises: amplifying the product of step (c) using primer E and primer F, wherein primer E comprises a consensus sequence P1 and an optional third tag sequence, and primer F comprises from 5' to 3': a consensus sequence P2 or its complementary sequence, an optional fourth tag sequence, a consensus sequence R2 or a partial sequence thereof.

待标记的核酸分子为基因组DNA:The nucleic acid molecule to be labeled is genomic DNA:

在某些实施方案中,所述方法包括:In certain embodiments, the method comprises:

(a)根据上述第一方面“待标记的核酸分子为基因组DNA:”部分所描述的方法生成多个经标记的所述第二核酸分子,和,(a) generating a plurality of labeled second nucleic acid molecules according to the method described in the section "The nucleic acid molecule to be labeled is genomic DNA " of the first aspect above, and,

(b)回收和/或合并多个所述第二核酸分子;在某些实施方案中,回收和/或合并多个所述第二离散分区中生成的所述第二核酸分子;(b) recovering and/or combining a plurality of said second nucleic acid molecules; in certain embodiments, recovering and/or combining a plurality of said second nucleic acid molecules generated in said second discrete partitions;

从而获得核酸分子文库序列。Thus, the sequence of the nucleic acid molecule library is obtained.

在某些实施方案中,所述方法还包括步骤(c):In certain embodiments, the method further comprises step (c):

纯化和/或扩增步骤(b)的产物中含有所述第一标签或其互补序列和所述第二标签或其互补序列的核酸分子的步骤。A step of purifying and/or amplifying the nucleic acid molecule containing the first tag or its complementary sequence and the second tag or its complementary sequence in the product of step (b).

在某些实施方案中,所述步骤(c)包括:使用引物E’和引物F’对步骤(b)的产物进行扩增,其中,所述引物E’包含共有序列P1,以及任选的第三标签序列,所述引物F’包含共有序列P2以及任选的第四标签序列。In certain embodiments, step (c) comprises: amplifying the product of step (b) using primer E’ and primer F’, wherein primer E’ comprises a consensus sequence P1 and an optional third tag sequence, and primer F’ comprises a consensus sequence P2 and an optional fourth tag sequence.

待标记的核酸分子为来自相同细胞的mRNA和基因组DNA The nucleic acid molecules to be labeled are mRNA and genomic DNA from the same cell :

在某些实施方案中,所述方法包括:In certain embodiments, the method comprises:

(a)根据上述第一方面“待标记的核酸分子为来自相同细胞的mRNA和基因组DNA:”部分所描述的方法生成多个经标记的所述第二核酸分子,其包括多个所述第二核酸分子a和多个所述第二核酸分子b,和,(a) generating a plurality of labeled second nucleic acid molecules according to the method described in the section "The nucleic acid molecules to be labeled are mRNA and genomic DNA from the same cell :" of the first aspect above, comprising a plurality of second nucleic acid molecules a and a plurality of second nucleic acid molecules b, and,

(b)回收和/或合并多个所述第二核酸分子;在某些实施方案中,回收和/或合并多个所述第二离散分区中生成的所述第二核酸分子;(b) recovering and/or combining a plurality of said second nucleic acid molecules; in certain embodiments, recovering and/or combining a plurality of said second nucleic acid molecules generated in said second discrete partitions;

从而获得核酸分子文库序列。Thus, the sequence of the nucleic acid molecule library is obtained.

在某些实施方案中,所述方法在步骤(b)之后,还包括步骤(c):将所述第二核酸分子a随机打断并添加接头序列。In certain embodiments, the method further comprises, after step (b), step (c): randomly breaking the second nucleic acid molecule a and adding a linker sequence.

在某些实施方案中,所述步骤(c)中,通过转座酶将所述第二核酸分子a随机打断并在其5’端添加接头序列。In certain embodiments, in step (c), the second nucleic acid molecule a is randomly fragmented by a transposase and a linker sequence is added to its 5' end.

在某些实施方案中,所述接头序列包含共有序列R2或其部分序列。In certain embodiments, the linker sequence comprises the consensus sequence R2 or a partial sequence thereof.

在莫偶像实施方案中,在步骤(c)之前,所述方法还包括从步骤(b)的产物中特异性富集所述第二核酸分子a。In another embodiment, before step (c), the method further comprises specifically enriching the second nucleic acid molecule a from the product of step (b).

在某些实施方案中,所述方法通过携带生物素标记的引物G特异性扩增富集所述第二核酸分子a。In certain embodiments, the method specifically amplifies and enriches the second nucleic acid molecule a by using a primer G carrying a biotin label.

在某些实施方案中,所述引物G含有共有序列O或其部分序列,或者,所述引物G包含共有序列T或其部分序列。In certain embodiments, the primer G contains the consensus sequence O or a partial sequence thereof, or the primer G contains the consensus sequence T or a partial sequence thereof.

在某些实施方案中,所述扩增富集还包括使用引物H,所述引物H包含共有序列P1或其部分序列。In certain embodiments, the amplification and enrichment further comprises using a primer H, wherein the primer H comprises a consensus sequence P1 or a partial sequence thereof.

在某些实施方案中,所述方法还包括步骤(d):In certain embodiments, the method further comprises step (d):

纯化和/或扩增步骤(c)的产物中含有所述第一标签或其互补序列和所述第二标签或其互补序列的核酸分子的步骤。 A step of purifying and/or amplifying the nucleic acid molecule containing the first tag or its complementary sequence and the second tag or its complementary sequence in the product of step (c).

在某些实施方案中,所述步骤(c)包括:使用引物E和引物F对步骤(c)的产物进行扩增,其中,所述引物E包含共有序列P1以及任选的第三标签序列,所述引物F从5’至3’包含:共有序列P2或其互补序列、任选的第四标签序列、共有序列R2。In certain embodiments, step (c) comprises: amplifying the product of step (c) using primer E and primer F, wherein primer E comprises a consensus sequence P1 and an optional third tag sequence, and primer F comprises from 5' to 3': a consensus sequence P2 or its complementary sequence, an optional fourth tag sequence, and a consensus sequence R2.

在某些实施方案中,所述方法还包括步骤(d)’:In certain embodiments, the method further comprises step (d)':

纯化和/或扩增步骤(b)的产物中含有所述第一标签或其互补序列和所述第二标签或其互补序列的所述第二核酸分子b的步骤。A step of purifying and/or amplifying the second nucleic acid molecule b containing the first tag or its complementary sequence and the second tag or its complementary sequence in the product of step (b).

在某些实施方案中,所述步骤(d)’包括:使用引物E’和引物F’对步骤(b)的产物进行扩增,其中,所述引物E’包含共有序列P1,以及任选的第三标签序列,所述引物F’包含共有序列P2以及任选的第四标签序列。In certain embodiments, step (d)' comprises: amplifying the product of step (b) using primer E' and primer F', wherein primer E' comprises a consensus sequence P1 and an optional third tag sequence, and primer F' comprises a consensus sequence P2 and an optional fourth tag sequence.

第三方面,本申请还提供一种对细胞或细胞核进行组学测序的方法,其包括:In a third aspect, the present application also provides a method for performing omics sequencing on a cell or a cell nucleus, comprising:

根据上述第二方面任一项所述的方法构建核酸分子文库;和,Constructing a nucleic acid molecule library according to any one of the methods described in the second aspect above; and,

对所述核酸分子文库进行测序。The nucleic acid molecule library is sequenced.

在某些实施方案中,在测序之前,将至少2个,至少3个,至少4个,至少5个,至少8个,至少10个,至少12个,至少15个,至少18个,至少20个,至少25个,2-5个,2-10个,2-20个,2-30个,2-40个或2-50个核酸分子文库合并,然后进行测序;其中,每个核酸分子文库各自具有多个核酸分子(即,扩增产物),且同一个文库中的所述多个核酸分子具有相同的所述第三标签序列或者相同的所述第四标签序列;且,来源于不同文库的核酸分子具有彼此不同的所述第三标签序列或者彼此不同的所述第四标签序列。In certain embodiments, before sequencing, at least 2, at least 3, at least 4, at least 5, at least 8, at least 10, at least 12, at least 15, at least 18, at least 20, at least 25, 2-5, 2-10, 2-20, 2-30, 2-40 or 2-50 nucleic acid molecule libraries are combined and then sequenced; wherein each nucleic acid molecule library each has multiple nucleic acid molecules (i.e., amplification products), and the multiple nucleic acid molecules in the same library have the same third tag sequence or the same fourth tag sequence; and nucleic acid molecules derived from different libraries have different third tag sequences or different fourth tag sequences from each other.

第四方面,本申请还提供一种核酸分子文库,其由上述第二方面任一项所述的方法所构建。In a fourth aspect, the present application also provides a nucleic acid molecule library, which is constructed by the method described in any one of the second aspects above.

产品product

第五方面,本申请还提供一种试剂组合物,其具备选自I、II和III的特征:In a fifth aspect, the present application also provides a reagent composition having characteristics selected from I, II and III:

(I)所述试剂组合物包含第二寡核苷酸分子a,所述第二寡核苷酸分子a的序列从5’端至3’端顺次包含:共有序列P1或其部分序列,第二标签序列,共有序列R1或其部分序列;(I) the reagent composition comprises a second oligonucleotide molecule a, the sequence of which comprises, from the 5' end to the 3' end, a consensus sequence P1 or a partial sequence thereof, a second tag sequence, and a consensus sequence R1 or a partial sequence thereof;

并且,所述试剂组合物进一步包含选自以下的一项或多项:Furthermore, the reagent composition further comprises one or more selected from the following:

(I-a)多个偶联了多个第一寡核苷酸分子a的珠粒,其中,所述第一寡核苷酸分子a含有第一标签序列;(I-a) a plurality of beads coupled with a plurality of first oligonucleotide molecules a, wherein the first oligonucleotide molecules a contain a first tag sequence;

并且,同一个珠粒上的所述多个第一寡核苷酸分子a具有相同的第一标签序列,并且,不同珠粒上的所述第一寡核苷酸分子a具有彼此不同的第一标签序列;Furthermore, the plurality of first oligonucleotide molecules a on the same bead have the same first tag sequence, and the first oligonucleotide molecules a on different beads have first tag sequences different from each other;

在某些实施方案中,所述第一寡核苷酸分子a从5’端至3’端包含:(i)共有序列R1或其部分序列、所述第一标签序列、poly(T)序列;或者,(ii)共有序列R1或其部分序列、所述第一标签序列和cDNA 3’末端悬突的互补序列;In certain embodiments, the first oligonucleotide molecule a comprises, from the 5' end to the 3' end: (i) a consensus sequence R1 or a partial sequence thereof, the first tag sequence, and a poly(T) sequence; or (ii) a consensus sequence R1 or a partial sequence thereof, the first tag sequence, and a complementary sequence to the 3' end overhang of the cDNA;

在某些实施方案中,所述第一寡核苷酸分子a进一步包含独特分子标签序列,并且,同一个珠粒上偶联的多个所述第一寡核苷酸分子a具有彼此不同的独特分子标签序列;在某些实施方案中,所述独特分子标签序列位于所述共有序列R1或其部分序列的3’端; In some embodiments, the first oligonucleotide molecule a further comprises a unique molecular tag sequence, and the plurality of first oligonucleotide molecules a coupled to the same bead have unique molecular tag sequences that are different from each other; in some embodiments, the unique molecular tag sequence is located at the 3' end of the common sequence R1 or a partial sequence thereof;

(I-b)引物A或引物B,其中,所述引物A从5’端至3’端包含共有序列O和cDNA 3’末端悬突的互补序列,所述引物B从5’端至3’端包含共有序列T或其部分序列和poly(T)序列;在某些实施方案中,所述共有序列O与所述共有序列T相同或部分相同;(I-b) primer A or primer B, wherein the primer A comprises a consensus sequence O and a complementary sequence of the cDNA 3’ end overhang from the 5’ end to the 3’ end, and the primer B comprises a consensus sequence T or a partial sequence thereof and a poly(T) sequence from the 5’ end to the 3’ end; in certain embodiments, the consensus sequence O is identical or partially identical to the consensus sequence T;

(I-c)引物C,所述引物C包含所述共有序列O或其部分序列,或者,所述引物C包含所述共有序列T或其部分序列;在某些实施方案中,所述共有序列O与所述共有序列T相同或部分相同;(I-c) primer C, the primer C comprises the consensus sequence O or a partial sequence thereof, or the primer C comprises the consensus sequence T or a partial sequence thereof; in certain embodiments, the consensus sequence O is identical or partially identical to the consensus sequence T;

(I-d)引物E和/或引物F,其中,所述引物E包含共有序列P1以及任选的第三标签序列,所述引物F从5’至3’包含:共有序列P2或其互补序列、任选的第四标签序列、共有序列R2或其部分序列;(I-d) Primer E and/or primer F, wherein the primer E comprises a consensus sequence P1 and an optional third tag sequence, and the primer F comprises from 5' to 3': a consensus sequence P2 or a complementary sequence thereof, an optional fourth tag sequence, a consensus sequence R2 or a partial sequence thereof;

(I-e)转座酶复合体II,所述转座酶复合体II含有转座酶和所述转座酶能够识别并结合的转座序列,且能够切割或断裂双链核酸;并且,所述转座序列包含转移链和非转移链;其中,所述转移链包含接头序列;在某些实施方案中,所述接头序列包含共有序列R2或其部分序列;(I-e) a transposase complex II, wherein the transposase complex II contains a transposase and a transposition sequence that the transposase can recognize and bind to, and can cut or break a double-stranded nucleic acid; and the transposition sequence comprises a transferred strand and a non-transferred strand; wherein the transferred strand comprises a linker sequence; in certain embodiments, the linker sequence comprises a consensus sequence R2 or a partial sequence thereof;

在某些实施方案中,所述cDNA 3’末端悬突具有至少1个,至少2个,至少3个,至少4个,至少5个,至少6个,至少7个,至少8个,至少9个,至少10个,1-10个,1-5个或2-10个核苷酸的长度。在某些实施方案中,所述cDNA 3’末端悬突为2-5个胞嘧啶核苷酸的悬突(例如CCC悬突);In some embodiments, the cDNA 3' terminal overhang has a length of at least 1, at least 2, at least 3, at least 4, at least 5, at least 6, at least 7, at least 8, at least 9, at least 10, 1-10, 1-5 or 2-10 nucleotides. In some embodiments, the cDNA 3' terminal overhang is an overhang of 2-5 cytosine nucleotides (e.g., CCC overhang);

(II)所述试剂组合物包含第二寡核苷酸分子b,所述第二寡核苷酸分子b的序列从5’端至3’端顺次包含:共有序列P2或其部分序列,第二标签序列,共有序列R2或其部分序列;(II) the reagent composition comprises a second oligonucleotide molecule b, the sequence of which comprises, from the 5' end to the 3' end, a consensus sequence P2 or a partial sequence thereof, a second tag sequence, and a consensus sequence R2 or a partial sequence thereof;

并且,所述试剂组合物进一步包含选自以下的一项或多项:Furthermore, the reagent composition further comprises one or more selected from the following:

(II-a)多个偶联了多个第一寡核苷酸分子b的珠粒,其中,所述第一寡核苷酸分子b含有第一标签序列;(II-a) a plurality of beads coupled with a plurality of first oligonucleotide molecules b, wherein the first oligonucleotide molecules b contain a first tag sequence;

并且,同一个珠粒上的所述多个第一寡核苷酸分子b具有相同的第一标签序列,并且,不同珠粒上的所述第一寡核苷酸分子b具有彼此不同的第一标签序列;Furthermore, the plurality of first oligonucleotide molecules b on the same bead have the same first tag sequence, and the first oligonucleotide molecules b on different beads have first tag sequences different from each other;

在某些实施方案中,所述第一寡核苷酸分子b从5’端至3’端包含:共有序列P1或其部分序列和所述第一标签序列;In certain embodiments, the first oligonucleotide molecule b comprises from the 5' end to the 3' end: a consensus sequence P1 or a partial sequence thereof and the first tag sequence;

(II-b)转座酶复合体I,所述转座酶复合体I如第一方面所述的标记方法中“待标记的核酸分子为基因组DNA”这部分中所定义;(II-b) transposase complex I, wherein the transposase complex I is as defined in the section "the nucleic acid molecule to be labeled is genomic DNA" in the labeling method described in the first aspect;

(II-c)引物D,其中,所述引物D包含共有序列P1或其部分序列;(II-c) primer D, wherein the primer D comprises the consensus sequence P1 or a partial sequence thereof;

(II-d)引物E’和/或引物F’,其中,所述引物E’包含共有序列P1,以及任选的第三标签序列,所述引物F’包含共有序列P2以及任选的第四标签序列;(II-d) primer E' and/or primer F', wherein the primer E' comprises a consensus sequence P1 and an optional third tag sequence, and the primer F' comprises a consensus sequence P2 and an optional fourth tag sequence;

(III)所述试剂组合物包含第二寡核苷酸分子a和第二寡核苷酸分子b;其中,所述第二寡核苷酸分子a的序列从5’端至3’端顺次包含:共有序列P1或其部分序列,第二标签序列,共有序列R1或其部分序列;所述第二寡核苷酸分子b的序列从5’端至3’端顺次包含:共有序列P2或其部分序列,第二标签序列,共有序列R2或其部分序列; (III) The reagent composition comprises a second oligonucleotide molecule a and a second oligonucleotide molecule b; wherein the sequence of the second oligonucleotide molecule a comprises, from the 5' end to the 3' end: a consensus sequence P1 or a partial sequence thereof, a second tag sequence, and a consensus sequence R1 or a partial sequence thereof; the sequence of the second oligonucleotide molecule b comprises, from the 5' end to the 3' end: a consensus sequence P2 or a partial sequence thereof, a second tag sequence, and a consensus sequence R2 or a partial sequence thereof;

并且,所述试剂组合物进一步包含选自以下的一项或多项:Furthermore, the reagent composition further comprises one or more selected from the following:

(III-a)多个同时偶联了多个所述第一寡核苷酸分子a和多个所述第一寡核苷酸分子b的珠粒,且,同一个珠粒上的所述多个第一寡核苷酸分子a和多个所述第一寡核苷酸分子b具有相同的第一标签序列,不同珠粒上的所述第一寡核苷酸分子a具有彼此不同的第一标签序列,不同珠粒上的所述第一寡核苷酸分子b具有彼此不同的第一标签序列;(III-a) a plurality of beads to which a plurality of the first oligonucleotide molecules a and a plurality of the first oligonucleotide molecules b are simultaneously coupled, wherein the plurality of the first oligonucleotide molecules a and the plurality of the first oligonucleotide molecules b on the same bead have the same first tag sequence, the first oligonucleotide molecules a on different beads have first tag sequences different from each other, and the first oligonucleotide molecules b on different beads have first tag sequences different from each other;

在某些实施方案中,所述第一寡核苷酸分子a从5’端至3’端包含:(i)共有序列R1或其部分序列、所述第一标签序列、poly(T)序列;或者,(ii)共有序列R1或其部分序列、所述第一标签序列和cDNA 3’末端悬突的互补序列;和/或,所述第一寡核苷酸分子b从5’端至3’端包含:共有序列P1或其部分序列和所述第一标签序列;In certain embodiments, the first oligonucleotide molecule a comprises, from the 5' end to the 3' end: (i) a consensus sequence R1 or a partial sequence thereof, the first tag sequence, and a poly(T) sequence; or, (ii) a consensus sequence R1 or a partial sequence thereof, the first tag sequence, and a complementary sequence to the 3' end overhang of the cDNA; and/or, the first oligonucleotide molecule b comprises, from the 5' end to the 3' end: a consensus sequence P1 or a partial sequence thereof and the first tag sequence;

在某些实施方案中,所述第一寡核苷酸分子a进一步包含独特分子标签序列,并且,同一个珠粒上偶联的多个所述第一寡核苷酸分子a具有彼此不同的独特分子标签序列;在某些实施方案中,所述独特分子标签序列位于所述共有序列R1或其部分序列的3’端;In some embodiments, the first oligonucleotide molecule a further comprises a unique molecular tag sequence, and the plurality of first oligonucleotide molecules a coupled to the same bead have unique molecular tag sequences that are different from each other; in some embodiments, the unique molecular tag sequence is located at the 3' end of the common sequence R1 or a partial sequence thereof;

(III-b)引物A或引物B,其中,所述引物A从5’端至3’端包含共有序列O和cDNA 3’末端悬突的互补序列,所述引物B从5’端至3’端包含共有序列T或其部分序列和poly(T)序列;在某些实施方案中,所述共有序列O与所述共有序列T相同或部分相同;(III-b) primer A or primer B, wherein the primer A comprises a consensus sequence O and a complementary sequence of the cDNA 3’ end overhang from the 5’ end to the 3’ end, and the primer B comprises a consensus sequence T or a partial sequence thereof and a poly(T) sequence from the 5’ end to the 3’ end; in certain embodiments, the consensus sequence O is identical or partially identical to the consensus sequence T;

(III-c)转座酶复合体I,所述转座酶复合体I如第一方面所述的标记方法中“待标记的核酸分子为基因组DNA”这部分中所定义;(III-c) transposase complex I, wherein the transposase complex I is as defined in the section "the nucleic acid molecule to be labeled is genomic DNA" in the labeling method described in the first aspect;

(III-d)引物C,所述引物C包含所述共有序列O或其部分序列,或者,所述引物C包含所述共有序列T或其部分序列;在某些实施方案中,所述共有序列O与所述共有序列T相同或部分相同;(III-d) primer C, the primer C comprises the consensus sequence O or a partial sequence thereof, or the primer C comprises the consensus sequence T or a partial sequence thereof; in certain embodiments, the consensus sequence O is identical or partially identical to the consensus sequence T;

(III-e)引物D,其中,所述引物D包含共有序列P1或其部分序列;(III-e) primer D, wherein the primer D comprises the consensus sequence P1 or a partial sequence thereof;

(III-f)引物E和/或引物F,其中,所述引物E包含共有序列P1以及任选的第三标签序列,所述引物F从5’至3’包含:共有序列P2或其互补序列、任选的第四标签序列、共有序列R2或其部分序列;(III-f) Primer E and/or primer F, wherein the primer E comprises a consensus sequence P1 and an optional third tag sequence, and the primer F comprises from 5' to 3': a consensus sequence P2 or a complementary sequence thereof, an optional fourth tag sequence, a consensus sequence R2 or a partial sequence thereof;

(III-g)引物E’和/或引物F’,其中,所述引物E’包含共有序列P1,以及任选的第三标签序列,所述引物F’包含共有序列P2以及任选的第四标签序列;(III-g) Primer E' and/or primer F', wherein the primer E' comprises a consensus sequence P1 and an optional third tag sequence, and the primer F' comprises a consensus sequence P2 and an optional fourth tag sequence;

(III-h)包含引物G和/或引物H,所述引物G携带生物素标记并且含有共有序列O或其部分序列或者共有序列T或其部分序列,所述引物H包含共有序列P1或其部分序列;在某些实施方案中,所述共有序列O与所述共有序列T相同或部分相同;(III-h) comprises primer G and/or primer H, wherein primer G carries a biotin label and contains a consensus sequence O or a partial sequence thereof or a consensus sequence T or a partial sequence thereof, and primer H comprises a consensus sequence P1 or a partial sequence thereof; in certain embodiments, the consensus sequence O is identical or partially identical to the consensus sequence T;

(III-i)转座酶复合体II,所述转座酶复合体II含有转座酶和所述转座酶能够识别并结合的转座序列,且能够切割或断裂双链核酸;并且,所述转座序列包含转移链和非转移链;其中,所述转移链包含接头序列;在某些实施方案中,所述接头序列包含共有序列R2或其部分序列;(III-i) a transposase complex II, wherein the transposase complex II contains a transposase and a transposition sequence that the transposase can recognize and bind to, and can cut or break a double-stranded nucleic acid; and the transposition sequence comprises a transferred strand and a non-transferred strand; wherein the transferred strand comprises a linker sequence; in certain embodiments, the linker sequence comprises a consensus sequence R2 or a partial sequence thereof;

在某些实施方案中,所述cDNA 3’末端悬突具有至少1个,至少2个,至少3个,至少4个,至少5个,至少6个,至少7个,至少8个,至少9个,至少10个,1-10个,1-5个或2-10个核苷酸的长度。在某些实施方案中,所述cDNA 3’末端悬突为2-5个胞嘧啶核苷酸的悬突 (例如CCC悬突)。In certain embodiments, the cDNA 3' terminal overhang has a length of at least 1, at least 2, at least 3, at least 4, at least 5, at least 6, at least 7, at least 8, at least 9, at least 10, 1-10, 1-5, or 2-10 nucleotides. In certain embodiments, the cDNA 3' terminal overhang is an overhang of 2-5 cytosine nucleotides. (e.g. CCC overhang).

在某些实施方案中,所述试剂组合物进一步包含用于固定和/或透化细胞或细胞核的试剂。In certain embodiments, the reagent composition further comprises a reagent for fixing and/or permeabilizing cells or cell nuclei.

在某些实施方案中,所述试剂组合物进一步包含甲醇、甲醛和/或多聚甲醛。In certain embodiments, the reagent composition further comprises methanol, formaldehyde and/or paraformaldehyde.

在某些实施方案中,所述试剂组合物进一步包含Triton X-100、digitonin、IGEPAL(例如,CA-630)、和/或Tween-20。In certain embodiments, the reagent composition further comprises Triton X-100, digitonin, IGEPAL (e.g., CA-630), and/or Tween-20.

在某些实施方案中,所述试剂组合物进一步包含:Rnase抑制剂,矿物油,缓冲液,dNTP,一种或多种核酸聚合酶(例如DNA聚合酶;例如具有链置换活性和/或高保真性的DNA聚合酶),用于回收或纯化核酸的试剂(例如磁珠),孔板,或其任何组合。In certain embodiments, the reagent composition further comprises: an RNase inhibitor, mineral oil, a buffer, dNTPs, one or more nucleic acid polymerases (e.g., DNA polymerases; e.g., DNA polymerases having strand displacement activity and/or high fidelity), reagents for recovering or purifying nucleic acids (e.g., magnetic beads), a well plate, or any combination thereof.

在某些实施方案中,所述试剂组合物还包含用于测序的试剂。例如用于二代测序的试剂。In certain embodiments, the reagent composition further comprises reagents for sequencing, such as reagents for next-generation sequencing.

在第六方面,本申请还提供了一种试剂盒,其包含:含有多个寡核苷酸分子的多反应体系,所述每个寡核苷酸分子含有特定的标签序列;In a sixth aspect, the present application also provides a kit, which comprises: a multi-reaction system containing a plurality of oligonucleotide molecules, each of which contains a specific tag sequence;

并且,所述多反应体系中,每个反应体系中的寡核苷酸分子具有相同的标签序列,不同反应体系的寡核苷酸分子具有彼此不同的标签序列。Furthermore, in the multi-reaction system, the oligonucleotide molecules in each reaction system have the same tag sequence, and the oligonucleotide molecules in different reaction systems have different tag sequences.

在某些实施方案中,所述寡核苷酸分子还包含共有序列P1或其部分序列,或者,所述寡核苷酸分子还包含共有序列P2或其部分序列。In certain embodiments, the oligonucleotide molecule further comprises a consensus sequence P1 or a partial sequence thereof, or the oligonucleotide molecule further comprises a consensus sequence P2 or a partial sequence thereof.

在某些实施方案中,所述多反应体系包含至少2个(例如,至少3个,至少4个,至少5个,至少8个,至少10个,至少12个,至少20个,至少24个,至少50个,至少96个,至少100个,至少200个,至少384个,至少400个,2-5个,2-10个,2-50个,2-80个,2-100个,2-500个,2-103个,2-104个,2-105个,2-106个)含有寡核苷酸的多反应体系;In certain embodiments, the multiple reaction system comprises at least 2 (e.g., at least 3, at least 4, at least 5, at least 8, at least 10, at least 12, at least 20, at least 24, at least 50, at least 96, at least 100, at least 200, at least 384, at least 400, 2-5, 2-10, 2-50, 2-80, 2-100, 2-500, 2-10 3 , 2-10 4 , 2-10 5 , 2-10 6 ) multiple reaction systems containing oligonucleotides;

其中多反应体系优选为多孔板,寡核苷酸可以游离或固定在反应体系中。The multi-reaction system is preferably a multi-well plate, and the oligonucleotides can be free or fixed in the reaction system.

第七方面,本申请提供一种对细胞进行固定和透化的方法,其包括以下步骤:In a seventh aspect, the present application provides a method for fixing and permeabilizing cells, comprising the following steps:

(i)在-40℃至-10℃(例如,-25℃至-15℃或-20℃)的条件下,使用浓度为60%-100%(例如,60%-80%、70%-80%、70%-85%、70%-90%、75%-80%、75%-85%、75%-90%、75%-100%、80%-85%、80%-90%、80%-100%或80%)的甲醇处理细胞5-30min(例如,8-20min或10min)对细胞进行固定和透化;(i) treating the cells with 60%-100% (e.g., 60%-80%, 70%-80%, 70%-85%, 70%-90%, 75%-80%, 75%-85%, 75%-90%, 75%-100%, 80%-85%, 80%-90%, 80%-100%, or 80%) methanol at -40°C to -10°C (e.g., -25°C to -15°C or -20°C) for 5-30 min (e.g., 8-20 min or 10 min) to fix and permeabilize the cells;

或者,or,

(ii)(a)在0℃至37℃(例如,15℃至30℃或25℃)的条件下,使用浓度为0.05%-5%(例如,0.5%-1%、0.5%-2%、0.5%-3%、0.5%-4%、0.5%-5%或1%)的甲醛或多聚甲醛处理细胞5-30min(例如,5-20min或10min)对细胞进行固定;和,(ii)(a) fixing the cells by treating the cells with formaldehyde or paraformaldehyde at a concentration of 0.05%-5% (e.g., 0.5%-1%, 0.5%-2%, 0.5%-3%, 0.5%-4%, 0.5%-5%, or 1%) at 0°C to 37°C (e.g., 15°C to 30°C or 25°C) for 5-30 minutes (e.g., 5-20 minutes or 10 minutes); and,

(b)使用浓度为0.05%-2%(例如,0.05%-0.2%、0.05%-0.25%、0.05%-0.3%、0.05%-0.5%、0.05%-0.8%、0.05%-1%、0.1%-0.2%、0.1%-0.25%、0.1%-0.3%、0.1%-0.4%、0.1%-0.5%、0.1%-0.8%、0.1%-1%、0.2%-0.25%、0.2%-0.3%、0.2%-0.4%、0.2%-0.5%、0.2%-0.8%、0.2%-1%或0.2%)的Triton X-100在-4℃至10℃(例如,0℃至4℃)的条件下处理细胞0.5-10min(例如,1-5min或3min)对细胞进行透化。 (b) using Triton at a concentration of 0.05%-2% (e.g., 0.05%-0.2%, 0.05%-0.25%, 0.05%-0.3%, 0.05%-0.5%, 0.05%-0.8%, 0.05%-1%, 0.1%-0.2%, 0.1%-0.25%, 0.1%-0.3%, 0.1%-0.4%, 0.1%-0.5%, 0.1%-0.8%, 0.1%-1%, 0.2%-0.25%, 0.2%-0.3%, 0.2%-0.4%, 0.2%-0.5%, 0.2%-0.8%, 0.2%-1%, or 0.2%) Treat the cells with X-100 at -4°C to 10°C (eg, 0°C to 4°C) for 0.5-10 min (eg, 1-5 min or 3 min) to permeabilize the cells.

在某些实施方案中,通过在-20℃的条件下,使用浓度为80%的甲醇处理细胞10min来对细胞进行固定和透化。In certain embodiments, cells are fixed and permeabilized by treating the cells with 80% methanol for 10 min at -20°C.

在某些实施方案中,所述细胞为天然存在的细胞或重组细胞,或两者的混合。In certain embodiments, the cell is a naturally occurring cell or a recombinant cell, or a mixture of both.

在某些实施方案中,所述重组细胞是指包含经修饰(例如,人为修饰)的核酸分子(例如,基因)和/或其产物(例如,蛋白、RNA)的细胞,所述修饰包括但不限于,增加或减少所述细胞内源基因的拷贝数、突变所述细胞内源基因、上调或下调或沉默所述细胞内源基因产物的表达、向所述细胞导入外源核酸分子(所述外源核酸分子被整合入所述细胞的基因组或以非整合形式存在)等。In certain embodiments, the recombinant cell refers to a cell comprising a modified (e.g., artificially modified) nucleic acid molecule (e.g., gene) and/or its product (e.g., protein, RNA), wherein the modification includes, but is not limited to, increasing or decreasing the copy number of endogenous genes in the cell, mutating endogenous genes in the cell, upregulating or downregulating or silencing the expression of endogenous gene products in the cell, introducing exogenous nucleic acid molecules into the cell (the exogenous nucleic acid molecules are integrated into the genome of the cell or exist in a non-integrated form), etc.

第八方面,本申请提供了一种对细胞核进行固定和透化的方法,其包括以下步骤:In an eighth aspect, the present application provides a method for fixing and permeabilizing a cell nucleus, comprising the following steps:

(i)对细胞核进行固定,所述固定选自:(i) fixing the cell nucleus, wherein the fixation is selected from:

(a)在0℃至30℃(例如,15℃至28℃或25℃)的条件下,使用浓度为0.05%-4%(例如,0.5%-1%、0.5%-2%、0.5%-3%、0.5%-4%或1%)的甲醛处理细胞核2-20min(例如,5-15min或10min)对细胞核进行固定;或者,(a) fixing the cell nuclei by treating the cell nuclei with formaldehyde at a concentration of 0.05%-4% (e.g., 0.5%-1%, 0.5%-2%, 0.5%-3%, 0.5%-4% or 1%) at 0°C to 30°C (e.g., 15°C to 28°C or 25°C) for 2-20 min (e.g., 5-15 min or 10 min); or,

(b)在0℃至30℃(例如,15℃至28℃或25℃)的条件下,使用浓度为0.05%-4%(例如,0.5%-2%、0.5%-3%、0.5%-4%或1.6%)的多聚甲醛处理细胞核1-15min(例如,1-10min或5min)对细胞核进行固定;(b) treating the cell nuclei with paraformaldehyde at a concentration of 0.05%-4% (e.g., 0.5%-2%, 0.5%-3%, 0.5%-4% or 1.6%) at 0°C to 30°C (e.g., 15°C to 28°C or 25°C) for 1-15 min (e.g., 1-10 min or 5 min) to fix the cell nuclei;

以及,as well as,

(ii)使用包含digitonin的透化液在-4℃至10℃(例如,0℃至4℃)的条件下处理细胞核0.5-10min(例如,1-5min或3min)对细胞核进行透化;(ii) permeabilizing the cell nucleus by treating the cell nucleus with a permeabilization solution containing digitonin at -4°C to 10°C (e.g., 0°C to 4°C) for 0.5-10 min (e.g., 1-5 min or 3 min);

在某些实施方案中,所述透化液进一步包含IGEPAL(例如,CA-630)和/或Tween-20。In certain embodiments, the permeabilization solution further comprises IGEPAL (e.g., CA-630) and/or Tween-20.

在某些实施方案中,所述透化液中,digitonin的浓度为0.0005%-0.05%(例如,0.0008%-0.005%、0.0005%-0.002%、0.0008%-0.002%或0.001%)。In certain embodiments, the concentration of digitonin in the permeabilization solution is 0.0005%-0.05% (eg, 0.0008%-0.005%, 0.0005%-0.002%, 0.0008%-0.002%, or 0.001%).

在某些实施方案中,所述透化液中,IGEPAL(例如,CA-630)的浓度为0.005%-0.1%(例如,0.005%-0.05%、0.008%-0.05%、0.005%-0.02%、0.008%-0.02%或0.01%)。In certain embodiments, in the permeabilization solution, IGEPAL (e.g., CA-630) at a concentration of 0.005%-0.1% (e.g., 0.005%-0.05%, 0.008%-0.05%, 0.005%-0.02%, 0.008%-0.02%, or 0.01%).

在某些实施方案中,所述透化液中,Tween-20的浓度为0.005%-0.1%(例如,0.005%-0.05%、0.008%-0.05%、0.005%-0.02%、0.008%-0.02%或0.01%)。In certain embodiments, the concentration of Tween-20 in the permeabilization solution is 0.005%-0.1% (eg, 0.005%-0.05%, 0.008%-0.05%, 0.005%-0.02%, 0.008%-0.02%, or 0.01%).

在某些实施方案中,通过在室温的条件下,使用浓度为1%的甲醛处理细胞核10min来对细胞核进行固定,并在固定处理后,使用含有0.001%digitonin、0.01%IGEPAL(例如,CA-630)和0.01%Tween-20的透化液在0℃至4℃的条件下处理细胞核2-4min(例如3min)对细胞核进行透化。In certain embodiments, the cell nuclei are fixed by treating the cell nuclei with 1% formaldehyde for 10 min at room temperature, and after fixation, the cell nuclei are fixed with 0.001% digitonin, 0.01% IGEPAL (e.g., The cell nuclei are permeabilized by treating the cell nuclei with a permeabilization solution containing CA-630) and 0.01% Tween-20 at 0°C to 4°C for 2-4 min (e.g., 3 min).

在某些实施方案中,所述细胞核为源自天然存在的细胞的细胞核或源自重组细胞的细胞核,或两者的混合。In certain embodiments, the cell nucleus is a cell nucleus derived from a naturally occurring cell or a cell nucleus derived from a recombinant cell, or a mixture of both.

在某些实施方案中,所述重组细胞是指包含经修饰(例如,人为修饰)的核酸分子(例如,基因)和/或其产物(例如,蛋白、RNA)的细胞,所述修饰包括但不限于,增 加或减少所述细胞内源基因的拷贝数、突变所述细胞内源基因、上调或下调或沉默所述细胞内源基因产物的表达、向所述细胞导入外源核酸分子(所述外源核酸分子被整合入所述细胞的基因组或以非整合形式存在)等。In certain embodiments, the recombinant cell refers to a cell comprising a modified (e.g., artificially modified) nucleic acid molecule (e.g., gene) and/or its product (e.g., protein, RNA), wherein the modification includes but is not limited to, Increasing or decreasing the copy number of the endogenous gene of the cell, mutating the endogenous gene of the cell, upregulating, downregulating or silencing the expression of the endogenous gene product of the cell, introducing exogenous nucleic acid molecules into the cell (the exogenous nucleic acid molecules are integrated into the genome of the cell or exist in a non-integrated form), etc.

在第九方面,本申请提供了一种装置,其用于标记来自细胞或细胞核的核酸分子和/或构建核酸分子文库,所述装置包括:In a ninth aspect, the present application provides a device for labeling nucleic acid molecules from cells or cell nuclei and/or constructing a nucleic acid molecule library, the device comprising:

存储器;和Memory; and

耦接至所述存储器的处理器,所述处理器被配置为基于存储在所述存储器中的指令,执行上述第一方面任一项所述的方法和/或上述第二方面任一项所述的方法。A processor coupled to the memory, the processor being configured to execute the method described in any one of the first aspects and/or the method described in any one of the second aspects based on instructions stored in the memory.

在第十方面,本申请提供了一种计算机可读存储介质,其上存储有计算机程序,其特征在于,该程序被处理器执行时实现上述第一方面任一项的方法和/或上述第二方面任一项的方法。In the tenth aspect, the present application provides a computer-readable storage medium having a computer program stored thereon, characterized in that when the program is executed by a processor, it implements any method of the above-mentioned first aspect and/or any method of the above-mentioned second aspect.

在第十一方面,本申请还提供了上述第一方面任一项的方法或上述第五方面的试剂组合物或第六方面的试剂盒、第七方面、第八方面的方法、第九方面的装置或第十方面的计算机可读存储介质用于构建核酸分子文库或用于进行转录组测序的用途;或者,上述第二方面任一项的方法用于进行转录组测序的用途。In the eleventh aspect, the present application also provides the use of any one of the methods of the first aspect above, the reagent composition of the fifth aspect above, the kit of the sixth aspect, the method of the seventh aspect, the eighth aspect, the device of the ninth aspect, or the computer-readable storage medium of the tenth aspect for constructing a nucleic acid molecule library or for performing transcriptome sequencing; or, the use of any one of the methods of the second aspect above for performing transcriptome sequencing.

在某些实施方案中,本申请提供了以下实施方案:In certain embodiments, the present application provides the following embodiments:

实施方案1.一种标记来自细胞或细胞核的核酸分子的方法,其包括下述步骤:Embodiment 1. A method for labeling a nucleic acid molecule from a cell or a cell nucleus, comprising the following steps:

(1)提供多个经固定和透化的细胞或细胞核,所述细胞或细胞核含有待标记的核酸分子;和,(1) providing a plurality of fixed and permeabilized cells or cell nuclei, wherein the cells or cell nuclei contain nucleic acid molecules to be labeled; and,

多个偶联了多个第一寡核苷酸分子的珠粒,其中,所述第一寡核苷酸分子含有第一标签序列;a plurality of beads coupled with a plurality of first oligonucleotide molecules, wherein the first oligonucleotide molecules contain a first tag sequence;

并且,同一个珠粒上的所述多个第一寡核苷酸分子具有相同的第一标签序列,并且,不同珠粒上的所述第一寡核苷酸分子具有彼此不同的第一标签序列;Furthermore, the plurality of first oligonucleotide molecules on the same bead have the same first tag sequence, and the first oligonucleotide molecules on different beads have first tag sequences different from each other;

(2)将多个所述珠粒和多个所述细胞或细胞核随机分配至不同的第一离散分区,在所述第一离散分区内使所述第一寡核苷酸分子从所述珠粒上释放并与所述细胞或细胞核接触,从而在所述细胞或细胞核内生成衍生自所述待标记核酸分子的第一核酸分子,所述第一核酸分子含有所述第一标签序列或其互补序列;(2) randomly allocating a plurality of the beads and a plurality of the cells or cell nuclei to different first discrete partitions, releasing the first oligonucleotide molecule from the beads and contacting the first oligonucleotide molecule with the cells or cell nuclei in the first discrete partitions, thereby generating a first nucleic acid molecule derived from the nucleic acid molecule to be labeled in the cells or cell nuclei, wherein the first nucleic acid molecule contains the first tag sequence or its complementary sequence;

(3)将源自不同所述第一离散分区的包含所述第一核酸分子的细胞或细胞核混合并重新分配到不同的第二离散分区;(3) mixing the cells or cell nuclei containing the first nucleic acid molecule originating from different first discrete partitions and redistributing them into different second discrete partitions;

(4)在所述第二离散分区内,使含有第二标签序列的第二寡核苷酸分子与所述第一核酸分子接触,生成含有第一标签序列或其互补序列以及第二标签序列或其互补序列的第二核酸分子;(4) contacting a second oligonucleotide molecule containing a second tag sequence with the first nucleic acid molecule within the second discrete partition to generate a second nucleic acid molecule containing the first tag sequence or its complementary sequence and the second tag sequence or its complementary sequence;

其中,同一个所述第二离散分区的所述第二寡核苷酸分子具有相同的第二标签序列,并且,不同所述第二离散分区的所述第二寡核苷酸分子具有彼此不同的第二标签序列;wherein the second oligonucleotide molecules in the same second discrete partition have the same second tag sequence, and the second oligonucleotide molecules in different second discrete partitions have different second tag sequences;

其中,所述细胞为天然存在的细胞或重组细胞,或两者的混合;所述细胞核为源自 天然存在的细胞的细胞核或源自重组细胞的细胞核,或两者的混合。Wherein, the cell is a naturally occurring cell or a recombinant cell, or a mixture of the two; the cell nucleus is derived from The nucleus of a naturally occurring cell or a nucleus derived from a recombinant cell, or a mixture of both.

实施方案2.实施方案1的方法,其中,所述方法使用甲醇对细胞进行固定和透化,或者,使用甲醛或多聚甲醛以及Triton X-100对细胞进行固定和透化。Implementation Option 2. The method of Implementation Option 1, wherein the method uses methanol to fix and permeabilize the cells, or uses formaldehyde or paraformaldehyde and Triton X-100 to fix and permeabilize the cells.

实施方案3实施方案1的方法,其中,所述方法使用甲醛或多聚甲醛以及digitonin对细胞核进行固定和透化;Embodiment 3 The method of embodiment 1, wherein the method uses formaldehyde or paraformaldehyde and digitonin to fix and permeabilize the cell nucleus;

在某些实施方案中,所述方法还包括使用IGEPAL(例如,CA-630)和/或Tween-20对细胞核进行透化。In certain embodiments, the method further comprises administering IGEPAL (e.g., CA-630) and/or Tween-20 to permeabilize the cell nuclei.

实施方案4.实施方案1-3任一项的方法,其具备选自以下的一项或多项:Embodiment 4. The method of any one of embodiments 1-3, comprising one or more selected from the following:

(1)在步骤(1)中,提供至少2个细胞或细胞核;和/或,提供至少2个珠粒;(1) In step (1), providing at least 2 cells or cell nuclei; and/or providing at least 2 beads;

(2)所述第一离散分区为离散微孔或离散微液滴(2) The first discrete partition is a discrete micropore or a discrete microdroplet

(3)所述珠粒偶联了至少2个所述第一寡核苷酸分子;(3) the beads are coupled to at least two of the first oligonucleotide molecules;

(4)所述珠粒能够自发地或在暴露于一种或多种刺激时释放所述第一寡核苷酸分子;(4) the bead is capable of releasing the first oligonucleotide molecule spontaneously or upon exposure to one or more stimuli;

(5)所述珠粒是凝胶珠粒;(5) The beads are gel beads;

(6)步骤(3)中,将所述细胞或细胞核分配到至少2个所述第二离散分区,其中,每个所述第二离散分区含有至少一个细胞或细胞核;(6) In step (3), the cells or cell nuclei are distributed into at least two of the second discrete partitions, wherein each of the second discrete partitions contains at least one cell or cell nucleus;

(7)所述第二离散分区为多孔板中的离散孔;(7) The second discrete partitions are discrete holes in a porous plate;

(8)步骤(3)之后,步骤(4)之前,所述方法还包括裂解细胞和/或对所述第一核酸分子进行纯化的步骤。(8) After step (3) and before step (4), the method further includes the steps of lysing cells and/or purifying the first nucleic acid molecule.

实施方案5.实施方案1-4任一项的方法,其中,所述待标记的核酸分子为mRNA,并且,所述第一寡核苷酸分子为第一寡核苷酸分子a。Embodiment 5. The method of any one of Embodiments 1-4, wherein the nucleic acid molecule to be labeled is mRNA, and the first oligonucleotide molecule is the first oligonucleotide molecule a.

实施方案6.实施方案5的方法,其中,所述步骤(2)包括以下步骤:Embodiment 6. The method of embodiment 5, wherein step (2) comprises the following steps:

(i)(a)在所述第一离散分区内,用所述第一寡核苷酸分子a对所述待标记的核酸分子进行逆转录,生成cDNA链,所述cDNA链包含以所述第一寡核苷酸分子a为逆转录引物形成的与所述待标记核酸分子互补的cDNA序列,以及3’末端悬突;其中,所述第一寡核苷酸分子a从5’端至3’端包含:共有序列R1或其部分序列、所述第一标签序列、poly(T)序列;和,(b)将引物A与(a)中生成的所述cDNA链进行退火,并进行延伸反应,生成延伸产物,所述延伸产物即为所述第一核酸分子;其中,所述引物A从5’端至3’端包含共有序列O和所述3’末端悬突的互补序列;(i) (a) in the first discrete partition, reversely transcribe the nucleic acid molecule to be labeled with the first oligonucleotide molecule a to generate a cDNA chain, wherein the cDNA chain comprises a cDNA sequence complementary to the nucleic acid molecule to be labeled formed by using the first oligonucleotide molecule a as a reverse transcription primer, and a 3' terminal overhang; wherein the first oligonucleotide molecule a comprises, from the 5' end to the 3' end: a consensus sequence R1 or a partial sequence thereof, the first tag sequence, and a poly(T) sequence; and, (b) annealing primer A with the cDNA chain generated in (a), and performing an extension reaction to generate an extension product, wherein the extension product is the first nucleic acid molecule; wherein the primer A comprises, from the 5' end to the 3' end, a consensus sequence O and a complementary sequence to the 3' terminal overhang;

或者,or,

(ii)(a)在所述第一离散分区内,用引物B对所述待标记的核酸分子进行逆转录,生成cDNA链,所述cDNA链包含以所述引物B为逆转录引物形成的与所述待标记核酸分子互补的cDNA序列,以及3’末端悬突;其中,所述引物B从5’端至3’端包含共有序列T或其部分序列和poly(T)序列;和,(b)在所述第一离散分区内,将所述第一寡核苷酸分子a与(a)中生成的所述cDNA链进行退火,并进行延伸反应,生成延伸产物,所述延伸产物即为所述第一核酸分子;其中,所述第一寡核苷酸分子a从5’端至3’端包含:共有序列R1或其部分序列、所述第一标签序列和所述3’末端悬突的互补序列。 (ii) (a) in the first discrete partition, the nucleic acid molecule to be labeled is reverse transcribed with primer B to generate a cDNA chain, wherein the cDNA chain comprises a cDNA sequence complementary to the nucleic acid molecule to be labeled formed by using primer B as a reverse transcription primer, and a 3' terminal overhang; wherein the primer B comprises a consensus sequence T or a partial sequence thereof and a poly(T) sequence from the 5' end to the 3'end; and, (b) in the first discrete partition, the first oligonucleotide molecule a is annealed with the cDNA chain generated in (a), and an extension reaction is performed to generate an extension product, wherein the extension product is the first nucleic acid molecule; wherein the first oligonucleotide molecule a comprises, from the 5' end to the 3' end: a consensus sequence R1 or a partial sequence thereof, the first tag sequence and a complementary sequence to the 3' terminal overhang.

在某些实施方案中,所述步骤(ii)中,步骤(ii)(a)在将所述细胞或细胞核分配到所述第一离散分区之前或之后进行。In certain embodiments, in said step (ii), step (ii)(a) is performed before or after assigning said cells or cell nuclei to said first discrete partitions.

在某些实施方案中,所述第一寡核苷酸分子a进一步包含独特分子标签序列,并且,同一个珠粒上偶联的多个所述第一寡核苷酸分子a具有彼此不同的独特分子标签序列;在某些实施方案中,所述独特分子标签序列位于所述共有序列R1或其部分序列的3’端。In some embodiments, the first oligonucleotide molecule a further comprises a unique molecular tag sequence, and the multiple first oligonucleotide molecules a coupled to the same bead have unique molecular tag sequences that are different from each other; in some embodiments, the unique molecular tag sequence is located at the 3' end of the common sequence R1 or a partial sequence thereof.

在某些实施方案中,所述共有序列O与所述共有序列T相同或部分相同。In certain embodiments, the consensus sequence O is identical or partially identical to the consensus sequence T.

在某些实施方案中,所述3’末端悬突具有至少1个,至少2个,至少3个,至少4个,至少5个,至少6个,至少7个,至少8个,至少9个,至少10个,1-10个,1-5个或2-10个核苷酸的长度;在某些实施方案中,所述3’末端悬突为2-5个胞嘧啶核苷酸的悬突(例如CCC悬突)。In certain embodiments, the 3' terminal overhang has a length of at least 1, at least 2, at least 3, at least 4, at least 5, at least 6, at least 7, at least 8, at least 9, at least 10, 1-10, 1-5 or 2-10 nucleotides; in certain embodiments, the 3' terminal overhang is an overhang of 2-5 cytosine nucleotides (e.g., a CCC overhang).

实施方案7.实施方案5或6的方法,其中,所述步骤(4)包括以下步骤:Embodiment 7. The method of embodiment 5 or 6, wherein step (4) comprises the following steps:

在所述第二离散分区内,以所述第二寡核苷酸分子和引物C为引物扩增所述第一核酸分子,生成的延伸产物即为所述第二核酸分子;In the second discrete partition, the first nucleic acid molecule is amplified using the second oligonucleotide molecule and primer C as primers, and the generated extension product is the second nucleic acid molecule;

其中,所述第二寡核苷酸分子从5’端至3’端包含:共有序列P1或其部分序列、所述第二标签序列、所述共有序列R1或其部分序列;所述引物C包含所述共有序列O或其部分序列,或者,所述引物C包含共有序列T或其部分序列。Wherein, the second oligonucleotide molecule comprises from the 5' end to the 3' end: the consensus sequence P1 or a partial sequence thereof, the second tag sequence, the consensus sequence R1 or a partial sequence thereof; the primer C comprises the consensus sequence O or a partial sequence thereof, or the primer C comprises the consensus sequence T or a partial sequence thereof.

实施方案8.实施方案1-4任一项的方法,其中,所述待标记的核酸分子为基因组DNA,并且,所述第一寡核苷酸分子为第一寡核苷酸分子b。Embodiment 8. The method of any one of Embodiments 1-4, wherein the nucleic acid molecule to be labeled is genomic DNA, and the first oligonucleotide molecule is the first oligonucleotide molecule b.

实施方案9.实施方案8的方法,其中,所述步骤(2)包括以下步骤:Embodiment 9. The method of embodiment 8, wherein step (2) comprises the following steps:

(a)将所述待标记核酸分子与转座酶复合体I孵育;其中,所述转座酶复合体I含有转座酶和所述转座酶能够识别并结合的转座序列,且能够切割或断裂双链核酸;并且,所述转座序列包含转移链和非转移链;其中,所述转移链包含第一转移链和第二转移链,所述第一转移链包含转座酶识别序列和共有序列R2或其部分序列,所述第二转移链包含转座酶识别序列和共有序列R1或其部分序列;并且,所述孵育在允许所述待标记的核酸分子被所述转座酶复合体I断裂成核酸片段且所述转移链被连接至所述核酸片段的末端(例如,所述核酸片段的5’端)的条件下进行;从而生成5’端分别含有共有序列R2或其部分序列以及共有序列R1或其部分序列的双链核酸片段;和,(a) incubating the nucleic acid molecule to be labeled with a transposase complex I; wherein the transposase complex I contains a transposase and a transposition sequence that the transposase can recognize and bind to, and can cut or break a double-stranded nucleic acid; and the transposition sequence comprises a transferred strand and a non-transferred strand; wherein the transferred strand comprises a first transferred strand and a second transferred strand, the first transferred strand comprises a transposase recognition sequence and a consensus sequence R2 or a partial sequence thereof, and the second transferred strand comprises a transposase recognition sequence and a consensus sequence R1 or a partial sequence thereof; and the incubation is performed under conditions that allow the nucleic acid molecule to be broken into nucleic acid fragments by the transposase complex I and the transferred strands are connected to the ends of the nucleic acid fragments (e.g., the 5' ends of the nucleic acid fragments); thereby generating double-stranded nucleic acid fragments whose 5' ends contain the consensus sequence R2 or a partial sequence thereof and the consensus sequence R1 or a partial sequence thereof, respectively; and,

(b)在所述第一离散分区内,将所述第一寡核苷酸分子b与(a)中生成的所述双链核酸片段进行连接(例如,利用核酸酶进行连接),并进行延伸反应,生成延伸产物,所述延伸产物即为所述第一核酸分子;其中,所述第一寡核苷酸分子b从5’端至3’端包含:共有序列P1或其部分序列和所述第一标签序列。(b) In the first discrete partition, the first oligonucleotide molecule b is connected to the double-stranded nucleic acid fragment generated in (a) (for example, by using a nuclease), and an extension reaction is performed to generate an extension product, which is the first nucleic acid molecule; wherein the first oligonucleotide molecule b comprises, from the 5' end to the 3' end: a consensus sequence P1 or a partial sequence thereof and the first tag sequence.

在某些实施方案中,所述第一核酸分子包含源自所述细胞或细胞核中处于染色质开放区的基因组DNA片段的序列。In certain embodiments, the first nucleic acid molecule comprises a sequence derived from a genomic DNA fragment in an open chromatin region in the cell or cell nucleus.

在某些实施方案中,所述步骤(a)在将所述细胞或细胞核分配到所述第一离散分区之前或之后进行。In certain embodiments, step (a) is performed before or after partitioning the cells or cell nuclei into the first discrete partitions.

在某些实施方案中,所述转座酶复合体I的共有序列R1或其部分序列的5’端是磷酸化 的。In certain embodiments, the 5' end of the consensus sequence R1 of the transposase complex I or a portion thereof is phosphorylated of.

实施方案10.实施方案8或9的方法,其中,所述步骤(4)包括以下步骤:Embodiment 10. The method of embodiment 8 or 9, wherein step (4) comprises the following steps:

在所述第二离散分区内,以所述第二寡核苷酸分子和引物D为引物扩增所述第一核酸分子,生成的延伸产物即为所述第二核酸分子;In the second discrete partition, the first nucleic acid molecule is amplified using the second oligonucleotide molecule and primer D as primers, and the generated extension product is the second nucleic acid molecule;

其中,所述第二寡核苷酸分子从5’端至3’端包含:共有序列P2或其部分序列、所述第二标签序列、共有序列R2或其部分序列;所述引物D包含共有序列P1或其部分序列。Wherein, the second oligonucleotide molecule comprises from the 5' end to the 3' end: the consensus sequence P2 or a partial sequence thereof, the second tag sequence, the consensus sequence R2 or a partial sequence thereof; and the primer D comprises the consensus sequence P1 or a partial sequence thereof.

实施方案11.实施方案1-4任一项的方法,其中,所述待标记的核酸分子为mRNA和基因组DNA,并且,所述mRNA和基因组DNA具有相同的细胞来源;Embodiment 11. The method according to any one of embodiments 1 to 4, wherein the nucleic acid molecules to be labeled are mRNA and genomic DNA, and the mRNA and genomic DNA have the same cell source;

并且,所述第一寡核苷酸分子包括第一寡核苷酸分子a和第一寡核苷酸分子b,所述第二寡核苷酸分子包括第二寡核苷酸分子a和第二寡核苷酸分子b;Furthermore, the first oligonucleotide molecule includes a first oligonucleotide molecule a and a first oligonucleotide molecule b, and the second oligonucleotide molecule includes a second oligonucleotide molecule a and a second oligonucleotide molecule b;

其中,所述珠粒同时偶联了多个所述第一寡核苷酸分子a和多个所述第一寡核苷酸分子b;并且,同一个珠粒上的所述多个第一寡核苷酸分子a和多个所述第一寡核苷酸分子b具有相同的第一标签序列。The beads are coupled to a plurality of the first oligonucleotide molecules a and a plurality of the first oligonucleotide molecules b at the same time; and the plurality of the first oligonucleotide molecules a and the plurality of the first oligonucleotide molecules b on the same bead have the same first tag sequence.

实施方案12.实施方案11的方法,其中,所述步骤(2)包括以下步骤:Embodiment 12. The method of embodiment 11, wherein step (2) comprises the following steps:

(A)(i)(a)在所述第一离散分区内,用所述第一寡核苷酸分子a对所述待标记的mRNA分子进行逆转录,生成cDNA链,所述cDNA链包含以所述第一寡核苷酸分子a为逆转录引物形成的与所述待标记mRNA分子互补的cDNA序列,以及3’末端悬突;其中,所述第一寡核苷酸分子a从5’端至3’端包含:共有序列R1或其部分序列、所述第一标签序列、poly(T)序列;和,(b)将引物A与(a)中生成的所述cDNA链进行退火,并进行延伸反应,生成延伸产物,所述延伸产物即为第一核酸分子a;其中,所述引物A从5’端至3’端包含共有序列O和所述3’末端悬突的互补序列;(A)(i)(a) in the first discrete partition, reversely transcribe the mRNA molecule to be labeled with the first oligonucleotide molecule a to generate a cDNA chain, wherein the cDNA chain comprises a cDNA sequence complementary to the mRNA molecule to be labeled formed by using the first oligonucleotide molecule a as a reverse transcription primer, and a 3' terminal overhang; wherein the first oligonucleotide molecule a comprises, from the 5' end to the 3' end: a consensus sequence R1 or a partial sequence thereof, the first tag sequence, and a poly(T) sequence; and, (b) annealing primer A with the cDNA chain generated in (a), and performing an extension reaction to generate an extension product, wherein the extension product is the first nucleic acid molecule a; wherein the primer A comprises, from the 5' end to the 3' end, a consensus sequence O and a complementary sequence to the 3' terminal overhang;

或者,or,

(ii)(a)在所述第一离散分区内,用引物B对所述待标记的mRNA分子进行逆转录,生成cDNA链,所述cDNA链包含以所述引物B为逆转录引物形成的与所述待标记mRNA分子互补的cDNA序列,以及3’末端悬突;其中,所述引物B从5’端至3’端包含共有序列T或其部分序列和poly(T)序列;和,(b)在所述第一离散分区内,将所述第一寡核苷酸分子a与(a)中生成的所述cDNA链进行退火,并进行延伸反应,生成延伸产物,所述延伸产物即为第一核酸分子a;其中,所述第一寡核苷酸分子a从5’端至3’端包含:共有序列R1或其部分序列、所述第一标签序列和所述3’末端悬突的互补序列;(ii) (a) in the first discrete partition, reversely transcribe the mRNA molecule to be labeled with primer B to generate a cDNA chain, wherein the cDNA chain comprises a cDNA sequence complementary to the mRNA molecule to be labeled formed by using primer B as a reverse transcription primer, and a 3' terminal overhang; wherein the primer B comprises a consensus sequence T or a partial sequence thereof and a poly(T) sequence from the 5' end to the 3' end; and, (b) in the first discrete partition, anneal the first oligonucleotide molecule a with the cDNA chain generated in (a), and perform an extension reaction to generate an extension product, wherein the extension product is the first nucleic acid molecule a; wherein the first oligonucleotide molecule a comprises, from the 5' end to the 3' end: a consensus sequence R1 or a partial sequence thereof, the first tag sequence and a complementary sequence to the 3' terminal overhang;

和,and,

(B)(a)将所述待标记DNA分子与转座酶复合体I孵育;其中,所述转座酶复合体I如实施方案9中所定义;并且,所述孵育在允许所述待标记的DNA分子被所述转座酶复合体I断裂成核酸片段且所述转移链被连接至所述核酸片段的末端(例如,所述核酸片段的5’端)的条件下进行;从而生成5’端分别含有共有序列R2或其部分序列以及共有序列R1或其部分序列的双链核酸片段;和,(B) (a) incubating the DNA molecule to be labeled with transposase complex I; wherein the transposase complex I is as defined in Embodiment 9; and the incubation is performed under conditions that allow the DNA molecule to be broken into nucleic acid fragments by the transposase complex I and the transferred strand to be connected to the end of the nucleic acid fragment (e.g., the 5' end of the nucleic acid fragment); thereby generating double-stranded nucleic acid fragments whose 5' ends contain a consensus sequence R2 or a partial sequence thereof and a consensus sequence R1 or a partial sequence thereof, respectively; and,

(b)在与(A)相同的所述第一离散分区内,将所述第一寡核苷酸分子b与(a)中 生成的所述双链核酸片段进行连接,并进行延伸反应,生成延伸产物,所述延伸产物即为第一核酸分子b;其中,所述第一寡核苷酸分子b从5’端至3’端包含:共有序列P1或其部分序列和所述第一标签序列;(b) in the same first discrete partition as (A), the first oligonucleotide molecule b is combined with the oligonucleotide molecule in (a) The generated double-stranded nucleic acid fragments are connected and extended to generate an extension product, which is the first nucleic acid molecule b; wherein the first oligonucleotide molecule b comprises, from the 5' end to the 3' end: a consensus sequence P1 or a partial sequence thereof and the first tag sequence;

其中,所述步骤(A)和所述步骤(B)可以以任意顺序进行(例如,先(A)后(B),先(B)后(A),或同时进行)。Wherein, the step (A) and the step (B) may be performed in any order (for example, (A) first and then (B), (B) first and then (A), or simultaneously).

在某些实施方案中,所述步骤(A)(ii)中,步骤(A)(ii)(a)在将所述细胞或细胞核分配到所述第一离散分区之前或之后进行。In certain embodiments, in said step (A)(ii), step (A)(ii)(a) is performed before or after assigning said cells or cell nuclei to said first discrete partitions.

在某些实施方案中,所述步骤(B)(a)在将所述细胞或细胞核分配到所述第一离散分区之前或之后进行。In certain embodiments, step (B)(a) is performed before or after partitioning the cells or cell nuclei into the first discrete partitions.

在某些实施方案中,所述共有序列O与所述共有序列T相同或部分相同。In certain embodiments, the consensus sequence O is identical or partially identical to the consensus sequence T.

在某些实施方案中,所述第一核酸分子b包含源自所述细胞或细胞核中处于染色质开放区的基因组DNA片段的序列。In certain embodiments, the first nucleic acid molecule b comprises a sequence derived from a genomic DNA fragment in an open chromatin region in the cell or cell nucleus.

在某些实施方案中,所述所述转座酶复合体I中的共有序列R1或其部分序列的5’端是磷酸化的。In certain embodiments, the 5' end of the consensus sequence R1 or a partial sequence thereof in the transposase complex I is phosphorylated.

在某些实施方案中,所述第一寡核苷酸分子a进一步包含独特分子标签序列,并且,同一个珠粒上偶联的多个所述第一寡核苷酸分子a具有彼此不同的独特分子标签序列;在某些实施方案中,所述独特分子标签序列位于所述共有序列R1或其部分序列的3’端。In some embodiments, the first oligonucleotide molecule a further comprises a unique molecular tag sequence, and the multiple first oligonucleotide molecules a coupled to the same bead have unique molecular tag sequences that are different from each other; in some embodiments, the unique molecular tag sequence is located at the 3' end of the common sequence R1 or a partial sequence thereof.

在某些实施方案中,所述3’末端悬突具有至少1个,至少2个,至少3个,至少4个,至少5个,至少6个,至少7个,至少8个,至少9个,至少10个,1-10个,1-5个或2-10个核苷酸的长度;在某些实施方案中,所述3’末端悬突为2-5个胞嘧啶核苷酸的悬突(例如CCC悬突)。In certain embodiments, the 3' terminal overhang has a length of at least 1, at least 2, at least 3, at least 4, at least 5, at least 6, at least 7, at least 8, at least 9, at least 10, 1-10, 1-5 or 2-10 nucleotides; in certain embodiments, the 3' terminal overhang is an overhang of 2-5 cytosine nucleotides (e.g., a CCC overhang).

实施方案13.实施方案11或12的方法,其中,所述步骤(4)包括以下步骤:Embodiment 13. The method of embodiment 11 or 12, wherein step (4) comprises the following steps:

(a)在所述第二离散分区内,以所述第二寡核苷酸分子a和引物C为引物扩增所述第一核酸分子a,生成的延伸产物即为所述第二核酸分子a;(a) amplifying the first nucleic acid molecule a in the second discrete partition using the second oligonucleotide molecule a and primer C as primers, and the generated extension product is the second nucleic acid molecule a;

其中,所述第二寡核苷酸分子a从5’端至3’端包含:共有序列P1或其部分序列、所述第二标签序列、所述共有序列R1或其部分序列;所述引物C包含共有序列O或其部分序列,或者,所述引物C包含共有序列T或其部分序列;Wherein, the second oligonucleotide molecule a comprises from the 5' end to the 3' end: the consensus sequence P1 or a partial sequence thereof, the second tag sequence, the consensus sequence R1 or a partial sequence thereof; the primer C comprises the consensus sequence O or a partial sequence thereof, or the primer C comprises the consensus sequence T or a partial sequence thereof;

and

(b)在同一所述第二离散分区内,以所述第二寡核苷酸分子b和引物D为引物扩增所述第一核酸分子b,生成的延伸产物即为所述第二核酸分子b;(b) amplifying the first nucleic acid molecule b in the same second discrete partition using the second oligonucleotide molecule b and primer D as primers, and the generated extension product is the second nucleic acid molecule b;

其中,所述第二寡核苷酸分子b从5’端至3’端包含:共有序列P2或其部分序列、所述第二标签序列、共有序列R2或其部分序列;所述引物D包含共有序列P1或其部分序列;Wherein, the second oligonucleotide molecule b comprises from the 5' end to the 3' end: the consensus sequence P2 or a partial sequence thereof, the second tag sequence, the consensus sequence R2 or a partial sequence thereof; the primer D comprises the consensus sequence P1 or a partial sequence thereof;

其中,所述步骤(a)和所述步骤(b)可以以任意顺序进行(例如,先(a)后(b),先(b)后(a),或同时进行)。Wherein, the step (a) and the step (b) may be performed in any order (for example, (a) first and then (b), (b) first and then (a), or simultaneously).

实施方案14.一种构建核酸分子文库的方法,其包括,Embodiment 14. A method for constructing a nucleic acid molecule library, comprising:

(1)根据实施方案1-13任一项的方法生成多个经标记的所述第二核酸分子,以及, (1) generating a plurality of labeled second nucleic acid molecules according to the method of any one of embodiments 1 to 13, and,

(2)回收和/或合并多个所述第二核酸分子,(2) recovering and/or combining a plurality of said second nucleic acid molecules,

从而获得核酸分子文库。Thus, a nucleic acid molecule library is obtained.

在某些实施方案中,在步骤(2)中,回收和/或合并多个所述第二离散分区中生成的所述第二核酸分子。In certain embodiments, in step (2), the second nucleic acid molecules generated in a plurality of the second discrete partitions are recovered and/or combined.

实施方案15.实施方案14的方法,其包括:Embodiment 15. The method of embodiment 14, comprising:

(a)根据实施方案5-7任一项的方法生成多个经标记的所述第二核酸分子,(a) generating a plurality of labeled second nucleic acid molecules according to the method of any one of embodiments 5-7,

(b)回收和/或合并多个所述第二核酸分子;在某些实施方案中,回收和/或合并多个所述第二离散分区中生成的所述第二核酸分子;和(b) recovering and/or combining a plurality of said second nucleic acid molecules; in certain embodiments, recovering and/or combining a plurality of said second nucleic acid molecules generated in said second discrete partitions; and

(c)将所述第二核酸分子随机打断并添加接头序列;(c) randomly breaking the second nucleic acid molecule and adding a linker sequence;

从而获得核酸分子文库序列。Thus, the sequence of the nucleic acid molecule library is obtained.

实施方案16.实施方案15的方法,其中,所述细胞为T细胞或B细胞。Embodiment 16. The method of Embodiment 15, wherein the cell is a T cell or a B cell.

在某些实施方案中,所述方法在步骤(a)之后步骤(c)之前,所述方法还包括对靶核酸分子进行富集的步骤;所述靶核酸分子为包含:(i)编码T细胞受体(TCR)或B细胞受体(BCR)的核苷酸序列或其部分序列(例如,V(D)J序列),和/或,(ii)(i)的互补序列的所述第二核酸分子。In certain embodiments, the method further comprises, after step (a) and before step (c), a step of enriching the target nucleic acid molecule; the target nucleic acid molecule is a second nucleic acid molecule comprising: (i) a nucleotide sequence encoding a T cell receptor (TCR) or a B cell receptor (BCR) or a partial sequence thereof (e.g., a V(D)J sequence), and/or (ii) a complementary sequence of (i).

实施方案17.实施方案14-16任一项的方法,其中,所述步骤(c)中,通过转座酶将所述第二核酸分子随机打断并在其5’端添加接头序列。Embodiment 17. The method of any one of Embodiments 14-16, wherein, in step (c), the second nucleic acid molecule is randomly interrupted by a transposase and a linker sequence is added to its 5' end.

在某些实施方案中,所述接头序列包含共有序列R2或其部分序列。In certain embodiments, the linker sequence comprises the consensus sequence R2 or a partial sequence thereof.

实施方案18.实施方案14-17任一项的方法,其中,所述方法还包括步骤(d):Embodiment 18. The method of any one of Embodiments 14-17, wherein the method further comprises step (d):

纯化和/或扩增步骤(c)的产物中含有所述第一标签或其互补序列和所述第二标签或其互补序列的核酸分子的步骤。A step of purifying and/or amplifying the nucleic acid molecule containing the first tag or its complementary sequence and the second tag or its complementary sequence in the product of step (c).

在某些实施方案中,所述步骤(d)包括:使用引物E和引物F对步骤(c)的产物进行扩增,其中,所述引物E包含共有序列P1以及任选的第三标签序列,所述引物F从5’至3’包含:共有序列P2或其互补序列、任选的第四标签序列、共有序列R2或其部分序列。In certain embodiments, step (d) comprises: amplifying the product of step (c) using primer E and primer F, wherein primer E comprises a consensus sequence P1 and an optional third tag sequence, and primer F comprises from 5' to 3': a consensus sequence P2 or its complementary sequence, an optional fourth tag sequence, a consensus sequence R2 or a partial sequence thereof.

实施方案19.实施方案14的方法,其包括:Embodiment 19. The method of embodiment 14, comprising:

(a)根据实施方案8-10任一项的方法生成多个经标记的所述第二核酸分子,和,(a) generating a plurality of labeled second nucleic acid molecules according to the method of any one of embodiments 8-10, and,

(b)回收和/或合并多个所述第二核酸分子;在某些实施方案中,回收和/或合并多个所述第二离散分区中生成的所述第二核酸分子;(b) recovering and/or combining a plurality of said second nucleic acid molecules; in certain embodiments, recovering and/or combining a plurality of said second nucleic acid molecules generated in said second discrete partitions;

从而获得核酸分子文库序列。Thus, the sequence of the nucleic acid molecule library is obtained.

实施方案20.实施方案19的方法,其中,所述方法还包括步骤(c):Embodiment 20. The method of Embodiment 19, wherein the method further comprises step (c):

纯化和/或扩增步骤(b)的产物中含有所述第一标签或其互补序列和所述第二标签或其互补序列的核酸分子的步骤。A step of purifying and/or amplifying the nucleic acid molecule containing the first tag or its complementary sequence and the second tag or its complementary sequence in the product of step (b).

在某些实施方案中,所述步骤(c)包括:使用引物E’和引物F’对步骤(b)的产物进行扩增,其中,所述引物E’包含共有序列P1,以及任选的第三标签序列,所述引物F’包含共有序列P2以及任选的第四标签序列。In certain embodiments, step (c) comprises: amplifying the product of step (b) using primer E’ and primer F’, wherein primer E’ comprises a consensus sequence P1 and an optional third tag sequence, and primer F’ comprises a consensus sequence P2 and an optional fourth tag sequence.

实施方案21.实施方案14的方法,其包括: Embodiment 21. The method of embodiment 14, comprising:

(a)根据实施方案11-13任一项的方法生成多个经标记的所述第二核酸分子,其包括多个所述第二核酸分子a和多个所述第二核酸分子b,和,(a) generating a plurality of labeled second nucleic acid molecules according to the method of any one of embodiments 11 to 13, comprising a plurality of second nucleic acid molecules a and a plurality of second nucleic acid molecules b, and,

(b)回收和/或合并多个所述第二核酸分子;在某些实施方案中,回收和/或合并多个所述第二离散分区中生成的所述第二核酸分子;(b) recovering and/or combining a plurality of said second nucleic acid molecules; in certain embodiments, recovering and/or combining a plurality of said second nucleic acid molecules generated in said second discrete partitions;

从而获得核酸分子文库序列。Thus, the sequence of the nucleic acid molecule library is obtained.

实施方案22.实施方案21的方法,其中,所述方法在步骤(b)之后,还包括步骤(c):将所述第二核酸分子a随机打断并添加接头序列。Embodiment 22. The method of embodiment 21, wherein, after step (b), the method further comprises step (c): randomly breaking the second nucleic acid molecule a and adding a linker sequence.

在某些实施方案中,所述步骤(c)中,通过转座酶将所述第二核酸分子a随机打断并在其5’端添加接头序列。In certain embodiments, in step (c), the second nucleic acid molecule a is randomly fragmented by a transposase and a linker sequence is added to its 5' end.

在某些实施方案中,所述接头序列包含共有序列R2或其部分序列。In certain embodiments, the linker sequence comprises the consensus sequence R2 or a partial sequence thereof.

实施方案23.实施方案22的方法,其中,在步骤(c)之前,所述方法还包括从步骤(b)的产物中特异性富集所述第二核酸分子a。Embodiment 23. The method of embodiment 22, wherein, before step (c), the method further comprises specifically enriching the second nucleic acid molecule a from the product of step (b).

在某些实施方案中,所述方法通过携带生物素标记的引物G特异性扩增富集所述第二核酸分子a。In certain embodiments, the method specifically amplifies and enriches the second nucleic acid molecule a by using a primer G carrying a biotin label.

在某些实施方案中,所述引物G含有共有序列O或其部分序列,或者,所述引物G包含共有序列T或其部分序列。In certain embodiments, the primer G contains the consensus sequence O or a partial sequence thereof, or the primer G contains the consensus sequence T or a partial sequence thereof.

在某些实施方案中,所述扩增富集还包括使用引物H,所述引物H包含共有序列P1或其部分序列。In certain embodiments, the amplification and enrichment further comprises using a primer H, wherein the primer H comprises a consensus sequence P1 or a partial sequence thereof.

实施方案24.实施方案22或23的方法,其中,所述方法还包括步骤(d):Embodiment 24. The method of Embodiment 22 or 23, wherein the method further comprises step (d):

纯化和/或扩增步骤(c)的产物中含有所述第一标签或其互补序列和所述第二标签或其互补序列的核酸分子的步骤。A step of purifying and/or amplifying the nucleic acid molecule containing the first tag or its complementary sequence and the second tag or its complementary sequence in the product of step (c).

在某些实施方案中,所述步骤(c)包括:使用引物E和引物F对步骤(c)的产物进行扩增,其中,所述引物E包含共有序列P1以及任选的第三标签序列,所述引物F从5’至3’包含:共有序列P2或其互补序列、任选的第四标签序列、共有序列R2。In certain embodiments, step (c) comprises: amplifying the product of step (c) using primer E and primer F, wherein primer E comprises a consensus sequence P1 and an optional third tag sequence, and primer F comprises from 5' to 3': a consensus sequence P2 or its complementary sequence, an optional fourth tag sequence, and a consensus sequence R2.

实施方案25.实施方案21-24任一项的方法,其中,所述方法还包括步骤(d)’:Embodiment 25. The method of any one of Embodiments 21-24, wherein the method further comprises step (d)':

纯化和/或扩增步骤(b)的产物中含有所述第一标签或其互补序列和所述第二标签或其互补序列的所述第二核酸分子b的步骤。A step of purifying and/or amplifying the second nucleic acid molecule b containing the first tag or its complementary sequence and the second tag or its complementary sequence in the product of step (b).

在某些实施方案中,所述步骤(d)’包括:使用引物E’和引物F’对步骤(b)的产物进行扩增,其中,所述引物E’包含共有序列P1,以及任选的第三标签序列,所述引物F’包含共有序列P2以及任选的第四标签序列。In certain embodiments, step (d)' comprises: amplifying the product of step (b) using primer E' and primer F', wherein primer E' comprises a consensus sequence P1 and an optional third tag sequence, and primer F' comprises a consensus sequence P2 and an optional fourth tag sequence.

实施方案26.一种对细胞或细胞核进行组学测序的方法,其包括:Embodiment 26. A method for performing omics sequencing on a cell or a cell nucleus, comprising:

根据实施方案14-25任一项所述的方法构建核酸分子文库;和,Constructing a nucleic acid molecule library according to the method described in any one of Embodiments 14-25; and,

对所述核酸分子文库进行测序。The nucleic acid molecule library is sequenced.

在某些实施方案中,在测序之前,将至少2个,至少3个,至少4个,至少5个,至少8个,至少10个,至少12个,至少15个,至少18个,至少20个,至少25个,2-5个,2-10个,2-20个,2-30个,2-40个或2-50个核酸分子文库合并,然后进行测序;其中,每个核酸分 子文库各自具有多个核酸分子(即,扩增产物),且同一个文库中的所述多个核酸分子具有相同的所述第三标签序列或者相同的所述第四标签序列;且,来源于不同文库的核酸分子具有彼此不同的所述第三标签序列或者彼此不同的所述第四标签序列。In certain embodiments, prior to sequencing, at least 2, at least 3, at least 4, at least 5, at least 8, at least 10, at least 12, at least 15, at least 18, at least 20, at least 25, 2-5, 2-10, 2-20, 2-30, 2-40 or 2-50 nucleic acid molecule libraries are combined and then sequenced; wherein each nucleic acid molecule Each sub-library has multiple nucleic acid molecules (i.e., amplification products), and the multiple nucleic acid molecules in the same library have the same third tag sequence or the same fourth tag sequence; and the nucleic acid molecules derived from different libraries have different third tag sequences or different fourth tag sequences from each other.

实施方案27.一种核酸分子文库,其由实施方案14-25任一项所述的方法所构建。Embodiment 27. A nucleic acid molecule library constructed by the method described in any one of embodiments 14-25.

实施方案28.试剂组合物,其具备选自I、II和III的特征:Embodiment 28. A reagent composition having the characteristics selected from I, II and III:

(I)所述试剂组合物包含第二寡核苷酸分子a,所述第二寡核苷酸分子a的序列从5’端至3’端顺次包含:共有序列P1或其部分序列,第二标签序列,共有序列R1或其部分序列;(I) the reagent composition comprises a second oligonucleotide molecule a, the sequence of which comprises, from the 5' end to the 3' end, a consensus sequence P1 or a partial sequence thereof, a second tag sequence, and a consensus sequence R1 or a partial sequence thereof;

并且,所述试剂组合物进一步包含选自以下的一项或多项:Furthermore, the reagent composition further comprises one or more selected from the following:

(I-a)多个偶联了多个第一寡核苷酸分子a的珠粒,其中,所述第一寡核苷酸分子a含有第一标签序列;(I-a) a plurality of beads coupled with a plurality of first oligonucleotide molecules a, wherein the first oligonucleotide molecules a contain a first tag sequence;

并且,同一个珠粒上的所述多个第一寡核苷酸分子a具有相同的第一标签序列,并且,不同珠粒上的所述第一寡核苷酸分子a具有彼此不同的第一标签序列;Furthermore, the plurality of first oligonucleotide molecules a on the same bead have the same first tag sequence, and the first oligonucleotide molecules a on different beads have first tag sequences different from each other;

在某些实施方案中,所述第一寡核苷酸分子a从5’端至3’端包含:(i)共有序列R1或其部分序列、所述第一标签序列、poly(T)序列;或者,(ii)共有序列R1或其部分序列、所述第一标签序列和cDNA 3’末端悬突的互补序列;In certain embodiments, the first oligonucleotide molecule a comprises, from the 5' end to the 3' end: (i) a consensus sequence R1 or a partial sequence thereof, the first tag sequence, and a poly(T) sequence; or (ii) a consensus sequence R1 or a partial sequence thereof, the first tag sequence, and a complementary sequence to the 3' end overhang of the cDNA;

在某些实施方案中,所述第一寡核苷酸分子a进一步包含独特分子标签序列,并且,同一个珠粒上偶联的多个所述第一寡核苷酸分子a具有彼此不同的独特分子标签序列;在某些实施方案中,所述独特分子标签序列位于所述共有序列R1或其部分序列的3’端;In some embodiments, the first oligonucleotide molecule a further comprises a unique molecular tag sequence, and the plurality of first oligonucleotide molecules a coupled to the same bead have unique molecular tag sequences that are different from each other; in some embodiments, the unique molecular tag sequence is located at the 3' end of the common sequence R1 or a partial sequence thereof;

(I-b)引物A或引物B,其中,所述引物A从5’端至3’端包含共有序列O和cDNA 3’末端悬突的互补序列,所述引物B从5’端至3’端包含共有序列T或其部分序列和poly(T)序列;在某些实施方案中,所述共有序列O与所述共有序列T相同或部分相同;(I-b) primer A or primer B, wherein the primer A comprises a consensus sequence O and a complementary sequence of the cDNA 3’ end overhang from the 5’ end to the 3’ end, and the primer B comprises a consensus sequence T or a partial sequence thereof and a poly(T) sequence from the 5’ end to the 3’ end; in certain embodiments, the consensus sequence O is identical or partially identical to the consensus sequence T;

(I-c)引物C,所述引物C包含所述共有序列O或其部分序列,或者,所述引物C包含所述共有序列T或其部分序列;在某些实施方案中,所述共有序列O与所述共有序列T相同或部分相同;(I-c) primer C, the primer C comprises the consensus sequence O or a partial sequence thereof, or the primer C comprises the consensus sequence T or a partial sequence thereof; in certain embodiments, the consensus sequence O is identical or partially identical to the consensus sequence T;

(I-d)引物E和/或引物F,其中,所述引物E包含共有序列P1以及任选的第三标签序列,所述引物F从5’至3’包含:共有序列P2或其互补序列、任选的第四标签序列、共有序列R2或其部分序列;(I-d) Primer E and/or primer F, wherein the primer E comprises a consensus sequence P1 and an optional third tag sequence, and the primer F comprises from 5' to 3': a consensus sequence P2 or a complementary sequence thereof, an optional fourth tag sequence, a consensus sequence R2 or a partial sequence thereof;

(I-e)转座酶复合体II,所述转座酶复合体II含有转座酶和所述转座酶能够识别并结合的转座序列,且能够切割或断裂双链核酸;并且,所述转座序列包含转移链和非转移链;其中,所述转移链包含接头序列;在某些实施方案中,所述接头序列包含共有序列R2或其部分序列;(I-e) a transposase complex II, wherein the transposase complex II contains a transposase and a transposition sequence that the transposase can recognize and bind to, and can cut or break a double-stranded nucleic acid; and the transposition sequence comprises a transferred strand and a non-transferred strand; wherein the transferred strand comprises a linker sequence; in certain embodiments, the linker sequence comprises a consensus sequence R2 or a partial sequence thereof;

在某些实施方案中,所述cDNA 3’末端悬突具有至少1个,至少2个,至少3个,至少4个,至少5个,至少6个,至少7个,至少8个,至少9个,至少10个,1-10个,1-5个或2-10个核苷酸的长度;在某些实施方案中,所述cDNA 3’末端悬突为2-5个胞嘧啶核苷酸的悬突(例如CCC悬突); In certain embodiments, the cDNA 3' terminal overhang has a length of at least 1, at least 2, at least 3, at least 4, at least 5, at least 6, at least 7, at least 8, at least 9, at least 10, 1-10, 1-5 or 2-10 nucleotides; in certain embodiments, the cDNA 3' terminal overhang is an overhang of 2-5 cytosine nucleotides (e.g., a CCC overhang);

(II)所述试剂组合物包含第二寡核苷酸分子b,所述第二寡核苷酸分子b的序列从5’端至3’端顺次包含:共有序列P2或其部分序列,第二标签序列,共有序列R2或其部分序列;(II) the reagent composition comprises a second oligonucleotide molecule b, the sequence of which comprises, from the 5' end to the 3' end, a consensus sequence P2 or a partial sequence thereof, a second tag sequence, and a consensus sequence R2 or a partial sequence thereof;

并且,所述试剂组合物进一步包含选自以下的一项或多项:Furthermore, the reagent composition further comprises one or more selected from the following:

(II-a)多个偶联了多个第一寡核苷酸分子b的珠粒,其中,所述第一寡核苷酸分子b含有第一标签序列;(II-a) a plurality of beads coupled with a plurality of first oligonucleotide molecules b, wherein the first oligonucleotide molecules b contain a first tag sequence;

并且,同一个珠粒上的所述多个第一寡核苷酸分子b具有相同的第一标签序列,并且,不同珠粒上的所述第一寡核苷酸分子b具有彼此不同的第一标签序列;Furthermore, the plurality of first oligonucleotide molecules b on the same bead have the same first tag sequence, and the first oligonucleotide molecules b on different beads have first tag sequences different from each other;

在某些实施方案中,所述第一寡核苷酸分子b从5’端至3’端包含:共有序列P1或其部分序列和所述第一标签序列;In certain embodiments, the first oligonucleotide molecule b comprises from the 5' end to the 3' end: a consensus sequence P1 or a partial sequence thereof and the first tag sequence;

(II-b)转座酶复合体I,所述转座酶复合体I如实施方案9中所定义;(II-b) transposase complex I, wherein the transposase complex I is as defined in embodiment 9;

(II-c)引物D,其中,所述引物D包含共有序列P1或其部分序列;(II-c) primer D, wherein the primer D comprises the consensus sequence P1 or a partial sequence thereof;

(II-d)引物E’和/或引物F’,其中,所述引物E’包含共有序列P1,以及任选的第三标签序列,所述引物F’包含共有序列P2以及任选的第四标签序列;(II-d) primer E' and/or primer F', wherein the primer E' comprises a consensus sequence P1 and an optional third tag sequence, and the primer F' comprises a consensus sequence P2 and an optional fourth tag sequence;

(III)所述试剂组合物包含第二寡核苷酸分子a和第二寡核苷酸分子b;其中,所述第二寡核苷酸分子a的序列从5’端至3’端顺次包含:共有序列P1或其部分序列,第二标签序列,共有序列R1或其部分序列;所述第二寡核苷酸分子b的序列从5’端至3’端顺次包含:共有序列P2或其部分序列,第二标签序列,共有序列R2或其部分序列;(III) The reagent composition comprises a second oligonucleotide molecule a and a second oligonucleotide molecule b; wherein the sequence of the second oligonucleotide molecule a comprises, from the 5' end to the 3' end: a consensus sequence P1 or a partial sequence thereof, a second tag sequence, and a consensus sequence R1 or a partial sequence thereof; the sequence of the second oligonucleotide molecule b comprises, from the 5' end to the 3' end: a consensus sequence P2 or a partial sequence thereof, a second tag sequence, and a consensus sequence R2 or a partial sequence thereof;

并且,所述试剂组合物进一步包含选自以下的一项或多项:Furthermore, the reagent composition further comprises one or more selected from the following:

(III-a)多个同时偶联了多个所述第一寡核苷酸分子a和多个所述第一寡核苷酸分子b的珠粒,且,同一个珠粒上的所述多个第一寡核苷酸分子a和多个所述第一寡核苷酸分子b具有相同的第一标签序列,不同珠粒上的所述第一寡核苷酸分子a具有彼此不同的第一标签序列,不同珠粒上的所述第一寡核苷酸分子b具有彼此不同的第一标签序列;(III-a) a plurality of beads to which a plurality of the first oligonucleotide molecules a and a plurality of the first oligonucleotide molecules b are simultaneously coupled, wherein the plurality of the first oligonucleotide molecules a and the plurality of the first oligonucleotide molecules b on the same bead have the same first tag sequence, the first oligonucleotide molecules a on different beads have first tag sequences different from each other, and the first oligonucleotide molecules b on different beads have first tag sequences different from each other;

在某些实施方案中,所述第一寡核苷酸分子a从5’端至3’端包含:(i)共有序列R1或其部分序列、所述第一标签序列、poly(T)序列;或者,(ii)共有序列R1或其部分序列、所述第一标签序列和cDNA 3’末端悬突的互补序列;和/或,所述第一寡核苷酸分子b从5’端至3’端包含:共有序列P1或其部分序列和所述第一标签序列;In certain embodiments, the first oligonucleotide molecule a comprises, from the 5' end to the 3' end: (i) a consensus sequence R1 or a partial sequence thereof, the first tag sequence, and a poly(T) sequence; or, (ii) a consensus sequence R1 or a partial sequence thereof, the first tag sequence, and a complementary sequence to the 3' end overhang of the cDNA; and/or, the first oligonucleotide molecule b comprises, from the 5' end to the 3' end: a consensus sequence P1 or a partial sequence thereof and the first tag sequence;

在某些实施方案中,所述第一寡核苷酸分子a进一步包含独特分子标签序列,并且,同一个珠粒上偶联的多个所述第一寡核苷酸分子a具有彼此不同的独特分子标签序列;在某些实施方案中,所述独特分子标签序列位于所述共有序列R1或其部分序列的3’端;In some embodiments, the first oligonucleotide molecule a further comprises a unique molecular tag sequence, and the plurality of first oligonucleotide molecules a coupled to the same bead have unique molecular tag sequences that are different from each other; in some embodiments, the unique molecular tag sequence is located at the 3' end of the common sequence R1 or a partial sequence thereof;

(III-b)引物A或引物B,其中,所述引物A从5’端至3’端包含共有序列O和cDNA 3’末端悬突的互补序列,所述引物B从5’端至3’端包含共有序列T或其部分序列和poly(T)序列;在某些实施方案中,所述共有序列O与所述共有序列T相同或部分相同;(III-b) primer A or primer B, wherein the primer A comprises a consensus sequence O and a complementary sequence of the cDNA 3’ end overhang from the 5’ end to the 3’ end, and the primer B comprises a consensus sequence T or a partial sequence thereof and a poly(T) sequence from the 5’ end to the 3’ end; in certain embodiments, the consensus sequence O is identical or partially identical to the consensus sequence T;

(III-c)转座酶复合体I,所述转座酶复合体I如实施方案9中所定义;(III-c) transposase complex I, wherein the transposase complex I is as defined in embodiment 9;

(III-d)引物C,所述引物C包含所述共有序列O或其部分序列,或者,所述引物C包含所述共有序列T或其部分序列;在某些实施方案中,所述共有序列O与所述共有序列T相 同或部分相同;(III-d) primer C, wherein the primer C comprises the consensus sequence O or a partial sequence thereof, or the primer C comprises the consensus sequence T or a partial sequence thereof; in certain embodiments, the consensus sequence O is identical to the consensus sequence T. Same or partly the same;

(III-e)引物D,其中,所述引物D包含共有序列P1或其部分序列;(III-e) primer D, wherein the primer D comprises the consensus sequence P1 or a partial sequence thereof;

(III-f)引物E和/或引物F,其中,所述引物E包含共有序列P1以及任选的第三标签序列,所述引物F从5’至3’包含:共有序列P2或其互补序列、任选的第四标签序列、共有序列R2或其部分序列;(III-f) Primer E and/or primer F, wherein the primer E comprises a consensus sequence P1 and an optional third tag sequence, and the primer F comprises from 5' to 3': a consensus sequence P2 or a complementary sequence thereof, an optional fourth tag sequence, a consensus sequence R2 or a partial sequence thereof;

(III-g)引物E’和/或引物F’,其中,所述引物E’包含共有序列P1,以及任选的第三标签序列,所述引物F’包含共有序列P2以及任选的第四标签序列;(III-g) Primer E' and/or primer F', wherein the primer E' comprises a consensus sequence P1 and an optional third tag sequence, and the primer F' comprises a consensus sequence P2 and an optional fourth tag sequence;

(III-h)引物G和/或引物H,所述引物G携带生物素标记并且含有共有序列O或其部分序列或者共有序列T或其部分序列,所述引物H包含共有序列P1或其部分序列;在某些实施方案中,所述共有序列O与所述共有序列T相同或部分相同;(III-h) primer G and/or primer H, wherein primer G carries a biotin label and contains a consensus sequence O or a partial sequence thereof or a consensus sequence T or a partial sequence thereof, and primer H contains a consensus sequence P1 or a partial sequence thereof; in certain embodiments, the consensus sequence O is identical or partially identical to the consensus sequence T;

(III-h)转座酶复合体II,所述转座酶复合体II含有转座酶和所述转座酶能够识别并结合的转座序列,且能够切割或断裂双链核酸;并且,所述转座序列包含转移链和非转移链;其中,所述转移链包含接头序列;在某些实施方案中,所述接头序列包含共有序列R2或其部分序列;(III-h) transposase complex II, wherein the transposase complex II contains a transposase and a transposition sequence that the transposase can recognize and bind to, and can cut or break a double-stranded nucleic acid; and the transposition sequence comprises a transferred strand and a non-transferred strand; wherein the transferred strand comprises a linker sequence; in certain embodiments, the linker sequence comprises a consensus sequence R2 or a partial sequence thereof;

在某些实施方案中,所述cDNA 3’末端悬突具有至少1个,至少2个,至少3个,至少4个,至少5个,至少6个,至少7个,至少8个,至少9个,至少10个,1-10个,1-5个或2-10个核苷酸的长度;在某些实施方案中,所述cDNA 3’末端悬突为2-5个胞嘧啶核苷酸的悬突(例如CCC悬突)。In some embodiments, the cDNA 3’ terminal overhang has a length of at least 1, at least 2, at least 3, at least 4, at least 5, at least 6, at least 7, at least 8, at least 9, at least 10, 1-10, 1-5 or 2-10 nucleotides; in some embodiments, the cDNA 3’ terminal overhang is an overhang of 2-5 cytosine nucleotides (e.g., a CCC overhang).

实施方案29.实施方案28的试剂组合物,所述试剂组合物进一步包含用于固定和/或透化细胞或细胞核的试剂。Embodiment 29. The reagent composition of Embodiment 28, further comprising a reagent for fixing and/or permeabilizing cells or cell nuclei.

在某些实施方案中,所述试剂组合物进一步包含甲醇、甲醛和/或多聚甲醛。In certain embodiments, the reagent composition further comprises methanol, formaldehyde and/or paraformaldehyde.

在某些实施方案中,所述试剂组合物进一步包含Triton X-100、digitonin、IGEPAL(例如,CA-630)、和/或Tween-20。In certain embodiments, the reagent composition further comprises Triton X-100, digitonin, IGEPAL (e.g., CA-630), and/or Tween-20.

在某些实施方案中,所述试剂组合物进一步包含:Rnase抑制剂,矿物油,缓冲液,dNTP,一种或多种核酸聚合酶(例如DNA聚合酶;例如具有链置换活性和/或高保真性的DNA聚合酶),用于回收或纯化核酸的试剂(例如磁珠),孔板,或其任何组合。In certain embodiments, the reagent composition further comprises: an RNase inhibitor, mineral oil, a buffer, dNTPs, one or more nucleic acid polymerases (e.g., DNA polymerases; e.g., DNA polymerases having strand displacement activity and/or high fidelity), reagents for recovering or purifying nucleic acids (e.g., magnetic beads), a well plate, or any combination thereof.

在某些实施方案中,所述试剂组合物还包含用于测序的试剂;例如用于二代测序的试剂。In certain embodiments, the reagent composition further comprises reagents for sequencing; for example, reagents for next-generation sequencing.

实施方案30.一种试剂盒,其包含:含有多个寡核苷酸分子的多反应体系,所述每个寡核苷酸分子含有特定的标签序列;Embodiment 30. A kit comprising: a multi-reaction system containing a plurality of oligonucleotide molecules, each of the oligonucleotide molecules containing a specific tag sequence;

并且,所述多反应体系中,每个反应体系中的寡核苷酸分子具有相同的标签序列,不同反应体系的寡核苷酸分子具有彼此不同的标签序列。Furthermore, in the multi-reaction system, the oligonucleotide molecules in each reaction system have the same tag sequence, and the oligonucleotide molecules in different reaction systems have different tag sequences.

在某些实施方案中,所述寡核苷酸分子还包含共有序列P1或其部分序列,或者,所述寡核苷酸分子还包含共有序列P2或其部分序列。In certain embodiments, the oligonucleotide molecule further comprises a consensus sequence P1 or a partial sequence thereof, or the oligonucleotide molecule further comprises a consensus sequence P2 or a partial sequence thereof.

在某些实施方案中,所述多反应体系包含至少2个(例如,至少3个,至少4个,至少5个,至少8个,至少10个,至少12个,至少20个,至少24个,至少50个,至少96个,至 少100个,至少200个,至少384个,至少400个,2-5个,2-10个,2-50个,2-80个,2-100个,2-500个,2-103个,2-104个,2-105个,2-106个)含有寡核苷酸的多反应体系;In certain embodiments, the multi-reaction system comprises at least 2 (e.g., at least 3, at least 4, at least 5, at least 8, at least 10, at least 12, at least 20, at least 24, at least 50, at least 96, at least 100, at least 120, at least 200, at least 240, at least 500, at least 96, at least 100, at least 12 ... at least 100, at least 200, at least 384, at least 400, 2-5, 2-10, 2-50, 2-80, 2-100, 2-500, 2-10 3 , 2-10 4 , 2-10 5 , 2-10 6 ) multiple reaction systems containing oligonucleotides;

其中多反应体系优选为多孔板,寡核苷酸可以游离或固定在反应体系中。The multi-reaction system is preferably a multi-well plate, and the oligonucleotides can be free or fixed in the reaction system.

实施方案31.一种对细胞进行固定和透化的方法,其包括以下步骤:Embodiment 31. A method for fixing and permeabilizing cells, comprising the following steps:

(i)在-40℃至-10℃(例如,-25℃至-15℃或-20℃)的条件下,使用浓度为60%-100%(例如,60%-80%、70%-80%、70%-85%、70%-90%、75%-80%、75%-85%、75%-90%、75%-100%、80%-85%、80%-90%、80%-100%或80%)的甲醇处理细胞5-30min(例如,8-20min或10min)对细胞进行固定和透化;(i) treating the cells with 60%-100% (e.g., 60%-80%, 70%-80%, 70%-85%, 70%-90%, 75%-80%, 75%-85%, 75%-90%, 75%-100%, 80%-85%, 80%-90%, 80%-100%, or 80%) methanol at -40°C to -10°C (e.g., -25°C to -15°C or -20°C) for 5-30 min (e.g., 8-20 min or 10 min) to fix and permeabilize the cells;

或者,or,

(ii)(a)在0℃至37℃(例如,15℃至30℃或25℃)的条件下,使用浓度为0.05%-5%(例如,0.5%-1%、0.5%-2%、0.5%-3%、0.5%-4%、0.5%-5%或1%)的甲醛或多聚甲醛处理细胞5-30min(例如,5-20min或10min)对细胞进行固定;和,(ii)(a) fixing the cells by treating the cells with formaldehyde or paraformaldehyde at a concentration of 0.05%-5% (e.g., 0.5%-1%, 0.5%-2%, 0.5%-3%, 0.5%-4%, 0.5%-5%, or 1%) at 0°C to 37°C (e.g., 15°C to 30°C or 25°C) for 5-30 minutes (e.g., 5-20 minutes or 10 minutes); and,

(b)使用浓度为0.05%-2%(例如,0.05%-0.2%、0.05%-0.25%、0.05%-0.3%、0.05%-0.5%、0.05%-0.8%、0.05%-1%、0.1%-0.2%、0.1%-0.25%、0.1%-0.3%、0.1%-0.4%、0.1%-0.5%、0.1%-0.8%、0.1%-1%、0.2%-0.25%、0.2%-0.3%、0.2%-0.4%、0.2%-0.5%、0.2%-0.8%、0.2%-1%或0.2%)的Triton X-100在-4℃至10℃(例如,0℃至4℃)的条件下处理细胞0.5-10min(例如,1-5min或3min)对细胞进行透化。(b) The concentration used is 0.05%-2% (e.g., 0.05%-0.2%, 0.05%-0.25%, 0.05%-0.3%, 0.05%-0.5%, 0.05%-0.8%, 0.05%-1%, 0.1%-0.2%, 0.1%-0.25%, 0.1%-0.3%, 0.1%-0.4%, 0.1%-0.5%, 0.1%-0.8% The cells are permeabilized by treating the cells with Triton X-100 (0.1%-1%, 0.2%-0.25%, 0.2%-0.3%, 0.2%-0.4%, 0.2%-0.5%, 0.2%-0.8%, 0.2%-1%, or 0.2%) at -4°C to 10°C (e.g., 0°C to 4°C) for 0.5-10 min (e.g., 1-5 min or 3 min).

在某些实施方案中,所述细胞为天然存在的细胞或重组细胞,或两者的混合。In certain embodiments, the cell is a naturally occurring cell or a recombinant cell, or a mixture of both.

实施方案32.一种对细胞核进行固定和透化的方法,其包括以下步骤:Embodiment 32. A method for fixing and permeabilizing a cell nucleus, comprising the following steps:

(i)对细胞核进行固定,所述固定选自:(i) fixing the cell nucleus, wherein the fixation is selected from:

(a)在0℃至30℃(例如,15℃至28℃或25℃)的条件下,使用浓度为0.05%-4%(例如,0.5%-1%、0.5%-2%、0.5%-3%、0.5%-4%或1%)的甲醛处理细胞核2-20min(例如,5-15min或10min)对细胞核进行固定;或者,(a) fixing the cell nuclei by treating the cell nuclei with formaldehyde at a concentration of 0.05%-4% (e.g., 0.5%-1%, 0.5%-2%, 0.5%-3%, 0.5%-4% or 1%) at 0°C to 30°C (e.g., 15°C to 28°C or 25°C) for 2-20 min (e.g., 5-15 min or 10 min); or,

(b)在0℃至30℃(例如,15℃至28℃或25℃)的条件下,使用浓度为0.05%-4%(例如,0.5%-2%、0.5%-3%、0.5%-4%或1.6%)的多聚甲醛处理细胞核1-15min(例如,1-10min或5min)对细胞核进行固定;(b) treating the cell nuclei with paraformaldehyde at a concentration of 0.05%-4% (e.g., 0.5%-2%, 0.5%-3%, 0.5%-4% or 1.6%) at 0°C to 30°C (e.g., 15°C to 28°C or 25°C) for 1-15 min (e.g., 1-10 min or 5 min) to fix the cell nuclei;

以及,as well as,

(ii)使用包含digitonin的透化液在-4℃至10℃(例如,0℃至4℃)的条件下处理细胞核0.5-10min(例如,1-5min或3min)对细胞核进行透化。(ii) permeabilizing the cell nuclei by treating the cell nuclei with a permeabilization solution containing digitonin at -4°C to 10°C (eg, 0°C to 4°C) for 0.5-10 min (eg, 1-5 min or 3 min).

在某些实施方案中,所述透化液进一步包含IGEPAL(例如,CA-630)和/或Tween-20。In certain embodiments, the permeabilization solution further comprises IGEPAL (e.g., CA-630) and/or Tween-20.

在某些实施方案中,所述透化液中,digitonin的浓度为0.0005%-0.05%(例如,0.0008%-0.005%、0.0005%-0.002%、0.0008%-0.002%或0.001%)。In certain embodiments, the concentration of digitonin in the permeabilization solution is 0.0005%-0.05% (eg, 0.0008%-0.005%, 0.0005%-0.002%, 0.0008%-0.002%, or 0.001%).

在某些实施方案中,所述透化液中,IGEPAL(例如,CA-630)的浓度为0.005%-0.1%(例如,0.005%-0.05%、0.008%-0.05%、0.005%-0.02%、0.008%-0.02%或0.01%)。 In certain embodiments, in the permeabilization solution, IGEPAL (e.g., CA-630) at a concentration of 0.005%-0.1% (e.g., 0.005%-0.05%, 0.008%-0.05%, 0.005%-0.02%, 0.008%-0.02%, or 0.01%).

在某些实施方案中,所述透化液中,Tween-20的浓度为0.005%-0.1%(例如,0.005%-0.05%、0.008%-0.05%、0.005%-0.02%、0.008%-0.02%或0.01%)。In certain embodiments, the concentration of Tween-20 in the permeabilization solution is 0.005%-0.1% (eg, 0.005%-0.05%, 0.008%-0.05%, 0.005%-0.02%, 0.008%-0.02%, or 0.01%).

在某些实施方案中,所述细胞核为源自天然存在的细胞的细胞核或源自重组细胞的细胞核,或两者的混合。In certain embodiments, the cell nucleus is a cell nucleus derived from a naturally occurring cell or a cell nucleus derived from a recombinant cell, or a mixture of both.

实施方案33.一种装置,其用于标记来自细胞或细胞核的核酸分子和/或构建核酸分子文库,所述装置包括:Embodiment 33. A device for labeling nucleic acid molecules from cells or cell nuclei and/or constructing a nucleic acid molecule library, the device comprising:

存储器;和Memory; and

耦接至所述存储器的处理器,所述处理器被配置为基于存储在所述存储器中的指令,执行实施方案1-13任一项所述的方法和/或实施方案14-25任一项所述的方法。A processor coupled to the memory, the processor being configured to execute the method described in any one of embodiments 1-13 and/or the method described in any one of embodiments 14-25 based on instructions stored in the memory.

实施方案34.一种计算机可读存储介质,其上存储有计算机程序,其特征在于,该程序被处理器执行时实现实施方案1-13任一项的方法和/或实施方案14-25任一项的方法。Embodiment 34. A computer-readable storage medium having a computer program stored thereon, characterized in that when the program is executed by a processor, the method of any one of embodiments 1-13 and/or the method of any one of embodiments 14-25 is implemented.

实施方案35.实施方案1-13任一项的方法、实施方案28或29的试剂盒、实施方案30的试剂组合物、实施方案31或32的方法、实施方案33的装置或实施方案34的计算机可读存储介质用于构建核酸分子文库或用于进行转录组测序的用途;或者,实施方案14-25任一项的方法用于进行转录组测序的用途。Embodiment 35. Use of the method of any one of embodiments 1-13, the kit of embodiment 28 or 29, the reagent composition of embodiment 30, the method of embodiment 31 or 32, the device of embodiment 33 or the computer-readable storage medium of embodiment 34 for constructing a nucleic acid molecule library or for performing transcriptome sequencing; or, use of the method of any one of embodiments 14-25 for performing transcriptome sequencing.

术语定义Definition of terms

在本发明中,除非另有说明,否则本文中使用的科学和技术名词具有本领域技术人员所通常理解的含义。并且,本文中所用的病毒学、生物化学、免疫学实验室操作步骤均为相应领域内广泛使用的常规步骤。同时,为了更好地理解本发明,下面提供相关术语的定义和解释。In the present invention, unless otherwise specified, the scientific and technical terms used herein have the meanings commonly understood by those skilled in the art. In addition, the virology, biochemistry, and immunology laboratory operation steps used herein are conventional steps widely used in the corresponding fields. At the same time, in order to better understand the present invention, the definitions and explanations of the relevant terms are provided below.

当本文使用术语“例如”、“如”、“诸如”、“包括”、“包含”或其变体时,这些术语将不被认为是限制性术语,而将被解释为表示“但不限于”或“不限于”。When the terms "for example," "such as," "including," "including," "comprising," or variations thereof are used herein, these terms will not be considered as limiting terms, but will be interpreted to mean "but not limited to" or "not limited to."

除非本文另外指明或根据上下文明显矛盾,否则术语“一个”和“一种”以及“该”和类似指称物在描述本发明的上下文中(尤其在以下权利要求的上下文中)应被解释成覆盖单数和复数。The terms "a" and "an" and "the" and similar referents in the context of describing the invention (especially in the context of the following claims) should be construed to cover both the singular and the plural, unless otherwise indicated herein or clearly contradicted by context.

如本文所用,术语“假单细胞”是指,在分析单细胞的转录组学实验中,一个微反应体系(例如,一个油包水液滴或者一个微孔)中含有两个或者更多个细胞的情形。在“假单细胞”的情况下,同一个微反应体系(例如,同一个液滴或微孔)中的两个或者更多个细胞将被标记上相同的细胞特异性标签。这导致,仅利用微反应体系引入的细胞特异性标签,是无法对微反应体系中的各个细胞进行“一对一”的标识作用。相应地,由“假单细胞”微反应体系所产生的测序数据,由于其含有来源于两个或多个细胞的测序结果,不能用于分析单细胞的转录组信息。因此,在传统的高通量单细胞转录组测序方法中,需要从最终产生的测序数据中过滤或移除由“假单细胞”微反应体系所产生的测序数据;并且,为了避免测序数据的大量浪费,需要尽可能降低或控制“假单细胞”微反应体 系的数量或比率。如本文所用,术语“假单细胞率”是指,“假单细胞”微反应体系(数量)占所有包含细胞的微反应体系(数量)的比率。As used herein, the term "pseudo-monocell" refers to a situation in which a micro-reaction system (e.g., an oil-in-water droplet or a microwell) contains two or more cells in a transcriptomic experiment analyzing a single cell. In the case of "pseudo-monocells", two or more cells in the same micro-reaction system (e.g., the same droplet or microwell) will be labeled with the same cell-specific label. This results in that it is impossible to perform a "one-to-one" identification of each cell in the micro-reaction system using only the cell-specific labels introduced by the micro-reaction system. Accordingly, the sequencing data generated by the "pseudo-monocell" micro-reaction system cannot be used to analyze the transcriptome information of a single cell because it contains sequencing results from two or more cells. Therefore, in the traditional high-throughput single-cell transcriptome sequencing method, it is necessary to filter or remove the sequencing data generated by the "pseudo-monocell" micro-reaction system from the final sequencing data; and, in order to avoid a large amount of waste of sequencing data, it is necessary to reduce or control the "pseudo-monocell" micro-reaction system as much as possible. As used herein, the term "pseudomonas rate" refers to the ratio of "pseudomonas" microreaction systems (number) to all microreaction systems (number) containing cells.

如本文所用,“细胞通量”是指对于给定的单细胞建库技术方案,单次建库反应能够同时进行标记的细胞数量。As used herein, "cell throughput" refers to the number of cells that can be simultaneously labeled in a single library construction reaction for a given single-cell library construction technology protocol.

如本文所用,“样品通量”是指对于给定单细胞建库技术方案,单次建库反应能够同时进行标记的样品数量。As used herein, "sample throughput" refers to the number of samples that can be simultaneously labeled in a single library construction reaction for a given single-cell library construction technology protocol.

如本文所用,可用于本发明方法的细胞或细胞核可以是任何感兴趣的细胞或其细胞核,例如,癌细胞、干细胞、神经细胞、胎儿细胞和参与免疫应答的免疫细胞或其细胞核。所述细胞/细胞核可以是相同类型的细胞/细胞核混合,也可以是完全异质的不同类型细胞/细胞核混合。不同的细胞/细胞核类型可包括个体的不同组织细胞/细胞核或不同个体的相同组织细胞/细胞核或者来源于不同属、种、菌株、变体或任何或所有前述的任何组合的微生物的细胞/细胞核。例如,不同的细胞/细胞核类型可包括个体的正常细胞/细胞核和癌细胞/细胞核;获自人类受试者的各种细胞/细胞核类型,例如多种免疫细胞/细胞核;来自环境、法医、微生物组或其他样品的多种不同的细菌物种、菌株和/或变体;或细胞/细胞核类型的任何其他各种混合物。As used herein, the cells or cell nuclei that can be used in the methods of the present invention can be any cell or cell nucleus of interest, for example, cancer cells, stem cells, neural cells, fetal cells, and immune cells or cell nuclei involved in immune responses. The cells/cell nuclei can be a mixture of cells/cell nuclei of the same type, or a mixture of completely heterogeneous cells/cell nuclei of different types. Different cell/cell nuclei types may include different tissue cells/cell nuclei of an individual or the same tissue cells/cell nuclei of different individuals, or cells/cell nuclei of microorganisms derived from different genera, species, strains, variants, or any or all of the foregoing combinations. For example, different cell/cell nuclei types may include normal cells/cell nuclei and cancer cells/cell nuclei of an individual; various cell/cell nuclei types obtained from human subjects, such as a variety of immune cells/cell nuclei; a variety of different bacterial species, strains, and/or variants from environmental, forensic, microbial groups, or other samples; or any other various mixtures of cell/cell nuclei types.

如本文所用,“核酸分子文库”表示从靶核酸分子产生的经标记的核酸片段的集合或群体,其中,在该集合或群体中经标记的核酸片段的组合显示在性质上和/或数量上代表从中产生经标记的核酸片段的靶核酸分子的序列的序列。As used herein, a "library of nucleic acid molecules" refers to a collection or population of labeled nucleic acid fragments generated from a target nucleic acid molecule, wherein the combination of labeled nucleic acid fragments in the collection or population exhibits a sequence that qualitatively and/or quantitatively represents the sequence of the target nucleic acid molecule from which the labeled nucleic acid fragment was generated.

如本文所用,“离散分区”是指包含目的物质的相互之间独立的空间单元,例如微滴或孔。通常而言,每个离散分区可以保持其自己的内容物与其它离散分区的内容物的分离。在一些实施方式中,所述离散分区中还可以包含其他根据不同的需求而分配的其他物质,例如染料、乳化剂、表面活性剂、稳定剂、聚合物、适体、还原剂、引发剂、生物素标记物、荧光团、缓冲液、酸性溶液、碱性溶液、光敏感的酶、pH敏感的酶、水性缓冲液、去污剂、离子型去污剂、非离子型去污剂等等。As used herein, "discrete partitions" refer to mutually independent spatial units containing target substances, such as droplets or holes. Generally speaking, each discrete partition can keep its own contents separate from the contents of other discrete partitions. In some embodiments, the discrete partitions may also contain other substances allocated according to different needs, such as dyes, emulsifiers, surfactants, stabilizers, polymers, aptamers, reducing agents, initiators, biotin markers, fluorophores, buffers, acidic solutions, alkaline solutions, light-sensitive enzymes, pH-sensitive enzymes, aqueous buffers, detergents, ionic detergents, non-ionic detergents, etc.

如本文所用,“cDNA”、“cDNA链”或“cDNA分子”是指使用感兴趣的RNA分子的至少一部分作为模板,通过RNA依赖性DNA聚合酶或反转录酶催化的与该感兴趣的RNA分子退火的引物的延伸而合成的“互补的DNA”(该过程也称为“反转录”)。所合成的cDNA分子与该模板的至少一部分“同源”或“互补”或“碱基配对”或“形成复合物”。As used herein, "cDNA", "cDNA chain" or "cDNA molecule" refers to "complementary DNA" synthesized by extension of a primer annealed to the RNA molecule of interest catalyzed by RNA-dependent DNA polymerase or reverse transcriptase using at least a portion of the RNA molecule of interest as a template (this process is also called "reverse transcription"). The synthesized cDNA molecule is "homologous" or "complementary" or "base paired" or "forms a complex" with at least a portion of the template.

如本文所用,“转座酶”表示如下的酶:该酶能够与包含转座子末端的组合物(例如,转座子、转座子末端、转座子末端组合物)形成功能复合物并催化该包含转座子末端的组合物插入或转座进入在转座反应(例如,体外转座反应)中与该酶孵育的双链核酸分子(例如DNA双链、RNA/cDNA杂合双链)中。非限制性转座酶的实例包括Tn5转座酶、MuA转座酶、睡美人转座酶、Mariner转座酶、Tn7转座酶、Tn10转座酶、Ty1转座酶、Tn552转座酶,以及具有上述转座酶的转座活性(例如,具有更高转座活性)的变体、修饰产物和衍生物。As used herein, "transposase" refers to an enzyme that is capable of forming a functional complex with a composition comprising a transposon end (e.g., a transposon, a transposon end, a transposon end composition) and catalyzing the insertion or transposition of the composition comprising a transposon end into a double-stranded nucleic acid molecule (e.g., a DNA double strand, an RNA/cDNA hybrid double strand) incubated with the enzyme in a transposition reaction (e.g., an in vitro transposition reaction). Non-limiting examples of transposases include Tn5 transposase, MuA transposase, Sleeping Beauty transposase, Mariner transposase, Tn7 transposase, Tn10 transposase, Ty1 transposase, Tn552 transposase, and variants, modified products, and derivatives having the transposition activity (e.g., having higher transposition activity) of the above transposases.

出于多种原因,本发明的核酸或多核苷酸(例如第一寡核苷酸分子、第二寡核苷酸 分子、引物A、引物B、引物C、引物E、引物F、引物D,引物E’、引物F’、引物G、引物H、转座酶复合体中的转移链、非转移链)可包括一种或多种修饰的核酸碱基、糖部分或核苷间连接。例如,使用包含修饰的碱基、糖部分或核苷间连接的核酸或多核苷酸的一些原因包括但不限于:(1)Tm的改变;(2)改变多核苷酸对一种或多种核酸酶的易感性;(3)提供用于连接标记的部分;(4)提供标记或标记猝灭剂;或(5)提供用于连接溶液中或结合于表面的另一种分子的部分,诸如生物素。例如,在一些实施方案中,可将本发明的核酸或多核苷酸(例如第一寡核苷酸分子、第二寡核苷酸分子、引物A、引物B、引物C、引物E、引物F、引物D,引物E’、引物F’、引物G、引物H、转座酶复合体中的转移链、非转移链)合成为使得随机部分包含一种或多种构象受限制的核酸类似物,诸如但不限于其中的核糖环被连接2’-O原子与4’-C原子的亚甲基桥“锁定”的一种或多种核糖核酸类似物;在一些实施方案中,可将寡核苷酸3’末端进行双脱氧处理,使得所述3’末端无法延伸;在一些实施方案中,可将本发明的核酸或多核苷酸(例如第一寡核苷酸分子、第二寡核苷酸分子、引物A、引物B、引物C、引物E、引物F、引物D,引物E’、引物F’、引物G、引物H、转座酶复合体中的转移链、非转移链)5’末端进行磷酸化处理,使得所述5’末端在核酸连接酶作用下可以与另外的寡核苷酸3’末端连接。在本发明的方法中,例如,在多核苷酸或寡核苷酸(例如第一寡核苷酸分子、第二寡核苷酸分子、引物A、引物B、引物C、引物E、引物F、引物D,引物E’、引物F’、引物G、引物H、转座酶复合体中的转移链、非转移链)中的一个或多个位置的单核苷酸中的核酸碱基可包括鸟嘌呤、腺嘌呤、尿嘧啶、胸腺嘧啶或胞嘧啶,或者可选地,所述核酸碱基中的一种或多种可包含修饰的碱基,诸如但不限于黄嘌呤、烯丙氨基(allyamino)-尿嘧啶、烯丙氨基-胸腺嘧啶核苷、次黄嘌呤、2-氨基腺嘌呤、5-丙炔基尿嘧啶、5-丙炔基胞嘧啶、4-硫尿嘧啶、6-硫鸟嘌呤、氮尿嘧啶和脱氮尿嘧啶、胸腺嘧啶核苷、胞嘧啶、腺嘌呤或鸟嘌呤。此外,它们可包含用如下部分衍生的核酸碱基:生物素部分、洋地黄毒苷部分、荧光部分或化学发光部分、猝灭部分或某种其他部分。就本发明的核酸或多核苷酸(例如第一寡核苷酸分子、第二寡核苷酸分子、引物A、引物B、引物C、引物E、引物F、引物D,引物E’、引物F’、引物G、引物H、转座酶复合体中的转移链、非转移链)来说,糖部分中的一个或多个可包括2′-脱氧核糖,或者可选地,糖部分中的一个或多个可包括某种其他糖部分,诸如但不限于:提供对一些核酸酶的抵抗力的核糖或2’-氟代-2’-脱氧核糖或2’-O-甲基-核糖,或可通过与可见的、荧光的、红外荧光的或其他可检测的染料或具有亲电子的、光反应性的、炔基或其他反应性化学部分的化学物质进行反应而标记的2’-氨基2’-脱氧核糖或2’-叠氮基-2’-脱氧核糖。本发明的核酸或多核苷酸的核苷间连接可以是磷酸二酯键连接,或者可选地,核苷间连接中的一种或多种可包括修饰的连接,诸如但不限于:硫代磷酸酯、二硫代磷酸酯、硒代磷酸酯(phosphoroselenate)、或二硒代磷酸酯(phosphorodiselenate)连接,它们对一些核酸酶具有抵抗力。For a variety of reasons, the nucleic acids or polynucleotides of the present invention (e.g., the first oligonucleotide molecule, the second oligonucleotide The nucleic acid or polynucleotide comprising a modified base, sugar moiety, or internucleoside linkage may include, but is not limited to: (1) alteration of the Tm; (2) alteration of the susceptibility of the polynucleotide to one or more nucleases; (3) provision of a moiety for attachment of a label; (4) provision of a label or label quencher; or (5) provision of a moiety for attachment of another molecule in solution or bound to a surface, such as biotin. For example, in some embodiments, the nucleic acid or polynucleotide of the invention (e.g., the first oligonucleotide molecule, the second oligonucleotide molecule, primer A, primer B, primer C, primer E, primer F, primer D, primer E', primer F', primer G, primer H, the transferred strand in the transposase complex, the non-transferred strand) can be synthesized such that the random portion comprises one or more conformationally restricted nucleic acid analogs, such as, but not limited to, one or more ribonucleic acid analogs in which the ribose ring is "locked" by a methylene bridge connecting the 2'-O atom and the 4'-C atom; In some embodiments, the 3' end of the oligonucleotide can be treated with dideoxy to make the 3' end unable to be extended; in some embodiments, the 5' end of the nucleic acid or polynucleotide of the present invention (e.g., the first oligonucleotide molecule, the second oligonucleotide molecule, primer A, primer B, primer C, primer E, primer F, primer D, primer E', primer F', primer G, primer H, the transferred chain and the non-transferred chain in the transposase complex) can be phosphorylated to make the 5' end connect to the 3' end of another oligonucleotide under the action of nucleic acid ligase. In the methods of the present invention, for example, the nucleic acid base in the single nucleotide at one or more positions in the polynucleotide or oligonucleotide (e.g., the first oligonucleotide molecule, the second oligonucleotide molecule, primer A, primer B, primer C, primer E, primer F, primer D, primer E', primer F', primer G, primer H, the transferred strand in the transposase complex, the non-transferred strand) may include guanine, adenine, uracil, thymine or cytosine, or alternatively, one or more of the nucleic acid bases may include a modified base such as, but not limited to, xanthine, allyamino-uracil, allyamino-thymidine, hypoxanthine, 2-aminoadenine, 5-propynyluracil, 5-propynylcytosine, 4-thiouracil, 6-thioguanine, azauracil and deazauracil, thymidine, cytosine, adenine or guanine. Additionally, they may comprise nucleic acid bases derivatized with a biotin moiety, a digoxigenin moiety, a fluorescent or chemiluminescent moiety, a quenching moiety, or some other moiety. With respect to the nucleic acids or polynucleotides of the invention (e.g., the first oligonucleotide molecule, the second oligonucleotide molecule, primer A, primer B, primer C, primer E, primer F, primer D, primer E', primer F', primer G, primer H, the transferred strand in the transposase complex, the non-transferred strand), one or more of the sugar moieties may include 2'-deoxyribose, or alternatively, one or more of the sugar moieties may include some other sugar moiety, such as, but not limited to: ribose or 2'-fluoro-2'-deoxyribose or 2'-O-methyl-ribose that provide resistance to some nucleases, or 2'-amino 2'-deoxyribose or 2'-azido-2'-deoxyribose that can be labeled by reaction with a visible, fluorescent, infrared fluorescent or other detectable dye or a chemical having an electrophilic, photoreactive, alkynyl or other reactive chemical moiety. The internucleoside linkages of the nucleic acids or polynucleotides of the invention can be phosphodiester linkages, or alternatively, one or more of the internucleoside linkages can include modified linkages such as, but not limited to, phosphorothioate, phosphorodithioate, phosphoroselenate, or phosphorodiselenate linkages, which are resistant to some nucleases.

在本申请的方法中,第一标签序列、第二标签序列、第三标签序列、第四标签序列、 独特分子标签序列、标签序列不受其组成或长度的限制,只要其能发挥标识作用即可。在某些实施方案中,所述第一标签序列具有至少3个,至少4个,至少5个,至少6个,至少7个,至少8个,至少9个,至少10个,至少15个,至少20个,至少25个,3-8个,3-15个,3-25个或3-50个核苷酸的长度。在某些实施方案中,所述第二标签序列具有至少2个,至少3个,至少4个,至少5个,至少6个,至少7个,至少8个,至少9个,至少10个,3-8个,3-15个,3-25个或3-50个核苷酸的长度。例如,所述第一标签序列的长度为4-8个核苷酸。在某些实施方案中,所述第三标签序列具有至少3个,至少4个,至少5个,至少6个,至少7个,至少8个,至少9个,至少10个,3-8个,3-15个,3-25个或3-50个核苷酸的长度。在某些实施方案中,所述第四标签序列具有至少3个,至少4个,至少5个,至少6个,至少7个,至少8个,至少9个,至少10个,3-8个,3-15个,3-25个或3-50个核苷酸的长度。在某些实施方案中,所述独特分子标签序列具有至少5个,至少6个,至少7个,至少8个,至少9个,至少10个,至少15个,至少20个,至少25个,5-8个,5-15个,5-25个或5-50个核苷酸的长度。In the method of the present application, the first tag sequence, the second tag sequence, the third tag sequence, the fourth tag sequence, Unique molecular tag sequence, tag sequence is not limited by its composition or length, as long as it can play a role in identification. In some embodiments, the first tag sequence has a length of at least 3, at least 4, at least 5, at least 6, at least 7, at least 8, at least 9, at least 10, at least 15, at least 20, at least 25, 3-8, 3-15, 3-25 or 3-50 nucleotides. In some embodiments, the second tag sequence has a length of at least 2, at least 3, at least 4, at least 5, at least 6, at least 7, at least 8, at least 9, at least 10, 3-8, 3-15, 3-25 or 3-50 nucleotides. For example, the length of the first tag sequence is 4-8 nucleotides. In some embodiments, the third tag sequence has a length of at least 3, at least 4, at least 5, at least 6, at least 7, at least 8, at least 9, at least 10, 3-8, 3-15, 3-25 or 3-50 nucleotides. In some embodiments, the fourth tag sequence has a length of at least 3, at least 4, at least 5, at least 6, at least 7, at least 8, at least 9, at least 10, 3-8, 3-15, 3-25 or 3-50 nucleotides. In some embodiments, the unique molecular tag sequence has a length of at least 5, at least 6, at least 7, at least 8, at least 9, at least 10, at least 15, at least 20, at least 25, 5-8, 5-15, 5-25 or 5-50 nucleotides.

同理,在本申请的方法中,共有序列R1、共有序列R2、共有序列O、共有序列P1、共有序列P2、引物A、引物B、引物C、引物E、引物F、引物D,引物E’、引物F’、引物G、引物H、转座酶复合体中的转移链、非转移链等也不受其组成或长度的限制,本领域技术人员完全可以出于各种动机合理调整所述序列的长度和/或其组成,此处不再赘述。Similarly, in the method of the present application, consensus sequence R1, consensus sequence R2, consensus sequence O, consensus sequence P1, consensus sequence P2, primer A, primer B, primer C, primer E, primer F, primer D, primer E', primer F', primer G, primer H, transferred chain, non-transferred chain in the transposase complex, etc. are not limited by their composition or length. Those skilled in the art can reasonably adjust the length and/or its composition of the sequence for various reasons, which will not be repeated here.

本领域技术人员易于理解,为使本申请的方法可以直接适用于现有的最主流的单细胞组学平台10X Genomics平台,因此,在某些实施方案中,所述共有序列R1与10X Genomics平台的Read1序列或其部分序列或其互补序列相同。在某些实施方案中,所述共有序列R2与10X Genomics平台的Read2序列或其部分序列或其互补序列相同。在某些实施方案中,所述共有序列O与10X Genomics平台的TSO序列或其部分序列或其互补序列相同。在某些实施方案中,所述共有序列P1与10X Genomics平台的P5序列或其部分序列或其互补序列相同。在某些实施方案中,所述共有序列P2与10X Genomics平台的P7序列或其部分序列或其互补序列相同。It is easy for a person skilled in the art to understand that in order to make the method of the present application directly applicable to the most mainstream single-cell omics platform 10X Genomics platform, therefore, in certain embodiments, the consensus sequence R1 is the same as the Read1 sequence of the 10X Genomics platform or a partial sequence thereof or a complementary sequence thereof. In certain embodiments, the consensus sequence R2 is the same as the Read2 sequence of the 10X Genomics platform or a partial sequence thereof or a complementary sequence thereof. In certain embodiments, the consensus sequence O is the same as the TSO sequence of the 10X Genomics platform or a partial sequence thereof or a complementary sequence thereof. In certain embodiments, the consensus sequence P1 is the same as the P5 sequence of the 10X Genomics platform or a partial sequence thereof or a complementary sequence thereof. In certain embodiments, the consensus sequence P2 is the same as the P7 sequence of the 10X Genomics platform or a partial sequence thereof or a complementary sequence thereof.

如本文所用,术语“珠粒”通常是指颗粒。珠粒可以是多孔的、无孔的、固体的、半固体的、半流体的或流体的。珠粒可以是磁性的或非磁性的。在一些实施方案中,珠粒可以是可溶解的、可破裂的或可降解的。在一些情况下,珠粒可以是不可降解的。在一些实施方案中,珠粒可以是凝胶珠粒。凝胶珠粒可以是水凝胶珠粒。凝胶珠粒可以由分子前体形成,例如聚合物或单体物质。半固体珠粒可以是脂质体珠粒。As used herein, the term "bead" generally refers to a particle. A bead may be porous, non-porous, solid, semi-solid, semi-fluid or fluid. A bead may be magnetic or non-magnetic. In some embodiments, a bead may be soluble, rupturable or degradable. In some cases, a bead may be non-degradable. In some embodiments, a bead may be a gel bead. A gel bead may be a hydrogel bead. A gel bead may be formed from a molecular precursor, such as a polymer or a monomeric substance. A semi-solid bead may be a liposomal bead.

发明的有益效果Advantageous Effects of the Invention

本发明可同时满足以下五点:The present invention can simultaneously meet the following five points:

1.大幅降低微反应体系空载率,可以使现有基于微反应体系的单细胞组学建库系统,单次反应细胞通量增加10-100倍,可达10万-100万细胞;1. The empty rate of the micro-reactor system is greatly reduced, which can increase the cell throughput of the existing single-cell omics library construction system based on the micro-reactor system by 10-100 times, reaching 100,000-1 million cells per reaction;

2.低假单细胞率。在微反应条形码标记细胞的基础上,引入了额外的细胞条形码。 因此即使多个细胞进去同一个微反应体系,也可进行区分,并使属于传统意义上的“假单细胞”的测序数据也可用;2. Low pseudo-single cell rate: Based on the micro-reaction barcode labeling cells, additional cell barcodes were introduced. Therefore, even if multiple cells enter the same microreactor system, they can be distinguished, and the sequencing data of "pseudo-single cells" in the traditional sense can also be used;

3.兼容全细胞和细胞核起始的建库;3. Compatible with whole cell and cell nucleus-based library construction;

4.该方案适用于基于现有成熟的微反应(例如油包水液滴,纳米微孔)细胞条形码标记的平台多种单细胞组学建库技术,包括单细胞3’端RNA-seq,单细胞5’端RNA-seq,单细胞VDJ-Seq,单细胞RNA和ATAC-seq多组学技术等,同时也适用于其他基于10X Genomics平台在模态扩展、质量提升上提出的技术变种,如单细胞CUT&Tag;4. This solution is applicable to a variety of single-cell omics library construction technologies based on existing mature micro-reaction (such as oil-in-water droplets, nanopores) cell barcode labeling platforms, including single-cell 3' end RNA-seq, single-cell 5' end RNA-seq, single-cell VDJ-Seq, single-cell RNA and ATAC-seq multi-omics technologies, etc. It is also applicable to other technical variants proposed based on the 10X Genomics platform in terms of modality expansion and quality improvement, such as single-cell CUT&Tag;

5.该方案获得数据质量与商品化的10x Genomics平台的标准操作获得的数据质量接近,指标包括但不限于:检测到的基因数、VDJ捕获率、检测到的ATAC-seq信号。5. The data quality obtained by this scheme is close to that obtained by standard operation of the commercial 10x Genomics platform. The indicators include but are not limited to: number of genes detected, VDJ capture rate, and detected ATAC-seq signal.

下面将结合附图和实施例对本发明的实施方案进行详细描述,但是本领域技术人员将理解,下列附图和实施例仅用于说明本发明,而不是对本发明的范围的限定。根据附图和优选实施方案的下列详细描述,本发明的各种目的和有利方面对于本领域技术人员来说将变得显然。Embodiments of the present invention will be described in detail below in conjunction with the accompanying drawings and examples, but it will be appreciated by those skilled in the art that the following drawings and examples are only used to illustrate the present invention, rather than to limit the scope of the present invention. Various objects and advantages of the present invention will become apparent to those skilled in the art based on the following detailed description of the accompanying drawings and preferred embodiments.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

图1:现有的10X Genomics单细胞3’RNA-Seq示例性建库流程原理和文库结构。具体地,单细胞悬液、逆转录反应液和10X genomics细胞条形码标记的微珠在10X genomics平台上被制备成一个细胞加一个磁珠的油包水微液滴。收集微液滴在PCR仪上经过逆转录和模板置换反应后,微珠上的细胞条形码将被加载到细胞的cDNA产物中,经过后续cDNA扩增、酶切打断加接头,PCR扩增得到终文库。文库最两端的P5和P7是illumina的测序接头序列;右侧sample index是文库的标签序列,用于区别测序数据的样品来源;文库两端的Read1和Read2是两端测序的两个引物;文库左端10X Barcode是细胞条形码,可区分不同的单细胞;UMI(unique molecular identifiers)单分子条形码是用于标记同一个细胞内的不同mRNA;Poly(dT)是多聚胸腺嘧啶寡核苷酸,在mRNA逆转录过程引入;中间细线部分为转录组序列。Figure 1: Existing 10X Genomics single-cell 3'RNA-Seq exemplary library construction process principle and library structure. Specifically, single-cell suspension, reverse transcription reaction solution and 10X genomics cell barcode-labeled microbeads are prepared into oil-in-water microdroplets with one cell plus one magnetic bead on the 10X genomics platform. After the microdroplets are collected and reverse transcribed and template replaced on the PCR instrument, the cell barcode on the microbead will be loaded into the cell's cDNA product, and the final library will be obtained after subsequent cDNA amplification, enzyme cutting and adding adapters, and PCR amplification. P5 and P7 at the two ends of the library are the sequencing adapter sequences of Illumina; the sample index on the right is the label sequence of the library, which is used to distinguish the sample source of the sequencing data; Read1 and Read2 at both ends of the library are the two primers for sequencing at both ends; the 10X Barcode on the left end of the library is the cell barcode, which can distinguish different single cells; the UMI (unique molecular identifiers) single molecule barcode is used to mark different mRNAs in the same cell; Poly (dT) is a polythymine oligonucleotide, which is introduced during the reverse transcription process of mRNA; the thin line in the middle is the transcriptome sequence.

图2:本发明示例性流程原理示意图。具体地,本发明基于微反应(例如图2中所示的油包水微液滴)的细胞条形码标记结合后组合标记的方式,在细胞原位完成第一轮的条形码标记后,将细胞重新混匀分配,在细胞的核酸分子上对细胞进行第二轮标签(96-384种)的引入,实现多种新型单细胞组学建库方案。Figure 2: Schematic diagram of the exemplary process principle of the present invention. Specifically, the present invention is based on the cell barcode labeling combined with the post-combination labeling method of micro-reaction (such as the oil-in-water microdroplet shown in Figure 2). After the first round of barcode labeling is completed in situ, the cells are re-mixed and distributed, and the second round of labels (96-384 types) are introduced to the cells on the nucleic acid molecules of the cells, realizing a variety of new single-cell omics library construction schemes.

图3:本发明示例性单细胞转录组、VDJ建库及文库结构示意图。具体地,本发明对完整的已固定的细胞/细胞核与带细胞标签的10X GENOMICS RNA微珠进行油包水微液滴制备,在细胞原位上待测mRNA通过逆转录反应(3’RNA)/模板置换反应(5’RNA)加上第一轮标签。随后将完成第一轮标签加载的细胞从微液滴中释放出来,充分混匀后分装到96/384孔板中,每个孔加入不同的带有特定标签序列的测序引物,通过PCR扩增使得不同孔里的细胞加上不同的第二轮标签。接着,完成扩增的产物收 集到一起进行纯化和进一步建库,最后获得对应的单细胞转录组文库。其中,单细胞5’端RNA的扩增产物还可以进一步富集VDJ,获得VDJ文库。最终的文库结构为:最两端的P5和P7是illumina的测序接头序列;右侧i7是文库的标签序列,用于区别不同的测序文库;文库两端的Read1和Read2是两端测序的两个引物;文库左端10X Barcode是细胞条形码,为第一轮细胞标签;标识有round 2部分为index PCR引入的第二轮细胞标签;UMI(unique molecular identifiers)单分子条形码是用于标记同一个细胞内的不同mRNA;TSO是模板置换序列;中间未标识部分为转录组序列。Figure 3: Schematic diagram of exemplary single-cell transcriptome, VDJ library construction and library structure of the present invention. Specifically, the present invention prepares oil-in-water microdroplets of intact fixed cells/cell nuclei and 10X GENOMICS RNA microbeads with cell labels, and adds the first round of labels to the mRNA to be tested in situ in the cell through reverse transcription reaction (3'RNA)/template displacement reaction (5'RNA). The cells that have completed the first round of label loading are then released from the microdroplets, mixed thoroughly and divided into 96/384-well plates, and different sequencing primers with specific label sequences are added to each well. Through PCR amplification, different second-round labels are added to the cells in different wells. Next, the amplified products are collected. Collect them together for purification and further library construction, and finally obtain the corresponding single-cell transcriptome library. Among them, the amplification product of the 5' end RNA of a single cell can be further enriched for VDJ to obtain a VDJ library. The final library structure is: P5 and P7 at the two ends are the sequencing adapter sequences of illumina; i7 on the right is the label sequence of the library, which is used to distinguish different sequencing libraries; Read1 and Read2 at both ends of the library are the two primers for sequencing at both ends; the 10X Barcode on the left end of the library is the cell barcode, which is the first round of cell label; the part marked with round 2 is the second round of cell label introduced by index PCR; UMI (unique molecular identifiers) single molecule barcode is used to mark different mRNAs in the same cell; TSO is the template replacement sequence; the unmarked part in the middle is the transcriptome sequence.

图4:本发明示例性单细胞多组学建库及文库结构示意图。具体地,本发明对完整的已固定的细胞/细胞核与带细胞标签的10X GENOMICS RNA微珠进行油包水微液滴制备,在细胞原位上对待测mRNA通过逆转录反应(3’RNA)/对染色质开放区gDNA通过链接反应加上第一轮标签。随后将完成第一轮标签加载的细胞从微液滴中释放出来,充分混匀后分装到96/384孔板中,每个孔加入不同的带有特定标签序列的测序引物,通过PCR扩增使得不同孔里的细胞加上不同的第二轮标签。接着,完成扩增的产物收集到一起进行纯化和进一步建库,最后通过cDNA和gDNA对应引物分别富集两种产物,继而分别进行进一步建库,获得对应的单细胞转录组测序文库和ATAC-seq测序文库。Figure 4: Schematic diagram of an exemplary single-cell multi-omics library construction and library structure of the present invention. Specifically, the present invention prepares oil-in-water microdroplets of intact fixed cells/cell nuclei and 10X GENOMICS RNA microbeads with cell labels, and adds the first round of labels to the mRNA to be tested through a reverse transcription reaction (3’RNA)/gDNA in the open region of chromatin through a ligation reaction in situ on the cell. The cells that have completed the first round of label loading are then released from the microdroplets, mixed thoroughly and divided into 96/384-well plates, and different sequencing primers with specific label sequences are added to each well. Through PCR amplification, different second-round labels are added to the cells in different wells. Next, the amplified products are collected together for purification and further library construction, and finally the two products are enriched by the corresponding primers of cDNA and gDNA, respectively, and then further library construction is performed to obtain the corresponding single-cell transcriptome sequencing library and ATAC-seq sequencing library.

图5:采用本发明单细胞转录组方法对人和小鼠细胞系混合样品进行测序的结果。FIG5 : The results of sequencing mixed samples of human and mouse cell lines using the single-cell transcriptome method of the present invention.

图6:采用本发明转录组方法对不同条件固定的人外周血单核细胞样品进行测序的结果。FIG6 : The results of sequencing human peripheral blood mononuclear cell samples fixed under different conditions using the transcriptome method of the present invention.

图7:采用本发明单细胞5’RNA-seq方法对冻存的人外周血单核细胞样品进行测序的结果。FIG7 : The results of sequencing frozen human peripheral blood mononuclear cell samples using the single-cell 5’ RNA-seq method of the present invention.

图8:采用本发明单细胞VDJ-seq方法对人外周血单核细胞样品进行测序的结果。FIG8 : The results of sequencing human peripheral blood mononuclear cell samples using the single-cell VDJ-seq method of the present invention.

图9:采用本发明单细胞转录组+ATAC多组学方法对冻存的人肾样品进行测序的结果。FIG9 : The results of sequencing frozen human kidney samples using the single-cell transcriptome + ATAC multi-omics method of the present invention.

序列信息Sequence information

本申请涉及的序列的描述提供于下表中。A description of the sequences involved in this application is provided in the table below.

表1:序列信息


注:“-s-”表示硫代修饰;N各自独立地选自A,T,C或G;“5Phos”表示磷酸化修饰;
“3ddC”表示2',3'-二脱氧胞苷-5'-单磷酸;N各自独立地选自A、T、C和G。
Table 1: Sequence information


Note: "-s-" indicates thio modification; N is independently selected from A, T, C or G; "5Phos" indicates phosphorylation modification;
"3ddC" means 2',3'-dideoxycytidine-5'-monophosphate; each N is independently selected from A, T, C and G.

在现有的微反应细胞条形码标记平台中,以10X Genomics公司Chromium平台为例, 并非所有的微反应体系(GEM,油包水液滴)均如所期望的含有单个细胞,通常会出现含有两个或者更多个细胞的情形,这种情形也被称为“假单细胞”。在“假单细胞”的情况下,同一个GEM中的两个或者更多个细胞将被标记上相同的条形码。这导致,仅利用GEM中的条形码无法对GEM中存在的两个或多个细胞进行“一对一”的标识作用。相应地,由“假单细胞”GEM所产生的测序数据,由于其含有来源于两个或多个细胞的测序结果,不能用于分析单细胞的转录组信息。因此,需要从最终产生的测序数据中过滤或移除由“假单细胞”GEM所产生的测序数据;并且,为了避免测序数据的大量浪费,需要尽可能降低或控制“假单细胞”GEM的数量或比率,从而大大限制了其建库通量。Among the existing micro-reaction cell barcode labeling platforms, taking the Chromium platform of 10X Genomics as an example, Not all micro-reaction systems (GEM, water-in-oil droplets) contain single cells as expected. Usually, there will be situations containing two or more cells, which are also called "pseudo-monocellular cells". In the case of "pseudo-monocellular cells", two or more cells in the same GEM will be marked with the same barcode. This results in the inability to perform a "one-to-one" identification of two or more cells present in the GEM using only the barcode in the GEM. Accordingly, the sequencing data generated by the "pseudo-monocellular" GEM cannot be used to analyze the transcriptome information of a single cell because it contains sequencing results derived from two or more cells. Therefore, it is necessary to filter or remove the sequencing data generated by the "pseudo-monocellular" GEM from the sequencing data finally generated; and in order to avoid a large amount of waste of sequencing data, it is necessary to reduce or control the number or ratio of the "pseudo-monocellular" GEM as much as possible, thereby greatly limiting its library construction throughput.

本发明在微反应(例如油包水液滴,纳米微孔)条形码标记细胞的基础上,将微反应体系中在细胞/细胞核内原位完成第一轮细胞条形码加载的细胞/细胞核取出,充分混匀后等分若干份,再通过微体系的index PCR在细胞/细胞核的核酸分子上引入第二轮标记,最后利用两轮标签信息共同定义一个细胞。从而,利用本发明方法构建的核酸分子文库具有两轮的细胞标签,这使得能够对由“假单细胞”产生的测序数据进行拆分,进而准确追踪和确定测序数据的细胞来源。例如,一个“假单细胞”微反应体系中的两个或多个细胞虽然均含有相同的第一轮细胞条形码标记,但所述两个或多个细胞各自含有不同的第二轮细胞条形码标记,从而,可根据第二轮细胞条形码标记对其中的各个细胞产生的测序数据进行区分,从而,即使是由“假单细胞”产生的测序数据,也能够被使用。这使得建库过程中出现的“假单细胞”的负面影响被大大降低;同理,也使得建库过程中的微反应体系空载率得到大幅降低,显著减少试剂浪费,降低成本。Based on the barcode labeling of cells in micro-reactions (e.g., water-in-oil droplets, nano-micropores), the present invention removes the cells/nuclei that have completed the first round of cell barcode loading in situ in the cells/nuclei in the micro-reaction system, mixes them thoroughly and divides them into several equal parts, then introduces the second round of labeling on the nucleic acid molecules of the cells/nuclei through the index PCR of the micro-system, and finally uses the two rounds of label information to jointly define a cell. Thus, the nucleic acid molecule library constructed using the method of the present invention has two rounds of cell labels, which makes it possible to split the sequencing data generated by the "pseudo-monocytes", and then accurately track and determine the cell source of the sequencing data. For example, although two or more cells in a "pseudo-monocyte" micro-reaction system all contain the same first round of cell barcode labels, the two or more cells each contain different second round of cell barcode labels, so that the sequencing data generated by each cell therein can be distinguished according to the second round of cell barcode labels, so that even the sequencing data generated by the "pseudo-monocytes" can be used. This greatly reduces the negative impact of "pseudo-single cells" that appear during the library construction process; similarly, it also greatly reduces the empty rate of the micro-reaction system during the library construction process, significantly reducing reagent waste and reducing costs.

此外,申请人还期望强调的是,基于预标记和微流控液滴的高通量建库技术的方法,虽然相比于现有传统的微流控液滴的高通量建库技术能提高通量、降低微反应体系空载率、假单细胞率。但是该类方法并不能同时适用或改进现有的多种单细胞组学建库技术(单细胞3’端RNA-seq,单细胞5’端RNA-seq,单细胞VDJ-Seq,单细胞RNA和ATAC-seq多组学技术等),并且该类方法操作复杂、步骤繁多,数据质量也不如标准的微流控液滴高通量建库技术。In addition, the applicant also hopes to emphasize that the method based on pre-labeling and microfluidic droplet high-throughput library construction technology can improve the throughput and reduce the empty rate and pseudo-single cell rate of the micro-reaction system compared with the existing traditional microfluidic droplet high-throughput library construction technology. However, this type of method cannot be simultaneously applied to or improve the existing multiple single-cell omics library construction technologies (single-cell 3' end RNA-seq, single-cell 5' end RNA-seq, single-cell VDJ-Seq, single-cell RNA and ATAC-seq multi-omics technology, etc.), and this type of method is complicated to operate, has many steps, and the data quality is not as good as the standard microfluidic droplet high-throughput library construction technology.

因此,现有的高通量单细胞组学建库技术无法同时兼顾:数据质量高、细胞通量大、操作简便、价格低廉、假单细胞率低。而且目前也没有任何一种方法能同时适用或改进现有的多种单细胞组学建库技术。Therefore, the existing high-throughput single-cell omics library construction technology cannot take into account the following: high data quality, high cell throughput, simple operation, low price, and low false single cell rate. Moreover, there is currently no method that can simultaneously apply or improve multiple existing single-cell omics library construction technologies.

具体实施方式DETAILED DESCRIPTION

现参照下列意在举例说明本发明(而非限定本发明)的实施例来描述本发明。The invention will now be described with reference to the following examples which are intended to illustrate the invention rather than to limit the invention.

除非特别指明,本发明中所使用的分子生物学实验方法,参照J.Sambrook等人,分子克隆:实验室手册,第2版,冷泉港实验室出版社,1989,以及F.M.Ausubel等人,精编分子生物学实验指南,第3版,John Wiley&Sons,Inc.,1995中所述的方法进行。本领域技术人员知晓,实施例以举例方式描述本发明,且不意欲限制本发明所要求保护的范围。 Unless otherwise specified, the molecular biology experimental methods used in the present invention are carried out with reference to the methods described in J. Sambrook et al., Molecular Cloning: A Laboratory Manual, 2nd Edition, Cold Spring Harbor Laboratory Press, 1989, and FM Ausubel et al., Compiled Molecular Biology Laboratory Manual, 3rd Edition, John Wiley & Sons, Inc., 1995. It is understood by those skilled in the art that the embodiments describe the present invention by way of example and are not intended to limit the scope of the invention claimed.

本申请实施例涉及的试剂信息如下:
The reagent information involved in the examples of this application is as follows:

注:表中所示试剂均可市购获得。Note: The reagents shown in the table are all commercially available.

除非另有说明,否则本申请实施例使用的试剂具有本领域技术人员所通常理解的含义。并且,除非另有说明或根据上下文明显矛盾,否则本申请实施例使用的试剂均可通过市场购买获得或根据相应领域内广泛使用的配方自行制备获得。Unless otherwise specified, the reagents used in the examples of the present application have the meanings generally understood by those skilled in the art. In addition, unless otherwise specified or clearly contradictory according to the context, the reagents used in the examples of the present application can be purchased from the market or prepared by themselves according to the formula widely used in the corresponding field.

实施例1:单细胞悬液固定透化Example 1: Fixation and permeabilization of single cell suspension

根据实验需要,可以用新鲜组织、新鲜细胞系、新鲜血样、原代细胞、冻存细胞样品来源的完整单细胞进行建库,在建库之前,需要对细胞进行固定和透化。本实施例选用Hela细胞系、NIH3T3细胞系(购自中国科学院细胞库)和外周血单核细胞PBMC进行实验。According to the experimental needs, the library can be constructed using intact single cells from fresh tissues, fresh cell lines, fresh blood samples, primary cells, and frozen cell samples. Before the library is constructed, the cells need to be fixed and permeabilized. In this example, Hela cell line, NIH3T3 cell line (purchased from the cell bank of the Chinese Academy of Sciences) and peripheral blood mononuclear cells PBMC were used for the experiment.

固定和透化实验步骤如下:The fixation and permeabilization steps are as follows:

方法一:80%甲醇-20℃固定透化10min。Method 1: Fix and permeabilize in 80% methanol at -20℃ for 10 min.

方法二:1%甲醛室温固定10min后离心去上清,0.2%Triton X-100重悬细胞,冰上透化3min。Method 2: Fix with 1% formaldehyde at room temperature for 10 minutes, then centrifuge and remove the supernatant. Resuspend the cells in 0.2% Triton X-100 and permeabilize on ice for 3 minutes.

方法三:1%多聚甲醛室温固定10min后离心去上清,0.2%Triton X-100重悬细胞,冰上透化3min。Method 3: Fix with 1% paraformaldehyde at room temperature for 10 minutes, then centrifuge and remove the supernatant. Resuspend the cells in 0.2% Triton X-100 and permeabilize on ice for 3 minutes.

固定后,4℃500g离心5min,去上清。After fixation, centrifuge at 500 g for 5 min at 4°C and remove the supernatant.

实施例2:细胞核固定及透化Example 2: Fixation and permeabilization of cell nuclei

根据实验需要,可以用新鲜组织、冻存组织、细胞系、血样、原代细胞、冻存细胞样品来源的细胞核(细胞核的提取方式参照本领域内广泛使用的常规步骤)进行建库, 在建库之前,需要对细胞核进行固定和透化。According to the experimental needs, the library can be constructed using cell nuclei from fresh tissues, frozen tissues, cell lines, blood samples, primary cells, and frozen cell samples (the extraction method of cell nuclei refers to the conventional steps widely used in the field). Before library construction, cell nuclei need to be fixed and permeabilized.

固定和透化实验步骤如下:The fixation and permeabilization steps are as follows:

方法一:1%甲醛固定室温固定10min。Method 1: Fix with 1% formaldehyde at room temperature for 10 minutes.

方法二:1.6%多聚甲醛室温固定5min。Method 2: Fix with 1.6% paraformaldehyde at room temperature for 5 minutes.

固定后,4℃600g离心,去上清。透化液(10mM Tris-HCl,pH7.4,10mM NaCl,3mM MgCl2,0.01%Tween-20,0.01%IGEPAL CA-630,1%BSA,0.001%digitonin,1%SUPERase-In RNase Inhibitor)重悬细胞,置于冰上透化3min后离心去上清。After fixation, centrifuge at 600g at 4°C and discard the supernatant. Resuspend the cells in permeabilization solution (10mM Tris-HCl, pH7.4, 10mM NaCl, 3mM MgCl 2 , 0.01% Tween-20, 0.01% IGEPAL CA-630, 1% BSA, 0.001% digitonin, 1% SUPERase-In RNase Inhibitor), permeabilize on ice for 3 minutes, and then centrifuge to discard the supernatant.

实施例3:单端(i7端)TN5转座酶复合体制备Example 3: Preparation of single-end (i7-end) TN5 transposase complex

该单端(i7端)TN5转座酶复合体将用于本发明单细胞转录组建库过程中。The single-end (i7-end) TN5 transposase complex will be used in the single-cell transcriptome library construction process of the present invention.

1.转座子制备:将Tn5-top_ME核苷酸(SEQ ID NO:9)和Tn5-bottom_Read2N核苷酸(SEQ ID NO:10)分别用TruePrepTagment Enzyme试剂盒中的Annealing buffer溶解至100Um,再把两种核苷酸以1:1体积比混匀,本发明实施例分别取10ul两种核苷酸,充分混匀后。置于PCR仪内,进行如下退火反应程序:75℃ 15分钟,60℃ 10分钟,50℃ 10分钟,40℃ 10分钟,25℃ 30分钟。经过退火后的接头混合液即为转座子,-20℃保存。1. Transposon preparation: Tn5-top_ME nucleotide (SEQ ID NO: 9) and Tn5-bottom_Read2N nucleotide (SEQ ID NO: 10) were respectively prepared by TruePrep Dissolve the annealing buffer in the Tagment Enzyme kit to 100 Um, and then mix the two nucleotides in a 1:1 volume ratio. In the embodiment of the present invention, take 10ul of the two nucleotides respectively and mix them thoroughly. Place in a PCR instrument and perform the following annealing reaction program: 75℃ 15 minutes, 60℃ 10 minutes, 50℃ 10 minutes, 40℃ 10 minutes, 25℃ 30 minutes. The annealed adapter mixture is the transposon and is stored at -20℃.

2.TN5转座酶复合体包埋:用TruePrepTagment Enzyme试剂盒中的TruePrep Tagment Enzyme(2μg/μl)和Coupling Buffer配制如下反应液:10ul TruePrep Tagment Enzyme(2μg/μl),33ul Coupling Buffer,7ul转座子(上一步实验获得)。充分混匀后,置于PCR仪内,30℃反应1小时。反应结束后即可得到单端(i7端)TN5转座酶复合体,-20℃保存。2. TN5 transposase complex embedding: using TruePrep The TruePrep Tagment Enzyme (2μg/μl) and Coupling Buffer in the Tagment Enzyme Kit are used to prepare the following reaction solution: 10ul TruePrep Tagment Enzyme (2μg/μl), 33ul Coupling Buffer, 7ul transposon (obtained in the previous step). After thorough mixing, place in a PCR instrument and react at 30℃ for 1 hour. After the reaction is completed, a single-end (i7-end) TN5 transposase complex is obtained and stored at -20℃.

实施例4:单细胞转录组文库制备(包括人和小鼠细胞系混样的单细胞3’RNA-Example 4: Preparation of single-cell transcriptome library (including single-cell 3'RNA-

seq、人外周血单核细胞单细胞5’RNA-seq文库制备)seq, human peripheral blood mononuclear cell single cell 5' RNA-seq library preparation)

1.油包水微液滴制备和逆转录反应条形码加载(第一轮细胞标签):实施例以10X genomics chromium平台10x Single Cell 5’RNA-seq和10x Single Cell 3’RNA-seq系统为例制备油包水微液滴,赋予每个微液滴一个唯一的标签标记。油包水制备和细胞条形码标记的微珠可以被其他平台代替。1. Preparation of oil-in-water microdroplets and reverse transcription reaction barcode loading (first round of cell labeling): Example 10X genomics chromium platform 10x Single Cell 5'RNA-seq and 10x Single Cell 3'RNA-seq system are used as examples to prepare oil-in-water microdroplets, giving each microdroplet a unique label. The microbeads for oil-in-water preparation and cell barcode labeling can be replaced by other platforms.

对上述实施例中已经完成固定透化的单细胞核和透化细胞样品计数后,随即配制逆转录反应液,10x Single Cell 5’RNA-seq反应体系为:18.8μl RT Reagent B,7.3μl Poly-dT RT Primer,1.9μl Reducing Agent B,2μl RT Enzyme C,38.7ul固定透化后的细胞/细胞核悬液;10x Single Cell 3’RNA-seq反应体系为:18.8μl RT Reagent B,2.4μl Template Switch Oligo,2μl Reducing Agent B,8.7μl RT Enzyme C,43.2ul固定透化后的细胞/细胞核悬液。按照10X Chromium Single Cell Reagent Kits User Guide说明书,将70ul的细胞反应液、50ul 10X Single Cell Gel Beads、45ul矿物油加载到10X Chip K(10x Single Cell 5’RNA-seq用)或10X Chip G(10x Single Cell 3’RNA-seq用)芯片上,在10X genomics chromium仪器上进行油包水制备。制备结束后收集油包水产物到200ul PCR管中,快速置于PCR仪,反应条件如下: 53℃ 45min,4℃暂存。After counting the single cell nuclei and permeabilized cell samples that have been fixed and permeabilized in the above embodiment, the reverse transcription reaction solution is prepared immediately. The 10x Single Cell 5'RNA-seq reaction system is: 18.8μl RT Reagent B, 7.3μl Poly-dT RT Primer, 1.9μl Reducing Agent B, 2μl RT Enzyme C, 38.7ul fixed and permeabilized cell/cell nucleus suspension; the 10x Single Cell 3'RNA-seq reaction system is: 18.8μl RT Reagent B, 2.4μl Template Switch Oligo, 2μl Reducing Agent B, 8.7μl RT Enzyme C, 43.2ul fixed and permeabilized cell/cell nucleus suspension. According to the 10X Chromium Single Cell Reagent Kits User Guide, 70ul of cell reaction solution, 50ul of 10X Single Cell Gel Beads, and 45ul of mineral oil were loaded onto the 10X Chip K (for 10x Single Cell 5'RNA-seq) or 10X Chip G (for 10x Single Cell 3'RNA-seq) chip, and the oil-in-water preparation was performed on the 10X genomics chromium instrument. After the preparation, the oil-in-water product was collected into a 200ul PCR tube and quickly placed in the PCR instrument. The reaction conditions were as follows: 53℃ for 45min, store at 4℃.

2.破除油包水微液滴释放细胞及细胞重分配:上述完成第一轮条形码细胞标签加载的油包水产物破除油包水微液滴,从水相中取出细胞充分混匀,再将细胞/细胞核悬液分装到96孔板中。2. Breaking the oil-in-water microdroplets to release cells and redistribute cells: The oil-in-water product that has completed the first round of barcode cell label loading breaks the oil-in-water microdroplets, takes out the cells from the aqueous phase and mixes them thoroughly, and then distributes the cell/cell nucleus suspension into a 96-well plate.

3.细胞裂解及纯化:分装好细胞的96孔板置于PCR仪中,85℃孵育5min。随后进行纯化。3. Cell lysis and purification: Place the 96-well plate with cells in a PCR instrument and incubate at 85°C for 5 minutes. Then perform purification.

4.Index PCR扩增反应(加载第二轮细胞标签):在上述纯化好的产物中,加入cDNA扩增的反应液,96孔板每孔中应加入的反应液包括:20ul KAPA HiFi HotStart 2X ReadyMix,2ul 10uM Partial TSO/IS引物(SEQ ID NO:2),2ul 10uM Truseq-i5端特异性第2标签引物(序列如SEQ ID NO:5,本实施例具体使用的该引物有96种,每个孔中加一种,所述96种引物包含的标签序列分别选自如SEQ ID NO:7所示的序列),充分混匀后快速置于PCR仪中扩增。4. Index PCR amplification reaction (loading the second round of cell labeling): Add cDNA amplification reaction solution to the above purified product. The reaction solution to be added to each well of the 96-well plate includes: 20ul KAPA HiFi HotStart 2X ReadyMix, 2ul 10uM Partial TSO/IS primer (SEQ ID NO: 2), 2ul 10uM Truseq-i5-end specific second label primer (sequence as SEQ ID NO: 5, there are 96 types of primers used in this embodiment, one is added to each well, and the label sequences contained in the 96 primers are respectively selected from the sequences shown in SEQ ID NO: 7), mix well and quickly place in a PCR instrument for amplification.

5.Index PCR cDNA扩增产物纯化:收集上述96孔板中的产物到新的EP管中,0.6x SPRIselect磁珠纯化、洗脱。5. Purification of Index PCR cDNA amplification products: Collect the products in the above 96-well plate into a new EP tube, and purify and elute with 0.6x SPRIselect magnetic beads.

6.测序文库构建:本实施例提供单端转座酶转座插入i7端测序引物后扩增的建库方式,首先取100ng上步产物,用单端(i7端)TN5转座酶(实施例3中制备)转座打断,反应体系为:10ul 5X Reaction(vazyme#S601-01)、5ul单端(i7端)TN5转座酶、100ng上述Index PCR cDNA扩增产物,反应液充分混匀后置于PCR仪中55℃孵育15min。用0.8x SPRIselect磁珠纯化、洗脱。接着对纯化产物进行测序文库扩增,反应体系为:50ul NEBNext High-Fidelity 2x PCR Master Mix,5ul 10uM P5端引物(SEQ ID NO:1),Nextare-i7端第2标签引物(序列结构示意如SEQ ID NO:6所示,本实施例实际所用引物包含的标签序列选自SEQ ID NO:8所示的序列),40ul转座纯化产物。充分混匀后置于PCR仪,反应条件:72℃ 5min,98℃ 45s,8个循环[98℃ 20s,60℃ 30s,72℃ 1min],72℃ 5min,4℃暂存。6. Construction of sequencing library: This example provides a library construction method in which a single-end transposase is inserted into the i7-end sequencing primer and then amplified. First, 100 ng of the product from the previous step is taken and interrupted by transposition with a single-end (i7-end) TN5 transposase (prepared in Example 3). The reaction system is: 10ul 5X Reaction (vazyme#S601-01), 5ul single-end (i7-end) TN5 transposase, 100ng of the above Index PCR cDNA amplification product, and the reaction solution is fully mixed and placed in a PCR instrument for incubation at 55°C for 15 minutes. Purify and elute with 0.8x SPRIselect magnetic beads. Then, the purified product was amplified for sequencing library. The reaction system was: 50ul NEBNext High-Fidelity 2x PCR Master Mix, 5ul 10uM P5 end primer (SEQ ID NO: 1), Nextare-i7 end second label primer (sequence structure schematic as shown in SEQ ID NO: 6, the label sequence contained in the primer actually used in this embodiment is selected from the sequence shown in SEQ ID NO: 8), 40ul transposition purification product. After fully mixing, place in PCR instrument, reaction conditions: 72℃ 5min, 98℃ 45s, 8 cycles [98℃ 20s, 60℃ 30s, 72℃ 1min], 72℃ 5min, 4℃ temporary storage.

7.测序文库纯化和片段筛选:用0.6X和0.2X SPRIselect磁珠对上步产物进行纯化和片段筛选。最后得到片段大小为300-600bp左右的测序文库。7. Sequencing library purification and fragment screening: Use 0.6X and 0.2X SPRIselect magnetic beads to purify and screen the products from the previous step. Finally, a sequencing library with a fragment size of about 300-600bp is obtained.

8.测序:构建好文库用NovaSeq 6000(Illumina,San Diego,CA)测序,读长150bp双端测序,每个细胞测50,000个reads。8. Sequencing: The constructed library was sequenced using NovaSeq 6000 (Illumina, San Diego, CA) with a read length of 150 bp and 50,000 reads per cell.

实施例5:单细胞VDJ文库制备(以人外周血单核细胞为例)Example 5: Preparation of single cell VDJ library (taking human peripheral blood mononuclear cells as an example)

本实施例提供的单细胞VDJ文库制备方法是建立在免疫细胞T细胞、B细胞完成实施例4所示单细胞5’RNA-seq Index PCR cDNA扩增产物纯化的基础上,也即利用10X genomics chromium平台10x Single Cell 5’RNA-seq完成实施例4的步骤1-步骤5,对获得的已加载两轮标记cDNA扩增产物进行VDJ保守区特异性引物的巢式PCR扩增,富集到VDJ序列。由于cDNA上已加载的两轮标记都位于序列的P5端(靠近VDJ端),因此富集后的产物仍然带有与单细胞5’RNA-seq转录组相同的两轮细胞标记。继而,在富集好的VDJ的基 础上进一步构建测序文库。本实施例以人外周血来源的PBMC为例,对其中的T细胞和B细胞的VDJ文库分别进行构建。The single-cell VDJ library preparation method provided in this embodiment is based on the purification of the single-cell 5'RNA-seq Index PCR cDNA amplification products shown in Example 4 using immune cells T cells and B cells, that is, using the 10X genomics chromium platform 10x Single Cell 5'RNA-seq to complete steps 1 to 5 of Example 4, and performing nested PCR amplification of the obtained cDNA amplification products loaded with two rounds of markers using primers specific for the VDJ conserved region to enrich for the VDJ sequence. Since the two rounds of markers loaded on the cDNA are both located at the P5 end of the sequence (close to the VDJ end), the enriched product still carries the same two rounds of cell markers as the single-cell 5'RNA-seq transcriptome. Subsequently, on the enriched VDJ base This example takes PBMC from human peripheral blood as an example, and constructs VDJ libraries of T cells and B cells therein respectively.

具体方法:Specific method:

1.巢式PCR特异性富集VDJ序列:人外周血来源的PBMC完成实施例4中10x Single Cell 5’RNA-seq Index PCR cDNA扩增产物纯化后,对cDNA扩增产物用两组特异性引物进行两轮PCR富集,第一轮巢式PCR反应体系为:50ul KAPA HiFi HotStart 2X ReadyMix,5ul 10uM P5端引物,5ul 10uM人T Cell/B Cell Outer引物(针对T Cell有2条引物(SEQ ID NOs:11-12),B Cell有7条引物(SEQ ID NOs:15-21),扩增TCR/BCR前对应引物先分别1:1混合),5ul cDNA扩增产物,35ul无核酸酶的水,充分混匀后快速置于PCR仪,反应条件如下:98℃45s,扩增TCR 11个循环[98℃ 20s,62℃ 30s,72℃ 1min],72℃ 1min,4℃暂存。随后,PCR扩增产物用0.5X和0.3X SPRIselect磁珠进行纯化和片段筛选,40.5ul EB缓冲液洗脱。接着进行第二轮巢式PCR扩增,PCR反应体系为:50ul KAPA HiFi HotStart 2X ReadyMix,5ul 10uM P5端引物,5ul 10uM人T Cell/B Cell Inner引物(针对T Cell有2条引物(SEQ ID NOs:13-14),B Cell有7条引物(SEQ ID NOs:22-28),扩增TCR/BCR前对应引物先分别1:1混合),40ul第一轮巢式PCR扩增产物,充分混匀后快速置于PCR仪,反应条件如下:98℃ 45s,扩增TCR 9个循环(扩增BCR 9个循环)[98℃ 20s,62℃ 30s,72℃ 1min],72℃ 1min,4℃暂存。最后,PCR扩增终产物用0.5X和0.25X SPRIselect磁珠进行纯化和片段筛选,30.5ul EB缓冲液洗脱。洗脱产物取1ul,用Qubit测浓度,剩余样品可于-80℃保存3月。1. Nested PCR specific enrichment of VDJ sequences: After the 10x Single Cell 5' RNA-seq Index PCR cDNA amplification product in Example 4 was purified from PBMCs derived from human peripheral blood, the cDNA amplification product was enriched in two rounds of PCR using two sets of specific primers. The first round of nested PCR reaction system was: 50ul KAPA HiFi HotStart 2X ReadyMix, 5ul 10uM P5 end primer, 5ul 10uM human T Cell/B Cell Outer primers (there are 2 primers for T Cell (SEQ ID NOs: 11-12), 7 primers for B Cell (SEQ ID NOs: 15-21, the corresponding primers should be mixed 1:1 before amplifying TCR/BCR), 5ul cDNA amplification product, 35ul nuclease-free water, mix thoroughly and quickly place in PCR instrument, the reaction conditions are as follows: 98℃ 45s, amplify TCR for 11 cycles [98℃ 20s, 62℃ 30s, 72℃ 1min], 72℃ 1min, and store at 4℃. Subsequently, the PCR amplification product was purified and fragment screened using 0.5X and 0.3X SPRIselect magnetic beads, and eluted with 40.5ul EB buffer. Then, the second round of nested PCR amplification was carried out. The PCR reaction system was: 50ul KAPA HiFi HotStart 2X ReadyMix, 5ul 10uM P5 end primer, 5ul 10uM human T Cell/B Cell Inner primer (there are 2 primers for T Cell (SEQ ID NOs: 13-14), 7 primers for B Cell (SEQ ID NOs: 22-28), and the corresponding primers were mixed 1:1 before amplifying TCR/BCR), 40ul of the first round of nested PCR amplification product was fully mixed and quickly placed in the PCR instrument. The reaction conditions were as follows: 98℃ for 45s, 9 cycles of TCR amplification (9 cycles of BCR amplification) [98℃ for 20s, 62℃ for 30s, 72℃ for 1min], 72℃ for 1min, and stored at 4℃. Finally, the final PCR amplification product was purified and fragment screened using 0.5X and 0.25X SPRIselect magnetic beads and eluted with 30.5ul EB buffer. 1ul of the eluted product was taken and the concentration was measured using Qubit. The remaining sample can be stored at -80℃ for 3 months.

2.VDJ测序文库构建:同实施例4,本实施例提供单端转座酶转座插入i7端测序引物后扩增的建库方式,取100ngVDJ富集产物,用单端(i7端)TN5转座酶(实施例3中制备)转座打断,反应体系为:10ul 5X Reaction(vazyme#S601-01)、5ul单端(i7端)TN5转座酶、100ng上述Index PCR cDNA扩增产物,反应总体系为50ul,体积不足之处用无核酸酶的水补足。反应液充分混匀后置于PCR仪中55℃孵育5min。产物用0.8x SPRIselect磁珠纯化,2. Construction of VDJ sequencing library: Same as Example 4, this example provides a library construction method in which a single-end transposase is transposed and inserted into the i7-end sequencing primer and then amplified. 100ng of VDJ enriched product is taken and interrupted by transposition with a single-end (i7-end) TN5 transposase (prepared in Example 3). The reaction system is: 10ul 5X Reaction (vazyme#S601-01), 5ul single-end (i7-end) TN5 transposase, 100ng of the above Index PCR cDNA amplification product, the total reaction system is 50ul, and the insufficient volume is supplemented with nuclease-free water. After the reaction solution is fully mixed, it is placed in a PCR instrument and incubated at 55°C for 5min. The product is purified with 0.8x SPRIselect magnetic beads,

最后磁珠由40.5ul EB洗脱。接着对纯化产物进行测序文库扩增,反应体系为:50ul NEBNext High-Fidelity 2x PCR Master Mix,5ul 10uM P5端引物,5ul 10uM P5端引物Nextare-i7端第2标签引物(序列结构示意如SEQ ID NO:6所示,本实施例实际所用引物包含的标签序列选自SEQ ID NO:8中随机一种),40ul转座纯化产物。充分混匀后置于PCR仪,反应条件:72℃ 5min,98℃ 45s,7个循环[98℃ 20s,60℃ 30s,72℃ 1min],72℃ 5min,4℃暂存。Finally, the magnetic beads were eluted with 40.5ul EB. Then, the purified product was amplified for sequencing library, and the reaction system was: 50ul NEBNext High-Fidelity 2x PCR Master Mix, 5ul 10uM P5 end primer, 5ul 10uM P5 end primer Nextare-i7 end second label primer (the sequence structure is shown in SEQ ID NO:6, and the label sequence contained in the primer actually used in this embodiment is selected from SEQ ID NO:8), 40ul transposition purification product. After fully mixing, place in PCR instrument, reaction conditions: 72℃ 5min, 98℃ 45s, 7 cycles [98℃ 20s, 60℃ 30s, 72℃ 1min], 72℃ 5min, 4℃ temporary storage.

3.测序文库纯化和片段筛选:用0.6X和0.2X SPRIselect磁珠对上步产物进行纯化和片段筛选。最后得到片段大小为300-600bp左右的测序文库。3. Sequencing library purification and fragment screening: Use 0.6X and 0.2X SPRIselect magnetic beads to purify and screen the products from the previous step. Finally, a sequencing library with a fragment size of about 300-600bp is obtained.

4.测序:构建好文库用NovaSeq 6000(Illumina,San Diego,CA)测序,读长150bp双端测序,每个细胞测10,000个reads。 4. Sequencing: The constructed library was sequenced using NovaSeq 6000 (Illumina, San Diego, CA) with a read length of 150 bp and 10,000 reads per cell.

实施例6:单细胞mRNA+基因组DNA多组学文库制备(以人肾脏单细胞样品为例)Example 6: Preparation of single-cell mRNA+genomic DNA multi-omics library (taking human kidney single-cell sample as an example)

实施例利用10X genomics chromium平台Single Cell Multiome ATAC+RNA-seq系统,先完成油包水微液滴制备和第一轮细胞标签条形码的加载,在此基础上通过index PC对细胞/细胞核的转录组和染色质开放区分别加载第二轮的细胞标签,最终完成单细胞多组学文库的构建。油包水制备和细胞条形码标记的微珠可以被其他平台代替。具体方法:The embodiment uses the 10X genomics chromium platform Single Cell Multiome ATAC+RNA-seq system to first complete the oil-in-water microdroplet preparation and the first round of cell label barcode loading. On this basis, the second round of cell labels are loaded on the transcriptome and chromatin open area of the cell/nucleus through index PC, and finally the construction of the single-cell multi-omics library is completed. The oil-in-water preparation and cell barcode-labeled microbeads can be replaced by other platforms. Specific method:

1.细胞原位转座反应:按Chromium Next GEM Single Cell Multiome ATAC+Gene Expression User Guide对上述实施例中已经完成固定透化的单细胞核和透化细胞样品进行原位转座反应,反应体系为:7ul ATAC Buffer B,3ul ATAC Enzyme B,5ul固定后的细胞/细胞核悬液,充分混匀后,置于PCR仪,反应条件如下:37℃ 60min,4℃暂存。反应结束后,样品细胞核染色质开放区上引入特异的接头序列。1. In situ cell transposition reaction: According to the Chromium Next GEM Single Cell Multiome ATAC+Gene Expression User Guide, the fixed and permeabilized single cell nuclei and permeabilized cell samples in the above example were subjected to in situ transposition reaction. The reaction system was: 7ul ATAC Buffer B, 3ul ATAC Enzyme B, 5ul fixed cell/nucleus suspension. After thorough mixing, the suspension was placed in a PCR instrument. The reaction conditions were as follows: 37°C for 60 min, and then temporarily stored at 4°C. After the reaction, a specific linker sequence was introduced into the open chromatin region of the sample cell nucleus.

2.油包水微液滴制备和条形码加载(第一轮细胞标签):完成细胞原位转座反应后,配制逆转录和连接反应液:49.5μl Barcoding Reagent Mix,1.9μl Reducing Agent B,1.1μl Template Switch Oligo,7.5μl Barcoding Enzyme Mix。反应液混匀后加到上一步反应产物中,将70ul细胞反应液、50ul 10X Single Cell Gel Beads、45ul矿物油加载到10X Chip J芯片上(空余的孔按说明书要求加载50%甘油),在10X genomics chromium仪器上进行油包水制备。制备结束后收集油包水产物,快速置于PCR仪,反应条件如下:37℃ 45min,25℃30min,4℃暂存。2. Preparation of oil-in-water microdroplets and barcode loading (first round of cell labeling): After completing the cell in situ transposition reaction, prepare the reverse transcription and ligation reaction solution: 49.5μl Barcoding Reagent Mix, 1.9μl Reducing Agent B, 1.1μl Template Switch Oligo, 7.5μl Barcoding Enzyme Mix. After mixing the reaction solution, add it to the reaction product of the previous step, load 70ul cell reaction solution, 50ul 10X Single Cell Gel Beads, and 45ul mineral oil onto the 10X Chip J chip (the remaining wells are loaded with 50% glycerol according to the instructions), and prepare the oil-in-water on the 10X genomics chromium instrument. After the preparation, collect the oil-in-water product and quickly place it in the PCR instrument. The reaction conditions are as follows: 37℃ 45min, 25℃ 30min, and 4℃ temporary storage.

3.破除油包水微液滴释放细胞及细胞重分配:同实施例4步骤2所述进行操作。3. Breaking the water-in-oil microdroplets to release cells and cell redistribution: perform the same operation as described in step 2 of Example 4.

4.细胞裂解及纯化:分装好的上述产物中每孔加入1ul Proteinase K,充分混匀并瞬离后,置于PCR仪中,55℃孵育5min。随后按照10X Chromium Single Cell Reagent Kits User Guide说明书配置Dynabeads Cleanup Mix,96孔板中每孔加入16ul Dynabeads Cleanup Mix进行纯化,最后每孔用16.5ul Elution Solution I洗脱,并将洗脱液分别转移到新的96孔板中。纯化产物用1.8x SPRIselect磁珠再次纯化,最后用16.5ul EB洗脱,洗脱后的产物再次转移到新的96孔板中。4. Cell lysis and purification: Add 1ul Proteinase K to each well of the above-mentioned products, mix thoroughly and centrifuge, then place in a PCR instrument and incubate at 55°C for 5 minutes. Then configure Dynabeads Cleanup Mix according to the 10X Chromium Single Cell Reagent Kits User Guide, add 16ul Dynabeads Cleanup Mix to each well of the 96-well plate for purification, and finally elute with 16.5ul Elution Solution I to each well, and transfer the eluate to a new 96-well plate. The purified product was purified again with 1.8x SPRIselect magnetic beads, and finally eluted with 16.5ul EB, and the eluted product was transferred to a new 96-well plate again.

5.Index PCR扩增反应(加载第二轮细胞标签):在上述纯化好的产物中,加入cDNA和gDNA扩增反应液,96孔板每孔中应加入的反应液包括:25ul NEBNext High-Fidelity 2x PCR Master Mix,2ul 10uM P5端引物(SEQ ID NO:1),2ul 10uM Nextare-i7端第2标签引物(序列如SEQ ID NO:6,本实施例具体使用的该引物有96种,每个孔中加一种,用来标记染色质开放区来源的gDNA,所述96种引物包含的标签序列分别选自如SEQ ID NO:8所示的序列),2ul 10uM Partial TSO/IS引物(SEQ ID NO:2),2ul 10uM Truseq-i5端特异性第2标签引物(序列如SEQ ID NO:5,本实施例具体使用的该引物有96种,每个孔中加一种,用来标记转录组cDNA,所述96种引物包含的标签序列分别选自如SEQ ID NO:7所示的序列),充分混匀后快速置于PCR仪,反应条件如下:72℃ 5min,98℃ 3min,7个循环[98℃ 20s,63℃ 30s,72℃ 1min],72℃ 1min,4℃暂存。5. Index PCR amplification reaction (loading the second round of cell labels): Add cDNA and gDNA amplification reaction solution to the above purified products. The reaction solution to be added to each well of the 96-well plate includes: 25ul NEBNext High-Fidelity 2x PCR Master Mix, 2ul 10uM P5 end primer (SEQ ID NO: 1), 2ul 10uM Nextare-i7 end second label primer (sequence such as SEQ ID NO: 6, there are 96 kinds of primers used in this embodiment, one is added to each well to label the gDNA from the chromatin open region, and the label sequences contained in the 96 primers are selected from SEQ ID NO: 6. D NO:8), 2ul 10uM Partial TSO/IS primer (SEQ ID NO:2), 2ul 10uM Truseq-i5-end specific second label primer (sequence as SEQ ID NO:5, there are 96 primers used in this embodiment, one is added to each well to label the transcriptome cDNA, the label sequences contained in the 96 primers are respectively selected from the sequence as shown in SEQ ID NO:7), mix well and quickly place in PCR instrument, the reaction conditions are as follows: 72℃ 5min, 98℃ 3min, 7 cycles [98℃ 20s, 63℃ 30s, 72℃ 1min], 72℃ 1min, 4℃ temporary storage.

6.Index PCR扩增产物纯化:收集上述96孔板中的产物到新的EP管中,1.6x  SPRIselect磁珠纯化,400ul EB洗脱。洗脱产物用1.6x SPRIselect磁珠再次纯化,最后用160ul EB洗脱。洗脱产物取1ul,用Qubit测浓度,剩余样品可于-80℃保存3月。6. Purification of Index PCR amplification products: Collect the products in the above 96-well plate into a new EP tube, 1.6x Purify with SPRIselect magnetic beads and elute with 400ul EB. Purify the eluted product again with 1.6x SPRIselect magnetic beads and elute with 160ul EB. Take 1ul of the eluted product and measure the concentration with Qubit. The remaining sample can be stored at -80℃ for 3 months.

7.cDNA扩增产物富集:取40ul上述Index PCR扩增产物用带生物素(biotin)修饰的引物特异性富集富集cDNA扩增产物,加入60ul反应液,反应液包含:50ul KAPA HiFi HotStart 2X ReadyMix,5ul 10uM P5端引物(SEQ ID NO:1),5ul 10uM Bio-Partial TSO/IS引物(SEQ ID NO:3),充分混匀后快速置于PCR仪,反应条件如下:98℃ 30s,6个循环[98℃ 20s,54℃ 30s,72℃ 20s],72℃ 1min,4℃暂存。接着,用MyOne Streptavidin C1 beads富集由生物素(biotin)标记的cDNA扩增产物(按MyOne Streptavidin C1 beads说明书进行纯化)。之后吸附cDNA扩增产物的C1beads用新的PCR反应液重悬,(反应液包含:50ul KAPA HiFi HotStart 2X ReadyMix,5ul 10uM P5端引物(SEQ ID NO:1),5ul10uM Partial TSO/IS引物(SEQ ID NO:2),40ul无核酸酶的水),充分混匀后进一步扩增,反应条件如下:98℃ 30s,4个循环[98℃ 20s,54℃ 30s,72℃ 20s],72℃ 1min,4℃暂存。PCR反应结束后,将PCR管置于磁力架上5min,吸出上清,上清用0.6x SPRIselect磁珠纯化、EB洗脱。洗脱产物取1ul,用Qubit测浓度,剩余样品可于-80℃保存3月。7. Enrichment of cDNA amplification products: Take 40ul of the above Index PCR amplification products and enrich the cDNA amplification products with biotin-modified primers, add 60ul reaction solution, the reaction solution contains: 50ul KAPA HiFi HotStart 2X ReadyMix, 5ul 10uM P5 end primer (SEQ ID NO: 1), 5ul 10uM Bio-Partial TSO/IS primer (SEQ ID NO: 3), mix well and quickly place in PCR instrument, the reaction conditions are as follows: 98℃ 30s, 6 cycles [98℃ 20s, 54℃ 30s, 72℃ 20s], 72℃ 1min, 4℃ temporary storage. Then, use MyOne Streptavidin C1 beads to enrich the cDNA amplification products labeled with biotin (purify according to the MyOne Streptavidin C1 beads instructions). Then, the C1beads adsorbed with the cDNA amplification product were resuspended in new PCR reaction solution (the reaction solution contained: 50ul KAPA HiFi HotStart 2X ReadyMix, 5ul 10uM P5 end primer (SEQ ID NO: 1), 5ul 10uM Partial TSO/IS primer (SEQ ID NO: 2), 40ul nuclease-free water), and further amplified after thorough mixing. The reaction conditions were as follows: 98℃ 30s, 4 cycles [98℃ 20s, 54℃ 30s, 72℃ 20s], 72℃ 1min, and stored at 4℃. After the PCR reaction was completed, the PCR tube was placed on a magnetic stand for 5min, the supernatant was aspirated, and the supernatant was purified with 0.6x SPRIselect magnetic beads and eluted with EB. Take 1ul of the eluted product and measure the concentration with Qubit. The remaining sample can be stored at -80℃ for 3 months.

8.cDNA测序文库构建及测序:同实施例4步骤6、7、8所述。8. Construction of cDNA sequencing library and sequencing: Same as steps 6, 7 and 8 of Example 4.

9.ATAC-seq测序文库构建:取40ul本实施步骤6:完成纯化的Index PCR扩增产物进行ATAC-seq测序文库的构建。40ul Index PCR扩增产物中加入:50ul KAPA HiFi HotStart2X ReadyMix、5ul 10uM P5端引物(SEQ ID NO:1)、5ul 10uM P7端引物(SEQ ID NO:4),充分混匀后PCR扩增,反应条件:98℃ 45s,7-10个循环(根据上样的细胞数量决定)[98℃ 20s,67℃ 30s,72℃ 20s],72℃ 1min,4℃暂存。9. ATAC-seq sequencing library construction: Take 40ul of the purified Index PCR amplification product in step 6 of this implementation to construct the ATAC-seq sequencing library. Add 50ul KAPA HiFi HotStart2X ReadyMix, 5ul 10uM P5 end primer (SEQ ID NO: 1), 5ul 10uM P7 end primer (SEQ ID NO: 4) to the 40ul Index PCR amplification product, mix well and then perform PCR amplification. Reaction conditions: 98℃ 45s, 7-10 cycles (depending on the number of cells loaded) [98℃ 20s, 67℃ 30s, 72℃ 20s], 72℃ 1min, and store at 4℃.

10.ATAC-seq测序文库纯化和片段筛选:用0.4X和1X SPRIselect磁珠对上步产物进行纯化和片段筛选。最后得到片段大小为200-700bp左右的测序文库。10. ATAC-seq sequencing library purification and fragment screening: Use 0.4X and 1X SPRIselect magnetic beads to purify and screen the products from the previous step. Finally, a sequencing library with a fragment size of about 200-700bp was obtained.

11.ATAC-se文库测序:构建好文库用NovaSeq 6000(Illumina,San Diego,CA)测序,读长50bp双端测序,每个细胞测25,000个reads。11. ATAC-se library sequencing: The constructed library was sequenced using NovaSeq 6000 (Illumina, San Diego, CA) with a read length of 50 bp and 25,000 reads per cell.

本申请实施例的实验结果如图5-9所示。The experimental results of the embodiments of the present application are shown in Figures 5-9.

图5显示了采用本发明单细胞转录组方法对人和小鼠细胞系混合样品进行测序的结果,具体为单细胞中比对到不同物种基因组上的UMI数目的散点图,其中UMI数为测序出来的cDNA分子数目,图中每个点代表一个细胞(总计6446个点),其中浅色的点代表的细胞中几乎只含有小鼠的cDNA,深色的点代表的细胞中几乎只含有人的cDNA,黑色的点代表的细胞为有污染的细胞(即假单细胞)。FIG5 shows the results of sequencing a mixed sample of human and mouse cell lines using the single-cell transcriptome method of the present invention, specifically a scatter plot of the number of UMIs in a single cell mapped to the genomes of different species, wherein the number of UMIs is the number of cDNA molecules sequenced, and each point in the figure represents a cell (a total of 6446 points), wherein light-colored points represent cells that contain almost only mouse cDNA, dark-colored points represent cells that contain almost only human cDNA, and black points represent cells that are contaminated (i.e., pseudomonocytes).

图6显示了采用本发明转录组方法对不同条件固定的人外周血单核细胞样品进行测序的结果。其中,A为对外周血单核细胞用三种条件(甲醇、1%甲醛、1%多聚甲醛)进行固定后检测到的UMI个数;B为对外周血单核细胞用三种条件(甲醇、1%甲醛、1%多聚甲醛)进行固定后检测到的基因个数;C为三种固定条件下所有的细胞无监督聚类细胞分群可视化结果;D为三种固定条件的细胞分群分布情况。 Figure 6 shows the results of sequencing human peripheral blood mononuclear cell samples fixed under different conditions using the transcriptome method of the present invention. Among them, A is the number of UMIs detected after peripheral blood mononuclear cells were fixed under three conditions (methanol, 1% formaldehyde, 1% paraformaldehyde); B is the number of genes detected after peripheral blood mononuclear cells were fixed under three conditions (methanol, 1% formaldehyde, 1% paraformaldehyde); C is the visualization result of unsupervised clustering of all cells under the three fixing conditions; D is the distribution of cell clusters under the three fixing conditions.

图7显示了采用本发明单细胞5’RNA-seq方法对冻存的人外周血单核细胞样品进行测序的结果(单次实验细胞通量:118,819),具体为冻存的人外周血单核细胞单细胞5’RNA-seq的细胞分群可视化结果,显示27种血液中主要的细胞类型都在本方法中检测到。Figure 7 shows the results of sequencing frozen human peripheral blood mononuclear cell samples using the single-cell 5'RNA-seq method of the present invention (single experiment cell throughput: 118,819), specifically the cell clustering visualization results of single-cell 5'RNA-seq of frozen human peripheral blood mononuclear cells, showing that 27 major cell types in the blood were detected in this method.

图8显示了采用本发明单细胞VDJ-seq方法对人外周血单核细胞样品进行测序的结果。其中,A和C分别为检测到BCR/TCR克隆的细胞可视化结果,黑色点为检测到BCR/TCR克隆的细胞,浅色点为未检测到BCR/TCR的细胞,检测到BCR克隆的细胞与图7单细胞转录组数据注释为B细胞位置完全重合,检测到TCR克隆的细胞与图7单细胞转录组数据注释为T细胞位置完全重合;B和D分别为能检测到BCR和TCR克隆的B细胞和T细胞的比例。Figure 8 shows the results of sequencing human peripheral blood mononuclear cell samples using the single-cell VDJ-seq method of the present invention. Among them, A and C are the visualization results of cells with detected BCR/TCR clones, black dots are cells with detected BCR/TCR clones, light dots are cells without detected BCR/TCR, cells with detected BCR clones completely overlap with the B cell position annotated in the single-cell transcriptome data of Figure 7, and cells with detected TCR clones completely overlap with the T cell position annotated in the single-cell transcriptome data of Figure 7; B and D are the proportions of B cells and T cells with detected BCR and TCR clones, respectively.

图9显示了采用本发明单细胞转录组+ATAC多组学方法对冻存的人肾样品进行测序的结果。其中,A是冻存的人肾样品单细胞转库组部分的细胞分群可视化结果,显示12种肾脏中主要的细胞类型都在本方法中检测到;B是冻存的人肾样品单细胞转录组单个细胞检测到的基因个数;C为冻存的人肾样品单细胞ATAC-seq部分的细胞分群可视化结果,聚类获得18个细胞簇;D为单细胞ATAC-seq部分各种细胞类型获得的ATAC-seq peak信息。Figure 9 shows the results of sequencing frozen human kidney samples using the single-cell transcriptome + ATAC multi-omics method of the present invention. Among them, A is the cell clustering visualization result of the single-cell transfer group of the frozen human kidney sample, showing that the 12 major cell types in the kidney are detected in this method; B is the number of genes detected in a single cell of the single-cell transcriptome of the frozen human kidney sample; C is the cell clustering visualization result of the single-cell ATAC-seq part of the frozen human kidney sample, and 18 cell clusters are obtained by clustering; D is the ATAC-seq peak information obtained for various cell types in the single-cell ATAC-seq part.

尽管本发明的具体实施方式已经得到详细的描述,但本领域技术人员将理解:根据已经公布的所有教导,可以对细节进行各种修改和变动,并且这些改变均在本发明的保护范围之内。本发明的全部分为由所附权利要求及其任何等同物给出。 Although the specific embodiments of the present invention have been described in detail, it will be understood by those skilled in the art that various modifications and changes may be made to the details according to all the teachings that have been published, and these changes are within the scope of protection of the present invention. The entire invention is given by the attached claims and any equivalents thereof.

Claims (17)

一种标记来自细胞或细胞核的核酸分子的方法,其包括下述步骤:A method for labeling nucleic acid molecules from cells or cell nuclei, comprising the following steps: (1)提供多个经固定和透化的细胞或细胞核,所述细胞或细胞核含有待标记的核酸分子;和,(1) providing a plurality of fixed and permeabilized cells or cell nuclei, wherein the cells or cell nuclei contain nucleic acid molecules to be labeled; and, 多个偶联了多个第一寡核苷酸分子的珠粒,其中,所述第一寡核苷酸分子含有第一标签序列;a plurality of beads coupled with a plurality of first oligonucleotide molecules, wherein the first oligonucleotide molecules contain a first tag sequence; 并且,同一个珠粒上的所述多个第一寡核苷酸分子具有相同的第一标签序列,并且,不同珠粒上的所述第一寡核苷酸分子具有彼此不同的第一标签序列;Furthermore, the plurality of first oligonucleotide molecules on the same bead have the same first tag sequence, and the first oligonucleotide molecules on different beads have first tag sequences different from each other; (2)将多个所述珠粒和多个所述细胞或细胞核随机分配至不同的第一离散分区,在所述第一离散分区内使所述第一寡核苷酸分子从所述珠粒上释放并与所述细胞或细胞核接触,从而在所述细胞或细胞核内生成衍生自所述待标记核酸分子的第一核酸分子,所述第一核酸分子含有所述第一标签序列或其互补序列;(2) randomly allocating a plurality of the beads and a plurality of the cells or cell nuclei to different first discrete partitions, releasing the first oligonucleotide molecule from the beads and contacting the first oligonucleotide molecule with the cells or cell nuclei in the first discrete partitions, thereby generating a first nucleic acid molecule derived from the nucleic acid molecule to be labeled in the cells or cell nuclei, wherein the first nucleic acid molecule contains the first tag sequence or its complementary sequence; (3)将源自不同所述第一离散分区的包含所述第一核酸分子的细胞或细胞核混合并重新分配到不同的第二离散分区;(3) mixing the cells or cell nuclei containing the first nucleic acid molecule originating from different first discrete partitions and redistributing them into different second discrete partitions; (4)在所述第二离散分区内,使含有第二标签序列的第二寡核苷酸分子与所述第一核酸分子接触,生成含有第一标签序列或其互补序列以及第二标签序列或其互补序列的第二核酸分子;(4) contacting a second oligonucleotide molecule containing a second tag sequence with the first nucleic acid molecule within the second discrete partition to generate a second nucleic acid molecule containing the first tag sequence or its complementary sequence and the second tag sequence or its complementary sequence; 其中,同一个所述第二离散分区的所述第二寡核苷酸分子具有相同的第二标签序列,并且,不同所述第二离散分区的所述第二寡核苷酸分子具有彼此不同的第二标签序列;wherein the second oligonucleotide molecules in the same second discrete partition have the same second tag sequence, and the second oligonucleotide molecules in different second discrete partitions have different second tag sequences; 其中,所述细胞为天然存在的细胞或重组细胞,或两者的混合;所述细胞核为源自天然存在的细胞的细胞核或重组细胞的细胞核,或两者的混合。Wherein, the cell is a naturally occurring cell or a recombinant cell, or a mixture of the two; the cell nucleus is a cell nucleus derived from a naturally occurring cell or a recombinant cell, or a mixture of the two. 权利要求1的方法,其中,所述待标记的核酸分子为mRNA,并且,所述第一寡核苷酸分子为第一寡核苷酸分子a;The method of claim 1, wherein the nucleic acid molecule to be labeled is mRNA, and the first oligonucleotide molecule is first oligonucleotide molecule a; 优选地,所述步骤(2)包括以下步骤:Preferably, the step (2) comprises the following steps: (i)(a)在所述第一离散分区内,用所述第一寡核苷酸分子a对所述待标记的核酸分子进行逆转录,生成cDNA链,所述cDNA链包含以所述第一寡核苷酸分子a为逆转录引物形成的与所述待标记核酸分子互补的cDNA序列,以及3’末端悬突;其中,所述第一寡核苷酸分子a从5’端至3’端包含:共有序列R1或其部分序列、所述第一标签序列、poly(T)序列;和,(b)将引物A与(a)中生成的所述cDNA链进行退火,并进行延伸反应,生成延伸产物,所述延伸产物即为所述第一核酸分子;其中,所述引物A从5’端至3’端包含共有序列O和所述3’末端悬突的互补序列;(i) (a) in the first discrete partition, reversely transcribe the nucleic acid molecule to be labeled with the first oligonucleotide molecule a to generate a cDNA chain, wherein the cDNA chain comprises a cDNA sequence complementary to the nucleic acid molecule to be labeled formed by using the first oligonucleotide molecule a as a reverse transcription primer, and a 3' terminal overhang; wherein the first oligonucleotide molecule a comprises, from the 5' end to the 3' end: a consensus sequence R1 or a partial sequence thereof, the first tag sequence, and a poly(T) sequence; and, (b) annealing primer A with the cDNA chain generated in (a), and performing an extension reaction to generate an extension product, wherein the extension product is the first nucleic acid molecule; wherein the primer A comprises, from the 5' end to the 3' end, a consensus sequence O and a complementary sequence to the 3' terminal overhang; 或者,or, (ii)(a)在所述第一离散分区内,用引物B对所述待标记的核酸分子进行逆转录,生成cDNA链,所述cDNA链包含以所述引物B为逆转录引物形成的与所述待标记核酸分子 互补的cDNA序列,以及3’末端悬突;其中,所述引物B从5’端至3’端包含共有序列T或其部分序列和poly(T)序列;和,(b)在所述第一离散分区内,将所述第一寡核苷酸分子a与(a)中生成的所述cDNA链进行退火,并进行延伸反应,生成延伸产物,所述延伸产物即为所述第一核酸分子;其中,所述第一寡核苷酸分子a从5’端至3’端包含:共有序列R1或其部分序列、所述第一标签序列和所述3’末端悬突的互补序列。(ii) (a) in the first discrete partition, reversely transcribe the nucleic acid molecule to be labeled using primer B to generate a cDNA chain, wherein the cDNA chain contains a sequence that is formed by using primer B as a reverse transcription primer and is identical to the nucleic acid molecule to be labeled. A complementary cDNA sequence, and a 3' terminal overhang; wherein the primer B comprises a consensus sequence T or a partial sequence thereof and a poly (T) sequence from the 5' end to the 3'end; and, (b) within the first discrete partition, the first oligonucleotide molecule a is annealed with the cDNA chain generated in (a), and an extension reaction is performed to generate an extension product, wherein the extension product is the first nucleic acid molecule; wherein the first oligonucleotide molecule a comprises, from the 5' end to the 3' end: a consensus sequence R1 or a partial sequence thereof, the first tag sequence and a complementary sequence to the 3' terminal overhang. 权利要求2的方法,其中,所述步骤(4)包括以下步骤:The method of claim 2, wherein said step (4) comprises the following steps: 在所述第二离散分区内,以所述第二寡核苷酸分子和引物C为引物扩增所述第一核酸分子,生成的延伸产物即为所述第二核酸分子;In the second discrete partition, the first nucleic acid molecule is amplified using the second oligonucleotide molecule and primer C as primers, and the generated extension product is the second nucleic acid molecule; 其中,所述第二寡核苷酸分子从5’端至3’端包含:共有序列P1或其部分序列、所述第二标签序列、所述共有序列R1或其部分序列;所述引物C包含所述共有序列O或其部分序列,或者,所述引物C包含共有序列T或其部分序列。Wherein, the second oligonucleotide molecule comprises from the 5' end to the 3' end: the consensus sequence P1 or a partial sequence thereof, the second tag sequence, the consensus sequence R1 or a partial sequence thereof; the primer C comprises the consensus sequence O or a partial sequence thereof, or the primer C comprises the consensus sequence T or a partial sequence thereof. 权利要求1的方法,其中,所述待标记的核酸分子为基因组DNA,并且,所述第一寡核苷酸分子为第一寡核苷酸分子b;The method of claim 1, wherein the nucleic acid molecule to be labeled is genomic DNA, and the first oligonucleotide molecule is the first oligonucleotide molecule b; 优选地,所述步骤(2)包括以下步骤:Preferably, the step (2) comprises the following steps: (a)将所述待标记核酸分子与转座酶复合体I孵育;其中,所述转座酶复合体I含有转座酶和所述转座酶能够识别并结合的转座序列,且能够切割或断裂双链核酸;并且,所述转座序列包含转移链和非转移链;其中,所述转移链包含第一转移链和第二转移链,所述第一转移链包含转座酶识别序列和共有序列R2或其部分序列,所述第二转移链包含转座酶识别序列和共有序列R1或其部分序列;并且,所述孵育在允许所述待标记的核酸分子被所述转座酶复合体I断裂成核酸片段且所述转移链被连接至所述核酸片段的末端(例如,所述核酸片段的5’端)的条件下进行;从而生成5’端分别含有共有序列R2或其部分序列以及共有序列R1或其部分序列的双链核酸片段;和,(a) incubating the nucleic acid molecule to be labeled with a transposase complex I; wherein the transposase complex I contains a transposase and a transposition sequence that the transposase can recognize and bind to, and can cut or break a double-stranded nucleic acid; and the transposition sequence comprises a transferred strand and a non-transferred strand; wherein the transferred strand comprises a first transferred strand and a second transferred strand, the first transferred strand comprises a transposase recognition sequence and a consensus sequence R2 or a partial sequence thereof, and the second transferred strand comprises a transposase recognition sequence and a consensus sequence R1 or a partial sequence thereof; and the incubation is performed under conditions that allow the nucleic acid molecule to be broken into nucleic acid fragments by the transposase complex I and the transferred strands are connected to the ends of the nucleic acid fragments (e.g., the 5' ends of the nucleic acid fragments); thereby generating double-stranded nucleic acid fragments whose 5' ends contain the consensus sequence R2 or a partial sequence thereof and the consensus sequence R1 or a partial sequence thereof, respectively; and, (b)在所述第一离散分区内,将所述第一寡核苷酸分子b与(a)中生成的所述双链核酸片段进行连接(例如,利用核酸酶进行连接),并进行延伸反应,生成延伸产物,所述延伸产物即为所述第一核酸分子;其中,所述第一寡核苷酸分子b从5’端至3’端包含:共有序列P1或其部分序列和所述第一标签序列。(b) In the first discrete partition, the first oligonucleotide molecule b is connected to the double-stranded nucleic acid fragment generated in (a) (for example, by using a nuclease), and an extension reaction is performed to generate an extension product, which is the first nucleic acid molecule; wherein the first oligonucleotide molecule b comprises, from the 5' end to the 3' end: a consensus sequence P1 or a partial sequence thereof and the first tag sequence. 权利要求4的方法,其中,所述步骤(4)包括以下步骤:The method of claim 4, wherein said step (4) comprises the following steps: 在所述第二离散分区内,以所述第二寡核苷酸分子和引物D为引物扩增所述第一核酸分子,生成的延伸产物即为所述第二核酸分子;In the second discrete partition, the first nucleic acid molecule is amplified using the second oligonucleotide molecule and primer D as primers, and the generated extension product is the second nucleic acid molecule; 其中,所述第二寡核苷酸分子从5’端至3’端包含:共有序列P2或其部分序列、所述第二标签序列、共有序列R2或其部分序列;所述引物D包含共有序列P1或其部分序列。Wherein, the second oligonucleotide molecule comprises from the 5' end to the 3' end: the consensus sequence P2 or a partial sequence thereof, the second tag sequence, the consensus sequence R2 or a partial sequence thereof; and the primer D comprises the consensus sequence P1 or a partial sequence thereof. 权利要求1的方法,其中,所述待标记的核酸分子为mRNA和基因组DNA,并且, 所述mRNA和基因组DNA具有相同的细胞来源;The method of claim 1, wherein the nucleic acid molecules to be labeled are mRNA and genomic DNA, and The mRNA and genomic DNA have the same cell origin; 并且,所述第一寡核苷酸分子包括第一寡核苷酸分子a和第一寡核苷酸分子b,所述第二寡核苷酸分子包括第二寡核苷酸分子a和第二寡核苷酸分子b;Furthermore, the first oligonucleotide molecule includes a first oligonucleotide molecule a and a first oligonucleotide molecule b, and the second oligonucleotide molecule includes a second oligonucleotide molecule a and a second oligonucleotide molecule b; 其中,所述珠粒同时偶联了多个所述第一寡核苷酸分子a和多个所述第一寡核苷酸分子b;并且,同一个珠粒上的所述多个第一寡核苷酸分子a和多个所述第一寡核苷酸分子b具有相同的第一标签序列;Wherein, the bead is coupled with a plurality of the first oligonucleotide molecules a and a plurality of the first oligonucleotide molecules b at the same time; and the plurality of the first oligonucleotide molecules a and the plurality of the first oligonucleotide molecules b on the same bead have the same first tag sequence; 优选地,所述步骤(2)包括以下步骤:Preferably, the step (2) comprises the following steps: (A)(i)(a)在所述第一离散分区内,用所述第一寡核苷酸分子a对所述待标记的mRNA分子进行逆转录,生成cDNA链,所述cDNA链包含以所述第一寡核苷酸分子a为逆转录引物形成的与所述待标记mRNA分子互补的cDNA序列,以及3’末端悬突;其中,所述第一寡核苷酸分子a从5’端至3’端包含:共有序列R1或其部分序列、所述第一标签序列、poly(T)序列;和,(b)将引物A与(a)中生成的所述cDNA链进行退火,并进行延伸反应,生成延伸产物,所述延伸产物即为第一核酸分子a;其中,所述引物A从5’端至3’端包含共有序列O和所述3’末端悬突的互补序列;(A)(i)(a) in the first discrete partition, reversely transcribe the mRNA molecule to be labeled with the first oligonucleotide molecule a to generate a cDNA chain, wherein the cDNA chain comprises a cDNA sequence complementary to the mRNA molecule to be labeled formed by using the first oligonucleotide molecule a as a reverse transcription primer, and a 3' terminal overhang; wherein the first oligonucleotide molecule a comprises, from the 5' end to the 3' end: a consensus sequence R1 or a partial sequence thereof, the first tag sequence, and a poly(T) sequence; and, (b) annealing primer A with the cDNA chain generated in (a), and performing an extension reaction to generate an extension product, wherein the extension product is the first nucleic acid molecule a; wherein the primer A comprises, from the 5' end to the 3' end, a consensus sequence O and a complementary sequence to the 3' terminal overhang; 或者,or, (ii)(a)在所述第一离散分区内,用引物B对所述待标记的mRNA分子进行逆转录,生成cDNA链,所述cDNA链包含以所述引物B为逆转录引物形成的与所述待标记mRNA分子互补的cDNA序列,以及3’末端悬突;其中,所述引物B从5’端至3’端包含共有序列T或其部分序列和poly(T)序列;和,(b)在所述第一离散分区内,将所述第一寡核苷酸分子a与(a)中生成的所述cDNA链进行退火,并进行延伸反应,生成延伸产物,所述延伸产物即为第一核酸分子a;其中,所述第一寡核苷酸分子a从5’端至3’端包含:共有序列R1或其部分序列、所述第一标签序列和所述3’末端悬突的互补序列;(ii) (a) in the first discrete partition, reversely transcribe the mRNA molecule to be labeled with primer B to generate a cDNA chain, wherein the cDNA chain comprises a cDNA sequence complementary to the mRNA molecule to be labeled formed by using primer B as a reverse transcription primer, and a 3' terminal overhang; wherein the primer B comprises a consensus sequence T or a partial sequence thereof and a poly(T) sequence from the 5' end to the 3' end; and, (b) in the first discrete partition, anneal the first oligonucleotide molecule a with the cDNA chain generated in (a), and perform an extension reaction to generate an extension product, wherein the extension product is the first nucleic acid molecule a; wherein the first oligonucleotide molecule a comprises, from the 5' end to the 3' end: a consensus sequence R1 or a partial sequence thereof, the first tag sequence and a complementary sequence to the 3' terminal overhang; 和,and, (B)(a)将所述待标记DNA分子与转座酶复合体I孵育;其中,所述转座酶复合体I如权利要求4中所定义;并且,所述孵育在允许所述待标记的DNA分子被所述转座酶复合体I断裂成核酸片段且所述转移链被连接至所述核酸片段的末端(例如,所述核酸片段的5’端)的条件下进行;从而生成5’端分别含有共有序列R2或其部分序列以及共有序列R1或其部分序列的双链核酸片段;和,(B)(a) incubating the DNA molecule to be labeled with a transposase complex I; wherein the transposase complex I is as defined in claim 4; and the incubation is performed under conditions that allow the DNA molecule to be broken into nucleic acid fragments by the transposase complex I and the transferred strand to be connected to the end of the nucleic acid fragment (e.g., the 5' end of the nucleic acid fragment); thereby generating double-stranded nucleic acid fragments whose 5' ends contain a consensus sequence R2 or a partial sequence thereof and a consensus sequence R1 or a partial sequence thereof, respectively; and, (b)在与(A)相同的所述第一离散分区内,将所述第一寡核苷酸分子b与(a)中生成的所述双链核酸片段进行连接,并进行延伸反应,生成延伸产物,所述延伸产物即为第一核酸分子b;其中,所述第一寡核苷酸分子b从5’端至3’端包含:共有序列P1或其部分序列和所述第一标签序列;(b) in the first discrete partition that is the same as (A), the first oligonucleotide molecule b is connected to the double-stranded nucleic acid fragment generated in (a), and an extension reaction is performed to generate an extension product, wherein the extension product is the first nucleic acid molecule b; wherein the first oligonucleotide molecule b comprises, from the 5' end to the 3' end: a consensus sequence P1 or a partial sequence thereof and the first tag sequence; 其中,所述步骤(A)和所述步骤(B)可以以任意顺序进行(例如,先(A)后(B),先(B)后(A),或同时进行)。Wherein, the step (A) and the step (B) may be performed in any order (for example, (A) first and then (B), (B) first and then (A), or simultaneously). 权利要求6的方法,其中,所述步骤(4)包括以下步骤: The method of claim 6, wherein said step (4) comprises the following steps: (a)在所述第二离散分区内,以所述第二寡核苷酸分子a和引物C为引物扩增所述第一核酸分子a,生成的延伸产物即为所述第二核酸分子a;(a) amplifying the first nucleic acid molecule a in the second discrete partition using the second oligonucleotide molecule a and primer C as primers, and the generated extension product is the second nucleic acid molecule a; 其中,所述第二寡核苷酸分子a从5’端至3’端包含:共有序列P1或其部分序列、所述第二标签序列、所述共有序列R1或其部分序列;所述引物C包含共有序列O或其部分序列,或者,所述引物C包含共有序列T或其部分序列;Wherein, the second oligonucleotide molecule a comprises from the 5' end to the 3' end: the consensus sequence P1 or a partial sequence thereof, the second tag sequence, the consensus sequence R1 or a partial sequence thereof; the primer C comprises the consensus sequence O or a partial sequence thereof, or the primer C comprises the consensus sequence T or a partial sequence thereof; and (b)在同一所述第二离散分区内,以所述第二寡核苷酸分子b和引物D为引物扩增所述第一核酸分子b,生成的延伸产物即为所述第二核酸分子b;(b) amplifying the first nucleic acid molecule b in the same second discrete partition using the second oligonucleotide molecule b and primer D as primers, and the generated extension product is the second nucleic acid molecule b; 其中,所述第二寡核苷酸分子b从5’端至3’端包含:共有序列P2或其部分序列、所述第二标签序列、共有序列R2或其部分序列;所述引物D包含共有序列P1或其部分序列;Wherein, the second oligonucleotide molecule b comprises from the 5' end to the 3' end: the consensus sequence P2 or a partial sequence thereof, the second tag sequence, the consensus sequence R2 or a partial sequence thereof; the primer D comprises the consensus sequence P1 or a partial sequence thereof; 其中,所述步骤(a)和所述步骤(b)可以以任意顺序进行(例如,先(a)后(b),先(b)后(a),或同时进行)。Wherein, the step (a) and the step (b) may be performed in any order (for example, (a) first and then (b), (b) first and then (a), or simultaneously). 一种构建核酸分子文库的方法,其包括,A method for constructing a nucleic acid molecule library, comprising: (1)根据权利要求1-7任一项的方法生成多个经标记的所述第二核酸分子,以及,(1) generating a plurality of labeled second nucleic acid molecules according to the method according to any one of claims 1 to 7, and, (2)回收和/或合并多个所述第二核酸分子,(2) recovering and/or combining a plurality of said second nucleic acid molecules, 从而获得核酸分子文库;Thereby obtaining a nucleic acid molecule library; 优选地,在步骤(2)中,回收和/或合并多个所述第二离散分区中生成的所述第二核酸分子。Preferably, in step (2), the second nucleic acid molecules generated in a plurality of the second discrete partitions are recovered and/or combined. 权利要求8的方法,其包括:The method of claim 8, comprising: (a)根据权利要求2或3的方法生成多个经标记的所述第二核酸分子,(a) generating a plurality of labeled second nucleic acid molecules according to the method of claim 2 or 3, (b)回收和/或合并多个所述第二核酸分子;优选地,回收和/或合并多个所述第二离散分区中生成的所述第二核酸分子;和(b) recovering and/or combining a plurality of said second nucleic acid molecules; preferably, recovering and/or combining a plurality of said second nucleic acid molecules generated in said second discrete partitions; and (c)将所述第二核酸分子随机打断并添加接头序列;(c) randomly breaking the second nucleic acid molecule and adding a linker sequence; 从而获得核酸分子文库序列;Thus obtaining the nucleic acid molecule library sequence; 优选地,所述步骤(c)中,通过转座酶将所述第二核酸分子随机打断并在其5’端添加接头序列;优选地,所述接头序列包含共有序列R2或其部分序列;Preferably, in step (c), the second nucleic acid molecule is randomly interrupted by a transposase and a linker sequence is added to its 5' end; preferably, the linker sequence comprises a consensus sequence R2 or a partial sequence thereof; 优选地,所述方法还包括步骤(d):Preferably, the method further comprises step (d): 纯化和/或扩增步骤(c)的产物中含有所述第一标签或其互补序列和所述第二标签或其互补序列的核酸分子的步骤。A step of purifying and/or amplifying the nucleic acid molecule containing the first tag or its complementary sequence and the second tag or its complementary sequence in the product of step (c). 权利要求8的方法,其包括:The method of claim 8, comprising: (a)根据权利要求4或5的方法生成多个经标记的所述第二核酸分子,和,(a) generating a plurality of labeled second nucleic acid molecules according to the method of claim 4 or 5, and, (b)回收和/或合并多个所述第二核酸分子;(b) recovering and/or combining a plurality of said second nucleic acid molecules; 从而获得核酸分子文库序列; Thus obtaining the nucleic acid molecule library sequence; 优选地,所述方法还包括步骤(c):Preferably, the method further comprises step (c): 纯化和/或扩增步骤(b)的产物中含有所述第一标签或其互补序列和所述第二标签或其互补序列的核酸分子的步骤。A step of purifying and/or amplifying the nucleic acid molecule containing the first tag or its complementary sequence and the second tag or its complementary sequence in the product of step (b). 权利要求8的方法,其包括:The method of claim 8, comprising: (a)根据权利要求6或7的方法生成多个经标记的所述第二核酸分子,其包括多个所述第二核酸分子a和多个所述第二核酸分子b,和,(a) generating a plurality of labeled second nucleic acid molecules according to the method of claim 6 or 7, comprising a plurality of second nucleic acid molecules a and a plurality of second nucleic acid molecules b, and, (b)回收和/或合并多个所述第二核酸分子;(b) recovering and/or combining a plurality of said second nucleic acid molecules; 从而获得核酸分子文库序列;Thus obtaining the nucleic acid molecule library sequence; 优选地,所述方法在步骤(b)之后,还包括步骤(c):将所述第二核酸分子a随机打断并添加接头序列;Preferably, the method further comprises, after step (b), step (c): randomly interrupting the second nucleic acid molecule a and adding a linker sequence; 优选地,所述步骤(c)中,通过转座酶将所述第二核酸分子a随机打断并在其5’端添加接头序列;Preferably, in the step (c), the second nucleic acid molecule a is randomly interrupted by a transposase and a linker sequence is added to its 5' end; 优选地,所述接头序列包含共有序列R2或其部分序列。Preferably, the linker sequence comprises the consensus sequence R2 or a partial sequence thereof. 权利要求11的方法,其中,所述方法还包括步骤(d):The method of claim 11, wherein the method further comprises step (d): 纯化和/或扩增步骤(c)的产物中含有所述第一标签或其互补序列和所述第二标签或其互补序列的核酸分子的步骤;A step of purifying and/or amplifying the nucleic acid molecule containing the first tag or its complementary sequence and the second tag or its complementary sequence in the product of step (c); 优选地,所述步骤(c)包括:使用引物E和引物F对步骤(c)的产物进行扩增,其中,所述引物E包含共有序列P1以及任选的第三标签序列,所述引物F从5’至3’包含:共有序列P2或其互补序列、任选的第四标签序列、共有序列R2。Preferably, step (c) comprises: using primer E and primer F to amplify the product of step (c), wherein primer E comprises a consensus sequence P1 and an optional third tag sequence, and primer F comprises from 5' to 3': a consensus sequence P2 or its complementary sequence, an optional fourth tag sequence, and a consensus sequence R2. 权利要求11或12的方法,其中,所述方法还包括步骤(d)’:The method of claim 11 or 12, wherein the method further comprises step (d)': 纯化和/或扩增步骤(b)的产物中含有所述第一标签或其互补序列和所述第二标签或其互补序列的所述第二核酸分子b的步骤。A step of purifying and/or amplifying the second nucleic acid molecule b containing the first tag or its complementary sequence and the second tag or its complementary sequence in the product of step (b). 试剂组合物,其具备选自I、II和III的特征:A reagent composition having characteristics selected from I, II and III: (I)所述试剂组合物包含第二寡核苷酸分子a,所述第二寡核苷酸分子a的序列从5’端至3’端顺次包含:共有序列P1或其部分序列,第二标签序列,共有序列R1或其部分序列;(I) the reagent composition comprises a second oligonucleotide molecule a, the sequence of which comprises, from the 5' end to the 3' end, a consensus sequence P1 or a partial sequence thereof, a second tag sequence, and a consensus sequence R1 or a partial sequence thereof; 并且,所述试剂组合物进一步包含选自以下的一项或多项:Furthermore, the reagent composition further comprises one or more selected from the following: (I-a)多个偶联了多个第一寡核苷酸分子a的珠粒,其中,所述第一寡核苷酸分子a含有第一标签序列;(I-a) a plurality of beads coupled with a plurality of first oligonucleotide molecules a, wherein the first oligonucleotide molecules a contain a first tag sequence; 并且,同一个珠粒上的所述多个第一寡核苷酸分子a具有相同的第一标签序列,并且,不同珠粒上的所述第一寡核苷酸分子a具有彼此不同的第一标签序列;Furthermore, the plurality of first oligonucleotide molecules a on the same bead have the same first tag sequence, and the first oligonucleotide molecules a on different beads have first tag sequences different from each other; 优选地,所述第一寡核苷酸分子a从5’端至3’端包含:(i)共有序列R1或其部分序 列、所述第一标签序列、poly(T)序列;或者,(ii)共有序列R1或其部分序列、所述第一标签序列和cDNA 3’末端悬突的互补序列;Preferably, the first oligonucleotide molecule a comprises from the 5' end to the 3' end: (i) a consensus sequence R1 or a partial sequence thereof; sequence, the first tag sequence, and a poly(T) sequence; or, (ii) a consensus sequence R1 or a partial sequence thereof, the first tag sequence, and a complementary sequence of a cDNA 3' end overhang; (I-b)引物A或引物B,其中,所述引物A从5’端至3’端包含共有序列O和cDNA 3’末端悬突的互补序列,所述引物B从5’端至3’端包含共有序列T或其部分序列和poly(T)序列;(I-b) primer A or primer B, wherein the primer A comprises a consensus sequence O and a complementary sequence of the cDNA 3’ end overhang from the 5’ end to the 3’ end, and the primer B comprises a consensus sequence T or a partial sequence thereof and a poly(T) sequence from the 5’ end to the 3’ end; (I-c)引物C,所述引物C包含所述共有序列O或其部分序列,或者,所述引物C包含所述共有序列T或其部分序列;(I-c) primer C, the primer C comprises the consensus sequence O or a partial sequence thereof, or the primer C comprises the consensus sequence T or a partial sequence thereof; (I-d)转座酶复合体II,所述转座酶复合体II含有转座酶和所述转座酶能够识别并结合的转座序列,且能够切割或断裂双链核酸;并且,所述转座序列包含转移链和非转移链;其中,所述转移链包含接头序列;优选地,所述接头序列包含共有序列R2或其部分序列;(I-d) a transposase complex II, wherein the transposase complex II contains a transposase and a transposition sequence that the transposase can recognize and bind to, and can cut or break a double-stranded nucleic acid; and the transposition sequence comprises a transferred strand and a non-transferred strand; wherein the transferred strand comprises a linker sequence; preferably, the linker sequence comprises a consensus sequence R2 or a partial sequence thereof; (II)所述试剂组合物包含第二寡核苷酸分子b,所述第二寡核苷酸分子b的序列从5’端至3’端顺次包含:共有序列P2或其部分序列,第二标签序列,共有序列R2或其部分序列;(II) the reagent composition comprises a second oligonucleotide molecule b, the sequence of which comprises, from the 5' end to the 3' end, a consensus sequence P2 or a partial sequence thereof, a second tag sequence, and a consensus sequence R2 or a partial sequence thereof; 并且,所述试剂组合物进一步包含选自以下的一项或多项:Furthermore, the reagent composition further comprises one or more selected from the following: (II-a)多个偶联了多个第一寡核苷酸分子b的珠粒,其中,所述第一寡核苷酸分子b含有第一标签序列;(II-a) a plurality of beads coupled with a plurality of first oligonucleotide molecules b, wherein the first oligonucleotide molecules b contain a first tag sequence; 并且,同一个珠粒上的所述多个第一寡核苷酸分子b具有相同的第一标签序列,并且,不同珠粒上的所述第一寡核苷酸分子b具有彼此不同的第一标签序列;Furthermore, the plurality of first oligonucleotide molecules b on the same bead have the same first tag sequence, and the first oligonucleotide molecules b on different beads have first tag sequences different from each other; 优选地,所述第一寡核苷酸分子b从5’端至3’端包含:共有序列P1或其部分序列和所述第一标签序列;Preferably, the first oligonucleotide molecule b comprises from the 5' end to the 3' end: a consensus sequence P1 or a partial sequence thereof and the first tag sequence; (II-b)转座酶复合体I,所述转座酶复合体I如权利要求4中所定义;(II-b) transposase complex I, the transposase complex I being as defined in claim 4; (II-c)引物D,其中,所述引物D包含共有序列P1或其部分序列;(II-c) primer D, wherein the primer D comprises the consensus sequence P1 or a partial sequence thereof; (III)所述试剂组合物包含第二寡核苷酸分子a和第二寡核苷酸分子b;其中,所述第二寡核苷酸分子a的序列从5’端至3’端顺次包含:共有序列P1或其部分序列,第二标签序列,共有序列R1或其部分序列;所述第二寡核苷酸分子b的序列从5’端至3’端顺次包含:共有序列P2或其部分序列,第二标签序列,共有序列R2或其部分序列;(III) The reagent composition comprises a second oligonucleotide molecule a and a second oligonucleotide molecule b; wherein the sequence of the second oligonucleotide molecule a comprises, from the 5' end to the 3' end: a consensus sequence P1 or a partial sequence thereof, a second tag sequence, and a consensus sequence R1 or a partial sequence thereof; the sequence of the second oligonucleotide molecule b comprises, from the 5' end to the 3' end: a consensus sequence P2 or a partial sequence thereof, a second tag sequence, and a consensus sequence R2 or a partial sequence thereof; 并且,所述试剂组合物进一步包含选自以下的一项或多项:Furthermore, the reagent composition further comprises one or more selected from the following: (III-a)多个同时偶联了多个所述第一寡核苷酸分子a和多个所述第一寡核苷酸分子b的珠粒,且,同一个珠粒上的所述多个第一寡核苷酸分子a和多个所述第一寡核苷酸分子b具有相同的第一标签序列,不同珠粒上的所述第一寡核苷酸分子a具有彼此不同的第一标签序列,不同珠粒上的所述第一寡核苷酸分子b具有彼此不同的第一标签序列;(III-a) a plurality of beads to which a plurality of the first oligonucleotide molecules a and a plurality of the first oligonucleotide molecules b are simultaneously coupled, wherein the plurality of the first oligonucleotide molecules a and the plurality of the first oligonucleotide molecules b on the same bead have the same first tag sequence, the first oligonucleotide molecules a on different beads have first tag sequences different from each other, and the first oligonucleotide molecules b on different beads have first tag sequences different from each other; 优选地,所述第一寡核苷酸分子a从5’端至3’端包含:(i)共有序列R1或其部分序列、所述第一标签序列、poly(T)序列;或者,(ii)共有序列R1或其部分序列、所述第一标签序列和cDNA 3’末端悬突的互补序列;和/或,所述第一寡核苷酸分子b从5’端至3’端包含:共有序列P1或其部分序列和所述第一标签序列;Preferably, the first oligonucleotide molecule a comprises, from the 5' end to the 3' end: (i) a consensus sequence R1 or a partial sequence thereof, the first tag sequence, and a poly(T) sequence; or, (ii) a consensus sequence R1 or a partial sequence thereof, the first tag sequence, and a complementary sequence to the 3' end overhang of the cDNA; and/or, the first oligonucleotide molecule b comprises, from the 5' end to the 3' end: a consensus sequence P1 or a partial sequence thereof and the first tag sequence; (III-b)引物A或引物B,其中,所述引物A从5’端至3’端包含共有序列O和cDNA 3’末 端悬突的互补序列,所述引物B从5’端至3’端包含共有序列T或其部分序列和poly(T)序列;(III-b) Primer A or primer B, wherein the primer A comprises a consensus sequence O and a cDNA 3' end from the 5' end to the 3' end. The primer B comprises a complementary sequence of the end overhang, and the primer B comprises a consensus sequence T or a partial sequence thereof and a poly (T) sequence from the 5' end to the 3'end; (III-c)转座酶复合体I,所述转座酶复合体I如权利要求4中所定义;(III-c) transposase complex I, the transposase complex I being as defined in claim 4; (III-d)引物C,所述引物C包含所述共有序列O或其部分序列,或者,所述引物C包含所述共有序列T或其部分序列;(III-d) primer C, the primer C comprises the consensus sequence O or a partial sequence thereof, or the primer C comprises the consensus sequence T or a partial sequence thereof; (III-e)引物D,其中,所述引物D包含共有序列P1或其部分序列;(III-e) primer D, wherein the primer D comprises the consensus sequence P1 or a partial sequence thereof; (III-f)引物E和/或引物F,其中,所述引物E包含共有序列P1以及任选的第三标签序列,所述引物F从5’至3’包含:共有序列P2或其互补序列、任选的第四标签序列、共有序列R2或其部分序列;(III-f) Primer E and/or primer F, wherein the primer E comprises a consensus sequence P1 and an optional third tag sequence, and the primer F comprises from 5' to 3': a consensus sequence P2 or a complementary sequence thereof, an optional fourth tag sequence, a consensus sequence R2 or a partial sequence thereof; (III-g)转座酶复合体II,所述转座酶复合体II含有转座酶和所述转座酶能够识别并结合的转座序列,且能够切割或断裂双链核酸;并且,所述转座序列包含转移链和非转移链;其中,所述转移链包含接头序列;优选地,所述接头序列包含共有序列R2或其部分序列。(III-g) Transposase complex II, wherein the transposase complex II contains a transposase and a transposition sequence that the transposase can recognize and bind to, and can cut or break a double-stranded nucleic acid; and the transposition sequence comprises a transferred strand and a non-transferred strand; wherein the transferred strand comprises a linker sequence; preferably, the linker sequence comprises a consensus sequence R2 or a partial sequence thereof. 一种试剂盒,其包含:含有多个寡核苷酸分子的多反应体系,所述每个寡核苷酸分子含有特定的标签序列;A kit comprising: a multi-reaction system containing a plurality of oligonucleotide molecules, each of which contains a specific tag sequence; 并且,所述多反应体系中,每个反应体系中的寡核苷酸分子具有相同的标签序列,不同反应体系的寡核苷酸分子具有彼此不同的标签序列;Furthermore, in the multi-reaction system, the oligonucleotide molecules in each reaction system have the same tag sequence, and the oligonucleotide molecules in different reaction systems have different tag sequences from each other; 所述寡核苷酸分子还包含共有序列P1或其部分序列,或者,所述寡核苷酸分子还包含共有序列P2或其部分序列;The oligonucleotide molecule further comprises a consensus sequence P1 or a partial sequence thereof, or the oligonucleotide molecule further comprises a consensus sequence P2 or a partial sequence thereof; 所述多反应体系包含至少2个(例如,至少3个,至少4个,至少5个,至少8个,至少10个,至少12个,至少20个,至少24个,至少50个,至少96个,至少100个,至少200个,至少384个,至少400个,2-5个,2-10个,2-50个,2-80个,2-100个,2-500个,2-103个,2-104个,2-105个,2-106个)含有寡核苷酸的多反应体系;The multiple reaction system comprises at least 2 (e.g., at least 3, at least 4, at least 5, at least 8, at least 10, at least 12, at least 20, at least 24, at least 50, at least 96, at least 100, at least 200, at least 384, at least 400, 2-5, 2-10, 2-50, 2-80, 2-100, 2-500, 2-10 3 , 2-10 4 , 2-10 5 , 2-10 6 ) multiple reaction systems containing oligonucleotides; 其中多反应体系优选为多孔板,寡核苷酸可以游离或固定在反应体系中。The multi-reaction system is preferably a multi-well plate, and the oligonucleotides can be free or fixed in the reaction system. 一种装置,其用于标记来自细胞或细胞核的核酸分子和/或构建核酸分子文库,所述装置包括:A device for labeling nucleic acid molecules from cells or cell nuclei and/or constructing a nucleic acid molecule library, the device comprising: 存储器;和Memory; and 耦接至所述存储器的处理器,所述处理器被配置为基于存储在所述存储器中的指令,执行权利要求1-7任一项所述的方法和/或权利要求8-13任一项所述的方法。A processor coupled to the memory, the processor being configured to execute the method of any one of claims 1-7 and/or the method of any one of claims 8-13 based on instructions stored in the memory. 一种计算机可读存储介质,其上存储有计算机程序,其特征在于,该程序被处理器执行时实现权利要求1-7任一项的方法和/或权利要求8-13任一项的方法。 A computer-readable storage medium having a computer program stored thereon, characterized in that when the program is executed by a processor, the method of any one of claims 1 to 7 and/or the method of any one of claims 8 to 13 is implemented.
PCT/CN2023/119479 2023-09-18 2023-09-18 Method and kit for high-throughput tagging of cell nucleic acid molecules Pending WO2025059808A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/CN2023/119479 WO2025059808A1 (en) 2023-09-18 2023-09-18 Method and kit for high-throughput tagging of cell nucleic acid molecules
CN202380014812.9A CN118647729A (en) 2023-09-18 2023-09-18 Method and kit for high-throughput labeling of cellular nucleic acid molecules

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2023/119479 WO2025059808A1 (en) 2023-09-18 2023-09-18 Method and kit for high-throughput tagging of cell nucleic acid molecules

Publications (1)

Publication Number Publication Date
WO2025059808A1 true WO2025059808A1 (en) 2025-03-27

Family

ID=92661830

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/119479 Pending WO2025059808A1 (en) 2023-09-18 2023-09-18 Method and kit for high-throughput tagging of cell nucleic acid molecules

Country Status (2)

Country Link
CN (1) CN118647729A (en)
WO (1) WO2025059808A1 (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180340171A1 (en) * 2017-05-26 2018-11-29 10X Genomics, Inc. Single cell analysis of transposase accessible chromatin
CN112005115A (en) * 2018-02-12 2020-11-27 10X基因组学有限公司 Methods to characterize multiple analytes from single cells or cell populations
CN114015755A (en) * 2020-12-31 2022-02-08 中国科学院北京基因组研究所(国家生物信息中心) Method and kit for labeling nucleic acid molecules
CN116064732A (en) * 2017-05-26 2023-05-05 10X基因组学有限公司 Single-cell analysis of transposase-accessible chromatin
CN116694730A (en) * 2022-02-28 2023-09-05 南方科技大学 A method for the construction of a single-cell open chromatin and transcriptome co-sequencing library

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180340171A1 (en) * 2017-05-26 2018-11-29 10X Genomics, Inc. Single cell analysis of transposase accessible chromatin
CN116064732A (en) * 2017-05-26 2023-05-05 10X基因组学有限公司 Single-cell analysis of transposase-accessible chromatin
CN112005115A (en) * 2018-02-12 2020-11-27 10X基因组学有限公司 Methods to characterize multiple analytes from single cells or cell populations
CN114015755A (en) * 2020-12-31 2022-02-08 中国科学院北京基因组研究所(国家生物信息中心) Method and kit for labeling nucleic acid molecules
CN116694730A (en) * 2022-02-28 2023-09-05 南方科技大学 A method for the construction of a single-cell open chromatin and transcriptome co-sequencing library

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
DATLINGER, P. ET AL.: "Ultra-high-throughput single-cell RNA sequencing and perturbation screening with combinatorial fluidic indexing", NATURE METHODS, vol. 18, 31 May 2021 (2021-05-31), pages 635 - 642, XP037473903, DOI: 10.1038/s41592-021-01153-z *
LI, YUN ET AL.: "FIPRESCI: droplet microfluidics based combinatorial indexing for massive‑scale 5′‑end single‑cell RNA sequencing", GENOME BIOLOGY, vol. 24, 6 April 2023 (2023-04-06), XP093078088, DOI: 10.1186/s13059-023-02893-1 *

Also Published As

Publication number Publication date
CN118647729A (en) 2024-09-13

Similar Documents

Publication Publication Date Title
CN114015755B (en) Methods and kits for labeling nucleic acid molecules
CN113106150B (en) An ultra-high-throughput single-cell sequencing method
US12234501B2 (en) In situ combinatorial labeling of cellular molecules
AU2017261189B2 (en) Accurate molecular barcoding
US20150275267A1 (en) Method and kit for preparing a target rna depleted sample
US20180010176A1 (en) Methods for highly parallel and accurate measurement of nucleic acids
JP2020500504A (en) Method for producing amplified double-stranded deoxyribonucleic acid, and composition and kit used in the method
CN110157785A (en) A single-cell RNA sequencing library construction method
CN117089597A (en) Single cell library construction sequencing method and application thereof
CN116949132A (en) A method for constructing single-cell sequencing libraries
CN116254611A (en) Construction method of multi-sample ultrahigh-flux single-cell transcriptome sequencing library
US20240182962A1 (en) Ultra-high-throughput single cell sequencing method
JP2025505870A (en) Single-cell transcriptome sequencing and its applications
US20250046395A1 (en) Molecular deduplication analysis methods
US20240279648A1 (en) Quantitative detection and analysis of molecules
WO2025059808A1 (en) Method and kit for high-throughput tagging of cell nucleic acid molecules
WO2023116373A1 (en) Method for generating population of labeled nucleic acid molecules and kit for the method
WO2023115536A1 (en) Method for generating labeled nucleic acid molecular population and kit thereof
EP3828283A1 (en) An improved sequencing method and kit
WO2025103412A1 (en) Methods and reagents for high-throughput single cell full length rna analysis
US20240124930A1 (en) Diagnostic and/or Sequencing Method and Kit
US20240384336A1 (en) Optimized Set Of Oligonucleotides For Bulk RNA Barcoding And Sequencing
WO2023116376A1 (en) Labeling and analysis method for single-cell nucleic acid
WO2024250155A1 (en) Method for constructing single cell sequencing library
EP4536854A1 (en) Optimised set of oligonucleotides for bulk rna barcoding and sequencing

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23952474

Country of ref document: EP

Kind code of ref document: A1