WO2024250155A1 - Method for constructing single cell sequencing library - Google Patents
Method for constructing single cell sequencing library Download PDFInfo
- Publication number
- WO2024250155A1 WO2024250155A1 PCT/CN2023/098377 CN2023098377W WO2024250155A1 WO 2024250155 A1 WO2024250155 A1 WO 2024250155A1 CN 2023098377 W CN2023098377 W CN 2023098377W WO 2024250155 A1 WO2024250155 A1 WO 2024250155A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- seq
- index
- cells
- vector
- linker
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6806—Preparing nucleic acids for analysis, e.g. for polymerase chain reaction [PCR] assay
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6869—Methods for sequencing
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6876—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
- C12Q1/6883—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6876—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
- C12Q1/6883—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
- C12Q1/6886—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
-
- C—CHEMISTRY; METALLURGY
- C40—COMBINATORIAL TECHNOLOGY
- C40B—COMBINATORIAL CHEMISTRY; LIBRARIES, e.g. CHEMICAL LIBRARIES
- C40B40/00—Libraries per se, e.g. arrays, mixtures
- C40B40/04—Libraries containing only organic compounds
- C40B40/06—Libraries containing nucleotides or polynucleotides, or derivatives thereof
-
- C—CHEMISTRY; METALLURGY
- C40—COMBINATORIAL TECHNOLOGY
- C40B—COMBINATORIAL CHEMISTRY; LIBRARIES, e.g. CHEMICAL LIBRARIES
- C40B50/00—Methods of creating libraries, e.g. combinatorial synthesis
- C40B50/06—Biochemical methods, e.g. using enzymes or whole viable microorganisms
Definitions
- the present invention relates to the technical field of gene sequencing, and in particular to a method for pairing analysis of RNA and chromatin accessibility in the same single cell through sequencing.
- Single-cell sequencing technology has developed from single-cell RNA-seq to ultra-high-throughput, multimodal single-cell sequencing.
- G&T-seq detects the single-cell genome and transcriptome in the same cell.
- ScTrio-seq analyzes the relationship between the genome, DNA methylation, and transcriptome of a single mammalian cell, and CITE-seq simultaneously measures epitopes and transcriptomes in a single cell.
- sci-CAR SNARE-Seq
- Paired-Seq SHARE-Seq
- Chromium single-cell multi-omics ATAC+ gene expression kits can locate chromatin and RNA in the same single cell. These methods dissect tissue heterogeneity and reveal relevant epigenomic regulatory elements.
- sci-CAR barcodes have low binding and high collision rates
- Paired-Seq has suboptimal labeling and reverse transcription reaction efficiencies when there are too many cells per tube.
- SHARE Seq requires custom sequencing to read two fragments of the ATAC Seq library, increasing sequencing costs.
- SNARE-Seq uses the Drop-Seq system to encapsulate labeled cells with DNA barcoded microbeads in nanoliter droplets, resulting in low cell yield (10k per experiment) and a high ratio of more than 2 cells with the same barcoded labeling (11.3%).
- Each single-cell multi-omics ATAC+ gene expression kit obtained the best joint analysis data, but the cost is high and the throughput is similar to SNARE-Seq.
- DSC-seq droplet-based single-cell sequencing
- this application proposes an ultra-high-throughput multimodal single-cell technology that measures gene expression and chromatin accessibility in the same cell in parallel, called (Parallel-seq).
- the present invention provides a single-cell ultra-high-throughput dual-omics technology (single-cell combined fluid labeling (SCIFI)), which can simultaneously measure the gene expression and chromatin accessibility of the same cell.
- SIMFI single-cell combined fluid labeling
- Parallel-Seq only performs four rounds of barcode indexing through one round of ligation reaction and two rounds of amplification reaction
- Parallel-Split-Seq only performs four rounds of barcode indexing through two rounds of ligation reaction and one round of amplification reaction, which realizes the joint analysis of open chromatin and gene expression in the same single cell, and can deconvolve cis-regulatory elements that regulate gene expression.
- a method for constructing a single-cell sequencing library comprising using a transposon to cut open chromatin to obtain a DNA fragment carrying a first connector; adding a reverse transcription primer to reverse transcribe mRNA to obtain a first chain of cDNA carrying a second connector, thereby constructing a chromatin DNA library and a transcriptome library in the same cell.
- the method further comprises placing the cells on a vector, and using a first vector-specific linker to connect the DNA fragment carrying the first linker obtained above and the first chain of cDNA obtained above to the vector respectively.
- the method further comprises synthesizing a second strand of cDNA.
- the method further comprises forming droplets, lysing cells and performing an amplification reaction in the droplets, and preferably, the formed droplets are overloaded with cells.
- the method further comprises purifying DNA and amplifying cDNA and chromatin DNA of the transcriptome library using primers, respectively.
- the method further comprises adding RNase.
- the method further comprises obtaining cells, fixing and permeabilizing the cells.
- a method for constructing a single-cell sequencing library comprises:
- step c) placing the cells on a vector, and using a first vector-specific linker to connect the DNA fragment carrying the first linker obtained in step a) and the first chain of the cDNA obtained in step b) to the vector respectively;
- step a) and b) can be performed simultaneously or sequentially.
- step a) can be performed first and then step b), or step b) can be performed first and then step a).
- step a) is performed first and then step b).
- it contains more than 10 transposons, more than 100 transposons, more than 1000 transposons, more than 10000 transposons, and so on.
- the transposon comprises a barcode sequence and a transposase.
- the transposase includes but is not limited to Tn5 transposase, Mu transposase, Tn7 transposase or IS5 transposase.
- the transposase is Tn5 transposase.
- the Tn5 transposase carries a sequence as shown in SEQ ID NO: 1 or 12.
- the barcode sequence comprises a first adapter. Further preferably, the barcode sequence comprises a first index.
- the first adapter comprises a first index and a transposase binding site.
- the first linker comprises at least one linker that is the same or different. Further preferably, the first linker comprises at least 4 linkers that are the same or different. In a specific embodiment of the present invention, it comprises at least 4-96 linkers that are the same or different.
- the barcode sequence is sequentially composed of an overhang, a first index and a transposase binding site from 5′ to 3′, and the overhang is a sequence complementary to a subsequent primer.
- the second linker comprises at least one linker that is the same or different. Further preferably, the second linker comprises at least 4 linkers that are the same or different.
- the reverse transcription primer comprises a second linker, wherein the second linker comprises poly(T) and a first index; and preferably further comprises a random hexamer primer.
- the reverse transcription primer comprises poly(T) and sequences complementary to the first index and subsequent primers.
- the first adapter and the second adapter may contain the same sequence complementary to the subsequent primer.
- the first index comprises at least one of AACAAC, ACCGCA, AGTTGG, CCACGT, CGTGTT, GTTCTC, TGACTA, TCAAGG, AACGGT, AAGCCT, ACATGA, ACTCTA, AGAAGT, AGTACC, ATGCGA, CAATAG, CATCCA, CCTGGA, CGAGAC, CGCTCA, GCGTAA, GGATCG, GTGAGG, TCCTTA, TCTGCC, TTAACC or TTAGTG, or a combination of two or more than three of them.
- the barcode sequence comprises at least one, two or more combinations of SEQ ID NO: 2 hybridized with SEQ ID NO: 1 or 12 respectively.
- the reverse transcription primer comprises at least one of SEQ ID NO: 3, 4, or a combination of two or more than three.
- the first vector-specific adapter comprises a second index.
- the first vector-specific adapter comprises a UMI.
- the second index comprises AAGACCAA, AAGCTACG, AAGGTCAT, AATAGTGG, AATGCCCTT, ACAATAGC, ACAGGATT, ACCGACCT, ACCTAGAT, ACGAGTCC, ACGGACGA, ACGTTCAA, ACTATCTG, ACTCCGAA, AGAACAGA, AGACGCTT, AGATGCGA, AGCCACTC, AGCGAAGC, AGGTAACG, AGTACATC, AGTGATTC, ATAAGAGG, ATA TCACG, ATCGCCGT, ATGACGGA, ATGGAATG, ATTCCTAC, CAACGCCA, CAAGTCTG, CACACATC, CACCTTAT, CAGAACCT, CAGCCGAT, CATACTGT, CATCCAC C.
- the first vector-specific adapter comprises a second index, a UMI, and a sequence complementary to a reverse transcription primer or a transposon sequence.
- the first vector-specific adapter is composed of, from 5′ to 3′, a sequence complementary to a reverse transcription primer or a transposon sequence, a UMI, a second index, and a sequence complementary to a sequence contained on the vector.
- the first vector-specific linker comprises SEQ ID NO: 6.
- the vector contains SEQ ID NO: 5.
- the first vector-specific linker comprises SEQ ID NO: 15.
- the vector contains SEQ ID NO: 13.
- the method also includes the steps of forming droplets, lysing cells and performing an amplification reaction in the droplets, preferably, the formed droplets are overloaded with cells. Overloading the droplets so that all functional droplets are used greatly improves the throughput of the microfluidic device. Linear amplification in droplets avoids the purification of unamplified products and can be easily combined with CRISPR screening, DNA methylation analysis, and protein expression analysis, which may lead to single-cell cross-omics sequencing or even whole-omics sequencing of single cells.
- the primers used in the amplification reaction in the droplet include a third index.
- the primer used for the amplification reaction in the droplet comprises SEQ ID NO: 8.
- the step of lysing the droplets is further included after the linear amplification.
- the lysed droplets are lysed by using a demulsifier.
- the method comprises using a second vector-specific adapter to connect the DNA fragment carrying the first adapter and the first strand of cDNA obtained above to a vector, respectively.
- the second vector-specific adapter comprises a third index.
- the second vector-specific linker comprises SEQ ID NO: 16.
- the vector contains SEQ ID NO: 14.
- the third index comprises AACCTCTT, AACGTCCGC, AAGAATCG, AAGCGGTG, AAGGAGCT, AATACCGC, AATCTCCA, ACAACTTC, ACACGCAA, ACCACAGT, ACCGTGTA, ACCTTGCC, ACGCATAA, ACGTATGG, ACTAACCA, ACTCAGGT, ACTGTTG, AGAAGTAC, AGAGATGA, AGATTAGG, AGCCTGGT, AGCTCTAA, AGGT GTCT, AGTCCGTT, AGTTCGCA, ATAAGCTC, ATCCATGA, ATCTAGCG, ATGCAACC, ATGTGCAG, ATTGGTAG, CAAGAAGA, CAATGGAC, CACATGCT, CACGGTAG, CAGAGGTT, CAGTATAG, CATCAAGT, CAGTTCC, CCAACAAT, CCAATTAC, CCAGTGAA, CCGATCAG, CCGGTCTT, CGACAACG,
- the method further comprises the step of purifying DNA.
- the primers used in the amplification reaction performed after the DNA purification comprise a fourth index.
- the fourth index comprises a combination of at least one, two or more than three of the P3xx indexes;
- the fourth index includes a combination of at least one, two or more than three of N7xx;
- the fourth index comprises a combination of at least one, two or more than three of P5xx;
- the fourth index includes a combination of at least one, two or more than three of N5xx.
- the primers used to amplify the transcriptome are SEQ ID NO: 9, 10.
- the primers used to amplify the transcriptome are SEQ ID NO: 20, 18.
- the primers used to amplify the open chromatin fragments are SEQ ID NO: 9, 11.
- the primers used to amplify the open chromatin fragments are SEQ ID NO: 20, 19.
- the carrier comprises a well, a tube or a plate.
- the carrier is an ELISA plate such as a 96-well plate.
- the method further comprises adding RNase.
- RNA is removed from the first-strand cDNA by RNase digestion reaction, and then the second-strand synthesis is performed using random primers, thereby avoiding the destruction of the open chromatin fragments by 0.1N NaOH and the contamination of the RNA-seq library.
- the method further comprises obtaining cells, fixing and permeabilizing the cells.
- a method for constructing a multi-mode single-cell sequencing library comprises the method for constructing a single-cell sequencing library according to the above-mentioned method.
- the third aspect of the present invention provides a method for constructing a transcriptome library, which comprises adding a reverse transcription primer to reverse transcribe mRNA to obtain a first chain of cDNA carrying a second linker; placing cells on a vector, and connecting the obtained first chain of cDNA to the vector using a first vector-specific linker; synthesizing the second chain of cDNA; and purifying and amplifying the cDNA of the transcriptome with primers.
- the reverse transcription primer comprises a second linker, wherein the second linker comprises poly(T) and a first index; and preferably further comprises a random hexamer primer.
- the first vector-specific linker comprises a second index.
- the method further comprises the steps of forming droplets, lysing cells and performing an amplification reaction in the droplets.
- the formed droplets are overloaded with cells
- the primers used in the amplification reaction in the droplet include a third index.
- the method comprises connecting the obtained first-strand cDNA to the vector using a second vector-specific adapter, and preferably, the second vector-specific adapter comprises a third index.
- the primers used in the amplification reaction performed after the DNA purification comprise a fourth index.
- the method further comprises adding RNase.
- the fourth aspect of the present invention provides a method for constructing a chromatin DNA library, which comprises using a transposon to cut open chromatin to obtain a DNA fragment carrying a first linker; placing cells on a vector, and connecting the obtained DNA fragment carrying the first linker to the vector using a first vector-specific linker; purifying the DNA and amplifying the chromatin DNA respectively using primers.
- the transposon comprises a barcode sequence and a transposase; preferably, the barcode sequence comprises a first linker; further preferably, the barcode sequence further comprises a first index.
- the first vector-specific linker comprises a second index.
- the method further comprises the steps of forming droplets, lysing cells and performing an amplification reaction in the droplets.
- the formed droplets are overloaded with cells
- the primers used in the amplification reaction in the droplet include a third index.
- the method comprises connecting the obtained DNA fragment carrying the first linker to the vector using a second vector-specific linker, and preferably, the second vector-specific linker comprises a third index.
- the primers used to amplify the chromatin DNA comprise a fourth index.
- the fifth aspect of the present invention provides a nucleic acid library obtained by the above method.
- the present invention provides a nucleic acid library, wherein the nucleic acid library comprises at least one DNA fragment, and the DNA fragment comprises at least one index and at least one unique molecular identifier.
- the indexes are one, two, three, four, five, six, seven, eight, nine or more than ten.
- the index includes a first index, a second index, a third index and/or a fourth index.
- the nucleic acid library comprises at least one from 5′ to 3′, which is a fourth index, a fragment DNA, a first index, a second index, and a third index.
- the unique molecular identifier is located between the fourth index and the fragment DNA, between the fragment DNA and the first index, between the first index and the second index, or between the second index and the third index.
- the seventh aspect of the present invention provides a sequencing method, which comprises constructing the above-mentioned nucleic acid library.
- the eighth aspect of the present invention provides an application of the above-mentioned nucleic acid library, wherein the application includes tumor target screening, disease monitoring or pre-implantation embryo diagnosis.
- the ninth aspect of the present invention provides a method for analyzing chromatin accessibility and transcription in the same cell, wherein the method comprises the steps of constructing a single-cell sequencing library, constructing a transcriptome library, and constructing a chromatin DNA library.
- the tenth aspect of the present invention provides a single-cell multi-omics analysis method, which includes constructing a single-cell sequencing library, constructing a transcriptome library, constructing a chromatin DNA library, and sequencing to obtain chromatin accessibility and/or transcriptome sequence information, and then performing bioinformatics analysis.
- the eleventh aspect of the present invention provides a kit, which includes reagents used to construct the above-mentioned nucleic acid library.
- chromatin accessibility refers to the degree of openness of eukaryotic chromatin DNA to other proteins after nucleosomes or transcription factors and other proteins bind to it. Among them, the region that can be re-bound to other proteins is open chromatin.
- the “carrier” of the present invention can be any object having a solid support surface, and its surface can be modified to couple with cells or nucleic acid molecules. It can be porous glass (CPG), oxalyl-adjusted pore glass, TentaGel support-an amino polyethylene glycol derivatized support, polystyrene, Poros (a copolymer of polystyrene/divinylbenzene) or reversibly cross-linked acrylamide. Many other solid supports are commercially available and suitable for the present invention. In some embodiments, it can be polystyrene resin or poly (methyl methacrylate) (PMMA). It can also be a metal.
- CPG porous glass
- PMMA poly (methyl methacrylate)
- the "droplets" of the present invention are oil-in-water or water-in-oil structures. Different droplets may have different identifiers.
- the aqueous mixture is combined with an oil phase.
- the oil phase is a surfactant.
- the "permeabilization” mentioned in the present invention refers to the technology of changing the permeability of the cell wall and cell membrane without causing cell lysis and destroying the internal organic structure of the cell, so that small molecules and some larger molecules can freely enter and exit the cell. After the permeabilization treatment, the permeability of the cell is improved while the overall structure remains intact, which still has a considerable protective effect on the intracellular enzyme, can ensure that the catalytic effect of the intracellular enzyme is fully exerted, and prolong the service life of the enzyme.
- the "overload” mentioned in the present invention means exceeding the original carrying capacity.
- the original carrying capacity is the conventional carrying capacity in the prior art.
- "overloaded cells in a droplet” means exceeding the amount of cells carried in the original droplet.
- droplet-carrying cells include empty cells, cells carrying a single cell, or overloaded cells.
- overloaded cells mean that the number of cells carried in a droplet exceeds one. Preferably, two, three, four, five, six, seven, eight or nine or more cells are carried.
- the "connector" described in the present invention can be used interchangeably with the adapter in the prior art, and can be used to connect fragmented DNA with an index, or to connect an index with an index, or to connect fragmented DNA with fragmented DNA. It is preferably a nucleotide sequence with a length of 3-1000 bases.
- index described in the present invention can be used interchangeably with index, barcode, etc. in the prior art.
- the index can be a sequence or a combination of several sequences. It is preferably a nucleotide sequence with a length of 3-1000 bases.
- the "unique molecular identifier" mentioned in the present invention is a Unique Molecular Identifier, or UMI for short, which is a randomly designed nucleotide sequence that can specifically identify the molecules it is coupled to. However, not all coupled molecules have a unique UMI. In a specific embodiment, it is combined with other indexes to form a unique molecular identifier.
- UMI Unique Molecular Identifier
- Complementarity refers to nucleotide sequences that are related by the base pairing rules.
- sequence 5'-AGT-3' is complementary to the sequence 5'-ACT-3'.
- Complementarity can be partial or complete. Partial complementarity occurs when one or more nucleic acid bases do not match according to the base pairing rules. Complete or complete complementarity between nucleic acids occurs when each nucleic acid base matches another base under the base pairing rules. The degree of complementarity between nucleic acid chains has a significant effect on the efficiency and strength of hybridization between nucleic acid chains.
- the "single cell” mentioned in the present invention refers to a single cell or a cell, which can come from a blood sample, a cell culture, or a specific tissue, organ or tumor, etc. Then, it is separated into single cells by conventional separation methods in the prior art.
- doublet or “doublets” mentioned in the present invention refers to the situation where two or more cells share a common identifier, such as an index, a linker, a unique molecular identifier, etc. or a combination thereof.
- nucleic acid refers to DNA, RNA, single-stranded, double-stranded, or more highly aggregated hybridization motifs and any chemical modifications thereof. Modifications include, but are not limited to, those modifications of chemical groups that provide for integration into other charges, polarizability, hydrogen bonding, electrostatic interactions, connection points and action points with nucleic acid ligand bases or nucleic acid ligands as a whole.
- Such modifications include, but are not limited to, peptide nucleic acids (PNA), phosphodiester group modifications (e.g., phosphorothioate, methylphosphonate), 2'-position sugar modifications, 5-position pyrimidine modifications, 8-position purine modifications, modifications at exocyclic amines, substitutions of 4-thiouridine, substitutions of 5-bromo or 5-iodo-uracil, backbone modifications, methylation, unusual base pairing combinations such as iso bases, isocytidine and isoguanidine, etc.
- Nucleic acids may also contain non-natural bases, such as nitroindole. Modifications may also include 3' and 5' modifications, including but not limited to capping with fluorophores (e.g., quantum dots) or other moieties.
- the terms “comprising” or “including” described in the present invention are open-ended terms.
- the protein or nucleic acid may be composed of the sequence, or may have additional amino acids or nucleotides at one or both ends of the protein or nucleic acid, but still have the activity described in the present invention.
- the second strand synthesis step in the cell is added to reduce the effect of cross-linked protein inhibition and capture more transcripts.
- Linear amplification based on droplet indexing is achieved and the efficiency of cDNA capture is improved.
- a PCR anchor adapter is provided for cDNA that is different from chromatin fragments to avoid ATAC-seq library contamination of RNA-seq library.
- Parallel-seq overloads droplets with multiple cells to fully utilize the generated droplets, and performs forward and backward indexing to distinguish cells within a droplet, greatly expanding the barcode space. Moreover, the length of the barcode region is significantly reduced, allowing it to read open fragments within the 150nt sequencing read length through the barcode and fixed nucleotide regions.
- Parallel-Seq first hashes cells with sample-specific barcodes during transposition and reverse transcription, allowing it to evaluate multiple samples in parallel in one experiment and be scalable.
- Parallel-Seq outperforms existing methods in data quality and has increased throughput (36 million cells per experiment), which provides a powerful tool for building affordable large-scale cell maps.
- Parallel-seq can easily handle more samples in an experiment and can be expanded to other omics such as DNA methylation, protein expression, and CRISPR screening.
- Figure 1 Parallel-Seq experimental design diagram, using indexing and droplet overloading to analyze scATAC and scRNA in the same cell, where pool/split represents mixing/dispersion.
- Figure 2 Parallel-Seq was performed using a mixture of NIH/3T3 (mouse), HEK293T (human), and K562 (human) cells, and the results are mapped to the UMI counts of scRNA-seq (top) and scATAC-seq (bottom) of the human and mouse genomes, where mm10 represents the mm10 version of the mouse reference genome.
- Figure 3 Insert length distribution of the scATAC-seq subset of Parallel-Seq.
- Figure 5 Scatter plot showing the log 2 (count) correlation between scATAC-seq and ENCODE DNase-seq by Parallel-Seq in K562 cells.
- Figure 6 Scatter plot showing the log 2 (TPM+1) correlation between clustered scRNA-seq and ENCODE nuclear RNA-seq in K562 cells Parallel-Seq.
- Figure 7 Comparison of chromatin accessibility captured by Parallel-Seq and ENCODE DNase-seq, and RNA captured by Parallel-Seq and ENCODE RNA-seq in K562 cells.
- Figure 8 Uniform Manifold Approximation and Projection (UMAP) visualization of Parallel-Seq paired gene expression data from a mixture of 3T3, 293T, and K562 cells.
- UMAP Uniform Manifold Approximation and Projection
- Figure 9 Uniform Manifold Approximation and Projection (UMAP) visualization of Parallel-Seq paired chromatin accessibility data from a mixture of 3T3, 293T, and K562 cells.
- UMAP Uniform Manifold Approximation and Projection
- Figure 10 Box plots show the number of uniquely mapped RNA reads and the number of uniquely mapped ATAC reads for sci-CAR, SNARE-seq, Paired-Seq, SHARE-seq, and Parallel-Seq.
- the horizontal axis RNA library box diagrams are sci-CAR, SNARE-Seq, Paired-Seq, SHARE-Seq, and Parallel-Seq from left to right
- the ATAC library box diagrams are sci-CAR, SNARE-Seq, Paired-Seq, SHARE-Seq, and Parallel-Seq from left to right.
- Figure 11 The box plot shows the number of genes captured per cell in sci-CAR, SNARE-seq, Paired-Seq, SHARE-seq, and Parallel-Seq.
- the horizontal axis RNA library box diagrams are sci-CAR, SNARE-Seq, Paired-Seq, SHARE-Seq, and Parallel-Seq from left to right.
- Figure 12 Schematic diagram of the Parallel-Split-Seq workflow.
- FIG. 13 UMI counts for scRNA-seq (left) and scATAC-seq (right) mapped to the human and mouse genomes. This experiment was performed using a mixture of NIH/3T3 (mouse), HEK293T (human), HeLa (human), K562 (human), and THP1 (human) cells for Parallel-Split-Seq.
- Figure 14 Insert length distribution of scATAC-seq fragments in Parallel-Split-Seq and Parallel-Seq.
- Figure 15 Enrichment of scATAC-seq reads around TSSs in Parallel-Split-Seq and Parallel-Seq.
- Figure 16 Scatter plots showing the correlation of log 2 (TPM+1) between scRNA-seq and ENCODE nuclear RNA-seq from Parallel-Split-Seq in K562 cells (Panel A) and the correlation of log2(count) between scATAC-seq and ENCODE nuclear DNase-seq (Panel B).
- Figure 17 Uniform Manifold Approximation and Projection (UMAP) visualization of Parallel-Split-Seq paired gene expression (left) and chromatin accessibility (right) data from NIH/3T3, HEK293T, HeLa, K562, and THP1 cells.
- UMAP Uniform Manifold Approximation and Projection
- Figure 18 Comparative results of chromatin accessibility captured by Parallel-Seq, Parallel-Split-Seq and ENCODE DNase-seq in K562 cells, and comparative results of RNA captured by Parallel-Seq, Parallel-Split-Seq and ENCODE RNA-seq.
- Figure 19 Box plots show the number of uniquely mapped RNA reads and the number of uniquely mapped ATAC reads for sci-CAR, SNARE-seq, Paired-Seq, SHARE-seq, Parallel-Seq, and Parallel-Split-seq.
- the horizontal axis RNA library box diagrams are sci-CAR, SNARE-Seq, Paired-Seq, SHARE-Seq, Parallel-Seq, and Parallel-Split-Seq from left to right
- the ATAC library box diagrams are sci-CAR, SNARE-Seq, Paired-Seq, SHARE-Seq, Parallel-Seq, and Parallel-Split-Seq from left to right.
- Figure 20 The box plot shows the number of genes captured per cell in sci-CAR, SNARE-seq, Paired-Seq, SHARE-seq, Parallel-Seq, and Parallel-Split-seq.
- the horizontal axis RNA library box diagrams are sci-CAR, SNARE-Seq, Paired-Seq, SHARE-Seq, Parallel-Seq, and Parallel-Split-Seq from left to right.
- HEK293T, HeLa-S3 and NIH/3T3 cells were cultured in DMEM (C11995500BT, ThermoFisher) medium supplemented with 10% fetal bovine serum (P30-3302, PAN BIOTECH) at 37°C and 5% CO 2.
- the cells were rinsed with PBS (C10010500BT, ThermoFisher) and cultured with 1 mL 0.25% trypsin EDTA (25200114, ThermoFisher) at 37°C for 3-5 minutes to detach the cells.
- K562 cells were cultured in RPMI 1640 (C11875500BT, ThermoFisher) medium supplemented with 10% fetal bovine serum at 37°C and 5% CO 2 . Detached HEK293T, HeLa-S3 and NIH/3T3 cells and K562 cell suspensions were collected by centrifugation, washed with PBS and counted using Countstar.
- the tissue was minced into small pieces less than 0.4 mm in a 1.5 mL Eppendorf microcentrifuge tube with scissors.
- the dissociation mixture was incubated at 37°C and rotated horizontally at 90 rpm for 60 minutes.
- the single cell suspension was filtered through a 70 ⁇ m cell strainer (15-1070, BIOLOGIX) and centrifuged at 500 g (centrifugal force) for 5 minutes at 4°C.
- the cells were resuspended with 1 mL PBS and 3 mL red blood cell lysis buffer (4992957, TIANGEN). Incubate at room temperature for 5 minutes and centrifuge at 500 g for 5 minutes at 4°C.
- the cells were resuspended in 500 ⁇ L fetal bovine serum. 5 ⁇ L of cells were taken, mixed with 5 ⁇ L of Taiban blue solution (15250061, ThermoFisher), and counted with a C-Chip disposable hemacytometer (DHC-N01N, As One). Single cell suspension was diluted to a final concentration of 10% with fetal bovine serum supplemented with dimethyl sulfoxide (D2650, Sigma Aldrich). We cryopreserved cells, with each tube containing 1x10 ⁇ 6 single cells. Before the experiment, cells were gently thawed at 37°C for 5 min and centrifuged at 500g for 5 min at 4°C.
- Calcein-AM-positive, 7-AAD-negative, and Annexin V-negative single cells were sorted using MoFloAstrios EQ Cell Sorter (Beckman Coulter). Since tumor cells and T cells are likely to be the major part of the sequencing data, we balanced the samples with less than 5% EpCam-positive cells, 40% T cells, and 55% other single cells.
- Tn5Merev /5Phos/CTGTCTCTTATACACATCT (SEQ ID NO: 21)
- Tn5ME-A, Tn5ME-B and barcoded R1BxME (x represents 1-96) were prepared.
- 10 ⁇ M Tn5Merev, 10 ⁇ M Tn5ME-A (for Parallel-Split-Seq) or 10 ⁇ M Tn5ME-B (for Parallel-Seq) were annealed with 10 ⁇ M R1BxME at 95°C for 2 minutes and gradually dropped to 20°C and 4°C at 0.1°C/s.
- Count 50k single cells for cell lines or classify 50k primary cells for each tube of lung cancer sample Centrifuge single cells at 500g for 5min at 4°C and resuspend single cells in 250 ⁇ L PBS. Add 750 ⁇ L PBS containing 1.33% methanol-free formaldehyde (28906, ThermoFisher) and incubate in ice for 10 minutes. Add 50 ⁇ L 20% BSA (V0332-100G, VWR) and centrifuge at 1000g for 3 minutes at 4°C using swinging bucket centrifugation, then collect cells into 1.5mL microcentrifuge tubes (MCT-150-C, Axygen) and the supernatant was removed in two pipetting steps like the omni ATAC. The results showed that after adding BSA, more primary cells could be recovered by first isolating single cells by swinging bucket centrifugation.
- 2x RSB was prepared as Omni ATAC by mixing 1 mL 1M Tris HCl pH 7.4 (T2663-1L, Sigma-Aldrich), 200 ⁇ L 5M NaCl (AM9759, ThermoFisher), 300 ⁇ L 1M MgCl 2 (AM9530G, ThermoFisher), and 48.5 mL ultrapure DNase/RNase free distilled water.
- permeabilization buffer prepare 50 ⁇ L 2x RSB, 1 ⁇ L RiboLock (EO0384, ThermoFisher), 1 ⁇ L SUPERase ⁇ In RNase inhibitor (AM2696, ThermoFisher), 1 ⁇ L 10% Nonidet P40 substitute (1133247301, Sigma-Aldrich), 1 ⁇ L 10% tween20 (11332465001, Sigma-Aldrich), 1 ⁇ L 1% Digitonin (D141-100MG, Sigma-Aldrich), 5 ⁇ L 20% BSA, 40 ⁇ L ultrapure DNase/RNase free distilled water per sample.
- ATAC-seq reaction solution by mixing 10 ⁇ L 5xLM buffer (M0221, Robustnique), 16.5 ⁇ L PBS, 0.5 ⁇ L RiboLock, 0.5 ⁇ L SUPERase ⁇ In RNase Inhibitor, 0.5 ⁇ L 10% Tween 20, 0.5 ⁇ L Digiton.5 ⁇ L, and 17.5 ⁇ L ultrapure DNase/RNase free distilled water. Resuspend permeabilized single cells with 46 ⁇ L ATAC-seq reaction solution and add 4 ⁇ L barcode-specific transposon to each tube. ATAC-seq reactions were performed at 37°C and 550 r.p.m. with a heated lid.
- Parallel Seq uses a ligation reaction to add a second index.
- the ligation adapter contains 7nt complementary strands ligated to the transposon and reverse transcription primers, respectively, as well as a 10nt index strand, an 8nt well-specific adapter, a 10nt UMI, and a universal PCR anchor for droplet linear amplification.
- the ligation adapter Prior to intracellular barcode ligation, the ligation adapter was annealed by combining 11 ⁇ M of the ligation strand and 12 ⁇ M of the barcode strand in a 100 ⁇ L reaction volume. The plate was incubated at 95°C for 2 minutes and cooled to 20°C at a rate of -0.1°C per second, and the culture plate was then divided into 10 ligation plates, with each well containing 10 ⁇ L of ligation adapter.
- the second and third indexes were added by ligation reactions.
- the ligation adapters contained a 10nt sequence complementary to the adapter strand, an 8nt well-specific adapter, and a 7nt sequence, which were then ligated.
- the ligation adapters added to the ligation reaction for the third index contained a 10nt index strand, an 8nt well-specific adapter, a 10nt UMI, and a short P3 sequence of the universal PCR primer.
- the second and third round adapters were annealed according to the Parallel-Split-seq protocol and were divided into 10 ligation plates respectively.
- the intracellular connection steps are as follows:
- Ligation reactions were performed according to the Split-seq protocol without RNase inhibitors. Prepare 2 mL 1x NEBuffe 3.1 (B7203S, NEB) and 2 mL ligation solution (500 ⁇ L 10x T4 DNA ligation buffer, 100 ⁇ L T4 DNA ligase (M0082, Robustnique), 50 ⁇ L 10% Triton x-100 and 1350 ⁇ L ultrapure DNase/RNase free distilled water). Resuspend the combined single cells with 1x buffer 3.1 and mix thoroughly with the ligation solution. Add 40 ⁇ L of cells from the ligation mixture to each well of the ligation plate. The ligation reaction was rotated at 15 r.p.m for 1 hour at room temperature.
- Resuspend cells using RNase digestion reaction (40 ⁇ L 5xRT buffer, 8 ⁇ L RNase Cocktail Enzyme Mix (AM2286, ThermoFisher), 8 ⁇ L RNAse H (Y9220L, Enzymatics) and 144 ⁇ L UltraPure DNase/RNase-free distilled water) and incubate at 37°C for 30 min, 300 rpm for 15 s and place on a mixer for 45 s. Wash the RNase digestion reaction by adding 790 ⁇ L PBS and 10 ⁇ L 10% Triton X-100, centrifuge and remove the supernatant. Do not add BSA in this step. Residual BSA will produce fragments with PEG8000 in the next step.
- RNase digestion reaction 40 ⁇ L 5xRT buffer, 8 ⁇ L RNase Cocktail Enzyme Mix (AM2286, ThermoFisher), 8 ⁇ L RNAse H (Y9220L, Enzymatics) and 144 ⁇ L UltraPure DNase/
- the second-strand synthesis reaction mixture 40 ⁇ L 5xRT buffer, 48 ⁇ L 50% PEG 8000 (B1004SVIAL, NEB), 20 ⁇ L 10 mM dNTPs, 2 ⁇ L 1 mM dN-P3 short primer (for Parallel-Seq) or dN-P5 short primer (for Parallel-Split-Seq), 5 ⁇ L Klenow Exo- (M0212L, NEB) and 85 ⁇ L UltraPure DNase/RNase-Free distilled water) at 37°C for 1 hour, at 300 r.p.m for 15 s and then on a mixer for 45 s.
- the second-strand synthesis reaction mixture 40 ⁇ L 5xRT buffer, 48 ⁇ L 50% PEG 8000 (B1004SVIAL, NEB), 20 ⁇ L 10 mM dNTPs, 2 ⁇ L 1 mM dN-P3 short primer (for Parallel-Seq) or dN-P5 short primer (for Parallel-Spli
- Linear amplification was performed as follows: 72 °C for 5 min, 98 °C for 30 sec, then 98 °C for 10 sec, 59 °C for 30 sec, and 72 °C for 1 min, for 12 cycles. Then stored at 15 °C until use.
- RNA-seq libraries were amplified using SI-PCR Primer B (PN-2000128, 10x Genomics) and N7xx primers.
- RNA-seq libraries were amplified using SI-PCR Primer B (PN-2000128) and P3xx primers. After amplification, ATAC-seq parts were cleaned up using 1.2x SPRI beads and RNA-seq parts were cleaned up using 0.8x SPRI beads.
- PCR amplification mix 25 ⁇ l NEBNext High-Fidelity 2X PCR Master Mix (M0541L, NEB), 2.5 ⁇ l N5xx primer, 1.25 ⁇ l P5xx primer, 1.25 ⁇ l P3xx primer and 15 ⁇ l UltraPure DNase/RNase-Free distilled water
- the cycling conditions were 72 °C for 5 min, 98 °C for 30 sec, then 98 °C for 10 sec, 65 °C for 30 sec, 72 °C for 1 min for 5 cycles, hold at 4 °C.
- the PCR mix was divided into ATAC-seq part and RNA-seq part.
- RNA-seq part was cleaned up using 1.0x AMPure XP beads (A63881, Beckman Coulter) and 0.8x AMPure XP beads, respectively.
- the PCR products were eluted with 22 ⁇ l UltraPure DNase/RNase-Free distilled water.
- the second round of PCR amplification was performed by adding 28 ⁇ l PCR reaction mixture (25 ⁇ l NEBNext High-Fidelity 2X PCR Master Mix, 1.25 ⁇ l N5xx primer, 1.25 ⁇ l P3_end primer, 0.5 ⁇ l 25x SYBR Green I (S7563, ThermoFisher) for ATAC-seq; 25 ⁇ l NEBNext High-Fidelity 2X PCR Master Mix, 1.25 ⁇ l N5xx primer, 1.25 ⁇ l P3_end primer, 0.5 ⁇ l 25x SYBR Green I for RNA-seq).
- Parallel-Seq libraries were sequenced using the Illumina NovaSeq 6000 sequencing system with 16nt i5 index, 8nt i7 index, and PE150 sequencing.
- Parallel-Split-Seq libraries were sequenced using the Illumina HiSeq X 10 System or NovaSeq 6000 Sequencing System with standard PE150 sequencing with 8nt i5 index and 8nt i7 index.
- RNA-seq library starts at the second strand synthesis annealing site, which is identical to the RNA sequence of the target gene.
- Raw reads were trimmed with cutadapt. Barcodes were parsed by FREE Difference software, allowing only one edit per round of barcoding. Data with embedded end sequences were filtered out from RNA libraries, and data without embedded end sequences were filtered out from ATAC libraries. Data were aligned to hg38, mm10, or the combined genome using STAR.
- RNA-seq For single-cell RNA-seq, a modified python script from the Split-seq pipeline was used to collapse UMIs and generate digital gene expression matrices. For single-cell ATAC-seq, mitochondrial reads were removed. Enrichment of TSS accessibility was calculated as previously described to assess data quality. Cells with TSS enrichment ⁇ 6 were discarded. Tn5 insertions were then calculated on 2-kb bins across the genome.
- X...X and “N...N” representing bases in the sequence of the present application can represent any natural or modified base type or base type known in the prior art, wherein “X” and “N” can be used interchangeably, including but not limited to A, T, C, G or U.
- V represents A, C or G.
- B represents C, G, T or U.
- X represents an amino acid
- it represents a natural or modified amino acid type known in the prior art.
- transposon-specific barcode sequence Tn5ME-B is shown in SEQ ID NO: 1
- sequence Tn5ME-x (x represents 1-27) with a first index is shown in SEQ ID NO: 2
- XXXXXX in the sequence represents the first index
- Tn5ME-B GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAG (SEQ ID NO: 1)
- Tn5ME-x /5Phos/TGCAGTA XXXXXX AGATGTGTATAAGAGACAG (SEQ ID NO: 2)
- R1BxT15VN /5Phos/ TGCAGTAXXXXXXTTTTTTTTTTTTTTTTTVN (SEQ ID NO: 3).
- R1BxN6 /5Phos/TGCAGTA XXXXXX NNNNNN (SEQ ID NO: 4)
- dscB′ sequence TACTGCACTCAGTGACT (SEQ ID NO: 5)
- the second PCR anchor is attached to the cDNA, wherein the primer used for the second strand synthesis is the p3 short primer.
- p3 short primer CAGACGTGTGCTCTTCCGATCTNNNGGNNNB (SEQ ID NO: 7)
- Lysing cells adding droplet-specific marker p5 adapter, i.e., the third index, for linear amplification in the droplets, as shown in Table 2, wherein the linear amplification primer is shown in (SEQ ID NO: 8), wherein XXXXXXXXXXXXXXXXX is the third index information, which is the specific index of beads in each droplet;
- the purified product in each PCR tube was divided into two parts, and the transcriptome and the open chromatin fragment were amplified using the corresponding primers, wherein the transcriptome was amplified using primers SI-PCR primer B (SEQ ID NO: 9) and P3xx primer (SEQ ID NO: 10), and XXXXXXX in the sequence represents the fourth index of the primer sequence required for amplifying the transcriptome, see P3xx index in Table 2; the open chromatin fragment was amplified using primers SI-PCR primer B (SEQ ID NO: 9) and N7xx primer (SEQ ID NO: 11), and XXXXXXXX in the sequence represents the fourth index of the primer sequence required for amplifying the open chromatin fragment, see N7xx index in Table 1;
- SI-PCR Primer B AATGATACGGCGACCACCGAGA (SEQ ID NO: 9)
- P3xx primer CAAGCAGAAGACGGCATACGAGAT XXXXXXX GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT (SEQ ID NO: 10)
- N7xx primer CAAGCAGAAGACGGCATACGAGAT XXXXXXX GTCTCGTGGGCTCGG (SEQ ID NO: 11)
- Parallel Seq The data quality of Parallel-Seq was further compared with sci-CAR, paired-Seq, SNARE-Seq, and SHARE-Seq.
- Parallel Seq showed better data quality than the state-of-the-art method SHARE Seq on two libraries ( Figures 10-11), with more ATAC fragments and RNA UMIs, more captured genes, and greater bandwidth than other methods.
- Parallel-Split-Seq was developed, which changed the location of the third index in Parallel-Seq, that is, the third index was added from linear amplification in the droplet to adding a round of ligation reaction on the plate. It still includes the step of linear amplification in the droplet, but the third index is not added in this step, and the barcode space is 24x96x96x96 ⁇ 2.12x10 7 ( Figure 12).
- Parallel-Split-Seq (same steps as in Example 1) was performed using a mixture of NIH/3T3 (mouse), HEK293T (human), Hela (human), K562 (human) and THP1 (human) cells.
- the specific steps are as follows:
- Tn5ME-A TCGTCGGCAGCGTCAGATGTGTATAAGAGACAG(SEQ ID NO:12)
- each well contains an R2′ sequence (SEQ ID NO: 13) when the second index is added, and each well contains an R3′ sequence (SEQ ID NO: 14) when the third index is added, and ligating the well-specific adapter sequence to the transposed chromatin or the first chain of cDNA, wherein the well-specific adapter sequence with the second index is shown as R2Bx (SEQ ID NO: 15), and XXXXXXX in the sequence represents the second index, as shown in Table 2; the well-specific adapter sequence with the third index is shown as R3Bx (SEQ ID NO: 16), and XXXXXXX in the sequence represents the third index, as shown in Table 2;
- R2′ sequence TACTGCAGCTGAACCTC (SEQ ID NO: 13)
- R3′ sequence TCTCCAAAGCTGTGGAC (SEQ ID NO: 14)
- R2Bx sequence /5Phos/TTGGAGA XXXXXXX GAGGTTCAGC (SEQ ID NO: 15)
- R3Bx sequence CAGACGTGTGCTCTTCCGATCTNNNNNNNNNNNN XXXXXXX GTCCACAGCT (SEQ ID NO: 16).
- Second-strand synthesis with random primers The second PCR anchor point is attached to the cDNA, wherein the primer used for the second-strand synthesis is the p5 short primer.
- P5 short primer ACACGACGCTCTTCCGATCTNNNGGNNNB (SEQ ID NO: 17)
- Each PCR product was purified and divided into two parts.
- the transcriptome and accessible chromatin fragments were amplified using corresponding primers, wherein the transcriptome was amplified using primers p3 end (SEQ ID NO: 20) and P5xx (SEQ ID NO: 18), and XXXXXXX in the P5xx sequence was the fourth index of the amplified transcriptome, as shown in Table 3;
- the open chromatin fragment was amplified using primers p3 end (SEQ ID NO: 20) and N5xx (SEQ ID NO: 19), and XXXXXXXX in the N5xx sequence was the fourth index of the amplified open chromatin fragment, as shown in Table 3.
- N5xx sequence AATGATACGGCGACCACCGAGATCTACAC XXXXXXXX TCGTCGGCAGCGTC (SEQ ID NO: 19)
Landscapes
- Chemical & Material Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Organic Chemistry (AREA)
- Health & Medical Sciences (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Engineering & Computer Science (AREA)
- Zoology (AREA)
- Wood Science & Technology (AREA)
- Analytical Chemistry (AREA)
- Biochemistry (AREA)
- Genetics & Genomics (AREA)
- Molecular Biology (AREA)
- Microbiology (AREA)
- Immunology (AREA)
- Biophysics (AREA)
- Physics & Mathematics (AREA)
- Bioinformatics & Cheminformatics (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Biotechnology (AREA)
- Chemical Kinetics & Catalysis (AREA)
- Pathology (AREA)
- General Chemical & Material Sciences (AREA)
- Medicinal Chemistry (AREA)
- Hospice & Palliative Care (AREA)
- Oncology (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Abstract
Description
本发明涉及基因测序技术领域,具体涉及通过测序对同一单细胞中的RNA和染色质可接近性进行配对分析的方法。The present invention relates to the technical field of gene sequencing, and in particular to a method for pairing analysis of RNA and chromatin accessibility in the same single cell through sequencing.
单细胞测序技术从单细胞RNA-seq发展到超高通量、多模态单细胞测序。G&T-seq检测同一细胞中的单细胞基因组和转录组。ScTrio-seq分析单个哺乳动物细胞的基因组、DNA甲基化和转录组之间的关系,CITE-seq同时测量单细胞中的表位和转录组。Single-cell sequencing technology has developed from single-cell RNA-seq to ultra-high-throughput, multimodal single-cell sequencing. G&T-seq detects the single-cell genome and transcriptome in the same cell. ScTrio-seq analyzes the relationship between the genome, DNA methylation, and transcriptome of a single mammalian cell, and CITE-seq simultaneously measures epitopes and transcriptomes in a single cell.
在多模式单细胞测序技术中,sci-CAR、SNARE-Seq、Paire-Seq、SHARE-Seq和Chromium单细胞多组学ATAC+基因表达试剂盒可在同一单细胞中定位染色质和RNA。这些方法解剖组织异质性并揭示相关的表观基因组调控元件。然而,sci-CAR条码结合低且碰撞率高,Paired-Seq在每管细胞数量过多时,其标记和逆转录反应效率均不理想。SHARE Seq需要定制测序来读取ATAC Seq库的两个片段,增加了测序成本。SNARE-Seq利用Drop-Seq系统将带有DNA条码微珠的标记细胞封装在纳升液滴中,出现了细胞产量低(每次实验10k)和很高的同一barcoded标记2个以上细胞的比率(11.3%)。各单细胞多组ATAC+基因表达试剂盒获得了最佳的联合分析数据,但成本高,通量与SNARE-Seq相似。Among multimodal single-cell sequencing technologies, sci-CAR, SNARE-Seq, Paired-Seq, SHARE-Seq, and Chromium single-cell multi-omics ATAC+ gene expression kits can locate chromatin and RNA in the same single cell. These methods dissect tissue heterogeneity and reveal relevant epigenomic regulatory elements. However, sci-CAR barcodes have low binding and high collision rates, and Paired-Seq has suboptimal labeling and reverse transcription reaction efficiencies when there are too many cells per tube. SHARE Seq requires custom sequencing to read two fragments of the ATAC Seq library, increasing sequencing costs. SNARE-Seq uses the Drop-Seq system to encapsulate labeled cells with DNA barcoded microbeads in nanoliter droplets, resulting in low cell yield (10k per experiment) and a high ratio of more than 2 cells with the same barcoded labeling (11.3%). Each single-cell multi-omics ATAC+ gene expression kit obtained the best joint analysis data, but the cost is high and the throughput is similar to SNARE-Seq.
在基于液滴的单细胞测序(dsc-seq)方法中,当加载到同一液滴中时,两个细胞将获得相同的条码,称为doublet,这会影响单细胞数据分析。与它产生的液滴相比,Dsc-seq加载的细胞数量要少得多,以避免doublet。例如,10x Genomics Chromium平台可产生约100k个含有条形码微珠和条形码试剂的液滴,但只能以约10%的碰撞率回收10k个单细胞。80%以上的功能液滴从未接收到单个细胞,浪费大部分试剂,并导致其大规模研究的高昂成本。In the droplet-based single-cell sequencing (DSC-seq) method, two cells will receive the same barcode when loaded into the same droplet, called a doublet, which affects single-cell data analysis. DSC-seq loads far fewer cells than the droplets it produces to avoid doublets. For example, the 10x Genomics Chromium platform produces about 100k droplets containing barcoded beads and barcoded reagents, but can only recover 10k single cells at a collision rate of about 10%. More than 80% of functional droplets never receive single cells, wasting most of the reagents and leading to high costs for large-scale studies.
因此,本申请提出了一种超高通量多模式单细胞技术,该技术并行测量同一细胞中的基因表达和染色质可接近性,称为(Parallel-seq)。Therefore, this application proposes an ultra-high-throughput multimodal single-cell technology that measures gene expression and chromatin accessibility in the same cell in parallel, called (Parallel-seq).
发明内容Summary of the invention
本发明提供了一种单细胞超高通量双组学技术(单细胞组合流体标记(scifi)),可以同时测量同一细胞的基因表达和染色质可接近性。与以往的多模式单细胞分析方法相比,Parallel-Seq仅通过一轮连接反应和两轮扩增反应进行四轮条码索引,Parallel-Split-Seq仅通过两轮连接反应和一轮扩增反应进行四轮条码索引,实现了在同一单细胞中对开放染色质和基因表达进行联合分析,可以对调节基因表达的顺式调节元件进行反卷积。用几个人类和小鼠细胞系对Parallel-Seq及Parallel-Split-Seq进行了基准测试,并将其应用于人类肺癌样本的原代细胞。结果显示,文库的数据特异性好,质量高,捕获的基因数量多。而且,有很少的doublets,碰撞率极低。本申请的构建单细胞测序文库的方法具有超标记组合空间能够以更低的成本执行大型细胞图谱项目。The present invention provides a single-cell ultra-high-throughput dual-omics technology (single-cell combined fluid labeling (SCIFI)), which can simultaneously measure the gene expression and chromatin accessibility of the same cell. Compared with previous multimodal single-cell analysis methods, Parallel-Seq only performs four rounds of barcode indexing through one round of ligation reaction and two rounds of amplification reaction, and Parallel-Split-Seq only performs four rounds of barcode indexing through two rounds of ligation reaction and one round of amplification reaction, which realizes the joint analysis of open chromatin and gene expression in the same single cell, and can deconvolve cis-regulatory elements that regulate gene expression. Parallel-Seq and Parallel-Split-Seq were benchmarked with several human and mouse cell lines and applied to primary cells of human lung cancer samples. The results showed that the library had good data specificity, high quality, and a large number of captured genes. Moreover, there were very few doublets and the collision rate was extremely low. The method for constructing a single-cell sequencing library in the present application has a super-label combination space that can perform large-scale cell atlas projects at a lower cost.
本发明的第一方面,提供了一种构建单细胞测序文库的方法,所述的方法包括利用转座子切割开放染色质获得携带第一接头的DNA片段;加入逆转录引物对mRNA进行逆转录获得携带第二接头的cDNA第一条链,获得在同一细胞中构建染色质DNA文库和转录组文库。In a first aspect of the present invention, a method for constructing a single-cell sequencing library is provided, the method comprising using a transposon to cut open chromatin to obtain a DNA fragment carrying a first connector; adding a reverse transcription primer to reverse transcribe mRNA to obtain a first chain of cDNA carrying a second connector, thereby constructing a chromatin DNA library and a transcriptome library in the same cell.
优选的,所述的方法还包括将细胞置于载体上,利用第一载体特异性接头分别将上述获得的携带第一接头的DNA片段和上述获得的cDNA第一条链连接至载体上。 Preferably, the method further comprises placing the cells on a vector, and using a first vector-specific linker to connect the DNA fragment carrying the first linker obtained above and the first chain of cDNA obtained above to the vector respectively.
优选的,所述的方法还包括合成cDNA第二条链。Preferably, the method further comprises synthesizing a second strand of cDNA.
优选的,所述的方法还包括形成液滴,裂解细胞并在液滴中进行扩增反应,优选的,形成的液滴中过载细胞。Preferably, the method further comprises forming droplets, lysing cells and performing an amplification reaction in the droplets, and preferably, the formed droplets are overloaded with cells.
优选的,所述的方法还包括纯化DNA并用引物分别扩增转录组文库的cDNA和染色质DNA。Preferably, the method further comprises purifying DNA and amplifying cDNA and chromatin DNA of the transcriptome library using primers, respectively.
优选的,所述的方法还包括加入RNA酶。Preferably, the method further comprises adding RNase.
优选的,所述的方法还包括获得细胞,将细胞固定并透化。Preferably, the method further comprises obtaining cells, fixing and permeabilizing the cells.
在本发明的一个具体实施方式中,一种构建单细胞测序文库的方法,包括:In a specific embodiment of the present invention, a method for constructing a single-cell sequencing library comprises:
a)利用转座子切割开放染色质获得携带第一接头的DNA片段;a) using a transposon to cut the open chromatin to obtain a DNA fragment carrying a first linker;
b)加入逆转录引物对mRNA进行逆转录获得携带第二接头的cDNA第一条链;b) adding a reverse transcription primer to reverse transcribe the mRNA to obtain the first strand of cDNA carrying the second linker;
c)将细胞置于载体上,利用第一载体特异性接头分别将步骤a)获得的携带第一接头的DNA片段和步骤b)获得的cDNA第一条链连接至载体上;c) placing the cells on a vector, and using a first vector-specific linker to connect the DNA fragment carrying the first linker obtained in step a) and the first chain of the cDNA obtained in step b) to the vector respectively;
d)合成cDNA的第二条链;d) synthesizing the second strand of cDNA;
e)用引物分别扩增转录组文库的cDNA和染色质DNA。e) Use primers to amplify cDNA and chromatin DNA of the transcriptome library respectively.
所述的步骤a)与步骤b)可以同时进行,或者先后进行。例如可以先进行步骤a)再进行步骤b),或者先进行步骤b)再进行步骤a)。The steps a) and b) can be performed simultaneously or sequentially. For example, step a) can be performed first and then step b), or step b) can be performed first and then step a).
优选的,先进行步骤a)再进行步骤b)。Preferably, step a) is performed first and then step b).
优选的,包含大于10个转座子、大于100个转座子、大于1000个转座子、大于10000个转座子等等。Preferably, it contains more than 10 transposons, more than 100 transposons, more than 1000 transposons, more than 10000 transposons, and so on.
所述的转座子包含条形码序列和转座酶。The transposon comprises a barcode sequence and a transposase.
所述的转座酶包括但不限于Tn5转座酶、Mu转座酶、Tn7转座酶或IS5转座酶。在本发明的一个具体实施方式中,所述的转座酶为Tn5转座酶。所述Tn5转座酶携带序列如SEQ ID NO:l或12所示。The transposase includes but is not limited to Tn5 transposase, Mu transposase, Tn7 transposase or IS5 transposase. In a specific embodiment of the present invention, the transposase is Tn5 transposase. The Tn5 transposase carries a sequence as shown in SEQ ID NO: 1 or 12.
所述的条形码序列包含第一接头。进一步优选的,所述的条形码序列包含第一索引。所述的第一接头包含第一索引和转座酶结合位点。The barcode sequence comprises a first adapter. Further preferably, the barcode sequence comprises a first index. The first adapter comprises a first index and a transposase binding site.
所述的第一接头包含至少一个相同或不同的接头。进一步优选的,所述的第一接头包含至少4个相同或不同的接头。在本发明的一个具体实施方式中,包含至少4-96个相同或不同的接头。The first linker comprises at least one linker that is the same or different. Further preferably, the first linker comprises at least 4 linkers that are the same or different. In a specific embodiment of the present invention, it comprises at least 4-96 linkers that are the same or different.
所述的条形码序列从5′-3′依次为突出端、第一索引和转座酶结合位点。所述的突出端为与后续引物互补的序列。The barcode sequence is sequentially composed of an overhang, a first index and a transposase binding site from 5′ to 3′, and the overhang is a sequence complementary to a subsequent primer.
优选的,所述的第二接头包含至少一个相同或不同的接头。进一步优选的,所述的第二接头包含至少4个相同或不同的接头。Preferably, the second linker comprises at least one linker that is the same or different. Further preferably, the second linker comprises at least 4 linkers that are the same or different.
所述的逆转录引物包含第二接头,所述的第二接头包含poly(T)和第一索引;优选还包含随机六聚体引物。The reverse transcription primer comprises a second linker, wherein the second linker comprises poly(T) and a first index; and preferably further comprises a random hexamer primer.
在本发明的一个具体实施方式中,所述的逆转录引物包含poly(T)和第一索引以及后续引物互补的序列。In a specific embodiment of the present invention, the reverse transcription primer comprises poly(T) and sequences complementary to the first index and subsequent primers.
在本发明的一个具体实施方式中,所述的第一接头与第二接头可以包含相同的后续引物互补的序列。In a specific embodiment of the present invention, the first adapter and the second adapter may contain the same sequence complementary to the subsequent primer.
在本发明的一个具体实施方式中,所述的第一索引包含AACAAC、ACCGCA、AGTTGG、CCACGT、CGTGTT、GTTCTC、TGACTA、TCAAGG、AACGGT、AAGCCT、ACATGA、ACTCTA、AGAAGT、AGTACC、ATGCGA、CAATAG、CATCCA、CCTGGA、CGAGAC、CGCTCA、GCGTAA、GGATCG、GTGAGG、TCCTTA、TCTGCC、TTAACC或TTAGTG中的至少一个、两个或三个以上的组合。In a specific embodiment of the present invention, the first index comprises at least one of AACAAC, ACCGCA, AGTTGG, CCACGT, CGTGTT, GTTCTC, TGACTA, TCAAGG, AACGGT, AAGCCT, ACATGA, ACTCTA, AGAAGT, AGTACC, ATGCGA, CAATAG, CATCCA, CCTGGA, CGAGAC, CGCTCA, GCGTAA, GGATCG, GTGAGG, TCCTTA, TCTGCC, TTAACC or TTAGTG, or a combination of two or more than three of them.
在本发明的一个具体实施方式中,所述的条形码序列包含SEQ ID NO:2分别与SEQ ID NO:1或12杂交后的至少一个、两个或三个以上的组合。In a specific embodiment of the present invention, the barcode sequence comprises at least one, two or more combinations of SEQ ID NO: 2 hybridized with SEQ ID NO: 1 or 12 respectively.
在本发明的一个具体实施方式中,所述的逆转录引物包含SEQ ID NO:3、4中的至少一个、两个或三个以上的组合。In a specific embodiment of the present invention, the reverse transcription primer comprises at least one of SEQ ID NO: 3, 4, or a combination of two or more than three.
所述的第一载体特异性接头包含第二索引。The first vector-specific adapter comprises a second index.
所述的第一载体特异性接头包含UMI。The first vector-specific adapter comprises a UMI.
优选的,所述的第二索引包含AAGACCAA、AAGCTACG、AAGGTCAT、AATAGTGG、AATGCCTT、ACAATAGC、ACAGGATT、ACCGACCT、ACCTAGAT、ACGAGTCC、ACGGACGA、ACGTTCAA、ACTATCTG、ACTCCGAA、AGAACAGA、AGACGCTT、AGATGCGA、AGCCACTC、AGCGAAGC、AGGTAACG、AGTACATC、AGTGATTC、ATAAGAGG、ATATCACG、ATCGCCGT、ATGACGGA、ATGGAATG、ATTCCTAC、CAACGCCA、CAAGTCTG、CACACATC、CACCTTAT、CAGAACCT、CAGCCGAT、CATACTGT、CATCCACC、CATTGAGC、CCAAGCGT、CCACGACT、CCATTGTC、CCGCATGT、CCTACTCC、CCTCCTTG、CCTTAATG、CGAATATC、CGAGAGCA、CGCCTCAA、CGCGTTAC、CGGACTCT、CGGTTGTT、CGTAGCTT、CGTGCCAA、CTACCGGA、CTAGCAGT、CTCAGCCT、CTCTTCTA、CTGCTGGT、CTGTATTC、CTTCGCTC、GAAGAGTA、GACACCTA、GACGTGAG、GACTTACT、GAGGACAA、GAGTTAAG、GATCCTCG、GCAATCCG、GCAGTGTG、GCCGCTAA、GCGACCAT、GCTAAGAC、GCTGTAGG、GGAACTGG、GGACAGTT、GGATTGCT、GGTCCTAA、GTACCTGT、GTCAAGGA、GTCTGCTT、GTGCTCCA、GTGTGACC、GTTATTGG、TAATTCGG、TACCAATC、TAGACTCC、TAGTCAAC、TCACGTTG、TCAGAATG、TCCAGCTT、TCCTGCGA、TCGGTTCC、TCTTACCT、TGACATGG、TGCCTATA、TGGTGTGG、TGTACTAG中的至少一个、两个或三个以上的组合。Preferably, the second index comprises AAGACCAA, AAGCTACG, AAGGTCAT, AATAGTGG, AATGCCCTT, ACAATAGC, ACAGGATT, ACCGACCT, ACCTAGAT, ACGAGTCC, ACGGACGA, ACGTTCAA, ACTATCTG, ACTCCGAA, AGAACAGA, AGACGCTT, AGATGCGA, AGCCACTC, AGCGAAGC, AGGTAACG, AGTACATC, AGTGATTC, ATAAGAGG, ATA TCACG, ATCGCCGT, ATGACGGA, ATGGAATG, ATTCCTAC, CAACGCCA, CAAGTCTG, CACACATC, CACCTTAT, CAGAACCT, CAGCCGAT, CATACTGT, CATCCAC C. CATTGAGC, CCAAGCGT, CCACGACT, CATTGTC, CCGCATGT, CCTACTCC, CCTCCTTG, CCTTAATG, CGAATATC, CGAGAGCA, CGCCTCAA, CGCGTTAC, CG GACTCT, CGGTTGTT, CGTAGCTT, CGTGCCAA, CTACCGGA, CTAGCAGT, CTCAGCCT, CTCTTCTA, CTGCTGGT, CTGTATTC, CTTCGCTC, GAAGAGTA, GACAC CTA, GACGTGAG, GACTTACT, GAGGACAA, GAGTTAAG, GATCCTCG, GCAATCCG, GCAGTGTG, GCCGCTAA, GCGACCAT, GCTAAGAC, GCTGTAGG, GGAACTGG, At least one of GGACAGTT, GGATTGCT, GGTCCTAA, GTACCTGT, GTCAAGGA, GTCTGCTT, GTGCTCCA, GTGTGACC, GTTATTGG, TAATTCGG, TACCAATC, TAGACTCC, TAGTCAAC, TCACGTTG, TCAGAATG, TCCAGCTT, TCCTGCGA, TCGGTTCC, TCTTACCT, TGACATGG, TGCCTATA, TGGTGTGG, and TGTACTAG, or a combination of two or more.
优选的,所述的第一载体特异性接头包含第二索引、UMI以及与逆转录引物或者转座子序列互补的序列。Preferably, the first vector-specific adapter comprises a second index, a UMI, and a sequence complementary to a reverse transcription primer or a transposon sequence.
在本发明的一个具体实施方式中,所述的第一载体特异性接头从5′-3′依次为与逆转录引物或者转座子序列互补的序列、UMI、第二索引和与载体上包含的序列互补的序列。In a specific embodiment of the present invention, the first vector-specific adapter is composed of, from 5′ to 3′, a sequence complementary to a reverse transcription primer or a transposon sequence, a UMI, a second index, and a sequence complementary to a sequence contained on the vector.
在本发明的一个具体实施方式中,第一载体特异性接头包含SEQ ID NO:6。In a specific embodiment of the present invention, the first vector-specific linker comprises SEQ ID NO: 6.
在本发明的一个具体实施方式中,所述载体上包含SEQ ID NO:5。In a specific embodiment of the present invention, the vector contains SEQ ID NO: 5.
在本发明的另一个具体实施方式中,第一载体特异性接头包含SEQ ID NO:15。In another specific embodiment of the present invention, the first vector-specific linker comprises SEQ ID NO: 15.
在本发明的另一个具体实施方式中,所述载体上包含SEQ ID NO:13。In another specific embodiment of the present invention, the vector contains SEQ ID NO: 13.
所述的方法还包括形成液滴,裂解细胞并在液滴中进行扩增反应的步骤,优选的,形成的液滴中过载细胞。使液滴过载,使所有功能液滴都被使用,大大提高了微流体设备的通量。在液滴中进行线性扩增避免了未扩增产物的纯化,并且可以轻松地结合CRISPR筛选、DNA甲基化分析、蛋白质表达分析,这可能会导致单细胞跨组学测序甚至单个细胞的全组学测序。The method also includes the steps of forming droplets, lysing cells and performing an amplification reaction in the droplets, preferably, the formed droplets are overloaded with cells. Overloading the droplets so that all functional droplets are used greatly improves the throughput of the microfluidic device. Linear amplification in droplets avoids the purification of unamplified products and can be easily combined with CRISPR screening, DNA methylation analysis, and protein expression analysis, which may lead to single-cell cross-omics sequencing or even whole-omics sequencing of single cells.
优选的,所述在液滴中进行扩增反应使用的引物包含第三索引。Preferably, the primers used in the amplification reaction in the droplet include a third index.
在本发明的一个具体实施方式中,在液滴中进行扩增反应使用的引物包含SEQ ID NO:8。In one specific embodiment of the present invention, the primer used for the amplification reaction in the droplet comprises SEQ ID NO: 8.
优选的,线性扩增后还包括裂解液滴的步骤。在本发明的一个具体实施方式中,所述的裂解液滴为采用破乳剂裂解。Preferably, the step of lysing the droplets is further included after the linear amplification. In a specific embodiment of the present invention, the lysed droplets are lysed by using a demulsifier.
优选的,所述的方法包括利用第二载体特异性接头分别将上述获得的携带第一接头的DNA片段和上述获得的cDNA第一条链连接至载体上。优选的,所述的第二载体特异性接头包含第三索引。Preferably, the method comprises using a second vector-specific adapter to connect the DNA fragment carrying the first adapter and the first strand of cDNA obtained above to a vector, respectively. Preferably, the second vector-specific adapter comprises a third index.
在本发明的一个具体实施方式中,第二载体特异性接头包含SEQ ID NO:16。In a specific embodiment of the present invention, the second vector-specific linker comprises SEQ ID NO: 16.
在本发明的一个具体实施方式中,所述载体上包含SEQ ID NO:14。In a specific embodiment of the present invention, the vector contains SEQ ID NO: 14.
在本发明的一个具体实施方式中,所述的第三索引包含AACCTCTT、AACGTCGC、AAGAATCG、AAGCGGTG、AAGGAGCT、AATACCGC、AATCTCCA、ACAACTTC、ACACGCAA、ACCACAGT、ACCGTGTA、ACCTTGCC、ACGCATAA、ACGTATGG、ACTAACCA、ACTCAGGT、ACTTGTTG、AGAAGTAC、AGAGATGA、AGATTAGG、AGCCTGGT、AGCTCTAA、AGGTGTCT、AGTCCGTT、AGTTCGCA、ATAAGCTC、ATCCATGA、ATCTAGCG、ATGCAACC、ATGTGCAG、ATTGGTAG、CAAGAAGA、CAATGGAC、CACATGCT、CACGGTAG、CAGAGGTT、CAGTATAG、CATCAAGT、CATGTTCC、CCAACAAT、CCAATTAC、CCAGTGAA、CCGATCAG、CCGGTCTT、CGACAACG、CGCCAGTA、CGCGGAAT、CGGAAGGA、CGGTGAGA、CGTAACAC、CGTCTATG、CGTTCTCG、CTACTAAG、CTAGTGCG、CTCTGACA、CTGATGAA、CTGGTACA、CTTACGAG、GAACTCAA、GAATGTTG、GACGAATT、GACTGCCA、GAGCTATT、GAGTCGGA、GATAGAAC、GATGGTCT、GCAGCACT、GCATTCAT、GCCTCTGT、GCGCAGAT、GCTCACAA、GCTTGCGT、GTAATGCA、GTATCGAG、GTCGATCT、GTGAGCGT、GTGGATAG、GTTAGCCA、TAAGGTGG、TACACCGG、TACTCGTC、TAGCTGAG、TCAACAGG、TCACTCAC、TCATAGAC、TCCGTACA、TCGGAGTA、TCGTCGGT、TGAACGCG、TGAGTCTT、TGCGACTG、TGGTTATC、TGTGTAAG、TTAGGAAC、TTCAGTGG、TTCTATCC中的至少一个、两个或三个以上的细合。 In a specific embodiment of the present invention, the third index comprises AACCTCTT, AACGTCCGC, AAGAATCG, AAGCGGTG, AAGGAGCT, AATACCGC, AATCTCCA, ACAACTTC, ACACGCAA, ACCACAGT, ACCGTGTA, ACCTTGCC, ACGCATAA, ACGTATGG, ACTAACCA, ACTCAGGT, ACTGTTG, AGAAGTAC, AGAGATGA, AGATTAGG, AGCCTGGT, AGCTCTAA, AGGT GTCT, AGTCCGTT, AGTTCGCA, ATAAGCTC, ATCCATGA, ATCTAGCG, ATGCAACC, ATGTGCAG, ATTGGTAG, CAAGAAGA, CAATGGAC, CACATGCT, CACGGTAG, CAGAGGTT, CAGTATAG, CATCAAGT, CAGTTCC, CCAACAAT, CCAATTAC, CCAGTGAA, CCGATCAG, CCGGTCTT, CGACAACG, CGCCAGTA, CGCGGAAT, CGGAA GGA, CGGTGAGA, CGTAACAC, CGTCTATG, CGTTCTCG, CTACTAAG, CTAGTGCG, CTCTGACA, CTGATGAA, CTGGTACA, CTTACGAG, GAACTCAA, GAATGTTG, G ACGAATT, GACTGCCA, GAGCTATT, GAGTCGGA, GATAGAAC, GATGGTCT, GCAGCACT, GCATTCAT, GCCTCTGT, GCGCAGAT, GCTCACAA, GCTTGCGT, GTAATG At least one, or a combination of two or more of CA, GTATCGAG, GTCGATCT, GTGAGCGT, GTGGATAG, GTTAGCCA, TAAGGTGG, TACACCGG, TACTCGTC, TAGCTGAG, TCAACAGG, TCACTCAC, TCATAGAC, TCCGTACA, TCGGAGTA, TCGTCGGT, TGAACGCG, TGAGTCTT, TGCGACTG, TGGTTATC, TGTGTAAG, TTAGGAAC, TTCAGTGG, and TTCTATCC.
优选的,所述的方法还包括纯化DNA的步骤。Preferably, the method further comprises the step of purifying DNA.
所述纯化DNA后进行的扩增反应中的引物包含第四索引。The primers used in the amplification reaction performed after the DNA purification comprise a fourth index.
优选的,所述的第四索引包含P3xx索引中的至少一个、两个或三个以上的组合;Preferably, the fourth index comprises a combination of at least one, two or more than three of the P3xx indexes;
优选的,所述的第四索引包含N7xx中的至少一个、两个或三个以上的组合;Preferably, the fourth index includes a combination of at least one, two or more than three of N7xx;
优选的,所述的第四索引包含P5xx中的至少一个、两个或三个以上的组合;Preferably, the fourth index comprises a combination of at least one, two or more than three of P5xx;
优选的,所述的第四索引包含N5xx中的至少一个、两个或三个以上的组合。
Preferably, the fourth index includes a combination of at least one, two or more than three of N5xx.
为增加第四索引(例如P3xx索引),扩增转录组所用引物为SEQ ID NO:9、10。To add a fourth index (e.g., P3xx index), the primers used to amplify the transcriptome are SEQ ID NO: 9, 10.
为增加第四索引(例如P5xx),扩增转录组所用引物为SEQ ID NO:20、18。To add the fourth index (e.g. P5xx), the primers used to amplify the transcriptome are SEQ ID NO: 20, 18.
为增加第四索引(例如N7xx),扩增开放染色质片段所用引物为SEQ ID NO:9、11。To add the fourth index (e.g. N7xx), the primers used to amplify the open chromatin fragments are SEQ ID NO: 9, 11.
为增加第四索引(例如N5xx),扩增开放染色质片段所用引物为SEQ ID NO:20、19。To add the fourth index (e.g. N5xx), the primers used to amplify the open chromatin fragments are SEQ ID NO: 20, 19.
在本发明的一个具体实施方式中,所述的载体包括孔、管或平板。In a specific embodiment of the present invention, the carrier comprises a well, a tube or a plate.
优选的,所述的载体为酶标板例如96孔板。 Preferably, the carrier is an ELISA plate such as a 96-well plate.
优选的,所述的方法还包括加入RNA酶。通过RNase酶切反应,从第一链cDNA中去除RNA,然后用随机引物进行第二链合成,避免了开放染色质片段被0.1N NaOH破坏和RNA-seq库被污染。Preferably, the method further comprises adding RNase. RNA is removed from the first-strand cDNA by RNase digestion reaction, and then the second-strand synthesis is performed using random primers, thereby avoiding the destruction of the open chromatin fragments by 0.1N NaOH and the contamination of the RNA-seq library.
优选的,所述的方法还包括获得细胞,将细胞固定并透化。Preferably, the method further comprises obtaining cells, fixing and permeabilizing the cells.
本发明的第二方面,提供了一种多模式单细胞测序文库的构建方法,所述的构建方法包括按照上述构建单细胞测序文库的方法。In a second aspect of the present invention, a method for constructing a multi-mode single-cell sequencing library is provided, wherein the construction method comprises the method for constructing a single-cell sequencing library according to the above-mentioned method.
本发明的第三方面,提供了一种构建转录组文库的方法,所述的方法包括加入逆转录引物对mRNA进行逆转录获得携带第二接头的cDNA第一条链;将细胞置于载体上,利用第一载体特异性接头将获得的cDNA第一条链连接至载体上;合成cDNA的第二条链;纯化并用引物扩增转录组的cDNA。The third aspect of the present invention provides a method for constructing a transcriptome library, which comprises adding a reverse transcription primer to reverse transcribe mRNA to obtain a first chain of cDNA carrying a second linker; placing cells on a vector, and connecting the obtained first chain of cDNA to the vector using a first vector-specific linker; synthesizing the second chain of cDNA; and purifying and amplifying the cDNA of the transcriptome with primers.
优选的,所述的逆转录引物包含第二接头,所述的第二接头包含poly(T)和第一索引;优选还包含随机六聚体引物。Preferably, the reverse transcription primer comprises a second linker, wherein the second linker comprises poly(T) and a first index; and preferably further comprises a random hexamer primer.
优选的,所述的第一载体特异性接头包含第二索引。Preferably, the first vector-specific linker comprises a second index.
优选的,所述的方法还包括形成液滴,裂解细胞并在液滴中进行扩增反应的步骤,Preferably, the method further comprises the steps of forming droplets, lysing cells and performing an amplification reaction in the droplets.
优选的,形成的液滴中过载细胞;Preferably, the formed droplets are overloaded with cells;
优选的,在液滴中进行扩增反应使用的引物包含第三索引。Preferably, the primers used in the amplification reaction in the droplet include a third index.
优选的,所述的方法包括利用第二载体特异性接头将获得的cDNA第一条链连接至载体上,优选的,所述的第二载体特异性接头包含第三索引。Preferably, the method comprises connecting the obtained first-strand cDNA to the vector using a second vector-specific adapter, and preferably, the second vector-specific adapter comprises a third index.
优选的,所述纯化DNA后进行的扩增反应中的引物包含第四索引。Preferably, the primers used in the amplification reaction performed after the DNA purification comprise a fourth index.
优选的,所述的方法还包括加入RNA酶。Preferably, the method further comprises adding RNase.
本发明的第四方面,提供了一种构建染色质DNA文库的方法,所述的方法包括利用转座子切割开放染色质获得携带第一接头的DNA片段;将细胞置于载体上,利用第一载体特异性接头将获得的携带第一接头的DNA片段连接至载体上;纯化DNA并用引物分别扩增染色质DNA。The fourth aspect of the present invention provides a method for constructing a chromatin DNA library, which comprises using a transposon to cut open chromatin to obtain a DNA fragment carrying a first linker; placing cells on a vector, and connecting the obtained DNA fragment carrying the first linker to the vector using a first vector-specific linker; purifying the DNA and amplifying the chromatin DNA respectively using primers.
优选的,所述的转座子包含条形码序列和转座酶;优选的,所述的条形码序列包含第一接头;进一步优选的,所述的条形码序列还包含第一索引。Preferably, the transposon comprises a barcode sequence and a transposase; preferably, the barcode sequence comprises a first linker; further preferably, the barcode sequence further comprises a first index.
优选的,所述的第一载体特异性接头包含第二索引。Preferably, the first vector-specific linker comprises a second index.
优选的,所述的方法还包括形成液滴,裂解细胞并在液滴中进行扩增反应的步骤,Preferably, the method further comprises the steps of forming droplets, lysing cells and performing an amplification reaction in the droplets.
优选的,形成的液滴中过载细胞;Preferably, the formed droplets are overloaded with cells;
优选的,所述在液滴中进行扩增反应使用的引物包含第三索引。Preferably, the primers used in the amplification reaction in the droplet include a third index.
优选的,所述的方法包括利用第二载体特异性接头将获得的携带第一接头的DNA片段连接至载体上,优选的,所述的第二载体特异性接头包含第三索引。Preferably, the method comprises connecting the obtained DNA fragment carrying the first linker to the vector using a second vector-specific linker, and preferably, the second vector-specific linker comprises a third index.
优选的,扩增染色质DNA所用的引物包含第四索引。Preferably, the primers used to amplify the chromatin DNA comprise a fourth index.
本发明的第五方面,提供了一种上述的方法获得的核酸文库。The fifth aspect of the present invention provides a nucleic acid library obtained by the above method.
本发明的第六方面,提供了一种核酸文库,所述的核酸文库包含至少一个片段DNA,所述的片段DNA包含至少一个索引,和至少一个独特分子标识。In a sixth aspect, the present invention provides a nucleic acid library, wherein the nucleic acid library comprises at least one DNA fragment, and the DNA fragment comprises at least one index and at least one unique molecular identifier.
优选的,所述的索引为一个、两个、三个、四个、五个、六个、七个、八个、九个或十个以上。Preferably, the indexes are one, two, three, four, five, six, seven, eight, nine or more than ten.
优选的,所述的索引包括第一索引、第二索引、第三索引和/或第四索引。Preferably, the index includes a first index, a second index, a third index and/or a fourth index.
在本发明的一个具体实施方式中,所述的核酸文库包含至少一个从5′到3′依次为第四索引、片段DNA、第一索引、第二索引、第三索引。In a specific embodiment of the present invention, the nucleic acid library comprises at least one from 5′ to 3′, which is a fourth index, a fragment DNA, a first index, a second index, and a third index.
优选的,所述的独特分子标识位于第四索引与片段DNA之间,片段DNA与第一索引之间,第一索引与第二索引之间或者第二索引与第三索引之间。Preferably, the unique molecular identifier is located between the fourth index and the fragment DNA, between the fragment DNA and the first index, between the first index and the second index, or between the second index and the third index.
本发明的第七方面,提供了一种测序方法,所述的测序方法包括构建上述的核酸文库。The seventh aspect of the present invention provides a sequencing method, which comprises constructing the above-mentioned nucleic acid library.
本发明的第八方面,提供了一种上述的核酸文库的应用,所述的应用包括肿瘤靶点筛选、疾病监测或植入前胚胎诊断。The eighth aspect of the present invention provides an application of the above-mentioned nucleic acid library, wherein the application includes tumor target screening, disease monitoring or pre-implantation embryo diagnosis.
本发明的第九方面,提供了一种同一细胞中分析染色质可接近性和转录的方法,所述的方法包括上述构建单细胞测序文库、上述构建转录组文库、上述构建染色质DNA文库的步骤。The ninth aspect of the present invention provides a method for analyzing chromatin accessibility and transcription in the same cell, wherein the method comprises the steps of constructing a single-cell sequencing library, constructing a transcriptome library, and constructing a chromatin DNA library.
本发明的第十方面,提供了一种单细胞多组学的分析方法,所述的分析方法包括构建单细胞测序文库、上述构建转录组文库、上述构建染色质DNA文库,并进行测序获得染色质可接近性和/或转录组序列信息,然后进行生物信息学分析。The tenth aspect of the present invention provides a single-cell multi-omics analysis method, which includes constructing a single-cell sequencing library, constructing a transcriptome library, constructing a chromatin DNA library, and sequencing to obtain chromatin accessibility and/or transcriptome sequence information, and then performing bioinformatics analysis.
本发明的第十一方面,提供了一种试剂盒,所述的试剂盒包括构建上述核酸文库所用的试剂。 The eleventh aspect of the present invention provides a kit, which includes reagents used to construct the above-mentioned nucleic acid library.
本发明所述的“染色质可接近性”即真核生物染色质DNA在核小体或转录因子等蛋白与其结合后,对其他蛋白能否再结合的开放程度。其中,可以对其他蛋白再结合的区域即为开放染色质。The "chromatin accessibility" of the present invention refers to the degree of openness of eukaryotic chromatin DNA to other proteins after nucleosomes or transcription factors and other proteins bind to it. Among them, the region that can be re-bound to other proteins is open chromatin.
本发明所述的“载体”可以为任何具有固体支持物表面的物体,其表面可以经过修饰与细胞或核酸分子偶联。其可以为孔玻璃(CPG)、草酰-调孔玻璃、TentaGel支持物-一种氨基聚乙二醇衍生化支持物、聚苯乙烯,Poros(一种聚苯乙烯/二乙烯基苯的共聚物)或可逆交联的丙烯酰胺。很多其它固体支持物市售可得且适用于本发明。在一些实施方式中,可以为聚苯乙烯树脂或聚(甲基丙烯酸甲酯)(PMMA)。也可以是金属。The "carrier" of the present invention can be any object having a solid support surface, and its surface can be modified to couple with cells or nucleic acid molecules. It can be porous glass (CPG), oxalyl-adjusted pore glass, TentaGel support-an amino polyethylene glycol derivatized support, polystyrene, Poros (a copolymer of polystyrene/divinylbenzene) or reversibly cross-linked acrylamide. Many other solid supports are commercially available and suitable for the present invention. In some embodiments, it can be polystyrene resin or poly (methyl methacrylate) (PMMA). It can also be a metal.
本发明所述的“液滴”为水包油或油包水结构。不同的液滴可以具有不同的标识。优选为水性混合物与油相合并。优选的,所述的油相为表面活性剂。The "droplets" of the present invention are oil-in-water or water-in-oil structures. Different droplets may have different identifiers. Preferably, the aqueous mixture is combined with an oil phase. Preferably, the oil phase is a surfactant.
本发明所述的“透化”是指在不造成细胞裂解以及不破坏细胞内部有机结构的情况下改变细胞壁和细胞膜的通透性,使得小分子物质和一些较大分子物质能够自由地进出细胞的技术。细胞经过透性化处理后在提高通透性的同时,整体结构保持完整,对胞内酶仍具有相当的保护作用,可保证胞内酶催化作用的充分发挥,并延长酶的使用寿命。The "permeabilization" mentioned in the present invention refers to the technology of changing the permeability of the cell wall and cell membrane without causing cell lysis and destroying the internal organic structure of the cell, so that small molecules and some larger molecules can freely enter and exit the cell. After the permeabilization treatment, the permeability of the cell is improved while the overall structure remains intact, which still has a considerable protective effect on the intracellular enzyme, can ensure that the catalytic effect of the intracellular enzyme is fully exerted, and prolong the service life of the enzyme.
本发明所述的“过载”为超过原有承载量。所述的原有承载量为现有技术中常规的承载量。例如“液滴中过载细胞”代表超过原有液滴中承载细胞的量。现有技术中液滴承载细胞包括空载、承载单个细胞或者过载细胞。其中过载细胞代表一个液滴中承载的细胞数量超过一个。优选承载两个、三个、四个、五个、六个、七个、八个或九个以上。The "overload" mentioned in the present invention means exceeding the original carrying capacity. The original carrying capacity is the conventional carrying capacity in the prior art. For example, "overloaded cells in a droplet" means exceeding the amount of cells carried in the original droplet. In the prior art, droplet-carrying cells include empty cells, cells carrying a single cell, or overloaded cells. Among them, overloaded cells mean that the number of cells carried in a droplet exceeds one. Preferably, two, three, four, five, six, seven, eight or nine or more cells are carried.
本发明所述的“接头”与现有技术中的adapter可以互换使用,其可以用于连接片段化的DNA与索引,或者连接索引与索引,或者连接片段化的DNA与片段化的DNA。其优选为一段长度为3-1000个碱基的核苷酸序列。The "connector" described in the present invention can be used interchangeably with the adapter in the prior art, and can be used to connect fragmented DNA with an index, or to connect an index with an index, or to connect fragmented DNA with fragmented DNA. It is preferably a nucleotide sequence with a length of 3-1000 bases.
本发明所述的“索引”与现有技术中的index、barcode等可以互换使用。所述的索引可以为一段序列或几段序列的组合。其优选为一段长度为3-1000个碱基的核苷酸序列。The "index" described in the present invention can be used interchangeably with index, barcode, etc. in the prior art. The index can be a sequence or a combination of several sequences. It is preferably a nucleotide sequence with a length of 3-1000 bases.
本发明所述的“独特分子标识”即Unique Molecular Identifier,简称UMI,其为随机设计的一段核苷酸序列,可以专一性的辨识其偶联的分子,但是并不是所有偶联的分子都具有唯一的UMI,在一个具体实施方式中,其与其他索引组合形成一个唯一的分子标识。The "unique molecular identifier" mentioned in the present invention is a Unique Molecular Identifier, or UMI for short, which is a randomly designed nucleotide sequence that can specifically identify the molecules it is coupled to. However, not all coupled molecules have a unique UMI. In a specific embodiment, it is combined with other indexes to form a unique molecular identifier.
本发明所述的“互补”是指通过碱基配对规则相关联的核苷酸序列。例如序列5′-AGT-3′与序列5′-ACT-3′互补。互补性可以是部分的或完全的。部分互补性发生在当一个或多个核酸碱基根据碱基配对规则不匹配时。核酸间完全或完整互补性发生在每个核酸碱基在碱基配对规则下与另一个碱基匹配时。核酸链间的互补性程度对于核酸链间杂交的效率和强度有显著影响。"Complementarity" as used herein refers to nucleotide sequences that are related by the base pairing rules. For example, the sequence 5'-AGT-3' is complementary to the sequence 5'-ACT-3'. Complementarity can be partial or complete. Partial complementarity occurs when one or more nucleic acid bases do not match according to the base pairing rules. Complete or complete complementarity between nucleic acids occurs when each nucleic acid base matches another base under the base pairing rules. The degree of complementarity between nucleic acid chains has a significant effect on the efficiency and strength of hybridization between nucleic acid chains.
本发明所述的“单细胞”指单个细胞或一个细胞,其可以来自血液样本、细胞培养物,也可以来自特定组织、器官或肿瘤等等。然后再通过现有技术常规的分离方式,将其分离为单个细胞。The "single cell" mentioned in the present invention refers to a single cell or a cell, which can come from a blood sample, a cell culture, or a specific tissue, organ or tumor, etc. Then, it is separated into single cells by conventional separation methods in the prior art.
本发明所述的“doublet”或“doublets”代表两个或三个以上的细胞共用一个标识的情况,所述的标识例如索引、接头、独特分子标识等等或他们的组合。The "doublet" or "doublets" mentioned in the present invention refers to the situation where two or more cells share a common identifier, such as an index, a linker, a unique molecular identifier, etc. or a combination thereof.
本文所述的“核酸”表示DNA、RNA、单链、双链、或更高度聚集的杂交基序及其任意化学修饰。修饰包括但不限于,提供整合入其它电荷、极化性、氢键、静电相互作用、与核酸配体碱基或核酸配体整体的连接点和作用点的化学基团的那些修饰。这类修饰包括但不限于,肽核酸(PNA)、磷酸二酯基团修饰(例如,硫代磷酸酯、甲基膦酸酯)、2′-位糖修饰、5-位嘧啶修饰、8-位嘌呤修饰、环外胺处的修饰、4-硫尿核苷的取代、5-溴或5-碘-尿嘧啶的取代、骨架修饰、甲基化、不常见的碱基配对组合如异碱基(iso bases)、异胞苷和异胍(isoguanidine)等。核酸也可包含非天然碱基,如硝基吲哚。修饰还可包括3′和5′修饰,包括但不限于用荧光团(例如,量子点)或其他部分加帽。As used herein, "nucleic acid" refers to DNA, RNA, single-stranded, double-stranded, or more highly aggregated hybridization motifs and any chemical modifications thereof. Modifications include, but are not limited to, those modifications of chemical groups that provide for integration into other charges, polarizability, hydrogen bonding, electrostatic interactions, connection points and action points with nucleic acid ligand bases or nucleic acid ligands as a whole. Such modifications include, but are not limited to, peptide nucleic acids (PNA), phosphodiester group modifications (e.g., phosphorothioate, methylphosphonate), 2'-position sugar modifications, 5-position pyrimidine modifications, 8-position purine modifications, modifications at exocyclic amines, substitutions of 4-thiouridine, substitutions of 5-bromo or 5-iodo-uracil, backbone modifications, methylation, unusual base pairing combinations such as iso bases, isocytidine and isoguanidine, etc. Nucleic acids may also contain non-natural bases, such as nitroindole. Modifications may also include 3' and 5' modifications, including but not limited to capping with fluorophores (e.g., quantum dots) or other moieties.
本发明所述的“和/或”包含该术语所连接的项目的所有组合,应视为各个组合已经单独地在本问列出。例如,“A和/或B”包含了“A”、“A和B”以及“B”。又例如,“A、B和/或C”包含了“A”、“B”、“C”、“A和B”、“A和C”、“B和C”以及“A和B和C”。The "and/or" described in the present invention includes all combinations of items connected by the term, and each combination should be deemed to have been listed separately in the question. For example, "A and/or B" includes "A", "A and B" and "B". For another example, "A, B and/or C" includes "A", "B", "C", "A and B", "A and C", "B and C" and "A and B and C".
本发明所述的“包含”或“包括”为开放式写法,当用于描述蛋白质或核酸的序列时,所述蛋白质或核酸可以是由所述序列组成,或者在所述蛋白质或核酸的一端或两端可以具有额外的氨基酸或核苷酸,但仍然具有本发明所述的活性。 The terms “comprising” or “including” described in the present invention are open-ended terms. When used to describe a protein or nucleic acid sequence, the protein or nucleic acid may be composed of the sequence, or may have additional amino acids or nucleotides at one or both ends of the protein or nucleic acid, but still have the activity described in the present invention.
增加了细胞内第二链合成步骤,以减少交联蛋白抑制的影响,捕获更多的转录本。实现基于液滴标引的线性扩增和提高cDNA捕获效率。同时,提供了cDNA不同于染色质片段的PCR锚定接头,避免了ATAC-seq库污染RNA-seq库。The second strand synthesis step in the cell is added to reduce the effect of cross-linked protein inhibition and capture more transcripts. Linear amplification based on droplet indexing is achieved and the efficiency of cDNA capture is improved. At the same time, a PCR anchor adapter is provided for cDNA that is different from chromatin fragments to avoid ATAC-seq library contamination of RNA-seq library.
Parallel-seq使用多个细胞对液滴进行过载,以充分利用生成的液滴,并进行前后索引以区分一个液滴内的细胞,大大扩展了条形码空间。而且,条形码区域的长度明显降低,使其可以通过条形码和固定核苷酸区域读取150nt测序读取长度内的开放片段。按照设计,Parallel-Seq首先在转座和逆转录过程中用特定于样本的条形码对细胞进行散列,使其可以在一个实验中并行评估多个样本并具有可扩展性。Parallel-Seq在数据质量方面优于现有方法,通量增加(每个实验3600万个细胞),这为构建价格合理的大型单元图谱提供了强大的工具。此外,我们将Parallel-seq应用于肺癌样本,并证明了其在识别特定基因可及区域的顺式调控元件方面的能力。在肿瘤样本中应用了基因表达和染色质可接近性的联合分析,并利用联合分析和新开发的分析方法来识别可能的调控元件,包括致癌基因的增强子和突变。此外,Parallel-Seq易于在实验中处理更多样本,并可扩展到其他组学,如DNA甲基化、蛋白质表达和CRISPR筛选。Parallel-seq overloads droplets with multiple cells to fully utilize the generated droplets, and performs forward and backward indexing to distinguish cells within a droplet, greatly expanding the barcode space. Moreover, the length of the barcode region is significantly reduced, allowing it to read open fragments within the 150nt sequencing read length through the barcode and fixed nucleotide regions. By design, Parallel-Seq first hashes cells with sample-specific barcodes during transposition and reverse transcription, allowing it to evaluate multiple samples in parallel in one experiment and be scalable. Parallel-Seq outperforms existing methods in data quality and has increased throughput (36 million cells per experiment), which provides a powerful tool for building affordable large-scale cell maps. In addition, we applied Parallel-seq to lung cancer samples and demonstrated its ability to identify cis-regulatory elements in accessible regions of specific genes. Joint analysis of gene expression and chromatin accessibility was applied in tumor samples, and joint analysis and newly developed analysis methods were used to identify possible regulatory elements, including enhancers and mutations of oncogenes. In addition, Parallel-Seq can easily handle more samples in an experiment and can be expanded to other omics such as DNA methylation, protein expression, and CRISPR screening.
以下,结合附图来详细说明本发明的实施例,其中:The embodiments of the present invention are described in detail below with reference to the accompanying drawings, wherein:
图1:Parallel-Seq的实验设计图,使用索引、液滴过载来分析同一细胞的scATAC和scRNA,其中,pool/split代表混合/分散。Figure 1: Parallel-Seq experimental design diagram, using indexing and droplet overloading to analyze scATAC and scRNA in the same cell, where pool/split represents mixing/dispersion.
图2:使用NIH/3T3(鼠)、HEK293T(人)和K562(人)细胞的混合物进行Parallel-Seq,结果映射到人类和小鼠基因组的scRNA-seq(上)和scATAC-seq(下)的UMI计数图,其中,mm10代表小鼠参考基因组mm10版本。Figure 2: Parallel-Seq was performed using a mixture of NIH/3T3 (mouse), HEK293T (human), and K562 (human) cells, and the results are mapped to the UMI counts of scRNA-seq (top) and scATAC-seq (bottom) of the human and mouse genomes, where mm10 represents the mm10 version of the mouse reference genome.
图3:Parallel-Seq的scATAC-seq部分片段的插入片段长度分布。Figure 3: Insert length distribution of the scATAC-seq subset of Parallel-Seq.
图4:TSSs周围scATAC-seq reads的富集。Figure 4: Enrichment of scATAC-seq reads around TSSs.
图5:散点图显示了K562细胞中Parallel-Seq的scATAC-seq和ENCODE DNase-seq之间log2(count)的相关性。Figure 5: Scatter plot showing the log 2 (count) correlation between scATAC-seq and ENCODE DNase-seq by Parallel-Seq in K562 cells.
图6:散点图显示了中K562细胞Parallel-Seq的聚集scRNA-seq和ENCODE核RNA-seq之间log2(TPM+1)的相关性。Figure 6: Scatter plot showing the log 2 (TPM+1) correlation between clustered scRNA-seq and ENCODE nuclear RNA-seq in K562 cells Parallel-Seq.
图7:在K562细胞中分别采用Parallel-Seq、ENCODE DNase-seq捕获染色质可接近性以及分别采用Parallel-Seq、ENCODE RNA-seq捕获RNA的对比结果。Figure 7: Comparison of chromatin accessibility captured by Parallel-Seq and ENCODE DNase-seq, and RNA captured by Parallel-Seq and ENCODE RNA-seq in K562 cells.
图8:来自3T3、293T和K562细胞混合的Parallel-Seq配对基因表达数据的均匀流形近似和投影(UMAP)可视化。Figure 8: Uniform Manifold Approximation and Projection (UMAP) visualization of Parallel-Seq paired gene expression data from a mixture of 3T3, 293T, and K562 cells.
图9:来自3T3、293T和K562细胞混合的Parallel-Seq配对染色质可接近性数据的均匀流形近似和投影(UMAP)可视化。Figure 9: Uniform Manifold Approximation and Projection (UMAP) visualization of Parallel-Seq paired chromatin accessibility data from a mixture of 3T3, 293T, and K562 cells.
图10:箱线图显示了sci-CAR、SNARE-seq、Paired-Seq、SHARE-seq和Parallel-Seq的唯一映射RNA reads的数量和唯一映射的ATAC reads的数量。其中,横坐标RNA文库框图从左到右依次为sci-CAR、SNARE-Seq、Paired-Seq、SHARE-Seq、Parallel-Seq,ATAC文库框图从左到右依次为sci-CAR、SNARE-Seq、Paired-Seq、SHARE-Seq、Parallel-Seq。 Figure 10: Box plots show the number of uniquely mapped RNA reads and the number of uniquely mapped ATAC reads for sci-CAR, SNARE-seq, Paired-Seq, SHARE-seq, and Parallel-Seq. The horizontal axis RNA library box diagrams are sci-CAR, SNARE-Seq, Paired-Seq, SHARE-Seq, and Parallel-Seq from left to right, and the ATAC library box diagrams are sci-CAR, SNARE-Seq, Paired-Seq, SHARE-Seq, and Parallel-Seq from left to right.
图11:箱线图显示了sci-CAR、SNARE-seq、Paired-Seq、SHARE-seq和Parallel-Seq中每个细胞捕获的的基因数量。其中,横坐标RNA文库框图从左到右依次为sci-CAR、SNARE-Seq、Paired-Seq、SHARE-Seq、Parallel-Seq。Figure 11: The box plot shows the number of genes captured per cell in sci-CAR, SNARE-seq, Paired-Seq, SHARE-seq, and Parallel-Seq. The horizontal axis RNA library box diagrams are sci-CAR, SNARE-Seq, Paired-Seq, SHARE-Seq, and Parallel-Seq from left to right.
图12:Parallel-Split-Seq工作流程示意图。Figure 12: Schematic diagram of the Parallel-Split-Seq workflow.
图13:映射到人类和小鼠基因组的scRNA-seq(左)和scATAC-seq(右)的UMI计数。该实验使用NIH/3T3(鼠)、HEK293T(人)、HeLa(人)、K562(人)和THP1(人)细胞的混合物进行Parallel-Split-Seq。Figure 13: UMI counts for scRNA-seq (left) and scATAC-seq (right) mapped to the human and mouse genomes. This experiment was performed using a mixture of NIH/3T3 (mouse), HEK293T (human), HeLa (human), K562 (human), and THP1 (human) cells for Parallel-Split-Seq.
图14:Parallel-Split-Seq与Parallel-Seq中scATAC-seq片段的插入片段长度分布。Figure 14: Insert length distribution of scATAC-seq fragments in Parallel-Split-Seq and Parallel-Seq.
图15:Parallel-Split-Seq与Parallel-Seq中TSSs周围scATAC-seq reads的富集。Figure 15: Enrichment of scATAC-seq reads around TSSs in Parallel-Split-Seq and Parallel-Seq.
图16:散点图显示了K562细胞中Parallel-Split-Seq的scRNA-seq和ENCODE核RNA-seq之间log2(TPM+1)的相关性(图A)以及scATAC-seq和ENCODE核DNase-seq之间log2(count)的相关性(图B)。Figure 16: Scatter plots showing the correlation of log 2 (TPM+1) between scRNA-seq and ENCODE nuclear RNA-seq from Parallel-Split-Seq in K562 cells (Panel A) and the correlation of log2(count) between scATAC-seq and ENCODE nuclear DNase-seq (Panel B).
图17:来自NIH/3T3、HEK293T、HeLa、K562和THP1细胞的Parallel-Split-Seq配对基因表达(左)和染色质可接近性(右)数据的均匀流形近似和投影(UMAP)可视化。Figure 17: Uniform Manifold Approximation and Projection (UMAP) visualization of Parallel-Split-Seq paired gene expression (left) and chromatin accessibility (right) data from NIH/3T3, HEK293T, HeLa, K562, and THP1 cells.
图18:在K562细胞中分别采用Parallel-Seq、Parallel-Split-Seq以及ENCODE DNase-seq捕获染色质可接近性的对比结果以及分别采用Parallel-Seq、Parallel-Split-Seq以及ENCODE RNA-seq捕获RNA的对比结果。Figure 18: Comparative results of chromatin accessibility captured by Parallel-Seq, Parallel-Split-Seq and ENCODE DNase-seq in K562 cells, and comparative results of RNA captured by Parallel-Seq, Parallel-Split-Seq and ENCODE RNA-seq.
图19:箱线图显示了sci-CAR、SNARE-seq、Paired-Seq、SHARE-seq、Parallel-Seq和Parallel-Split-seq的唯一映射RNA reads的数量和唯一映射的ATAC reads的数量。其中,横坐标RNA文库框图从左到右依次为sci-CAR、SNARE-Seq、Paired-Seq、SHARE-Seq、Parallel-Seq和Parallel-Split-Seq,ATAC文库框图从左到右依次为sci-CAR、SNARE-Seq、Paired-Seq、SHARE-Seq、Parallel-Seq和Parallel-Split-Seq。Figure 19: Box plots show the number of uniquely mapped RNA reads and the number of uniquely mapped ATAC reads for sci-CAR, SNARE-seq, Paired-Seq, SHARE-seq, Parallel-Seq, and Parallel-Split-seq. The horizontal axis RNA library box diagrams are sci-CAR, SNARE-Seq, Paired-Seq, SHARE-Seq, Parallel-Seq, and Parallel-Split-Seq from left to right, and the ATAC library box diagrams are sci-CAR, SNARE-Seq, Paired-Seq, SHARE-Seq, Parallel-Seq, and Parallel-Split-Seq from left to right.
图20:箱线图显示了sci-CAR、SNARE-seq、Paired-Seq、SHARE-seq、Parallel-Seq和Parallel-Split-seq中每个细胞捕获的的基因数量。其中,横坐标RNA文库框图从左到右依次为sci-CAR、SNARE-Seq、Paired-Seq、SHARE-Seq、Parallel-Seq和Parallel-Split-Seq。Figure 20: The box plot shows the number of genes captured per cell in sci-CAR, SNARE-seq, Paired-Seq, SHARE-seq, Parallel-Seq, and Parallel-Split-seq. The horizontal axis RNA library box diagrams are sci-CAR, SNARE-Seq, Paired-Seq, SHARE-Seq, Parallel-Seq, and Parallel-Split-Seq from left to right.
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅是本发明的部分实施例,而不是全部。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。The following will be combined with the drawings in the embodiments of the present invention to clearly and completely describe the technical solutions in the embodiments of the present invention. Obviously, the described embodiments are only some embodiments of the present invention, not all. Based on the embodiments of the present invention, all other embodiments obtained by ordinary technicians in this field without making creative work are within the scope of protection of the present invention.
实施例中细胞培养方法:Cell culture method in the embodiment:
HEK293T,HeLa-S3和NIH/3T3细胞在添加10%胎牛血清(P30-3302,PAN BIOTECH)的DMEM(C11995500BT,ThermoFisher)培养基中,37℃和5%CO2的环境下培养。用PBS(C10010500BT,ThermoFisher)冲洗细胞,并在37℃下用1mL 0.25%胰蛋白酶EDTA(25200114,ThermoFisher)培养3-5分钟以分离细胞。K562细胞在添加10%胎牛血清的RPMI 1640(C11875500BT,ThermoFisher)培养基中,37℃和5%CO2的环境下培养。通过离心收集分离的HEK293T、HeLa-S3和NIH/3T3细胞以及K562细胞悬浮液,用PBS洗涤并使用Countstar计数。 HEK293T, HeLa-S3 and NIH/3T3 cells were cultured in DMEM (C11995500BT, ThermoFisher) medium supplemented with 10% fetal bovine serum (P30-3302, PAN BIOTECH) at 37°C and 5% CO 2. The cells were rinsed with PBS (C10010500BT, ThermoFisher) and cultured with 1 mL 0.25% trypsin EDTA (25200114, ThermoFisher) at 37°C for 3-5 minutes to detach the cells. K562 cells were cultured in RPMI 1640 (C11875500BT, ThermoFisher) medium supplemented with 10% fetal bovine serum at 37°C and 5% CO 2 . Detached HEK293T, HeLa-S3 and NIH/3T3 cells and K562 cell suspensions were collected by centrifugation, washed with PBS and counted using Countstar.
实施例中肺癌样本制备:Lung cancer sample preparation in the embodiment:
在解放军总医院采集新鲜非小细胞肺癌实体肿瘤组织,置于预冷MACS组织存储液(130-100-008,Miltenyi Biotec)(2-8℃)中。样品必须被MACS全部覆盖,并从医院运送到实验室。Fresh NSCLC solid tumor tissues were collected from PLA General Hospital and placed in precooled MACS tissue storage solution (130-100-008, Miltenyi Biotec) (2-8°C). The samples had to be fully covered with MACS and transported from the hospital to the laboratory.
使用了4677μLDMEM/F-12(11320033,ThermoFisher)、250μL 2.5mg/ml Liberase TL(05401020001,Sigma Aldrich)至最终浓度250μg/mL、23μl 2mg/mL elastase(NC9301601,Worthington)至最终浓度9.2μg/mL、50μL 10mg/mL DNA酶(11284932001,Sigma Aldrich)至最终浓度为100μg/mL的分离混合物。A separation mixture of 4677 μL DMEM/F-12 (11320033, ThermoFisher), 250 μL 2.5 mg/ml Liberase TL (05401020001, Sigma Aldrich) to a final concentration of 250 μg/mL, 23 μl 2 mg/mL elastase (NC9301601, Worthington) to a final concentration of 9.2 μg/mL, and 50 μL 10 mg/mL DNase (11284932001, Sigma Aldrich) to a final concentration of 100 μg/mL was used.
用剪刀将组织在1.5mL Eppendorf微量离心管中切碎成0.4mm以下的小块。解离混合物在37℃下培养,并在90转/分下水平旋转60分钟。通过70μm细胞过滤器(15-1070,BIOLOGIX)过滤单细胞悬浮液,并在4℃下以500g(离心力)离心5分钟。用1mL PBS和3mL红细胞裂解液(4992957,TIANGEN)重悬细胞。室温孵育5min,4℃下500g离心5min。细胞重悬于500μL胎牛血清中。取5μL细胞,与5μL台班蓝液(15250061,ThermoFisher)混合,用C-Chip一次性血球计(DHC-N01N,As One)计数。用添加二甲基亚砜(D2650,Sigma Aldrich)的胎牛血清稀释单细胞悬液至最终浓度10%。我们冷冻保存细胞,每个试管含有1x10^6的单个细胞。实验前,细胞在37℃温和解冻5min,在4℃500g离心5min。用80μL细胞染色缓冲液(420201,BioLegend)和5μL Human TruStainFcX(422302,BioLegend)重悬细胞,4℃孵育5分钟。分别加入5μL anti-CD45 PE(304039,BioLegend)、5μL anti-CD3 BV421(317344,BioLegend)和5μL anti-EpCam PE/Cy7(324222,BioLegend)抗体,在4℃避光孵育15min。用1mL PBS洗涤染色细胞,4℃500g离心5min。丢弃上清,用含0.02μM Calcein-AM(425201,BioLegend)的1mL PBS重悬单个细胞,避光室温孵育15分钟。分别用90μL Annexin V Binding Buffer、5μL APC Annexin V(640941,BioLegend)和5μL 7-AAD活力染色液(420404,BioLegend)重悬细胞。室温孵育10分钟。加入400μL PBS,用35μm BD细胞滤器(352235,BD Falcon)过滤细胞。Caicein-AM阳性、7-AAD阴性、Annexin V阴性单细胞用MoFloAstrios EQ Cell Sorter(Beckman Coulter)分选。由于肿瘤细胞和T细胞可能是测序数据的主要部分,因此我们用低于5%的EpCam阳性细胞、40%的T细胞和55%的其他单细胞来平衡样本。The tissue was minced into small pieces less than 0.4 mm in a 1.5 mL Eppendorf microcentrifuge tube with scissors. The dissociation mixture was incubated at 37°C and rotated horizontally at 90 rpm for 60 minutes. The single cell suspension was filtered through a 70 μm cell strainer (15-1070, BIOLOGIX) and centrifuged at 500 g (centrifugal force) for 5 minutes at 4°C. The cells were resuspended with 1 mL PBS and 3 mL red blood cell lysis buffer (4992957, TIANGEN). Incubate at room temperature for 5 minutes and centrifuge at 500 g for 5 minutes at 4°C. The cells were resuspended in 500 μL fetal bovine serum. 5 μL of cells were taken, mixed with 5 μL of Taiban blue solution (15250061, ThermoFisher), and counted with a C-Chip disposable hemacytometer (DHC-N01N, As One). Single cell suspension was diluted to a final concentration of 10% with fetal bovine serum supplemented with dimethyl sulfoxide (D2650, Sigma Aldrich). We cryopreserved cells, with each tube containing 1x10^6 single cells. Before the experiment, cells were gently thawed at 37°C for 5 min and centrifuged at 500g for 5 min at 4°C. Cells were resuspended with 80μL cell staining buffer (420201, BioLegend) and 5μL Human TruStain FcX (422302, BioLegend) and incubated at 4°C for 5 min. 5μL anti-CD45 PE (304039, BioLegend), 5μL anti-CD3 BV421 (317344, BioLegend) and 5μL anti-EpCam PE/Cy7 (324222, BioLegend) antibodies were added respectively and incubated at 4°C in the dark for 15 min. Wash the stained cells with 1 mL PBS and centrifuge at 500 g for 5 min at 4°C. Discard the supernatant and resuspend the single cells with 1 mL PBS containing 0.02 μM Calcein-AM (425201, BioLegend) and incubate at room temperature for 15 min in the dark. Resuspend the cells with 90 μL Annexin V Binding Buffer, 5 μL APC Annexin V (640941, BioLegend) and 5 μL 7-AAD Viability Staining Solution (420404, BioLegend), respectively. Incubate at room temperature for 10 min. Add 400 μL PBS and filter the cells with a 35 μm BD cell filter (352235, BD Falcon). Calcein-AM-positive, 7-AAD-negative, and Annexin V-negative single cells were sorted using MoFloAstrios EQ Cell Sorter (Beckman Coulter). Since tumor cells and T cells are likely to be the major part of the sequencing data, we balanced the samples with less than 5% EpCam-positive cells, 40% T cells, and 55% other single cells.
实施例中制备转座子的方法如下:The method for preparing the transposon in the embodiment is as follows:
准备Tn5Merev(/5Phos/CTGTCTCTTATACACATCT(SEQ ID NO:21))、Tn5ME-A、Tn5ME-B和带条码的R1BxME(x代表1-96)。将10μM Tn5Merev、10μM Tn5ME-A(用于Parallel-Split-Seq)或10μM Tn5ME-B(用于Parallel-Seq)分别与10μM R1BxME在95℃~2分钟,0.1℃/s下逐渐退火降至20℃和4℃。结合2μL退火Tn5ME-B(Parallel-Seq)、2μL退火的Tn5ME-R1Bx、2μL 10x TPS、4μL转座酶(M0221,Robustnique)和10μLUltraPure DNase/RNase-Free Water(10977023,ThermoFisher)在室温下孵育30分钟。组装的转座子每管分装4μL,在-20℃下储存不超过1个月。Tn5Merev (/5Phos/CTGTCTCTTATACACATCT (SEQ ID NO: 21)), Tn5ME-A, Tn5ME-B and barcoded R1BxME (x represents 1-96) were prepared. 10μM Tn5Merev, 10μM Tn5ME-A (for Parallel-Split-Seq) or 10μM Tn5ME-B (for Parallel-Seq) were annealed with 10μM R1BxME at 95℃ for 2 minutes and gradually dropped to 20℃ and 4℃ at 0.1℃/s. Combine 2 μL annealed Tn5ME-B (Parallel-Seq), 2 μL annealed Tn5ME-R1Bx, 2 μL 10x TPS, 4 μL transposase (M0221, Robustnique), and 10 μL UltraPure DNase/RNase-Free Water (10977023, ThermoFisher) and incubate at room temperature for 30 min. Aliquot 4 μL of the assembled transposon into each tube and store at -20°C for no more than 1 month.
实施例中固定细胞的方法如下:The method for fixing cells in the embodiment is as follows:
为细胞系计数50k单细胞,或为每管肺癌样本分类50k原代细胞。在4℃下以500g离心单细胞5min,并用250μL PBS重悬单细胞。添加750μL含有1.33%无甲醇甲醛的PBS(28906,ThermoFisher)并在冰中孵育10分钟。添加50μL 20%BSA(V0332-100G,VWR)并在4℃下用摆桶离心法在1000g下离心3分钟,然后在预冷(4℃)的固定角离心机中离心将细胞收集到1.5mL微型离心管(MCT-150-C, Axygen)的一侧,并像omni ATAC一样,通过两个移液步骤去除上清液。结果显示加入BSA后,先用摆桶离心法分离单细胞,可以回收更多的原代细胞。Count 50k single cells for cell lines or classify 50k primary cells for each tube of lung cancer sample. Centrifuge single cells at 500g for 5min at 4°C and resuspend single cells in 250μL PBS. Add 750μL PBS containing 1.33% methanol-free formaldehyde (28906, ThermoFisher) and incubate in ice for 10 minutes. Add 50μL 20% BSA (V0332-100G, VWR) and centrifuge at 1000g for 3 minutes at 4°C using swinging bucket centrifugation, then collect cells into 1.5mL microcentrifuge tubes (MCT-150-C, Axygen) and the supernatant was removed in two pipetting steps like the omni ATAC. The results showed that after adding BSA, more primary cells could be recovered by first isolating single cells by swinging bucket centrifugation.
实施例中透化细胞的方法如下:The method for permeabilizing cells in the embodiment is as follows:
将1mL 1M Tris HCl pH 7.4(T2663-1L,Sigma-Aldrich)、200μL 5M NaCl(AM9759,ThermoFisher)、300μL 1M MgCl2(AM9530G,ThermoFisher)和48.5mL超纯DNase/RNase游离蒸馏水混合,制备2x RSB作为Omni ATAC。固定时,制备透化缓冲液,每个样品结合50μL 2x RSB、1μL核糖锁(EO0384,ThermoFisher)、1μLSUPERase·In RNase抑制剂(AM2696,ThermoFisher)、1μL 10%Nonidet P40替代物(1133247301,Sigma-Aldrich)、1μL 10%tween20(11332465001,Sigma-Aldrich)、1μL 1%Digitonin(D141-100MG,Sigma-Aldrich),5μL 20%BSA,40μl超纯DNase/RNase游离蒸馏水。通过将500μL 2x RSB、10μL 10%TWEEN20、1μL RiboLock、50μL 20%BSA。去除固定试剂后,立即添加100μL透化缓冲液,移液管8次,放入冰中孵育5分钟。透化后,向每个样品中添加1mL洗涤缓冲液。离心细胞并去除上清液。2x RSB was prepared as Omni ATAC by mixing 1 mL 1M Tris HCl pH 7.4 (T2663-1L, Sigma-Aldrich), 200 μL 5M NaCl (AM9759, ThermoFisher), 300 μL 1M MgCl 2 (AM9530G, ThermoFisher), and 48.5 mL ultrapure DNase/RNase free distilled water. During fixation, prepare permeabilization buffer, combine 50 μL 2x RSB, 1 μL RiboLock (EO0384, ThermoFisher), 1 μL SUPERase·In RNase inhibitor (AM2696, ThermoFisher), 1 μL 10% Nonidet P40 substitute (1133247301, Sigma-Aldrich), 1 μL 10% tween20 (11332465001, Sigma-Aldrich), 1 μL 1% Digitonin (D141-100MG, Sigma-Aldrich), 5 μL 20% BSA, 40 μL ultrapure DNase/RNase free distilled water per sample. After removing the fixative reagent, immediately add 100 μL of permeabilization buffer, pipette 8 times, and incubate in ice for 5 minutes. After permeabilization, add 1 mL of wash buffer to each sample. Centrifuge the cells and remove the supernatant.
实施例中转座的方法如下:The method of transposition in the embodiment is as follows:
将10μL 5xLM缓冲液(M0221,Robustnique)、16.5μL PBS、0.5μL RiboLock、0.5μLSUPERase·In RNase Inhibitor、0.5μL 10%Tween 20、0.5μL Digiton.5μL、17.5μL超纯DNase/RNase游离蒸馏水混合,制备ATAC seq反应溶液。用46μL ATAC-seq反应溶液重悬透化单细胞,并向每个试管中添加4μL条码特异性转座子。ATAC-seq反应在37℃、550r.p.m.条件下用热盖进行。ATAC-seq反应后,向每根试管中添加949μL PBS、10μL 10%Triton X-100(93443,Sigma-Aldrich)、1μL RiboLock和50μL 20%BSA,并离心以去除上清液。Prepare ATAC-seq reaction solution by mixing 10 μL 5xLM buffer (M0221, Robustnique), 16.5 μL PBS, 0.5 μL RiboLock, 0.5 μL SUPERase·In RNase Inhibitor, 0.5 μL 10% Tween 20, 0.5 μL Digiton.5 μL, and 17.5 μL ultrapure DNase/RNase free distilled water. Resuspend permeabilized single cells with 46 μL ATAC-seq reaction solution and add 4 μL barcode-specific transposon to each tube. ATAC-seq reactions were performed at 37°C and 550 r.p.m. with a heated lid. After the ATAC-seq reaction, 949 μL PBS, 10 μL 10% Triton X-100 (93443, Sigma-Aldrich), 1 μL RiboLock, and 50 μL 20% BSA were added to each tube and centrifuged to remove the supernatant.
实施例中细胞内逆转录的方法如下:The method of reverse transcription in cells in the embodiment is as follows:
混合8μL PBS、0.5μLRiboLock、0.5μLSUPERase·In RNase Inhibitor、7μL无核酸酶水,为每个样品制备16μL重悬溶液。通过添加8μL 5x RT缓冲液、2μL 10mM dNTP(N0447L、NEB)、0.5μLRiboLock、0.25μLSUPERase·In RNase Inhibitor、4μL Maxima H Minus逆转录酶(EP0753和ThermoFisher)和5.25μLUltraPure DNase/RNase-Free蒸馏水以制备逆转录混合物。将20μL逆转录混合物拆分到每个试管中,并添加条形码匹配的2μL随机和2μLpolyT逆转录引物)。用16μL重悬溶液重悬转置细胞,并添加到条形码匹配的PCR管中。充分混匀,50℃反转录10分钟,然后经过3个热循环(8℃ 12秒、15℃ 45秒、20℃ 45秒、30℃ 30秒、42℃ 120秒和50℃ 180秒),在50℃下孵育5分钟,并在4℃下永久保存。在冰上将逆转录反应合并到一个1.5mL试管中。离心细胞并去除上清液。用补充有10μL 10%Triton X-100和50μL 20%BSA的1mL PBS再次清洗细胞。离心细胞并去除上清液。Prepare 16 μL of resuspension solution for each sample by mixing 8 μL PBS, 0.5 μL RiboLock, 0.5 μL SUPERase·In RNase Inhibitor, and 7 μL nuclease-free water. Prepare reverse transcription mixture by adding 8 μL 5x RT buffer, 2 μL 10 mM dNTPs (N0447L, NEB), 0.5 μL RiboLock, 0.25 μL SUPERase·In RNase Inhibitor, 4 μL Maxima H Minus Reverse Transcriptase (EP0753 and ThermoFisher), and 5.25 μL UltraPure DNase/RNase-Free distilled water. Split 20 μL of reverse transcription mixture into each tube and add 2 μL of barcode-matched random and 2 μL of polyT reverse transcription primers). Resuspend transposed cells with 16 μL of resuspension solution and add to barcode-matched PCR tubes. Mix well, reverse transcribe at 50°C for 10 minutes, then go through 3 thermal cycles (8°C for 12 seconds, 15°C for 45 seconds, 20°C for 45 seconds, 30°C for 30 seconds, 42°C for 120 seconds, and 50°C for 180 seconds), incubate at 50°C for 5 minutes, and store permanently at 4°C. Combine the reverse transcription reactions into a 1.5mL tube on ice. Centrifuge the cells and remove the supernatant. Wash the cells again with 1mL PBS supplemented with 10μL 10% Triton X-100 and 50μL 20% BSA. Centrifuge the cells and remove the supernatant.
实施例中连接反应方法如下:The connection reaction method in the embodiment is as follows:
Parallel Seq使用连接反应添加第二索引。连接接头包含7nt互补链分别连接到转座子和逆转录引物,以及10nt索引链,8nt孔特异性接头,10nt的UMI,以及用于液滴线性扩增的通用PCR锚。在细胞内条形码连接之前,通过在100μL反应体积中结合11μM连接子链和12μM条形码链对连接接头进行退火。将平板在95℃下孵育2分钟,并以每秒-0.1℃的速率冷却至20℃,然后将培养板分成10个连接板,每个孔包含10μL连接接头。 Parallel Seq uses a ligation reaction to add a second index. The ligation adapter contains 7nt complementary strands ligated to the transposon and reverse transcription primers, respectively, as well as a 10nt index strand, an 8nt well-specific adapter, a 10nt UMI, and a universal PCR anchor for droplet linear amplification. Prior to intracellular barcode ligation, the ligation adapter was annealed by combining 11μM of the ligation strand and 12μM of the barcode strand in a 100μL reaction volume. The plate was incubated at 95°C for 2 minutes and cooled to 20°C at a rate of -0.1°C per second, and the culture plate was then divided into 10 ligation plates, with each well containing 10μL of ligation adapter.
对于Parallel-Split-Seq,第二和第三索引通过连接反应添加。连接接头包含与接头链互补的10nt序列、8nt孔特异性接头和7nt序列,随后被连接。加入第三索引的连接反应的连接接头包含10nt索引链,8nt孔特异性接头、10nt的UMI、通用PCR引物的P3短序列。按照Parallel-Split-seq方案退火第二和第三轮转接器,并分别分成10个连接板。For Parallel-Split-Seq, the second and third indexes were added by ligation reactions. The ligation adapters contained a 10nt sequence complementary to the adapter strand, an 8nt well-specific adapter, and a 7nt sequence, which were then ligated. The ligation adapters added to the ligation reaction for the third index contained a 10nt index strand, an 8nt well-specific adapter, a 10nt UMI, and a short P3 sequence of the universal PCR primer. The second and third round adapters were annealed according to the Parallel-Split-seq protocol and were divided into 10 ligation plates respectively.
其中,细胞内连接步骤如下:Among them, the intracellular connection steps are as follows:
连接反应按照Split-seq方案进行,不含RNase抑制剂。制备2mL 1xNEBuffe 3.1(B7203S,NEB)和2mL连接溶液(500μL 10x T4 DNA连接缓冲液,100μL T4 DNA连接酶(M0082,Robustnique),50μL 10%Triton x-100和1350μL超纯DNA酶/RNA酶游离蒸馏水)。用1x buffer 3.1重悬组合的单细胞,并与连接溶液充分混合。将连接混合物中的40μL细胞添加到连接板的每个孔中。连接反应在室温下15r.p.m旋转1小时。连接后,向每个孔中添加2μL 500μM EDTA(AM9260G,ThermoFisher)并合并。合并的细胞加入50μl 10%Triton X-100、50μL 20%BSA并离心去除上清液。用940μL PBS、10μL 10%Triton X-100和50μL 20%BSA再次清洗细胞。对于Parallel-Split-Seq,在第二索引标记后,通过连接反应添加第三索引。Ligation reactions were performed according to the Split-seq protocol without RNase inhibitors. Prepare 2 mL 1x NEBuffe 3.1 (B7203S, NEB) and 2 mL ligation solution (500 μL 10x T4 DNA ligation buffer, 100 μL T4 DNA ligase (M0082, Robustnique), 50 μL 10% Triton x-100 and 1350 μL ultrapure DNase/RNase free distilled water). Resuspend the combined single cells with 1x buffer 3.1 and mix thoroughly with the ligation solution. Add 40 μL of cells from the ligation mixture to each well of the ligation plate. The ligation reaction was rotated at 15 r.p.m for 1 hour at room temperature. After ligation, add 2 μL 500 μM EDTA (AM9260G, ThermoFisher) to each well and combine. The combined cells were added with 50 μl 10% Triton X-100, 50 μL 20% BSA and centrifuged to remove the supernatant. The cells were washed again with 940 μL PBS, 10 μL 10% Triton X-100, and 50 μL 20% BSA. For Parallel-Split-Seq, the third index was added by ligation reaction after the second index tagging.
实施例中RNase消化的方法如下:The method of RNase digestion in the embodiment is as follows:
使用RNase消化反应(40μL 5xRT缓冲液、8μL RNase Cocktail Enzyme Mix(AM2286、ThermoFisher)、8μLRNAse H(Y9220L,Enzymatics)和144μLUltraPure DNase/无RNase蒸馏水)重悬细胞并在37℃下孵育30分钟,300rpm 15秒后放置在混匀仪上45秒。加入790μL PBS和10μL 10%Triton X-100洗涤RNase消化反应,离心并去除上清液。此步骤中不要添加BSA。残留的BSA会在下一步中产生带有PEG8000的碎片。Resuspend cells using RNase digestion reaction (40 μL 5xRT buffer, 8 μL RNase Cocktail Enzyme Mix (AM2286, ThermoFisher), 8 μL RNAse H (Y9220L, Enzymatics) and 144 μL UltraPure DNase/RNase-free distilled water) and incubate at 37°C for 30 min, 300 rpm for 15 s and place on a mixer for 45 s. Wash the RNase digestion reaction by adding 790 μL PBS and 10 μL 10% Triton X-100, centrifuge and remove the supernatant. Do not add BSA in this step. Residual BSA will produce fragments with PEG8000 in the next step.
实施例中第二链合成的方法如下:The method for second chain synthesis in the embodiment is as follows:
用第二条链合成反应混合物(40μL 5xRT缓冲液、48μl 50%PEG 8000(B1004SVIAL,NEB)、20μL 10mM dNTP、2μL 1mM dN-P3短引物(用于Parallel-Seq)或dN-P5短引物(用于Parallel-Split-Seq)、5μLKlenow Exo-(M0212L,NEB)和85μLUltraPure DNase/RNase-Free蒸馏水)在37℃下孵育1小时,以300r.p.m 15s后放置在混匀仪上45秒。第二条链合成后,用含有0.1Triton X-100和1%BSA的PBS清洗细胞两次。然后用40μL 0.5xPBS重悬细胞,并用C-Chip一次性血细胞计数器在trypan blue中计数。Incubate the cells with the second-strand synthesis reaction mixture (40 μL 5xRT buffer, 48 μL 50% PEG 8000 (B1004SVIAL, NEB), 20 μL 10 mM dNTPs, 2 μL 1 mM dN-P3 short primer (for Parallel-Seq) or dN-P5 short primer (for Parallel-Split-Seq), 5 μL Klenow Exo- (M0212L, NEB) and 85 μL UltraPure DNase/RNase-Free distilled water) at 37°C for 1 hour, at 300 r.p.m for 15 s and then on a mixer for 45 s. After second-strand synthesis, wash the cells twice with PBS containing 0.1 Triton X-100 and 1% BSA. Then resuspend the cells with 40 μL 0.5xPBS and count them in trypan blue using a C-Chip disposable hemocytometer.
使用10x chromium ATAC-seq kit过载Overload with 10x Chromium ATAC-seq kit
对于Parallel-Seq,将细胞离心并用7μL ATAC-seq缓冲液B重悬,并用1xNucleies缓冲液补充至15μL。将56.5μL Barcoding Reagent B、1.5μL Reducing Agent B和2μL Barcoding Enzyme(PN-1000176,10x Genomics)与细胞结合,并加载到Chromium Next GEM Chip H(PN-1000162,10x Genomics)的一个通道中。GEM生成后,液滴被分成16个管,每个管包含6.25μL液滴。线性扩增如下进行:72℃ 5分钟,98℃ 30秒,然后98℃ 10秒,59℃ 30秒,72℃ 1分钟,12个循环。然后15℃保存待用。For Parallel-Seq, cells were centrifuged and resuspended with 7 μL ATAC-seq Buffer B and made up to 15 μL with 1x Nucleies Buffer. 56.5 μL Barcoding Reagent B, 1.5 μL Reducing Agent B, and 2 μL Barcoding Enzyme (PN-1000176, 10x Genomics) were combined with cells and loaded into one channel of Chromium Next GEM Chip H (PN-1000162, 10x Genomics). After GEM generation, droplets were divided into 16 tubes, each containing 6.25 μL droplets. Linear amplification was performed as follows: 72 °C for 5 min, 98 °C for 30 sec, then 98 °C for 10 sec, 59 °C for 30 sec, and 72 °C for 1 min, for 12 cycles. Then stored at 15 °C until use.
Parallel-Seq文库构建Parallel-Seq library construction
随着Chromium Next GEM Single Cell ATAC Reagent Kits V1.1的缩小,进行GEM孵育后的清理。将7.8μL Recovery Agent添加到每个管中,轻轻颠倒管10次以混合。短暂离心并加入12.5μLDynabeads Cleanup Mix。移液管混合5次并在室温下孵育10分钟。用81μL Elution Solution I洗脱产物并分成两部分,40μL用于ATAC-seq,40μL用于RNA-seq。使用1.2x SPRI beads(B23318,Beckman Coulter)清理ATAC-seq部分,使用0.8x SPRI beads清理RNA-seq部分。使用SI-PCR引物B(PN-2000128,10x基因组学)和N7xx引物放大ATAC-seq库。用SI-PCR引物B(PN-2000128)和P3xx引物放大RNA-seq库。扩增后,用1.2x SPRI beads清理ATAC-seq部分,用0.8x SPRI beads清理RNA-seq部分。With Chromium Next GEM Single Cell ATAC Reagent Kits V1.1 scaled down, perform post-GEM incubation cleanup. Add 7.8 μL Recovery Agent to each tube and gently invert the tube 10 times to mix. Centrifuge briefly and add 12.5 μL Dynabeads Cleanup Mix. Pipette mix 5 times and incubate at room temperature for 10 min. Elute product with 81 μL Elution Solution I and split into two parts, 40 μL for ATAC-seq and 40 μL for RNA-seq. Clean up the ATAC-seq part using 1.2x SPRI beads (B23318, Beckman Coulter) and the RNA-seq part using 0.8x SPRI beads. Amplify ATAC-seq libraries using SI-PCR Primer B (PN-2000128, 10x Genomics) and N7xx primers. RNA-seq libraries were amplified using SI-PCR Primer B (PN-2000128) and P3xx primers. After amplification, ATAC-seq parts were cleaned up using 1.2x SPRI beads and RNA-seq parts were cleaned up using 0.8x SPRI beads.
Parallel-Split-Seq文库构建Parallel-Split-Seq library construction
将第二条链合成细胞稀释成800个细胞/μl,并将每管2,000个细胞分开。加入2.5μl 2x裂解液(0.25μl 1M pH 8.0 Tris-HCl,0.25μl 10%IGEPAL CA-630(I8896,Sigma Aldrich),0.25μl 10%Tween 20,0.5μl 291mg/mlQIAGENProtease(19155,QIAGEN)和1.25μl UltraPure DNase/RNase-Free蒸馏水)并在55℃下孵育8小时,在70℃下孵育15分钟以灭活QIAGEN Protease并在4℃下永久保持。添加45μl PCR扩增混合物(25μl NEBNext High-Fidelity 2X PCR Master Mix(M0541L,NEB)、2.5μl N5xx引物、1.25μl P5xx引物、1.25μl P3xx引物和15μl UltraPure DNase/RNase-Free蒸馏水)以扩增ATAC-seq和RNA-seq片段。循环条件为72℃ 5分钟,98℃ 30秒,然后98℃ 10秒,65℃ 30秒,72℃ 1分钟循环5次,保持在4℃。将PCR混合物分成ATAC-seq部分和RNA-seq部分。分别使用1.0x AMPure XP beads(A63881,Beckman Coulter)和0.8x AMPure XP beads清理RNA-seq部分。用22μl UltraPure DNase/RNase-Free蒸馏水洗脱PCR产物。通过添加28μl PCR反应混合物进行第二轮PCR扩增(25μl NEBNext High-Fidelity 2X PCR Master Mix、1.25μl N5xx引物、1.25μl P3_end引物、0.5μl 25x SYBR Green I(S7563、ThermoFisher)用于ATAC-seq,);25μl NEBNext High-Fidelity 2X PCR Master Mix,1.25μl N5xx引物,1.25μl P3_end引物,0.5μl 25x SYBR Green I,用于RNA-seq)。我们在QuantStudio 3实时PCR系统(ThermoFisher)上扩增了ATAC-seq和RNA-seq库的每个子库,跟踪扩增并在荧光单位值达到~100,000时停止每个子库。根据经验,细胞系或肿瘤细胞的循环数约为6-8,其他原代细胞的循环数约为7-10,小于11是可以接受的。分别使用1.0x AMPure XP beads清理ATAC-seq库和使用0.8x AMPure XP beads清理RNA-seq库。用20μl洗脱缓冲液(19086,QIAGEN)洗脱文库,并用Qubit dsDNAHS检测试剂盒(Q32851,ThermoFisher)确定浓度。每个子库预计将回收超过20ng产品。使用安捷伦高灵敏度D1000 ScreenTape Assay(5067-5584和5067-5585,安捷伦)进行质量控制。Dilute the second chain synthesis cells to 800 cells/μl and split 2,000 cells per tube. Add 2.5 μl 2x lysis buffer (0.25 μl 1 M pH 8.0 Tris-HCl, 0.25 μl 10% IGEPAL CA-630 (I8896, Sigma Aldrich), 0.25 μl 10% Tween 20, 0.5 μl 291 mg/ml QIAGEN Protease (19155, QIAGEN) and 1.25 μl UltraPure DNase/RNase-Free distilled water) and incubate at 55 °C for 8 h, incubate at 70 °C for 15 min to inactivate QIAGEN Protease and keep permanently at 4 °C. 45 μl PCR amplification mix (25 μl NEBNext High-Fidelity 2X PCR Master Mix (M0541L, NEB), 2.5 μl N5xx primer, 1.25 μl P5xx primer, 1.25 μl P3xx primer and 15 μl UltraPure DNase/RNase-Free distilled water) was added to amplify ATAC-seq and RNA-seq fragments. The cycling conditions were 72 °C for 5 min, 98 °C for 30 sec, then 98 °C for 10 sec, 65 °C for 30 sec, 72 °C for 1 min for 5 cycles, hold at 4 °C. The PCR mix was divided into ATAC-seq part and RNA-seq part. The RNA-seq part was cleaned up using 1.0x AMPure XP beads (A63881, Beckman Coulter) and 0.8x AMPure XP beads, respectively. The PCR products were eluted with 22 μl UltraPure DNase/RNase-Free distilled water. The second round of PCR amplification was performed by adding 28 μl PCR reaction mixture (25 μl NEBNext High-Fidelity 2X PCR Master Mix, 1.25 μl N5xx primer, 1.25 μl P3_end primer, 0.5 μl 25x SYBR Green I (S7563, ThermoFisher) for ATAC-seq; 25 μl NEBNext High-Fidelity 2X PCR Master Mix, 1.25 μl N5xx primer, 1.25 μl P3_end primer, 0.5 μl 25x SYBR Green I for RNA-seq). We amplified each sublibrary of ATAC-seq and RNA-seq libraries on a QuantStudio 3 Real-Time PCR System (ThermoFisher), followed the amplification and stopped each sublibrary when the fluorescence unit value reached ∼100,000. As a rule of thumb, the cycle number is about 6–8 for cell lines or tumor cells and about 7–10 for other primary cells, with less than 11 being acceptable. ATAC-seq libraries were cleaned up using 1.0x AMPure XP beads and RNA-seq libraries were cleaned up using 0.8x AMPure XP beads, respectively. Libraries were eluted with 20 μl of elution buffer (19086, QIAGEN) and the concentration was determined using the Qubit dsDNA HS Assay Kit (Q32851, ThermoFisher). More than 20 ng of product was expected to be recovered for each sublibrary. Quality control was performed using the Agilent High Sensitivity D1000 ScreenTape Assay (5067-5584 and 5067-5585, Agilent).
实施例中测序方法如下:The sequencing method in the embodiment is as follows:
使用具有16nt i5索引、8nt i7索引和PE150测序的Illumina NovaSeq 6000测序系统对Parallel-Seq文库进行测序。Parallel-Seq libraries were sequenced using the Illumina NovaSeq 6000 sequencing system with 16nt i5 index, 8nt i7 index, and PE150 sequencing.
使用Illumina HiSeq X 10系统或NovaSeq 6000测序系统对Parallel-Split-Seq文库进行测序,标准PE150测序具有8nt i5索引和8nt i7索引。Parallel-Split-Seq libraries were sequenced using the Illumina HiSeq X 10 System or NovaSeq 6000 Sequencing System with standard PE150 sequencing with 8nt i5 index and 8nt i7 index.
Parallel-Seq和Parallel-Split-Seq数据的预处理Preprocessing of Parallel-Seq and Parallel-Split-Seq Data
我们使用read1对Parallel-Seq的细胞条形码和连接接头进行测序。为了平衡测序核苷酸组合,我们在barcode2和连接接头之间添加相位核苷酸,第1-24位不添加任何核苷酸,第25-48位添加T,第49-72位添加CA,第73-96位添加ACA。对于barcode2的1-24th,Parallel-Seq的barcode1、barcode2、barcode3和barcode4应该位于36-41st、11-18th、i5索引、i7索引内。对于barcode2的25-48th、49-72nd和73-96th位,只需要改变barcode1的位置一个核苷酸步长。唯一的分子标识符位于read1的1-10th内。 位于read1的Tn5ME后面的序列是一个Tn5切割位点,而read2为ATAC-seq提供另一Tn5切割位点。RNA-seq文库的Read2从第二链合成退火位点开始,与靶基因的RNA序列相同。We use read1 to sequence the cell barcodes and adapters for Parallel-Seq. To balance the sequencing nucleotide composition, we add phase nucleotides between barcode2 and the adapter, no nucleotides are added at positions 1-24, T is added at positions 25-48, CA is added at positions 49-72, and ACA is added at positions 73-96. For barcode2's 1-24th , Parallel-Seq's barcode1, barcode2, barcode3, and barcode4 should be located within 36-41st, 11-18th, i5 index, i7 index. For barcode2's 25-48th , 49-72nd , and 73-96th positions, only the position of barcode1 needs to be changed by one nucleotide step. The unique molecular identifier is located within read1's 1-10th . The sequence following the Tn5ME in read1 is a Tn5 cleavage site, while read2 provides another Tn5 cleavage site for ATAC-seq. Read2 of the RNA-seq library starts at the second strand synthesis annealing site, which is identical to the RNA sequence of the target gene.
对于Parallel-Split-Seq,使用read2对细胞条码和连接接头进行测序。barcode1、barcode2、barcode3和barcode4应位于第61-66、36-43、11-18、i7索引内。位于read2的Tn5ME后面的序列是一个Tn5切割位点,而read1为ATAC-seq提供了另一个片段。RNA-seq文库的Read1提供了目标基因的RNA序列。For Parallel-Split-Seq, use read2 to sequence the cell barcodes and ligation adapters. barcode1, barcode2, barcode3, and barcode4 should be located within indexes 61-66, 36-43, 11-18, and i7. The sequence following Tn5ME in read2 is a Tn5 cut site, while read1 provides another fragment for ATAC-seq. Read1 of the RNA-seq library provides the RNA sequence of the target gene.
原始读数用cutadapt修剪。条形码由FREE Difference软件解析,每轮条形码只允许一次编辑。从RNA文库中筛选出具有嵌入的末端序列的数据,并从ATAC文库中筛选出不具有嵌入的末端序列的数据。使用STAR将数据与hg38、mm10或组合基因组比对。Raw reads were trimmed with cutadapt. Barcodes were parsed by FREE Difference software, allowing only one edit per round of barcoding. Data with embedded end sequences were filtered out from RNA libraries, and data without embedded end sequences were filtered out from ATAC libraries. Data were aligned to hg38, mm10, or the combined genome using STAR.
对于单细胞RNA-seq,使用来自Split-seq管道的修改后的python脚本来折叠UMI并生成数字基因表达矩阵。对于单细胞ATAC-seq,删除了线粒体读数。如前所述计算TSS可访问性的丰富性,以评估数据质量。丢弃TSS富集<6的细胞。然后计算整个基因组中2-kb bins(区间)上的Tn5插入。For single-cell RNA-seq, a modified python script from the Split-seq pipeline was used to collapse UMIs and generate digital gene expression matrices. For single-cell ATAC-seq, mitochondrial reads were removed. Enrichment of TSS accessibility was calculated as previously described to assess data quality. Cells with TSS enrichment < 6 were discarded. Tn5 insertions were then calculated on 2-kb bins across the genome.
丢弃表达<200个基因、<200bin的细胞。我们使用Scrublet来预测doublet概率并通过默认阈值去除双峰。Cells expressing <200 genes and <200 bins were discarded. We used Scrublet to predict doublet probabilities and removed doublets using the default threshold.
对于混合细胞系数据,将<90%UMI映射到一个物种的细胞视为混合细胞。For mixed cell line data, cells with <90% UMI mapped to one species were considered mixed cells.
实施例中对照方法如下:The control method in the embodiment is as follows:
sci-CAR的步骤参见文献:Cao,J.et al.Joint profiling of chromatin accessibility and gene expression in thousands of single cells.Science361,1380-1385(2018)。For the steps of sci-CAR, please refer to the literature: Cao, J. et al. Joint profiling of chromatin accessibility and gene expression in thousands of single cells. Science 361, 1380-1385 (2018).
paired-Seq的步骤参见文献:Zhu,C.et al.An ultra high-throughput method for single-cell joint analysis of open chromatin and transcriptome.Nat Struct Mol Biol26,1063-1070(2019)。For the steps of paired-Seq, please refer to the literature: Zhu, C. et al. An ultra high-throughput method for single-cell joint analysis of open chromatin and transcriptome. Nat Struct Mol Biol 26, 1063-1070 (2019).
SNARE-Seq的步骤参见文献:Chen,S.,Lake,B.B.&Zhang,K.High-throughput sequencing of the transcriptome and chromatin accessibility in the same cell.Nat Biotechnol37,1452-1457(2019)。For the steps of SNARE-Seq, please refer to the literature: Chen, S., Lake, B.B. & Zhang, K. High-throughput sequencing of the transcriptome and chromatin accessibility in the same cell. Nat Biotechnol 37, 1452-1457 (2019).
SHARE-Seq的步骤参见文献:Ma,S.et al.Chromatin Potential Identified by Shared Single-Cell Profiling of RNA and Chromatin.Cell183,1103-1116 e20(2020)。For the steps of SHARE-Seq, please refer to the literature: Ma, S. et al. Chromatin Potential Identified by Shared Single-Cell Profiling of RNA and Chromatin. Cell 183, 1103-1116 e20 (2020).
本申请代表序列中碱基的“X...X”、“N...N”均可以代表任意天然或修饰或现有技术中已知的碱基类型,其中“X”与“N”可替换使用,包括但不限于A、T、C、G或U。V代表A、C或G。B代表C、G、T或U。当然,当“X”代表氨基酸时则代表现有技术中已知的天然或经修饰的氨基酸类型。The "X...X" and "N...N" representing bases in the sequence of the present application can represent any natural or modified base type or base type known in the prior art, wherein "X" and "N" can be used interchangeably, including but not limited to A, T, C, G or U. V represents A, C or G. B represents C, G, T or U. Of course, when "X" represents an amino acid, it represents a natural or modified amino acid type known in the prior art.
实施例1 Parallel-Seq分析多个样本的同一单细胞中的RNA和开放染色质Example 1 Parallel-Seq analysis of RNA and open chromatin in the same single cell from multiple samples
Parallel-Seq的实验设计如图1所示。具体步骤如下:The experimental design of Parallel-Seq is shown in Figure 1. The specific steps are as follows:
(1)Parallel-Seq从27个不同的样品开始,每个样品50,000个单细胞;(1) Parallel-Seq started with 27 different samples, each with 50,000 single cells;
(2)对每个样品的细胞进行固定、透化并使用条码Tn5转座子,用转座子特有的条形码标记开放染色质;其中,转座子特有的条码序列Tn5ME-B如SEQ ID NO:1所示,带有第一索引的Tn5ME-x(x代表1-27)序列如SEQ ID NO:2所示,序列中XXXXXX代表第一索引,见表1。(2) The cells of each sample were fixed, permeabilized and used with a barcoded Tn5 transposon, and the open chromatin was labeled with a transposon-specific barcode; wherein the transposon-specific barcode sequence Tn5ME-B is shown in SEQ ID NO: 1, and the sequence Tn5ME-x (x represents 1-27) with a first index is shown in SEQ ID NO: 2, and XXXXXX in the sequence represents the first index, see Table 1.
Tn5ME-B:GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAG(SEQ ID NO:1)Tn5ME-B: GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAG (SEQ ID NO: 1)
Tn5ME-x:/5Phos/TGCAGTAXXXXXXAGATGTGTATAAGAGACAG(SEQ ID NO:2)Tn5ME-x:/5Phos/TGCAGTA XXXXXX AGATGTGTATAAGAGACAG (SEQ ID NO: 2)
(3)用条码匹配的poly(T)引物R1BxT15VN和随机六聚体引物R1BxN6对每个样本的mRNA进行逆转录,逆转录引物R1BxT15VN和R1BxN6(x代表1-27)如SEQ ID NO:3和SEQ ID NO:4所示,序列中XXXXXX代表第一索引,见表1;(3) Reverse transcription of mRNA of each sample was performed using barcode-matched poly(T) primer R1BxT15VN and random hexamer primer R1BxN6. The reverse transcription primers R1BxT15VN and R1BxN6 (x represents 1-27) are shown in SEQ ID NO: 3 and SEQ ID NO: 4. XXXXXX in the sequence represents the first index, see Table 1.
R1BxT15VN:/5Phos/TGCAGTAXXXXXXTTTTTTTTTTTTTTTVN(SEQ ID NO:3)。R1BxT15VN:/5Phos/ TGCAGTAXXXXXXTTTTTTTTTTTTTTTVN (SEQ ID NO: 3).
R1BxN6:/5Phos/TGCAGTAXXXXXXNNNNNN(SEQ ID NO:4)R1BxN6:/5Phos/TGCAGTA XXXXXX NNNNNN (SEQ ID NO: 4)
(4)将不同样本的细胞组合并随机分配到96孔板中,每孔中包含dscB′序列(SEQ ID NO:5),将孔特异性接头序列dscBx连接到转座染色质或cDNA第一链,其中,孔特异性接头序列dscBx(x代表1-96)如SEQ ID NO:6所示,序列中“NNNNNNNNNN”为UMI,第二索引和UMI之间添加相位核苷酸,第dscB1-dscB24不添加任何核苷酸,dscB25-dscB48位添加T,dscB49-dscB72位添加CA,dscB73-dscB96位添加ACA,序列中XXXXXXXX代表第二索引,见表2;(4) Cells of different samples were combined and randomly distributed into a 96-well plate, each well containing a dscB′ sequence (SEQ ID NO: 5), and the well-specific adapter sequence dscBx was connected to the transposed chromatin or the first strand of cDNA, wherein the well-specific adapter sequence dscBx (x represents 1-96) is as shown in SEQ ID NO: 6, "NNNNNNNNNN" in the sequence is the UMI, a phase nucleotide is added between the second index and the UMI, no nucleotides are added to dscB1-dscB24, T is added to dscB25-dscB48, CA is added to dscB49-dscB72, and ACA is added to dscB73-dscB96, and XXXXXXXX in the sequence represents the second index, see Table 2;
dscB′序列:TACTGCACTCAGTGACT(SEQ ID NO:5)dscB′ sequence: TACTGCACTCAGTGACT (SEQ ID NO: 5)
dscBx序列:TCGTCGGCAGCGTCAGATGTGTATAAGAGACAGNNNNNNNNNNXXXXXXXXAGTCACTGAG(SEQ ID NO:6)dscBx sequence: TCGTCGGCAGCGTCAGATGTGTATAAGAGACAGNNNNNNNNNN XXXXXXXX AGTCACTGAG (SEQ ID NO: 6)
(5)用RNA酶消化RNA;(5) digesting RNA with RNase;
(6)随机引物的第二链合成将第二个PCR锚点附加到cDNA上,其中,第二链合成所用引物为p3短引物。(6) Random Primer Second Strand Synthesis: The second PCR anchor is attached to the cDNA, wherein the primer used for the second strand synthesis is the p3 short primer.
p3短引物:CAGACGTGTGCTCTTCCGATCTNNNGGNNNB(SEQ ID NO:7)p3 short primer: CAGACGTGTGCTCTTCCGATCTNNNGGNNNB (SEQ ID NO: 7)
(7)所有细胞汇集在一起并过载进入Chromium scATAC-seq芯片的一个通道;(7) All cells are pooled together and overloaded into one channel of the Chromium scATAC-seq chip;
(8)裂解细胞,在液滴内线性扩增添加液滴特异性标记p5 adapter即第三索引,见表2,其中,线性扩增引物如(SEQ ID NO:8)所示,其中XXXXXXXXXXXXXXXX为第三索引信息,为每个液滴中beads的特异性索引;(8) Lysing cells, adding droplet-specific marker p5 adapter, i.e., the third index, for linear amplification in the droplets, as shown in Table 2, wherein the linear amplification primer is shown in (SEQ ID NO: 8), wherein XXXXXXXXXXXXXXXX is the third index information, which is the specific index of beads in each droplet;
5’-AATGATACGGCGACCACCGAGATCTACAC-XXXXXXXXXXXXXXXX-TCGTCGGCAGCGTC-3’(SEQ ID NO:8)5'-AATGATACGGCGACCACCGAGATCTACAC- XXXXXXXXXXXXXXXX -TCGTCGGCAGCGTC-3'(SEQ ID NO: 8)
(9)进一步将液滴分到16个PCR管中进行PCR纯化;(9) further dividing the droplets into 16 PCR tubes for PCR purification;
(10)将每个PCR管中纯化产物分为两部分,分别用相应的引物扩增转录组和开放染色质片段,其中,扩增转录组采用引物SI-PCR引物B(SEQ ID NO:9)和P3xx引物(SEQ ID NO:10),序列中XXXXXXXX代表扩增转录组所需引物序列的第四索引,见表2中P3xx索引;扩增开放染色质片段采用引物SI-PCR引物B(SEQ ID NO:9)和N7xx引物(SEQ ID NO:11),序列中XXXXXXXX代表扩增开放染色质片段所需引物序列的第四索引,见表1中N7xx索引;(10) The purified product in each PCR tube was divided into two parts, and the transcriptome and the open chromatin fragment were amplified using the corresponding primers, wherein the transcriptome was amplified using primers SI-PCR primer B (SEQ ID NO: 9) and P3xx primer (SEQ ID NO: 10), and XXXXXXXX in the sequence represents the fourth index of the primer sequence required for amplifying the transcriptome, see P3xx index in Table 2; the open chromatin fragment was amplified using primers SI-PCR primer B (SEQ ID NO: 9) and N7xx primer (SEQ ID NO: 11), and XXXXXXXX in the sequence represents the fourth index of the primer sequence required for amplifying the open chromatin fragment, see N7xx index in Table 1;
SI-PCR引物B:AATGATACGGCGACCACCGAGA(SEQ ID NO:9)SI-PCR Primer B: AATGATACGGCGACCACCGAGA (SEQ ID NO: 9)
P3xx引物:CAAGCAGAAGACGGCATACGAGATXXXXXXXXGTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT(SEQ ID NO:10)P3xx primer: CAAGCAGAAGACGGCATACGAGAT XXXXXXXX GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT (SEQ ID NO: 10)
N7xx引物:CAAGCAGAAGACGGCATACGAGATXXXXXXXXGTCTCGTGGGCTCGG(SEQ ID NO:11)N7xx primer: CAAGCAGAAGACGGCATACGAGAT XXXXXXXX GTCTCGTGGGCTCGG (SEQ ID NO: 11)
(11)经过测序和条码解析分析后,4轮索引相同组合的基因表达和染色质可接近性图谱代表了一个单细胞的成对图谱。原则上,经过4轮索引后,将条码空间很大程度上扩展到(96x96x100000x16≈1.47x1010),这使得Parallel-Seq能够在一次实验中评估超过100万个细胞,碰撞率极低。(11) After sequencing and barcode parsing analysis, the gene expression and chromatin accessibility profiles of the same combination of four rounds of indexing represent the pairwise profiles of a single cell. In principle, after four rounds of indexing, the barcode space is largely expanded to (96x96x100000x16≈1.47x10 10 ), which enables Parallel-Seq to evaluate more than 1 million cells in a single experiment with a very low collision rate.
表1
Table 1
表2
Table 2
实施例2 Parallel-Seq性能验证Example 2 Parallel-Seq performance verification
用NIH/3T3(小鼠)、HEK293T(人)和K562(人)细胞的混合物进行了Parallel-Seq(步骤同实施例1),经质量筛选获得了2200个细胞的转录组和染色质可接近性,其中scRNA序列部分的平均UMI为7014,scATAC序列部分的平均UMI为10103。来自人类和小鼠细胞的Reads在转录组和染色质谱中都被很好地分离,其中转录组分配给802个小鼠细胞和1398个人类细胞,染色质谱分配给805个小鼠细胞和1398个人类细胞,均有很少的doublet,两个图谱的碰撞率分别为0.2%、0.1%(图2)。Parallel-Seq was performed with a mixture of NIH/3T3 (mouse), HEK293T (human), and K562 (human) cells (same steps as in Example 1), and the transcriptome and chromatin accessibility of 2200 cells were obtained after quality screening, of which the average UMI of the scRNA sequence part was 7014, and the average UMI of the scATAC sequence part was 10103. Reads from human and mouse cells were well separated in both transcriptome and chromatin profiles, with transcriptomes assigned to 802 mouse cells and 1398 human cells, and chromatin profiles assigned to 805 mouse cells and 1398 human cells, with very few doublets, and the collision rates of the two maps were 0.2% and 0.1%, respectively (Figure 2).
聚合后的scATAC-seq数据的插入片段大小分布显示出清晰的核小体结合模式(图3),测序reads的TSS富集得分高达14(图4),说明scATAC-seq数据是合格的。The insert size distribution of the aggregated scATAC-seq data showed a clear nucleosome binding pattern (Figure 3), and the TSS enrichment score of the sequencing reads was as high as 14 (Figure 4), indicating that the scATAC-seq data was qualified.
Parallel-Seq生成的聚集单细胞染色质可接近性和转录组谱分别与ENCODE中K562细胞的大量DNA酶序列(ENCFF156LGK,R=0.79)(图5)和核RNA序列(ENCFF631TDY,R=0.81)(图6)密切相关(图7)。此外,每个细胞的表达和染色质可接近性谱在细胞类型内聚在一起,彼此分离(图8-9)。总之,这些数据证明了Parallel-Seq的高特异性和高质量。The aggregated single-cell chromatin accessibility and transcriptome profiles generated by Parallel-Seq were closely correlated with the bulk DNA sequence (ENCFF156LGK, R=0.79) (Figure 5) and nuclear RNA sequence (ENCFF631TDY, R=0.81) (Figure 6) of K562 cells in ENCODE (Figure 7). In addition, the expression and chromatin accessibility profiles of each cell clustered together within the cell type and separated from each other (Figures 8-9). Together, these data demonstrate the high specificity and high quality of Parallel-Seq.
进一步比较了Parallel-Seq与sci-CAR、paired-Seq、SNARE-Seq和SHARE-Seq的数据质量。Parallel Seq在两个文库上显示出优于最先进的方法SHARE Seq的数据质量(图10-11),其ATAC片段和RNA的UMI的数量更多,捕获的基因数量也较其他方法多,具有更大的带宽。The data quality of Parallel-Seq was further compared with sci-CAR, paired-Seq, SNARE-Seq, and SHARE-Seq. Parallel Seq showed better data quality than the state-of-the-art method SHARE Seq on two libraries (Figures 10-11), with more ATAC fragments and RNA UMIs, more captured genes, and greater bandwidth than other methods.
实施例3 Parallel-Split-Seq及性能验证Example 3 Parallel-Split-Seq and performance verification
为了使其更易于使用,进一步降低成本,开发了Parallel-Split-Seq,将Parallel-Seq中第三索引的加入位置改变,即从液滴中线性扩增增加第三索引改为在板上增加一轮连接反应增加第三索引。依然包含液滴中线性扩增的步骤,只是在这步中不增加第三索引,条码空间为24x96x96x96~2.12x107(图12)。In order to make it easier to use and further reduce costs, Parallel-Split-Seq was developed, which changed the location of the third index in Parallel-Seq, that is, the third index was added from linear amplification in the droplet to adding a round of ligation reaction on the plate. It still includes the step of linear amplification in the droplet, but the third index is not added in this step, and the barcode space is 24x96x96x96~2.12x10 7 (Figure 12).
本实施例用NIH/3T3(小鼠)、HEK293T(人)、Hela(人)、K562(人)及THP1(人)细胞的混合物进行了Parallel-Split-Seq(步骤同实施例1),具体步骤如下:In this example, Parallel-Split-Seq (same steps as in Example 1) was performed using a mixture of NIH/3T3 (mouse), HEK293T (human), Hela (human), K562 (human) and THP1 (human) cells. The specific steps are as follows:
(1)对每个样品的细胞进行固定、透化并使用条码Tn5转座子,用转座子特有的条形码标记开放染色质;其中,转座子特有的条码序列Tn5ME-A如SEQ ID NO:12所示,带有第一索引的Tn5ME-x(x代表1-27)序列如SEQ ID NO:2所示,序列中XXXXXX代表第一索引,见表1;(1) Fixing and permeabilizing the cells of each sample and using a barcode Tn5 transposon to label the open chromatin with a transposon-specific barcode; wherein the transposon-specific barcode sequence Tn5ME-A is shown in SEQ ID NO: 12, and the sequence Tn5ME-x (x represents 1-27) with a first index is shown in SEQ ID NO: 2, and XXXXXX in the sequence represents the first index, as shown in Table 1;
Tn5ME-A:TCGTCGGCAGCGTCAGATGTGTATAAGAGACAG(SEQ ID NO:12)Tn5ME-A:TCGTCGGCAGCGTCAGATGTGTATAAGAGACAG(SEQ ID NO:12)
(2)用条码匹配的poly(T)引物R1BxT15VN和随机六聚体引物R1BxN6对每个样本的mRNA进行逆转录,逆转录引物R1BxT15VN和R1BxN6(x代表1-27)如SEQ ID NO:3和SEQ ID NO:4所示,序列中XXXXXX代表第一索引,见表1;(2) The mRNA of each sample was reverse transcribed using the barcode-matched poly(T) primer R1BxT15VN and the random hexamer primer R1BxN6. The reverse transcription primers R1BxT15VN and R1BxN6 (x represents 1-27) are shown in SEQ ID NO: 3 and SEQ ID NO: 4. XXXXXX in the sequence represents the first index, see Table 1.
(3)将不同样本的细胞组合并随机分配到96孔板中,进行2次连接反应,增加第二索引及第三索引,其中增加第二索引时每孔中包含R2′序列(SEQ ID NO:13),增加第三索引时每孔中包含R3′序列(SEQ ID NO:14),将孔特异性接头序列连接到转座染色质或cDNA第一链,其中,带有第二索引的孔特异性接头序列如R2Bx(SEQ ID NO:15)所示,序列中XXXXXXXX代表第二索引,见表2;带有的第三索引的孔特异性接头序列如R3Bx(SEQ ID NO:16)所示,序列中XXXXXXXX代表第三索引,见表2;(3) combining cells of different samples and randomly distributing them into a 96-well plate, performing two ligation reactions, adding a second index and a third index, wherein each well contains an R2′ sequence (SEQ ID NO: 13) when the second index is added, and each well contains an R3′ sequence (SEQ ID NO: 14) when the third index is added, and ligating the well-specific adapter sequence to the transposed chromatin or the first chain of cDNA, wherein the well-specific adapter sequence with the second index is shown as R2Bx (SEQ ID NO: 15), and XXXXXXXX in the sequence represents the second index, as shown in Table 2; the well-specific adapter sequence with the third index is shown as R3Bx (SEQ ID NO: 16), and XXXXXXXX in the sequence represents the third index, as shown in Table 2;
R2′序列:TACTGCAGCTGAACCTC(SEQ ID NO:13)R2′ sequence: TACTGCAGCTGAACCTC (SEQ ID NO: 13)
R3′序列:TCTCCAAAGCTGTGGAC(SEQ ID NO:14)R3′ sequence: TCTCCAAAGCTGTGGAC (SEQ ID NO: 14)
R2Bx序列:/5Phos/TTGGAGAXXXXXXXXGAGGTTCAGC(SEQ ID NO:15)R2Bx sequence:/5Phos/TTGGAGA XXXXXXXX GAGGTTCAGC (SEQ ID NO: 15)
R3Bx序列:CAGACGTGTGCTCTTCCGATCTNNNNNNNNNNXXXXXXXXGTCCACAGCT(SEQ ID NO:16)。R3Bx sequence: CAGACGTGTGCTCTTCCGATCTNNNNNNNNNN XXXXXXXX GTCCACAGCT (SEQ ID NO: 16).
(4)用RNA酶消化RNA;(4) digesting RNA with RNase;
(5)随机引物的第二链合成将第二个PCR锚点附加到cDNA上,其中,第二链合成所用引物为p5短引物。(5) Second-strand synthesis with random primers: The second PCR anchor point is attached to the cDNA, wherein the primer used for the second-strand synthesis is the p5 short primer.
P5短引物:ACACGACGCTCTTCCGATCTNNNGGNNNB(SEQ ID NO:17)P5 short primer: ACACGACGCTCTTCCGATCTNNNGGNNNB (SEQ ID NO: 17)
(6)所有细胞汇集在一起,计数并稀释到800细胞/ul,分装到PCR管中,每管2.5ul细胞;(6) All cells were pooled together, counted and diluted to 800 cells/ul, and dispensed into PCR tubes, with 2.5ul cells per tube;
(7)在PCR管中裂解细胞,并直接加入PCR扩增体系(包含P5xx(SEQ ID NO:18)、N5xx(SEQ ID NO:19)和P3xx(SEQ ID NO:10))进行扩增,并增加第四索引,见表2;(7) Lyse the cells in a PCR tube and directly add the PCR amplification system (including P5xx (SEQ ID NO: 18), N5xx (SEQ ID NO: 19) and P3xx (SEQ ID NO: 10)) for amplification, and add the fourth index, see Table 2;
(8)对每个PCR进行纯化,将纯化产物分为两部分,分别用相应的引物扩增转录组和可接近性染色质片段,其中,扩增转录组采用p3 end引物(SEQ ID NO:20)和P5xx(SEQ ID NO:18),P5xx序列中XXXXXXXX为扩增转录组的第四索引,见表3;扩增开放染色质片段采用引物p3 end(SEQ ID NO:20)和N5xx(SEQ ID NO:19),N5xx序列中XXXXXXXX为扩增开放染色质片段的第四索引,见表3。(8) Each PCR product was purified and divided into two parts. The transcriptome and accessible chromatin fragments were amplified using corresponding primers, wherein the transcriptome was amplified using primers p3 end (SEQ ID NO: 20) and P5xx (SEQ ID NO: 18), and XXXXXXXX in the P5xx sequence was the fourth index of the amplified transcriptome, as shown in Table 3; the open chromatin fragment was amplified using primers p3 end (SEQ ID NO: 20) and N5xx (SEQ ID NO: 19), and XXXXXXXX in the N5xx sequence was the fourth index of the amplified open chromatin fragment, as shown in Table 3.
p3 end序列:CAAGCAGAAGACGGCATACGAGAT(SEQ ID NO:20)p3 end sequence: CAAGCAGAAGACGGCATACGAGAT (SEQ ID NO: 20)
P5xx序列:P5xx sequence:
AATGATACGGCGACCACCGAGATCTACACXXXXXXXXACACTCTTTCCCTACACGACGCTCTTCCGATCT(SEQ ID NO:18)AATGATACGGCGACCACCGAGATCTACAC XXXXXXXX ACACTCTTTCCCTACACGACCGCTTCCGATCT (SEQ ID NO: 18)
N5xx序列:AATGATACGGCGACCACCGAGATCTACACXXXXXXXXTCGTCGGCAGCGTC(SEQ ID NO:19)N5xx sequence: AATGATACGGCGACCACCGAGATCTACAC XXXXXXXX TCGTCGGCAGCGTC (SEQ ID NO: 19)
表3
Table 3
结果显示,Parallel-Split-Seq的特异性较好,碰撞率低,与大量数据相关性高(见图13-18)。而且,Parallel-Split-Seq与Parallel-Seq的性能相当,优于现有的方法(见图19-20)。The results show that Parallel-Split-Seq has good specificity, low collision rate, and high correlation with large amounts of data (see Figures 13-18). Moreover, the performance of Parallel-Split-Seq is comparable to that of Parallel-Seq and is superior to existing methods (see Figures 19-20).
以上详细描述了本发明的优选实施方式,但是,本发明并不限于上述实施方式中的具体细节,在本发明的技术构思范围内,可以对本发明的技术方案进行多种简单变型,这些简单变型均属于本发明的保护范围。另外需要说明的是,在上述具体实施方式中所描述的各个具体技术特征,在不矛盾的情况下,可以通过任何合适的方式进行组合,为了避免不必要的重复,本发明对各种可能的组合方式不再另行说明。 The preferred embodiments of the present invention are described in detail above, but the present invention is not limited to the specific details in the above embodiments. Within the technical concept of the present invention, the technical solution of the present invention can be subjected to a variety of simple modifications, and these simple modifications all belong to the protection scope of the present invention. It should also be noted that the various specific technical features described in the above specific embodiments can be combined in any suitable manner without contradiction. In order to avoid unnecessary repetition, the present invention will not further describe various possible combinations.
Claims (18)
The P3xx index, N7xx index, P5xx index or N5xx index is as follows:
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| PCT/CN2023/098377 WO2024250155A1 (en) | 2023-06-05 | 2023-06-05 | Method for constructing single cell sequencing library |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| PCT/CN2023/098377 WO2024250155A1 (en) | 2023-06-05 | 2023-06-05 | Method for constructing single cell sequencing library |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2024250155A1 true WO2024250155A1 (en) | 2024-12-12 |
Family
ID=93794880
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/CN2023/098377 Pending WO2024250155A1 (en) | 2023-06-05 | 2023-06-05 | Method for constructing single cell sequencing library |
Country Status (1)
| Country | Link |
|---|---|
| WO (1) | WO2024250155A1 (en) |
Citations (7)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN109996892A (en) * | 2016-12-07 | 2019-07-09 | 深圳华大智造科技有限公司 | Construction method and application of single-cell sequencing library |
| WO2021189679A1 (en) * | 2020-03-27 | 2021-09-30 | 中国人民解放军陆军军医大学 | Method for constructing single cell transcriptome sequencing library and use thereof |
| US20220259586A1 (en) * | 2017-05-26 | 2022-08-18 | 10X Genomics, Inc. | Single cell analysis of transposase accessible chromatin |
| US20220356461A1 (en) * | 2019-12-19 | 2022-11-10 | Illumina, Inc. | High-throughput single-cell libraries and methods of making and of using |
| CN115478098A (en) * | 2022-10-10 | 2022-12-16 | 中国科学技术大学 | A single-cell transcriptome and chromatin accessibility dual-omics sequencing library construction method and sequencing method |
| CN115537408A (en) * | 2022-10-08 | 2022-12-30 | 厦门大学 | Single cell multi-omics library and construction method thereof |
| CN116949132A (en) * | 2023-06-05 | 2023-10-27 | 清华大学 | A method for constructing single-cell sequencing libraries |
-
2023
- 2023-06-05 WO PCT/CN2023/098377 patent/WO2024250155A1/en active Pending
Patent Citations (8)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN109996892A (en) * | 2016-12-07 | 2019-07-09 | 深圳华大智造科技有限公司 | Construction method and application of single-cell sequencing library |
| EP3553180A1 (en) * | 2016-12-07 | 2019-10-16 | MGI Tech Co., Ltd. | Method for constructing single cell sequencing library and use thereof |
| US20220259586A1 (en) * | 2017-05-26 | 2022-08-18 | 10X Genomics, Inc. | Single cell analysis of transposase accessible chromatin |
| US20220356461A1 (en) * | 2019-12-19 | 2022-11-10 | Illumina, Inc. | High-throughput single-cell libraries and methods of making and of using |
| WO2021189679A1 (en) * | 2020-03-27 | 2021-09-30 | 中国人民解放军陆军军医大学 | Method for constructing single cell transcriptome sequencing library and use thereof |
| CN115537408A (en) * | 2022-10-08 | 2022-12-30 | 厦门大学 | Single cell multi-omics library and construction method thereof |
| CN115478098A (en) * | 2022-10-10 | 2022-12-16 | 中国科学技术大学 | A single-cell transcriptome and chromatin accessibility dual-omics sequencing library construction method and sequencing method |
| CN116949132A (en) * | 2023-06-05 | 2023-10-27 | 清华大学 | A method for constructing single-cell sequencing libraries |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US11591652B2 (en) | System and methods for massively parallel analysis of nucleic acids in single cells | |
| US12252733B2 (en) | Methods and kits for labeling cellular molecules | |
| US12234501B2 (en) | In situ combinatorial labeling of cellular molecules | |
| JP2009072062A (en) | Method for isolating the 5 'end of a nucleic acid and its application | |
| WO2012116661A1 (en) | Dna tag and use thereof | |
| US20240287596A1 (en) | Method for sequencing rna oligonucleotides | |
| CN116949132A (en) | A method for constructing single-cell sequencing libraries | |
| CN111801428B (en) | A method for obtaining single-cell mRNA sequences | |
| US20240279648A1 (en) | Quantitative detection and analysis of molecules | |
| WO2024250155A1 (en) | Method for constructing single cell sequencing library | |
| WO2024077439A1 (en) | Single-cell transcriptome and chromatin accessibility dual-omics sequencing library contruction method and sequencing method | |
| CN115478098A (en) | A single-cell transcriptome and chromatin accessibility dual-omics sequencing library construction method and sequencing method | |
| WO2024168092A2 (en) | Methods and kits for labeling cellular molecules for multiplex analysis | |
| WO2025059808A1 (en) | Method and kit for high-throughput tagging of cell nucleic acid molecules |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 23940045 Country of ref document: EP Kind code of ref document: A1 |