[go: up one dir, main page]

WO2025233344A1 - Methods and compositions for barcoding nucleic acids - Google Patents

Methods and compositions for barcoding nucleic acids

Info

Publication number
WO2025233344A1
WO2025233344A1 PCT/EP2025/062369 EP2025062369W WO2025233344A1 WO 2025233344 A1 WO2025233344 A1 WO 2025233344A1 EP 2025062369 W EP2025062369 W EP 2025062369W WO 2025233344 A1 WO2025233344 A1 WO 2025233344A1
Authority
WO
WIPO (PCT)
Prior art keywords
sequence
oligonucleotide
primer
barcoded
unique barcode
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
PCT/EP2025/062369
Other languages
French (fr)
Inventor
Sami ELLOUZE
Yannick RONDELEZ
Anis SENOUSSI
Guillaume GINES
Andrew Griffiths
Pablo IBANEZ
Gaël BLIVET-BAILLY
Benjamin LASSUS
Lucie CAVALIÉ
Sophie JIN
Pascaline Mary
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ecole Superieure de Physique et Chimie Industrielles de Ville de Paris ESPCI
Hifibio SAS
Original Assignee
Ecole Superieure de Physique et Chimie Industrielles de Ville de Paris ESPCI
Hifibio SAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ecole Superieure de Physique et Chimie Industrielles de Ville de Paris ESPCI, Hifibio SAS filed Critical Ecole Superieure de Physique et Chimie Industrielles de Ville de Paris ESPCI
Publication of WO2025233344A1 publication Critical patent/WO2025233344A1/en
Pending legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6806Preparing nucleic acids for analysis, e.g. for polymerase chain reaction [PCR] assay

Definitions

  • the present invention is in the field of molecular biology and relates to methods and compositions for barcoding nucleic acids.
  • the invention also encompasses methods for analysing single cells.
  • the invention is also in the field of microfluidics as the methods may be implemented in microfluidic systems.
  • Oligonucleotide barcoding strategies play a key role in single cell analysis. Different strategies have been developed for barcoding and analysing single cells.
  • the method comprises the steps of co-encapsulating each cell with a distinctly barcoded microparticle in a microfluidic droplet, lysing the cells after they are isolated in droplets, capturing the mRNAs originating from the cell on the microparticle to form the STAMPS (Single-cell Transcriptomes Attached to Microparticles) and reverse-transcribing, amplifying and sequencing these STAMPS in a single reaction.
  • STAMPS Single-cell Transcriptomes Attached to Microparticles
  • these methods using beads for delivering barcoded primers into droplets show some drawbacks.
  • a photocleavable linker is required to release primers from the bead, which would complicate bead fabrication and makes it less cost effective.
  • the use of UV light may introduce damage to DNA or RNA and bias in the results.
  • the Drop-Seq system does not release primers from the bead and the reaction efficiency is low as reactions take place only near the surface of the beads.
  • Rotem etal. (PLoS One 2015, 10(5): e0116328) have described a microfluidic droplet-based approach for labelling mRNA prior to sequencing. The method is based on electrically coalescence of two adjacent droplets, each containing either the mRNA from a single cell lysate or the unique labels. Reverse transcription reagents were injected post drop coalescence. Although this method relies on single cell cDNA labelling, the transcriptomic sequence data comes from an aggregate of multiple phenotypically and genotypically uncorrelated cells.
  • compositions according to the present invention are intended to provide with a straightforward and efficient solution for single cell analysis as described hereinafter.
  • the present invention provides with a method of barcoding a target sequence, the method comprising the steps of: a. providing in a reactor: i. an oligonucleotide comprising a unique barcode sequence flanked by a priming sequence, ii. an amplification mixture comprising a polymerase with strand displacement capability and a nickase, ill. an oligonucleotide primer pair comprising a forward primer and a reverse primer, iv. a single cell comprising the target sequence; b. contacting said oligonucleotide comprising a unique barcode sequence with one oligonucleotide primer of the oligonucleotide primer pair; c.
  • the present invention provides with a method of analysing a single cell, the method comprising the steps of: a. providing in a reactor: i. an oligonucleotide comprising a unique barcode sequence flanked by a priming sequence, ii. an amplification mixture comprising a polymerase with strand displacement activity and a nickase, ill. an oligonucleotide primer pair comprising a forward primer and a reverse primer, iv. a single cell comprising a target sequence; b. contacting said oligonucleotide comprising a unique barcode sequence with one oligonucleotide primer of the oligonucleotide primer pair; c.
  • the present invention provides with a composition
  • a composition comprising: a. a forward primer comprising a nickase site, a first primer sequence complementary to a first end of an oligonucleotide comprising a unique barcode sequence a first priming sequence and, optionally, a unique molecular identifier (UMI) sequence and a first adapter sequence; and b. a reverse primer comprising a nickase site, a second primer complementary to a second end of the oligonucleotide comprising a unique barcode sequence, a second priming sequence, optionally, a unique molecular identifier (UMI) sequence.
  • a forward primer comprising a nickase site, a first primer sequence complementary to a first end of an oligonucleotide comprising a unique barcode sequence a first priming sequence and, optionally, a unique molecular identifier (UMI) sequence and a first adapter sequence
  • UMI unique mole
  • Figure 1(A) shows a schematic of a first cycle of an isothermal exponential amplification method starting from an oligonucleotide comprising a unique barcode sequence and using two primers to amplify the sequence and generate a high concentration of barcoded primers. These barcoded primers can be subsequently used for a reverse transcription reaction.
  • Figure 2 shows a schematic workflow of the methods according to the present invention.
  • Figure 3 shows a schematic workflow of the methods according to the present invention.
  • Figure 4 shows a schematic of forward and reverse primers and unique barcode molecule design.
  • the forward and reverse pre-primer sequences shown in the figure correspond to the priming sequences.
  • the reverse primer introduces a unique molecule identifier (UMI) and a gene specific primer (GSP).
  • UMI unique molecule identifier
  • GSP gene specific primer
  • Figure 5 shows a table providing the barcode cluster analysis.
  • Figure 6 shows a graph indicating the fraction of human and mouse reads in each barcode cluster.
  • Figure 7 shows the size of the isothermal strand displacement amplification (iSDA) products when the template used is the RCA product containing 100 to 1000 copies of the linear template vs the linear template (SEQ ID NO.l).
  • iSDA isothermal strand displacement amplification
  • Figure 8 shows the size of double-stranded DNA (dsDNA) obtained after iSDA-RT on a range of RNA extract in bulk, and second strand synthesis.
  • Fresh DNBs were used for all conditions (except PC and C7) and were amplified by iSDA.
  • the capture sequence used is specific of B-Actine and a PCR was performed for the second strand synthesis.
  • PC positive control, iSDA final product purchased used as a template, 500 ng RNA; Cl: 0 ng RNA; C2: 100 ng RNA; C3: 250 ng RNA; C4: 500 ng RNA; C5: 750 ng RNA; C6: + 0.003% Igepal, 500 ng RNA; C7: old DNB, 500 ng RNA; NC: negative control, no RNA no enzyme mix.
  • Figure 9(A,B) shows the size of double-stranded DNA (ds DNA) obtained after iSDA-RT on cells in bulk, and second strand synthesis, as well as the alignment of the sequences obtained by Sanger sequencing with the theoretical sequence
  • the capture sequence used is specific of B-Actine and a PCR was performed for the second strand synthesis.
  • the primers used for Sanger (Fwd: GAGCAAGAGAGGCATCCTCAC (SEQ ID NO.262) and Rev: TGACGTGTGCTCTTCCGATC (SEQ ID NO.263)) are the same as those used for PCR: a combination of both sequences obtained were combined to achieve the whole sequence.
  • Figure 10(A,B,C) shows the size of double-stranded DNA (ds DNA) obtained after iSDA-RT on cells in 350 pL drops, and second strand synthesis, as well as the alignment of the sequence obtained by Sanger sequencing with the theoretical sequence.
  • the capture sequence used is specific of B-Actine and a PCR was performed for the second strand synthesis.
  • the primers used for Sanger (Fwd and Rev) are the same as those used for PCR: a combination of both sequences obtained were combined to achieve the whole sequence.
  • PCI and PC2 were performed with different incubations; Cl: 1 pM DNB + 0.003% Igepal; C2: 1 nM DNB + 0.003% Igepal; C3: 1 pM DNB - Igepal; C4: 1 pM DNB + 0. 3% Igepal; C5: 1 pM DNB - tRNA; Bulk: 1 pM DNB + 0.003% Igepal; NTC: no template control (iSDA); NC: negative control, no cells.
  • High-throughput single cell sequencing methods rely on microfluidics for cell barcoding.
  • a key step in these methods is loading droplets with high concentrations of barcoded oligonucleotides to tag nucleic acids of interest.
  • Barcode loading is often achieved using beads on which barcode sequences are synthesized.
  • current barcode designs utilize fixed targeting primers that are not easily adapted for different purposes; therefore, if new targets are identified, a new batch of beads must be synthesized, which is expensive and laborious.
  • scRNAseq single-cell RNA sequencing
  • additional targets cannot be easily added to existing whole transcriptome or multiplexed amplicon beads.
  • Methods for screening cells having a phenotype of interest and recovery of specific cell genotype information are highly desirable since the recovery of single cell specific genotype together with single cell specific phenotype is very challenging.
  • the innovative concept behind the methods and compositions according to the present invention lies in proposing an in situ amplification of a single barcode molecule based on polymerase/nicking cycles that allow the generation of a large concentration of barcoded molecules in the same reactor wherein the single cell is subjected to analysis.
  • the large concentration of barcoded molecules may be subsequently used for barcoding nucleic acids that originates from the single cell.
  • the methods disclosed herein present various advantages. For instance, the generation of a plurality of barcoded primers and subsequent barcoding of the single cell's nucleic acids occur in the same reactor. Therefore, the methods do not require extensive procedures of manipulation of the single cells that are laborious, time-consuming and error-prone, in particular, when the screening of the single cells is carried out on a large scale. Also, the methods of barcoding a target sequence and analysing a single cell disclosed herein do not require the use of beads bearing barcoded sequences/primers.
  • the methods and compositions according to the present invention also allow analysis of a single cell displaying a phenotype of interest as only these single cells, after being isolated in a reactor or encapsulated in a droplet, would trigger the step of generating a plurality of barcoded primers for a target sequence originating from the single cell. Therefore, the methods and compositions disclosed herein allow identification of subpopulations of single cells that are closely associated with a specific phenotype.
  • the present invention relates to a method of barcoding a target sequence, the method comprising the steps of: a. providing in a reactor: i. an oligonucleotide comprising a unique barcode sequence flanked by a priming sequence, ii. an amplification mixture comprising a polymerase with strand displacement capability and a nickase, ill. an oligonucleotide primer pair comprising a forward primer and a reverse primers, iv. a single cell comprising the target sequence; b. contacting said oligonucleotide comprising a unique barcode sequence with one oligonucleotide primer of the oligonucleotide primer pair; c.
  • the present invention relates to a method of analysing a single cell, the method comprising the steps of: a. providing in a reactor: i. an oligonucleotide comprising a unique barcode sequence flanked by a priming sequence, ii. an amplification mixture comprising a polymerase with strand displacement capability and a nickase, ill. an oligonucleotide primer pair comprising a forward primer and a reverse primer, iv. a single cell comprising a target sequence; b. contacting said oligonucleotide comprising a unique barcode sequence with one oligonucleotide primer of the oligonucleotide primer pair; c.
  • nucleic acid and oligonucleotide may be used interchangeably and refer to naturally-occurring or synthetic polymeric forms of nucleotides. Therefore, the nucleic acids and oligonucleotides of the present invention may be formed of naturally-occurring nucleotides, e.g., deoxyribonucleic acid (DNA) or ribonucleic acid (RNA), natural or synthetic modifications of nucleotides or artificial bases.
  • the nucleic acids and oligonucleotides may exist as single- or doublestranded DNA or RNA, or as an RNA/DNA heteroduplex.
  • nucleic acid and oligonucleotide may refer to a short polynucleotide, generally less than or equal to 200 nucleotides in length, preferably between 5 and 150 nucleotides in length, more preferably between 10 and 100 nucleotides in length, even more preferably between 20 and 50 nucleotides in length.
  • a “nucleic acid” or “oligonucleotide” may hybridize to other polynucleotides, therefore serving as a probe for polynucleotide detection, or a primer for polynucleotide chain extension.
  • the indefinite article “a” or “an” may also refer to “one or more” or “at least one”.
  • the term “an oligonucleotide” includes “one or more oligonucleotides”.
  • the term “an oligonucleotide” encompasses one or more oligonucleotides, wherein each oligonucleotide may comprise a unique barcode or a plurality of the same unique barcode sequence.
  • the method of analysing a single cell disclosed herein allows the analysis of one or more single cells.
  • target sequence refers to a nucleic acid or fragment thereof originating from a single cell that is subjected to barcoding and analysis.
  • Target sequences may be DNA or RNA molecules.
  • the target sequence is of mammalian, viral, bacterial, plant or fungal origin. In one embodiment, the target sequence is of mammalian origin, preferably, is of human origin.
  • the target sequence is a nucleic acid selected from a transcriptome, a genome, an exome, transfected or transduced nucleic acid, a mitochondrial DNA, a chloroplast DNA or a modified nucleic acid.
  • modified nucleic acid refers to nucleic acid modified by biological reactions within the cell or by synthetic approaches.
  • a discrete set of genes or transcripts originating from the single cell may be targeted for analysis. Therefore, the method and composition according to the present invention can also barcode at the same time more than one target sequence originating from the single cell.
  • the target sequence is a transcript sequence.
  • the method further comprises reverse transcribing the transcriptome to produce a cDNA library tagged with the amplified unique barcode molecules.
  • a reverse transcriptase may be provided in the reactor.
  • the target sequence is selected from a subpopulation of nucleic acid in the single cell, a modified nucleic acid in the single cell or exposed on the surface of the single cell.
  • the single cell may bear on its surface a target sequence which trigger a barcoding reaction while the single cell is not lysed or still viable.
  • the plurality of single-stranded barcoded oligonucleotide sequences target a biomolecule exposed on the surface of the single cell.
  • the methods are used for single cell transcriptome sequencing, single cell genomic sequencing or single cell methylome sequencing.
  • barcode sequence refers to a unique nucleic acid sequence that may be distinguished by its sequence from another nucleic acid sequence, thus allowing to uniquely label a nucleic acid sequence so that it may be distinguished from another nucleic acid carrying another barcode sequence.
  • the barcode sequence uniquely identifies the nucleic acids released from a single cell from nucleic acids released from other single cells, for instance, even after the nucleic acids are pooled together.
  • the barcode sequence may be used to distinguish tens, hundreds or even thousands of nucleic acids arising from different single cells.
  • the barcode within the unique barcode sequence is a short DNA sequence wherein some positions of the DNA sequence may contain degenerate bases.
  • a base labelled N will correspond to an approximatively balanced mixture of A, T, C and G. Other mixtures are possible and are indicated by the letters according to a standard nucleic acid notation.
  • the barcode then consists of one or multiple stretches of randomized sequence.
  • a random barcode may be a stretch of 15 N
  • a "split barcode” may consist of 2 stretches of 12 N separated by a constant spacer sequence
  • a "structured barcode” may be (WS)io or NNNNNWNNNNNWNNNNNNN.
  • a structured barcode may be easier to identify in a sequencing read, because they must follow a specific pattern, whereas a random barcode may be any sequence.
  • Another advantage of structured barcodes is that they allow to avoid some sequences or sequence patterns. For example, the barcode (WS)i 2 cannot contain a homopolymer because W is A or T and S is C or G, and this may lead to less sequencing errors if the sequencing technique that is used is prone to errors on homopolymers.
  • Another advantage of a structured barcode is that it may be designed to not contain one or more specific subsequences, such as the recognition sequences of the nickases, primer binding site, sequences similar to the gene specific primer, homopolymers and combinations thereof.
  • a structured barcode may be designed to keep a balanced content of GC vs AT (e.g., 50/50) in the barcode. In this way, it is less likely to observe failures or side products during the amplification of the unique barcode sequence in the reactor.
  • the unique molecular identifier (UMI) sequence may also contain such a pattern.
  • the barcode sequence may be of any suitable length.
  • the barcode sequence is preferably of a length sufficient to distinguish the barcode sequence from other barcode sequences and avoid barcode collisions.
  • the unique barcode sequence contains at least 5, 10, 15 or more degenerate nucleotides. In one embodiment the unique barcode contains between 5 to 100 degenerate nucleotides, preferably from 10 to 50 degenerate nucleotides, more preferably from 15 to 25 degenerate nucleotides.
  • the barcode degenerate sequence i.e. the number of degenerate base and the specific degeneracy of each base, it is possible to compute the total number of possible barcode sequence and depending on the number of object to be barcoded, the probability that two of them receive the same sequence (i.e. a collision).
  • the barcode sequence is NNNWNNNW
  • the barcode sequence may consist of one unique barcode sequences or a plurality of the same unique barcode sequence or may consist of more than one barcode sequence.
  • the different barcode sequences may be taken from a pool of barcode sequences, which themselves have been generated by spl it-and-pol I synthesis. If the barcode sequence consists of more than one barcode sequence, the barcode sequences may be taken from the same or different pools of barcode sequences.
  • the pool of sequences may be selected using any suitable technique, e.g., randomly or such that the sequences allow for error detection and/or correction, for instance, by being separated by a certain distance (e.g., Hamming distance) such that errors in reading of the barcode sequence may be detected, and in some cases, corrected.
  • the pool may have any number of barcode sequences. Methods for joining different barcode sequences taken from one pool or more than one pool are known to the person skilled in the art.
  • the unique barcode molecule may comprise a single barcode or multiple concatenated copies of the barcode.
  • Single barcode molecules containing multiple concatenated repeats may be obtained via reactions such as rolling circle amplification (RCA), loop-mediated isothermal amplification (LAMP) or terminal hairpin and self-priming extension (THSP).
  • RCA rolling circle amplification
  • LAMP loop-mediated isothermal amplification
  • THSP terminal hairpin and self-priming extension
  • a non-hyperbranched reaction is preferably used to obtain the concatenated repeats to ensure that a given barcode is present in only one concatemeric molecule.
  • the oligonucleotide comprising a unique barcode sequence is obtained by a rolling circle amplification.
  • the oligonucleotide comprising a unique barcode sequence comprises a single index or a plurality of indexes linked to each other.
  • index refers to a nucleotide sequence characterising the barcode.
  • the oligonucleotide comprising a unique barcode sequence is a doublestranded nucleic acid, a single-stranded nucleic acid, a partially double-stranded nucleic acid or a partially single-stranded nucleic acid.
  • the oligonucleotide comprising a unique barcode sequence may be floating in the reactor or bound to a surface, e.g., a particle or a surface of the reactor, or the surface of a cell.
  • the unique barcode sequence is comprised within the single-stranded oligonucleotide obtained in step (d) of the methods according to the present invention.
  • This single-stranded oligonucleotide comprising the unique barcode sequence may be referred to as a barcoded primer.
  • barcoded primer refers to at least one molecule of about 20 to about 200 nucleobases in length that may function to prime nucleic acid synthesis.
  • the barcoded primer may be of about 30 to about 150 nucleobases in length, of about 40 to about 100 nucleobases in length, of about 50 to about 90 nucleobases in length, of about 60 to about 80 or 70 nucleobases in length.
  • a barcoded primer is an oligonucleotide comprising a barcode sequence or a set of barcode sequences and a primer sequence.
  • the barcoded primer further comprises a unique molecular identifier (UMI).
  • UMI is located at the 3'-end or the 5'-end of the barcode sequence.
  • the UMI sequences are well known in the art and are described, for instance, in Kivioja et al. (Nature Methods 2012, 9: 72-74).
  • the UMI has a length ranging from 3 to 30 nucleotides, preferably from 5 to 20 nucleotides, more preferably from 8 to 13 nucleotides.
  • the barcoded primer may comprise one of the following structures: (a) a polymerase chain reaction (PCR) handle, a barcode sequence, an adaptor, a poly(dT) tail, a gene-specific primer (GSP) or a set of gene-specific primers (GSPs); (b) a T7 promoter region, a PCR handle, a barcode sequence, an adaptor and a poly(dT) tail; or (c) a PCR handle, a barcode sequence, an adaptor and a template switching oligonucleotide (TSO) for 5' -end amplification.
  • the barcoded primer may comprise a unique molecule identifier (UMI).
  • the barcoded primer comprises from 5' to 3': a PCR handle, optionally a UMI, a barcode sequence or a set of barcode sequences, an adapter and a capture/targeting sequence.
  • the barcoded primer is an oligonucleotide comprising a barcode sequence or a set of barcode sequences, a first adapter sequence, a second adapter sequence, a primer sequence and a capture sequence.
  • the barcoded primer is an oligonucleotide comprising a complementary sequence of the barcode sequence or the set of barcode sequences, the first adapter sequence, the second adapter sequence, the primer sequence and the capture sequence.
  • an "oligonucleotide primer” refers to a short single-stranded nucleic acid of between 10 and 50 nucleotides in length, designed to perfectly or almost perfectly match a nucleic acid of interest, to be captured and then amplified (e.g., by PCR) or reverse transcribed (e.g., by reverse transcription (RT)).
  • the primer sequences are specific to the nucleic acids they hybridize to, i.e., the primer sequences preferably hybridize under stringent hybridization conditions, more preferably under highly stringent hybridization conditions, and are complementary to or almost complementary to the nucleic acids they hybridize to.
  • the primer sequence serves as a starting point for nucleic acid synthesis, allowing polymerase enzymes such as nucleic acid polymerase to extend the primer sequence and replicate the complementary strand.
  • a primer sequence may be complementary to and hybridize to a target nucleic acid.
  • the primer sequence may be a synthetic primer sequence.
  • the forward primer may contain from 5' to 3': (a) a 5' tail, (b) a restriction enzyme or nickase recognition site, (c) an adaptor sequence, and/or a (d) a 3' end complementary or reverse complementary to the unique barcode molecule.
  • the adaptor sequence can, for example, be used for later recovery and amplification of, e.g., the cDNA produced by the methods of the invention.
  • the reverse primer may contain from 5' to 3': (a) a 5' tail, (b) a restriction enzyme or nickase recognition site; (c) a gene specific primer or poly(A) section, (d) a spacer, and (e) a 3' end complementary or reverse complementary to the unique barcode molecule.
  • the spacer can, for example, optionally be a unique molecular identifier (UMI) sequence.
  • the forward primer and/or the reverse primer comprises the reverse complement of the target sequence, i.e., the GSP, oligodT or template switch oligo (TSO), when a T7 promoter is to be used. It may be beneficial that one or both of these primers have a 3' phosphate modification.
  • the forward primer comprises a nickase site, a first primer sequence complementary to a first end of the oligonucleotide comprising a unique barcode sequence and, optionally, a unique molecular identifier (UMI) sequence.
  • UMI unique molecular identifier
  • the forward primer comprises a nickase site, a first primer sequence complementary to a first end of the oligonucleotide comprising a unique barcode sequence, a first priming sequence, and optionally, a unique molecular identifier (UMI) sequence.
  • the reverse primer comprises a nickase site and second primer complementary to a second end of the oligonucleotide comprising a unique barcode sequence, a priming sequence, optionally, a unique molecular identifier (UMI) sequence.
  • the reverse primer comprises a nickase site, a second primer complementary to a second end of the oligonucleotide comprising a unique barcode sequence, a second priming sequence, and optionally, a unique molecular identifier (UMI) sequence.
  • UMI unique molecular identifier
  • forward pre-primer corresponds to the "first priming sequence” and "reverse pre-primer” corresponds to the "second priming sequence”.
  • amplification mixture refers to a mixture of reagents that are used in a nucleic acid amplification reaction.
  • An amplification mixture comprises a buffer, deoxynucleotide triphosphates (dNTPs) and a polymerase, but does not comprise primers or a sample to be amplified.
  • dNTPs deoxynucleotide triphosphates
  • the enzymes to be used in the amplification mixture are selected from a list of polymerase, nicking enzyme (nickase), restriction enzymes and exonuclease.
  • the amplification mixture comprises a polymerase and a nickase.
  • the amplification reaction in step (d) of the methods according to the present invention uses at least one nicking enzyme.
  • the use of a nickase is subject to reaction conditions and sequence constraints. Temperature and buffer-wise, the nickase must possess sufficient activity at the incubation temperatures and work in concert with DNA polymerase and other elements in the reaction mixture.
  • the recognition sequence for the nickase can, for example, be the same or be different for the forward primer and the reverse primer.
  • the same nicking enzyme can be used for both the forward and reverse primers, or multiple nicking enzymes can be combined for either the forward or the reverse primer or both primers.
  • Nb.BbvCI and Nt.BsmAI are well-tested examples that work effectively, but many enzymes exist and could be used, such as, for example, Nb.BpulOl and Nb.Mval269l (ThermoFisher Scientific; Waltham, MA) or Nt.BspQI, Nt.BstNBI, Nb.BsrDI, Nb.BtsI, Nt.Alwl, Nt.BbvCI, Nb.Bsml, and Nb.BssSI (New England Biolabs; Ipswich, MA).
  • the nickase For the primer that contain the reverse complement of the targeting sequence, there are more constraints on the nickase recognition sequence. As the cut has to occur immediately prior to the gene specific primer (GSP) or the polyA sequence on the reverse primer, the nickase must belong to the class of type IIS restriction enzymes or "shifted cleavage" enzymes. In some specific cases, the nicking enzymes cutting inside the recognition site can be used, for example, if the base that is left on the amplified unique barcode sequence is a T and the targeting domain is polyT, or if the base that is left is compatible with the 3' end of the gene specific primer (GSP). However, for the general case, Typells nicking enzymes are preferred. Such enzymes can include, but are not limited to Nt.BsmAI, Nt.Alwl, Nt.BstNBI, and Nt.BspQI.
  • the nickase recognition sequence is replaced in the reverse primer by a restriction enzyme sequence, with the same constraint on the cutting position.
  • each reverse primer can be used for polymerization only once, before the extension product receives a double stranded cut by the restriction enzyme.
  • the polymerase for use according to the invention is selected from the group consisting of Bst 2.0 DNA polymerase, Bst large fragment DNA polymerase, Klenow fragment (3'- >5' exo-), Phi29 DNA polymerase and Vent(exo-) DNA polymerase. More particularly, the polymerase is Vent(exo-) DNA polymerase. More than one polymerase may be used simultaneously.
  • the nicking enzyme (nickase) for use according to the invention is selected from the group consisting of Nb.BbvCI, Nb.Bstl, Nb.BssSI, Nb.BsrDI, Nb.Bsml and Nt.BstNBI. More preferably, Nb.Bsml and/or Nt.BstNBI. More than one nickase may be used simultaneously.
  • the nicking enzyme may be replaced by a restriction enzyme.
  • restriction enzymes Unlike nicking enzymes, which cut only one strand of a DNA duplex, the restriction enzymes cut the two strands. Thus, when using restriction enzymes instead of nicking enzymes, it may be necessary to protect the templates used in the method of the invention. This protection may be performed, for instance, by performing chemical modification of the templates. Such modification comprises backbone modification, such as phosphorothioate linkage. More than one restriction enzyme or a combination of nicking enzyme and restriction enzyme may be used simultaneously.
  • the amplification mixture further comprises an exonuclease.
  • the exonuclease for use according to the invention is preferably selected from the group consisting of RecJf, Exonuclease I, Exonuclease VII and ttRecJ exonuclease. More preferably, the exonuclease is ttRecJ exonuclease, such as the one obtained following the protocol described by Yamagata et al. (PNAS 2002, 99(9): 5908- 5912). More than one exonuclease can be used simultaneously.
  • the reactor includes additional reagents.
  • Additional reagents typically include a reverse transcriptase (RT), cell lysing additives, tagging additives, stabilizing additives, additives to adjust viscosity and density of aqueous phase, and/or deoxynucleotide triphosphates (dNTPs).
  • RT reverse transcriptase
  • dNTPs deoxynucleotide triphosphates
  • additional reagents are added to the reactor, said additional reagents comprising at least a reverse transcriptase (RT), cell lysing additives, tagging additives, stabilizing additives, additives to adjust viscosity and density of aqueous phase, and/or deoxynucleotide triphosphates (dNTPs).
  • RT reverse transcriptase
  • cDNA complementary DNA
  • the reverse transcriptase is selected from the group consisting of Superscriptase I, Superscriptase II, Superscriptase III, Superscriptase IV, Murine Leukemia RT, SmartScribe RT, Maxima H RT, or MultiScribe RT.
  • the reverse transcriptase is at a concentration of 1 to 50 U/pL, preferably 5 to 25 U/pL, for example at 12.5 U/pL.
  • Non-limiting examples of RNase inhibitors include RNase OUT, IN, SuperIN Rnase, and those inhibitors targeting a wide range of RNAse (e.g., A, B, C, 1 and Tl).
  • the lysis buffer is typically 0.36% Igepal CA 630, 50 mM Tris-HCI pH 8.
  • the "tagging additives" enable the tagging of the target sequence with the amplified unique barcode molecules.
  • Tagging additives can include, but are not limited to, reverse transcriptases, transposases, ligases, or any other enzyme used in single cell-omic assays.
  • the "stabilizing additives” enable the stabilization of the reaction components in the methods disclosed herein.
  • Stabilizing additives can include, but are not limited to, bovine serum albumin (BSA), surfactants, solutes, and cosolvents (e.g., betaine, DMSO, urea, trehalose, etc.).
  • BSA bovine serum albumin
  • cosolvents e.g., betaine, DMSO, urea, trehalose, etc.
  • the "additives to adjust viscosity and density of the aqueous phase” enable better microfluidic encapsulation in the methods disclosed herein.
  • An additive to adjust viscosity and density of the aqueous phase can, for example, include carboxymethylcellulose.
  • said additional reagents are added into the reactor, in particular into the microfluidic droplet, by injection from a reservoir, for example using electrical forces (picoinjection) after a first droplet incubation step (Abate et al. (2010) Proc. Nat. Acad. Sci. USA 107:19163-19166).
  • said additional reagents are added into the reactor, in particular into the microfluidic droplet, by coalescence with a second reactor, in particular a second microfluidic droplet, comprising said additional reagents but not comprising any target sequence.
  • Droplets can be coalesced by a variety of methods known to the skilled person, including passive droplet coalescence (see Mazutis et al. (2009) Lab on a Chip, 9(18):2665-2672; Mazutis et al. (2012) Lab Chip, 12:1800- 1806), droplet coalescence driven by local heating from a focused laser (Baroud et al. (2007) Lab Chip 7:1029-1033) or using electric forces (Chabert et al.
  • Said second reactor in particular said second microfluidic droplet, can be prepared by the same techniques as those disclosed above for the reactors comprising the target sequence.
  • the step of reverse transcription defined above refers to reverse transcribing the released nucleic acids hybridized to said barcoded primers using the primer sequence in at least some of the reactors.
  • Reverse transcription is performed using the reverse transcriptase (RT) comprised in at least some of the reactors.
  • RT reverse transcriptase
  • RNA/DNA duplex comprising a single strand cDNA hybridized to its template RNA.
  • said RNA/DNA duplex is further linked to the barcoded primer comprising the primer sequence used for the reverse transcription.
  • Temporal switching refers to a technology described originally in 2001, frequently referred to as “SMART” (switching mechanism at the 5' end of the RNA transcript) technology (Takara Bio USA, Inc). This technology has shown promise in generating full-length cDNA libraries, even from single-cell- derived RNA samples (Zhu et al. (2001) Biotechniques 30:892-897). This strategy relies on the intrinsic properties of Moloney murine leukemia virus (MMLV) reverse transcriptase and the use of a unique template switching oligonucleotide (TS oligo, or TSO).
  • MMLV Moloney murine leukemia virus
  • the terminal transferase activity of the MMLV reverse transcriptase adds a few additional nucleotides (mostly deoxycytidine) to the 3' end of the newly synthesized cDNA strand. These bases function as a TS oligo-anchoring site.
  • the reverse transcriptase Upon base pairing between the TS oligo and the appended deoxycytidine stretch, the reverse transcriptase "switches" template strands, from cellular RNA to the TS oligo, and continues replication to the 5' end of the TS oligo.
  • the resulting cDNA contains the complete 5' end of the transcript, and universal sequences of choice are added to the reverse transcription product.
  • this approach makes it possible to efficiently amplify the entire full-length transcript pool in a completely sequence-independent manner (Shapiro et al. (2013) Nat. Rev. Genet. 14:618-630).
  • the reactor further comprises cDNAs.
  • At least some of the reactors further comprise cDNAs produced by reverse transcription of nucleic acids from the cells contained in said reactors.
  • said cDNA refers to a single-stranded complementary DNA.
  • said cDNA is comprised in a RNA/DNA duplex.
  • the RNA/DNA duplex refers to the RNA that has been reverse transcribed and is hybridized to the primer sequence of at least one of the primers, which is optionally barcoded, contained in the reactor.
  • the RNA/DNA duplex is linked to the primer, which is optionally barcoded, comprising the primer sequence to which the nucleic acid, preferably mRNA, was hybridized and which was used for reverse transcription.
  • the amplification mixture further comprises a reverse transcriptase, in particular when the method is used for barcoding and sequencing RNA molecules originating from a selected population of single cells.
  • the target sequence is a transcript sequence or a plurality of transcript sequences.
  • the priming sequence comprises oligo(dT), oligo(dT)VN or at least one targeted sequence.
  • target sequence refers to a sequence recognizing a target sequence.
  • the target sequence is a RNA sequence.
  • the priming sequence is complementary to the RNA sequence.
  • the amplification mixture further comprises a single-stranded DNA binding protein.
  • the single strand binding protein may be selected from single-stranded DNA binding protein (SSB), extreme thermostable single-stranded DNA binding protein (ETSSB), gp32 or RecA.
  • the amplification mixture further comprises tRNA.
  • tRNA may be necessary for the amplification reaction to alleviate inhibition by the reverse transcriptase, as the reverse transcriptase may inhibit the primer amplification reaction.
  • the forward primer is present at a concentration of at least 10 5 copies per reactor. In another embodiment, the forward primer is present at a concentration ranging from about 10 5 to about IO 10 copies per reactor, preferably from about 10 6 to about 10 9 copies per reactor.
  • the reverse primer is present at a concentration of at least 10 5 copies per reactor. In another embodiment, the reverse primer is present at a concentration ranging from about 10 5 to about IO 10 copies per reactor, preferably from about 10 6 to about 10 9 copies per reactor.
  • the forward primer is present at a concentration of at least 100 nM.
  • the reverse primer is present at a concentration of at least 25 nM.
  • the step of "extending" carried out in step (c) of the methods according to the present invention refers to the extension of a primer by the addition of nucleotides using a polymerase. If a primer that is annealed to a nucleic acid is extended, the nucleic acid acts as a template for extension reaction.
  • the amplification reaction carried out in step (d) of the methods according to the present invention occurs under isothermal conditions (at a constant temperature).
  • Exemplary isothermal amplifications are nicking enzyme amplification reaction (NEAR) and isothermal strand displacement amplification (iSDA).
  • NEAR nicking enzyme amplification reaction
  • iSDA isothermal strand displacement amplification
  • the temperature during the amplification is suitable for keeping the single cell viable.
  • the barcoding reaction carried out in step (f) of the methods according to the present invention occurs at a different temperature.
  • the amplification reaction continues until generating at least about 10 5 to about 10 9 copies of barcoded oligonucleotide sequences per reactor.
  • the amplification reaction carried out in step (d) of the method according to the present invention may be carried out using asymmetric concentrations of the first oligonucleotide primer and the second oligonucleotide primer.
  • the use of asymmetric concentrations of the first oligonucleotide primer and the second oligonucleotide primer allows to obtain at the end of the amplification reaction singlestranded barcoded sequences.
  • asymmetric concentration refers to unequal or unbalanced concentrations of the first oligonucleotide primer and the second oligonucleotide primer.
  • the forward primer is present at a higher concentration than the reverse primer or vice versa.
  • the forward oligonucleotide primer and the reverse oligonucleotide primer are provided in a concentration ratio between 1:1 and 1:100, preferably, between 1:2 and 1:10, more preferably, between 1:4 and 1:6, or vice versa.
  • the reactor further comprises an initiator template.
  • the term "initiator template” refers to a molecule or compound, other than a reactant, that is capable of initiate a chain reaction, such as an amplification reaction.
  • the initiator template may be exposed on the surface of the single cell.
  • the initiator template may be directly or indirectly associated to a phenotype of interest from the single cell.
  • the initiator template may originate from the single cell or carried on a compound binding or exposed on the surface of the single cell.
  • An exemplary compound binding the surface of the single cell may be represented by an antibody.
  • the initiator template may be directly or indirectly linked to the single cell with its 3' -end or its 5' end.
  • the initiator template is linked to a biomolecule of interest with its 3' -end or 5' end.
  • the initiator template is selected from a free nucleic acid, a nucleic acid linked to a peptide, a nucleic acid linked to an antibody, a nucleic acid linked to a cell surface protein, a nucleic acid linked to a chemical compound, a nucleic acid linked to a particle, a nucleic acid originating from a single cell or a nucleic acid in a liposome.
  • the initiator template comprises a primer sequence.
  • single cell refers to an individual cell.
  • a single cell is of mammalian, viral, bacterial, plant or fungal origin.
  • a single cell is of mammalian origin, preferably, is of human origin.
  • the single cell comprises an initiator template.
  • the methods of barcoding and analysing a target sequence may be carried out in any reactor suitable for containing a single cell and components for performing amplification reaction, lysis of the single cell and barcoding of the target sequence.
  • the reactor is selected from a chamber, a droplet, a well or a tube.
  • the reactor has a volume ranging from about 10 pL and about 100 pL.
  • the term “droplet” refers to an isolated portion of a first fluid that is completely surrounded by a second fluid.
  • the term “droplet” refers to a microfluidic droplet. Therefore, the method disclosed herein can be carried out in a microfluidic chip.
  • the droplet has a substantially spherical shape and has a volume suitable for or greater than a mammalian cell.
  • the methods as disclosed herein further comprise the step of lysing the single cell to release the target sequence from the single cell;
  • the step of "lysing" carried out in the methods according to the present invention refers to the disruption or poration of the cell membrane. Lysis may be accomplished by enzymatic, physical, mechanical, electrical, thermal or chemical means, or any combination thereof. Methods of cell lysis are only intended to physically lyse the cell to extract its content and do not affect the integrity of the reactor. Methods of cell lysis are known in the art.
  • cell lysis additives enabling and aid in cell lysis, preferably without disruption of the reactors, in particular, of the droplets.
  • Cell lysing additives can, for example, include, but are not limited to surfactants, enzymes, stabilizers.
  • the cell lysis additives are compatible with RT activity.
  • the cell lysis additives comprise enzymes selected from the group consisting of lysozyme, lysostaphin, zymolase, mutanolysin, glycanases, proteases, and mannose.
  • the cell lysis additives comprise magnesium chloride, a detergent, a buffered solution and an RNase inhibitor.
  • the magnesium chloride is used at a concentration of between 1 mM to 20 mM.
  • the detergent is selected from the group consisting of Triton-X-100, NP-40, Nonidet P40, and Tween-20 and IGEPAL CA 630.
  • the detergent is at a concentration of 0.01% to 1%.
  • Non-limiting examples of the buffered solution include Tris-HCI, Hepes-KOH, Pipes-NaOH, maleic acid, phosphoric acid, citric acid, malic acid, formic acid, lactic acid, succinic acid, acetic acid, pivalic (trimethylacetic) acid, pyridine, piperazine, picolinic acid, L-histidine, MES, Bis-tris, bis-tris propane, ADA, ACES, MOPSO, PIPES, imidazole, MOPS, BES, TES, HEPES, DIPSO, TAPSO, TEA (triethanolamine), N-Ethylmorpholine, POPSO, EPPS, HEPPS, HEPPSO, Tris, tricine, Glycylglycine, bicine, TAPS, morpholine, N-Methyldiethanolamine, AMPD (2-amino-2-methyl-l,3-propanediol), Diethanolamine, AMPSO, bo
  • Enzymatic methods to destabilise cell walls are well-established in the art.
  • the enzymes are generally commercially available and, in most cases, were originally isolated from biological sources. Enzymes commonly used include lysozyme, lysostaphin, zymolase, mutanolysin, glycanases, proteases, and mannose.
  • Nonionic and zwitterionic detergents are milder detergents.
  • the Triton X series of nonionic detergents, the IGEPAL CA 630 nonionic detergent, and 3-[(3-Cholamidopropyl) dimethylammonio]-l-propanesulfonate (CHAPS), a zwitterionic detergent are commonly used for these purposes.
  • ionic detergents are strong solubilizing agents and tend to denature proteins, thereby destroying protein activity and function. SDS, an ionic detergent that binds to and denatures proteins, is used extensively in the art to disrupt cells.
  • Physical cell lysis refers to the use of sonication, thermal shock (above 40°C, below 10°C), electroporation, freezing, shearing or laser-induced cavitation.
  • the cells are lysed on ice.
  • the cell lysis does not disrupt or destroy the reactors, in particular, the droplets, in the context of the invention.
  • step (e) of the methods according to the present invention refers to adding a unique genetic sequence, i.e., a barcode sequence, to a nucleic acid which allows to distinguish said barcoded nucleic acid from a nucleic acid having another added genetic sequence, i.e., another unique barcode sequence. Therefore, barcoding may enable one to pool samples of nucleic acids in order to reduce the cost of sequencing per sample, yet retain the ability to determine from which sample a sequence read is derived.
  • Separate library preparations may be prepared for each sample, and each sample may have its own unique barcode. The separately prepared libraries with unique barcodes may then be pooled and sequenced. Each sequence read of the resulting dataset may be traced back to an original sample via the barcode in the sequence read.
  • the barcoding reaction carried out in step (e) comprises an exponential amplification phase followed by a linear amplification phase.
  • step (f) of the methods according to the present invention refers to any method by which the identity of at least 10, at least 20, at least 50, at least 100 or at least 200 or more consecutive nucleotides of a polynucleotide may be obtained.
  • identification of the barcoded nucleic acid may be carried out by sequencing or by PCR. Methods of analysing a barcoded nucleic acid are known in the art.
  • identification of the target sequence contained in each microreactor can be carried out by sequencing, in particular, by sequencing DNA, the barcoded cDNAs obtained as detailed above, or the tags.
  • the barcoded cDNAs produced by the reverse transcription as defined above are recovered and further used for identification, typically, by subsequent amplification by PCR and sequencing library preparation.
  • the method of the invention further comprises recovering cell cDNAs produced by reverse transcription in at least some of the reactors.
  • Recovering herein refers to isolating the barcoded cDNAs produced by reverse transcription in at least some of the reactors from said plurality of reactors.
  • the barcoded cDNA comprises one or more modified nucleotides or nucleotide analogs, for example for facilitating purification of the barcoded cDNA sequences or molecules.
  • modified nucleotides include derivatives of nucleotides with substitutions at the 2' position of the sugar, in particular with the following chemical modifications: O-methyl group (2'-O- Me) substitution, 2-methoxyethyl group (2'-O-MOE) substitution, fluoro group (2'-fluoro) substitution, chloro group (2'-CI) substitution, bromo group (2'-Br) substitution, cyanide group (2'-CN) substitution, trifluoromethyl group (2'-CF3) substitution, OCF3 group (2'-OCF3) substitution, OCN group (2'-OCN) substitution, O-alkyl group (2'-O-al kyl ) substitution, S-alkyl group (2'-S-al kyl ) substitution, N-alkyl group (2'-N-akyl) substitution, O-alkenyl group (2'-O-alkenyl) substitution, S-alkenyl group (2'-S-alkenyl) substitution, N-alkenyl group (2'-alken
  • modified nucleotides include nucleotides wherein the ribose moiety is used to produce locked nucleic acid (LNA), in which a covalent bridge is formed between the 2' oxygen and the 4' carbon of the ribose, fixing it in the 3'-endo configuration.
  • LNA locked nucleic acid
  • nucleotide analogs include deoxyinosine.
  • nucleotide analogs include biotin labelled nucleotide.
  • Biotin-ll-dCTP can be used as a substrate for the reverse transcriptase to incorporate biotins into the cDNA during polymerization, allowing affinity purification using streptavidin or avidin.
  • the barcoded cDNA is further treated with RNAse A and/or RNAse H.
  • RNAse A is an endoribonuclease that specifically degrades single-stranded RNA at C and U residues.
  • the RNAse A is at a concentration of 10 to 1000 pg/pL, preferably 50 to 200 pg/pL, for example at 100 pg/pL.
  • RNAse H is a family of non-specific endonucleases that catalyze the cleavage of RNA via a hydrolytic mechanism.
  • RNase H ribonuclease activity cleaves the 3'-O-P bond of RNA in a DNA/RNA duplex substrate to produce 3'-hydroxyl and 5'-phosphate terminated products.
  • the RNAse H is at a concentration of 10 to 1000 pg/pL, preferably 50 to 200 pg/pL, for example at 100 g/piL.
  • the barcoded cDNA is further treated with Proteinase K.
  • Proteinase K is a broadspectrum serine protease and digests proteins, preferentially after hydrophobic amino acids.
  • the Proteinase K is at a concentration of 0.1 to 5 mg/mL, preferably 0.1 to 1 mg/mL, for example at 0.8 mg/mL.
  • the recovered and treated barcoded cDNA are further amplified by PCR.
  • the PCR primer may contain a tail with and index for multiplexing or a random sequence serving as UML
  • the step of sequencing the barcoded cDNA may comprise performing a next generation sequencing (NGS) protocol on a sequencing library.
  • NGS next generation sequencing
  • Any type of NGS protocol can be used such as the MiSeq Systems (illumina®), the HiSeq Systems (illumina®), the NextSeq System (illumina®), the NovaSeq Systems (illumina®), the lonTorrent system (ThermoFisher), the lonProton system (ThermoFisher), or the sequencing systems produced by Pacific Biosciences or by Nanopore.
  • the NGS protocol comprises loading an amount of the sequencing library between 1 pM and 20 pM, in particular between 1.5 pM and 20 pM, per flow cell of a reagent kit.
  • the NGS sequencing protocol further comprises the step of adding 5-60% PhiX to the amount of the sequencing library or to the flow cell of the reagent kit.
  • the barcoded cDNAs are further amplified.
  • the amplification step is performed by a polymerase chain reaction (PCR), and/or a linear amplification.
  • the linear amplification precedes the PCR reaction.
  • the linear amplification is an in vitro transcription, followed by reverse transcription.
  • the linear amplification is an isothermal amplification.
  • said amplification step is performed after removing unincorporated barcoded primers. In one embodiment, said amplification step is performed prior to the sequencing step defined herein above.
  • the barcoded cDNA produced after reverse transcription is quantified using qPCR.
  • specific sequences necessary for sequencing are added during amplification or by ligation of adaptors, thereby generating a sequencing library.
  • the present invention relates to a composition
  • a composition comprising: a. a forward primer comprising a nickase site, a first primer sequence complementary to a first end of an oligonucleotide comprising a unique barcode sequence, a first priming sequence and, optionally, a unique molecular identifier (UMI) sequence and a first adapter sequence; and b. a reverse primer comprising a nickase site and a second primer complementary to a second end of the oligonucleotide comprising a unique barcode sequence, a second priming sequence, and optionally, a unique molecular identifier (UMI) sequence.
  • UMI unique molecular identifier
  • the first oligonucleotide primer is present in the composition at a higher concentration than the second oligonucleotide primer or vice versa.
  • the forward primer is present at a concentration of at least 10 5 copies per reactor. In another embodiment, the forward primer is present at a concentration ranging from about 10 5 to about IO 10 copies per reactor, preferably from about 10 6 to about 10 9 copies per reactor. In one embodiment, the reverse primer is present at a concentration of at least 10 5 copies per reactor. In another embodiment, the reverse primer is present at a concentration ranging from about 10 5 to about IO 10 copies per reactor, preferably from about 10 6 to about 10 9 copies per reactor.
  • composition disclosed herein may be used to barcode target sequence originating from a single cell.
  • Embodiments of the methods disclosed herein also apply to the composition disclosed herein.
  • Example 1 Single cell transcriptomic analysis on a mixture of two cell types (human Jurkat cell line and mouse p338dl)
  • barcoded DNA template (barcoded DNA balls): circularization and rolling circle amplification (RCA) (Day 0, 5 hours)
  • ssDNA precursor single stranded DNA
  • Table 1 Components for circularization of precursors and linear ssDNA removal.
  • Circular ssDNA (SEQ ID NO.1) 6 pL 0.5 pM 0.1 pM
  • RCA Next rolling circle amplification
  • Circular ssDNA (SEQ ID NO.l) 0.5 pL 100 nM 2 nM
  • the RCA reaction was diluted with H2O to a stock solution of InM and was stored at -20°C.
  • Reverse primer (SEQ ID NO:3) 2.5 pL 1000 nM 25 nM
  • DNA barcoded balls was adjusted according to the size of the droplets used. The number of cells was adjusted according to the experiment, 5-20% of the total volume was preferred. DMEM, 10% FCS, 1% P/S can be replaced by RPMI, 10% FBS, 1% P/S.
  • PDMS polydimethylsiloxane
  • IBAR-RT reaction Remove the surplus of oil and incubate the mix for 40 minutes at 37°C, 25 minutes at 55°C, 20 minutes at 80°C, and then place on ice (4°C).
  • the PCR mix was prepared (Table 7).
  • the reverse primer included an Illumina index and two different indexes were used for the two pooled fractions.
  • RevP (ex: RSeq_tot_i35_SBS12_v2r) (SEQ ID NO:4) 0.5 pM
  • SEQ ID NO:5 5'AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCTGTCGTCGACAACGGC TCC3'
  • the PCR run was performed as follows: 98°C for 30 seconds, followed by 28-34 cycles of 98°C for 10 seconds and 72°C for 38 seconds, followed by 72°C for 2 minutes, followed by a 4°C hold.
  • the purified DNA was sequenced via MinlON using manufacturer protocol (e.g., using the Ligation protocol SQK-LSK110).
  • Results The reads were filtered by length, separated into indexes and demultiplexed (grouped by barcode) leading to around 2000 unique barcodes.
  • the barcodes that appeared in the two indexes (indicating breakage of the RCA balls) were removed.
  • the reads within each cluster were assigned to human or mouse variant of the targeted gene.
  • the number of different barcodes as well as the fraction of human and mouse reads for each cluster were computed.
  • the results are provided in FIGs. 4 and 5, showing that most barcodes are associated to reads from a single cell type, and few barcodes are associated with reads from the two cell types. This demonstrates that the isothermal barcode amplification reaction happening in one given droplet has enabled tagging the transcripts coming from the co-encapsulated cell, and only that cell.

Landscapes

  • Chemical & Material Sciences (AREA)
  • Organic Chemistry (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Analytical Chemistry (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Health & Medical Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Biophysics (AREA)
  • Immunology (AREA)
  • Microbiology (AREA)
  • Molecular Biology (AREA)
  • Biotechnology (AREA)
  • Physics & Mathematics (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • Biochemistry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

Methods and compositions for barcoding a target sequence and analysing a single cell are provided. An in situ amplification of a single barcode molecule based on polymerase/nicking cycles allow the generation of a large concentration of barcoded molecules in the same reactor wherein the single cell is subjected to analysis. The large concentration of barcoded molecules may be subsequently used for barcoding a target sequence that originates from the single cell

Description

METHODS AND COMPOSITIONS FOR BARCODING NUCLEIC ACIDS
FIELD OF THE INVENTION
The present invention is in the field of molecular biology and relates to methods and compositions for barcoding nucleic acids. The invention also encompasses methods for analysing single cells. The invention is also in the field of microfluidics as the methods may be implemented in microfluidic systems.
BACKGROUND OF THE INVENTION
Cellular heterogeneity underlies all biological systems and emerges as a result of epigenetic, transcriptional and posttranslational diversity within and between cell populations. Analyses resolved at the single cell level are important to investigate cell-to-cel I variation within a cell population as well as cellular networks to discover new ways to diagnose and treat diseases. In the last years, advances in whole-genome and whole-transcriptome amplification have allowed the gene expression profiling and sequencing of the minute amounts of DNA and RNA present in a single cell. Technologies for single cell genomics, transcriptomics and proteomics include microfluidics, transcriptome in vivo analysis and mass cytometry (Heath et al., Nat. Rev. Drug Discov. 2016, 15(3): 204-216).
Oligonucleotide barcoding strategies play a key role in single cell analysis. Different strategies have been developed for barcoding and analysing single cells.
Klein et al. (Cell 2015, 161(5): 1187-1201) have developed a platform for indexing thousands of single cells for RNA sequencing, termed inDrop, wherein each cell is encapsulated into droplets with lysis buffer, reverse transcription reagents and a hydrogel bead carrying barcoded primers. Macosko et al. (Cell 2015, 161(5): 1202-1214) have introduced Drop-Seq, a method for analysing the mRNA expression in single cells by encapsulating these cells in microfluidic droplets for parallel analysis. The method comprises the steps of co-encapsulating each cell with a distinctly barcoded microparticle in a microfluidic droplet, lysing the cells after they are isolated in droplets, capturing the mRNAs originating from the cell on the microparticle to form the STAMPS (Single-cell Transcriptomes Attached to Microparticles) and reverse-transcribing, amplifying and sequencing these STAMPS in a single reaction. However, these methods using beads for delivering barcoded primers into droplets show some drawbacks. In the case of the inDrop system, a photocleavable linker is required to release primers from the bead, which would complicate bead fabrication and makes it less cost effective. Also, the use of UV light may introduce damage to DNA or RNA and bias in the results. Differently, the Drop-Seq system does not release primers from the bead and the reaction efficiency is low as reactions take place only near the surface of the beads. Rotem etal. (PLoS One 2015, 10(5): e0116328) have described a microfluidic droplet-based approach for labelling mRNA prior to sequencing. The method is based on electrically coalescence of two adjacent droplets, each containing either the mRNA from a single cell lysate or the unique labels. Reverse transcription reagents were injected post drop coalescence. Although this method relies on single cell cDNA labelling, the transcriptomic sequence data comes from an aggregate of multiple phenotypically and genotypically uncorrelated cells.
In addition to the methods carried out in droplets, bulk methods are also known in the art (Jaitin et al., Sciences 2014, 343(6172): 776-779). However, the bulk methods enable the analysis of cell populations with lower throughput than corresponding microfluidic methods.
One challenge in the current single cell technologies is how to increase sensitivity and accuracy. This is especially important for low abundance transcripts for which it is difficult to differentiate signal and experimental noise. Another challenge is how to increase throughput. Transcriptional profiling of rare cell types, e.g., tumor cells, that are present in larger cell populations requires rare cell type enrichment and/or processing larger samples of single cells.
Current methods for single cell analysis, such as Drop-Seq and inDrop systems, require laborious workflows due to microfluid handling steps for sample processing, synthesis of barcoded beads, coencapsulation procedures of single cells with barcoded primer beads.
The methods and compositions according to the present invention are intended to provide with a straightforward and efficient solution for single cell analysis as described hereinafter.
SUMMARY OF THE INVENTION
In one aspect, the present invention provides with a method of barcoding a target sequence, the method comprising the steps of: a. providing in a reactor: i. an oligonucleotide comprising a unique barcode sequence flanked by a priming sequence, ii. an amplification mixture comprising a polymerase with strand displacement capability and a nickase, ill. an oligonucleotide primer pair comprising a forward primer and a reverse primer, iv. a single cell comprising the target sequence; b. contacting said oligonucleotide comprising a unique barcode sequence with one oligonucleotide primer of the oligonucleotide primer pair; c. extending said one oligonucleotide primer positioned on said oligonucleotide comprising a unique barcode sequence, thereby generating a double-stranded oligonucleotide comprising the unique barcode sequence; d. amplifying, in presence of the nickase, said double-stranded oligonucleotide comprising a unique barcode sequence by using the oligonucleotide primer pair, thereby generating a plurality of single-stranded barcoded oligonucleotide sequences; e. barcoding said target sequence with at least one of said plurality of single-stranded barcoded oligonucleotide sequences to generate a barcoded target sequence.
In another aspect, the present invention provides with a method of analysing a single cell, the method comprising the steps of: a. providing in a reactor: i. an oligonucleotide comprising a unique barcode sequence flanked by a priming sequence, ii. an amplification mixture comprising a polymerase with strand displacement activity and a nickase, ill. an oligonucleotide primer pair comprising a forward primer and a reverse primer, iv. a single cell comprising a target sequence; b. contacting said oligonucleotide comprising a unique barcode sequence with one oligonucleotide primer of the oligonucleotide primer pair; c. extending said one oligonucleotide primer positioned on said oligonucleotide comprising a unique barcode sequence, thereby generating a double-stranded oligonucleotide comprising the unique barcode sequence; d. amplifying, in presence of the nickase, said double-stranded oligonucleotide comprising a unique barcode sequence by using the oligonucleotide primer pair, thereby generating a plurality of single-stranded barcoded oligonucleotide sequences; e. barcoding said target sequence with at least one of said plurality of single-stranded barcoded oligonucleotide sequences to generate a barcoded target sequence; and f. analysing said barcoded target sequence.
In a further aspect, the present invention provides with a composition comprising: a. a forward primer comprising a nickase site, a first primer sequence complementary to a first end of an oligonucleotide comprising a unique barcode sequence a first priming sequence and, optionally, a unique molecular identifier (UMI) sequence and a first adapter sequence; and b. a reverse primer comprising a nickase site, a second primer complementary to a second end of the oligonucleotide comprising a unique barcode sequence, a second priming sequence, optionally, a unique molecular identifier (UMI) sequence.
BRIEF DESCRIPTION OF THE FIGURES
Figure 1(A) shows a schematic of a first cycle of an isothermal exponential amplification method starting from an oligonucleotide comprising a unique barcode sequence and using two primers to amplify the sequence and generate a high concentration of barcoded primers. These barcoded primers can be subsequently used for a reverse transcription reaction. Legenda: (1) single copy of barcode nucleotide (DNA) or multiple copy; (2) primer #1; (3) polymerization and nicking step; (4) polymerization (strand displacement) and annealing with primer #2; (5) Annealing with primer #2 and strand displacement; (6) N1 nicking; (7) primer #2; (8) polymerization and nicking step; (9) N2 nicking; (10) polymerization (strand displacement); (11) barcoded primers; (P') PCR primer (complementary sequences of P); (Al') adapter (complementary sequence of Al); (B') barcode (complementary sequence of B); (A2') adapter (complementary sequence of A2); (C') capture/targeting sequence (complementary sequence of C); (Nl) nicking sequence (complementary to Nl'); (N2) nicking sequence (complementary to N2'). Figure 1(B) shows a schematic of a first and further cycles of the isothermal exponential amplification method.
Figure 2 shows a schematic workflow of the methods according to the present invention. Legenda: (12) primer release in droplet; (13) single or multiple barcode amplification using present invention (claim 1); (14) cell lysis and (D) mRNA release in the droplet; (15) reverse transcription (targeted sequence or with oligodT) and (D') complementary mRNA sequence; (16) barcoded cDNA droplet release; (17) barcoded cDNA; (18) single primer; (P') PCR primer (complementary sequences of P); (Al') adapter (complementary sequence of Al); (B') barcode (complementary sequence of B); (A2') adapter (complementary sequence of A2); (C') capture/targeting sequence (complementary sequence of C); (Nl) nicking sequence (complementary to Nl'); (N2) nicking sequence (complementary to N2').
Figure 3 shows a schematic workflow of the methods according to the present invention. Legenda: (12) primer release in droplet; (13) single or multiple barcode amplification using present invention (claim 1); (14) cell lysis and (D) mRNA release in the droplet; (15) reverse transcription (targeted sequence or with oligodT) and (D') complementary mRNA sequence; (16) barcoded cDNA droplet release; (17) barcoded cDNA; (18) single primer; (19) activator sequence; (20) multiple copies of primer; (P') PCR primer (complementary sequences of P); (Al') adapter (complementary sequence of Al); (B') barcode (complementary sequence of B); (A2') adapter (complementary sequence of A2); (C') capture/targeting sequence (complementary sequence of C); (Nl) nicking sequence (complementary to Nl'); (N2) nicking sequence (complementary to N2').
Figure 4 shows a schematic of forward and reverse primers and unique barcode molecule design. The forward and reverse pre-primer sequences shown in the figure correspond to the priming sequences. In the figure, the reverse primer introduces a unique molecule identifier (UMI) and a gene specific primer (GSP). Legend: "1": nicking site; end of the primer; "p": phosphorylation.
Figure 5 shows a table providing the barcode cluster analysis.
Figure 6 shows a graph indicating the fraction of human and mouse reads in each barcode cluster.
Figure 7 shows the size of the isothermal strand displacement amplification (iSDA) products when the template used is the RCA product containing 100 to 1000 copies of the linear template vs the linear template (SEQ ID NO.l). On the left part of the Figure, iSDA were performed with a range of linear template from 10 pM to 0.01 pM (initial concentration) vs with a range of RCA product (DNA nanoballs : DNBs) from 10 pM to 0.01 pM (initial concentration). On the right part of the Figure, iSDA were performed with the linear template at 1 pM (no RCA) vs with the RCA product at 1 pM (RCA). A condition without enzymes was performed (NC = negative control) and L stands for ladder.
Figure 8 shows the size of double-stranded DNA (dsDNA) obtained after iSDA-RT on a range of RNA extract in bulk, and second strand synthesis. Fresh DNBs were used for all conditions (except PC and C7) and were amplified by iSDA. The capture sequence used is specific of B-Actine and a PCR was performed for the second strand synthesis. Conditions: PC: positive control, iSDA final product purchased used as a template, 500 ng RNA; Cl: 0 ng RNA; C2: 100 ng RNA; C3: 250 ng RNA; C4: 500 ng RNA; C5: 750 ng RNA; C6: + 0.003% Igepal, 500 ng RNA; C7: old DNB, 500 ng RNA; NC: negative control, no RNA no enzyme mix.
Figure 9(A,B) shows the size of double-stranded DNA (ds DNA) obtained after iSDA-RT on cells in bulk, and second strand synthesis, as well as the alignment of the sequences obtained by Sanger sequencing with the theoretical sequence The capture sequence used is specific of B-Actine and a PCR was performed for the second strand synthesis. The primers used for Sanger (Fwd: GAGCAAGAGAGGCATCCTCAC (SEQ ID NO.262) and Rev: TGACGTGTGCTCTTCCGATC (SEQ ID NO.263)) are the same as those used for PCR: a combination of both sequences obtained were combined to achieve the whole sequence. Conditions: STD: standard for qPCR purchased. Its sequence contains the barcode part and the transcript part and it is used as a positive control for Sanger. It should align perfectly with the theoretical sequence ; PC: positive control, iSDA final product purchased used as a template, 750 ng RNA; Cl: 750 ng RNA; C2: 25,000 cells + 0.003% Igepal; C3: 25,000 cells - Igepal; C4: 25,000 cells - tRNA; NC: negative control, no enzyme, no cells; NTC: no template control (qPCR).
Figure 10(A,B,C) shows the size of double-stranded DNA (ds DNA) obtained after iSDA-RT on cells in 350 pL drops, and second strand synthesis, as well as the alignment of the sequence obtained by Sanger sequencing with the theoretical sequence. The capture sequence used is specific of B-Actine and a PCR was performed for the second strand synthesis. The primers used for Sanger (Fwd and Rev) are the same as those used for PCR: a combination of both sequences obtained were combined to achieve the whole sequence. Conditions: STD: standard for qPCR purchased. Its sequence contains the barcode part and the transcript part and it is used as a positive control for Sanger. It sould align perfectly with the theoretical sequence; PC for N=l&2: positive control, iSDA final product purchased used as a template + 0.003% Igepal ; PC for N=3: positive control, RT performed on iSDA final product following the RT enzyme protol (no iSDA). PCI and PC2 were performed with different incubations; Cl: 1 pM DNB + 0.003% Igepal; C2: 1 nM DNB + 0.003% Igepal; C3: 1 pM DNB - Igepal; C4: 1 pM DNB + 0. 3% Igepal; C5: 1 pM DNB - tRNA; Bulk: 1 pM DNB + 0.003% Igepal; NTC: no template control (iSDA); NC: negative control, no cells.
DETAILED DESCRIPTION OF THE INVENTION Single cell sequencing has recently emerged as a powerful tool for mapping cellular heterogeneity in diseased and healthy tissues, yet high-throughput methods are needed for capturing the unbiased diversity of cells. In this context, the use of barcodes for identifying single cells is limited by the cost and technical challenges associated with generating unique sets of barcoded oligonucleotides for each cell.
High-throughput single cell sequencing methods rely on microfluidics for cell barcoding. A key step in these methods is loading droplets with high concentrations of barcoded oligonucleotides to tag nucleic acids of interest. Barcode loading is often achieved using beads on which barcode sequences are synthesized. However, current barcode designs utilize fixed targeting primers that are not easily adapted for different purposes; therefore, if new targets are identified, a new batch of beads must be synthesized, which is expensive and laborious. For instance, adapting beads used for single-cell RNA sequencing (scRNAseq) to target genomic DNA would result in barcodes containing poly-T stretches that prevent common sequencers from reading into downstream sequences. Also, additional targets cannot be easily added to existing whole transcriptome or multiplexed amplicon beads.
Methods for screening cells having a phenotype of interest and recovery of specific cell genotype information are highly desirable since the recovery of single cell specific genotype together with single cell specific phenotype is very challenging.
The innovative concept behind the methods and compositions according to the present invention lies in proposing an in situ amplification of a single barcode molecule based on polymerase/nicking cycles that allow the generation of a large concentration of barcoded molecules in the same reactor wherein the single cell is subjected to analysis. The large concentration of barcoded molecules may be subsequently used for barcoding nucleic acids that originates from the single cell.
The methods disclosed herein present various advantages. For instance, the generation of a plurality of barcoded primers and subsequent barcoding of the single cell's nucleic acids occur in the same reactor. Therefore, the methods do not require extensive procedures of manipulation of the single cells that are laborious, time-consuming and error-prone, in particular, when the screening of the single cells is carried out on a large scale. Also, the methods of barcoding a target sequence and analysing a single cell disclosed herein do not require the use of beads bearing barcoded sequences/primers.
The methods and compositions according to the present invention also allow analysis of a single cell displaying a phenotype of interest as only these single cells, after being isolated in a reactor or encapsulated in a droplet, would trigger the step of generating a plurality of barcoded primers for a target sequence originating from the single cell. Therefore, the methods and compositions disclosed herein allow identification of subpopulations of single cells that are closely associated with a specific phenotype.
According to one aspect, the present invention relates to a method of barcoding a target sequence, the method comprising the steps of: a. providing in a reactor: i. an oligonucleotide comprising a unique barcode sequence flanked by a priming sequence, ii. an amplification mixture comprising a polymerase with strand displacement capability and a nickase, ill. an oligonucleotide primer pair comprising a forward primer and a reverse primers, iv. a single cell comprising the target sequence; b. contacting said oligonucleotide comprising a unique barcode sequence with one oligonucleotide primer of the oligonucleotide primer pair; c. extending said one oligonucleotide primer positioned on said oligonucleotide comprising a unique barcode sequence, thereby generating a double-stranded oligonucleotide comprising the unique barcode sequence; d. amplifying, in presence of the nickase, said double-stranded oligonucleotide comprising a unique barcode sequence by using the oligonucleotide primer pair, thereby generating a plurality of single-stranded barcoded oligonucleotide sequences; e. barcoding said target sequence with at least one of said plurality of single-stranded barcoded oligonucleotide sequences to generate a barcoded target sequence.
According to another aspect, the present invention relates to a method of analysing a single cell, the method comprising the steps of: a. providing in a reactor: i. an oligonucleotide comprising a unique barcode sequence flanked by a priming sequence, ii. an amplification mixture comprising a polymerase with strand displacement capability and a nickase, ill. an oligonucleotide primer pair comprising a forward primer and a reverse primer, iv. a single cell comprising a target sequence; b. contacting said oligonucleotide comprising a unique barcode sequence with one oligonucleotide primer of the oligonucleotide primer pair; c. extending said one oligonucleotide primer positioned on said oligonucleotide comprising a unique barcode sequence, thereby generating a double-stranded oligonucleotide comprising the unique barcode sequence; d. amplifying, in presence of the nickase, said double-stranded oligonucleotide comprising the unique barcode sequence by using the oligonucleotide primer pair, thereby generating a plurality of single-stranded barcoded oligonucleotide sequences; e. barcoding said target sequence with at least one of said plurality of single-stranded barcoded oligonucleotide sequences to generate a barcoded target sequence; and f. analysing said barcoded target sequence.
As used herein, the terms "nucleic acid" and "oligonucleotide" may be used interchangeably and refer to naturally-occurring or synthetic polymeric forms of nucleotides. Therefore, the nucleic acids and oligonucleotides of the present invention may be formed of naturally-occurring nucleotides, e.g., deoxyribonucleic acid (DNA) or ribonucleic acid (RNA), natural or synthetic modifications of nucleotides or artificial bases. The nucleic acids and oligonucleotides may exist as single- or doublestranded DNA or RNA, or as an RNA/DNA heteroduplex. The terms "nucleic acid" and "oligonucleotide" may refer to a short polynucleotide, generally less than or equal to 200 nucleotides in length, preferably between 5 and 150 nucleotides in length, more preferably between 10 and 100 nucleotides in length, even more preferably between 20 and 50 nucleotides in length. A "nucleic acid" or "oligonucleotide" may hybridize to other polynucleotides, therefore serving as a probe for polynucleotide detection, or a primer for polynucleotide chain extension.
As used herein, the indefinite article "a" or "an" may also refer to "one or more" or "at least one". For example, the term "an oligonucleotide" includes "one or more oligonucleotides". In the context of the present invention, the term "an oligonucleotide" encompasses one or more oligonucleotides, wherein each oligonucleotide may comprise a unique barcode or a plurality of the same unique barcode sequence. Likewise, the method of analysing a single cell disclosed herein allows the analysis of one or more single cells.
As used herein, the term "target sequence" refers to a nucleic acid or fragment thereof originating from a single cell that is subjected to barcoding and analysis. Target sequences may be DNA or RNA molecules.
In some embodiments, the target sequence is of mammalian, viral, bacterial, plant or fungal origin. In one embodiment, the target sequence is of mammalian origin, preferably, is of human origin.
In one embodiment, the target sequence is a nucleic acid selected from a transcriptome, a genome, an exome, transfected or transduced nucleic acid, a mitochondrial DNA, a chloroplast DNA or a modified nucleic acid.
As used herein, the term "modified nucleic acid" refers to nucleic acid modified by biological reactions within the cell or by synthetic approaches.
In the context of the present invention, a discrete set of genes or transcripts originating from the single cell may be targeted for analysis. Therefore, the method and composition according to the present invention can also barcode at the same time more than one target sequence originating from the single cell.
According to one embodiment, the target sequence is a transcript sequence.
In some embodiments, the method further comprises reverse transcribing the transcriptome to produce a cDNA library tagged with the amplified unique barcode molecules. To reverse transcribe the transcriptome, a reverse transcriptase may be provided in the reactor.
In one embodiment, the target sequence is selected from a subpopulation of nucleic acid in the single cell, a modified nucleic acid in the single cell or exposed on the surface of the single cell.
In the context of the present invention, the single cell may bear on its surface a target sequence which trigger a barcoding reaction while the single cell is not lysed or still viable.
In one embodiment, the plurality of single-stranded barcoded oligonucleotide sequences target a biomolecule exposed on the surface of the single cell.
In one embodiment, the methods are used for single cell transcriptome sequencing, single cell genomic sequencing or single cell methylome sequencing.
As used herein, the term "barcode sequence" refers to a unique nucleic acid sequence that may be distinguished by its sequence from another nucleic acid sequence, thus allowing to uniquely label a nucleic acid sequence so that it may be distinguished from another nucleic acid carrying another barcode sequence. The barcode sequence uniquely identifies the nucleic acids released from a single cell from nucleic acids released from other single cells, for instance, even after the nucleic acids are pooled together. The barcode sequence may be used to distinguish tens, hundreds or even thousands of nucleic acids arising from different single cells.
The barcode within the unique barcode sequence is a short DNA sequence wherein some positions of the DNA sequence may contain degenerate bases. By way of an example, a base labelled N will correspond to an approximatively balanced mixture of A, T, C and G. Other mixtures are possible and are indicated by the letters according to a standard nucleic acid notation. The barcode then consists of one or multiple stretches of randomized sequence. For example, a random barcode may be a stretch of 15 N, a "split barcode" may consist of 2 stretches of 12 N separated by a constant spacer sequence and a "structured barcode" may be (WS)io or NNNNNWNNNNNWNNNNN. An advantage of a structured barcode is that a structured barcode may be easier to identify in a sequencing read, because they must follow a specific pattern, whereas a random barcode may be any sequence. Another advantage of structured barcodes is that they allow to avoid some sequences or sequence patterns. For example, the barcode (WS)i2 cannot contain a homopolymer because W is A or T and S is C or G, and this may lead to less sequencing errors if the sequencing technique that is used is prone to errors on homopolymers. Another advantage of a structured barcode is that it may be designed to not contain one or more specific subsequences, such as the recognition sequences of the nickases, primer binding site, sequences similar to the gene specific primer, homopolymers and combinations thereof. Additionally, a structured barcode may be designed to keep a balanced content of GC vs AT (e.g., 50/50) in the barcode. In this way, it is less likely to observe failures or side products during the amplification of the unique barcode sequence in the reactor. The unique molecular identifier (UMI) sequence may also contain such a pattern.
The barcode sequence may be of any suitable length. The barcode sequence is preferably of a length sufficient to distinguish the barcode sequence from other barcode sequences and avoid barcode collisions.
In one embodiment, the unique barcode sequence contains at least 5, 10, 15 or more degenerate nucleotides. In one embodiment the unique barcode contains between 5 to 100 degenerate nucleotides, preferably from 10 to 50 degenerate nucleotides, more preferably from 15 to 25 degenerate nucleotides. Depending on the barcode degenerate sequence, i.e. the number of degenerate base and the specific degeneracy of each base, it is possible to compute the total number of possible barcode sequence and depending on the number of object to be barcoded, the probability that two of them receive the same sequence (i.e. a collision). For example, if the barcode sequence is NNNWNNNW, the number of possible sequences is 4n6 * 2n2 = 16384 and if this barcode pool is used to tag 100 cells, the probability of collision (at least 2 cells receiving the same barcode sequence) is 1- 16384 !/( 16384 -100)1/ 16384 A100 = 0.26. Accordingly, it is possible to adapt the number and type of degenerated base in the barcode so that the probability of collision is lower than a given value.
The barcode sequence may consist of one unique barcode sequences or a plurality of the same unique barcode sequence or may consist of more than one barcode sequence. The different barcode sequences may be taken from a pool of barcode sequences, which themselves have been generated by spl it-and-pol I synthesis. If the barcode sequence consists of more than one barcode sequence, the barcode sequences may be taken from the same or different pools of barcode sequences. The pool of sequences may be selected using any suitable technique, e.g., randomly or such that the sequences allow for error detection and/or correction, for instance, by being separated by a certain distance (e.g., Hamming distance) such that errors in reading of the barcode sequence may be detected, and in some cases, corrected. The pool may have any number of barcode sequences. Methods for joining different barcode sequences taken from one pool or more than one pool are known to the person skilled in the art.
The unique barcode molecule may comprise a single barcode or multiple concatenated copies of the barcode. Single barcode molecules containing multiple concatenated repeats may be obtained via reactions such as rolling circle amplification (RCA), loop-mediated isothermal amplification (LAMP) or terminal hairpin and self-priming extension (THSP). A non-hyperbranched reaction is preferably used to obtain the concatenated repeats to ensure that a given barcode is present in only one concatemeric molecule.
In one embodiment, the oligonucleotide comprising a unique barcode sequence is obtained by a rolling circle amplification.
In one embodiment, the oligonucleotide comprising a unique barcode sequence comprises a single index or a plurality of indexes linked to each other. As used herein, the term "index" refers to a nucleotide sequence characterising the barcode.
In some embodiments, the oligonucleotide comprising a unique barcode sequence is a doublestranded nucleic acid, a single-stranded nucleic acid, a partially double-stranded nucleic acid or a partially single-stranded nucleic acid. The oligonucleotide comprising a unique barcode sequence may be floating in the reactor or bound to a surface, e.g., a particle or a surface of the reactor, or the surface of a cell.
The unique barcode sequence is comprised within the single-stranded oligonucleotide obtained in step (d) of the methods according to the present invention. This single-stranded oligonucleotide comprising the unique barcode sequence may be referred to as a barcoded primer.
As used herein, the term "barcoded primer" refers to at least one molecule of about 20 to about 200 nucleobases in length that may function to prime nucleic acid synthesis. In particular, the barcoded primer may be of about 30 to about 150 nucleobases in length, of about 40 to about 100 nucleobases in length, of about 50 to about 90 nucleobases in length, of about 60 to about 80 or 70 nucleobases in length. In the context of the present invention, a barcoded primer is an oligonucleotide comprising a barcode sequence or a set of barcode sequences and a primer sequence.
In some embodiments, the barcoded primer further comprises a unique molecular identifier (UMI). The UMI is located at the 3'-end or the 5'-end of the barcode sequence. The UMI sequences are well known in the art and are described, for instance, in Kivioja et al. (Nature Methods 2012, 9: 72-74). In one embodiment the UMI has a length ranging from 3 to 30 nucleotides, preferably from 5 to 20 nucleotides, more preferably from 8 to 13 nucleotides.
The barcoded primer may comprise one of the following structures: (a) a polymerase chain reaction (PCR) handle, a barcode sequence, an adaptor, a poly(dT) tail, a gene-specific primer (GSP) or a set of gene-specific primers (GSPs); (b) a T7 promoter region, a PCR handle, a barcode sequence, an adaptor and a poly(dT) tail; or (c) a PCR handle, a barcode sequence, an adaptor and a template switching oligonucleotide (TSO) for 5' -end amplification. Optionally, the barcoded primer may comprise a unique molecule identifier (UMI).
In one embodiment, the barcoded primer comprises from 5' to 3': a PCR handle, optionally a UMI, a barcode sequence or a set of barcode sequences, an adapter and a capture/targeting sequence.
In another embodiment, the barcoded primer is an oligonucleotide comprising a barcode sequence or a set of barcode sequences, a first adapter sequence, a second adapter sequence, a primer sequence and a capture sequence. Also, the barcoded primer is an oligonucleotide comprising a complementary sequence of the barcode sequence or the set of barcode sequences, the first adapter sequence, the second adapter sequence, the primer sequence and the capture sequence. As used herein, an "oligonucleotide primer" refers to a short single-stranded nucleic acid of between 10 and 50 nucleotides in length, designed to perfectly or almost perfectly match a nucleic acid of interest, to be captured and then amplified (e.g., by PCR) or reverse transcribed (e.g., by reverse transcription (RT)). The primer sequences are specific to the nucleic acids they hybridize to, i.e., the primer sequences preferably hybridize under stringent hybridization conditions, more preferably under highly stringent hybridization conditions, and are complementary to or almost complementary to the nucleic acids they hybridize to. The primer sequence serves as a starting point for nucleic acid synthesis, allowing polymerase enzymes such as nucleic acid polymerase to extend the primer sequence and replicate the complementary strand. A primer sequence may be complementary to and hybridize to a target nucleic acid. The primer sequence may be a synthetic primer sequence.
The forward primer may contain from 5' to 3': (a) a 5' tail, (b) a restriction enzyme or nickase recognition site, (c) an adaptor sequence, and/or a (d) a 3' end complementary or reverse complementary to the unique barcode molecule. The adaptor sequence can, for example, be used for later recovery and amplification of, e.g., the cDNA produced by the methods of the invention.
The reverse primer may contain from 5' to 3': (a) a 5' tail, (b) a restriction enzyme or nickase recognition site; (c) a gene specific primer or poly(A) section, (d) a spacer, and (e) a 3' end complementary or reverse complementary to the unique barcode molecule. The spacer can, for example, optionally be a unique molecular identifier (UMI) sequence.
The forward primer and/or the reverse primer comprises the reverse complement of the target sequence, i.e., the GSP, oligodT or template switch oligo (TSO), when a T7 promoter is to be used. It may be beneficial that one or both of these primers have a 3' phosphate modification.
In one embodiment, the forward primer comprises a nickase site, a first primer sequence complementary to a first end of the oligonucleotide comprising a unique barcode sequence and, optionally, a unique molecular identifier (UMI) sequence.
In one embodiment, the forward primer comprises a nickase site, a first primer sequence complementary to a first end of the oligonucleotide comprising a unique barcode sequence, a first priming sequence, and optionally, a unique molecular identifier (UMI) sequence. In one embodiment, the reverse primer comprises a nickase site and second primer complementary to a second end of the oligonucleotide comprising a unique barcode sequence, a priming sequence, optionally, a unique molecular identifier (UMI) sequence.
In one embodiment, the reverse primer comprises a nickase site, a second primer complementary to a second end of the oligonucleotide comprising a unique barcode sequence, a second priming sequence, and optionally, a unique molecular identifier (UMI) sequence.
As used herein, the term "forward pre-primer" corresponds to the "first priming sequence" and "reverse pre-primer" corresponds to the "second priming sequence".
As used herein, the term "amplification mixture" refers to a mixture of reagents that are used in a nucleic acid amplification reaction. An amplification mixture comprises a buffer, deoxynucleotide triphosphates (dNTPs) and a polymerase, but does not comprise primers or a sample to be amplified.
The enzymes to be used in the amplification mixture are selected from a list of polymerase, nicking enzyme (nickase), restriction enzymes and exonuclease.
In one embodiment, the amplification mixture comprises a polymerase and a nickase.
The amplification reaction in step (d) of the methods according to the present invention uses at least one nicking enzyme. The use of a nickase is subject to reaction conditions and sequence constraints. Temperature and buffer-wise, the nickase must possess sufficient activity at the incubation temperatures and work in concert with DNA polymerase and other elements in the reaction mixture. The recognition sequence for the nickase can, for example, be the same or be different for the forward primer and the reverse primer. The same nicking enzyme can be used for both the forward and reverse primers, or multiple nicking enzymes can be combined for either the forward or the reverse primer or both primers.
By way of an example, for the forward primer there are no particular restrictions on the choice of nickase, as there are no restrictions on the nickase sequence. Nb.BbvCI and Nt.BsmAI are well-tested examples that work effectively, but many enzymes exist and could be used, such as, for example, Nb.BpulOl and Nb.Mval269l (ThermoFisher Scientific; Waltham, MA) or Nt.BspQI, Nt.BstNBI, Nb.BsrDI, Nb.BtsI, Nt.Alwl, Nt.BbvCI, Nb.Bsml, and Nb.BssSI (New England Biolabs; Ipswich, MA). For the primer that contain the reverse complement of the targeting sequence, there are more constraints on the nickase recognition sequence. As the cut has to occur immediately prior to the gene specific primer (GSP) or the polyA sequence on the reverse primer, the nickase must belong to the class of type IIS restriction enzymes or "shifted cleavage" enzymes. In some specific cases, the nicking enzymes cutting inside the recognition site can be used, for example, if the base that is left on the amplified unique barcode sequence is a T and the targeting domain is polyT, or if the base that is left is compatible with the 3' end of the gene specific primer (GSP). However, for the general case, Typells nicking enzymes are preferred. Such enzymes can include, but are not limited to Nt.BsmAI, Nt.Alwl, Nt.BstNBI, and Nt.BspQI.
In certain embodiments, the nickase recognition sequence is replaced in the reverse primer by a restriction enzyme sequence, with the same constraint on the cutting position. In this case, each reverse primer can be used for polymerization only once, before the extension product receives a double stranded cut by the restriction enzyme.
In one embodiment, the polymerase for use according to the invention is selected from the group consisting of Bst 2.0 DNA polymerase, Bst large fragment DNA polymerase, Klenow fragment (3'- >5' exo-), Phi29 DNA polymerase and Vent(exo-) DNA polymerase. More particularly, the polymerase is Vent(exo-) DNA polymerase. More than one polymerase may be used simultaneously.
In one embodiment, the nicking enzyme (nickase) for use according to the invention is selected from the group consisting of Nb.BbvCI, Nb.Bstl, Nb.BssSI, Nb.BsrDI, Nb.Bsml and Nt.BstNBI. More preferably, Nb.Bsml and/or Nt.BstNBI. More than one nickase may be used simultaneously.
The nicking enzyme may be replaced by a restriction enzyme.
Unlike nicking enzymes, which cut only one strand of a DNA duplex, the restriction enzymes cut the two strands. Thus, when using restriction enzymes instead of nicking enzymes, it may be necessary to protect the templates used in the method of the invention. This protection may be performed, for instance, by performing chemical modification of the templates. Such modification comprises backbone modification, such as phosphorothioate linkage. More than one restriction enzyme or a combination of nicking enzyme and restriction enzyme may be used simultaneously.
In one embodiment, the amplification mixture further comprises an exonuclease. The exonuclease for use according to the invention is preferably selected from the group consisting of RecJf, Exonuclease I, Exonuclease VII and ttRecJ exonuclease. More preferably, the exonuclease is ttRecJ exonuclease, such as the one obtained following the protocol described by Yamagata et al. (PNAS 2002, 99(9): 5908- 5912). More than one exonuclease can be used simultaneously. In a particular embodiment, the reactor includes additional reagents.
Additional reagents typically include a reverse transcriptase (RT), cell lysing additives, tagging additives, stabilizing additives, additives to adjust viscosity and density of aqueous phase, and/or deoxynucleotide triphosphates (dNTPs). Accordingly, in a particular embodiment, additional reagents are added to the reactor, said additional reagents comprising at least a reverse transcriptase (RT), cell lysing additives, tagging additives, stabilizing additives, additives to adjust viscosity and density of aqueous phase, and/or deoxynucleotide triphosphates (dNTPs).
The "reverse transcriptase (RT)" in context of the present invention is an enzyme used to generate complementary DNA (cDNA) from an RNA template, in a process termed reverse transcription.
In one embodiment, the reverse transcriptase is selected from the group consisting of Superscriptase I, Superscriptase II, Superscriptase III, Superscriptase IV, Murine Leukemia RT, SmartScribe RT, Maxima H RT, or MultiScribe RT.
In one embodiment, the reverse transcriptase is at a concentration of 1 to 50 U/pL, preferably 5 to 25 U/pL, for example at 12.5 U/pL.
Non-limiting examples of RNase inhibitors include RNase OUT, IN, SuperIN Rnase, and those inhibitors targeting a wide range of RNAse (e.g., A, B, C, 1 and Tl).
In one example the lysis buffer is typically 0.36% Igepal CA 630, 50 mM Tris-HCI pH 8.
In the context of the present invention, the "tagging additives" enable the tagging of the target sequence with the amplified unique barcode molecules. Tagging additives can include, but are not limited to, reverse transcriptases, transposases, ligases, or any other enzyme used in single cell-omic assays.
In the context of the present invention, the "stabilizing additives" enable the stabilization of the reaction components in the methods disclosed herein. Stabilizing additives can include, but are not limited to, bovine serum albumin (BSA), surfactants, solutes, and cosolvents (e.g., betaine, DMSO, urea, trehalose, etc.). In the context of the present invention, the "additives to adjust viscosity and density of the aqueous phase" enable better microfluidic encapsulation in the methods disclosed herein. An additive to adjust viscosity and density of the aqueous phase can, for example, include carboxymethylcellulose.
In a particular embodiment, said additional reagents are added into the reactor, in particular into the microfluidic droplet, by injection from a reservoir, for example using electrical forces (picoinjection) after a first droplet incubation step (Abate et al. (2010) Proc. Nat. Acad. Sci. USA 107:19163-19166).
In another particular embodiment, said additional reagents are added into the reactor, in particular into the microfluidic droplet, by coalescence with a second reactor, in particular a second microfluidic droplet, comprising said additional reagents but not comprising any target sequence. Droplets can be coalesced by a variety of methods known to the skilled person, including passive droplet coalescence (see Mazutis et al. (2009) Lab on a Chip, 9(18):2665-2672; Mazutis et al. (2012) Lab Chip, 12:1800- 1806), droplet coalescence driven by local heating from a focused laser (Baroud et al. (2007) Lab Chip 7:1029-1033) or using electric forces (Chabert et al. (2005) Electrophoresis 26:3706-3715; Ahn et al. (2006) Appl. Phys. Lett., 88:264105; Link et al. (2006) Angew. Chem., Int. Ed., 45:2556-2560; Priest et al. (2006) Appl. Phys. Lett. 89:134101) or using magnetophoretic forces or using pneumatic controllers (see Xi et al. (2017) Lab Chip 17:751-771).
Said second reactor, in particular said second microfluidic droplet, can be prepared by the same techniques as those disclosed above for the reactors comprising the target sequence.
By "coalescence" is meant herein the process by which two or more droplets or particles merge during contact to form a single daughter droplet or particle.
The step of reverse transcription defined above refers to reverse transcribing the released nucleic acids hybridized to said barcoded primers using the primer sequence in at least some of the reactors. Reverse transcription is performed using the reverse transcriptase (RT) comprised in at least some of the reactors.
"Reverse Transcription" or "RT reaction" is a process in which single-stranded RNA is reverse transcribed into a single-stranded complementary DNA (cDNA) by using total cellular RNA or poly(A) RNA, a reverse transcriptase enzyme, a primer, dNTPs and an RNase inhibitor. It will be understood by those skilled in the art, that the product of the reverse transcription is a RNA/DNA duplex comprising a single strand cDNA hybridized to its template RNA. As it will be further understood, said RNA/DNA duplex is further linked to the barcoded primer comprising the primer sequence used for the reverse transcription.
"Template switching" refers to a technology described originally in 2001, frequently referred to as "SMART" (switching mechanism at the 5' end of the RNA transcript) technology (Takara Bio USA, Inc). This technology has shown promise in generating full-length cDNA libraries, even from single-cell- derived RNA samples (Zhu et al. (2001) Biotechniques 30:892-897). This strategy relies on the intrinsic properties of Moloney murine leukemia virus (MMLV) reverse transcriptase and the use of a unique template switching oligonucleotide (TS oligo, or TSO). During first-strand synthesis, upon reaching the 5' end of the RNA template, the terminal transferase activity of the MMLV reverse transcriptase adds a few additional nucleotides (mostly deoxycytidine) to the 3' end of the newly synthesized cDNA strand. These bases function as a TS oligo-anchoring site. Upon base pairing between the TS oligo and the appended deoxycytidine stretch, the reverse transcriptase "switches" template strands, from cellular RNA to the TS oligo, and continues replication to the 5' end of the TS oligo. By doing so, the resulting cDNA contains the complete 5' end of the transcript, and universal sequences of choice are added to the reverse transcription product. Along with tagging of the cDNA 3' end by oligo dT primers, this approach makes it possible to efficiently amplify the entire full-length transcript pool in a completely sequence-independent manner (Shapiro et al. (2013) Nat. Rev. Genet. 14:618-630).
Accordingly, it will be understood by those skilled in the art, that after reverse transcribing the nucleic acids, the reactor further comprises cDNAs.
Accordingly, in one embodiment, at least some of the reactors further comprise cDNAs produced by reverse transcription of nucleic acids from the cells contained in said reactors.
In one embodiment, said cDNA refers to a single-stranded complementary DNA.
In a further embodiment, said cDNA is comprised in a RNA/DNA duplex.
In one embodiment, the RNA/DNA duplex refers to the RNA that has been reverse transcribed and is hybridized to the primer sequence of at least one of the primers, which is optionally barcoded, contained in the reactor.
As it will be understood by those skilled in the art, in one embodiment, the RNA/DNA duplex is linked to the primer, which is optionally barcoded, comprising the primer sequence to which the nucleic acid, preferably mRNA, was hybridized and which was used for reverse transcription. In some embodiments, the amplification mixture further comprises a reverse transcriptase, in particular when the method is used for barcoding and sequencing RNA molecules originating from a selected population of single cells.
In one embodiment, the target sequence is a transcript sequence or a plurality of transcript sequences.
In one embodiment, the priming sequence comprises oligo(dT), oligo(dT)VN or at least one targeted sequence.
As used herein, the term "targeted sequence" refers to a sequence recognizing a target sequence.
In one embodiment, the target sequence is a RNA sequence.
In one embodiment, the priming sequence is complementary to the RNA sequence.
In some embodiments, the amplification mixture further comprises a single-stranded DNA binding protein. The single strand binding protein may be selected from single-stranded DNA binding protein (SSB), extreme thermostable single-stranded DNA binding protein (ETSSB), gp32 or RecA.
In some embodiments, the amplification mixture further comprises tRNA. For cellular applications where the transcriptome is analysed, tRNA may be necessary for the amplification reaction to alleviate inhibition by the reverse transcriptase, as the reverse transcriptase may inhibit the primer amplification reaction.
In one embodiment, the forward primer is present at a concentration of at least 105 copies per reactor. In another embodiment, the forward primer is present at a concentration ranging from about 105 to about IO10 copies per reactor, preferably from about 106 to about 109 copies per reactor.
In one embodiment, the reverse primer is present at a concentration of at least 105 copies per reactor. In another embodiment, the reverse primer is present at a concentration ranging from about 105 to about IO10 copies per reactor, preferably from about 106 to about 109 copies per reactor.
In another embodiment, the forward primer is present at a concentration of at least 100 nM. In another embodiment the reverse primer is present at a concentration of at least 25 nM. The step of "extending" carried out in step (c) of the methods according to the present invention refers to the extension of a primer by the addition of nucleotides using a polymerase. If a primer that is annealed to a nucleic acid is extended, the nucleic acid acts as a template for extension reaction.
The amplification reaction carried out in step (d) of the methods according to the present invention occurs under isothermal conditions (at a constant temperature). Exemplary isothermal amplifications are nicking enzyme amplification reaction (NEAR) and isothermal strand displacement amplification (iSDA). Preferably, the temperature during the amplification is suitable for keeping the single cell viable. Differently, the barcoding reaction carried out in step (f) of the methods according to the present invention occurs at a different temperature.
In the context of the present invention, the amplification reaction continues until generating at least about 105 to about 109 copies of barcoded oligonucleotide sequences per reactor.
The amplification reaction carried out in step (d) of the method according to the present invention may be carried out using asymmetric concentrations of the first oligonucleotide primer and the second oligonucleotide primer. The use of asymmetric concentrations of the first oligonucleotide primer and the second oligonucleotide primer allows to obtain at the end of the amplification reaction singlestranded barcoded sequences.
As used herein, the expression "asymmetric concentration" refers to unequal or unbalanced concentrations of the first oligonucleotide primer and the second oligonucleotide primer.
In one embodiment, the forward primer is present at a higher concentration than the reverse primer or vice versa.
In one embodiment, the forward oligonucleotide primer and the reverse oligonucleotide primer are provided in a concentration ratio between 1:1 and 1:100, preferably, between 1:2 and 1:10, more preferably, between 1:4 and 1:6, or vice versa.
In one embodiment, the reactor further comprises an initiator template.
As used herein, the term "initiator template" refers to a molecule or compound, other than a reactant, that is capable of initiate a chain reaction, such as an amplification reaction. The initiator template may be exposed on the surface of the single cell. The initiator template may be directly or indirectly associated to a phenotype of interest from the single cell. The initiator template may originate from the single cell or carried on a compound binding or exposed on the surface of the single cell. An exemplary compound binding the surface of the single cell may be represented by an antibody. An exemplary compound exposed on the surface of the single cell.
The initiator template may be directly or indirectly linked to the single cell with its 3' -end or its 5' end.
In one embodiment, the initiator template is linked to a biomolecule of interest with its 3' -end or 5' end.
In one embodiment, the initiator template is selected from a free nucleic acid, a nucleic acid linked to a peptide, a nucleic acid linked to an antibody, a nucleic acid linked to a cell surface protein, a nucleic acid linked to a chemical compound, a nucleic acid linked to a particle, a nucleic acid originating from a single cell or a nucleic acid in a liposome.
In one embodiment, the initiator template comprises a primer sequence.
As used herein, the term "single cell" refers to an individual cell.
In some embodiments a single cell is of mammalian, viral, bacterial, plant or fungal origin.
In one embodiment, a single cell is of mammalian origin, preferably, is of human origin.
In one embodiment, the single cell comprises an initiator template.
In the context of the present invention, the methods of barcoding and analysing a target sequence may be carried out in any reactor suitable for containing a single cell and components for performing amplification reaction, lysis of the single cell and barcoding of the target sequence.
In one embodiment, the reactor is selected from a chamber, a droplet, a well or a tube.
Regardless its geometric configuration, the reactor has a volume ranging from about 10 pL and about 100 pL.
As used herein, the term "droplet" refers to an isolated portion of a first fluid that is completely surrounded by a second fluid. In the context of the present invention, the term "droplet" refers to a microfluidic droplet. Therefore, the method disclosed herein can be carried out in a microfluidic chip. The droplet has a substantially spherical shape and has a volume suitable for or greater than a mammalian cell.
In one embodiment, the methods as disclosed herein further comprise the step of lysing the single cell to release the target sequence from the single cell;
The step of "lysing" carried out in the methods according to the present invention refers to the disruption or poration of the cell membrane. Lysis may be accomplished by enzymatic, physical, mechanical, electrical, thermal or chemical means, or any combination thereof. Methods of cell lysis are only intended to physically lyse the cell to extract its content and do not affect the integrity of the reactor. Methods of cell lysis are known in the art.
In the context of the present invention, the "cell lysis additives" enabling and aid in cell lysis, preferably without disruption of the reactors, in particular, of the droplets. Cell lysing additives can, for example, include, but are not limited to surfactants, enzymes, stabilizers.
Preferably, the cell lysis additives are compatible with RT activity.
In one embodiment, the cell lysis additives comprise enzymes selected from the group consisting of lysozyme, lysostaphin, zymolase, mutanolysin, glycanases, proteases, and mannose.
In one preferred embodiment, the cell lysis additives comprise magnesium chloride, a detergent, a buffered solution and an RNase inhibitor.
In one embodiment, the magnesium chloride is used at a concentration of between 1 mM to 20 mM.
In one embodiment, the detergent is selected from the group consisting of Triton-X-100, NP-40, Nonidet P40, and Tween-20 and IGEPAL CA 630.
In one embodiment, the detergent is at a concentration of 0.01% to 1%.
Non-limiting examples of the buffered solution include Tris-HCI, Hepes-KOH, Pipes-NaOH, maleic acid, phosphoric acid, citric acid, malic acid, formic acid, lactic acid, succinic acid, acetic acid, pivalic (trimethylacetic) acid, pyridine, piperazine, picolinic acid, L-histidine, MES, Bis-tris, bis-tris propane, ADA, ACES, MOPSO, PIPES, imidazole, MOPS, BES, TES, HEPES, DIPSO, TAPSO, TEA (triethanolamine), N-Ethylmorpholine, POPSO, EPPS, HEPPS, HEPPSO, Tris, tricine, Glycylglycine, bicine, TAPS, morpholine, N-Methyldiethanolamine, AMPD (2-amino-2-methyl-l,3-propanediol), Diethanolamine, AMPSO, boric acid, CHES, glycine, CAPSO, ethanolamine, AMP (2-amino-2-methyl-l-propanol), piperazine, CAPS, 1, 3-Diaminopropane, CABS, or piperidine.
"Enzymatic methods" to destabilise cell walls are well-established in the art. The enzymes are generally commercially available and, in most cases, were originally isolated from biological sources. Enzymes commonly used include lysozyme, lysostaphin, zymolase, mutanolysin, glycanases, proteases, and mannose.
As known by those skilled in the art "chemical cell lysis" is achieved using chemicals such as detergents, which disrupt the lipid barrier surrounding cells by disrupting lipid-lipid, lipid-protein, and proteinprotein interactions. The ideal detergent for cell lysis depends on cell type and source. Nonionic and zwitterionic detergents are milder detergents. The Triton X series of nonionic detergents, the IGEPAL CA 630 nonionic detergent, and 3-[(3-Cholamidopropyl) dimethylammonio]-l-propanesulfonate (CHAPS), a zwitterionic detergent, are commonly used for these purposes. In contrast, ionic detergents are strong solubilizing agents and tend to denature proteins, thereby destroying protein activity and function. SDS, an ionic detergent that binds to and denatures proteins, is used extensively in the art to disrupt cells.
"Physical cell lysis" refers to the use of sonication, thermal shock (above 40°C, below 10°C), electroporation, freezing, shearing or laser-induced cavitation.
In one example the cells are lysed on ice.
In one preferred embodiment, the cell lysis does not disrupt or destroy the reactors, in particular, the droplets, in the context of the invention.
The step of "barcoding" carried out in step (e) of the methods according to the present invention refers to adding a unique genetic sequence, i.e., a barcode sequence, to a nucleic acid which allows to distinguish said barcoded nucleic acid from a nucleic acid having another added genetic sequence, i.e., another unique barcode sequence. Therefore, barcoding may enable one to pool samples of nucleic acids in order to reduce the cost of sequencing per sample, yet retain the ability to determine from which sample a sequence read is derived. Separate library preparations may be prepared for each sample, and each sample may have its own unique barcode. The separately prepared libraries with unique barcodes may then be pooled and sequenced. Each sequence read of the resulting dataset may be traced back to an original sample via the barcode in the sequence read. In one embodiment, the barcoding reaction carried out in step (e) comprises an exponential amplification phase followed by a linear amplification phase.
The step of "analysing" carried out in step (f) of the methods according to the present invention refers to any method by which the identity of at least 10, at least 20, at least 50, at least 100 or at least 200 or more consecutive nucleotides of a polynucleotide may be obtained. In particular, identification of the barcoded nucleic acid may be carried out by sequencing or by PCR. Methods of analysing a barcoded nucleic acid are known in the art.
In particular, identification of the target sequence contained in each microreactor can be carried out by sequencing, in particular, by sequencing DNA, the barcoded cDNAs obtained as detailed above, or the tags.
In one embodiment, the barcoded cDNAs produced by the reverse transcription as defined above are recovered and further used for identification, typically, by subsequent amplification by PCR and sequencing library preparation.
Accordingly, in one embodiment, the method of the invention further comprises recovering cell cDNAs produced by reverse transcription in at least some of the reactors.
"Recovering" herein refers to isolating the barcoded cDNAs produced by reverse transcription in at least some of the reactors from said plurality of reactors.
In one embodiment, recovering herein refers to collecting the reactors comprising barcoded cDNA produced by reverse transcription or collecting the aqueous composition contained in said reactors comprising said barcoded cDNA, and separating the barcoded cDNA comprised in the aqueous composition.
In one particular embodiment, recovering herein refers to collecting the microfluidic droplets comprising barcoded cDNA produced by reverse transcription, breaking the microfluidic droplets and separating the barcoded cDNA comprised in the aqueous composition from the oil phase of said microfluidic droplets.
Methods to isolate nucleic acids, in particular cDNA from microfluidic droplets are known to the skilled in the art and comprise for example, collecting the microfluidic droplets and breaking the emulsion by, for example, applying an electrical field (electrocoalescence) or by adding a chemical emulsion breaking agent, such as perfluoro-octanol in the case of droplets in fluorinated carrier oils. In one example, the broken emulsion is typically centrifuged for, for example, 10 minutes at 10,000 g at 4°C and the supernatant comprising the barcoded cDNA in the aqueous phase is recovered.
In one embodiment, the barcoded cDNA comprises one or more modified nucleotides or nucleotide analogs, for example for facilitating purification of the barcoded cDNA sequences or molecules.
For example, the nucleotides may be employed as phosphorothioate derivatives (replacement of a non-bridging phosphoryl oxygen atom with a sulfur atom) which have increased resistance to nuclease digestion. 2' -methoxyethyl (MOE) modification (such as the modified backbone commercialized by ISIS Pharmaceuticals) is also effective.
Other examples of modified nucleotides include derivatives of nucleotides with substitutions at the 2' position of the sugar, in particular with the following chemical modifications: O-methyl group (2'-O- Me) substitution, 2-methoxyethyl group (2'-O-MOE) substitution, fluoro group (2'-fluoro) substitution, chloro group (2'-CI) substitution, bromo group (2'-Br) substitution, cyanide group (2'-CN) substitution, trifluoromethyl group (2'-CF3) substitution, OCF3 group (2'-OCF3) substitution, OCN group (2'-OCN) substitution, O-alkyl group (2'-O-al kyl ) substitution, S-alkyl group (2'-S-al kyl ) substitution, N-alkyl group (2'-N-akyl) substitution, O-alkenyl group (2'-O-alkenyl) substitution, S-alkenyl group (2'-S-alkenyl) substitution, N-alkenyl group (2'-N-alkenyl) substitution, SOCH3 group (2'-SOCH3) substitution, SO2CH3 group (2'-SO2CH3) substitution, ONO2 group (2'-ONO2) substitution, NO2 group (2'-NO2) substitution, N3 group (2'-N3) substitution and/or NH2 group (2'-NH2) substitution. Other examples of modified nucleotides include biotin labeled nucleotides.
Other examples of modified nucleotides include nucleotides wherein the ribose moiety is used to produce locked nucleic acid (LNA), in which a covalent bridge is formed between the 2' oxygen and the 4' carbon of the ribose, fixing it in the 3'-endo configuration.
Other examples of nucleotide analogs include deoxyinosine.
Other examples of nucleotide analogs include biotin labelled nucleotide. For example, Biotin-ll-dCTP can be used as a substrate for the reverse transcriptase to incorporate biotins into the cDNA during polymerization, allowing affinity purification using streptavidin or avidin.
In one embodiment, the barcoded cDNA is further treated with RNAse A and/or RNAse H. "RNAse A" is an endoribonuclease that specifically degrades single-stranded RNA at C and U residues. In one embodiment, the RNAse A is at a concentration of 10 to 1000 pg/pL, preferably 50 to 200 pg/pL, for example at 100 pg/pL.
"RNAse H" is a family of non-specific endonucleases that catalyze the cleavage of RNA via a hydrolytic mechanism. RNase H ribonuclease activity cleaves the 3'-O-P bond of RNA in a DNA/RNA duplex substrate to produce 3'-hydroxyl and 5'-phosphate terminated products. In one embodiment, the RNAse H is at a concentration of 10 to 1000 pg/pL, preferably 50 to 200 pg/pL, for example at 100 g/piL.
In one embodiment, the barcoded cDNA is further treated with Proteinase K. "Proteinase K" is a broadspectrum serine protease and digests proteins, preferentially after hydrophobic amino acids. In one embodiment, the Proteinase K is at a concentration of 0.1 to 5 mg/mL, preferably 0.1 to 1 mg/mL, for example at 0.8 mg/mL.
In one embodiment, the recovered and treated barcoded cDNA are further amplified by PCR. The PCR primer may contain a tail with and index for multiplexing or a random sequence serving as UML
In one embodiment, the barcoded cDNAs obtained after reverse transcription are sequenced to allow identification of nucleic acid input contained in the reactor.
In one embodiment, the step of sequencing the barcoded cDNA may comprise performing a next generation sequencing (NGS) protocol on a sequencing library. Any type of NGS protocol can be used such as the MiSeq Systems (illumina®), the HiSeq Systems (illumina®), the NextSeq System (illumina®), the NovaSeq Systems (illumina®), the lonTorrent system (ThermoFisher), the lonProton system (ThermoFisher), or the sequencing systems produced by Pacific Biosciences or by Nanopore.
In certain embodiments, the NGS protocol comprises loading an amount of the sequencing library between 1 pM and 20 pM, in particular between 1.5 pM and 20 pM, per flow cell of a reagent kit.
In one embodiment, the NGS sequencing protocol further comprises the step of adding 5-60% PhiX to the amount of the sequencing library or to the flow cell of the reagent kit.
In one embodiment, prior to sequencing, the barcoded cDNAs are further amplified. In one embodiment, the amplification step is performed by a polymerase chain reaction (PCR), and/or a linear amplification.
In one embodiment, the linear amplification precedes the PCR reaction.
In one embodiment, the linear amplification is an in vitro transcription, followed by reverse transcription.
In one embodiment, the linear amplification is an isothermal amplification.
In one embodiment, said amplification step is performed after removing unincorporated barcoded primers. In one embodiment, said amplification step is performed prior to the sequencing step defined herein above.
In one embodiment, the barcoded cDNA produced after reverse transcription is quantified using qPCR.
In one embodiment, specific sequences necessary for sequencing are added during amplification or by ligation of adaptors, thereby generating a sequencing library.
According to another aspect, the present invention relates to a composition comprising: a. a forward primer comprising a nickase site, a first primer sequence complementary to a first end of an oligonucleotide comprising a unique barcode sequence, a first priming sequence and, optionally, a unique molecular identifier (UMI) sequence and a first adapter sequence; and b. a reverse primer comprising a nickase site and a second primer complementary to a second end of the oligonucleotide comprising a unique barcode sequence, a second priming sequence, and optionally, a unique molecular identifier (UMI) sequence.
In one embodiment, the first oligonucleotide primer is present in the composition at a higher concentration than the second oligonucleotide primer or vice versa.
In one embodiment, the forward primer is present at a concentration of at least 105 copies per reactor. In another embodiment, the forward primer is present at a concentration ranging from about 105 to about IO10 copies per reactor, preferably from about 106 to about 109 copies per reactor. In one embodiment, the reverse primer is present at a concentration of at least 105 copies per reactor. In another embodiment, the reverse primer is present at a concentration ranging from about 105 to about IO10 copies per reactor, preferably from about 106 to about 109 copies per reactor.
The composition disclosed herein may be used to barcode target sequence originating from a single cell.
Embodiments of the methods disclosed herein also apply to the composition disclosed herein.
EXAMPLES
Example 1: Single cell transcriptomic analysis on a mixture of two cell types (human Jurkat cell line and mouse p338dl)
To demonstrate that single cell transcripts were barcoded in a cell-specific manner, a mixture of human (Jurkat) and mouse (p338dl) cells was subjected to the protocol described below and human and mouse specific transcripts were identified using a single gene-specific primer (GSP) targeting actB. The encapsulation was performed using a flow-focusing microfluidic device so that the droplets typically contain no more than one cell, and typically no more than one unique barcode molecule, suspended in the amplification and tagging mixture, understanding that the encapsulations are random and governed by Poisson statistics. It was thus expected that one given barcode will only be associated with human reads or mouse reads and not a mixture of both.
Preparation of barcoded DNA template ("barcoded DNA balls"): circularization and rolling circle amplification (RCA) (Day 0, 5 hours)
The library of precursor single stranded DNA (ssDNA) was ordered as oligos with the degenerate sequence given in SEQ ID NO:1, which corresponds to the ssDNA (5' phosph) in Table 1
SEQ ID NO:1:
5' PGTAG ATAG ACCGTG AN NTAN N N NTAN N N NTAN N N NTAAG ATCGG AAG AGCACACGTCATG AG ACG ATG ATGG3'
To circularize the precursors and remove linear ssDNA, the following components in Table 1 were mixed and incubated for 80 minutes at 60°C and then 10 minutes at 80°C.
Table 1: Components for circularization of precursors and linear ssDNA removal.
Component Vol. [Stock] [Final]
10X Reaction Buffer 1 pL 10 X I X ssDNA (5' Phosph) (SEQ ID NO.l) 5 pL 1 pM 0.5 pM
ATP 0.5 pL 1 mM 0.05 mM
MnCh 0.5 pL 50 mM 2.5 mM
CircLigase 0.75 pL 100 units/pL 7.5 units/pL Sterile water 2.25 pL
TOTAL 10 pL After the circularization of precursors and linear ssDNA removal, the following components in Table 2 were mixed and incubated for 5 minutes at 37°C and then 1 minute and 30 seconds at 80°C.
Table 2: Thermolabile exonuclease reaction.
Component Vol. [Stock] [Final]
N EBuffer r3.1 3 pL 10 X I X
Circular ssDNA (SEQ ID NO.1) 6 pL 0.5 pM 0.1 pM
Thermolabile Exol 2.25 pL 20 units/pL 1.5 units/ .L Sterile water 18.75 pL
TOTAL 30 pL
The shift between linear/circularized DNA was compared on a 7.5% denaturing PAGE gel, and the yield of the reaction was quantified (generally 75-100% for ssDNA tested (70-100 nt)).
Next rolling circle amplification (RCA) was performed. To perform the RCA, the following components in Table 3 were mixed and incubated for 130 minutes at 45°C and then for 10 minutes at 65°C.
Table 3: RCA reaction mixture.
Component Vol. [Stock] [Final]
10X Reaction Buffer 2.5 pL 10 X I X dNTP 2.5 pL 10 mM I mM
DTT 0.25 pL 100 mM I mM
Primer (SEQ ID NO.261) 0.5 pL 1000 nM 20 nM
Circular ssDNA (SEQ ID NO.l) 0.5 pL 100 nM 2 nM
EquiPhi29 DNA Pol. 1.25 pL 10 units/pL 0.5 units/pL Sterile water 17.5 pL
TOTAL 25 pL
The RCA reaction was diluted with H2O to a stock solution of InM and was stored at -20°C.
Next, the buffer mix and enzyme mix was prepared for the isothermal barcode amplification reactionreverse transcription (iBAR-RT). The following solutions in Tables 4 and 5 were prepared on ice and stored at -20°C.
Table 4: Enzyme mix.
Component Vol. [Stock] [Final]
Recombinant albumin 5 pL 20 mg/mL 2 mg/mL Nt.BsmAI 15 pL 5000 units/mL 1500 units/mL ET.SSB 5 pL 500 pg/m L 50 pg/mL
Bst 2.0 3.75 pL 8000 units/mL 600 units/mL
RNaseOut 6.25 pL 40 units/pL 5 units/pL
SSIV 7.5 pL 200 units/pL 30 units/pL Sterile water 7.5 pL
TOTAL 50 pL
Table 5: Buffer mix.
Component Vol. [Stock] [Final]
TRIS pH 8.4 31.25 pL I M 125 mM
(NH4)2SO4 25 pL I M 100 mM
KCI 10.42 pL 3 M 125 mM
MgSO4 12.5 pL I M 50 mM
Tween 20 2.5 pL 100 % 1 %
DMSO 25 pL 100 % 10 %
DTT 50 pL 100 mM 20 mM dNTP 62.5 pL 10 mM 2.5 mM Sterile water 30.83 pL
TOTAL 250 pL
Next the encapsulation, iBAR-RT, and library preparation mix was prepared. The following components in Table 6 were mixed together on ice and in the following order: H2O, buffers, additives, DNA, and finally enzymes. The mixture was kept on ice.
Table 6: iBAR-RT mix.
Component Vol. [Stock] [Final]
Sterile water 57.5 pL
Buffer Mix 10 pL 10 X I X tRNA S. cerevisiae 2 pL 1000 ng/pL 20 ng/pL
Forward primer (SEQ ID NO:2) 1 pL 10000 nM 100 nM
DNB (l nM stock) (SEQ ID NO.l) 1 pL 1 pM 0.01 pM
Reverse primer (SEQ ID NO:3) 2.5 pL 1000 nM 25 nM
Enzyme Mix 15 pL 10 X 1.5 X
EvaGreen 0.5 pL 100 % 0.5 %
Igepal 3 pL 0.1 % 0.003 % Cells 7.5 pL 3,333 jz/pL 250 jz/pL (25,000 jz)
TOTAL 100 pL SEQ ID N0:2 :
5' CC ATCATCGTCTCATG ACGTGTGCT3' P
SEQ ID N0:3 :
5'TCAAATGT GTCTCTAACTGGGACGACATGGAGAATANNNTANNNNTANNNGTAGATAGACCGTGA3'
(Note: the concentration of DNA barcoded balls was adjusted according to the size of the droplets used. The number of cells was adjusted according to the experiment, 5-20% of the total volume was preferred. DMEM, 10% FCS, 1% P/S can be replaced by RPMI, 10% FBS, 1% P/S.)
Encapsulation: A standard polydimethylsiloxane (PDMS) microfluidic chip was used to encapsulate together individual cells and the previous mix. The solutions were kept on ice.
IBAR-RT reaction: Remove the surplus of oil and incubate the mix for 40 minutes at 37°C, 25 minutes at 55°C, 20 minutes at 80°C, and then place on ice (4°C).
Library preparation by PCR for gene specific barcoding: The droplet population was split in two and the two emulsions were separately broken using lH,lH,2H,2H-Perfluoro-l-octanol (PFO). The oil was removed, new oil was added + PFO (25%), the mix was vortexed and centrifuged. If needed, PFO was added, and the step was repeated.
The PCR mix was prepared (Table 7). The reverse primer included an Illumina index and two different indexes were used for the two pooled fractions.
Table 7: PCR mix.
Component [Final]
5X Q5 reaction buffer IX
Evagreen (if qPCR) 0.2X dNTPs 200 pM
RevP (ex: RSeq_tot_i35_SBS12_v2r) (SEQ ID NO:4) 0.5 pM
ForP (ex.: FSeq_tot_PCR_actB_4) (SEQ ID NO:5) 0.5 pM
Q5 High-Fidelity DNA Polymerase 0.02 units/pL
ISDA-RT sample 0.1X
Sterile water
TOTAL SEQ. ID NO:4 :
5'CAAGCAGAAGACGGCATACGAGATAAAATGGTGACTGGAGTTCA GACGTGTGCTCTTCCGATCT3'
SEQ ID NO:5 : 5'AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCTGTCGTCGACAACGGC TCC3'
The PCR run was performed as follows: 98°C for 30 seconds, followed by 28-34 cycles of 98°C for 10 seconds and 72°C for 38 seconds, followed by 72°C for 2 minutes, followed by a 4°C hold.
An agarose gel at 1% was prepared, and the gel was run for 40 minutes at 100-150V. The correct band was extracted and purified. The concentration of the purified DNA was measured, and the purified DNA was stored at -20°C.
Sequencing: The purified DNA was sequenced via MinlON using manufacturer protocol (e.g., using the Ligation protocol SQK-LSK110).
Results: The reads were filtered by length, separated into indexes and demultiplexed (grouped by barcode) leading to around 2000 unique barcodes. The barcodes that appeared in the two indexes (indicating breakage of the RCA balls) were removed. The reads within each cluster were assigned to human or mouse variant of the targeted gene. The number of different barcodes as well as the fraction of human and mouse reads for each cluster were computed. The results are provided in FIGs. 4 and 5, showing that most barcodes are associated to reads from a single cell type, and few barcodes are associated with reads from the two cell types. This demonstrates that the isothermal barcode amplification reaction happening in one given droplet has enabled tagging the transcripts coming from the co-encapsulated cell, and only that cell.

Claims

Claims
1. A method of barcoding a target sequence, the method comprising the steps of: a. providing in a reactor: i. an oligonucleotide comprising a unique barcode sequence flanked by a priming sequence, ii. an amplification mixture comprising a polymerase with strand displacement capability and a nickase, ill. an oligonucleotide primer pair comprising a forward primer and a reverse primer; iv. a single cell comprising the target sequence; b. contacting said oligonucleotide comprising a unique barcode sequence with one oligonucleotide primer of the oligonucleotide primer pair; c. extending said one oligonucleotide primer positioned on said oligonucleotide comprising a unique barcode sequence, thereby generating a double-stranded oligonucleotide comprising the unique barcode sequence; d. amplifying, in presence of the nickase, said double-stranded oligonucleotide comprising the unique barcode sequence by using the oligonucleotide primer pair, thereby generating a plurality of single-stranded barcoded oligonucleotide sequences; e. barcoding said target sequence with at least one of said plurality of single-stranded barcoded oligonucleotide sequences to generate a barcoded target sequence.
2. A method of analysing a single cell, the method comprising the steps of: a. providing in a reactor: i. an oligonucleotide comprising a unique barcode sequence flanked by a priming sequence, ii. an amplification mixture comprising a polymerase with strand displacement capability and a nickase, ill. an oligonucleotide primer pair comprising a forward primer and a reverse primer, iv. a single cell comprising a target sequence; b. contacting said oligonucleotide comprising a unique barcode sequence with one oligonucleotide primer of the oligonucleotide primer pair, c. extending said one oligonucleotide primer positioned on said oligonucleotide comprising a unique barcode sequence, thereby generating a double-stranded oligonucleotide comprising the unique barcode sequence; d. amplifying, in presence of the nickase, said double-stranded oligonucleotide comprising the unique barcode sequence by using the oligonucleotide primer pair, thereby generating a plurality of single-stranded barcoded oligonucleotide sequences; e. barcoding said target sequence with at least one of said plurality of single-stranded barcoded oligonucleotide sequences to generate a barcoded target sequence; and f. analysing said barcoded target sequence.
3. The method according to any one of the claims 1 or 2, wherein said target sequence is a nucleic acid selected from a transcriptome, a genome, an exome, transfected or transduced nucleic acid, a mitochondrial DNA, a chloroplast DNA or a modified nucleic acid.
4. The method according to any one of the claims 1 to 4, wherein said oligonucleotide comprising a unique barcode sequence is obtained by a rolling circle amplification.
5. The method according to any one of the claims 1 to 5, wherein step (e) comprises an exponential amplification phase followed by a linear amplification phase.
6. The method according to any one of the claims 1 to 7, wherein said forward primer comprises a nickase site, a first primer sequence complementary to a first end of the oligonucleotide comprising a unique barcode sequence, a first priming sequence and, optionally, a unique molecular identifier (UMI) sequence.
7. The method according to any one of the claims 1 to 7, wherein said reverse primer comprises a nickase site, a second primer complementary to a second end of the oligonucleotide comprising a unique barcode sequence, a second priming sequence, and optionally, a unique molecular identifier (UMI) sequence.
8. The method according to any one of the claims 1 to 9, wherein said forward primer is present at a higher concentration than said reverse primer or vice versa.
9. The method according to any one of the claims 1 to 10, wherein said forward primer is present at a concentration of at least about 105 copies per reactor.
10. The method according to any one of the claims 1 to 10, wherein said reverse primer is present at a concentration of at least about 105 copies per reactor.
11. The method according to any one of the claims 1 to 12, wherein the target sequence is a transcript sequence.
12. The method according to any one of the claims 1 to 13, wherein the priming sequence comprises oligo(dT), oligo(dT)VN or at least one targeted sequence.
13. The method according to any one of the claims 1 to 16, further comprising lysing said single cell to release said target sequence from the single cell.
14. The method according to any one of the claims 1 to 16, wherein said plurality of single-stranded barcoded oligonucleotide sequences target a biomolecule exposed on the surface of the single cell.
15. A composition comprising: a. a forward primer comprising a nickase site, a first primer sequence complementary to a first end of an oligonucleotide comprising a unique barcode sequence, a first priming sequence and, optionally, a unique molecular identifier (UMI) sequence and a first adapter sequence; and b. a reverse primer comprising a nickase site, second primer complementary to a second end of the oligonucleotide comprising a unique barcode sequence, a second priming sequence, optionally, a unique molecular identifier (UMI) sequence.
PCT/EP2025/062369 2024-05-06 2025-05-06 Methods and compositions for barcoding nucleic acids Pending WO2025233344A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
EP24174376 2024-05-06
EP24174376.4 2024-05-06

Publications (1)

Publication Number Publication Date
WO2025233344A1 true WO2025233344A1 (en) 2025-11-13

Family

ID=91027106

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/EP2025/062369 Pending WO2025233344A1 (en) 2024-05-06 2025-05-06 Methods and compositions for barcoding nucleic acids

Country Status (1)

Country Link
WO (1) WO2025233344A1 (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021088189A1 (en) * 2019-11-08 2021-05-14 天津大学 Oligonucleotide library isothermal amplification method for dna data storage
WO2021155057A1 (en) * 2020-01-29 2021-08-05 Becton, Dickinson And Company Barcoded wells for spatial mapping of single cells through sequencing

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021088189A1 (en) * 2019-11-08 2021-05-14 天津大学 Oligonucleotide library isothermal amplification method for dna data storage
WO2021155057A1 (en) * 2020-01-29 2021-08-05 Becton, Dickinson And Company Barcoded wells for spatial mapping of single cells through sequencing

Non-Patent Citations (17)

* Cited by examiner, † Cited by third party
Title
ABATE ET AL., PROC. NAT. ACAD. SCI., vol. 107, 2010, pages 19163 - 19166
BAROUD ET AL., LAB CHIP, vol. 7, 2007, pages 1029 - 1033
CHABERT ET AL., ELECTROPHORESIS, vol. 26, 2005, pages 3706 - 3715
DAVID REDIN ET AL: "Droplet Barcode Sequencing for targeted linked-read haplotyping of single DNA molecules", NUCLEIC ACIDS RESEARCH, vol. 45, no. 13, 19 May 2017 (2017-05-19), GB, pages e125 - e125, XP055584526, ISSN: 0305-1048, DOI: 10.1093/nar/gkx436 *
HEATH ET AL., NAT. REV. DRUG DISCOV., vol. 15, no. 3, 2016, pages 204 - 216
JAITIN ET AL., SCIENCES, vol. 343, no. 6172, 2014, pages 776 - 779
KIVIOJA ET AL., NATURE METHODS, vol. 9, 2012, pages 72 - 74
KLEIN ET AL., CELL, vol. 161, no. 5, 2015, pages 1202 - 1214
LINK ET AL., ANGEW. CHEM., INT. ED., vol. 45, 2006, pages 2556 - 2560
MAZUTIS ET AL., LAB CHIP, vol. 12, 2012, pages 1800 - 1806
MAZUTIS ET AL., LAB ON A CHIP, vol. 9, no. 18, 2009, pages 2665 - 2672
PRIEST ET AL., APPL. PHYS. LETT., vol. 89, 2006, pages 134101
ROTEM ET AL., PLOS ONE, vol. 10, no. 5, 2015, pages 0116328
SHAPIRO ET AL., NAT. REV. GENET., vol. 14, 2013, pages 618 - 630
XI ET AL., LAB CHIP, vol. 17, 2017, pages 751 - 771
YAMAGATA ET AL., PNAS, vol. 99, no. 9, 2002, pages 5908 - 5912
ZHU ET AL., BIOTECHNIQUES, vol. 30, 2001, pages 892 - 897

Similar Documents

Publication Publication Date Title
US10876108B2 (en) Compositions and methods for targeted nucleic acid sequence enrichment and high efficiency library generation
US11124828B2 (en) Methods for adding adapters to nucleic acids and compositions for practicing the same
EP3350732B1 (en) Method for preparing a next generation sequencing (ngs) library from a ribonucleic acid (rna) sample and kit for practicing the same
EP2769007B1 (en) Compositions and methods for directional nucleic acid amplification and sequencing
US10017761B2 (en) Methods for preparing cDNA from low quantities of cells
US9469874B2 (en) Long-range barcode labeling-sequencing
EP3702457A1 (en) Reagents, kits and methods for molecular barcoding
WO2007142608A1 (en) Nucleic acid concatenation
US10059938B2 (en) Gene expression analysis
WO2025233344A1 (en) Methods and compositions for barcoding nucleic acids
CN120435555A (en) Selective amplification methods for efficient rearrangement detection
JP2025538503A (en) Methods for selective amplification for efficient rearrangement detection
CN119932155A (en) Methods and kits for targeted genome enrichment