GAP-FILL LIGATION AND PRIMER EXTENSION REACTIONS EMPLOYING EXONUCLEASE-RESISTANT OR CAPTURE MOIETY MEANS
CROSS-REFERENCING
This application claims the benefit of U.S. provisional application serial no.
63/439,839, filed on January 18, 2023, which application is incorporated herein for all purposes.
BACKGROUND
DNA microscopy methods, which map the relative locations of biomolecules in cells and tissues, rely on the addition of DNA sequences to probes that specifically target proteins or nucleic acids. These probes are for example antibodies coupled with DNA. A few methods exist to perform the DNA analysis to achieve relative locations for proteins. As these reactions are difficult to optimize to 100% completion of each reaction step, any remaining non-extended sequences are still present during the PCR amplification which makes up the sequencing library preparation. These non-extended sequences can interfere during the PCR reaction forming false PCR reaction products. These false PCR reaction products can scramble the spatial relations making data analyses much more difficult.
The present disclosure provides a solution to this problem.
SUMMARY
A method of performing a gap-fill ligation reaction is provided. In some embodiments, the method may comprise: (a) hybridizing a first oligonucleotide and a second oligonucleotide to a population of template molecules that comprise an identifier sequence that varies in the population, wherein the first and second oligonucleotides hybridize to sites that flank the identifier sequence; (b) incubating the product of (a) under gap-fill ligation conditions to produce: (i) full length products that comprise the full sequence of the first oligonucleotide, the complement of an identifier sequence and the second oligonucleotide; and (ii) incomplete products that do not comprise the full sequence of the first oligonucleotide, the complement of an identifier sequence and the second oligonucleotide; and (c) enriching for the full-length products using an exonuclease or by affinity. In some embodiments: the 3' end of one of the first and second oligonucleotides is exonuclease resistant, the 5' end of the other of the first and second oligonucleotides is exonuclease resistant, and enrichment of the full-length products is done by treating the product of step
(b) with an exonuclease. Alternatively or in addition, the 5' or 3' end of one of the first or the second oligonucleotides may have a capture moiety, and enrichment of the full-length products may be done using a support that has affinity for the capture moiety.
Kits for performing the method are also provided.
BRIEF DESCRIPTION OF THE FIGURES
The skilled artisan will understand that the drawings, described below, are for illustration purposes only. The drawings are not intended to limit the scope of the present teachings in any way.
Fig. 1 illustrates a problem addressed by the present method.
Fig. 2 illustrates a first embodiment of the present method.
Fig. 3 illustrates a first embodiment of the present method.
Fig. 4 illustrates a second embodiment of the present method.
Fig. 5 illustrates a second embodiment of the present method.
Fig. 6 illustrates a workflow in which the present method may be used.
Fig. 7 illustrates an alternative embodiment of the method.
Fig. 8 shows the spatial links found between cells without using an exonuclease (RecJ). Each black spot is a cell, gray lines are sequence based links between cells.
Fig. 9 shows the links found between cells when the products are digested using an exonuclease (RecJ) before PCR. Each black spot is a cell, gray lines are sequence based links between cells. Real individual cells are now more easily identified as the number of false links are much lower. This is especially useful when studying cell-cell interactions.
DEFINITIONS
Unless defined otherwise herein, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. Although any methods and materials similar or equivalent to those described herein can be used in the practice or testing of the present invention, the preferred methods and materials are described.
All patents and publications, including all sequences disclosed within such patents and publications, referred to herein are expressly incorporated by reference.
Numeric ranges are inclusive of the numbers defining the range. Unless otherwise indicated, nucleic acids are written left to right in 5' to 3' orientation; amino acid sequences are written left to right in amino to carboxy orientation, respectively.
The headings provided herein are not limitations of the various aspects or embodiments of the invention. Accordingly, the terms defined immediately below are more fully defined by reference to the specification as a whole.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. Singleton, et al., DICTIONARY OF MICROBIOLOGY AND MOLECULAR BIOLOGY, 2D ED., John Wiley and Sons, New York (1994), and Hale & Markham, THE HARPER COLLINS DICTIONARY OF BIOLOGY, Harper Perennial, N.Y. (1991) provide one of skill with the general meaning of many of the terms used herein. Still, certain terms are defined below for the sake of clarity and ease of reference.
The term “oligonucleotide” as used herein denotes a single-stranded multimer of nucleotides of from about 2 to 200 nucleotides, up to 500 nucleotides in length. Oligonucleotides may be synthetic or may be made enzymatically, and, in some embodiments, are 30 to 150 nucleotides in length. Oligonucleotides may contain ribonucleotide monomers (i.e., may be oligoribonucleotides) or deoxyribonucleotide monomers. An oligonucleotide may be 10 to 20, 21 to 30, 31 to 40, 41 to 50, 51to 60, 61 to 70, 71 to 80, 80 to 100, 100 to 150 or 150 to 200 nucleotides in length, for example.
The term “primer” as used herein refers to an oligonucleotide that is capable of acting as a point of initiation of synthesis when placed under conditions in which synthesis of a primer extension product, which is complementary to a nucleic acid strand, is induced, i.e., in the presence of nucleotides and an inducing agent such as a DNA polymerase and at a
suitable temperature and pH. The 3' end of a primer is typically complementary to at least 10 nucleotides (e.g., 10-30 nucleotides) of a template.
The term “primer extension products” refer to the product of extension of a primer or the product of extension of a molecule that is itself a primer extension product.
The term “hybridization” or “hybridizes” refers to a process in which a nucleic acid strand anneals to and forms a stable duplex, either a homoduplex or a heteroduplex, under normal hybridization conditions with a second complementary nucleic acid strand and does not form a stable duplex with unrelated nucleic acid molecules under the same normal hybridization conditions. The formation of a duplex is accomplished by annealing two complementary nucleic acid strands in a hybridization reaction. The hybridization reaction can be made to be highly specific by adjustment of the hybridization conditions (often referred to as hybridization stringency) under which the hybridization reaction takes place, such that hybridization between two nucleic acid strands will not form a stable duplex, e.g., a duplex that retains a region of double-strandedness under normal stringency conditions, unless the two nucleic acid strands contain a certain number of nucleotides in specific sequences which are substantially or completely complementary. "Normal hybridization or normal stringency conditions” are readily determined for any given hybridization reaction. See, for example, Ausubel et al., Current Protocols in Molecular Biology, John Wiley & Sons, Inc., New York, or Sambrook et al., Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory Press. As used herein, the term "hybridizing” or "hybridization” refers to any process by which a strand of nucleic acid binds with a complementary strand through base pairing.
A nucleic acid is considered to be “selectively hybridizable” to a reference nucleic acid sequence if the two sequences specifically hybridize to one another under moderate to high stringency hybridization and wash conditions. Moderate and high stringency hybridization conditions are known (see, e.g., Ausubel, et al., Short Protocols in Molecular Biology, 3rd ed., Wiley & Sons 1995 and Sambrook et al., Molecular Cloning: A Laboratory Manual, Third Edition, 2001 Cold Spring Harbor, N.Y.). One example of high stringency conditions includes hybridization at about 42 °C in 50% formamide, 5X SSC, 5X Denhardt’s solution, 0.5% SDS and 100 ug/ml denatured carrier DNA followed by washing two times in 2X SSC and 0.5% SDS at room temperature and two additional times in 0.1 X SSC and 0.5% SDS at 42 °C.
The term “sequencing”, as used herein, refers to a method by which the identity of at least 10 consecutive nucleotides (e.g., the identity of at least 20, at least 50, at least 100 or at least 200 or more consecutive nucleotides) of a polynucleotide are obtained.
The term “next-generation sequencing” refers to the so-called parallelized sequencing-by-synthesis or sequencing-by-ligation platforms currently employed by, e.g., Illumina, Life Technologies, BGI Genomics (Complete Genomics technology), and Roche etc. Next-generation sequencing methods may also include nanopore sequencing methods or electronic-detection based methods such as, e.g., Ion Torrent technology commercialized by Life Technologies.
The term “duplex,” or “duplexed,” as used herein, describes two complementary polynucleotides that are base-paired, i.e., hybridized together.
The terms “determining,” “measuring,” “evaluating,” “assessing,” “assaying,” and “analyzing” are used interchangeably herein to refer to forms of measurement, and include determining if an element is present or not. These terms include both quantitative and/or qualitative determinations. Assessing may be relative or absolute.
The term “ligating”, as used herein, refers to the enzymatically catalyzed joining of the terminal nucleotide at the 5’ end of a first DNA molecule to the terminal nucleotide at the 3’ end of a second DNA molecule.
The terms “plurality”, “set” and “population” are used interchangeably to refer to something that contains at least 2 members. In certain cases, a plurality may have at least 10, at least 100, at least 1000, at least 10,000, or at least 100,000 members.
A “primer binding site” refers to a site to which an oligonucleotide hybridizes in a target polynucleotide or fragment. If an oligonucleotide “provides” a binding site for a primer, then the primer may hybridize to that oligonucleotide or its complement.
The term “strand” as used herein refers to a nucleic acid made up of nucleotides covalently linked together by covalent bonds, e.g., phosphodiester bonds.
The term “extending”, as used herein, refers to the extension of a nucleic acid by ligation or the addition of nucleotides using a polymerase. If a nucleic acid that is annealed to a polynucleotide is extended, the polynucleotide acts as a template for an extension reaction. In these embodiments, the nucleic acid may be extended by a template-dependent polymerase or by ligation to an oligonucleotide that is complementary to the polynucleotide, where the polynucleotide acts as a splint.
The term "extending" includes extension at the 3' end or the 5' end. Primer extension, ligation and gap-fill ligation reactions are types of extending.
The term "extendible 5' or 3' end" refers to a 5' phosphate and 3' hydroxyl, respectively, both of which are extensible by ligation. 3' hydroxyls are also extendible by a polymerase.
As used herein, the term “barcoded particles” is intended to refer to both barcoded RCA products and barcoded beads, wherein the particles in a population of barcoded particles are each separately barcoded with a unique particle identifier sequence, i.e., a sequence that is unique to each particle such that the particles can be distinguished from one another by their unique identifier sequences.
As used herein, the term “rolling circle amplification” or “RCA” for short refers to an isothermal amplification that generates linear concatemerized copies of a circular nucleic acid template using a strand-displacing polymerase. RCA is well known in the molecular biology arts and is described in a variety of publications including, but not limited to Lizardi et al (Nat. Genet. 1998 19:225-232), Schweitzer et al (Proc. Natl. Acad. Sci. 2000 97:10113- 10119), Wiltshire et al (Clin. Chem. 2000 46:1990-1993) and Schweitzer et al (Curr. Opin. Biotech 2001 12:21-27), which are incorporated by reference herein.
As used herein, the term “rolling circle amplification products” refers to the concatemerized products of a rolling circle amplification reaction.
The term "opposite end" refers to the other end of a nucleic acid molecule. The opposite end to a 3' end is the 5' end and the opposite end to a 5' end is the 3' end.
The term "gap-fill ligation" refers to a reaction in which two oligonucleotides hybridize to nearby sites on a template to define a gap. The gap is filled in by a polymerase and the nick between the primer extension product is sealed by a ligase.. See, e.g., Mignardi et al, Nucleic Acids Res. 2015 43: el51.
The terms “antibody” and “immunoglobulin” include antibodies or immunoglobulins of any isotype and fragments of antibodies which retain specific binding to antigen, including, but not limited to, Fab, Fv, scFv, and Fd fragments, chimeric antibodies, humanized antibodies, minibodies, single-chain antibodies, nanobodies and fusion proteins comprising an antigen-binding portion of an antibody and a non- antibody protein. Also encompassed by the term are Fab’, Fv, F(ab’)2, and/ or other antibody fragments that retain specific binding to antigen, and monoclonal antibodies. Antibodies may exist in a variety of other forms including, for example, Fv, Fab, and (Fab')2, as well as bi-functional (i.e., bi-
specific) hybrid antibodies (e.g., Lanzavecchia et al., Eur. J. Immunol. 17, 105 (1987)) and in single chains (e. g., Huston et al., Proc. Natl. Acad. Sci. U.S.A., 85, 5879-5883 (1988) and Bird et al., Science, 242, 423-426 (1988)), which are incorporated herein by reference. (See, generally, Hood et al., “Immunology”, Benjamin, N.Y., 2nd ed. (1984), and Hunkapiller and Hood, Nature, 323, 15-16 (1986)).
The terms “antibody-oligonucleotide conjugate” and “capture agent that is linked to a oligonucleotide” and the likerefers to a capture agent, e.g., an antibody or aptamer, that is non-covalently (e.g., via a streptavidin/biotin interaction) or covalently (e.g., via a click reaction or the like) linked to a single- stranded oligonucleotide in a way that the capture agent can still bind to its binding site. The oligonucleotide and the capture agent may be linked via a number of different methods, including those that use maleimide or halogencontaining group, which are cysteine-reactive. The capture agent and the oligonucleotide may be linked proximal to or at the 5’ end of the oligonucleotide, proximal to or at the 3’ end of the oligonucleotide, or anywhere in-between. In some embodiments, the oligonucleotides may be linked to the capture agents by a linker that spaces the oligonucleotide from the capture agents. Oligonucleotides may be linked to capture agents using any convenient method (see, e.g., Gong et al., Bioconjugate Chem. 2016 27: 217-225 and Kazane et al. Proc Natl Acad Sci 2012 109: 3731-3736). In many embodiments, the sequence of an oligonucleotide that is conjugated to a binding agent uniquely identifies the epitope or sequence to which the binding agent binds. For example, if the method is performed using 10 different antibodies, then each antibody is tethered to a different sequence that identifies the epitope to which the antibody binds. This feature allows the method to be multiplexed and, in some embodiments, at least 5, at least 10, at least 20 or at least 50 different antibodies that bind to different markers in or on the surface of a cell can be used in the method. Each antibody is conjugated to a different antibody identifier sequence, and the antibody identifier sequences allow the binding events for a particular antibody to be mapped.
Other definitions of terms may appear throughout the specification.
DETAILED DESCRIPTION
Certain gap-fill ligation reactions generate products that interfere with later reactions, e.g., PCR. This problem is illustrated in Fig. 1. With reference to Fig. 1, certain gap-fill ligation methods involve: hybridizing a first oligonucleotide 2 and a second oligonucleotide
4 to a population of template molecules 6 that comprise an identifier sequence 8 ("NNN", which may be 4-20 nt in length in some cases) that varies in the population, wherein the first and second oligonucleotides hybridize to sites that flank the identifier sequence 8. The distance between one end of the first oligonucleotide and the other end of the second oligonucleotide (i.e., the "gap") may be in the range of 4-100 nucleotides (e.g., 5-30 nucleotides) although this distance could be longer or shorter in some cases. As illustrated, first oligonucleotide 2 may provide a forward primer binding site ("FP"), a variable (e.g., random) sequence ("xxx", which may be 4-20 nt in length in some cases) and an end that hybridizes to the template (which may be 10-30 nucleotides in length in many cases). Likewise, second oligonucleotide 4 may provide a reverse primer binding site ("RP") and an end that hybridizes to the template (which may be 10-30 nucleotides in length in many cases). As illustrated, the reverse primer binding site (RP) may be in a tail of the primer or it may be in the sequence that hybridizes to the template, in which case the complement of the reverse primer binding site will be in the underlying template.
In the next step of the method, the product is incubated under gap-fill ligation conditions to produce: (i) full length products that comprise the full sequence of the first oligonucleotide 2 (including the FP sequence and the random sequence, xxx), the complement of an identifier sequence 8' (NNN') and the second oligonucleotide 4 (including the RP sequence); and (ii) incomplete products that do not comprise the full sequence of the first oligonucleotide 2, the complement of an identifier sequence 8 and the second oligonucleotide 4. The nature of the incomplete products depends on whether the first oligonucleotide or the second oligonucleotide is being extended by the polymerase (see the top panels of Fig. 1). In embodiments in which the first oligonucleotide is extended by the polymerase, the incomplete products may be composed of an unextended first oligonucleotide 2, a partially extended oligonucleotide 2, a fully extended oligonucleotide 2 that has not ligated to the second oligonucleotide 4, or any combination thereof. In embodiments in which the second oligonucleotide is extended by the polymerase, the incomplete products may comprise an unextended first oligonucleotide 4, a partially extended oligonucleotide 4, a fully extended oligonucleotide 4 that has not ligated to the first oligonucleotide 2, or any combination thereof.
The problem is that the incomplete products interfere with later amplification reactions, e.g., PCR, in which a pair of primers that bind to the PCR primer sites or their
complements are used. In particular, the incomplete products (or their copies made during PCR) may act as PCR primers themselves, which leads to confounding results.
In some embodiments, this problem can be addressed by making the 3' end of one of the first and second oligonucleotides (i.e., the oligonucleotide that is not extended by the polymerase) exonuclease resistant and the 5' end of the other of the first and second oligonucleotides (i.e., the oligonucleotide that the extended oligonucleotide is ligated to) exonuclease resistant. In these embodiments, both ends of the full-length products should be exonuclease resistant and, as such, the full-length products can be enriched by treating the product of step (b) with an exonuclease. This embodiment is illustrated in Fig. 2. In the illustrated embodiment one end of the first oligonucleotide is made exonuclease resistant by attaching it to an antibody and the opposite end of the second oligonucleotide has an exonuclease-resistant nucleotide or linkage (as indicated by the asterisk). In this figure, either oligonucleotide can be extended in the gap-fill reaction. Fig. 3 illustrates this method in a different way.
The exonuclease used in the method may be any one or a combination of exonucleases, including, e.g., both exonuclease I and exonuclease III, although one or more other exonucleases, e.g., exonuclease T, exonuclease V, exonuclease VII, T5 exonuclease, T7 exonuclease, RecJ exonuclease, etc., could be used. As would be apparent, the selected exonuclease can be specific either for the 5’ end or the 3’ of DNA, and a mixture could be used. Lambda exonuclease preferentially degrades phosphorylated 5' ends but much less so non-phosphorylated 5'ends. As such, lambda exonuclease can be used in many embodiments, particularly when the 5' end of the full length product has a free 5' end (i.e., not tethered to antibody) that is exonuclease resistant but the partial products don't.
In another embodiment, the problem described above can be addressed by adding a capture moiety to the 5' or 3' end of one of the first or the second oligonucleotides. In these embodiments, enrichment of the full-length products can be done using a support that has affinity for the capture moiety. This embodiment is illustrated in Fig. 3, where the 5' end of one of the oligonucleotides is attached to an antibody and the 3' of the other oligonucleotide has a capture moiety, e.g., a biotin.
As would be apparent, unless otherwise stated an oligonucleotide that is modified at one end (e.g., by conjugation to an antibody, a capture moiety or by an exonuclease resistant linkage or nucleotide) may have hydroxyl or phosphate at the other (whichever is appropriate).
In some embodiments and as illustrated, one of the first and second oligonucleotides may be conjugated to an antibody or aptamer. In these embodiments, the oligonucleotide that is attached to the antibody or aptamer may have a sequence barcode that identifies the antibody or aptamer to which it is conjugated, e.g., a sequence of 3-10 nucleotides, that identify the antigen to which the antibody or aptamer binds. The method can be multiplexed in these embodiments, and the method may be performed using at least 10, or at least 20 and up to 50 or 100 or more different antibodies or aptamers, each conjugated to an oligonucleotide that containing a sequence barcode that identifies the antibody or aptamer to which it is conjugated.
In some embodiments, the antibody or aptamer is bound to a cell. In any embodiment, the cell may be in solution, on a support (e.g., a slide), in a three-dimensional sample of tissue, or in a tissue section. In some embodiments, the antibody or aptamer may be bound to an antigen in a cell or on the surface of the cell (such that the antibody or aptamer coats the surface of the cells). A sample containing cells that are in solution may be a sample of cultured cells that have been grown as a cell suspension, for example. In other embodiments, disassociated cells (which cells may have been produced by disassociating cultured cells or cells that are in a solid tissue, e.g., a soft tissue such as liver of spleen, using trypsin or the like) may be used. In particular embodiments, cells can be found in blood, e.g., cells that in whole blood or a sub -population of cells thereof. Sub-populations of cells in whole blood include platelets, red blood cells (erythrocytes), platelets and white blood cells (i.e., peripheral blood leukocytes, which are made up of neutrophils, lymphocytes, eosinophils, basophils and monocytes). These five types of white blood cells can be further divided into two groups, granulocytes (which are also known as polymorphonuclear leukocytes and include neutrophils, eosinophils and basophils) and mononuclear leukocytes (which include monocytes and lymphocytes). Lymphocytes can be further divided into T cells, B cells and NK cells. Peripheral blood cells are found in the circulating pool of blood and not sequestered within the lymphatic system, spleen, liver, or bone marrow. If cells that are immobilized on a support are used, then then the sample may be made by, e.g., growing cells on a planar surface, depositing cells on a planar surface, e.g., by centrifugation, by cutting a three dimensional object that contains cells into sections and mounting the sections onto a planar surface, i.e., producing a tissue section.
In some embodiments, the template may be a nucleic acid barcoded particle, for example, a nucleic acid barcoded bead. Nucleic acid barcoded particles are uniquely
barcoded by surface-tethered oligonucleotide that unique particle identifier sequences. As used herein, the term “population of barcoded particles” refers to particles, e.g., small beads or metallic particles the like, that are coated in oligonucleotides, where the surface-tethered oligonucleotides on each particle have a unique sequence that is different to the sequence that is in the oligonucleotides that are tethered to other particles in the population. In other words, if there are 1,000 barcoded particles, the oligonucleotides that are tethered to each particle will have a unique sequence (referred to herein as a unique molecular identifier “UMI” or unique identifier “UID”. The UID for one particle is different to the UIDs for other particles. Methods for making barcoded particles are known. See, e.g., e.g., Kanagal- Shamanna et al (Methods Mol Biol 2016 1392: 33-42) and Shao et al (PlosOne 2011 0024910) and Delley et al. (Scientific Reports 2021 11: 10857).
In other embodiments, the template may be a barcoded RCA product. Barcoded RCA products each contain a unique sequence that is in the repeated sequence. In other words, if there are 1,000 RCA products, each product will have a unique sequence (referred to herein as a unique molecular identifier “UMI” or unique identifier “UID”). The UID for one particle is different to the UIDs for other particles. The RCA product can be made by, e.g., synthesizing initial oligonucleotides that have a degenerate sequence, circularizing the initial oligonucleotides using a splint, and amplifying the circularized oligonucleotides by RCA. In some embodiments, the initial oligonucleotides may contain a degenerate (e.g., random) sequence of 6-10 nucleotides, or even more random nucleotides dependent on the number of unique RCA products required. Amplification of circularized oligonucleotides that have a degenerate sequence should produce a population of RCA products that each have a unique identifier (i.e., a sequence that is different from the other RCA products in the population). Methods for generating RCA products that have unique identifiers are described in Wu et al (Nat. Comm. 2019 10: 3854) and US20160281134, for example, and are readily adapted for use herein. In some embodiments, the different oligonucleotides that are used to make the first and second sets of RCA products are made separately and then mixed together. In other embodiments, the different oligonucleotides may be made in parallel on a planar support in the form of an array and then cleaved from the array. Examples of such methods are described in, e.g., Cleary et al. (Nature Methods 2004 1: 241-248) and LeProust et al. (Nucleic Acids Research 2010 38: 2522-2540).
In some embodiments, one or both of the first and second oligonucleotides may further comprise a random sequence, as illustrated in Fig. 1. This sequence, which may be
referred to as a UMI or unique molecular identifier, may be used to count molecules in embodiments in which the products are sequenced.
In some embodiments, the enriching is done using an exonuclease, and the 5' or 3' end of one of the first and second oligonucleotides is made exonuclease resistant by tethering it to an antibody and the opposite end of the other of the first and second oligonucleotides comprises an exonuclease-resistant linkage or nucleotide. This embodiment is illustrated in Fig. 2. For example, one end of one of the oligonucleotides may be attached to an antibody whereas the opposite end of the other oligonucleotides may comprise a phosphorothioate linkage or a 2’0Me RNA or 2'-deoxyadenosine-5'-(a-thio) residue, for example.
In some embodiments, the capture moiety may be a biotin moiety (e.g., biotin or desthiobiotin) and the support may comprise avidin or streptavidin. In these embodiments, the enriching is done by affinity. As illustrated in Fig. 4, the 5' or 3' end of one of the first and second oligonucleotides may be conjugated to an antibody and the opposite end of the other of the first and second oligonucleotides comprises the capture moiety. In these embodiments, the opposite end of the other of the first and second oligonucleotides may comprise a biotin moiety. Fig. 5 further illustrates an embodiment of this method.
As illustrated in the figures, the full length products of the reaction may contain a binding site for a forward primer (FP), an antibody identifier sequence, an optional unique molecular identifier (UMI), the complement of a variable sequence (e.g., a random sequence) from a template, and a binding site for a reverse primer (RP). In some embodiments, the method may further comprise amplifying the full-length products to produce amplification products. In these embodiments, the PCR may use a pair of primers, one primer hybridizing to the FP sequence or its complement and the other primer hybridizing to the RP sequence or its complement. In these embodiments, the method may comprise sequencing the amplification products. The sequence reads may contain an antibody identifier sequence, a unique molecular identifier, and the complement of a variable sequence (e.g., a random sequence) from the template.
In some embodiments, the identifier sequence that varies in the population of template molecules may be a random sequence, e.g., a random sequence of 4-20 nucleotides. In these embodiments, the method may provide a way to copy the complement of a unique identifier sequence from a barcoded bead or barcoded RCA product, as described above, onto the end of an oligonucleotide, e.g., an oligonucleotide that is conjugated to an antibody.
In some embodiments, the method may be incorporated into the workflow described in WO2022208327, which is summarized in Fig. 6. WO2022208327 is incorporated by reference herein.
The method of Fig. 6 is further described in WO2022208327. This method may make use of a probe system comprising: (a) a population of nucleic acid molecules that have an extendible 5' or 3' end; (b) a first set of barcoded particles that each have a nucleotide sequence comprising: (i) a binding sequence that is complementary to the extendible end of the nucleic acid molecules of (a), (ii) a unique particle identifier sequence, and (iii) a first template sequence; (c)a second set of barcoded particles that each have a nucleotide sequence comprising: (i) the first template sequence, and (ii) a unique particle identifier sequence, wherein extension of the nucleic acid molecules of (a) using the first set of barcoded particles of (b) as a template produces extension products that contain the complement of a unique particle identifier sequence of a particle of (b)(ii) and the complement of the first template sequence. The barcoded particles of (b) and/or (c) may be rolling circle amplification (RCA) products or barcoded nanoparticles (barcoded beads).
The present method may be incorporated into the primer extension steps in the method shown in Fig. 6. Specifically, one or both of the primer extensions steps shown in Fig. 6 may be done using the above-described gap-fill ligation protocol. As would be apparent, one would need to add the downstream oligonucleotides (modified as necessary) to the workflow. In these embodiments, the method may comprise i. hybridizing the first set of barcoded particles of (b) with the population of nucleic acid molecules of (a), ii. extending, using the present gap- fill ligation method, the hybridized nucleic acid molecules using the nucleotide sequence of the first set of barcoded particles as a template to produce gap-fill ligation product that contain i. the complement of a unique particle identifier sequence from a barcoded particle in the first set of barcoded particles and ii. the complement of the first template sequence; iii. removing the first set of barcoded particles; iv. hybridizing the first extension products with the second set of barcoded particles, wherein the complement of the first template sequence in the first extension products hybridizes to the first template sequence in the second set of barcoded particles; and v. extending, using the present gap-fill ligation method, the first extension products using the nucleotide sequence of the second set of barcoded particles as a template to produce second gap-fill ligation products that contain: a unique particle identifier sequence from a barcoded particle in the first set of barcoded
particles, the first template sequence, and a unique particle identifier sequence from a barcoded particle in the second set of barcoded particles.
Alternative embodiment
An alternative solution to the problem is provided, which, in some cases, may be implemented independently or in addition the foregoing method. This embodiment involve primer extension reaction, not gap-fill ligation. In these embodiments, the method may comprise: (a) hybridizing a first oligonucleotide to a population of template molecules that comprise a primer binding site and an adjacent identifier sequence that varies in the population, wherein the 5' end of the first oligonucleotide is exonuclease resistant and the 3' end hybridizes to a site that flanks the primer binding site and the identifier sequence; (b) incubating the product of (a) under primer extension conditions that include one or more exonuclease resistant or biotinylated dNTPs to produce: (i) full length products that comprise the first oligonucleotide, the complement of an identifier sequence and the complement of the PCR primer site; and (ii) incomplete products that do not comprise the full sequence of the first oligonucleotide, the complement of an identifier sequence and the complement of the PCR primer site; (c) enriching for the full length products of (b)(i) using an exonuclease if exonuclease resistant dNTPs are used in step (b) or by affinity if biotinylated nucleotides are used in step (b). This method is schematically illustrated in Fig. 7.
Many of the options and details laid out above may be incorporated into this alternative method. For example, the first oligonucleotide is conjugated to an antibody and in these embodiments the first oligonucleotide has a sequence barcode that identifies the antibody to which it is conjugated. Similarly, the antibody may be bound to a cell, a cell that is in solution. In these embodiments, the antibody may bound to an antigen on the surface of the cell. In addition, the template is a barcoded RCA product or a barcoded bead. In some embodiments, the sequence that varies in the population is a random sequence. In some embodiments, the first oligonucleotide further comprises a random sequence which may act a molecular counter.
In some embodiments, the 5' end of the first oligonucleotide is made exonuclease resistant by tethering it to an antibody and the enriching may be done using an exonuclease. In some embodiments, the dNTPs may comprise a 2’0Me dNTP or a deoxyadenosine-5'-(a- thio)-triphosphate dNTP, although other exonuclease resistant nucleotides could be used. As
would be apparent, the enriching may be done by affinity and the support may comprise streptavidin.
Kits
Also provided by this disclosure are kits for practicing the subject methods, as described above. In certain embodiments, the kit may comprise any of the components listed above, e.g., a first oligonucleotide, a second oligonucleotide and a population of template molecules that comprise an identifier sequence that varies in the population, wherein the first and second oligonucleotides hybridize to sites that flank the identifier sequence (i.e., contain sequences that are complementary to sequences on either side of the identifier sequence in the template molecule, where the 5 ’ end of one oligonucleotide hybridizes to a sequence in the template molecule that is upstream of the identifier sequence and the 3’ end of the other oligonucleotide hybridizes to a sequence in the template molecule that is downstream of the identifier sequence, as illustrated in Fig 1), where the 3' end of one of the first and second oligonucleotides is exonuclease resistant and the 5' end of the other of the first and second oligonucleotides is exonuclease resistant or the 5' or 3' end of one of the first or the second oligonucleotides has a capture moiety. In these embodiments, the kit may further comprise dNTPs, polymerase and ligase (for performing a gap-fill ligation reaction), as well as one or more exonucleases or a support that has affinity for the capture moiety. In some embodiments, one end of the first or second oligonucleotides may be attached to an antibody or other agent, which makes that end exonuclease resistant.
The kit may additionally contain other agents, including buffers and other components described above. The various components of the kit may be present in separate containers or certain compatible components may be pre-combined into a single container, as desired. In addition to the above-mentioned components, the subject kit may further include instructions for using the components of the kit to practice the subject method.
EXAMPLES
The following examples are put forth so as to provide those of ordinary skill in the art with additional disclosure and description of how to make and use the present invention, and are not intended to limit the scope of what the inventors regard as their invention nor are they intended to represent that the experiments below are all or the only experiments performed.
EXAMPLE 1
A sample of PBMCs was fixed in PFA, washed and blocked with human IgG and salmon sperm DNA. Then bound with antibody-DNA probes and subjected to two rounds of co-localization by DNA pixels (Molecular Pixelation) using gap-fill ligation method described in WO2022208327, except that the downstream oligonucleotides is modified to have a 5' end that is exonuclease resistant, as described herein. The Molecular Pixelation assay contains two steps in which an identifier sequence from a pixelated template (e.g., an RCA product or barcoded bead) that coats a cell is copied onto the end of an oligonucleotide that is attached to an antibody, as illustrated in Fig. 2.
The antibody was conjugated to an oligonucleotide by the 3' end. The oligonucleotide contained free 5’ end having a phosphate. After the two rounds of DNA-pixelation and RCA products as DNA pixels, the sample was split in into two. One part where one part was subjected to RecJ - exonuclease (New England Biolabs) treatment and the other served as a control.
In perfect data, there should be one cluster for each input cell in the assay, for example 100 cells = 100 clusters. The number of false connections between cells are indicative of the level of Chimeric PCR or false connections derived from the PCR reaction.
In the untreated control sample there were 404 cells with 3780 false connections between cells (see Fig. 8). In the RecJ treated sample there were 238 cells with a total of 291 false connections. RecJ treatment lowered the number of false links between cells by about 10-fold see (Fig. 9).