[go: up one dir, main page]

WO2025240905A1 - Imaging-free high-resolution spatial macromolecule abundance reconstruction - Google Patents

Imaging-free high-resolution spatial macromolecule abundance reconstruction

Info

Publication number
WO2025240905A1
WO2025240905A1 PCT/US2025/029834 US2025029834W WO2025240905A1 WO 2025240905 A1 WO2025240905 A1 WO 2025240905A1 US 2025029834 W US2025029834 W US 2025029834W WO 2025240905 A1 WO2025240905 A1 WO 2025240905A1
Authority
WO
WIPO (PCT)
Prior art keywords
nucleic acid
array
oligonucleotides
bead
population
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
PCT/US2025/029834
Other languages
French (fr)
Inventor
Fei Chen
Chenlei HU
Mehdi BORJI
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Broad Institute Inc
Harvard University
Original Assignee
Broad Institute Inc
Harvard University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Broad Institute Inc, Harvard University filed Critical Broad Institute Inc
Publication of WO2025240905A1 publication Critical patent/WO2025240905A1/en
Pending legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B25/00ICT specially adapted for hybridisation; ICT specially adapted for gene or protein expression
    • G16B25/30Microarray design
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6813Hybridisation assays
    • C12Q1/6841In situ hybridisation
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B25/00ICT specially adapted for hybridisation; ICT specially adapted for gene or protein expression
    • G16B25/10Gene or protein expression profiling; Expression-ratio estimation or normalisation

Definitions

  • the invention relates generally to methods and compositions for spatial assessment of macromolecule abundance (e.g., RNA expression, DNA abundance, protein abundance), e.g., in a tissue or other biological sample.
  • macromolecule abundance e.g., RNA expression, DNA abundance, protein abundance
  • RNA expression in a tissue sample includes traditional histological approaches, in which sections of tissue are fixed, stained, and assessed, e.g., for the presence of individual transcripts across the viewable region of the fixed tissue section on a microscope slide, as well as certain more recent in situ techniques for transcriptome monitoring. Many such techniques have been afflicted by being laborious in application, offering a low degree of multiplexing with a high degree of technical difficulty and/or providing only low resolution of spatial capture across an array (i.e., providing only approximately 100-200 pm resolution). A need therefore exists for improved approaches for spatial macromolecule (e.g., RNA expression, DNA and/or protein abundance) profiling at resolutions approaching single cell resolution.
  • spatial macromolecule e.g., RNA expression, DNA and/or protein abundance
  • the current disclosure relates, at least in part, to imaging-free compositions and methods for assessing macromolecule abundance (e.g., RNA expression levels) in a tissue or other biological sample, which provide deep macromolecule-identifying sequence coverage at high- resolution across multiple locations assessed within a tissue sample.
  • macromolecule abundance e.g., RNA expression levels
  • imaging- free high-resolution reconstructions of macromolecule abundance are obtained via use of dimensionality reduction methods that reconstruct the physical locations of macromolecules and associated abundance, using diffusion data (e.g., diffusion patterns).
  • compositions and methods referred to elsewhere herein as "Slide-seq” are employed and/or adapted to provide imaging-free high resolution reconstructions of macromolecule abundance.
  • the instant disclosure provides a method for generating a spatial representation of macromolecule abundance from a sample, the method involving: (i) contacting first oligonucleotides bound to a solid support and present in a positional array with a sample, wherein the first oligonucleotides include: an array location identifier sequence that is common to all first oligonucleotides in a given element in the positional array and a macromolecule-specific capture sequence, under conditions suitable for oligonucleotide-macromolecule binding; (ii) obtaining sequence information for a population of macromolecules bound to the first oligonucleotides and the array location identifier sequence of the respective first oligonucleotides bound to each of the macromolecules of the population of macromolecules for which sequence information is obtained; and (iii) generating a computational reconstruction of the spatial locations of the population of macromolecules bound to the first oligonucleotides from inputs minimally comprising
  • the first oligonucleotides bound to the solid support and present in the positional array have resolution of 100 micrometers or less between individual elements of the positional array.
  • the macromolecule is RNA, DNA, protein, or combinations thereof.
  • the RNA is a poly-A-tailed RNA. In an embodiment, the RNA is a mRNA.
  • the macromolecule-specific capture sequence includes a poly-dT tail of sufficient length to allow for capture of poly-A-tailed RNAs via hybridization.
  • the macromolecule-specific capture sequence includes a gene-specific sequence or a transcript-specific sequence.
  • the DNA is a genomic DNA or a barcode DNA.
  • the macromolecule-specific capture sequence is a component of a loaded transposase.
  • the positional array possesses resolution of 50 micrometers or less between individual elements of the positional array. In an embodiment, the positional array possesses resolution of 30 micrometers or less between individual elements of the positional array. In an embodiment, the positional array possesses resolution of 20 micrometers or less between individual elements of the positional array. In an embodiment, the positional array possesses resolution of 10 micrometers or less between individual elements of the positional array.
  • the sample is a tissue sample.
  • the tissue sample is obtained from a tissue of a brain, a lung, a liver, a kidney, a pancreas, or a heart.
  • the sample is obtained from a mammal.
  • the biological sample is obtained from a human.
  • the sample is fixed.
  • the tissue sample may be fixed with paraffin.
  • the sample may be fixed using formalin-fixation and paraffin embedding (FFPE).
  • the solid support is a slide. In an embodiment, the solid support is a glass slide.
  • the first oligonucleotides are bound to the solid support using a capture material.
  • the capture material may be applied as a liquid.
  • the capture material may be applied using a brush or aerosol spray.
  • the capture material may be a liquid electrical tape.
  • the capture material may dry to form a vinyl polymer.
  • the vinyl polymer is polyvinyl hexane.
  • the obtaining sequence information of step (ii) involves a nextgeneration sequencing approach
  • the next-generation sequencing approach may be solid-phase, reversible dye-terminator sequencing; massively parallel signature sequencing; pyro-sequencing; sequencing-by -ligation; ion semiconductor sequencing; Nanopore sequencing; or DNA nanoball sequencing.
  • the next-generation sequencing approach includes solid-phase, reversible dye-terminator sequencing.
  • the generating of the computational reconstruction of the spatial locations of the population of macromolecules of step (iii) includes performing a dimensionality reduction analysis.
  • the macromolecules include a population of second oligonucleotides capable of binding to the first oligonucleotides, optionally where the second oligonucleotides are attached to a bead.
  • the generating of the computational reconstruction of the spatial locations of the population of macromolecules of step (iii) involves performing Uniform Manifold Approximation and Projection (UMAP) reduction, t-distributed stochastic neighbor embedding (t- SNE) reduction, and/or multidimensional scaling (MDS) reduction.
  • UMAP Uniform Manifold Approximation and Projection
  • t- SNE t-distributed stochastic neighbor embedding
  • MDS multidimensional scaling
  • the generating of the computational reconstruction of the spatial locations of the population of macromolecules of step (iii) involves performing Uniform Manifold Approximation and Projection (UMAP) reduction.
  • the disclosure provides a method for generating a spatial representation of mRNA abundance from a sample, the method involving: (i) contacting first oligonucleotides bound to a solid support and present in a positional array having resolution of 100 micrometers or less between individual elements of the positional array with a sample under conditions suitable for oligonucleotide-mRNA hybridization, wherein the first oligonucleotides include: an array location identifier sequence that is common to all first oligonucleotides in a given element in the positional array, and a poly-dT tail of sufficient length to allow for capture of poly-A-tailed mRNAs via hybridization; (ii) obtaining sequence information for a population of poly-A-tailed mRNAs bound to the first oligonucleotides and the array location identifier sequence of the respective first oligonucleotides bound to each of the poly-A-tailed mRNAs for which sequence information is obtained
  • the generating of the computational reconstruction of the spatial locations of the population of poly-A-tailed RNAs of step (iii) comprises performing a dimensionality reduction analysis.
  • the poly-A-tailed RNAs comprise a population of second oligonucleotides capable of binding to the first oligonucleotides, optionally wherein the second oligonucleotides are attached to a bead.
  • the conditions suitable for oligonucleotide-mRNA hybridization involve incubation in 6X SSC buffer.
  • the 6X SSC buffer is supplemented with detergent.
  • the disclosure provides a method for generating a spatial representation of macromolecule abundance from a sample, the method involving: (i) generating a well array having a plurality of wells, wherein each well of the array can hold exactly one bead; (ii) depositing beads comprising macromolecule capture oligonucleotides into the wells of the well array (optionally, by evaporation in a centrifuge); (iii) brushing the well array to remove all of the beads not present in the wells; (iv) depositing the sample onto the well array and centrifuging, thereby forcing the sample into the wells of the well array; (v) adding a digestion buffer, thereby lysing the sample and causing the macromolecules of the sample to transfer onto the beads in the wells; (vi) obtaining sequence information for a population of macromolecules bound to the macromolecule capture oligonucleotides of the beads and an associated capture oligonucleotide bead identification sequence for each
  • the generating of the computational reconstruction of the spatial locations of the population of macromolecules of step (vii) comprises performing a dimensionality reduction analysis.
  • the macromolecules comprise a population of second oligonucleotides capable of binding to the macromolecule capture oligonucleotides, optionally wherein the second oligonucleotides are attached to a bead.
  • the method further includes performing reverse transcription upon the sample in the wells of the well array.
  • the method further includes separating oligonucleotides from beads by sonication or by photocleavage.
  • the disclosure provides a method for generating a spatial representation of macromolecule abundance from a sample, the method involving: (i) adhering clusters of oligonucleotides in an array to a solid support; (ii) contacting the array with a tissue sample; (iii) obtaining sequence information for a population of macromolecules bound to the oligonucleotide clusters and a respective associated oligonucleotide cluster identification sequence for each macromolecule of the population of macromolecules for which sequence information is obtained; and (iv) generating a computational reconstruction of the spatial locations of the population of macromolecules bound to the clusters of oligonucleotides from inputs minimally comprising the obtained oligonucleotide cluster identification sequenced and molecular diffusion patterns of the macromolecules of the population of macromolecules for which sequence information is obtained, relative to the clusters of oligonucleotides present in the array. [0036] In one embodiment, the generating of the computational reconstruction
  • the macromolecules comprise a population of second oligonucleotides capable of binding to the clusters of oligonucleotides, optionally wherein the second oligonucleotides are attached to a bead.
  • the array includes barcoded clusters of oligonucleotides on the solid support.
  • the obtaining sequence information step (iii) involves performance of long-read sequencing.
  • the disclosure provides a method for generating a spatial representation of macromolecule abundance from a tissue sample of a subject, the method involving: (i) obtaining the tissue sample from the subject; (ii) preparing a cryosection of the tissue sample and adhering said cryosection to a solid support; (iii) forming an array of barcoded oligonucleotide clusters and/or an array of beads attached to barcoded oligonucleotides and contacting the cryosection adhered to the solid support with the array; (iv) obtaining sequence information for a population of macromolecules bound to the array(s), wherein the sequence information comprises macromolecule identification information and associated positional identification information of the barcoded oligonucleotides; and (v) generating a computational reconstruction of the spatial locations of the population of macromolecules bound to the bead oligonucleotides from inputs minimally comprising the obtained sequence information and molecular diffusion patterns of the macromolecules of the population of
  • the generating of the computational reconstruction of the spatial locations of the population of macromolecules of step (v) comprises performing a dimensionality reduction analysis.
  • the macromolecules comprise a population of second oligonucleotides capable of binding to the barcoded oligonucleotides, optionally wherein the second oligonucleotides are attached to a bead.
  • an array (puck) is physically transferred from one surface to another.
  • a gel encasement is formed on top of the array (puck), thereby allowing beads to be picked up off the surface of the array (puck) without altering bead positions relative to each other.
  • the beads or array are used for capture of oligonucleotides.
  • the beads or array include or bind oligonucleotide-conjugated antibodies.
  • the beads or array include or bind nucleic acid hybridization probes.
  • the hybridization probes are RNA hybridization probes.
  • the hybridization probes are DNA hybridization probes.
  • the hybridization probes are capable of specific hybridization to transcriptome or genome sequence(s) of the tissue sample.
  • the hybridization probes include unique molecular identifiers (UMIs).
  • UMIs of the hybridization probes are counted via sequencing to assess the levels of hybridization probe-bound macromolecules.
  • the hybridization probe-bound macromolecules are proteins, exons, transcripts, nucleic acid sequences including single nucleotide polymorphisms (SNPs) and/or genomic regions.
  • the hybridization probes are released from the array or tissue.
  • the hybridization probes are released from the array or tissue by a method of: (a) cleavage and/or degradation of a photolabile and/or photocleavable group; (b) T7 RNA polymerase transcription; (c) enzymatic cleavage (optionally, the enzymatic cleavage is RNAseH cleavage of bound RNA or RNAse cleavage of an RNA base in the hybridization probes); and/or (d) chemical cleavage (optionally, the chemical cleavage is disulfide cleavage).
  • the beads or array possess primers capable of specific binding to a selection of one or more target transcripts.
  • the one or more target transcripts are selected from among T Cell receptor transcript sequences; transcripts of low-expressing proteins (optionally, the low-expressing proteins are transcription factors); and synthetic transcripts (optionally, the synthetic transcripts are guide-RNAs).
  • the disclosure provides a method for generating a spatial representation of macromolecule abundance from a sample involving: (i) contacting a tissue with a first monomer or linear polymer, a cross-linking agent including a second monomer or polymer, wherein the cross-linking agent is capable of crosslinking with the first monomer or linear polymer when combined, and a nucleic acid primer or probe including a modification capable of binding the nucleic acid primer or probe to the first monomer or linear polymer, the cross-linking agent, or both, wherein the nucleic acid primer or probe includes: a matrix location identifier sequence that is common to all nucleic acid primers or probes in a given element in a matrix and a target nucleic acid molecule-specific capture sequence; (ii) crosslinking the cross-linking agent with the first monomer or linear polymer, thereby forming the matrix; (iii) binding the nucleic acid primer or probe to the first monomer or linear polymer, the cross-link
  • the generating of the computational reconstruction of the spatial locations of the population of target nucleic acid molecules of step (vi) includes performing a dimensionality reduction analysis.
  • the target nucleic acid molecules include a population of second oligonucleotides capable of binding to the nucleic acid primers or probes, optionally wherein the second oligonucleotides are attached to a bead.
  • FIGs. lA to 1G show that the adapted “Slide-seq”-like and “Slide-tag”-like approaches of the instant disclosure, which, in specific aspects, perform dimensionality reduction analysis upon macromolecule abundance and associated positional array data, thereby generating latent space representations of macromolecule abundance, enabled imaging-free spatial transcriptomics via computational reconstruction.
  • FIG. 1A shows a schematic of the instant method, where a mosaic barcode array of uniformly mixed capture beads (poly(dT)) and fiducial beads (poly(dA)) was used for imaging-free spatial transcriptomics.
  • Both capture and fiducial beads were DNA- barcoded with a unique spatial barcode for each bead and a unique molecular identifier (UMI) for each oligonucleotide molecule.
  • Fiducial beads' DNA barcodes were photocleaved and diffused to proximate capture beads.
  • a diffusion matrix was abstracted from the sequencing result of the capture bead barcode and fiducial bead barcode conjugation product. UMAP reduces the high dimensional diffusion matrix to a two-dimensional embedding space and reconstructs the spatial location of beads.
  • FIG. IB shows a simulated reconstruction.
  • FIG. IB, at left is an image with the pattern of letter "H”.
  • FIG. IB, at middle shows beads sampled from the original image.
  • FIG. IB, at right shows reconstructed beads location through simulated diffusion matrix and UMAP embedding. Each bead was colored the same as the middle image to show the recovered pattern.
  • FIG. 1C, at left, shows simulated reconstruction error as a function of the ratio of fiducial to capture beads.
  • FIG. 1C shows simulated reconstruction error as a function of diffusion distance and number of molecules carrying diffusion information ((UMIs)) per bead.
  • FIG. ID shows a schematic reconstruction for "Slide-seq”. Molecules of mRNA are captured by capture beads, followed by UV induced fiducial bead barcode diffusion and conjugation.
  • FIG. IE at left, shows the diffusion pattern of a capture bead barcode on its associated fiducial bead barcodes, shaded by the number of unique joint UMIs. Inset details the diffusion center.
  • FIG. IE shows the diffusion pattern of a capture bead barcode on its associated fiducial bead barcodes, shaded by the number of unique joint UMIs. Inset details the diffusion center.
  • FIG. IE shows the ensemble average of kernel density estimation (KDE) on capture bead barcode's diffusion distribution in the x-axis direction. Dots represent the half-maximum, corresponding to the full width at half maximum (FWHM) measured as 123.1 pm.
  • FIG. IF shows the spatial location of capture beads on a mouse hippocampus with reconstruction through UMAP embedding, colored by decomposed cell types. Locations were scaled to the original array size with a diameter of 3 mm.
  • FIG. 1G at left, shows the schematic of RMS error at different measurement lengths. Error was defined as the difference between the distance of two beads measured in ground truth (dashed line) and in reconstruction (solid line).
  • FIG. 1G shows RMS error of different measurement lengths. Data shown in FIG. IF (top line) and two biological replicates (middle and bottom line) were presented. Solid lines represent mean values across beads and shaded areas represent one standard deviation.
  • FIG. 1H at left, shows spatial expression of Atp2bl , a Cornu Ammonis 1 (CAI) layer marker of the hippocampus, in ground truth and reconstruction.
  • FIG. 1H at right, shows representative plots of CAI layer width by profiling the expression intensity of Atp2bl along a perpendicular line in ground truth (dashed line) and reconstruction (solid line). Scale bars: 500 pm.
  • CAI Cornu Ammonis 1
  • FIGs. 2A to 2L demonstrate that reconstruction enabled diverse spatial transcriptomics measurements at scale.
  • FIG. 2A shows a schematic of reconstruction for spatial transcriptomics with single-nucleus resolution.
  • a barcode array was exposed to UV for five seconds to allow fiducial bead barcode diffusion and conjugation with capture bead barcode. After the tissue section, the array was UV exposed for 1 minute to cleave a majority of fiducial bead barcodes and tag nucleic. Sample was then to proceed with "Slide-tags" for spatial genomics profiling.
  • FIG. 2B shows uniform manifold approximation and projection (UMAP) embedding of snRNA-seq profiles from a coronal mouse hippocampus section, colored by cell type annotations shown in FIG.
  • UMAP uniform manifold approximation and projection
  • FIG. 2C shows the spatial location of nuclei mapped based on reconstructed locations of capture beads, colored by cell type annotations. Locations were scared to the original array size with a diameter of 3 mm.
  • FIG. 2D shows the RMS length measurement error of nuclei in reconstruction versus ground truth. Solid line shows the mean value and shaded area shows one standard deviation.
  • FIG. 2E at left, shows the spatial location of beads with profiled gene expression from a Pl mouse head section. Beads were colored by decomposed cell type annotations from robust cell type decomposition (RCTD).
  • FIG. 2E at right, shows hematoxylin and eosin staining of an adjacent slice from the same sample. Dashed line indicates the 1.2 cm bead array used for this experiment.
  • FIG. 2F shows UMAP representing gene expression of the mouse sample captured by beads, colored by decomposed cell types from RCTD.
  • FIG. 2G shows spatial distribution of labeled cell types, colored the same as cell type annotations in FIG. 2F. Gray beads represent other cell types, plotted for contrasting.
  • Neuronal cells include olfactory sensory neurons, intermediate neuronal progenitors, CNS neurons, and neural crest PNS neurons from clusters in FIG. 2E ("Chond", chondrocytes; "Fibro", fibroblasts).
  • FIG. 2H shows marker gene spatial expression of cell types shown in FIG. 2G, shaded by relative expression level.
  • FIG. 21 shows subclustering of neuronal cells, colored by subtype annotations.
  • FIG. 2 J shows results obtained for a marker gene of each neuronal subtype shown at FIG. 21, colored with corresponding color gradients based on relative expression level. All beads of neuronal cell type were plotted.
  • FIG. 2K within beads categorized under the epithelial type, the top 20 spatially differential expression genes ranked by nonparametric C-SIDE were plotted, with Moran’s I statistics calculated. Higher scores on both metrics signify more spatially variable expression.
  • FIG. 2L shows spatially differential expression of four epithelial genes from FIG. 2K. All beads of epithelial type were positioned, shaded by relative expression level of each gene. Scale bars: 500 pm.
  • FIGs. 3A to 3G show a simulation of diffusion and reconstruction with UMAP.
  • FIG. 3A shows simulated locations of capture beads and fiducial beads in a 3 mm circle.
  • FIG. 3B shows a simulated diffusion pattern of a capture bead on its associated fiducial beads, shaded by simulated UMI counts. The distribution plots on the top and right represent the diffusion distribution on the x and y axis, respectively.
  • FIG. 3C shows simulated locations of capture beads, colored by a two- dimensional color gradient depending on the locations.
  • FIG. 3D shows UMAP reconstructed locations of capture beads, colored the same as in FIG. 3C.
  • FIG. 3E shows absolute error of capture beads plotted in ground truth locations.
  • FIG. 3F shows displacement vectors of capture beads.
  • FIG. 3G shows a histogram plot of capture beads’ absolute error.
  • FIGs. 4A to 4 J demonstrate "Slide-seq" adapted by use of reconstruction metrics as disclosed herein.
  • FIG. 4A shows absolute error of capture beads plotted in ground truth locations.
  • FIG. 4B shows displacement vectors of capture beads.
  • Each arrow starts from the capture bead’s ground truth location and ends at the reconstruction location.
  • FIG. 4C shows a histogram plot of anchor beads’ absolute errors.
  • FIG. 4D at the left, shows UMAP representing gene expression from a coronal mouse hippocampus section captured by beads, colored by decomposed cell types from RCTD.
  • FIG. 4D at the left, shows UMAP representing gene expression from a coronal mouse hippocampus section captured by beads, colored by decomposed cell types from RCTD.
  • FIG. 4D shows the spatial location of capture beads in ground truth, colored by decomposed cell types.
  • FIG. 4E shows relative RMS error of measurement lengths as a function of measurement length. Data shown in FIG. 7F(top line) and two biological replicates (middle and bottom line) are presented. Solid lines represent average values across beads and shaded areas represent one standard deviation.
  • FIG. 4G shows a neighborhood enrichment analysis between cell type pairs in reconstruction (left) and ground truth (right).
  • FIG. 41 shows barcode matching between Slide-seq library and in situ sequencing barcode list or reconstruction barcode list. Bead barcodes with >20 UMI counts were matched with hamming distance ⁇ 1. Left rectangle represents total barcodes from in situ sequencing and shaded darker represents barcodes matched with Slide-seq library barcodes (shown as lower rectangle).
  • FIG. 4 J shows a Violin plot of UMI count per bead with the same Slide-seq library matched to reconstruction results and in situ sequencing results. Scale bars: 500 pm.
  • FIGs. 5 A to 5G demonstrate "Slide-tags" reconstruction metrics.
  • FIG. 5 A shows the absolute error of capture beads plotted in ground truth locations.
  • FIG. 5B shows displacement vectors of capture beads. Each arrow starts from the capture bead’s ground truth location and ends at the reconstruction location.
  • FIG. 5C shows a histogram plot of capture bead locations’ absolute errors.
  • FIG. 5D shows the spatial representation of reconstruction error on each nucleus.
  • FIG. 5E shows displacement vectors of located nuclei. Each arrow starts from the nuclei's ground truth location and ends at the reconstruction location.
  • FIG. 5F shows a histogram plot of nuclei locations’ absolute errors.
  • FIG. 5G shows RMS error of measurement lengths between bead pairs as a function of measurement length. Solid lines represent average values, and shaded areas represent one standard deviation.
  • FIG. 6 shows exemplary methods for constructing spatial profiles of genetics information.
  • FIG. 7 shows scale-limiting steps in spatial genomics.
  • FIG. 8 shows an exemplary method of spatial mapping and an exemplary mathematical basis for performing spatial reconstruction without imaging, via use of molecular diffusion measurements (in certain aspects of the instant “Slide-tags” approach, molecular diffusion of oligonucleotides (optionally oligonucleotide-linked macromolecules) released from array elements can be monitored and used for performing such spatial reconstructions).
  • FIG. 9 shows diffusion of capture beads and fiducial beads on a mosaic barcode array.
  • FIG. 10 shows an exemplary computational reconstruction with diffusion matrix.
  • the diffusion matrix is high dimensional, and each capture bead is a dot in fiducial bead space.
  • the high dimensional diffusion matrix is generated from 2D physical space and has an intrinsic 2D manifold. Dimensionality reduction was used to learn the low dimensional manifold.
  • FIG. 11A shows a simulation of diffusion-based reconstruction (min dist: effective minimum distance between embedded points).
  • FIG. 11B shows a simulation of diffusion-based reconstruction (min_dist: effective minimum distance between embedded points).
  • FIG. 12 shows the effect of parameters in simulation.
  • FIG. 13 diagrams the "Slide-seq” method and "Slide-tags” method.
  • FIG. 14A shows reconstruction with "Slide-seq”. In situ indexing was performed for ground truth.
  • FIG. 14B shows reconstruction with "Slide-seq”. In situ indexing was performed for ground truth.
  • FIG. 15 shows errors in reconstruction with "Slide-seq”.
  • FIG. 16 shows reconstruction error on neighborhood analysis (0.997 pearson correlation coefficient).
  • FIG. 17 shows reconstruction with "Slide-tags" at single-nucleus resolution.
  • FIG. 18 shows error in reconstruction with " Slide-tags”. Groundtruth and reconstruction results are identical.
  • FIG. 19 shows the advantages of computational reconstruction in comparison to reconstruction via sequencing RNA.
  • FIGs. 20A shows an image obtained using spatial reconstruction with "Slide-seq", performed upon a Pl mouse section.
  • FIG. 20B shows cell type information for the spatial reconstruction performed upon the Pl mouse section.
  • FIG. 21 shows spatially represented cell types and marker genes.
  • FIGs. 22A, 22B, and 22C show fine structure of olfactory epithelium from computation reconstruction.
  • Cbr2 xenobiotic metabolism, marker of sustentacular cells; and Gap43 : immature OSNs.
  • Cbr2 xenobiotic metabolism; and coronal section is shown.
  • Reg3g is a respiratory epithelium marker.
  • FIG. 23 shows the advantages of computational reconstruction, and the scalability of spatial transcriptomics. Reconstruction may be performed on any methods involving "Slide-seq" and "Slide-tags”.
  • FIGs. 24A and 24B show experimental schematics of reconstruction with Slide-seq. Structure of the library at each stage of the preparation for reconstruction with Slide-seq.
  • FIG. 24A shows beads and 5 sec UV cleave for reconstruction and extension parts of the process.
  • FIG. 24B shows 2 min UV for tagging nuclei and PCR for reconstruction library parts of the process. Diagram shades align with text shades in sequences.
  • FIG. 25 depicts an example of a computer system and associated devices for spatial assessment of macromolecule abundance (e.g., RNA expression, DNA abundance, protein abundance) according to the techniques described herein.
  • macromolecule abundance e.g., RNA expression, DNA abundance, protein abundance
  • the present disclosure is directed, at least in part, to the discovery that imaging-free high- resolution reconstructions of macromolecule abundance could be obtained from a biological sample (e.g., a tissue sample) via use of dimensionality reduction methods that reconstruct the physical locations of macromolecules and associated abundance, using diffusion data (e.g., diffusion patterns).
  • a biological sample e.g., a tissue sample
  • diffusion data e.g., diffusion patterns
  • the methods and compositions of the disclosure feature a tightly packed spatially barcoded microbead array (e.g., an array of 10 pm diameter beads packed at an inter-bead spacing of 20 pm or less, where each bead possesses a bead-specific barcode within bead-attached capture oligonucleotides) created via application of a capture material to a solid support (e.g., application of a liquid electrical tape to a glass slide, followed by application of a layer of microbeads), which can be used, e.g., to capture cellular transcriptomes (or other macromolecules) of biological samples (e.g., cryosectioned tissue), in a manner that is both spatially resolvable at high resolution (e.g., at resolutions of 20 pm between image features) and with deep coverage (i.e., high-resolution reconstructions of relative expression for individual transcripts can be generated using the methods and compositions of the instant disclosure, for a large number (i.e.
  • the instant disclosure enables imaging-free spatially resolved capture of nucleic acids for sequencing from cells and tissues with approximate 10 pm (single cell) resolution.
  • Art-recognized spatial profding technologies pre-dating the "Slide-seq" approach of WO 2019/213254 have relied upon either targeted in situ techniques, which were laborious and offered only a low degree of multiplexing with a high degree of technical difficulty or have offered only very low resolution on spatial capture arrays (resolutions of approximately 100-200 jun).
  • the instant disclosure provides a level of resolution via reconstruction that is superior in lateral resolution and in capture area, to most art-recognized methods.
  • the instant disclosure provides methods and compositions that are easily adoptable and allow for whole transcriptomic profiling of complex tissues.
  • compositions and methods described herein use a spatially barcoded array of oligonucleotide-laden beads to capture mRNA from tissue sections.
  • Exemplified beads are synthesized with a unique or sufficiently unique bead barcode as previously described, e.g., in WO 2016/040476, wherein an exemplary sufficiently unique bead barcode is one that is a member of a population of barcode sequences that is sufficiently degenerate to a population (e.g., of beads) that a majority of individual components (e.g.
  • beads) of the barcoded population each possesses a unique barcode sequence, where the remainder (minority) of the population may possess barcodes that are redundant with those of other members within the remainder population, yet such redundancy can either be eliminated or otherwise adjusted for (e.g., normalized, averaged across/between redundant members, etc.) with only minor impact upon, e.g., the image resolution obtained when employing such a barcoded population.
  • the approach of the instant disclosure enables the imaging-free localization of cell types and gene expression patterns in a biological sample with an approximate 10-micron resolution in an unbiased manner.
  • 11,339,390 has employed a combination of UMIs and unique event identifiers (UEIs) to generate a hierarchy of physical co-localization among groups of template nucleic acid molecules in a biological sample
  • UMIs unique event identifiers
  • the imaging-free adapted "Slide-seq" approach of the instant disclosure provides a method that was demonstrated to enable facile generation of large volumes of unbiased spatial transcriptomes with approximate 10-20 pm spatial resolution, comparable to the size of individual cells, through generation of high resolution reconstructions of macromolecule abundance obtained via use of dimensionality reduction methods that reconstruct the physical locations of macromolecules and associated abundance, using diffusion data (e.g., diffusion patterns).
  • diffusion data e.g., diffusion patterns
  • RNA is transferred from freshly frozen tissue sections onto a surface covered in polystyrene beads presenting barcoded DNA oligonucleotides, without any need for a priori knowledge of respective bead locations within a starting bead array, because the instant disclosure features latent space representation(s) of the population of macromolecules bound to such oligonucleotides/beads and oriented for array location based upon bead identifier sequences (i.e., barcodes), with such spatial orientation of beads based upon bead identifier sequences performed after macromolecules have been bound, beads/captured macromolecules have been mixed, and sequence information has been obtained, via use of a dimensionality reduction analysis that models and makes use of diffusion data to perform such orientation of array elements, which thereby generates an imaging-free spatial representation of macromolecule abundance from the biological sample.
  • bead identifier sequences i.e., barcodes
  • Sequencing of the bead- anchored RNA therefore allows for the assignment of beads to known cell types derived from scRNAseq data, revealing the spatial organization of cell types in the tissue with approximate 10 pm resolution, or greater (e.g., 5 pm resolution or greater, 2 pm resolution or greater, etc.).
  • "Slide- seq” was initially applied to systematically characterize spatial gene expression patterns in the Purkinje layer of the mouse cerebellum, identifying several genes not previously associated with Purkinje cell compartments. Applying "Slide-seq" to a model of traumatic brain injury also allowed for the characterization of underlying genetic programs varying over time and space in response to injury.
  • compositions and methods build upon the successes and advantages of "Slide- seq” and related adaptations thereof, yet significantly improve cost and efficiency parameters for the end-user of a "Slide-seq” array (or, for that matter, for any user of a barcoded array of oligonucleotides capable of capturing macromolecules from a sample).
  • Generating a latent space representation of the population of macromolecules bound to the oligonucleotides having array location identifier sequences by performing a dimensionality reduction analysis frees the "Slide- seq" or other arrayed process from the need for an imaging-based a priori determination of which oligonucleotides within an array are found at which positional locations (elements) in the array.
  • the instant compositions and methods disclosed herein therefore enable imaging-free, high resolution spatial representation of macromolecule abundance from a biological sample via use of dimensionality reduction methods that reconstruct the physical locations of macromolecules and associated abundance, using diffusion data (e.g., diffusion patterns).
  • Tissue organization arises from the coordinated molecular programs of cells.
  • Spatial genomics maps cells and their molecular programs within the spatial context of tissues.
  • current methods measure spatial information through imaging or direct registration, which often require specialized equipment and are limited in scale.
  • an imaging-free spatial transcriptomics method was developed that uses molecular diffusion patterns to computationally reconstruct spatial data.
  • a simple experimental protocol on two-dimensional barcode arrays was used to establish an interaction network between barcodes via molecular diffusion. Sequencing these interactions generates a high dimensional matrix of interactions between different spatial barcodes. Then, dimensionality reduction to regenerate a two-dimensional manifold is performed, which represents the spatial locations of the barcode arrays.
  • imaging which requires specialized techniques and equipment, introduces several limitations on spatial transcriptomic approaches, such as throughput, adaptability and the constrained size of detectable areas (Moses & Pachter, 2022).
  • arrays may be deterministically printed through lithography or physical methods, but such methods require complex equipment and high upfront costs (Stahl et al., 2016; Liu et al., 2020).
  • An imaging-free spatial transcriptomic technique, such as that disclosed herein, can enhance the throughput and accessibility of experiments, and enable larger scale detection for comprehensive studies of tissues.
  • an imaging-free spatial transcriptomics method has been developed that computationally reconstructs the spatial locations of barcode arrays used in spatial transcriptomics measurements with high resolution and fidelity.
  • This imaging-free approach was initially implemented on two-dimensional (2D) barcode arrays, along with ground truth imaging for error estimation. Then, a dimensionality reduction method was utilized to reconstruct the spatial locations.
  • This imaging-free strategy was remarkably demonstrated to integrate with existing barcode array -based spatial transcriptomics methods without perturbing spatial structures.
  • the methods disclosed herein facilitate higher throughput generation of barcode arrays and are accessible to laboratories lacking specialized imaging equipment. Furthermore, the techniques disclosed herein have been applied to a tissue sample on a centimeter scale, demonstrating the potential of these methods for large-scale spatial transcriptomics.
  • the term “approximately” or “about” refers to a range of values that fall within 25%, 20%, 19%, 18%, 17%, 16%, 15%, 14%, 13%, 12%, 11%, 10%, 9%, 8%, 7%, 6%, 5%, 4%, 3%, 2%, 1%, or less in either direction (greater than or less than) of the stated reference value unless otherwise stated or otherwise evident from the context (except where such number would exceed 100% of a possible value).
  • amplicon when used in reference to a nucleic acid, means the product of copying the nucleic acid, wherein the product has a nucleotide sequence that is the same as or complementary to at least a portion of the nucleotide sequence of the nucleic acid.
  • An amplicon can be produced by any of a variety of amplification methods that use the nucleic acid, or an amplicon thereof, as a template including, for example, polymerase extension, polymerase chain reaction (PCR), rolling circle amplification (RCA), multiple displacement amplification (MDA), ligation extension, or ligation chain reaction.
  • An amplicon can be a nucleic acid molecule having a single copy of a particular nucleotide sequence (e.g. a PCR product) or multiple copies of the nucleotide sequence (e.g., a concatemeric product of RCA).
  • a first amplicon of a target nucleic acid is typically a complementary copy.
  • Subsequent amplicons are copies that are created, after generation of the first amplicon, from the target nucleic acid or from the first amplicon.
  • a subsequent amplicon can have a sequence that is substantially complementary to the target nucleic acid or substantially identical to the target nucleic acid.
  • the term "array” refers to a population of features or sites that can be differentiated from each other according to relative location. Different molecules that are at different sites of an array can be differentiated from each other according to the locations of the sites in the array.
  • An individual site of an array can include one or more molecules of a particular type. For example, a site can include a single target nucleic acid molecule having a particular sequence or a site can include several nucleic acid molecules having the same sequence (and/or complementary sequence, thereof). The sites of an array can be different features located on the same substrate.
  • Exemplary features include without limitation, wells in a substrate, beads (or other particles) in or on a substrate, projections from a substrate, ridges on a substrate, or channels in a substrate.
  • the sites of an array can be separate substrates each bearing a different molecule. Different molecules attached to separate substrates can be identified according to the locations of the substrates on a surface to which the substrates are associated or according to the locations of the substrates in a liquid or gel.
  • Exemplary arrays in which separate substrates are located on a surface include, without limitation, those having beads in wells, beads arranged upon a flat surface (e.g., a slide), optionally beads captured upon a flat surface (e.g., a layer of beads adhered to or otherwise stably associated with a slide (e.g., a layer of beads adsorbed to a slide-attached elastomeric surface)), etc.
  • an analyte such as a nucleic acid
  • a material such as a gel or solid support
  • a covalent bond is characterized by the sharing of pairs of electrons between atoms.
  • a non-covalent bond is a chemical bond that does not involve the sharing of pairs of electrons and can include, for example, hydrogen bonds, ionic bonds, van der Waals forces, hydrophilic interactions and hydrophobic interactions.
  • barcode sequence is intended to mean a series of nucleotides in a nucleic acid that can be used to identify the nucleic acid, a characteristic of the nucleic acid (e.g., the identity and optionally the location of a bead to which the nucleic acid is attached), or a manipulation that has been carried out on the nucleic acid.
  • the barcode sequence can be a naturally occurring sequence or a sequence that does not occur naturally in the organism from which the barcoded nucleic acid was obtained.
  • a barcode sequence can be unique to a single nucleic acid species in a population or a barcode sequence can be shared by several different nucleic acid species in a population (e.g., all nucleic acid species attached to a single bead might possess the same barcode sequence, while different beads present a different shared barcode sequence that serves to identify each such different bead).
  • each nucleic acid probe in a population can include different barcode sequences from all other nucleic acid probes in the population.
  • each nucleic acid probe in a population can include different barcode sequences from some or most other nucleic acid probes in a population.
  • each probe in a population can have a barcode that is present for several different probes in the population even though the probes with the common barcode differ from each other at other sequence regions along their length.
  • one or more barcode sequences that are used with a biological specimen are not present in the genome, transcriptome or other nucleic acids of the biological specimen.
  • barcode sequences can have less than 80%, 70%, 60%, 50% or 40% sequence identity to the nucleic acid sequences in a particular biological specimen.
  • beads can include small discrete particles.
  • the composition of the beads can vary, depending upon the class of capture probe, the method of synthesis, and other factors. In certain embodiments of the instant disclosure, the sizes of the beads of the instant disclosure tend to range from 1 pm to 100 pm in diameter (with all subranges within this range expressly contemplated), e.g., depending upon the extent of image resolution desired, nature of the solid support to be used for spatial bead array construction, sequencing processes (e.g., flow cell sequencing) to be employed, as well as other factors.
  • biological specimen is intended to mean one or more cell, tissue, organism or portion thereof.
  • a biological specimen can be obtained from any of a variety of organisms. Exemplary organisms include, but are not limited to, a mammal such as a rodent, mouse, rat, rabbit, guinea pig, ungulate, horse, sheep, pig, goat, cow, cat, dog, primate (i.e.
  • Target nucleic acids can also be derived from a prokaryote such as a bacterium, e.g., Escherichia coll. Staphylococci or Mycoplasma pneumoniae,' an archae; a virus such as Hepatitis C virus or human immunodeficiency virus; or a viroid.
  • a prokaryote such as a bacterium, e.g., Escherichia coll. Staphylococci or Mycoplasma pneumoniae,' an archae; a virus such as Hepatitis C virus or human immunodeficiency virus; or a viroid.
  • Specimens can be derived from a homogeneous culture or population of the above organisms or alternatively from a collection of several different organisms, for example, in a community or ecosystem.
  • cleavage site is intended to mean a location in a nucleic acid molecule that is susceptible to bond breakage.
  • the location can be specific to a particular chemical, enzymatic or physical process that results in bond breakage.
  • the location can be a nucleotide that is abasic or a nucleotide that has a base that is susceptible to being removed to create an abasic site. Examples of nucleotides that are susceptible to being removed include uracil and 8-oxo-guanine as set forth in further detail herein below.
  • the location can also be at or near a recognition sequence for a restriction endonuclease such as a nicking enzyme.
  • control or “reference” is meant a standard of comparison. Methods to select and test control samples are within the ability of those in the art. Determination of statistical significance is within the ability of those skilled in the art, e.g., the number of standard deviations from the mean that constitute a positive result.
  • the term “cryosection” refers to a piece of tissue, e.g., a biopsy, that has been obtained from a subject, snap frozen, embedded in optimal cutting temperature embedding material, frozen, and cut into thin sections.
  • the thin sections can be directly applied to an array of beads captured upon a solid support (e.g., a slide), or the thin sections can be fixed (e.g., in methanol or paraformaldehyde) and applied to a bead-presenting planar surface, e.g., a slide upon which a layer of microbeads has been attached/arrayed.
  • nucleic acids have nucleotide sequences that are not the same as each other.
  • Two or more nucleic acids can have nucleotide sequences that are different along their entire length.
  • two or more nucleic acids can have nucleotide sequences that are different along a substantial portion of their length.
  • two or more nucleic acids can have target nucleotide sequence portions that are different for the two or more molecules while also having a universal sequence portion that is the same on the two or more molecules.
  • Two beads can be different from each other by virtue of being attached to different nucleic acids.
  • each when used in reference to a collection of items, is intended to identify an individual item in the collection but does not necessarily refer to every item in the collection. Exceptions can occur if explicit disclosure or context clearly dictates otherwise.
  • the term "extend,” when used in reference to a nucleic acid, is intended to mean addition of at least one nucleotide or oligonucleotide to the nucleic acid.
  • one or more nucleotides can be added to the 3' end of a nucleic acid, for example, via polymerase catalysis (e g., DNA polymerase, RNA polymerase or reverse transcriptase). Chemical or enzymatic methods can be used to add one or more nucleotide to the 3' or 5' end of a nucleic acid.
  • One or more oligonucleotides can be added to the 3' or 5' end of a nucleic acid, for example, via chemical or enzymatic (e.g., ligase catalysis) methods.
  • a nucleic acid can be extended in a template directed manner, whereby the product of extension is complementary to a template nucleic acid that is hybridized to the nucleic acid that is extended.
  • the term "feature” means a location in an array for a particular species of molecule.
  • a feature can contain only a single molecule or it can contain a population of several molecules of the same species.
  • Features of an array are typically discrete. The discrete features can be contiguous, or they can have spaces between each other. The size of the features and/or spacing between the features can vary such that arrays can be high density, medium density or low density. High density arrays are characterized as having sites separated by less than about 15 pm. Medium density arrays have sites separated by about 15 to 30 pm, while low density arrays have sites separated by greater than 30 pm.
  • An array useful herein can have, for example, sites that are separated by less than 100 pm, 50 pm, 10 pm, 5 pm, 1 pm, or 0.5 pm.
  • An apparatus or method of the present disclosure can be used to detect an array at a resolution sufficient to distinguish sites at the above densities or density ranges.
  • isolated refers to material that is free to varying degrees from components which normally accompany it as found in its native state.
  • Isolate denotes a degree of separation from an original source or surroundings.
  • Purify denotes a degree of separation that is higher than isolation.
  • NGS next-generation sequencing
  • conventional sequencing methods e.g., standard Sanger or Maxam-Gilbert sequencing methods. These unprecedented speeds are achieved by performing and reading out thousands to millions of sequencing reactions in parallel.
  • NGS sequencing platforms include, but are not limited to, the following: Massively Parallel Signature Sequencing (Lynx Therapeutics); 454 pyrosequencing (454 Life Sciences/Roche Diagnostics); solid- phase, reversible dye-terminator sequencing (Solexa/IlluminaTM); SOLiDTM technology (Applied Biosystems); Ion semiconductor sequencing (Ion TorrentTM); and DNA nanoball sequencing (Complete Genomics). Descriptions of certain NGS platforms can be found in the following: Shendure, et al., "Next-generation DNA sequencing," Nature, 2008, vol. 26, No. 10, 135-1 145; Mardis, "The impact of next-generation sequencing technology on genetics," Trends in Genetics, 2007, vol. 24, No. 3, pp.
  • nucleic acid and “nucleotide” are intended to be consistent with their use in the art and to include naturally occurring species or functional analogs thereof. Particularly useful functional analogs of nucleic acids are capable of hybridizing to a nucleic acid in a sequence specific fashion or capable of being used as a template for replication of a particular nucleotide sequence.
  • Naturally occurring nucleic acids generally have a backbone containing phosphodiester bonds.
  • An analog structure can have an alternate backbone linkage including any of a variety of those known in the art.
  • Naturally occurring nucleic acids generally have a deoxyribose sugar (e.g. found in deoxyribonucleic acid (DNA)) or a ribose sugar (e.g. found in ribonucleic acid (RNA)).
  • a nucleic acid can contain nucleotides having any of a variety of analogs of these sugar moieties that are known in the art.
  • a nucleic acid can include native or non-native nucleotides.
  • a native deoxyribonucleic acid can have one or more bases selected from the group consisting of adenine, thymine, cytosine or guanine and a ribonucleic acid can have one or more bases selected from the group consisting of uracil, adenine, cytosine or guanine.
  • Useful non-native bases that can be included in a nucleic acid or nucleotide are known in the art.
  • poly T or poly A when used in reference to a nucleic acid sequence, is intended to mean a series of two or more thiamine (T) or adenine (A) bases, respectively.
  • a poly T or poly A can include at least about 2, 5, 8, 10, 12, 15, 18, 20 or more of the T or A bases, respectively.
  • a poly T or poly A can include at most about 30, 20, 18, 15, 12, 10, 8, 5 or 2 of the T or A bases, respectively.
  • the term "random" can be used to refer to the spatial arrangement or composition of locations on a surface.
  • the first relating to the spacing and relative location of features (also called “sites") and the second relating to identity or predetermined knowledge of the particular species of molecule that is present at a particular feature.
  • features of an array can be randomly spaced such that nearest neighbor features have variable spacing between each other.
  • the spacing between features can be ordered, for example, forming a regular pattern such as a rectilinear grid or hexagonal grid.
  • Exemplary solid supports include, but are not limited to, glass and modified or functionalized glass, plastics (including acrylics, polystyrene and copolymers of styrene and other materials, polypropylene, polyethylene, polybutylene, polyurethanes, TeflonTM, cyclic olefins, polyimides etc.), nylon, ceramics, resins, Zeonor®, silica or silica-based materials including silicon and modified silicon, carbon, metals, inorganic glasses, optical fiber bundles, and polymers.
  • Particularly useful solid supports for some embodiments are slides and beads capable of assorting/packing upon the surface of a slide (e.g., beads to which a large number of oligonucleotides are attached).
  • the term "spatial tag" is intended to mean a nucleic acid having a sequence that is indicative of a location.
  • the nucleic acid is a synthetic molecule having a sequence that is not found in one or more biological specimen that will be used with the nucleic acid.
  • the nucleic acid molecule can be naturally derived or the sequence of the nucleic acid can be naturally occurring, for example, in a biological specimen that is used with the nucleic acid.
  • the location indicated by a spatial tag can be a location in or on a biological specimen, in or on a solid support or a combination thereof.
  • a barcode sequence can function as a spatial tag.
  • the identification of the tag that serves as a spatial tag is only determined after a population of beads (each possessing a distinct barcode sequence) has been arrayed upon a solid support (optionally randomly arrayed upon a solid support) and sequencing of such a bead-associated barcode sequence has been determined in situ upon the solid support.
  • subject includes humans and mammals (e.g., mice, rats, pigs, cats, dogs, and horses).
  • subjects are mammals, particularly primates, especially humans.
  • subjects are livestock such as cattle, sheep, goats, cows, swine, and the like; poultry such as chickens, ducks, geese, turkeys, and the like; and domesticated animals particularly pets such as dogs and cats.
  • subject mammals will be, for example, rodents (e.g., mice, rats, hamsters), rabbits, primates, or swine such as inbred pigs and the like.
  • tissue is intended to mean an aggregation of cells, and, optionally, intercellular matter. Typically, the cells in a tissue are not free floating in solution and instead are attached to each other to form a multicellular structure. Exemplary tissue types include muscle, nerve, epidermal and connective tissues.
  • the term "universal sequence” refers to a series of nucleotides that is common to two or more nucleic acid molecules even if the molecules also have regions of sequence that differ from each other.
  • a universal sequence that is present in different members of a collection of molecules can allow capture of multiple different nucleic acids using a population of universal capture nucleic acids that are complementary to the universal sequence.
  • a universal sequence present in different members of a collection of molecules can allow the replication or amplification of multiple different nucleic acids using a population of universal primers that are complementary to the universal sequence.
  • a universal capture nucleic acid or a universal primer includes a sequence that can hybridize specifically to a universal sequence.
  • Target nucleic acid molecules may be modified to attach universal adapters, for example, at one or both ends of the different target sequences.
  • Ranges can be expressed herein as from “about” one particular value, and/or to “about” another particular value. When such a range is expressed, another aspect includes from the one particular value and/or to the other particular value. Similarly, when values are expressed as approximations, by use of the antecedent “about,” it is understood that the particular value forms another aspect. It is further understood that the endpoints of each of the ranges are significant both in relation to the other endpoint, and independently of the other endpoint. It is also understood that there are a number of values disclosed herein, and that each value is also herein disclosed as “about” that particular value in addition to the value itself.
  • data are provided in a number of different formats and that this data represent endpoints and starting points and ranges for any combination of the data points. For example, if a particular data point “10” and a particular data point “15” are disclosed, it is understood that greater than, greater than or equal to, less than, less than or equal to, and equal to 10 and 15 are considered disclosed as well as between 10 and 15. It is also understood that each unit between two particular units are also disclosed. For example, if 10 and 15 are disclosed, then 11, 12, 13, and 14 are also disclosed.
  • a nested sub-range of an exemplary range of 1 to 50 may comprise 1 to 10, 1 to 20, 1 to 30, and 1 to 40 in one direction, or 50 to 40, 50 to 30, 50 to 20, and 50 to 10 in the other direction.
  • transitional term “comprising,” which is synonymous with “including,” “containing,” or “characterized by,” is inclusive or open-ended and does not exclude additional, unrecited elements or method steps.
  • the transitional phrase “consisting of’ excludes any element, step, or ingredient not specified in the claim.
  • the transitional phrase “consisting essentially of’ limits the scope of a claim to the specified materials or steps “and those that do not materially affect the basic and novel characteristic(s)” of the claimed invention.
  • the present disclosure provides a method for generating and using a spatially tagged array of oligonucleotides (e.g., an array of microbead-attached oligonucleotides) to perform deep expression profiling upon biological samples, e.g., cryosectioned tissue samples, with high resolution of reconstructions.
  • a spatially tagged array of oligonucleotides e.g., an array of microbead-attached oligonucleotides
  • the method can include the steps of (a) attaching different nucleic acid probes to array elements (optionally bound to a solid support) and/or to beads that are then captured upon a solid support to produce randomly located probe-possessing beads on the solid support, wherein the different nucleic acid probes each includes a barcode sequence (that is shared by all such nucleic acid probes of a single bead and/or array element), and wherein each of the randomly located beads ideally includes a barcode sequence(s) that is different from other randomly located beads on the solid support; (b) contacting a biological specimen with the solid support that has the array of probes/oligonucleotides and/or randomly located beads thereon; (c) hybridizing the probes presented by the array (e.g., the randomly located beads) to target nucleic acids from portions of the biological specimen that are proximal to the randomly located array elements (e.g., beads); (d) extending the probes of the randomly located beads to produce extended probes that include the barcode sequence
  • any of a variety of solid supports can be used in a method, composition or apparatus of the present disclosure.
  • Particularly useful solid supports are those used for nucleic acid arrays. Examples include glass, modified glass, functionalized glass, inorganic glasses, microspheres (e.g., inert and/or magnetic particles), plastics, polysaccharides, nylon, nitrocellulose, ceramics, resins, silica, silica-based materials, carbon, metals, an optical fiber or optical fiber bundles, polymers and multiwell (e.g. microtiter) plates.
  • Exemplary plastics include acrylics, polystyrene, copolymers of styrene and other materials, polypropylene, polyethylene, polybutylene, polyurethanes and TeflonTM.
  • Exemplary silica-based materials include silicon and various forms of modified silicon.
  • a solid support can be within or part of a vessel such as a well, tube, channel, cuvette, Petri plate, bottle or the like.
  • the vessel is a flow-cell, for example, as described in WO 2014/142841 Al; U.S. Pat. App. Pub. No. 2010/0111768 Al and U.S. Pat. No. 8,951,781 or Bentley et al., Nature 456:53-59 (2008), each of which is incorporated herein by reference.
  • Exemplary flow-cells are those that are commercially available from Illumina®, Inc.
  • the vessel is a well in a multiwell plate or microtiter plate.
  • a solid support can include a gel coating. Attachment, e.g., of nucleic acids to a solid support via a gel is exemplified by flow cells available commercially from Illumina Inc. (San Diego, CA) or described in US Pat. App. Pub. Nos. 2011/0059865 Al, 2014/0079923 Al, or 2015/0005447 Al; or PCT Publ. No. WO 2008/093098, each of which is incorporated herein by reference.
  • Exemplary gels that can be used in the methods and apparatus set forth herein include, but are not limited to, those having a colloidal structure, such as agarose; polymer mesh structure, such as gelatin; or cross-linked polymer structure, such as polyacrylamide, SFA (see, for example, US Pat. App. Pub. No. 2011/0059865 Al, which is incorporated herein by reference) or PAZAM (see, for example, US Pat. App. Publ. Nos. 2014/0079923 Al, or 2015/0005447 Al, each of which is incorporated herein by reference).
  • a colloidal structure such as agarose
  • polymer mesh structure such as gelatin
  • cross-linked polymer structure such as polyacrylamide, SFA
  • SFA see, for example, US Pat. App. Pub. No. 2011/0059865 Al, which is incorporated herein by reference
  • PAZAM see, for example, US Pat. App. Publ. Nos. 2014/0079923 Al, or 2015/0005447 Al,
  • a solid support can be configured as an array of features to which beads can be attached.
  • the features can be present in any of a variety of desired formats.
  • the features can be wells, pits, channels, ridges, raised regions, pegs, posts or the like.
  • Exemplary features include wells that are present in substrates used for commercial sequencing platforms sold by 454 LifeSciences (a subsidiary of Roche, Basel Switzerland) or Ion Torrent (a subsidiary of Life Technologies, Carlsbad California).
  • substrates having wells include, for example, etched fiber optics and other substrates described in US Pat Nos.
  • wells of a substrate can include gel material (with or without beads) as set forth in US Pat. App. Publ. No. 2014/0243224 Al, which is incorporated herein by reference.
  • Features can appear on a solid support as a grid of spots or patches.
  • the features can be located in a repeating pattern or in an irregular, non-repeating pattern.
  • repeating patterns can include hexagonal patterns, rectilinear patterns, grid patterns, patterns having reflective symmetry, patterns having rotational symmetry, or the like.
  • Asymmetric patterns can also be useful.
  • the pitch of an array can be the same between different pairs of nearest neighbor features or the pitch can vary between different pairs of nearest neighbor features.
  • features on a solid support can each have an area that is larger than about 100 nm 2 , 250 nm 2 , 500 nm 2 , 1 pm 2 , 2.5 pm 2 , 5 pm 2 , 10 pm 2 or 50 pm 2 .
  • features can each have an area that is smaller than about 50 pm 2 , 25 pm 2 , 10 pm 2 , 5 pm 2 , 1 pm 2 , 500 nm 2 , or 100 nm 2 .
  • the preceding ranges can describe the apparent area of a bead or other particle on a solid support when viewed or imaged from above.
  • Suitable bead compositions include those used in peptide, nucleic acid and organic moiety synthesis, including, but not limited to, plastics, ceramics, glass, polystyrene, methylstyrene, acrylic polymers, paramagnetic materials, thoriasol, carbon graphite, titanium dioxide, latex or cross-linked dextrans such as Sepharose, cellulose, nylon, cross-linked micelles and Teflon® may all be used.
  • "Microsphere Detection Guide” from Bangs Laboratories, Fishers, IN is a helpful guide, which is incorporated herein by reference in its entirety.
  • the beads need not be spherical; irregular particles may be used.
  • the beads may be porous, thus increasing the surface area of the bead available for either capture probe attachment or tag attachment.
  • the bead sizes can range from nanometers, for example, 100 nm, to millimeters, for example, 1 mm, with beads from about 0.2 pm to about 200 pm commonly employed, and from about 5 to about 20 pm being within the range currently exemplified, although in some embodiments smaller or larger beads may be used.
  • the beads can be made to include universal primers, and the beads can then be loaded onto an array, thereby forming universal arrays for use in a method set forth herein.
  • the solid supports typically used for bead arrays can be used without beads.
  • nucleic acids, such as probes or primers can be attached directly to the wells or to gel material in wells.
  • the instant methods can employ an array of beads, wherein different nucleic acid probes are attached to different beads in the array.
  • each bead can be attached to a different nucleic acid probe and the beads can be randomly distributed on the solid support in order to effectively attach the different nucleic acid probes to the solid support.
  • the solid support can include wells having dimensions that accommodate no more than a single bead. In such a configuration, the beads may be attached to the wells due to forces resulting from the fit of the beads in the wells.
  • attachment chemistries or capture materials e g., liquid electrical tape
  • adhere or otherwise stably associate the beads with a solid support, optionally including holding the beads in wells that may or may not be present on a solid support.
  • Nucleic acid probes that are attached to beads can include barcode sequences.
  • a population of the beads can be configured such that each bead is attached to only one type of barcode (e.g., a spatial barcode) and many different beads each with a different barcode are present in the population.
  • randomly distributing the beads to a solid support will result in randomly locating the nucleic acid probe-presenting beads (and their respective barcode sequences) on the solid support.
  • the number of different barcodes in a population of beads can exceed the capacity of the solid support in order to produce an array that is not redundant with respect to the population of barcodes on the solid support.
  • the capacity of the solid support will be determined in some embodiments by the number of features (e.g. single-bead occupancy wells) that attach or otherwise accommodate a bead.
  • nucleic acid probes may be present on a bead or other solid support of the instant disclosure prior to contacting the bead or other solid support with nucleic acid probes.
  • the primers can be attached at the features, whereas interstitial areas outside of the features substantially lack any of the primers.
  • Nucleic acid probes can be captured at preformed features on a bead or other solid support, and optionally amplified on the bead or other solid support, e.g., using methods set forth in U.S. Patent Nos. 8,895,249 and 8,778,849 and/or U.S. Patent Application Publication No.
  • exemplary capture moieties include, but are not limited to, chemical moieties capable of reacting with a nucleic acid probe to create a covalent bond or receptors capable of binding non-covalently to a ligand on a nucleic acid probe.
  • a step of attaching nucleic acid probes to a bead or other solid support can be carried out by providing a fluid that contains a mixture of different nucleic acid probes and contacting this fluidic mixture with the bead or other solid support.
  • the contact can result in the fluidic mixture being in contact with a surface to which many different nucleic acid probes from the fluidic mixture will attach.
  • the probes have random access to the surface (whether the surface has preformed features configured to attach the probes or a uniform surface configured for attachment). Accordingly, the probes can be randomly located on the bead or other solid support.
  • the total number and variety of different probes that end up attached to a surface can be selected for a particular application or use. For example, in embodiments where a fluidic mixture of different nucleic acid probes is contacted with a bead or other solid support for purposes of attaching the probes to the support, the number of different probe species can exceed the occupancy of the bead or other solid support for probes. Thus, the number and variety of different probes that attach to the bead or other solid support can be equivalent to the probe occupancy of the bead or other solid support.
  • the number and variety of different probe species on the bead or other solid support can be less than the occupancy (i.e., there will be redundancy of probe species such that the bead or other solid support may contain multiple features having the same probe species).
  • redundancy can be achieved, for example, by contacting the bead or other solid support with a fluidic mixture that contains a number and variety of probe species that is substantially lower than the probe occupancy of the bead or other solid support.
  • Attachment of the nucleic acid probes can be mediated by hybridization of the nucleic acid probes to complementary primers that are attached to the bead or other solid support, chemical bond formation between a reactive moiety on the nucleic acid probe and the bead or other solid support (examples are set forth in U.S. Patent Nos. 8,895,249 and 8,778,849, and in U.S. Patent Application Publication No. 2014/0243224 Al, each of which is incorporated herein by reference), affinity interactions of a moiety on the nucleic acid probe with a bead- or other solid support-bound moiety (e.g.
  • nucleic acid probes with the bead or other solid support (e.g. hydrogen bonding, ionic forces, van der Waals forces and the like), or other interactions known in the art to attach nucleic acids to surfaces.
  • receptor-ligand pairs such as streptavidin-biotin, antibody-epitope, lectin-carbohydrate and the like
  • physical interactions of the nucleic acid probes with the bead or other solid support e.g. hydrogen bonding, ionic forces, van der Waals forces and the like
  • other interactions known in the art to attach nucleic acids to surfaces e.g. hydrogen bonding, ionic forces, van der Waals forces and the like
  • One or more features on a bead or other solid support can each include a single molecule of a particular probe.
  • the features can be configured, in some embodiments, to accommodate no more than a single nucleic acid probe molecule. However, whether or not the feature can accommodate more than one nucleic acid probe molecule, the feature may nonetheless include no more than a single nucleic acid probe molecule.
  • an individual feature can include a plurality of nucleic acid probe molecules, for example, an ensemble of nucleic acid probe molecules having the same sequence as each other. In particular embodiments, the ensemble can be produced by amplification from a single nucleic acid probe template to produce amplicons, for example, as a cluster attached to the surface.
  • a method set forth herein can use any of a variety of amplification techniques.
  • Exemplary techniques that can be used include, but are not limited to, polymerase chain reaction (PCR), rolling circle amplification (RCA), multiple displacement amplification (MDA), or random prime amplification (RPA).
  • PCR polymerase chain reaction
  • RCA rolling circle amplification
  • MDA multiple displacement amplification
  • RPA random prime amplification
  • the amplification can be carried out in solution, for example, when features of an array are capable of containing amplicons in a volume having a desired capacity.
  • an amplification technique used in a method of the present disclosure will be carried out on solid phase.
  • one or more primer species e.g., universal primers for one or more universal primer binding site present in a nucleic acid probe
  • one or both of the primers used for amplification can be attached to a bead or other solid support (e.g., via a gel).
  • a bead or other solid support e.g., via a gel.
  • Formats that utilize two species of primers attached to a bead or other solid support are often referred to as bridge amplification because double stranded amplicons form a bridge-like structure between the two surface attached primers that flank the template sequence that has been copied.
  • Exemplary reagents and conditions that can be used for bridge amplification are described, for example, in U.S. Patent Nos. 5,641,658; 7,115,400; and 8,895,249; and/or U.S. Patent Application Publication Nos.
  • Solid-phase PCR amplification can also be carried out with one of the amplification primers attached to a bead or other solid support and the second primer in solution.
  • An exemplary format that uses a combination of a surface attached primer and soluble primer is the format used in emulsion PCR as described, for example, in Dressman et al., Proc. Natl. Acad. Sci. USA 100:8817-8822 (2003), WO 05/010145, or U.S. Patent Application Publication Nos.
  • Emulsion PCR is illustrative of the format and it will be understood that for purposes of the methods set forth herein the use of an emulsion is optional and indeed for several embodiments an emulsion is not used.
  • RCA techniques can be modified for use in a method of the present disclosure.
  • Exemplary components that can be used in an RCA reaction and principles by which RCA produces amplicons are described, for example, in Lizardi et al., Nat. Genet. 19:225-232 (1998) and U.S. Patent Application Publication No. 2007/0099208 Al, each of which is incorporated herein by reference.
  • Primers used for RCA can be in solution or attached to a bead or other solid support.
  • the primers can be one or more of the universal primers described herein.
  • MDA techniques can be modified for use in a method of the present disclosure. Some basic principles and useful conditions for MDA are described, for example, in Dean et al., Proc Natl. Acad. Sci. USA 99:5261 -66 (2002); Lü et al., Genome Research 13:294-307 (2003); Walker et al., Molecular Methods for Virus Detection, Academic Press, Inc., 1995; Walker et al., Nucl. Acids Res. 20: 1691-96 (1992); US 5,455,166; US 5,130,238; and US 6,214,587, each of which is incorporated herein by reference.
  • Primers used for MDA can be in solution or attached to a bead or other solid support at an amplification site. Again, the primers can be one or more of the universal primers described herein.
  • Nucleic acid probes that are used in a method set forth herein or present in an apparatus or composition of the present disclosure can include barcode sequences, and for embodiments that include a plurality of different nucleic acid probes, each of the probes can include a different barcode sequence from other probes in the plurality. Barcode sequences can be any of a variety of lengths.
  • a barcode sequence can be at least 2, 4, 6, 8, 10, 12, 15, 20 or more nucleotides in length. Alternatively or additionally, the length of the barcode sequence can be at most 20, 15, 12, 10, 8, 6, 4 or fewer nucleotides. Examples of barcode sequences that can be used are set forth, for example in, U.S. Patent Application Publication No. 2014/0342921 Al and U.S. Patent No. 8,460,865, each of which is incorporated herein by reference.
  • nucleic acid detection methods include, but are not limited to nucleic acid sequencing of a probe, hybridization of nucleic acids to a probe, ligation of nucleic acids that are hybridized to a probe, extension of nucleic acids that are hybridized to a probe, extension of a first nucleic acid that is hybridized to a probe followed by ligation of the extended nucleic acid to a second nucleic acid that is hybridized to the probe, or other methods known in the art such as those set forth in U.S. Patent No. 8,288,103 or 8,486,625, each of which is incorporated herein by reference.
  • compositions and methods of the instant disclosure largely remove the need for sequencing-by-synthesis (SBS) techniques, as the a priori need for imaging of barcode locations is replaced by generating a latent space representation of macromolecules at array locations by performing a dimensionality reduction analysis.
  • SBS sequencing-by-synthesis
  • Exemplified bead-attached oligonucleotides of the instant disclosure include an oligonucleotide spatial barcode designed to be unique to each bead within a bead array (or at least wherein the majority of such barcodes are unique to a bead within a bead array - e.g., it is expressly contemplated here and elsewhere herein that a bead array possessing only a small fraction of beads (e.g., even up to 10%, 20%, 30% or 40% or more of total beads) having non-unique spatial barcodes (e.g., attributable to a relative lack of degeneracy within the bead population, e.g., due to a probabilistically determinable lack of sequence degeneracy calculated as possible within the bead population, as then compared to the number of sites across which the bead population is ultimately distributed and/or due to an artifact such as non-randomness of bead association occurring during pool-and- split
  • Exemplified bead-attached oligonucleotides of the instant disclosure also include a linker (optionally a cleavable linker); a poly-dT sequence (herein, as a 3’ tail); a Unique Molecular Identifier (UMI) which differs for each priming site (as described below and as known in the art, e.g., see WO 2016/040476); a spatial barcode as described above and elsewhere herein; and a common sequence (“PCR handle”) to enable PCR amplification after “single-cell transcriptomes attached to microparticles” (STAMP) formation.
  • a linker optionally a cleavable linker
  • UMI Unique Molecular Identifier
  • mRNAs bind to poly-dT-presenting primers on their companion microparticle.
  • the mRNAs are reverse transcribed into cDNAs, generating a set of beads called STAMPs.
  • the barcoded STAMPs can then be amplified in pools for high-throughput mRNA-seq to analyze any desired number of beads (where each bead roughly corresponds to an approximately bead-sized area of cellular transcriptomes derived from the cryosectioned tissue sample (in the instant disclosure, 10 pm beads were used to produce resolutions approximating single cell feature sizes, as exemplified herein).
  • oligonucleotide sequences designed for capture of a broader range of macromolecules as described here and elsewhere herein can be used.
  • oligonucleotide-directed capture of other types of macromolecules is also contemplated for the bead-attached oligonucleotides of the instant disclosure; for instance, a gene-specific capture sequence can be incorporated into oligonucleotide sequences (e.g., for purpose of capturing a full range of cell/tissue-associated RNAs including non-poly-A-tailed RNAs, such as tRNAs, miRNAs, etc., or for purpose of specifically capturing DNAs) and/or a loaded transposase can be used to capture, for example, DNA, and/or a specific sequence can be included to allow for specific capture of a DNA-barcoded antibody signal (not only allowing for assessment of protein distribution across a test sample using the compositions and methods of the instant disclosure, but also thereby, e.g., allowing for linkage of the spatial distributions of proteins to RNA expression).
  • a gene-specific capture sequence can be incorporated into oligonucleo
  • Exemplary split-and-pool synthesis of the bead barcode To generate the cell barcode, the pool of microparticles (here, microbeads) is repeatedly split into four equally sized oligonucleotide synthesis reactions, to which one of the four DNA bases is added, and then pooled together after each cycle, in a total of 12 split-pool cycles.
  • the barcode synthesized on any individual bead reflects that bead’s unique (or sufficiently unique) path through the series of synthesis reactions. The result is a pool of microparticles, each possessing one of 4 12 (16,777,216) possible sequences on its entire complement of primers.
  • Extension of the split-pool process can provide for, e.g., production of an even greater number of possible spatial barcode sequences for use in the compositions and methods of the instant disclosure.
  • functional use of spatial barcodes does not require complete non-redundancy of spatial barcodes among all beads of a bead array.
  • UMI unique molecular identifier
  • the linker of a bead-attached oligonucleotide is a chemically-cleavable, straight-chain polymer.
  • the linker is a photolabile optionally substituted hydrocarbon polymer.
  • the linker of a bead-attached oligonucleotide is a non-cleavable, straight-chain polymer.
  • the linker is a non-cleavable, optionally substituted hydrocarbon polymer.
  • the linker is a polyethylene glycol.
  • the linker is a PEG-C3 to PEG-24.
  • a nucleic acid probe used in a composition or method set forth herein can include a target capture moiety.
  • the target capture moiety is a target capture sequence.
  • the target capture sequence is generally complementary to a target sequence such that target capture occurs by formation of a probe-target hybrid complex.
  • a target capture sequence can be any of a variety of lengths including, for example, lengths exemplified above in the context of barcode sequences.
  • a plurality of different nucleic acid probes can include different target capture sequences that hybridize to different target nucleic acid sequences from a biological specimen. Different target capture sequences can be used to selectively bind to one or more desired target nucleic acids from a biological specimen.
  • the different nucleic acid probes can include a target capture sequence that is common to all or a subset of the probes on a solid support.
  • the nucleic acid probes on a solid support can have a poly A or poly T sequence.
  • Such probes or amplicons thereof can hybridize to mRNA molecules, cDNA molecules or amplicons thereof that have poly A or poly T tails. Although the mRNA or cDNA species will have different target sequences, capture will be mediated by the common poly A or poly T sequence regions.
  • target nucleic acids can be captured and analyzed in a method set forth herein including, but not limited to, messenger RNA (mRNA), copy DNA (cDNA), genomic DNA (gDNA), ribosomal RNA (rRNA) or transfer RNA (tRNA).
  • mRNA messenger RNA
  • cDNA copy DNA
  • gDNA genomic DNA
  • rRNA ribosomal RNA
  • tRNA transfer RNA
  • target sequences can be selected from databases and appropriate capture sequences designed using techniques and databases known in the art.
  • a method set forth herein can include a step of hybridizing nucleic acid probes, that are on a supported bead array, to target nucleic acids that are from portions of the biological specimen that are proximal to the probes.
  • a target nucleic acid will flow or diffuse from a region of the biological specimen to an area of the probe-presenting bead array that is in proximity with that region of the specimen.
  • the target nucleic acid will interact with nucleic acid probes that are proximal to the region of the specimen from which the target nucleic acid was released.
  • a target-probe hybrid complex can form where the target nucleic acid encounters a complementary target capture sequence on a nucleic acid probe. The location of the target-probe hybrid complex will generally correlate with the region of the biological specimen from where the target nucleic acid was derived.
  • the beads will include a plurality of nucleic acid probes
  • the biological specimen will release a plurality of target nucleic acids
  • a plurality of targetprobe hybrids will be formed on the beads.
  • the sequences of the target nucleic acids and their locations on the bead array will provide spatial information about the nucleic acid content of the biological specimen.
  • the target nucleic acids need not be released. Rather, the target nucleic acids may remain in contact with the biological specimen, for example, when they are attached to an exposed surface of the biological specimen in a way that the target nucleic acids can also bind to appropriate nucleic acid probes on the beads.
  • a method of the present disclosure can include a step of extending bead-attached probes to which target nucleic acids are hybridized.
  • the resulting extended probes will include the barcode sequences and sequences from the target nucleic acids (albeit in complementary form).
  • the extended probes are thus spatially tagged versions of the target nucleic acids from the biological specimen.
  • the sequences of the extended probes identify what nucleic acids are in the biological specimen and where in the biological specimen the target nucleic acids are located. It will be understood that other sequence elements that are present in the nucleic acid probes can also be included in the extended probes (see, e.g., description as provided elsewhere herein).
  • Such elements include, for example, primer binding sites, cleavage sites, other tag sequences (e g., sample identification tags), capture sequences, recognition sites for nucleic acid binding proteins or nucleic acid enzymes, or the like.
  • Extension of probes can be carried out using methods exemplified herein or otherwise known in the art for amplification of nucleic acids or sequencing of nucleic acids.
  • one or more nucleotides can be added to the 3' end of a nucleic acid, for example, via polymerase catalysis (e.g., DNA polymerase, RNA polymerase or reverse transcriptase).
  • Chemical or enzymatic methods can be used to add one or more nucleotide to the 3' or 5' end of a nucleic acid.
  • One or more oligonucleotides can be added to the 3' or 5' end of a nucleic acid, for example, via chemical or enzymatic (e.g., ligase catalysis) methods.
  • a nucleic acid can be extended in a template directed manner, whereby the product of extension is complementary to a template nucleic acid that is hybridized to the nucleic acid that is extended.
  • a DNA primer is extended by a reverse transcriptase using an RNA template, thereby producing a cDNA.
  • an extended probe made in a method set forth herein can be a reverse transcribed DNA molecule.
  • Exemplary methods for extending nucleic acids are set forth in US Pat. App. Publ. No. US 2005/0037393 Al or US Pat. No. 8,288,103 or 8,486,625, each of which is incorporated herein by reference.
  • an extended probe can include at least, 1, 2, 5, 10, 25, 50, 100, 200, 500, 1000 or more nucleotides that are copied from a target nucleic acid.
  • the length of the extension product can be controlled, for example, using reversibly terminated nucleotides in the extension reaction and running a limited number of extension cycles.
  • an extended probe produced in a method set forth herein can include no more than 1000, 500, 200, 100, 50, 25, 10, 5, 2 or 1 nucleotides that are copied from a target nucleic acid.
  • extended probes can be any length within or outside of the ranges set forth above.
  • probes used in a method, composition or apparatus set forth herein need not be nucleic acids. Other molecules can be used such as proteins, carbohydrates, small molecules, particles or the like.
  • Probes can be a combination of a nucleic acid component (e.g., having a barcode, primer binding site, cleavage site and/or other sequence element set forth herein) and another moiety (e.g., a moiety that captures or modifies a target nucleic acid).
  • a nucleic acid component e.g., having a barcode, primer binding site, cleavage site and/or other sequence element set forth herein
  • another moiety e.g., a moiety that captures or modifies a target nucleic acid
  • a method of the present disclosure can include a step of removing one or more extended probes from a bead.
  • the probes will have included a cleavage site such that the product of extending the probes will also include the cleavage site.
  • a cleavage site can be introduced into a probe during a modification step.
  • a cleavage site can be introduced into an extended probe during the extension step.
  • Exemplary cleavage sites include, but are not limited to, moieties that are susceptible to a chemical, enzymatic or physical process that results in bond breakage.
  • the location can be a nucleotide sequence that is recognized by an endonuclease.
  • Suitable endonucleases and their recognition sequences are well known in the art and in many cases are even commercially available (e.g., from New England Biolabs, Beverley MA; ThermoFisher, Waltham, MA or Sigma Aldrich, St. Louis MO).
  • a particularly useful endonuclease will break a bond in a nucleic acid strand at a site that is 3 '-remote to its binding site in the nucleic acid, examples of which include Type II or Type IIS restriction endonucleases.
  • an endonuclease will cut only one strand in a duplex nucleic acid (e.g., a nicking enzyme). Examples of endonucleases that cleave only one strand include Nt.BstNBI and Nt.Alwl.
  • An abasic site may be created at a uracil nucleotide on one strand of a nucleic acid.
  • the enzyme uracil DNA glycosylase (UDG) may be used to remove the uracil base, generating an abasic site on the strand.
  • the nucleic acid strand that has the abasic site may then be cleaved at the abasic site by treatment with endonuclease (e.g., EndoIV endonuclease, AP lyase, FPG glycosylase/ AP lyase, EndoVIII glycosylase/ AP lyase), heat or alkali.
  • endonuclease e.g., EndoIV endonuclease, AP lyase, FPG glycosylase/ AP lyase, EndoVIII glycosylase/ AP lyase
  • a fluidic mixture can include at most 1 x 10 9 , 1 x 10 8 , 1 x 10 7 , l x 10 6 , 1 x 10 5 , l x 10 4 , l x 10 3 , 100, 10 or fewer different modified probes.
  • the fluidic mixture can be manipulated to allow detection of the modified nucleic acid probes.
  • the modified nucleic acid probes can be separated spatially on a second solid support (i.e., different from the bead array and/or adhered solid support from which the nucleic acid probes were released after having been contacted with a biological specimen and modified), or the probes can be separated temporally in a fluid stream.
  • Modified nucleic acid probes can be separated on a bead or other solid support in a capture or detection method commonly employed for microarraybased techniques or nucleic acid sequencing techniques such as those set forth previously and/or otherwise described herein.
  • modified probes can be attached to a microarray by hybridization to complementary nucleic acids.
  • the modified probes can be attached to beads or to a flow cell surface and optionally amplified as is carried out in many nucleic acid sequencing platforms.
  • Modified probes can be separated in a fluid stream using a microfluidic device, droplet manipulation device, or flow cytometer. Typically, detection is carried out on these separation devices, but detection is not necessary in all embodiments.
  • the number of bead-attached oligonucleotides present upon an individual bead can vary across a wide range, e.g., from tens to thousands, or millions, or more. Due to the transcriptome profiling nature of the instant disclosure, it is generally preferred to pack as many capture oligonucleotides as spatially and sterically (as well as economically) possible onto an individual bead (i.e., thousands, tens of thousands, or more, of oligonucleotides per individual bead), provided that mRNA capture from a contacted tissue is optimized. It is contemplated that optimization of the oligonucleotide-per-bead metric can be readily performed by one of ordinary skill in the art.
  • oligonucleotides of the instant disclosure can possess any number of other art-recognized features while remaining within the scope of the instant disclosure.
  • a capture material is employed to associate a bead array with a solid support (e.g., a glass slide).
  • the capture material is a liquid electrical tape.
  • An exemplary liquid electrical tape of the instant disclosure is PermatexTM liquid electrical tape, which is a weatherproof protectant for wiring and electrical connections.
  • Liquid capture material such as liquid tape can be applied as a liquid, which then dries to a vinyl polymer that resists dirt, dust, chemicals, and moisture, ensuring that applied beads are attached to a capture material-coated slide in a dry condition.
  • RNAs i.e., the transcriptome of cells found within cryosectioned tissues
  • microbeads where each microbead respectively possesses thousands of oligonucleotides capable of capturing oligoribonucleotides, e.g., transcripts
  • diffusive properties - is what imparts the instant methods and compositions with extremely high resolution (i.e., resolution at 10-50 pm spacing across a two- dimensional image of a section) of assessment of the cellular transcriptomes (or other macromolecules) of assayed tissue sections.
  • beads of the instant disclosure can be applied to a capture material- coated solid support, either immediately upon deposit of capture material to the solid support, or following an initial drying period for the capture material.
  • Capture materials of the instant disclosure can be applied by any of a number of methods, including brushed onto the solid support, sprayed onto the solid support, or the like, or via submersion of the solid support in the capture material.
  • a brush top applicator can allow coverage without gaps and can enable access to tight spaces, which offers advantages in certain embodiments over forms of capture material (i.e., tape) that are applied in a non-liquid state.
  • liquid electrical tape has been exemplified as a capture material for use in the methods and compositions of the instant disclosure
  • other capture materials are also contemplated for such use, including any art-recognized glue or other reagent that is (a) spreadable and/or depositable upon a solid surface (e.g., upon a slide, optionally a slide that allows for light transmission through the slide, e g., a microscope slide) and (b) capable of binding or otherwise capturing a population of beads of 1-100 pm size.
  • Exemplary other capture materials include latex such as cis-l,4-polyisoprene and other rubbers, as well as elastomers (which are generally defined as polymers that possess viscoelasticity (i.e., both viscosity and elasticity), very weak inter-molecular forces, and generally low Young's modulus and high failure strain compared with other materials), including artificial elastomers (e.g., neoprene) and/or silicone elastomers.
  • Acrylate polymers e.g., scotch tape
  • scotch tape are also expressly contemplated, e.g., for use as a capture material of the instant disclosure.
  • a tissue section is employed.
  • the tissue can be derived from a multicellular organism.
  • Exemplary multicellular organisms include, but are not limited to a mammal, plant, algae, nematode, insect, fish, reptile, amphibian, fungi or Plasmodium falciparum.
  • Exemplary species are set forth previously herein or known in the art.
  • the tissue can be freshly excised from an organism, or it may have been previously preserved for example by freezing, embedding in a material such as paraffin (e.g., formalin fixed paraffin embedded samples), formalin fixation, infiltration, dehydration or the like.
  • a tissue section can be cryosectioned, using techniques and compositions as described herein and as known in the art.
  • a tissue can be permeabilized and the cells of the tissue lysed. Any of a variety of art-recognized lysis treatments can be used. Target nucleic acids that are released from a tissue that is permeabilized can be captured by nucleic acid probes, as described herein and as known in the art.
  • a tissue can be prepared in any convenient or desired way for its use in a method, composition or apparatus herein. Fresh, frozen, fixed or unfixed tissues can be used. A tissue can be fixed or embedded using methods described herein or known in the art.
  • a tissue sample for use herein can be fixed by deep freezing at temperature suitable to maintain or preserve the integrity of the tissue structure, e.g., less than -20° C.
  • a tissue can be prepared using formalin-fixation and paraffin embedding (FFPE) methods which are known in the art.
  • FFPE formalin-fixation and paraffin embedding
  • Other fixatives and/or embedding materials can be used as desired.
  • a fixed or embedded tissue sample can be sectioned, i.e., thinly sliced, using known methods.
  • a tissue sample can be sectioned using a chilled microtome or cryostat, set at a temperature suitable to maintain both the structural integrity of the tissue sample and the chemical properties of the nucleic acids in the sample.
  • Exemplary additional fixatives that are expressly contemplated include alcohol fixation (e.g., methanol fixation, ethanol fixation), glutaraldehyde fixation and paraformaldehyde fixation.
  • a tissue sample will be treated to remove embedding material (e g. to remove paraffin or formalin) from the sample prior to release, capture or modification of nucleic acids. This can be achieved by contacting the sample with an appropriate solvent (e.g. xylene and ethanol washes). Treatment can occur prior to contacting the tissue sample with a solid support- captured bead array as set forth herein or the treatment can occur while the tissue sample is on the solid support-captured bead array.
  • an appropriate solvent e.g. xylene and ethanol washes
  • the thickness of a tissue sample or other biological specimen that is contacted with a bead array in a method, composition or apparatus set forth herein can be any suitable thickness desired.
  • the thickness will be at least 0.1 pm, 0.25 pm, 0.5 pm, 0.75 pm, 1 pm, 5 pm, 10 pm, 50 pm, 100 pm or thicker.
  • the thickness of a tissue sample that is contacted with bead array will be no more than 100 pm, 50 pm, 10 pm, 5 pm, 1 pm, 0.5 pm, 0.25 pm, 0.1 pm or thinner.
  • a particularly relevant source for a tissue sample is a human being.
  • the sample can be derived from an organ, including for example, an organ of the central nervous system such as brain, brainstem, cerebellum, spinal cord, cranial nerve, or spinal nerve; an organ of the musculoskeletal system such as muscle, bone, tendon or ligament; an organ of the digestive system such as salivary gland, pharynx, esophagus, stomach, small intestine, large intestine, liver, gallbladder or pancreas; an organ of the respiratory system such as larynx, trachea, bronchi, lungs or diaphragm; an organ of the urinary system such as kidney, ureter, bladder or urethra; a reproductive organ such as ovary, fallopian tube, uterus, vagina, placenta, testicle, epididymis, vas deferens, seminal vesicle, prostate, penis or scrotum; an organ of
  • a sample from a human can be considered (or suspected) healthy or diseased when used. In some cases, two samples can be used: a first being considered diseased and a second being considered as healthy (e.g., for use as a healthy control).
  • Any of a variety of conditions can be evaluated, including but not limited to, an autoimmune disease, cancer, cystic fibrosis, aneuploidy, pathogenic infection, psychological condition, hepatitis, diabetes, sexually transmitted disease, heart disease, stroke, cardiovascular disease, multiple sclerosis or muscular dystrophy.
  • Certain contemplated conditions include genetic conditions or conditions associated with pathogens having identifiable genetic signatures.
  • compositions and methods can be applied to obtain spatially-resolvable abundance data for a wide range of macromolecules, including not only poly-A-tailed RNAs/transcripts, but also, e.g., non- poly-A-tailed RNAs (e.g., tRNAs, miRNAs, etc.; optionally specifically captured using sequencespecific oligonucleotide sequences), DNAs (including, e.g., capture via gene-specific oligonucleotides, loaded transposases, etc.), and proteins (including, e.g., DNA-barcoded antibodies, optionally where a DNA barcode effectively tags a capture antibody for detection, allowing for direct comparison of spatial distribution(s) of antibodies and/or antibody-captured proteins with spatially-resolvable expression profiling that also can be performed
  • RNA including, e g., transcripts, tRNAs, rRNAs, miRNAs, etc.
  • DNAs including, e.g., genomic DNAs, barcode DNAs, etc.
  • proteins including, e.g., antibodies that are tagged for binding and detection and/or other forms of protein, optionally including proteins captured by antibodies.
  • proteins can be profiled using a library of DNA-barcoded antibodies to stain a tissue, before capturing proteins on the spatial array (refer to Cellular Indexing of Transcriptome and Epitopes by sequencing (CITE-seq), which combines unbiased genome-wide expression profiling with the measurement of specific protein markers in thousands of single cells using droplet microfluidics.
  • CITE-seq Cellular Indexing of Transcriptome and Epitopes by sequencing
  • monoclonal antibodies are conjugated to oligonucleotides containing unique antibody identifier sequences; a cell suspension is then labeled with the oligo-tagged antibodies and single cells are subsequently encapsulated into nanoliter-sized aqueous droplets in a microfluidic apparatus.
  • antibody and cDNA molecules are indexed with the same unique (or sufficiently unique) barcode and are converted into libraries that are amplified independently and mixed in appropriate proportions for sequencing in the same lane.
  • proteins may be adsorbed onto the beads nonspecifically, or through chemical capture (such as amine reactive chemistry or crosslinkers), the beads may be sorted into wells and the proteins quantitated by standard measures (antibodies, ELISA, etc.), and then followed by sequencing of the paired bead sequences and the spatial locations reconstructed.
  • a solid support-captured bead array is washed after exposure of the bead array to a cryosectioned tissue (optionally, the cryosectioned tissue is removed prior to or during application of a wash solution).
  • a solid support-captured bead array of the instant disclosure can be submerged in a buffered salt solution (or other stabilizing solution) after contacting the bead array with a cryosectioned tissue sample.
  • buffered salt solutions include saline-sodium citrate (SSC), for example at a NaCl concentration of about 0.2 M to 5 M NaCl, optionally at about 0.5 to 3 M NaCl, optionally at about 1 M NaCl.
  • RNA i.e., transcript
  • SSC has been exemplified in the processes of the instant disclosure, use of other types of buffered solutions is expressly contemplated, including, e.g., PBS, Tris buffered saline and/or Tris buffer, as well as, more broadly, any aqueous buffer possessing a pH between 4 and 10 and salt between 0-1 osmolarity.
  • Wash solutions can contain various additives, such as surfactants (e.g., detergents), enzymes (e.g., proteases and collagenases), cleavage reagents, or the like, to facilitate removal of the specimen.
  • the solid support is treated with a solution comprising a proteinase enzyme.
  • the solution can include cellulase, hemicelluase or chitinase enzymes (e.g. if desiring to remove a tissue sample from a plant or fungal source).
  • the temperature of a wash solution will be at least 30°C, 35°C, 50°C, 60°C or 90°C. Conditions can be selected for removal of a biological specimen while not denaturing hybrid complexes formed between target nucleic acids and solid support-attached nucleic acid probes.
  • Some of the methods and compositions provided herein employ methods of sequencing nucleic acids.
  • a number of DNA sequencing techniques are known in the art, including fluorescence-based sequencing methodologies (See, e.g., Birren et al., Genome Analysis Analyzing DNA, 1, Cold Spring Harbor, N.Y., which is incorporated herein by reference in its entirety).
  • automated sequencing techniques understood in that art are utilized.
  • parallel sequencing of partitioned amplicons can be utilized (PCT Publication No W02006084132, which is incorporated herein by reference in its entirety).
  • DNA sequencing is achieved by parallel oligonucleotide extension (See, e.g., U.S. Pat. No.
  • NGS Next-generation sequencing methods can be employed in certain aspects of the instant disclosure to obtain a high volume of sequence information (such as are particularly required to perform deep sequencing of bead-associated RNAs following capture of RNAs from cryosections) in a highly efficient and cost-effective manner.
  • NGS methods share the common feature of massively parallel, high-throughput strategies, with the goal of lower costs in comparison to older sequencing methods (see, e.g., Voelkerding et al, Clinical Chem., 55: 641- 658, 2009; MacLean et al, Nature Rev. Microbiol, 7- 287-296; which are incorporated herein by reference in their entireties).
  • NGS methods can be broadly divided into those that typically use template amplification and those that do not.
  • Amplification-utilizing methods include pyrosequencing commercialized by Roche as the 454 technology platforms (e.g., GS 20 and GS FLX), the Solexa platform commercialized by Illumina, and the Supported Oligonucleotide Ligation and Detection (SOLiDTM) platform commercialized by Applied Biosystems.
  • Nonamplification approaches also known as single -molecule sequencing, are exemplified by the HeliScope platform commercialized by Helicos Biosciences, SMRT sequencing commercialized by Pacific Biosciences, and emerging platforms marketed by VisiGen and Oxford Nanopore Technologies Ltd.
  • template DNA is fragmented, end-repaired, ligated to adaptors, and clonally amplified in-situ by capturing single template molecules with beads bearing oligonucleotides complementary to the adaptors.
  • Each bead bearing a single template type is compartmentalized into a water-in-oil microvesicle, and the template is clonally amplified using a technique referred to as emulsion PCR.
  • the emulsion is disrupted after amplification and beads are deposited into individual wells of a picotitre plate functioning as a flow cell during the sequencing reactions. Ordered, iterative introduction of each of the four dNTP reagents occurs in the flow cell in the presence of sequencing enzymes and luminescent reporter such as luciferase.
  • luminescent reporter such as luciferase.
  • an appropriate dNTP is added to the 3' end of the sequencing primer
  • the resulting production of ATP causes a burst of luminescence within the well, which is recorded using a CCD camera. It is possible to achieve read lengths greater than or equal to 400 bases, and 10 6 sequence reads can be achieved, resulting in up to 500 million base pairs (Mb) of sequence.
  • sequencing data are produced in the form of shorter-1 ength reads.
  • single-stranded fragmented DNA is end-repaired to generate 5'-phosphorylated blunt ends, followed by Klenow- mediated addition of a single A base to the 3' end of the fragments.
  • A-addition facilitates addition of T-overhang adaptor oligonucleotides, which are subsequently used to capture the templateadaptor molecules on the surface of a flow cell that is studded with oligonucleotide anchors.
  • the anchor is used as a PCR primer, but because of the length of the template and its proximity to other nearby anchor oligonucleotides, extension by PCR results in the "arching over" of the molecule to hybridize with an adjacent anchor oligonucleotide to form a bridge structure on the surface of the flow cell.
  • These loops of DNA are denatured and cleaved. Forward strands are then sequenced with reversible dye terminators.
  • sequence of incorporated nucleotides is determined by detection of post- incorporation fluorescence, with each fluorophore and block removed prior to the next cycle of dNTP addition. Sequence read length ranges from 36 nucleotides to over 50 nucleotides, with overall output exceeding 1 billion nucleotide pairs per analytical run.
  • Sequencing nucleic acid molecules using SOLiD technology can initially involve fragmentation of the template, ligation to oligonucleotide adaptors, attachment to beads, and clonal amplification by emulsion PCR. Following this, beads bearing the template are immobilized on a derivatized surface of a glass flow-cell, and a primer complementary to the adaptor oligonucleotide is annealed.
  • interrogation probes have 16 possible combinations of the two bases at the 3' end of each probe, and one of four fluors at the 5' end. Fluor color, and thus identity of each probe, corresponds to specified color-space coding schemes. Multiple rounds (usually 7) of probe annealing, ligation, and fluor detection are followed by denaturation, and then a second round of sequencing using a primer that is offset by one base relative to the initial primer. In this manner, the template sequence can be computationally re-constructed, and template bases are interrogated twice, resulting in increased accuracy. Sequence read length averages 35 nucleotides, and overall output exceeds 4 billion bases per sequencing run.
  • nanopore sequencing is employed (see, e.g., Astier et al, J. Am. Chem. Soc. 2006 Feb 8; 128(5): 1705-10, which is incorporated by reference).
  • the theory behind nanopore sequencing has to do with what occurs when a nanopore is immersed in a conducting fluid and a potential (voltage) is applied across it. Under these conditions a slight electric current due to conduction of ions through the nanopore can be observed, and the amount of current is exceedingly sensitive to the size of the nanopore.
  • the Ion Torrent technology is a method of DNA sequencing based on the detection of hydrogen ions that are released during the polymerization of DNA (see, e.g., Science 327(5970): 1190 (2010); U.S. Pat. Appl. Pub. Nos. 20090026082, 20090127589, 20100301398, 20100197507, 20100188073, and 20100137143, which are incorporated herein by reference in their entireties).
  • a microwell contains a template DNA strand to be sequenced. Beneath the layer of microwells is a hypersensitive ISFET ion sensor. All layers are contained within a CMOS semiconductor chip, similar to that used in the electronics industry.
  • Certain aspects of the instant disclosure feature use of dimensionality reduction analyses, applied to an assortment of (i) macromolecule abundance data; and (ii) associated array element identification information (e.g., barcodes that identify position within an array), also employing diffusion data (e.g., diffusion patterns), to generate latent space representations of bound macromolecules of a biological sample.
  • Exemplified dimensionality reduction analyses include the supervised dimensionality reduction analyses Uniform Manifold Approximation and Projection (UMAP) reduction, t-distributed stochastic neighbor embedding (t-SNE) reduction, multidimensional scaling (MDS) reduction, and variable autoencoders, though other dimensionality reduction analyses known in the art may be employed (additionally or alternatively).
  • dimensionality reduction analysis involves performing non-linear cell trajectory reconstruction on latent space to construct an inferred maximum likelihood progression trajectory between a first phenotypic state and a second phenotypic state.
  • performing non-linear cell trajectory reconstruction involves applying a reverse graph embedding algorithm to the latent space.
  • the present disclosure also provides computer systems that are programmed to implement methods of the disclosure.
  • a computer system is programmed or otherwise configured to, for example: generate a latent space representation of spatial representation of macromolecule abundance.
  • the computer system can regulate various aspects of methods and systems of the present disclosure, such as, for example, generating a latent space representation of macromolecule abundance.
  • Methods and systems of the present disclosure can be implemented by way of one or more algorithms.
  • An algorithm can be implemented by way of software upon execution by a central processing unit. The algorithm can, for example, generate a latent space representation of macromolecule abundance.
  • the instructions comprise a description of how to create a tissue cryosection, form a spatially-defined (or simply spatially definable, pending performance of a step that defines the spatial resolution of the bead array) bead array, contact a tissue cryosection with a spatially-defined bead array and/or obtain captured, tissue cryosection-derived transcript sequence from the spatially-defined bead array.
  • the kit may further comprise a description of selecting an individual suitable for treatment based on identifying whether that subj ect has a certain pattern of expression of one or more transcripts in a cryosection sample.
  • the instructions generally include information as to dosage, dosing schedule, and route of administration for the intended use/treatment.
  • Instructions supplied in the kits of the instant disclosure are typically written instructions on a label or package insert (e.g., a paper sheet included in the kit), but machine-readable instructions (e.g., instructions carried on a magnetic or optical storage disk) are also acceptable.
  • the label or package insert indicates that the composition is used for staging a cryosection and/or diagnosing a specific expression pattern in a cryosection. Instructions may be provided for practicing any of the methods described herein.
  • kits of this disclosure are in suitable packaging.
  • suitable packaging includes, but is not limited to, vials, bottles, jars, flexible packaging (e.g., sealed Mylar or plastic bags), and the like.
  • the container may further comprise a pharmaceutically active agent.
  • Kits may optionally provide additional components such as buffers and interpretive information. Normally, the kit comprises a container and a label or package insert(s) on or associated with the container. [0209]
  • the practice of the present disclosure employs, unless otherwise indicated, conventional techniques of chemistry, molecular biology, microbiology, recombinant DNA, genetics, immunology, cell biology, cell culture and transgenic biology, which are within the skill of the art. See, e.g., Maniatis et al., 1982, Molecular Cloning (Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y ); Sambrook et al., 1989, Molecular Cloning, 2nd Ed.
  • Mouse brain samples were obtained following guidelines in accordance with the U.S. National Institutes of Health Guide for the Care and Use of Laboratory Animals under protocol number 0120-09-16 and approved by the Broad Institutional Animal Care and Use Committee. Wild-type C57BL/6 mice, maintained on a 12-hour light/dark cycle were anesthetized by administration of isoflurane in a gas chamber flowing 3% isoflurane for 1 minute. Blood was cleared from the brain using transcardial perfusion with a chilled pH 7.4 HEPES buffer (1 lOmM NaCl, 10 mM HEPES, 25 mM glucose, 75 mM sucrose, 7.5 mM MgCh, 2.5 mM KC1). Brain was removed, frozen in liquid nitrogen vapor for 3 minutes and stored at -80 °C.
  • the C57BL/6 mouse embryo at Pl was purchased from ZyagenTM. The sample was stored at -80 °C before use.
  • a 12 pm thick section from the frozen Pl mouse sample was mounted onto a glass slide.
  • the LeicaTM ST5010 Autostainer XL Leica BiosystemsTM
  • H&E hematoxylin and eosin staining. Sections were immersed in xylene, sequentially processed through 100% and 95% ethanol series, and then stained with hematoxylin. Eosin staining was applied and the section was again processed through 100% and 95% ethanol series, xylene, dehydrated, and covered using the LeicaTM CB6030 Fully Automated Glass Coverslipper.
  • the slide was imaged with Leica Aperio VERSA Brightfield, Fluorescence & FISH Digital Pathology Scanner.
  • Bead barcodes were synthesized as in Slide-seqV2 17 . Sequences of beads used in reconstruction for "Slide-seq” and "Slide-tags":
  • the coverslip gasket filled with beads was centrifuged at 850 g for at least 30 min at 40 °C until the surface was dry, for the beads to stick to glass coverslips sprayed with Gorilla Glue® and Plastic Dip®. Excess beads were washed off to form a monolayer bead array. Pucks were sequenced using a sequencing-by- ligation approach with a monobase-encoding strategy.
  • arrays were incubated in RT solution (115 pl water, 40 pl Maxima 5* RT buffer (Thermo Fisher®, EP0751), 20 pl of lO mM dNTPs (NEB®, N0477L), 5 pl RNase inhibitor (Lucigen®, 30281), 10 pl of 50 pM template switch oligonucleotide (Qiagen®,
  • the bead array was then placed under an ultraviolet (365 nm) light source (0.42 mW mm-2, Thorlabs, M365LP1-C5, Thorlabs, LEDD1B) for 2 min (Slide-seq for mouse hippocampus section) or 5 s (Slide-seq for Pl mouse sample, as the capture beads used were also photocleavable so were cleaved for less time). Then the bead array was incubated at room temperature for 10 min for cleaved oligonucleotides to diffuse.
  • 365 nm ultraviolet
  • the bead array After dipping the bead array into a 1 mL diffusion buffer to wash out free oligonucleotides, the bead array was put into a 1.5 mL centrifuge tube of 200 pL extension buffer (lx NEBuffer 2, ImM dNTP, 25 units Klenow exo- (NEB M0212L)). The bead array was incubated at 37 °C for 1 h. Then Slide-seq V2 was continued from adding the tissue clearing buffer to cDNA PCR on the dissociated beads.
  • extension buffer lx NEBuffer 2, ImM dNTP, 25 units Klenow exo- (NEB M0212L
  • a total of 200 pl tissue digestion buffer (200 mM Tris-Cl pH 8, 400 mM NaCl, 4% SDS, 10 mM EDTA and 32 U ml-1 proteinase K (NEB®, P8107S)) was added directly to the Klenow extension solution, and the mixture was incubated at 37 °C for 30 min. Beads were pipetted up and down to detach from the surface. Then, 200 pl wash buffer (10 mM Tris pH 8.0, 1 mM EDTA and 0.01% Tween-20) was added to the 400 pl tissue clearing and RT solution mix, and the tube was centrifuged for 3 min at 3,000 g. The supernatant was then removed from the bead pellet, and the beads were resuspended in 200 pl wash buffer and centrifuged again. This was repeated a total of three times.
  • the beads were then pelleted and resuspended in library PCR mix (22 pl water, 25 pl of Terra Direct PCR mix buffer (Takara® Biosciences, 639270), 1 pl Terra polymerase (Takara® Biosciences, 639270), 1 pl of 100 pM TruSeq® PCR handle primer (IDT):
  • CTACACGACGCTCTTCCGATCT (SEQ ID NO: 6) and 1 pl of 100 pM SMART® PCR primer (IDT)): AAGCAGTGGTATCAACGCAGAGT (SEQ ID NO: 7), and PCR was performed according to the following pro-gram: 95 °C for 3 min; four cycles of 98 °C for 20 s, 65 °C for 45 s and 72 °C for 3 min; nine cycles of 98 °C for 20 s, 67 °C for 20 s and 72 °C for 3 min; 72 °C for 5 min; hold at 4 °C. After cDNA PCR, the beads were spun down with 2 min 3000 RCF.
  • PCR mix included: lx KAPA (Roche KK2612), 100 nM P5- TruSeq® Read 1 primer 5’-
  • the diffusion and extension steps were performed before doing Slide-tags. Similar to the process in reconstruction with Slide-seq, the bead array was emerged with diffusion buffer and exposed to ultraviolet (365 nm) light source for 5 s, incubated at room temperature for 10 min, dipped in 1 mL diffusion buffer, put in 200 pL extension buffer as above and incubated at 37 °C for 1 h. After extension, the bead array was dried, and used for a Slide-tags protocol until nuclei extraction. Fresh frozen tissues were cryosectioned to 20 pm and then placed onto the array.
  • the array was placed onto the glass slide in 6-10 pl dissociation buffer (82 mM Na2SO4, 30 mM K2SO4, 10 mM glucose, 10 mM HEPES, 5 mM MgC12) and exposed to ultraviolet (365 nm) light source (0.42 mW mm-2, Thor-labs, M365LP1-C5, Thorlabs, LEDD1B) for 30 s. After photocleavage, the puck was incubated for 7.5 min for the oligos to tag nuclei. Nuclei were exacted and loaded into the lOx Genomics® Chromium® controller using the Chromium® Next GEM Single Cell 3' Kit v3.1 (lOx Genomics, PN- 1000268).
  • Fiducial and capture beads were simulated to locate uniformly in a circle with each bead’s color determined by its location in the image of pattern ‘H’ or in a two-dimensional color gradient.
  • the diffusion of a fixed number of barcodes, described by unique molecular identifiers (UMIs), from each fiducial bead was assumed to follow a Gaussian distribution.
  • F £ the number of fiducial bead barcode j captured by capture bead i, follows a binomial distribution with the probability determined by the distance between beads:
  • Pij C exp(- —) where is the Euclidean distance between the fiducial and capture beads, is the standard deviation in the Gaussian distribution, and C is for normalization.
  • the diffusion based pairwise count matrix was then generated for reconstruction.
  • reads were initially filtered out if their constant sequences had Hamming distances greater than 3 when compared to the universal primer sequence. From remaining reads, capture bead barcode, capture bead UMI, fiducial bead barcode, and fiducial bead UMI were abstracted. To determine the read threshold for reliable bead barcodes, rank plots were generated for both capture bead barcodes and fiducial bead barcodes. Barcodes above the read threshold were collapsed with a Hamming distance of 1, resulting in a whitelist of barcodes. Barcodes with reads below the threshold were matched to the whitelist with a Hamming distance of 1.
  • Diffusion matrices were used as input for Uniform Manifold Approximation and Proj ection (UMAP) to reduce to a two dimensional space. Coordinates in the two-dimensional space were directly used as reconstructed locations. UMAP parameters that were tuned were: larger n neighbors and larger min dist for uniform distribution of beads, larger n epochs for converging, and also cosine metric that can better represent the high dimensional distance from diffusion matrix. With experimental data, it was also found a loglp transformation of the diffusion matrix improved reconstruction accuracy.
  • UMAP Uniform Manifold Approximation and Proj ection
  • the UMAP computation was expedited through the use of 24 parallel threads. The parameters for this 1.2 cm sample were changed because: the diffusion distance (tr) was the same while the whole size increased, which means the relative connectivity decreased and the minimum distance between beads decreased.
  • the diffusion distribution of a capture bead barcode was represented by the position of its associated fiducial bead barcodes, which were color-coded according to the conjugation UMI count.
  • the diffusion distributions of capture beads with high total UMI counts (first 3000 beads ranked by total UMI counts) along the X axis were fitted using Kernel Density Estimation (KDE) and then averaged to derive the ensemble KDE diffusion distribution.
  • KDE Kernel Density Estimation
  • the empirical Full Width at Half Maximum (FWHM) of the diffusion distribution was calculated based on the ensemble KDE.
  • Atp2bl The spatial distribution of Atp2bl was profiled in both reconstruction and ground truth. A line perpendicular to the expression pattern of Atp2b 1 in the CAI region was drawn to characterize the expression density of Atp2bl along this line. The widths of the CAI region were then determined by the Full Width at Half Maximum (FWHM) of the Atp2bl expression distribution. This analysis of the CAI width was conducted across three biological replicates.
  • FWHM Full Width at Half Maximum
  • nuclei locations were first transformed from reconstruction according to the registration between bead locations in reconstruction and ground truth. Nuclei locations from the reconstruction and the ground truth were directly compared to calculate the absolute error. For the analysis of the measurement length error, the pairwise distances between nuclei were examined, employing the same methodology used for the measurement length error of beads.
  • the UMAP reconstruction code was optimized by performing multiple parameter sweeps to settle on the set of parameters (n_neighbors, min_dist, local_connectivity, etc.) that yielded the best results and consistency across multiple runs.
  • Leiden initialization step was performed. Drawing on the finding that spatial barcode counts matrix can be Leiden clustered into spatially resolved clusters (Liao et. al; DOI: 10.1101/2024.08.06.606834), the “Leiden initialization” algorithm was implemented in which Leiden clusters the counts matrix into discrete clusters, and using those clusters, made a separate matrix (with dimensions cluster by cluster) that counts the edges of all the beads within a cluster that connect to beads outside of its cluster.
  • UMAP of this smaller matrix gives an initial embedding where each bead is in the correct relative position compared to the beads of other clusters, and so UMAP converges in significantly fewer steps and converges more reliably (a local optimum that matches the known shape of a circle).
  • tissue sample size of tissue sample to which the arrays and processes of the current disclosure can be successfully applied.
  • the current disclosure can be applied to significantly larger tissue samples than previously exemplified - e.g., tissue sample sections that are 5 cm across, 7 cm across, 10 cm across, or even larger, can be readily imaged for macromolecule abundance using the compositions and methods disclosed herein.
  • tissue sample sections that are 5 cm across, 7 cm across, 10 cm across, or even larger, can be readily imaged for macromolecule abundance using the compositions and methods disclosed herein.
  • a 7 cm puck array, used to contact a 7 cm or larger tissue section
  • Successful imaging of the 7 cm puck was accomplished by making certain changes to the spatial reconstruction process, including rewriting of the KNN code.
  • compositions and methods were therefore applied to significantly larger tissue sections than previously exemplified 1.2 cm pucks, with imaging of pucks of 5 cm or more diameter, 7 cm or more diameter, 8 cm or more diameter, 9 cm or more diameter, 10 cm or more diameter, 11 cm or more diameter, 12 cm or more diameter, 13 cm or more diameter, 14 cm or more diameter, 15 cm or more diameter, 16 cm or more diameter, or larger, now demonstrated.
  • a 1.2 cm circular bead array was used to profile the spatial transcriptomics of the Pl mouse section although the tissue covered only a portion of the bead array. To differentiate between the tissue-covered and uncovered regions of the bead array, segmentation was performed based on the UMI (Unique Molecular Identifier) count per bead. Kernel Density Estimation (KDE) was employed to estimate the UMI count density across the array, and a threshold for UMI counts was established. Only the beads covered by tissue were retained for further analysis, to save computational memory.
  • UMI Unique Molecular Identifier
  • Neuronal cell types including CNS neurons, neural crest and PNS neurons, olfactory sensory neurons, and intermediate neuronal progenitors, were isolated and analyzed again with UMAP embedding and unsupervised clustering. Highly variable genes were found for each subcluster. Olfactory epithelium enriched genes were listed by calculating the ratio of the mean expression level in the olfactory epithelium region to that in the entire section.
  • Slide-seq is a high resolution spatial transcriptomic approach that utilizes arrays of barcoded beads (10-micron polystyrene beads) for spatial capture of macromolecules (e.g., RNAs).
  • barcoded beads 10-micron polystyrene beads
  • macromolecules e.g., RNAs.
  • the "Slide-seq” approach has been implemented with inclusion of an imaging step designed to detect the sequence and position of individual array elements (e.g., via barcoding of, e.g., clusters of oligonucleotides, probes and/or beads), thereby allowing for immediate placement of macromolecule abundance data at a position within a surveyed array.
  • this imaging/position detection step can be expensive and laborious for an end-user to implement.
  • capture and fiducial beads were uniformly sampled from a circular area with color pattern of a letter H (FIG. IB, FIG. 3 A).
  • the diffusion of barcodes from each fiducial bead was simulated to follow a Gaussian distribution (FIG. 3B). Due to this diffusion, capture beads exhibited proximity-dependent capture of barcodes from fiducial beads, generating a neighboring matrix between bead barcodes (FIG. 1A). Specifically, capture beads registered higher count values for barcodes from fiducial beads that were closer, while distant fiducial beads were associated with zero counts.
  • Example 3 Implementing Spatial Transcriptomics through Computational Reconstruction
  • the array mixed the original barcoded poly(dT) beads for capturing mRNA with barcoded poly (dA) fiducial beads to enable diffusion-based reconstruction (FIG. 1A, FIG. ID, FIG. 9, FIG. 14A, FIG. 14B).
  • in situ sequencing was performed to spatially index the array, using the standard approach as previously described, to generate ground truth positions before reconstruction.
  • the oligonucleotide barcodes on poly(dA) beads were cleaved with UV and were captured by nearby poly(dT) beads (capture beads) (FIG. ID).
  • fluorescence beads were cleaved with UV and were captured by nearby poly(dT) beads (capture beads) (FIG. ID).
  • ground truth positions of the same array were generated before reconstruction using in situ sequencing.
  • the distribution of capture bead barcodes on fiducial bead barcodes followed a heavy tailed distribution, with the full width at half maximum (FWHM) around 123.1 pm (FIG. IE).
  • the intuition here is that most commonly, the pairwise distance measurements (e.g., length, neighborhoods, spatial proximity) are being quantified, and thus, the error of pairwise distance measurements are taken into consideration.
  • the root-mean-square (RMS) error of length measurements was quantified across all pairwise length measurements as a function of measurement length. RMS error was close to 10 pm (the bead size) at local scale measurements (about 100 pm) as nearby beads were usually displaced in the same direction, and plateaued at ⁇ 20 pm >1000 microns (representing ⁇ 2% error in measurement lengths) (FIG. 4E).
  • Example 5 Computational Reconstruction Enabled Spatial Transcriptomics At Large Scale
  • the instantly disclosed reconstruction technique is purely performed through molecular biology reactions, and, thus, is not limited by imaging throughput.
  • reconstruction on a 1.2- centimeter Pl mouse cranial section was performed with "Slide-seq" adapted to use the current reconstruction techniques.
  • Transcriptomes were spatially profiled across different tissue types, including brain, muscle, and the upper respiratory system, with a single section (FIG. 2E, FIGs. 20A-20B).
  • FIG. 2E When compared with hematoxylin and eosin staining of an adjacent section, reconstruction successfully identified the compartmentalization of different tissues and elucidated fine structural details (FIG. 2E, FIGs. 20A-20B).
  • Decomposed cell types were assigned to each bead with robust cell type decomposition (RCTD; FIG. 2F).
  • RCTD cell type decomposition
  • FIG. 2F To assess the fidelity of reconstructed tissue structure, the spatial distribution of certain cell types that exhibit unique spatial localization were represented: adipocytes around anterior cervical region; neuronal cells in central nervous systems (CNS), peripheral nervous systems (PNS), and olfactory sensory region (FIG. 2G, FIG. 21). The locations of these cell types were highly correlated with the spatial expression pattern of their maker genes (FIG. 2H, FIG. 21).
  • fibroblasts and osteoblasts corresponded with the distribution of type I collagen, which is abundantly present in tendons and bones.
  • beads that were assigned with neuronal cell types were gathered and subjected to further clustering. Such subclustering revealed distinctions between CNS, PNS, olfactory neurons, which were highly correlated with the expression pattern of their respective marker genes. Cortical neurons were found with this higher resolution of clustering (FIGs. 2I-2J).
  • the relative error in length measurements is particularly relevant because most spatial analyses depend on distance measurements, such as quantifying intercellular distances, defining cellular neighborhoods and identifying spatially varying gene expression.
  • distance measurements such as quantifying intercellular distances, defining cellular neighborhoods and identifying spatially varying gene expression.
  • empirical analyses of biological structures such as the mouse hippocampus CAI region, confirm that locally concordant errors have negligible effects on measurements of local structures or cellular neighborhoods. Though minor uncertainties may arise in absolute registration across sections, they are unlikely to affect overall conclusions.
  • Processes described herein may be performed singly or collectively by one or more computer systems, such as one or more computer system(s) executing software to perform spatial assessment of macromolecule abundance (e.g., RNA expression, DNA abundance, protein abundance) according to the techniques described herein.
  • FIG. 25 depicts an example of a computer system and associated devices to perform spatial assessment of macromolecule abundance according to the techniques described herein.
  • a computer system may also be referred to herein as a data processing device/system, computing device/system/node, or simply a computer.
  • the computer system may be based on one or more of various system architectures and/or instruction set architectures, such as those offered by Intel Corporation (Santa Clara, California, USA) or Apple computer (Cupertino, CA) as examples.
  • FIG. 25 shows a computer system 100 in communication with external device(s) 112.
  • Computer system 100 includes one or more processor(s) 102, for instance central processing unit(s) (CPUs) or 103, for example a graphics processing unit(s) (GPUs).
  • a processor 102 or 103 can include functional components used in the execution of instructions, such as functional components to fetch program instructions from locations such as cache or main memory, decode program instructions, and execute program instructions, access memory for instruction execution, and write results of the executed instructions.
  • a processor 102 or 103 can also include register(s) to be used by one or more of the functional components.
  • Computer system 100 also includes memory 104, input/output (I/O) devices 108, and I/O interfaces 110, which may be coupled to processor(s) 102 and/or 103 and each other via one or more buses and/or other connections.
  • Bus connections represent one or more of any of several types of bus structures, including a memory bus or memory controller, a peripheral bus, an accelerated graphics port, and a processor or local bus using any of a variety of bus architectures.
  • such architectures include the Industry Standard Architecture (ISA), the Micro Channel Architecture (MCA), the Enhanced ISA (EISA), the Video Electronics Standards Association (VESA) local bus, and the Peripheral Component Interconnect (PCI).
  • ISA Industry Standard Architecture
  • MCA Micro Channel Architecture
  • EISA Enhanced ISA
  • VESA Video Electronics Standards Association
  • PCI Peripheral Component Interconnect
  • Memory 104 can be or include main or system memory (e.g. Random Access Memory) used in the execution of program instructions, storage device(s) such as hard drive(s), flash media, or optical media as examples, and/or cache memory, as examples.
  • Memory 104 can include, for instance, a cache, such as a shared cache, which may be coupled to local caches (examples include LI cache, L2 cache, etc.) of processor(s) 102.
  • memory 104 may be or include at least one computer program product having a set (e.g., at least one) of program modules, instructions, code or the like that is/are configured to carry out functions of embodiments described herein when executed by one or more processors.
  • Memory 104 can store an operating system 105 and other computer programs 106, such as one or more computer programs/applications that execute to perform aspects described herein.
  • programs/applications can include computer readable program instructions that may be configured to carry out functions of embodiments or aspects described herein.
  • I/O devices 108 include but are not limited to microphones, speakers, Global Positioning System (GPS) devices, cameras, lights, accelerometers, gyroscopes, magnetometers, sensor devices configured to sense light, proximity, heart rate, body and/or ambient temperature, blood pressure, and/or skin resistance, and activity monitors.
  • GPS Global Positioning System
  • An EO device may be incorporated into the computer system as shown, though in some embodiments an EO device may be regarded as an external device 112 coupled to the computer system through one or more EO interfaces 110.
  • Computer system 100 may communicate with one or more external devices 112 via one or more EO interfaces 110.
  • Example external devices include a keyboard, a pointing device, a display, and/or any other devices that enable a user to interact with computer system 100.
  • Other example external devices include any device that enables computer system 100 to communicate with one or more other computing systems or peripheral devices such as a printer.
  • a network interface/adapter is an example EO interface that enables computer system 100 to communicate with one or more networks, such as a local area network (LAN), a general wide area network (WAN), and/or a public network (e.g., the Internet), providing communication with other computing devices or systems, storage devices, or the like.
  • Ethernet-based (such as Wi-Fi) interfaces and Bluetooth® adapters are just examples of the currently available types of network adapters used in computer systems.
  • Particular external device(s) 112 may include one or more data storage devices, which may store one or more programs, one or more computer readable program instructions, and/or data, etc.
  • Computer system 100 may include and/or be coupled to and in communication with (e.g. as an external device of the computer system) removable/non-removable, volatile/non-volatile computer system storage media.
  • a non-removable, nonvolatile magnetic media typically called a “hard drive”
  • a magnetic disk drive for reading from and writing to a removable, non-volatile magnetic disk (e.g., a “floppy disk”)
  • an optical disk drive for reading from or writing to a removable, non-volatile optical disk, such as a CD- ROM, DVD-ROM or other optical media.
  • aspects of the present invention may be a system, a method, and/or a computer program product, any of which may be configured to perform or facilitate aspects described herein.
  • aspects of the present invention may take the form of a computer program product, which may be embodied as computer readable medium(s).
  • a computer readable medium may be a tangible storage device/medium having computer readable program code/instructions stored thereon.
  • Example computer readable medium(s) include, but are not limited to, electronic, magnetic, optical, or semiconductor storage devices or systems, or any combination of the foregoing.
  • Example embodiments of a computer readable medium include a hard drive or other mass-storage device, an electrical connection having wires, random access memory (RAM), read-only memory (ROM), erasable-programmable read-only memory such as EPROM or flash memory, an optical fiber, a portable computer disk/diskette, such as a compact disc read-only memory (CD-ROM) or Digital Versatile Disc (DVD), an optical storage device, a magnetic storage device, or any combination of the foregoing.
  • the computer readable medium may be readable by a processor, processing unit, or the like, to obtain data (e.g. instructions) from the medium for execution.
  • a computer program product is or includes one or more computer readable media that includes/ stores computer readable program code to provide and facilitate one or more aspects described herein.
  • program instruction contained or stored in/on a computer readable medium can be obtained and executed by any of various suitable components such as a processor of a computer system to cause the computer system to behave and function in a particular manner.
  • Such program instructions for carrying out operations to perform, achieve, or facilitate aspects described herein may be written in, or compiled from code written in, any desired programming language.
  • such programming language includes object-oriented and/or procedural programming languages such as C, C++, C #, Java, Perl, Python, etc.
  • Program code can include one or more program instructions obtained for execution by one or more processors.
  • Computer program instructions may be provided to one or more processors of, e.g., one or more computer systems, to produce a machine, such that the program instructions, when executed by the one or more processors, perform, achieve, or facilitate aspects of the present invention, such as actions or functions described in flowcharts and/or block diagrams described herein.
  • each block, or combinations of blocks, of the flowchart illustrations and/or block diagrams depicted and described herein can be implemented, in some embodiments, by computer program instructions.
  • a method for generating a spatial representation of macromolecule abundance from a sample comprising: (i) contacting first oligonucleotides bound to a solid support and present in a positional array with a sample under conditions suitable for first oligonucleotide-macromolecule binding, wherein the first oligonucleotides comprise: an array location identifier sequence that is common to all first oligonucleotides in a given element in the positional array; and a macromolecule-specific capture sequence; (ii) obtaining sequence information for a population of macromolecules bound to the first oligonucleotides and the array location identifier sequence of the respective first oligonucleotides bound to each of the macromolecules of the population of macromolecules for which sequence information is obtained; and (iii) generating a computational reconstruction of the spatial locations of the population of macromolecules bound to the first oligonucleotides from inputs minimally comprising the obtained sequence information for the array
  • (A2) For the method denoted as (Al), wherein the macromolecules are selected from the group consisting of RNA, DNA, protein, and combinations thereof.
  • RNA is a poly-A-tailed RNA, optionally a mRNA.
  • (A4) For the method denoted as any one of (Al) through (A3), wherein the macromoleculespecific capture sequence comprises a poly-dT tail of sufficient length to allow for capture of poly- A-tailed RNAs via hybridization.
  • (A5) For the method denoted as any one of (Al) through (A4), wherein the macromoleculespecific capture sequence comprises a gene-specific sequence or a transcript-specific sequence.
  • (A7) For the method denoted as any one of (Al) through (A6), wherein the macromoleculespecific capture sequence is a component of a loaded transposase.
  • (A8) For the method denoted as any one of (Al) through (A7), wherein the positional array possesses a resolution of 50 micrometers or less between individual elements of the positional array, optionally wherein the positional array possesses a resolution of 30 micrometers or less between individual elements of the positional array, optionally wherein the positional array possesses a resolution of 20 micrometers or less between individual elements of the positional array, optionally wherein the positional array possesses a resolution of 10 micrometers or less between individual elements of the positional array.
  • tissue sample is obtained from a tissue selected from the group consisting of brain, lung, liver, kidney, pancreas, and heart.
  • (Al l) For the method denoted as any one of (Al) through (A10), wherein the sample is obtained from a mammal, optionally a human.
  • (A12) For the method denoted as any one of (Al) through (Al 1), wherein the sample is fixed, optionally wherein the tissue sample is fixed with paraffin, optionally wherein the tissue sample is fixed using formalin-fixation and paraffin embedding (FFPE).
  • FFPE formalin-fixation and paraffin embedding
  • (A14) For the method denoted as any one of (Al) through (A13), wherein the first oligonucleotides are bound to the solid support using a capture material, optionally wherein the capture material is applied as a liquid, optionally wherein the capture material is applied using a brush or aerosol spray, optionally wherein the capture material is a liquid electrical tape, optionally wherein the capture material dries to form a vinyl polymer, optionally wherein the vinyl polymer is polyvinyl hexane.
  • a capture material optionally wherein the capture material is applied as a liquid, optionally wherein the capture material is applied using a brush or aerosol spray, optionally wherein the capture material is a liquid electrical tape, optionally wherein the capture material dries to form a vinyl polymer, optionally wherein the vinyl polymer is polyvinyl hexane.
  • step (ii) comprises a next-generation sequencing approach, optionally wherein the next-generation sequencing approach is selected from the group consisting of solidphase, reversible dye-terminator sequencing; massively parallel signature sequencing; pyrosequencing; sequencing-by-ligation; ion semiconductor sequencing; Nanopore sequencing; and DNA nanoball sequencing, optionally wherein the next-generation sequencing approach comprises solid-phase, reversible dye-terminator sequencing.
  • (A17) For the method denoted as any one of (Al) through (A16), wherein the macromolecules comprise a population of second oligonucleotides capable of binding to the first oligonucleotides, optionally wherein the second oligonucleotides are attached to a bead.
  • (A18) For the method denoted as any one of (Al) through (A17), wherein the generating of the computational reconstruction of the spatial locations of the population of macromolecules of step (iii) comprises performing Uniform Manifold Approximation and Projection (UMAP) reduction, t-distributed stochastic neighbor embedding (t-SNE) reduction, and/or multidimensional scaling (MDS) reduction, optionally wherein the generating of the computational reconstruction of the spatial locations of the population of macromolecules of step (iii) comprises performing Uniform Manifold Approximation and Projection (UMAP) reduction.
  • UMAP Uniform Manifold Approximation and Projection
  • t-SNE t-distributed stochastic neighbor embedding
  • MDS multidimensional scaling
  • the first oligonucleotides bound to the solid support and present in the positional array have a resolution of 100 micrometers or less between individual elements of the positional array.
  • a method for generating a spatial representation of mRNA abundance from a sample comprising: (i) contacting first oligonucleotides bound to a solid support and present in a positional array having a resolution of 100 micrometers or less between individual elements of the positional array with a sample, wherein the first oligonucleotides comprise: an array location identifier sequence that is common to all first oligonucleotides in a given element in the positional array; and a poly-dT tail of sufficient length to allow for capture of poly-A-tailed mRNAs via hybridization, under conditions suitable for oligonucleotide-mRNA hybridization; (ii) obtaining sequence information for a population of poly-A-tailed mRNAs bound to the first oligonucleotides and the array location identifier sequence of the respective first oligonucleotides bound to each of the poly-A-tailed mRNAs for which sequence information is obtained;
  • step (B2) For the method denoted as (Bl), wherein the generating of the computational reconstruction of the spatial locations of the population of poly-A-tailed RNAs of step (iii) comprises performing a dimensionality reduction analysis.
  • poly-A-tailed RNAs comprise a population of second oligonucleotides capable of binding to the first oligonucleotides, optionally wherein the second oligonucleotides are attached to a bead.
  • (B4) For the method denoted as any one of (Bl) through (B3), further comprising performing reverse transcription upon hybridized poly-A-tailed mRNAs immediately after hybridizing said poly-A-tailed mRNAs to the solid support-bound oligonucleotides, optionally performing reverse transcription before a digestion step is performed.
  • (Cl) A method for generating a spatial representation of macromolecule abundance from a sample, the method comprising: (i) generating a well array having a plurality of wells, wherein each well of the well array can hold exactly one bead; (ii) depositing beads comprising macromolecule capture oligonucleotides into the wells of the well array, optionally depositing by evaporation in a centrifuge; (iii) brushing the well array to remove all of the beads not present in the wells; (iv) depositing the sample onto the well array and centrifuging, thereby forcing the biological sample into the wells of the well array; (v) adding a digestion buffer, thereby lysing the sample and causing the macromolecules of the sample to transfer onto the beads in the wells; (vi) obtaining sequence information for a population of macromolecules bound to the macromolecule capture oligonucleotides of the beads and an associated capture oligonucleotide bead identification sequence for each macro
  • (C3) For the method denoted as (Cl) or (C2), wherein the macromolecules comprise a population of second oligonucleotides capable of binding to the macromolecule capture oligonucleotides, optionally wherein the second oligonucleotides are attached to a bead.
  • (C4) For the method denoted as any one of (Cl) through (C3), further comprising performing reverse transcription upon the sample in the wells of the well array, optionally further comprising separating oligonucleotides from beads by sonication or by photocleavage.
  • (DI) A method for generating a spatial representation of macromolecule abundance from a sample, the method comprising: (i) adhering clusters of oligonucleotides in an array to a solid support; (ii) contacting the array with a tissue sample; (iii) obtaining sequence information for a population of macromolecules bound to the oligonucleotide clusters and a respective associated oligonucleotide cluster identification sequence for each macromolecule of the population of macromolecules for which sequence information is obtained; and (iv) generating a computational reconstruction of the spatial locations of the population of macromolecules bound to the clusters of oligonucleotides from inputs minimally comprising: the obtained oligonucleotide cluster identification sequenced and molecular diffusion patterns of the macromolecules of the population of macromolecules for which sequence information is obtained, relative to the clusters of oligonucleotides present in the array.
  • step (D2) For the method denoted as (DI), wherein the generating of the computational reconstruction of the spatial locations of the population of macromolecules of step (iv) comprises performing a dimensionality reduction analysis.
  • (D3) For the method denoted as (DI) or (D2), wherein the macromolecules comprise a population of second oligonucleotides capable of binding to the clusters of oligonucleotides, optionally wherein the second oligonucleotides are attached to a bead.
  • (D4) For the method denoted as any one of (DI) through (D3), wherein the array comprises barcoded clusters of oligonucleotides on the solid support.
  • (D5) For the method denoted as any one of (DI) through (D4), wherein the obtaining step (iii) comprises performance of long-read sequencing.
  • (El) A method for generating a spatial representation of macromolecule abundance from a tissue sample of a subject comprising: (i) obtaining the tissue sample from the subject; (ii) preparing a cryosection of the tissue sample and adhering said cryosection to a solid support; (iii) forming an array of barcoded oligonucleotide clusters and/or an array of beads attached to barcoded oligonucleotides and contacting the cryosection adhered to the solid support with the array; (iv) obtaining sequence information for a population of macromolecules bound to the array(s), wherein the sequence information comprises macromolecule identification information and associated positional identification information of the barcoded oligonucleotides; and (v) generating a computational reconstruction of the spatial locations of the population of macromolecules bound to the bead oligonucleotides from inputs minimally comprising: the obtained sequence information and molecular diffusion patterns of the macromolecules of the population of macromolecules for
  • (E3) For the method denoted as (El) or (E2), wherein the macromolecules comprise a population of second oligonucleotides capable of binding to the barcoded oligonucleotides, optionally wherein the second oligonucleotides are attached to a bead.
  • (E4) For the method denoted as any one of (Al) through (A19), (Bl) through (B5), (Cl) through (C4), (DI) through (D5), or (El) through (E3), wherein the array is physically transferred from one surface to another, optionally wherein a gel encasement is formed on top of the array, thereby allowing beads to be picked up off the surface of the array without altering bead positions relative to each other.
  • (E5) For the method denoted as any one of (Al) through (A19), (Bl) through (B5), (Cl) through (C4), (DI) through (D5), or (El) through (E4), wherein the beads or array are used for capture of oligonucleotides.
  • (E6) For the method denoted as any one of (Al) through (A19), (Bl) through (B5), (Cl) through (C4), (DI) through (D5), or (El) through (E5), wherein the beads or array comprise or bind oligonucleotide-conjugated antibodies.
  • (E7) For the method denoted as any one of (Al) through (A19), (Bl) through (B5), (Cl) through (C4), (DI) through (D5), or (El) through (E6), wherein the beads or array comprise or bind nucleic acid hybridization probes.
  • nucleic acid hybridization probes comprise unique molecular identifiers (UMIs), optionally wherein the UMIs of the hybridization probes are counted via sequencing to assess the levels of hybridization probe-bound macromolecules, optionally wherein the hybridization probe-bound macromolecules are selected from the group consisting of proteins, exons, transcripts, nucleic acid sequences comprising single nucleotide polymorphisms (SNPs) and/or genomic regions.
  • UMIs unique molecular identifiers
  • (El 2) For the method denoted as (E7), wherein the nucleic acid hybridization probes are released from the array or tissue, optionally wherein the nucleic acid hybridization probes are released from the array or tissue by a method selected from the group consisting of: (a) cleavage and/or degradation of a photolabile and/or photocleavable group; (b) T7 RNA polymerase transcription; (c) enzymatic cleavage, optionally RNAseH cleavage of bound RNA or RNAse cleavage of an RNA base in the hybridization probes; and/or (d) chemical cleavage, optionally disulfide cleavage.
  • (E13) For the method denoted as any one of (Al) through (A19), (Bl) through (B5), (Cl) through (C4), (DI) through (D5), or (El) through (El 2), wherein the beads or array possess primers capable of specific binding to a selection of one or more target transcripts, optionally wherein the one or more target transcripts are selected from the group consisting of T Cell receptor transcript sequences; transcripts of low-expressing proteins, optionally wherein the low-expressing proteins are transcription factors; and synthetic transcripts, optionally wherein the synthetic transcripts are guide-RNAs.
  • (E14) For the method denoted as any one of (Al) through (A19), (Bl) through (B5), (Cl) through (C4), (DI) through (D5), or (El) through (E13), wherein the generating a computational reconstruction of the spatial locations comprises a Uniform Manifold Approximation and Projection (UMAP) reduction, a t-distributed stochastic neighbor embedding (t-SNE) reduction, or a multidimensional scaling (MDS) reduction.
  • UMAP Uniform Manifold Approximation and Projection
  • t-SNE t-distributed stochastic neighbor embedding
  • MDS multidimensional scaling
  • (E15) For the method denoted as any one of (Al) through (A19), (Bl) through (B5), (Cl) through (C4), (DI) through (D5), or (El) through (E14), wherein a tissue sample of 5 cm or more in diameter is imaged, 7 cm or more in diameter is imaged, 8 cm or more in diameter is imaged, 9 cm or more in diameter is imaged, 10 cm or more in diameter is imaged, 11 cm or more in diameter is imaged, 12 cm or more in diameter is imaged, 13 cm or more in diameter is imaged, 14 cm or more in diameter is imaged, 15 cm or more in diameter is imaged, 16 cm or more in diameter is imaged, or larger tissue sample is imaged.
  • (Fl) A method for generating a spatial representation of macromolecule abundance from a tissue sample comprising: (i) contacting the tissue sample with a first monomer or linear polymer; a cross-linking agent comprising a second monomer or polymer, wherein the cross-linking agent is capable of crosslinking with the first monomer or linear polymer when combined; and a nucleic acid primer or probe comprising a modification capable of binding the primer or probe to the first monomer or linear polymer, the cross-linking agent, or both, wherein the primer or probe comprises: a matrix location identifier sequence that is common to all primers or probes in a given element in a matrix and a target nucleic acid molecule-specific capture sequence; (ii) crosslinking the cross-linking agent with the first monomer or linear polymer, thereby forming the matrix; (iii) binding the nucleic acid primer or probe to the first monomer or linear polymer, the cross-linking agent, or both; (iv) incubating
  • step (F2) For the method denoted as (Fl), wherein the generating of the computational reconstruction of the spatial locations of the population of target nucleic acid molecules of step (vi) comprises performing a dimensionality reduction analysis.
  • (F3) For the method denoted as (Fl) or (F2), wherein the target nucleic acid molecules comprise a population of second oligonucleotides capable of binding to the nucleic acid primers or probes, optionally wherein the second oligonucleotides are attached to a bead.
  • a computer-implemented method for reconstructing spatial locations of macromolecules distributed in an array comprising: contacting one or more first oligonucleotides bound to a solid support and present in the array with a sample under conditions suitable for first oligonucleotide-macromolecule binding, wherein the one or more first oligonucleotides comprise an array location identifier sequence that is common to all of the one or more first oligonucleotides in a given element in the array; and a macromolecule-specific capture sequence; obtaining sequence information for a population of macromolecules bound to the one or more first oligonucleotides and the array location identifier sequence of the respective one or more first oligonucleotides bound to each of the macromolecules of the population of macromolecules for which sequence information is obtained; and reconstructing spatial locations of the population of macromolecules bound to the one or more first oligonucleotides from inputs minimally comprising the obtained sequence information for the array
  • (G2) For the computer-implemented method denoted as (Gl), wherein reconstructing spatial locations of the population of macromolecules further comprises applying a linear or nonlinear dimensionality reduction method to reduce the high-dimensional sequence information from the population of macromolecules into a two-dimensional (2D) embedding space.
  • (G3) For the computer-implemented method denoted as (Gl) or (G2), wherein the nonlinear dimensionality reduction method is Uniform Manifold Approximation and Projection (UMAP).
  • UMAP Uniform Manifold Approximation and Projection
  • a computer program product comprising: a computer readable storage medium readable by at least one processor and storing instructions for execution by the at least one processor for performing a method comprising: contacting one or more first oligonucleotides bound to a solid support and present in the array with a sample under conditions suitable for first oligonucleotide-macromolecule binding, wherein the one or more first oligonucleotides comprise an array location identifier sequence that is common to all of the one or more first oligonucleotides in a given element in the array; and a macromolecule-specific capture sequence; obtaining sequence information for a population of macromolecules bound to the one or more first oligonucleotides and the array location identifier sequence of the respective one or more first oligonucleotides bound to each of the macromolecules of the population of macromolecules for which sequence information is obtained; and reconstructing spatial locations of the population of macromolecules bound to the one or more first oligonucle

Landscapes

  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Organic Chemistry (AREA)
  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Genetics & Genomics (AREA)
  • Biophysics (AREA)
  • Biotechnology (AREA)
  • General Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Molecular Biology (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Wood Science & Technology (AREA)
  • Zoology (AREA)
  • Biochemistry (AREA)
  • Microbiology (AREA)
  • General Engineering & Computer Science (AREA)
  • Immunology (AREA)
  • Analytical Chemistry (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Medical Informatics (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Theoretical Computer Science (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

Compositions and methods for assessing relative macromolecule abundance (e.g., RNA expression levels) in a spatially-defined manner across a biological sample are provided, specifically obtaining deep transcriptomic coverage at high-resolution across multiple locations assessed across the biological sample, via imaging-free high resolution reconstruction of macromolecule abundance obtained using dimensionality reduction methods that reconstruct the physical locations of macromolecules and associated abundance, with reference to diffusion data (e.g., diffusion patterns).

Description

IMAGING-FREE HIGH-RESOLUTION SPATIAL MACROMOLECULE ABUNDANCE
RECONSTRUCTION
CROSS-REFERENCE TO RELATED APPLICATION
[0001] The present application is related to and claims priority under 35 U.S.C. § 119(e) to U.S. provisional patent application No. 63/649,062, entitled “IMAGING-FREE HIGH-RESOLUTION SPATIAL MACROMOLECULE ABUNDANCE RECONSTRUCTION,” filed May 17, 2024. The entire content of the aforementioned patent application is incorporated herein by this reference.
STATEMENT REGARDING FEDERALLY SPONSORED RESEARCH
[0002] This invention was made with government support under Grant Nos. HG010647 and CA276865 awarded by the National Institutes of Health. The government has certain rights in the invention.
FIELD OF THE INVENTION
[0003] The invention relates generally to methods and compositions for spatial assessment of macromolecule abundance (e.g., RNA expression, DNA abundance, protein abundance), e.g., in a tissue or other biological sample.
SEQUENCE LISTING
[0004] The instant application contains a Sequence Listing which has been filed electronically in extensible Markup Language format and is hereby incorporated by reference in its entirety. Said XML file, created on May 15, 2025, is named 808978_004830.xml and is 16,267 Bytes in size.
BACKGROUND OF THE INVENTION
[0005] Approaches for spatial monitoring of RNA expression in a tissue sample include traditional histological approaches, in which sections of tissue are fixed, stained, and assessed, e.g., for the presence of individual transcripts across the viewable region of the fixed tissue section on a microscope slide, as well as certain more recent in situ techniques for transcriptome monitoring. Many such techniques have been afflicted by being laborious in application, offering a low degree of multiplexing with a high degree of technical difficulty and/or providing only low resolution of spatial capture across an array (i.e., providing only approximately 100-200 pm resolution). A need therefore exists for improved approaches for spatial macromolecule (e.g., RNA expression, DNA and/or protein abundance) profiling at resolutions approaching single cell resolution.
BRIEF SUMMARY OF THE INVENTION
[0006] The current disclosure relates, at least in part, to imaging-free compositions and methods for assessing macromolecule abundance (e.g., RNA expression levels) in a tissue or other biological sample, which provide deep macromolecule-identifying sequence coverage at high- resolution across multiple locations assessed within a tissue sample. In certain aspects, imaging- free high-resolution reconstructions of macromolecule abundance are obtained via use of dimensionality reduction methods that reconstruct the physical locations of macromolecules and associated abundance, using diffusion data (e.g., diffusion patterns). In some aspects, compositions and methods referred to elsewhere herein as "Slide-seq" (see, e.g., WO 2019/213254), as well as other compositions and processes for obtaining spatial macromolecule abundance (see, e.g., WO 2021/096814; U.S. Patent Application Publication No. 2022/0177963; and WO 2022/174054), are employed and/or adapted to provide imaging-free high resolution reconstructions of macromolecule abundance.
[0007] In one aspect, the instant disclosure provides a method for generating a spatial representation of macromolecule abundance from a sample, the method involving: (i) contacting first oligonucleotides bound to a solid support and present in a positional array with a sample, wherein the first oligonucleotides include: an array location identifier sequence that is common to all first oligonucleotides in a given element in the positional array and a macromolecule-specific capture sequence, under conditions suitable for oligonucleotide-macromolecule binding; (ii) obtaining sequence information for a population of macromolecules bound to the first oligonucleotides and the array location identifier sequence of the respective first oligonucleotides bound to each of the macromolecules of the population of macromolecules for which sequence information is obtained; and (iii) generating a computational reconstruction of the spatial locations of the population of macromolecules bound to the first oligonucleotides from inputs minimally comprising the obtained sequence information for the array location identifier sequences and molecular diffusion patterns of the macromolecules of the population of macromolecules for which sequence information is obtained, relative to the first oligonucleotides present in the positional array, thereby generating a spatial representation of macromolecule abundance from the sample.
[0008] In one embodiment, the first oligonucleotides bound to the solid support and present in the positional array have resolution of 100 micrometers or less between individual elements of the positional array.
[0009] In an embodiment, the macromolecule is RNA, DNA, protein, or combinations thereof.
[0010] In certain embodiments, the RNA is a poly-A-tailed RNA. In an embodiment, the RNA is a mRNA.
[0011] In an embodiment, the macromolecule-specific capture sequence includes a poly-dT tail of sufficient length to allow for capture of poly-A-tailed RNAs via hybridization.
[0012] In an embodiment, the macromolecule-specific capture sequence includes a gene-specific sequence or a transcript-specific sequence.
[0013] In certain embodiments, the DNA is a genomic DNA or a barcode DNA.
[0014] In some embodiments, the macromolecule-specific capture sequence is a component of a loaded transposase.
[0015] In certain embodiments, the positional array possesses resolution of 50 micrometers or less between individual elements of the positional array. In an embodiment, the positional array possesses resolution of 30 micrometers or less between individual elements of the positional array. In an embodiment, the positional array possesses resolution of 20 micrometers or less between individual elements of the positional array. In an embodiment, the positional array possesses resolution of 10 micrometers or less between individual elements of the positional array.
[0016] In some embodiments, the sample is a tissue sample.
[0017] In one embodiment, the tissue sample is obtained from a tissue of a brain, a lung, a liver, a kidney, a pancreas, or a heart.
[0018] In some embodiments, the sample is obtained from a mammal. In an embodiment, the biological sample is obtained from a human.
[0019] In certain embodiments, the sample is fixed. The tissue sample may be fixed with paraffin. The sample may be fixed using formalin-fixation and paraffin embedding (FFPE).
[0020] In some embodiments, the solid support is a slide. In an embodiment, the solid support is a glass slide. [0021] In some embodiments, the first oligonucleotides are bound to the solid support using a capture material. The capture material may be applied as a liquid. The capture material may be applied using a brush or aerosol spray. The capture material may be a liquid electrical tape. The capture material may dry to form a vinyl polymer. In an embodiment, the vinyl polymer is polyvinyl hexane.
[0022] In one embodiment, the obtaining sequence information of step (ii) involves a nextgeneration sequencing approach, the next-generation sequencing approach may be solid-phase, reversible dye-terminator sequencing; massively parallel signature sequencing; pyro-sequencing; sequencing-by -ligation; ion semiconductor sequencing; Nanopore sequencing; or DNA nanoball sequencing. In an embodiment, the next-generation sequencing approach includes solid-phase, reversible dye-terminator sequencing.
[0023] In some embodiments, the generating of the computational reconstruction of the spatial locations of the population of macromolecules of step (iii) includes performing a dimensionality reduction analysis.
[0024] In some embodiments, the macromolecules include a population of second oligonucleotides capable of binding to the first oligonucleotides, optionally where the second oligonucleotides are attached to a bead.
[0025] In some embodiments, the generating of the computational reconstruction of the spatial locations of the population of macromolecules of step (iii) involves performing Uniform Manifold Approximation and Projection (UMAP) reduction, t-distributed stochastic neighbor embedding (t- SNE) reduction, and/or multidimensional scaling (MDS) reduction. In an embodiment, the generating of the computational reconstruction of the spatial locations of the population of macromolecules of step (iii) involves performing Uniform Manifold Approximation and Projection (UMAP) reduction.
[0026] In an aspect, the disclosure provides a method for generating a spatial representation of mRNA abundance from a sample, the method involving: (i) contacting first oligonucleotides bound to a solid support and present in a positional array having resolution of 100 micrometers or less between individual elements of the positional array with a sample under conditions suitable for oligonucleotide-mRNA hybridization, wherein the first oligonucleotides include: an array location identifier sequence that is common to all first oligonucleotides in a given element in the positional array, and a poly-dT tail of sufficient length to allow for capture of poly-A-tailed mRNAs via hybridization; (ii) obtaining sequence information for a population of poly-A-tailed mRNAs bound to the first oligonucleotides and the array location identifier sequence of the respective first oligonucleotides bound to each of the poly-A-tailed mRNAs for which sequence information is obtained; and (iii) generating a computational reconstruction of the spatial locations of the population of poly-A-tailed RNAs bound to the first oligonucleotides from inputs minimally comprising the obtained sequence information for the array location identifier sequences and molecular diffusion patterns of the poly-A-tailed RNAs of the population of poly-A-tailed RNAs for which sequence information is obtained, relative to the first oligonucleotides present in the positional array, thereby generating the spatial representation of mRNA abundance from the sample.
[0027] In one embodiment, the generating of the computational reconstruction of the spatial locations of the population of poly-A-tailed RNAs of step (iii) comprises performing a dimensionality reduction analysis.
[0028] In one embodiment, the poly-A-tailed RNAs comprise a population of second oligonucleotides capable of binding to the first oligonucleotides, optionally wherein the second oligonucleotides are attached to a bead.
[0029] In one embodiment, the method further includes performing reverse transcription upon hybridized poly-A-tailed mRNAs immediately after hybridizing said poly-A-tailed mRNAs to the solid support-bound first oligonucleotides. Optionally, reverse transcription is performed upon hybridized poly-A-tailed mRNAs before a digestion step is performed.
[0030] In another embodiment, the conditions suitable for oligonucleotide-mRNA hybridization involve incubation in 6X SSC buffer. Optionally, the 6X SSC buffer is supplemented with detergent.
[0031] In another aspect, the disclosure provides a method for generating a spatial representation of macromolecule abundance from a sample, the method involving: (i) generating a well array having a plurality of wells, wherein each well of the array can hold exactly one bead; (ii) depositing beads comprising macromolecule capture oligonucleotides into the wells of the well array (optionally, by evaporation in a centrifuge); (iii) brushing the well array to remove all of the beads not present in the wells; (iv) depositing the sample onto the well array and centrifuging, thereby forcing the sample into the wells of the well array; (v) adding a digestion buffer, thereby lysing the sample and causing the macromolecules of the sample to transfer onto the beads in the wells; (vi) obtaining sequence information for a population of macromolecules bound to the macromolecule capture oligonucleotides of the beads and an associated capture oligonucleotide bead identification sequence for each macromolecule of the population of macromolecules for which sequence information is obtained; and (vii) generating a computational reconstruction of the spatial locations of the population of macromolecules bound to the macromolecule capture oligonucleotides from inputs minimally comprising the obtained sequence information for the bead identification sequences and molecular diffusion patterns of the macromolecules of the population of macromolecules for which sequence information is obtained, relative to the macromolecule capture oligonucleotides present in the well array, thereby generating the spatial representation of macromolecule abundance from the sample.
[0032] In one embodiment, the generating of the computational reconstruction of the spatial locations of the population of macromolecules of step (vii) comprises performing a dimensionality reduction analysis.
[0033] In one embodiment, the macromolecules comprise a population of second oligonucleotides capable of binding to the macromolecule capture oligonucleotides, optionally wherein the second oligonucleotides are attached to a bead.
[0034] In one embodiment, the method further includes performing reverse transcription upon the sample in the wells of the well array. Optionally, the method further includes separating oligonucleotides from beads by sonication or by photocleavage.
[0035] In another aspect, the disclosure provides a method for generating a spatial representation of macromolecule abundance from a sample, the method involving: (i) adhering clusters of oligonucleotides in an array to a solid support; (ii) contacting the array with a tissue sample; (iii) obtaining sequence information for a population of macromolecules bound to the oligonucleotide clusters and a respective associated oligonucleotide cluster identification sequence for each macromolecule of the population of macromolecules for which sequence information is obtained; and (iv) generating a computational reconstruction of the spatial locations of the population of macromolecules bound to the clusters of oligonucleotides from inputs minimally comprising the obtained oligonucleotide cluster identification sequenced and molecular diffusion patterns of the macromolecules of the population of macromolecules for which sequence information is obtained, relative to the clusters of oligonucleotides present in the array. [0036] In one embodiment, the generating of the computational reconstruction of the spatial locations of the population of macromolecules of step (iv) comprises performing a dimensionality reduction analysis.
[0037] In one embodiment, the macromolecules comprise a population of second oligonucleotides capable of binding to the clusters of oligonucleotides, optionally wherein the second oligonucleotides are attached to a bead.
[0038] In one embodiment, the array includes barcoded clusters of oligonucleotides on the solid support.
[0039] In another embodiment, the obtaining sequence information step (iii) involves performance of long-read sequencing.
[0040] In another aspect, the disclosure provides a method for generating a spatial representation of macromolecule abundance from a tissue sample of a subject, the method involving: (i) obtaining the tissue sample from the subject; (ii) preparing a cryosection of the tissue sample and adhering said cryosection to a solid support; (iii) forming an array of barcoded oligonucleotide clusters and/or an array of beads attached to barcoded oligonucleotides and contacting the cryosection adhered to the solid support with the array; (iv) obtaining sequence information for a population of macromolecules bound to the array(s), wherein the sequence information comprises macromolecule identification information and associated positional identification information of the barcoded oligonucleotides; and (v) generating a computational reconstruction of the spatial locations of the population of macromolecules bound to the bead oligonucleotides from inputs minimally comprising the obtained sequence information and molecular diffusion patterns of the macromolecules of the population of macromolecules for which sequence information is obtained, relative to the bead oligonucleotides present in the array.
[0041] In one embodiment, the generating of the computational reconstruction of the spatial locations of the population of macromolecules of step (v) comprises performing a dimensionality reduction analysis.
[0042] In one embodiment, the macromolecules comprise a population of second oligonucleotides capable of binding to the barcoded oligonucleotides, optionally wherein the second oligonucleotides are attached to a bead. [0043] In one embodiment, an array (puck) is physically transferred from one surface to another. Optionally, a gel encasement is formed on top of the array (puck), thereby allowing beads to be picked up off the surface of the array (puck) without altering bead positions relative to each other. [0044] In another embodiment, the beads or array are used for capture of oligonucleotides.
[0045] In certain embodiments, the beads or array include or bind oligonucleotide-conjugated antibodies.
[0046] In some embodiments, the beads or array include or bind nucleic acid hybridization probes.
[0047] In certain embodiments, the hybridization probes are RNA hybridization probes.
[0048] In related embodiments, the hybridization probes are DNA hybridization probes.
[0049] In certain embodiments, the hybridization probes are capable of specific hybridization to transcriptome or genome sequence(s) of the tissue sample.
[0050] In some embodiments, the hybridization probes include unique molecular identifiers (UMIs). Optionally, the UMIs of the hybridization probes are counted via sequencing to assess the levels of hybridization probe-bound macromolecules. Optionally, the hybridization probe-bound macromolecules are proteins, exons, transcripts, nucleic acid sequences including single nucleotide polymorphisms (SNPs) and/or genomic regions.
[0051] In certain embodiments, the hybridization probes are released from the array or tissue. Optionally, the hybridization probes are released from the array or tissue by a method of: (a) cleavage and/or degradation of a photolabile and/or photocleavable group; (b) T7 RNA polymerase transcription; (c) enzymatic cleavage (optionally, the enzymatic cleavage is RNAseH cleavage of bound RNA or RNAse cleavage of an RNA base in the hybridization probes); and/or (d) chemical cleavage (optionally, the chemical cleavage is disulfide cleavage).
[0052] In some embodiments, the beads or array possess primers capable of specific binding to a selection of one or more target transcripts. Optionally, the one or more target transcripts are selected from among T Cell receptor transcript sequences; transcripts of low-expressing proteins (optionally, the low-expressing proteins are transcription factors); and synthetic transcripts (optionally, the synthetic transcripts are guide-RNAs).
[0053] In related embodiments, the dimensionality reduction analysis involves a Uniform Manifold Approximation and Projection (UMAP) reduction, a t-distributed stochastic neighbor embedding (t-SNE) reduction, or a multidimensional scaling (MDS) reduction. [0054] In some embodiments, the generating a computational reconstruction of the spatial locations comprises a Uniform Manifold Approximation and Projection (UMAP) reduction, a t- distributed stochastic neighbor embedding (t-SNE) reduction, or a multidimensional scaling (MDS) reduction.
[0055] In another aspect, the disclosure provides a method for generating a spatial representation of macromolecule abundance from a sample involving: (i) contacting a tissue with a first monomer or linear polymer, a cross-linking agent including a second monomer or polymer, wherein the cross-linking agent is capable of crosslinking with the first monomer or linear polymer when combined, and a nucleic acid primer or probe including a modification capable of binding the nucleic acid primer or probe to the first monomer or linear polymer, the cross-linking agent, or both, wherein the nucleic acid primer or probe includes: a matrix location identifier sequence that is common to all nucleic acid primers or probes in a given element in a matrix and a target nucleic acid molecule-specific capture sequence; (ii) crosslinking the cross-linking agent with the first monomer or linear polymer, thereby forming the matrix; (iii) binding the nucleic acid primer or probe to the first monomer or linear polymer, the cross-linking agent, or both; (iv) incubating the matrix and nucleic acid primer or probe with the tissue under conditions suitable for annealing of the nucleic acid primer or probe to a target nucleic acid molecule of or associated with the tissue, thereby forming a primer-bound or probe-bound target nucleic acid molecule, thereby binding a target nucleic acid molecule of or associated with the tissue; (v) obtaining sequence information for a population of target nucleic acid molecules bound to the nucleic acid primers or probes and the matrix location identifier sequence of the nucleic acid primers or probes bound to each of the target nucleic acid molecules sequenced; and (vi) generating a computational reconstruction of the spatial locations of the population of target nucleic acid molecules bound to the nucleic acid primers or probes from inputs minimally comprising the obtained sequence information for the matrix location identifier sequences, and molecular diffusion patterns of the target nucleic acid molecules of the population of target nucleic acid molecules for which sequence information is obtained, relative to the nucleic acid primers or probes present in the matrix.
[0056] In some embodiments, the generating of the computational reconstruction of the spatial locations of the population of target nucleic acid molecules of step (vi) includes performing a dimensionality reduction analysis. [0057] In some embodiments, the target nucleic acid molecules include a population of second oligonucleotides capable of binding to the nucleic acid primers or probes, optionally wherein the second oligonucleotides are attached to a bead.
BRIEF DESCRIPTION OF THE DRAWINGS
[0058] The following detailed description, given by way of example, but not intended to limit the disclosure solely to the specific embodiments described, may best be understood in conjunction with the accompanying drawings, in which:
[0059] FIGs. lA to 1G show that the adapted “Slide-seq”-like and "Slide-tag"-like approaches of the instant disclosure, which, in specific aspects, perform dimensionality reduction analysis upon macromolecule abundance and associated positional array data, thereby generating latent space representations of macromolecule abundance, enabled imaging-free spatial transcriptomics via computational reconstruction. FIG. 1A shows a schematic of the instant method, where a mosaic barcode array of uniformly mixed capture beads (poly(dT)) and fiducial beads (poly(dA)) was used for imaging-free spatial transcriptomics. Both capture and fiducial beads were DNA- barcoded with a unique spatial barcode for each bead and a unique molecular identifier (UMI) for each oligonucleotide molecule. Fiducial beads' DNA barcodes were photocleaved and diffused to proximate capture beads. A diffusion matrix was abstracted from the sequencing result of the capture bead barcode and fiducial bead barcode conjugation product. UMAP reduces the high dimensional diffusion matrix to a two-dimensional embedding space and reconstructs the spatial location of beads. With computationally reconstructed spatial locations, the same array was performed with "Slide-seq'7" Slide-tags" to profile tissue spatial genomics, ("be", spatial barcode; "pc", photocl eavable linker). FIG. IB shows a simulated reconstruction. FIG. IB, at left, is an image with the pattern of letter "H". FIG. IB, at middle, shows beads sampled from the original image. FIG. IB, at right, shows reconstructed beads location through simulated diffusion matrix and UMAP embedding. Each bead was colored the same as the middle image to show the recovered pattern. FIG. 1C, at left, shows simulated reconstruction error as a function of the ratio of fiducial to capture beads. The error was quantified as the median displacement distance of capture beads, with the whole array's diameter defined as 3 mm. The shaded area represented one standard deviation (N = 5 repeated simulations). FIG. 1C, at right, shows simulated reconstruction error as a function of diffusion distance and number of molecules carrying diffusion information ((UMIs)) per bead. FIG. ID shows a schematic reconstruction for "Slide-seq". Molecules of mRNA are captured by capture beads, followed by UV induced fiducial bead barcode diffusion and conjugation. FIG. IE, at left, shows the diffusion pattern of a capture bead barcode on its associated fiducial bead barcodes, shaded by the number of unique joint UMIs. Inset details the diffusion center. FIG. IE, at right, further shows the ensemble average of kernel density estimation (KDE) on capture bead barcode's diffusion distribution in the x-axis direction. Dots represent the half-maximum, corresponding to the full width at half maximum (FWHM) measured as 123.1 pm. FIG. IF shows the spatial location of capture beads on a mouse hippocampus with reconstruction through UMAP embedding, colored by decomposed cell types. Locations were scaled to the original array size with a diameter of 3 mm. FIG. 1G, at left, shows the schematic of RMS error at different measurement lengths. Error was defined as the difference between the distance of two beads measured in ground truth (dashed line) and in reconstruction (solid line). Errors of measurement lengths in the same range were averaged by root-mean-square (RMS). FIG. 1G, at right, shows RMS error of different measurement lengths. Data shown in FIG. IF (top line) and two biological replicates (middle and bottom line) were presented. Solid lines represent mean values across beads and shaded areas represent one standard deviation. FIG. 1H, at left, shows spatial expression of Atp2bl , a Cornu Ammonis 1 (CAI) layer marker of the hippocampus, in ground truth and reconstruction. FIG. 1H, at right, shows representative plots of CAI layer width by profiling the expression intensity of Atp2bl along a perpendicular line in ground truth (dashed line) and reconstruction (solid line). Scale bars: 500 pm.
[0060] FIGs. 2A to 2L demonstrate that reconstruction enabled diverse spatial transcriptomics measurements at scale. FIG. 2A shows a schematic of reconstruction for spatial transcriptomics with single-nucleus resolution. A barcode array was exposed to UV for five seconds to allow fiducial bead barcode diffusion and conjugation with capture bead barcode. After the tissue section, the array was UV exposed for 1 minute to cleave a majority of fiducial bead barcodes and tag nucleic. Sample was then to proceed with "Slide-tags" for spatial genomics profiling. FIG. 2B shows uniform manifold approximation and projection (UMAP) embedding of snRNA-seq profiles from a coronal mouse hippocampus section, colored by cell type annotations shown in FIG. 2C. FIG. 2C shows the spatial location of nuclei mapped based on reconstructed locations of capture beads, colored by cell type annotations. Locations were scared to the original array size with a diameter of 3 mm. FIG. 2D shows the RMS length measurement error of nuclei in reconstruction versus ground truth. Solid line shows the mean value and shaded area shows one standard deviation. FIG. 2E, at left, shows the spatial location of beads with profiled gene expression from a Pl mouse head section. Beads were colored by decomposed cell type annotations from robust cell type decomposition (RCTD). FIG. 2E, at right, shows hematoxylin and eosin staining of an adjacent slice from the same sample. Dashed line indicates the 1.2 cm bead array used for this experiment. FIG. 2F shows UMAP representing gene expression of the mouse sample captured by beads, colored by decomposed cell types from RCTD. FIG. 2G shows spatial distribution of labeled cell types, colored the same as cell type annotations in FIG. 2F. Gray beads represent other cell types, plotted for contrasting. Neuronal cells include olfactory sensory neurons, intermediate neuronal progenitors, CNS neurons, and neural crest PNS neurons from clusters in FIG. 2E ("Chond", chondrocytes; "Fibro", fibroblasts). FIG. 2H shows marker gene spatial expression of cell types shown in FIG. 2G, shaded by relative expression level. FIG. 21 shows subclustering of neuronal cells, colored by subtype annotations. Gray beads are other subclusters of neuronal cells. FIG. 2 J shows results obtained for a marker gene of each neuronal subtype shown at FIG. 21, colored with corresponding color gradients based on relative expression level. All beads of neuronal cell type were plotted. In FIG. 2K, within beads categorized under the epithelial type, the top 20 spatially differential expression genes ranked by nonparametric C-SIDE were plotted, with Moran’s I statistics calculated. Higher scores on both metrics signify more spatially variable expression. FIG. 2L shows spatially differential expression of four epithelial genes from FIG. 2K. All beads of epithelial type were positioned, shaded by relative expression level of each gene. Scale bars: 500 pm.
[0061] FIGs. 3A to 3G show a simulation of diffusion and reconstruction with UMAP. FIG. 3A shows simulated locations of capture beads and fiducial beads in a 3 mm circle. FIG. 3B shows a simulated diffusion pattern of a capture bead on its associated fiducial beads, shaded by simulated UMI counts. The distribution plots on the top and right represent the diffusion distribution on the x and y axis, respectively. FIG. 3C shows simulated locations of capture beads, colored by a two- dimensional color gradient depending on the locations. FIG. 3D shows UMAP reconstructed locations of capture beads, colored the same as in FIG. 3C. FIG. 3E shows absolute error of capture beads plotted in ground truth locations. FIG. 3F shows displacement vectors of capture beads. Each arrow starts from the capture bead’s ground truth location and ends at the reconstruction location. FIG. 3G shows a histogram plot of capture beads’ absolute error. [0062] FIGs. 4A to 4 J demonstrate "Slide-seq" adapted by use of reconstruction metrics as disclosed herein. FIG. 4A shows absolute error of capture beads plotted in ground truth locations. FIG. 4B shows displacement vectors of capture beads. Each arrow starts from the capture bead’s ground truth location and ends at the reconstruction location. FIG. 4C shows a histogram plot of anchor beads’ absolute errors. FIG. 4D, at the left, shows UMAP representing gene expression from a coronal mouse hippocampus section captured by beads, colored by decomposed cell types from RCTD. FIG. 4D, at the right, shows the spatial location of capture beads in ground truth, colored by decomposed cell types. FIG. 4E shows relative RMS error of measurement lengths as a function of measurement length. Data shown in FIG. 7F(top line) and two biological replicates (middle and bottom line) are presented. Solid lines represent average values across beads and shaded areas represent one standard deviation. FIG. 4F shows CAI width measured in ground truth and reconstruction (N = 3 biological replicates). Data shown in FIG. IF (blue) and two biological replicates (orange and green) are shown. Gray lines showed the mean width of each group. FIG. 4G shows a neighborhood enrichment analysis between cell type pairs in reconstruction (left) and ground truth (right). The enrichment scores were plotted in the same color scale, higher score represent more enriched in the neighborhoods. FIG. 4H shows a comparison of matched bead barcodes and unique RNA molecules (UMIs) in reconstruction (top) or ground truth (bottom) (N = 3 biological replicates). Solid lines: mean values. Values were normalized to the ground truth for direct comparison across replicates. Scale bars: 500 pm. FIG. 41 shows barcode matching between Slide-seq library and in situ sequencing barcode list or reconstruction barcode list. Bead barcodes with >20 UMI counts were matched with hamming distance <1. Left rectangle represents total barcodes from in situ sequencing and shaded darker represents barcodes matched with Slide-seq library barcodes (shown as lower rectangle). Right rectangle represents total barcodes from reconstruction and darker shading represents barcodes matched with Slide-seq library barcodes. FIG. 4 J shows a Violin plot of UMI count per bead with the same Slide-seq library matched to reconstruction results and in situ sequencing results. Scale bars: 500 pm.
[0063] FIGs. 5 A to 5G demonstrate "Slide-tags" reconstruction metrics. FIG. 5 A shows the absolute error of capture beads plotted in ground truth locations. FIG. 5B shows displacement vectors of capture beads. Each arrow starts from the capture bead’s ground truth location and ends at the reconstruction location. FIG. 5C shows a histogram plot of capture bead locations’ absolute errors. FIG. 5D shows the spatial representation of reconstruction error on each nucleus. FIG. 5E shows displacement vectors of located nuclei. Each arrow starts from the nuclei's ground truth location and ends at the reconstruction location. FIG. 5F shows a histogram plot of nuclei locations’ absolute errors. FIG. 5G shows RMS error of measurement lengths between bead pairs as a function of measurement length. Solid lines represent average values, and shaded areas represent one standard deviation.
[0064] FIG. 6 shows exemplary methods for constructing spatial profiles of genetics information. [0065] FIG. 7 shows scale-limiting steps in spatial genomics.
[0066] FIG. 8 shows an exemplary method of spatial mapping and an exemplary mathematical basis for performing spatial reconstruction without imaging, via use of molecular diffusion measurements (in certain aspects of the instant “Slide-tags” approach, molecular diffusion of oligonucleotides (optionally oligonucleotide-linked macromolecules) released from array elements can be monitored and used for performing such spatial reconstructions).
[0067] FIG. 9 shows diffusion of capture beads and fiducial beads on a mosaic barcode array.
[0068] FIG. 10 shows an exemplary computational reconstruction with diffusion matrix. The diffusion matrix is high dimensional, and each capture bead is a dot in fiducial bead space. The high dimensional diffusion matrix is generated from 2D physical space and has an intrinsic 2D manifold. Dimensionality reduction was used to learn the low dimensional manifold.
[0069] FIG. 11A shows a simulation of diffusion-based reconstruction (min dist: effective minimum distance between embedded points). FIG. 11B shows a simulation of diffusion-based reconstruction (min_dist: effective minimum distance between embedded points).
[0070] FIG. 12 shows the effect of parameters in simulation.
[0071] FIG. 13 diagrams the "Slide-seq" method and "Slide-tags" method.
[0072] FIG. 14A shows reconstruction with "Slide-seq". In situ indexing was performed for ground truth. FIG. 14B shows reconstruction with "Slide-seq". In situ indexing was performed for ground truth.
[0073] FIG. 15 shows errors in reconstruction with "Slide-seq".
[0074] FIG. 16 shows reconstruction error on neighborhood analysis (0.997 pearson correlation coefficient).
[0075] FIG. 17 shows reconstruction with "Slide-tags" at single-nucleus resolution.
[0076] FIG. 18 shows error in reconstruction with " Slide-tags". Groundtruth and reconstruction results are identical. [0077] FIG. 19 shows the advantages of computational reconstruction in comparison to reconstruction via sequencing RNA.
[0078] FIGs. 20A shows an image obtained using spatial reconstruction with "Slide-seq", performed upon a Pl mouse section. FIG. 20B shows cell type information for the spatial reconstruction performed upon the Pl mouse section.
[0079] FIG. 21 shows spatially represented cell types and marker genes.
[0080] FIGs. 22A, 22B, and 22C show fine structure of olfactory epithelium from computation reconstruction. In FIG. 22A, Cbr2: xenobiotic metabolism, marker of sustentacular cells; and Gap43 : immature OSNs. In FIG. 22B. Cbr2: xenobiotic metabolism; and coronal section is shown. In FIG. 22C, Reg3g is a respiratory epithelium marker.
[0081] FIG. 23 shows the advantages of computational reconstruction, and the scalability of spatial transcriptomics. Reconstruction may be performed on any methods involving "Slide-seq" and "Slide-tags".
[0082] FIGs. 24A and 24B show experimental schematics of reconstruction with Slide-seq. Structure of the library at each stage of the preparation for reconstruction with Slide-seq. FIG. 24A shows beads and 5 sec UV cleave for reconstruction and extension parts of the process. FIG. 24B shows 2 min UV for tagging nuclei and PCR for reconstruction library parts of the process. Diagram shades align with text shades in sequences.
[0083] FIG. 25 depicts an example of a computer system and associated devices for spatial assessment of macromolecule abundance (e.g., RNA expression, DNA abundance, protein abundance) according to the techniques described herein.
DETAILED DESCRIPTION OF THE INVENTION
[0084] The present disclosure is directed, at least in part, to the discovery that imaging-free high- resolution reconstructions of macromolecule abundance could be obtained from a biological sample (e.g., a tissue sample) via use of dimensionality reduction methods that reconstruct the physical locations of macromolecules and associated abundance, using diffusion data (e.g., diffusion patterns). In certain aspects, the methods and compositions of the disclosure feature a tightly packed spatially barcoded microbead array (e.g., an array of 10 pm diameter beads packed at an inter-bead spacing of 20 pm or less, where each bead possesses a bead-specific barcode within bead-attached capture oligonucleotides) created via application of a capture material to a solid support (e.g., application of a liquid electrical tape to a glass slide, followed by application of a layer of microbeads), which can be used, e.g., to capture cellular transcriptomes (or other macromolecules) of biological samples (e.g., cryosectioned tissue), in a manner that is both spatially resolvable at high resolution (e.g., at resolutions of 20 pm between image features) and with deep coverage (i.e., high-resolution reconstructions of relative expression for individual transcripts can be generated using the methods and compositions of the instant disclosure, for a large number (i.e., tens, hundreds or even thousands) of transcripts, across an individual cryosectioned tissue sample), without requiring imaging.
[0085] In certain aspects, the instant disclosure enables imaging-free spatially resolved capture of nucleic acids for sequencing from cells and tissues with approximate 10 pm (single cell) resolution. Art-recognized spatial profding technologies pre-dating the "Slide-seq" approach of WO 2019/213254 have relied upon either targeted in situ techniques, which were laborious and offered only a low degree of multiplexing with a high degree of technical difficulty or have offered only very low resolution on spatial capture arrays (resolutions of approximately 100-200 jun). The instant disclosure provides a level of resolution via reconstruction that is superior in lateral resolution and in capture area, to most art-recognized methods. By performing reconstructions upon mRNA capture and subsequent high-throughput sequencing (e.g., Illumina™ bead-based sequencing) data, the instant disclosure provides methods and compositions that are easily adoptable and allow for whole transcriptomic profiling of complex tissues.
[0086] In certain embodiments, the compositions and methods described herein use a spatially barcoded array of oligonucleotide-laden beads to capture mRNA from tissue sections. Exemplified beads are synthesized with a unique or sufficiently unique bead barcode as previously described, e.g., in WO 2016/040476, wherein an exemplary sufficiently unique bead barcode is one that is a member of a population of barcode sequences that is sufficiently degenerate to a population (e.g., of beads) that a majority of individual components (e.g. beads) of the barcoded population each possesses a unique barcode sequence, where the remainder (minority) of the population may possess barcodes that are redundant with those of other members within the remainder population, yet such redundancy can either be eliminated or otherwise adjusted for (e.g., normalized, averaged across/between redundant members, etc.) with only minor impact upon, e.g., the image resolution obtained when employing such a barcoded population. The approach of the instant disclosure enables the imaging-free localization of cell types and gene expression patterns in a biological sample with an approximate 10-micron resolution in an unbiased manner.
[0087] The spatial organization of cells in tissue has a profound influence on their physiology, yet a high-throughput, imaging-free sequencing-based readout of gene expression with cellular resolution has been previously lacking. While, e.g., the "DNA microscopy" approach of U.S. Patent No. 11,339,390 has employed a combination of UMIs and unique event identifiers (UEIs) to generate a hierarchy of physical co-localization among groups of template nucleic acid molecules in a biological sample, the imaging-free adapted "Slide-seq" approach of the instant disclosure provides a method that was demonstrated to enable facile generation of large volumes of unbiased spatial transcriptomes with approximate 10-20 pm spatial resolution, comparable to the size of individual cells, through generation of high resolution reconstructions of macromolecule abundance obtained via use of dimensionality reduction methods that reconstruct the physical locations of macromolecules and associated abundance, using diffusion data (e.g., diffusion patterns). In certain aspects, to perform the adapted "Slide-seq" process of the disclosure, RNA is transferred from freshly frozen tissue sections onto a surface covered in polystyrene beads presenting barcoded DNA oligonucleotides, without any need for a priori knowledge of respective bead locations within a starting bead array, because the instant disclosure features latent space representation(s) of the population of macromolecules bound to such oligonucleotides/beads and oriented for array location based upon bead identifier sequences (i.e., barcodes), with such spatial orientation of beads based upon bead identifier sequences performed after macromolecules have been bound, beads/captured macromolecules have been mixed, and sequence information has been obtained, via use of a dimensionality reduction analysis that models and makes use of diffusion data to perform such orientation of array elements, which thereby generates an imaging-free spatial representation of macromolecule abundance from the biological sample. Sequencing of the bead- anchored RNA therefore allows for the assignment of beads to known cell types derived from scRNAseq data, revealing the spatial organization of cell types in the tissue with approximate 10 pm resolution, or greater (e.g., 5 pm resolution or greater, 2 pm resolution or greater, etc.). "Slide- seq" was initially applied to systematically characterize spatial gene expression patterns in the Purkinje layer of the mouse cerebellum, identifying several genes not previously associated with Purkinje cell compartments. Applying "Slide-seq" to a model of traumatic brain injury also allowed for the characterization of underlying genetic programs varying over time and space in response to injury. The "Slide-seq" approach and spatial reconstruction implementations such as those disclosed herein have provided improved methodologies to identify novel molecular patterns within tissues at high resolution and can accommodate large volumes of tissue, thereby enabling the generation of high-resolution transcriptome atlases at scale, among other applications.
[0088] The function of complex tissues is fundamentally tied to the organization of a tissue's resident cell types. The "Slide-seq" approach of, e.g., WO 2019/213254, provided an unbiased method for exploring genome-wide, spatial distributions of gene expression, as well as for other macromolecules. Advantages of other, art-recognized methods for spatial detection of analyte profiles (e.g., expression profiles) included, without limitation, optimization of the "Slide-seq" process for success - e.g., in certain aspects, it was identified as very important for "Slide-seq" that reverse transcription was performed immediately after the hybridization step, prior to a digestion step. When it was attempted, performing a digestion step first did not work. In addition, numerous different hybridization buffers were employed in different experiments, and several did not work. For example, using reverse transcriptase buffer during hybridization resulted in greatly reduced library sizes. A 6X SSC was identified as effective, though use of 6X SSC supplemented with detergent (e.g., Triton-X) resulted in increased capture of RNA. "Slide-seq" analysis methods also provided improved methods for (a) identifying genes with correlated spatial distributions; (b) cell types with correlated spatial distributions; and (c) genes with significantly non-random spatial distributions. For "Slide-seq", methods for attachment of beads to a solid support resulted from a process of non-routine optimization. Several surface coverings were attempted, such as acrylamide and polydimethylsiloxane (PDMS), and the liquid electrical tape (vinyl polymer)-coated surface was identified as a preferred embodiment. In the "Slide-seq" process, methods for collecting cDNA from beads were also the result of non-routine optimization. Beads were attached reversibly to the liquid tape surface. Optionally, it was specifically contemplated that photocleavable beads could also be used, where the cDNA could be released from the beads using UV light. Inclusion and use of photolabile beads and/or photolabile conjugates was also expressly contemplated.
[0089] The instant compositions and methods build upon the successes and advantages of "Slide- seq" and related adaptations thereof, yet significantly improve cost and efficiency parameters for the end-user of a "Slide-seq" array (or, for that matter, for any user of a barcoded array of oligonucleotides capable of capturing macromolecules from a sample). Generating a latent space representation of the population of macromolecules bound to the oligonucleotides having array location identifier sequences by performing a dimensionality reduction analysis frees the "Slide- seq" or other arrayed process from the need for an imaging-based a priori determination of which oligonucleotides within an array are found at which positional locations (elements) in the array. The instant compositions and methods disclosed herein therefore enable imaging-free, high resolution spatial representation of macromolecule abundance from a biological sample via use of dimensionality reduction methods that reconstruct the physical locations of macromolecules and associated abundance, using diffusion data (e.g., diffusion patterns).
[0090] Tissue organization arises from the coordinated molecular programs of cells. Spatial genomics maps cells and their molecular programs within the spatial context of tissues. However, current methods measure spatial information through imaging or direct registration, which often require specialized equipment and are limited in scale. Here, an imaging-free spatial transcriptomics method was developed that uses molecular diffusion patterns to computationally reconstruct spatial data. To do so, a simple experimental protocol on two-dimensional barcode arrays was used to establish an interaction network between barcodes via molecular diffusion. Sequencing these interactions generates a high dimensional matrix of interactions between different spatial barcodes. Then, dimensionality reduction to regenerate a two-dimensional manifold is performed, which represents the spatial locations of the barcode arrays. Surprisingly, it was found that the UMAP algorithm, with minimal modifications, can faithfully successfully reconstruct the arrays. This method is compatible with capture array based spatial transcriptomics/genomics methods, Slide-seq and Slide-tags, with high fidelity. The fidelity of the reconstruction was systematically explored through comparisons with experimentally derived ground truth data, and it was demonstrated that reconstruction generates high quality spatial genomics data. This technique was also scaled to reconstruct high-resolution spatial information over areas up to 7 centimeters in size. Further scaling to larger-sized tissue sections (e.g., to sizes including a complete cross-section of a target organ of even larger mammals, e.g., human brain) are also expressly contemplated, using the same techniques employed herein to scale puck/tissue sample sizes to 7 centimeters. This computational reconstruction method effectively converts spatial genomics measurements to molecular biology, enabling spatial transcriptomics with high accessibility, and scalability.
[0091] Tissue functions arise from the coordinated activities of cells. Interactions at multiple scales - from the level of molecules to communication between cells, to signaling across tissues - are all necessary for this coordinated function. Spatial transcriptomic technologies, which enable spatial localization of gene expression profiles within tissue contexts, represent a powerful set of tools to study tissue function and cell-cell interactions (Chen et al., 2015; Shah et al., 2016; Stahl, et al., 2016; Wang et al., 2018; Rodriques et al., 2019). Currently, imaging plays a key role in spatial transcriptomics, either for directly locating RNA molecules within tissues or for indexing capture arrays. However, imaging, which requires specialized techniques and equipment, introduces several limitations on spatial transcriptomic approaches, such as throughput, adaptability and the constrained size of detectable areas (Moses & Pachter, 2022). Alternatively, arrays may be deterministically printed through lithography or physical methods, but such methods require complex equipment and high upfront costs (Stahl et al., 2016; Liu et al., 2020). An imaging-free spatial transcriptomic technique, such as that disclosed herein, can enhance the throughput and accessibility of experiments, and enable larger scale detection for comprehensive studies of tissues.
[0092] In theory, spatial information can be inferred independently of imaging. For example, molecular proximity measurements measure interactions (e.g., HiC; Lieberman-Aiden et al., 2009). Beyond proximity, spatial locations can be inferred directly from pairwise distance measurements. For example, the location of a mobile phone is accurately determined by measuring its distances to three different satellites. Similarly, the geographical layout of the United States can be mathematically reconstructed based solely on the pairwise distances between cities (Singer, 2008). Moreover, the concept of distance can be generalized from Euclidean distance to more complicated spatial variation. Human genetic variations, for example, tend to correlate geographically; the two-dimensional representation of European genetic variations has been shown to mimic the actual geography of Europe, revealing the geographic information that are inherent within genetic variations (Novembre et al., 2008). At the molecular level, diffusion patterns, which highlight neighboring information, have been used for reconstructing the spatial locations of molecules (Glaser et al., 2015; Boulgakov et al., 2020; Weinstein et al., 2019; Weinstein etal., 2023). These efforts to determine molecular locations without traditional imaging have broadened the methodologies for conducting spatial measurements. However, they are either established only in theoretical simulations (Hoffecker et al., 2019; Greenstreet et al., 2023), or in simplified experimental systems (Weinstein et al., 2019). [0093] As disclosed herein, an imaging-free spatial transcriptomics method has been developed that computationally reconstructs the spatial locations of barcode arrays used in spatial transcriptomics measurements with high resolution and fidelity. This imaging-free approach was initially implemented on two-dimensional (2D) barcode arrays, along with ground truth imaging for error estimation. Then, a dimensionality reduction method was utilized to reconstruct the spatial locations. This imaging-free strategy was remarkably demonstrated to integrate with existing barcode array -based spatial transcriptomics methods without perturbing spatial structures. The methods disclosed herein facilitate higher throughput generation of barcode arrays and are accessible to laboratories lacking specialized imaging equipment. Furthermore, the techniques disclosed herein have been applied to a tissue sample on a centimeter scale, demonstrating the potential of these methods for large-scale spatial transcriptomics.
[0094] Other advantages of the instant disclosure are also described throughout the instant document, including those that would be apparent to a skilled artisan.
[0095] Various expressly contemplated components of certain compositions and methods of the instant disclosure are considered in additional detail below.
Definitions
[0096] Unless specifically stated or obvious from context, as used herein, the term “about” is understood as within a range of normal tolerance in the art, for example within 2 standard deviations of the mean. “About” can be understood as within 10%, 9%, 8%, 7%, 6%, 5%, 4%, 3%, 2%, 1%, 0.5%, 0.1%, 0.05%, or 0.01% of the stated value.
[0097] In certain embodiments, the term "approximately" or "about" refers to a range of values that fall within 25%, 20%, 19%, 18%, 17%, 16%, 15%, 14%, 13%, 12%, 11%, 10%, 9%, 8%, 7%, 6%, 5%, 4%, 3%, 2%, 1%, or less in either direction (greater than or less than) of the stated reference value unless otherwise stated or otherwise evident from the context (except where such number would exceed 100% of a possible value).
[0098] Unless otherwise clear from context, all numerical values provided herein are modified by the term “about.”
[0099] As used herein, the term "amplicon," when used in reference to a nucleic acid, means the product of copying the nucleic acid, wherein the product has a nucleotide sequence that is the same as or complementary to at least a portion of the nucleotide sequence of the nucleic acid. An amplicon can be produced by any of a variety of amplification methods that use the nucleic acid, or an amplicon thereof, as a template including, for example, polymerase extension, polymerase chain reaction (PCR), rolling circle amplification (RCA), multiple displacement amplification (MDA), ligation extension, or ligation chain reaction. An amplicon can be a nucleic acid molecule having a single copy of a particular nucleotide sequence (e.g. a PCR product) or multiple copies of the nucleotide sequence (e.g., a concatemeric product of RCA). A first amplicon of a target nucleic acid is typically a complementary copy. Subsequent amplicons are copies that are created, after generation of the first amplicon, from the target nucleic acid or from the first amplicon. A subsequent amplicon can have a sequence that is substantially complementary to the target nucleic acid or substantially identical to the target nucleic acid.
[0100] As used herein, the term "array" refers to a population of features or sites that can be differentiated from each other according to relative location. Different molecules that are at different sites of an array can be differentiated from each other according to the locations of the sites in the array. An individual site of an array can include one or more molecules of a particular type. For example, a site can include a single target nucleic acid molecule having a particular sequence or a site can include several nucleic acid molecules having the same sequence (and/or complementary sequence, thereof). The sites of an array can be different features located on the same substrate. Exemplary features include without limitation, wells in a substrate, beads (or other particles) in or on a substrate, projections from a substrate, ridges on a substrate, or channels in a substrate. The sites of an array can be separate substrates each bearing a different molecule. Different molecules attached to separate substrates can be identified according to the locations of the substrates on a surface to which the substrates are associated or according to the locations of the substrates in a liquid or gel. Exemplary arrays in which separate substrates are located on a surface include, without limitation, those having beads in wells, beads arranged upon a flat surface (e.g., a slide), optionally beads captured upon a flat surface (e.g., a layer of beads adhered to or otherwise stably associated with a slide (e.g., a layer of beads adsorbed to a slide-attached elastomeric surface)), etc.
[0101] As used herein, the term "attached" refers to the state of two things being joined, fastened, adhered, connected or bound to each other. For example, an analyte, such as a nucleic acid, can be attached to a material, such as a gel or solid support, by a covalent or non-covalent bond. A covalent bond is characterized by the sharing of pairs of electrons between atoms. A non-covalent bond is a chemical bond that does not involve the sharing of pairs of electrons and can include, for example, hydrogen bonds, ionic bonds, van der Waals forces, hydrophilic interactions and hydrophobic interactions.
[0102] As used herein, the term "barcode sequence" is intended to mean a series of nucleotides in a nucleic acid that can be used to identify the nucleic acid, a characteristic of the nucleic acid (e.g., the identity and optionally the location of a bead to which the nucleic acid is attached), or a manipulation that has been carried out on the nucleic acid. The barcode sequence can be a naturally occurring sequence or a sequence that does not occur naturally in the organism from which the barcoded nucleic acid was obtained. A barcode sequence can be unique to a single nucleic acid species in a population or a barcode sequence can be shared by several different nucleic acid species in a population (e.g., all nucleic acid species attached to a single bead might possess the same barcode sequence, while different beads present a different shared barcode sequence that serves to identify each such different bead). By way of further example, each nucleic acid probe in a population can include different barcode sequences from all other nucleic acid probes in the population. Alternatively, each nucleic acid probe in a population can include different barcode sequences from some or most other nucleic acid probes in a population. For example, each probe in a population can have a barcode that is present for several different probes in the population even though the probes with the common barcode differ from each other at other sequence regions along their length. In particular embodiments, one or more barcode sequences that are used with a biological specimen (e.g., a tissue sample) are not present in the genome, transcriptome or other nucleic acids of the biological specimen. For example, barcode sequences can have less than 80%, 70%, 60%, 50% or 40% sequence identity to the nucleic acid sequences in a particular biological specimen.
[0103] As used herein, "beads", “microbeads”, "microspheres" or "particles" or grammatical equivalents can include small discrete particles. The composition of the beads can vary, depending upon the class of capture probe, the method of synthesis, and other factors. In certain embodiments of the instant disclosure, the sizes of the beads of the instant disclosure tend to range from 1 pm to 100 pm in diameter (with all subranges within this range expressly contemplated), e.g., depending upon the extent of image resolution desired, nature of the solid support to be used for spatial bead array construction, sequencing processes (e.g., flow cell sequencing) to be employed, as well as other factors. [0104] As used herein, the term "biological specimen" is intended to mean one or more cell, tissue, organism or portion thereof. A biological specimen can be obtained from any of a variety of organisms. Exemplary organisms include, but are not limited to, a mammal such as a rodent, mouse, rat, rabbit, guinea pig, ungulate, horse, sheep, pig, goat, cow, cat, dog, primate (i.e. human or non-human primate); a plant such as Arabidopsis thaliana, corn, sorghum, oat, wheat, rice, canola, or soybean; an algae such as Chlamydomonas reinhardtii, a nematode such as Caenorhabditis elegans,' an insect such as Drosophda melanogaster, mosquito, fruit fly, honey bee or spider; a fish such as zebrafish; a reptile; an amphibian such as a frog or Xenopus laevis', a Dictyostelium discoideum; a fungi such as Pneumocystis carinii, Takifugu rubripes, yeast (e.g., Saccharamoyces cerevisiae, Schizosaccharomyces pomhe) or a Plasmodium falciparum. Target nucleic acids can also be derived from a prokaryote such as a bacterium, e.g., Escherichia coll. Staphylococci or Mycoplasma pneumoniae,' an archae; a virus such as Hepatitis C virus or human immunodeficiency virus; or a viroid. Specimens can be derived from a homogeneous culture or population of the above organisms or alternatively from a collection of several different organisms, for example, in a community or ecosystem.
[0105] As used herein, the term "cleavage site" is intended to mean a location in a nucleic acid molecule that is susceptible to bond breakage. The location can be specific to a particular chemical, enzymatic or physical process that results in bond breakage. For example, the location can be a nucleotide that is abasic or a nucleotide that has a base that is susceptible to being removed to create an abasic site. Examples of nucleotides that are susceptible to being removed include uracil and 8-oxo-guanine as set forth in further detail herein below. The location can also be at or near a recognition sequence for a restriction endonuclease such as a nicking enzyme.
[0106] By “control” or “reference” is meant a standard of comparison. Methods to select and test control samples are within the ability of those in the art. Determination of statistical significance is within the ability of those skilled in the art, e.g., the number of standard deviations from the mean that constitute a positive result.
[0107] As used herein, the term “cryosection” refers to a piece of tissue, e.g., a biopsy, that has been obtained from a subject, snap frozen, embedded in optimal cutting temperature embedding material, frozen, and cut into thin sections. In certain embodiments, the thin sections can be directly applied to an array of beads captured upon a solid support (e.g., a slide), or the thin sections can be fixed (e.g., in methanol or paraformaldehyde) and applied to a bead-presenting planar surface, e.g., a slide upon which a layer of microbeads has been attached/arrayed.
[0108] As used herein, the term "different", when used in reference to nucleic acids, means that the nucleic acids have nucleotide sequences that are not the same as each other. Two or more nucleic acids can have nucleotide sequences that are different along their entire length. Alternatively, two or more nucleic acids can have nucleotide sequences that are different along a substantial portion of their length. For example, two or more nucleic acids can have target nucleotide sequence portions that are different for the two or more molecules while also having a universal sequence portion that is the same on the two or more molecules. Two beads can be different from each other by virtue of being attached to different nucleic acids.
[0109] As used herein, the term "each," when used in reference to a collection of items, is intended to identify an individual item in the collection but does not necessarily refer to every item in the collection. Exceptions can occur if explicit disclosure or context clearly dictates otherwise.
[0110] As used herein, the term "extend," when used in reference to a nucleic acid, is intended to mean addition of at least one nucleotide or oligonucleotide to the nucleic acid. In particular embodiments, one or more nucleotides can be added to the 3' end of a nucleic acid, for example, via polymerase catalysis (e g., DNA polymerase, RNA polymerase or reverse transcriptase). Chemical or enzymatic methods can be used to add one or more nucleotide to the 3' or 5' end of a nucleic acid. One or more oligonucleotides can be added to the 3' or 5' end of a nucleic acid, for example, via chemical or enzymatic (e.g., ligase catalysis) methods. A nucleic acid can be extended in a template directed manner, whereby the product of extension is complementary to a template nucleic acid that is hybridized to the nucleic acid that is extended.
[0U1] As used herein, the term "feature" means a location in an array for a particular species of molecule. A feature can contain only a single molecule or it can contain a population of several molecules of the same species. Features of an array are typically discrete. The discrete features can be contiguous, or they can have spaces between each other. The size of the features and/or spacing between the features can vary such that arrays can be high density, medium density or low density. High density arrays are characterized as having sites separated by less than about 15 pm. Medium density arrays have sites separated by about 15 to 30 pm, while low density arrays have sites separated by greater than 30 pm. An array useful herein can have, for example, sites that are separated by less than 100 pm, 50 pm, 10 pm, 5 pm, 1 pm, or 0.5 pm. An apparatus or method of the present disclosure can be used to detect an array at a resolution sufficient to distinguish sites at the above densities or density ranges.
[0112] The terms “isolated,” “purified,” or “biologically pure” refer to material that is free to varying degrees from components which normally accompany it as found in its native state. “Isolate” denotes a degree of separation from an original source or surroundings. “Purify” denotes a degree of separation that is higher than isolation.
[0113] As used herein, the term "next-generation sequencing" or "NGS" can refer to sequencing technologies that have the capacity to sequence polynucleotides at speeds that were unprecedented using conventional sequencing methods (e.g., standard Sanger or Maxam-Gilbert sequencing methods). These unprecedented speeds are achieved by performing and reading out thousands to millions of sequencing reactions in parallel. NGS sequencing platforms include, but are not limited to, the following: Massively Parallel Signature Sequencing (Lynx Therapeutics); 454 pyrosequencing (454 Life Sciences/Roche Diagnostics); solid- phase, reversible dye-terminator sequencing (Solexa/Illumina™); SOLiD™ technology (Applied Biosystems); Ion semiconductor sequencing (Ion Torrent™); and DNA nanoball sequencing (Complete Genomics). Descriptions of certain NGS platforms can be found in the following: Shendure, et al., "Next-generation DNA sequencing," Nature, 2008, vol. 26, No. 10, 135-1 145; Mardis, "The impact of next-generation sequencing technology on genetics," Trends in Genetics, 2007, vol. 24, No. 3, pp. 133-141; Su, et al., "Next-generation sequencing and its applications in molecular diagnostics" Expert Rev Mol Diagn, 2011 , 11 (3):333-43; and Zhang et al., "The impact of next-generation sequencing on genomics", J Genet Genomics, 201, 38(3): 95-109.
[0114] As used herein, the terms "nucleic acid" and "nucleotide" are intended to be consistent with their use in the art and to include naturally occurring species or functional analogs thereof. Particularly useful functional analogs of nucleic acids are capable of hybridizing to a nucleic acid in a sequence specific fashion or capable of being used as a template for replication of a particular nucleotide sequence.
[0115] Naturally occurring nucleic acids generally have a backbone containing phosphodiester bonds. An analog structure can have an alternate backbone linkage including any of a variety of those known in the art. Naturally occurring nucleic acids generally have a deoxyribose sugar (e.g. found in deoxyribonucleic acid (DNA)) or a ribose sugar (e.g. found in ribonucleic acid (RNA)). A nucleic acid can contain nucleotides having any of a variety of analogs of these sugar moieties that are known in the art. A nucleic acid can include native or non-native nucleotides. In this regard, a native deoxyribonucleic acid can have one or more bases selected from the group consisting of adenine, thymine, cytosine or guanine and a ribonucleic acid can have one or more bases selected from the group consisting of uracil, adenine, cytosine or guanine. Useful non-native bases that can be included in a nucleic acid or nucleotide are known in the art. The terms "probe" or "target," when used in reference to a nucleic acid or sequence of a nucleic acid, are intended as semantic identifiers for the nucleic acid or sequence in the context of a method or composition set forth herein and does not necessarily limit the structure or function of the nucleic acid or sequence beyond what is otherwise explicitly indicated. The terms "probe" and "target" can be similarly applied to other analytes such as proteins, small molecules, cells or the like.
[0116] As used herein, the term "poly T or poly A," when used in reference to a nucleic acid sequence, is intended to mean a series of two or more thiamine (T) or adenine (A) bases, respectively. A poly T or poly A can include at least about 2, 5, 8, 10, 12, 15, 18, 20 or more of the T or A bases, respectively. Alternatively or additionally, a poly T or poly A can include at most about 30, 20, 18, 15, 12, 10, 8, 5 or 2 of the T or A bases, respectively.
[0117] As used herein, the term "random" can be used to refer to the spatial arrangement or composition of locations on a surface. For example, there are at least two types of order for an array described herein, the first relating to the spacing and relative location of features (also called "sites") and the second relating to identity or predetermined knowledge of the particular species of molecule that is present at a particular feature. Accordingly, features of an array can be randomly spaced such that nearest neighbor features have variable spacing between each other. Alternatively, the spacing between features can be ordered, for example, forming a regular pattern such as a rectilinear grid or hexagonal grid. In another respect, features of an array can be random with respect to the identity or predetermined knowledge of the species of analyte (e.g., nucleic acid of a particular sequence) that occupies each feature independent of whether spacing produces a random pattern or ordered pattern. An array set forth herein can be ordered in one respect and random in another. For example, in some embodiments set forth herein, a surface is contacted with a population of nucleic acids under conditions where the nucleic acids attach at sites that are ordered with respect to their relative locations but 'randomly located' with respect to knowledge of the sequence for the nucleic acid species present at any particular site. Reference to "randomly distributing" nucleic acids at locations on a surface is intended to refer to the absence of knowledge or absence of predetermination regarding which nucleic acid will be captured at which location (regardless of whether the locations are arranged in an ordered pattern or not).
[0118] As used herein, the term "solid support" refers to a rigid substrate that is insoluble in aqueous liquid. The substrate can be non-porous or porous. The substrate can optionally be capable of taking up a liquid (e.g., due to porosity) but will typically be sufficiently rigid that the substrate does not swell substantially when taking up the liquid and does not contract substantially when the liquid is removed by drying. A nonporous solid support is generally impermeable to liquids or gases. Exemplary solid supports include, but are not limited to, glass and modified or functionalized glass, plastics (including acrylics, polystyrene and copolymers of styrene and other materials, polypropylene, polyethylene, polybutylene, polyurethanes, Teflon™, cyclic olefins, polyimides etc.), nylon, ceramics, resins, Zeonor®, silica or silica-based materials including silicon and modified silicon, carbon, metals, inorganic glasses, optical fiber bundles, and polymers. Particularly useful solid supports for some embodiments are slides and beads capable of assorting/packing upon the surface of a slide (e.g., beads to which a large number of oligonucleotides are attached).
[0119] As used herein, the term "spatial tag" is intended to mean a nucleic acid having a sequence that is indicative of a location. Typically, the nucleic acid is a synthetic molecule having a sequence that is not found in one or more biological specimen that will be used with the nucleic acid. However, in some embodiments the nucleic acid molecule can be naturally derived or the sequence of the nucleic acid can be naturally occurring, for example, in a biological specimen that is used with the nucleic acid. The location indicated by a spatial tag can be a location in or on a biological specimen, in or on a solid support or a combination thereof. A barcode sequence can function as a spatial tag. In certain embodiments, the identification of the tag that serves as a spatial tag is only determined after a population of beads (each possessing a distinct barcode sequence) has been arrayed upon a solid support (optionally randomly arrayed upon a solid support) and sequencing of such a bead-associated barcode sequence has been determined in situ upon the solid support.
[0120] As used herein, the term "subject" includes humans and mammals (e.g., mice, rats, pigs, cats, dogs, and horses). In many embodiments, subjects are mammals, particularly primates, especially humans. In some embodiments, subjects are livestock such as cattle, sheep, goats, cows, swine, and the like; poultry such as chickens, ducks, geese, turkeys, and the like; and domesticated animals particularly pets such as dogs and cats. In some embodiments (e.g., particularly in research contexts) subject mammals will be, for example, rodents (e.g., mice, rats, hamsters), rabbits, primates, or swine such as inbred pigs and the like.
[0121] As used herein, the term "tissue" is intended to mean an aggregation of cells, and, optionally, intercellular matter. Typically, the cells in a tissue are not free floating in solution and instead are attached to each other to form a multicellular structure. Exemplary tissue types include muscle, nerve, epidermal and connective tissues.
[0122] As used herein, the term "universal sequence" refers to a series of nucleotides that is common to two or more nucleic acid molecules even if the molecules also have regions of sequence that differ from each other. A universal sequence that is present in different members of a collection of molecules can allow capture of multiple different nucleic acids using a population of universal capture nucleic acids that are complementary to the universal sequence. Similarly, a universal sequence present in different members of a collection of molecules can allow the replication or amplification of multiple different nucleic acids using a population of universal primers that are complementary to the universal sequence. Thus, a universal capture nucleic acid or a universal primer includes a sequence that can hybridize specifically to a universal sequence. Target nucleic acid molecules may be modified to attach universal adapters, for example, at one or both ends of the different target sequences.
[0123] Unless specifically stated or obvious from context, as used herein, the term "or" is understood to be inclusive. Unless specifically stated or obvious from context, as used herein, the terms "a", "an", and "the" are understood to be singular or plural.
[0124] Ranges can be expressed herein as from “about” one particular value, and/or to “about” another particular value. When such a range is expressed, another aspect includes from the one particular value and/or to the other particular value. Similarly, when values are expressed as approximations, by use of the antecedent “about,” it is understood that the particular value forms another aspect. It is further understood that the endpoints of each of the ranges are significant both in relation to the other endpoint, and independently of the other endpoint. It is also understood that there are a number of values disclosed herein, and that each value is also herein disclosed as “about” that particular value in addition to the value itself. It is also understood that throughout the application, data are provided in a number of different formats and that this data represent endpoints and starting points and ranges for any combination of the data points. For example, if a particular data point “10” and a particular data point “15” are disclosed, it is understood that greater than, greater than or equal to, less than, less than or equal to, and equal to 10 and 15 are considered disclosed as well as between 10 and 15. It is also understood that each unit between two particular units are also disclosed. For example, if 10 and 15 are disclosed, then 11, 12, 13, and 14 are also disclosed.
[0125] Ranges provided herein are understood to be shorthand for all of the values within the range. For example, a range of 1 to 50 is understood to include any number, combination of numbers, or sub-range from the group consisting 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, or 50 as well as all intervening decimal values between the aforementioned integers such as, for example, 1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7, 1.8, and 1.9. With respect to sub-ranges, “nested sub-ranges” that extend from either end point of the range are specifically contemplated. For example, a nested sub-range of an exemplary range of 1 to 50 may comprise 1 to 10, 1 to 20, 1 to 30, and 1 to 40 in one direction, or 50 to 40, 50 to 30, 50 to 20, and 50 to 10 in the other direction.
[0126] The transitional term “comprising,” which is synonymous with “including,” “containing,” or “characterized by,” is inclusive or open-ended and does not exclude additional, unrecited elements or method steps. By contrast, the transitional phrase “consisting of’ excludes any element, step, or ingredient not specified in the claim. The transitional phrase “consisting essentially of’ limits the scope of a claim to the specified materials or steps “and those that do not materially affect the basic and novel characteristic(s)” of the claimed invention.
[0127] The embodiments set forth below and recited in the claims can be understood in view of the above definitions.
[0128] Other features and advantages of the disclosure will be apparent from the following description of the preferred embodiments thereof, and from the claims. Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this disclosure belongs. Although methods and materials similar or equivalent to those described herein can be used in the practice or testing of the present disclosure, suitable methods and materials are described below. All published foreign patents and patent applications cited herein are incorporated herein by reference. All other published references, documents, manuscripts and scientific literature cited herein are incorporated herein by reference. In the case of conflict, the present specification, including definitions, will control. In addition, the materials, methods, and examples are illustrative only and not intended to be limiting.
Solid Supports
[0129] In certain aspects, the present disclosure provides a method for generating and using a spatially tagged array of oligonucleotides (e.g., an array of microbead-attached oligonucleotides) to perform deep expression profiling upon biological samples, e.g., cryosectioned tissue samples, with high resolution of reconstructions. The method can include the steps of (a) attaching different nucleic acid probes to array elements (optionally bound to a solid support) and/or to beads that are then captured upon a solid support to produce randomly located probe-possessing beads on the solid support, wherein the different nucleic acid probes each includes a barcode sequence (that is shared by all such nucleic acid probes of a single bead and/or array element), and wherein each of the randomly located beads ideally includes a barcode sequence(s) that is different from other randomly located beads on the solid support; (b) contacting a biological specimen with the solid support that has the array of probes/oligonucleotides and/or randomly located beads thereon; (c) hybridizing the probes presented by the array (e.g., the randomly located beads) to target nucleic acids from portions of the biological specimen that are proximal to the randomly located array elements (e.g., beads); (d) extending the probes of the randomly located beads to produce extended probes that include the barcode sequences and sequences from the target nucleic acids, thereby spatially tagging the nucleic acids of the biological specimen; (e) obtaining sequence information for captured macromolecules, including barcode sequences and sequences from the target nucleic acids; and (f) generating a latent space representation of the population of macromolecules bound to the oligonucleotides having array location identifier sequences by performing a dimensionality reduction analysis, thereby generating an imaging-free spatial representation of macromolecule abundance from the biological sample.
[0130] Any of a variety of solid supports can be used in a method, composition or apparatus of the present disclosure. Particularly useful solid supports are those used for nucleic acid arrays. Examples include glass, modified glass, functionalized glass, inorganic glasses, microspheres (e.g., inert and/or magnetic particles), plastics, polysaccharides, nylon, nitrocellulose, ceramics, resins, silica, silica-based materials, carbon, metals, an optical fiber or optical fiber bundles, polymers and multiwell (e.g. microtiter) plates. Exemplary plastics include acrylics, polystyrene, copolymers of styrene and other materials, polypropylene, polyethylene, polybutylene, polyurethanes and Teflon™. Exemplary silica-based materials include silicon and various forms of modified silicon.
[0131] In particular embodiments, a solid support can be within or part of a vessel such as a well, tube, channel, cuvette, Petri plate, bottle or the like. Optionally, the vessel is a flow-cell, for example, as described in WO 2014/142841 Al; U.S. Pat. App. Pub. No. 2010/0111768 Al and U.S. Pat. No. 8,951,781 or Bentley et al., Nature 456:53-59 (2008), each of which is incorporated herein by reference. Exemplary flow-cells are those that are commercially available from Illumina®, Inc. (San Diego, CA) for use with a sequencing platform such as a Genome Analyzer®, MiSeq®, NextSeq® or HiSeq® platform. Optionally, the vessel is a well in a multiwell plate or microtiter plate.
[0132] In certain embodiments, a solid support can include a gel coating. Attachment, e.g., of nucleic acids to a solid support via a gel is exemplified by flow cells available commercially from Illumina Inc. (San Diego, CA) or described in US Pat. App. Pub. Nos. 2011/0059865 Al, 2014/0079923 Al, or 2015/0005447 Al; or PCT Publ. No. WO 2008/093098, each of which is incorporated herein by reference. Exemplary gels that can be used in the methods and apparatus set forth herein include, but are not limited to, those having a colloidal structure, such as agarose; polymer mesh structure, such as gelatin; or cross-linked polymer structure, such as polyacrylamide, SFA (see, for example, US Pat. App. Pub. No. 2011/0059865 Al, which is incorporated herein by reference) or PAZAM (see, for example, US Pat. App. Publ. Nos. 2014/0079923 Al, or 2015/0005447 Al, each of which is incorporated herein by reference).
[0133] In some embodiments, a solid support can be configured as an array of features to which beads can be attached. The features can be present in any of a variety of desired formats. For example, the features can be wells, pits, channels, ridges, raised regions, pegs, posts or the like. Exemplary features include wells that are present in substrates used for commercial sequencing platforms sold by 454 LifeSciences (a subsidiary of Roche, Basel Switzerland) or Ion Torrent (a subsidiary of Life Technologies, Carlsbad California). Other substrates having wells include, for example, etched fiber optics and other substrates described in US Pat Nos. 6,266,459; 6,355,431; 6,770,441; 6,859,570; 6,210,891; 6,258,568; 6,274,320; US Pat app. Publ. Nos. 2009/0026082 Al; 2009/0127589 Al; 2010/0137143 Al; 2010/0282617 Al or PCT Publication No. WO 00/63437, each of which is incorporated herein by reference. In some embodiments, wells of a substrate can include gel material (with or without beads) as set forth in US Pat. App. Publ. No. 2014/0243224 Al, which is incorporated herein by reference.
[0134] Features can appear on a solid support as a grid of spots or patches. The features can be located in a repeating pattern or in an irregular, non-repeating pattern. Optionally, repeating patterns can include hexagonal patterns, rectilinear patterns, grid patterns, patterns having reflective symmetry, patterns having rotational symmetry, or the like. Asymmetric patterns can also be useful. The pitch of an array can be the same between different pairs of nearest neighbor features or the pitch can vary between different pairs of nearest neighbor features.
[0135] In particular embodiments, features on a solid support can each have an area that is larger than about 100 nm2, 250 nm2, 500 nm2, 1 pm2, 2.5 pm2, 5 pm2, 10 pm2 or 50 pm2. Alternatively or additionally, features can each have an area that is smaller than about 50 pm2, 25 pm2, 10 pm2, 5 pm2, 1 pm2, 500 nm2, or 100 nm2. The preceding ranges can describe the apparent area of a bead or other particle on a solid support when viewed or imaged from above.
Beads
[0136] Certain aspects of the instant disclosure employ a collection of beads or other particles, to which oligonucleotides are attached. Suitable bead compositions include those used in peptide, nucleic acid and organic moiety synthesis, including, but not limited to, plastics, ceramics, glass, polystyrene, methylstyrene, acrylic polymers, paramagnetic materials, thoriasol, carbon graphite, titanium dioxide, latex or cross-linked dextrans such as Sepharose, cellulose, nylon, cross-linked micelles and Teflon® may all be used. "Microsphere Detection Guide" from Bangs Laboratories, Fishers, IN is a helpful guide, which is incorporated herein by reference in its entirety. The beads need not be spherical; irregular particles may be used. In addition, the beads may be porous, thus increasing the surface area of the bead available for either capture probe attachment or tag attachment. The bead sizes can range from nanometers, for example, 100 nm, to millimeters, for example, 1 mm, with beads from about 0.2 pm to about 200 pm commonly employed, and from about 5 to about 20 pm being within the range currently exemplified, although in some embodiments smaller or larger beads may be used.
[0137] The particles can be suspended in a solution or they can be located on the surface of a substrate (e.g., arrayed upon the surface of a solid support, such as a glass slide). Art-recognized examples of arrays having beads located on a surface include those wherein beads are located in wells such as a BeadChip array (Illumina Inc., San Diego CA), substrates used in sequencing platforms from 454 LifeSciences (a subsidiary of Roche, Basel Switzerland) or substrates used in sequencing platforms from Ion Torrent (a subsidiary of Life Technologies, Carlsbad California). Other solid supports having beads located on a surface are described in US Pat. Nos. 6,266,459; 6,355,431; 6,770,441; 6,859,570; 6,210,891; 6,258,568; or 6,274,320; US Pat. App. Publ. Nos. 2009/0026082 Al; 2009/0127589 Al; 2010/0137143 Al; or 2010/0282617 Al or PCT Publication No. WO 00/63437, each of which is incorporated herein by reference. Several of the above references describe methods for attaching nucleic acid probes to beads prior to loading the beads in or on a solid support. As such, the collection of beads can include different beads each having a unique (or sufficiently unique and/or near-unique, as described elsewhere herein) probe attached. It will, however, be understood that the beads can be made to include universal primers, and the beads can then be loaded onto an array, thereby forming universal arrays for use in a method set forth herein. The solid supports typically used for bead arrays can be used without beads. For example, nucleic acids, such as probes or primers can be attached directly to the wells or to gel material in wells. Thus, the above references are illustrative of materials, compositions or apparatus that can be modified for use in the methods and compositions set forth herein.
[0138] Accordingly, the instant methods can employ an array of beads, wherein different nucleic acid probes are attached to different beads in the array. In this embodiment, each bead can be attached to a different nucleic acid probe and the beads can be randomly distributed on the solid support in order to effectively attach the different nucleic acid probes to the solid support. Optionally, the solid support can include wells having dimensions that accommodate no more than a single bead. In such a configuration, the beads may be attached to the wells due to forces resulting from the fit of the beads in the wells. As described elsewhere herein, it is also possible to use attachment chemistries or capture materials (e g., liquid electrical tape) to adhere or otherwise stably associate the beads with a solid support, optionally including holding the beads in wells that may or may not be present on a solid support.
[0139] Nucleic acid probes that are attached to beads can include barcode sequences. A population of the beads can be configured such that each bead is attached to only one type of barcode (e.g., a spatial barcode) and many different beads each with a different barcode are present in the population. In this embodiment, randomly distributing the beads to a solid support will result in randomly locating the nucleic acid probe-presenting beads (and their respective barcode sequences) on the solid support. In some cases, there can be multiple beads with the same barcode sequence such that there is redundancy in the population. However, randomly distributing a redundancy-comprising population of beads on a solid support - especially one that has a capacity that is greater than the number of unique barcodes in the bead population - will tend to result in redundancy of barcodes on the solid support, which will tend to reduce image resolution in the context of the instant disclosure (i.e., where the precise location of a barcoded bead cannot be resolved due to redundancy of barcode use within an arrayed population of beads, it is contemplated that such redundant locations will simply be eliminated from an ultimate image produced by methods of the instant disclosure, or other modes of adjustment (e.g., normalization and/or averaging of values) may also be employed to address such redundancies). Alternatively, in preferred embodiments, the number of different barcodes in a population of beads can exceed the capacity of the solid support in order to produce an array that is not redundant with respect to the population of barcodes on the solid support. The capacity of the solid support will be determined in some embodiments by the number of features (e.g. single-bead occupancy wells) that attach or otherwise accommodate a bead.
[0140] A bead or other nucleic acid-presenting solid support of the instant disclosure can include or can be made by the methods set forth herein to attach, a plurality of different nucleic acid probes. For example, a bead or other nucleic acid-presenting solid support can include at least 10, 100, 1 x 103, 1 x 104, 1 x 105, 1 x 106, 1 x 107, 1 x 108, 1 x 109 or more different probes. Alternatively or additionally, a bead or other nucleic acid-presenting solid support can include at most 1 x 109, l x 108, 1 x 107, 1 x 106, 1 x 105, 1 x 104, 1 x 103, 100, or fewer different probes. It will be understood that each of the different probes can be present in several copies, for example, when the probes have been amplified to form a cluster. Thus, the above ranges can describe the number of different nucleic acid clusters on a bead or other nucleic acid-presenting solid support of the instant disclosure. It will also be understood that the above ranges can describe the number of different barcodes, target capture sequences, or other sequence elements set forth herein as being unique (or sufficiently unique) to particular nucleic acid probes. Alternatively or additionally, the ranges can describe the number of extended probes or modified probes created on a bead or other nucleic acid-presenting solid support of the instant disclosure using a method set forth herein.
[0141] Features may be present on a bead or other solid support of the instant disclosure prior to contacting the bead or other solid support with nucleic acid probes. For example, in embodiments where probes are attached to a bead or other solid support via hybridization to primers, the primers can be attached at the features, whereas interstitial areas outside of the features substantially lack any of the primers. Nucleic acid probes can be captured at preformed features on a bead or other solid support, and optionally amplified on the bead or other solid support, e.g., using methods set forth in U.S. Patent Nos. 8,895,249 and 8,778,849 and/or U.S. Patent Application Publication No. 2014/0243224 Al, each of which is incorporated herein by reference. Alternatively, a bead or other solid support may have a lawn of primers or may otherwise lack features. In this case, a feature can be formed by virtue of attachment of a nucleic acid probe on the bead or other solid support. Optionally, the captured nucleic acid probe can be amplified on the bead or other solid support such that the resulting cluster becomes a feature. Although attachment is exemplified above as capture between a primer and a complementary portion of a probe, it will be understood that capture moieties other than primers can be present at pre-formed features or as a lawn. Other exemplary capture moieties include, but are not limited to, chemical moieties capable of reacting with a nucleic acid probe to create a covalent bond or receptors capable of binding non-covalently to a ligand on a nucleic acid probe.
[0142] A step of attaching nucleic acid probes to a bead or other solid support can be carried out by providing a fluid that contains a mixture of different nucleic acid probes and contacting this fluidic mixture with the bead or other solid support. The contact can result in the fluidic mixture being in contact with a surface to which many different nucleic acid probes from the fluidic mixture will attach. Thus, the probes have random access to the surface (whether the surface has preformed features configured to attach the probes or a uniform surface configured for attachment). Accordingly, the probes can be randomly located on the bead or other solid support.
[0143] The total number and variety of different probes that end up attached to a surface can be selected for a particular application or use. For example, in embodiments where a fluidic mixture of different nucleic acid probes is contacted with a bead or other solid support for purposes of attaching the probes to the support, the number of different probe species can exceed the occupancy of the bead or other solid support for probes. Thus, the number and variety of different probes that attach to the bead or other solid support can be equivalent to the probe occupancy of the bead or other solid support.
[0144] Alternatively, the number and variety of different probe species on the bead or other solid support can be less than the occupancy (i.e., there will be redundancy of probe species such that the bead or other solid support may contain multiple features having the same probe species). Such redundancy can be achieved, for example, by contacting the bead or other solid support with a fluidic mixture that contains a number and variety of probe species that is substantially lower than the probe occupancy of the bead or other solid support.
[0145] Attachment of the nucleic acid probes can be mediated by hybridization of the nucleic acid probes to complementary primers that are attached to the bead or other solid support, chemical bond formation between a reactive moiety on the nucleic acid probe and the bead or other solid support (examples are set forth in U.S. Patent Nos. 8,895,249 and 8,778,849, and in U.S. Patent Application Publication No. 2014/0243224 Al, each of which is incorporated herein by reference), affinity interactions of a moiety on the nucleic acid probe with a bead- or other solid support-bound moiety (e.g. between known receptor-ligand pairs such as streptavidin-biotin, antibody-epitope, lectin-carbohydrate and the like), physical interactions of the nucleic acid probes with the bead or other solid support (e.g. hydrogen bonding, ionic forces, van der Waals forces and the like), or other interactions known in the art to attach nucleic acids to surfaces.
[0146] In some embodiments, attachment of a nucleic acid probe is non-specific with regard to any sequence differences between the nucleic acid probe and other nucleic acid probes that are or will be attached to the bead or other solid support. For example, different probes can have a universal sequence that complements surface-attached primers or the different probes can have a common moiety that mediates attachment to the surface. Alternatively, each of the different probes (or a subpopulation of different probes) can have a unique (or sufficiently unique) sequence that complements a unique (or sufficiently unique) primer on the bead or other solid support or they can have a unique (or sufficiently unique) moiety that interacts with one or more different reactive moiety on the bead or other solid support. In such cases, the unique (or sufficiently unique) primers or unique (or sufficiently unique) moieties can, optionally, be attached at predefined locations in order to selectively capture particular probes, or particular types of probes, at the respective predefined locations.
[0147] One or more features on a bead or other solid support can each include a single molecule of a particular probe. The features can be configured, in some embodiments, to accommodate no more than a single nucleic acid probe molecule. However, whether or not the feature can accommodate more than one nucleic acid probe molecule, the feature may nonetheless include no more than a single nucleic acid probe molecule. Alternatively, an individual feature can include a plurality of nucleic acid probe molecules, for example, an ensemble of nucleic acid probe molecules having the same sequence as each other. In particular embodiments, the ensemble can be produced by amplification from a single nucleic acid probe template to produce amplicons, for example, as a cluster attached to the surface.
[0148] A method set forth herein can use any of a variety of amplification techniques. Exemplary techniques that can be used include, but are not limited to, polymerase chain reaction (PCR), rolling circle amplification (RCA), multiple displacement amplification (MDA), or random prime amplification (RPA). In some embodiments the amplification can be carried out in solution, for example, when features of an array are capable of containing amplicons in a volume having a desired capacity. In certain embodiments, an amplification technique used in a method of the present disclosure will be carried out on solid phase. For example, one or more primer species (e.g., universal primers for one or more universal primer binding site present in a nucleic acid probe) can be attached to a bead or other solid support. In PCR embodiments, one or both of the primers used for amplification can be attached to a bead or other solid support (e.g., via a gel). Formats that utilize two species of primers attached to a bead or other solid support are often referred to as bridge amplification because double stranded amplicons form a bridge-like structure between the two surface attached primers that flank the template sequence that has been copied. Exemplary reagents and conditions that can be used for bridge amplification are described, for example, in U.S. Patent Nos. 5,641,658; 7,115,400; and 8,895,249; and/or U.S. Patent Application Publication Nos. 2002/0055100 Al, 2004/0096853 Al, 2004/0002090 Al, 2007/0128624 Al and 2008/0009420 Al, each of which is incorporated herein by reference. Solid-phase PCR amplification can also be carried out with one of the amplification primers attached to a bead or other solid support and the second primer in solution. An exemplary format that uses a combination of a surface attached primer and soluble primer is the format used in emulsion PCR as described, for example, in Dressman et al., Proc. Natl. Acad. Sci. USA 100:8817-8822 (2003), WO 05/010145, or U.S. Patent Application Publication Nos. 2005/0130173 Al or 2005/0064460 Al, each of which is incorporated herein by reference. Emulsion PCR is illustrative of the format and it will be understood that for purposes of the methods set forth herein the use of an emulsion is optional and indeed for several embodiments an emulsion is not used.
[0149] RCA techniques can be modified for use in a method of the present disclosure. Exemplary components that can be used in an RCA reaction and principles by which RCA produces amplicons are described, for example, in Lizardi et al., Nat. Genet. 19:225-232 (1998) and U.S. Patent Application Publication No. 2007/0099208 Al, each of which is incorporated herein by reference. Primers used for RCA can be in solution or attached to a bead or other solid support. The primers can be one or more of the universal primers described herein.
[0150] MDA techniques can be modified for use in a method of the present disclosure. Some basic principles and useful conditions for MDA are described, for example, in Dean et al., Proc Natl. Acad. Sci. USA 99:5261 -66 (2002); Lage et al., Genome Research 13:294-307 (2003); Walker et al., Molecular Methods for Virus Detection, Academic Press, Inc., 1995; Walker et al., Nucl. Acids Res. 20: 1691-96 (1992); US 5,455,166; US 5,130,238; and US 6,214,587, each of which is incorporated herein by reference. Primers used for MDA can be in solution or attached to a bead or other solid support at an amplification site. Again, the primers can be one or more of the universal primers described herein.
[0151] In particular embodiments, a combination of the above-exemplified amplification techniques can be used. For example, RCA and MDA can be used in a combination wherein RCA is used to generate a concatemeric amplicon in solution (e.g., using solution-phase primers). The amplicon can then be used as a template for MDA using primers that are attached to a bead or other solid support (e g., universal primers). In this example, amplicons produced after the combined RCA and MDA steps will be attached to the bead or other solid support.
[0152] Nucleic acid probes that are used in a method set forth herein or present in an apparatus or composition of the present disclosure can include barcode sequences, and for embodiments that include a plurality of different nucleic acid probes, each of the probes can include a different barcode sequence from other probes in the plurality. Barcode sequences can be any of a variety of lengths.
[0153] Longer sequences can generally accommodate a larger number and variety of barcodes for a population. Generally, all probes in a plurality will have the same length barcode (albeit with different sequences), but it is also possible to use different length barcodes for different probes. A barcode sequence can be at least 2, 4, 6, 8, 10, 12, 15, 20 or more nucleotides in length. Alternatively or additionally, the length of the barcode sequence can be at most 20, 15, 12, 10, 8, 6, 4 or fewer nucleotides. Examples of barcode sequences that can be used are set forth, for example in, U.S. Patent Application Publication No. 2014/0342921 Al and U.S. Patent No. 8,460,865, each of which is incorporated herein by reference. [0154] A method of the present disclosure can include a step of performing a nucleic acid detection reaction to determine barcode sequences of nucleic acid probes that were originally associated with a bead or other solid support array element. In many embodiments, the probes were randomly located on the bead or other solid support and the nucleic acid detection reaction provides information that can be reconstructed as described elsewhere herein, to locate each of the different probes and their original locations. Exemplary nucleic acid detection methods include, but are not limited to nucleic acid sequencing of a probe, hybridization of nucleic acids to a probe, ligation of nucleic acids that are hybridized to a probe, extension of nucleic acids that are hybridized to a probe, extension of a first nucleic acid that is hybridized to a probe followed by ligation of the extended nucleic acid to a second nucleic acid that is hybridized to the probe, or other methods known in the art such as those set forth in U.S. Patent No. 8,288,103 or 8,486,625, each of which is incorporated herein by reference.
[0155] The compositions and methods of the instant disclosure largely remove the need for sequencing-by-synthesis (SBS) techniques, as the a priori need for imaging of barcode locations is replaced by generating a latent space representation of macromolecules at array locations by performing a dimensionality reduction analysis.
[0156] A method of the present disclosure can include a step of contacting a biological specimen (i.e., a cryosectioned tissue sample) with a bead or other solid support that has nucleic acid probes attached thereto. In some embodiments, the nucleic acid probes are randomly located on the bead or other solid support. As disclosed herein, the identity and location of the nucleic acid probes need not have been decoded prior to contacting the biological specimen with the bead or other solid support.
Bead-Attached Oligonucleotides
[0157] Certain aspects of the instant disclosure employ a nucleotide- or oligonucleotide-adorned bead, where the bead-attached oligonucleotide includes one or more of the following: a linker; an identical sequence for use as a sequencing priming site; a uniform or near-uniform nucleotide or oligonucleotide sequence; a Unique Molecular Identifier that differs for each priming site; an oligonucleotide redundant sequence for capturing polyadenylated mRNAs and priming reverse transcription (i.e., a poly-T sequence); and at least one oligonucleotide barcode that provides an substrate for spatial identification of an individual bead’s position within a bead array. Exemplified bead-attached oligonucleotides of the instant disclosure include an oligonucleotide spatial barcode designed to be unique to each bead within a bead array (or at least wherein the majority of such barcodes are unique to a bead within a bead array - e.g., it is expressly contemplated here and elsewhere herein that a bead array possessing only a small fraction of beads (e.g., even up to 10%, 20%, 30% or 40% or more of total beads) having non-unique spatial barcodes (e.g., attributable to a relative lack of degeneracy within the bead population, e.g., due to a probabilistically determinable lack of sequence degeneracy calculated as possible within the bead population, as then compared to the number of sites across which the bead population is ultimately distributed and/or due to an artifact such as non-randomness of bead association occurring during pool-and- split rounds of oligonucleotide synthesis, etc.) could still yield high resolution transcriptome expression images, even while removing (or otherwise adjusting for) any beads that turn out to be redundant in barcode within the array). This spatial barcode provides a substrate for identification. Exemplified bead-attached oligonucleotides of the instant disclosure also include a linker (optionally a cleavable linker); a poly-dT sequence (herein, as a 3’ tail); a Unique Molecular Identifier (UMI) which differs for each priming site (as described below and as known in the art, e.g., see WO 2016/040476); a spatial barcode as described above and elsewhere herein; and a common sequence (“PCR handle”) to enable PCR amplification after “single-cell transcriptomes attached to microparticles” (STAMP) formation. As set forth in WO 2016/040476, mRNAs bind to poly-dT-presenting primers on their companion microparticle. At steps where an mRNA sequence is to be identified, the mRNAs are reverse transcribed into cDNAs, generating a set of beads called STAMPs. The barcoded STAMPs can then be amplified in pools for high-throughput mRNA-seq to analyze any desired number of beads (where each bead roughly corresponds to an approximately bead-sized area of cellular transcriptomes derived from the cryosectioned tissue sample (in the instant disclosure, 10 pm beads were used to produce resolutions approximating single cell feature sizes, as exemplified herein).
[0158] It is expressly contemplated that, instead of or in addition to the above-referenced poly- dT-presenting primers, oligonucleotide sequences designed for capture of a broader range of macromolecules as described here and elsewhere herein, can be used. In particular, oligonucleotide-directed capture of other types of macromolecules is also contemplated for the bead-attached oligonucleotides of the instant disclosure; for instance, a gene-specific capture sequence can be incorporated into oligonucleotide sequences (e.g., for purpose of capturing a full range of cell/tissue-associated RNAs including non-poly-A-tailed RNAs, such as tRNAs, miRNAs, etc., or for purpose of specifically capturing DNAs) and/or a loaded transposase can be used to capture, for example, DNA, and/or a specific sequence can be included to allow for specific capture of a DNA-barcoded antibody signal (not only allowing for assessment of protein distribution across a test sample using the compositions and methods of the instant disclosure, but also thereby, e.g., allowing for linkage of the spatial distributions of proteins to RNA expression). [0159] Exemplary split-and-pool synthesis of the bead barcode: To generate the cell barcode, the pool of microparticles (here, microbeads) is repeatedly split into four equally sized oligonucleotide synthesis reactions, to which one of the four DNA bases is added, and then pooled together after each cycle, in a total of 12 split-pool cycles. The barcode synthesized on any individual bead reflects that bead’s unique (or sufficiently unique) path through the series of synthesis reactions. The result is a pool of microparticles, each possessing one of 412 (16,777,216) possible sequences on its entire complement of primers. Extension of the split-pool process can provide for, e.g., production of an even greater number of possible spatial barcode sequences for use in the compositions and methods of the instant disclosure. However, as noted above, functional use of spatial barcodes does not require complete non-redundancy of spatial barcodes among all beads of a bead array. Rather, provided that the majority of such barcodes are unique to a bead within a bead array, it is expressly contemplated that a bead array possessing only a small fraction of beads (e.g., even up to 10%, 20%, 30% or 40% or more of total beads) having non-unique spatial barcodes (e.g., attributable to an artifact such as non-randomness of bead association having occurred during pool-and-split rounds of oligonucleotide synthesis, or simply to the likelihood that an array of a million beads derived from a ten million-fold complex library would still be expected to include a number of beads having redundant spatial barcodes in pairwise comparisons) could still yield high resolution transcriptome expression images, where removal or other adjustment (averaging or other such adjustment) of any beads that turn out to be redundant in barcode within the array could be simply performed, e.g., during in silica spatial location assignment and/or image generation.
[0160] Exemplary synthesis of a unique molecular identifier (UMI). Following the completion of the “split-and-pool” synthesis cycles described above for generation of spatial barcodes, all microparticles are together subjected to eight rounds of degenerate synthesis with all four DNA bases available during each cycle, such that each individual primer receives one of 48 (65,536) possible sequences (UMIs). A UMI is thereby provided that allows distinguishing between, e.g., individual bead-attached oligonucleotides upon the same bead that otherwise share a common spatial barcode (being that such oligonucleotides are attached to the same bead and therefore receive the same spatial barcode).
[0161] In some embodiments of the instant disclosure, the linker of a bead-attached oligonucleotide is a chemically-cleavable, straight-chain polymer. Optionally, the linker is a photolabile optionally substituted hydrocarbon polymer. In certain embodiments, the linker of a bead-attached oligonucleotide is a non-cleavable, straight-chain polymer. Optionally, the linker is a non-cleavable, optionally substituted hydrocarbon polymer. In certain embodiments, the linker is a polyethylene glycol. In one embodiment, the linker is a PEG-C3 to PEG-24.
[0162] A nucleic acid probe used in a composition or method set forth herein can include a target capture moiety. In particular embodiments, the target capture moiety is a target capture sequence. The target capture sequence is generally complementary to a target sequence such that target capture occurs by formation of a probe-target hybrid complex. A target capture sequence can be any of a variety of lengths including, for example, lengths exemplified above in the context of barcode sequences.
[0163] In certain embodiments, a plurality of different nucleic acid probes can include different target capture sequences that hybridize to different target nucleic acid sequences from a biological specimen. Different target capture sequences can be used to selectively bind to one or more desired target nucleic acids from a biological specimen. In some cases, the different nucleic acid probes can include a target capture sequence that is common to all or a subset of the probes on a solid support. For example, the nucleic acid probes on a solid support can have a poly A or poly T sequence. Such probes or amplicons thereof can hybridize to mRNA molecules, cDNA molecules or amplicons thereof that have poly A or poly T tails. Although the mRNA or cDNA species will have different target sequences, capture will be mediated by the common poly A or poly T sequence regions.
[0164] Any of a variety of target nucleic acids can be captured and analyzed in a method set forth herein including, but not limited to, messenger RNA (mRNA), copy DNA (cDNA), genomic DNA (gDNA), ribosomal RNA (rRNA) or transfer RNA (tRNA). Particular target sequences can be selected from databases and appropriate capture sequences designed using techniques and databases known in the art. [0165] A method set forth herein can include a step of hybridizing nucleic acid probes, that are on a supported bead array, to target nucleic acids that are from portions of the biological specimen that are proximal to the probes. Generally, a target nucleic acid will flow or diffuse from a region of the biological specimen to an area of the probe-presenting bead array that is in proximity with that region of the specimen. Here, the target nucleic acid will interact with nucleic acid probes that are proximal to the region of the specimen from which the target nucleic acid was released. A target-probe hybrid complex can form where the target nucleic acid encounters a complementary target capture sequence on a nucleic acid probe. The location of the target-probe hybrid complex will generally correlate with the region of the biological specimen from where the target nucleic acid was derived. In certain embodiments, the beads will include a plurality of nucleic acid probes, the biological specimen will release a plurality of target nucleic acids and a plurality of targetprobe hybrids will be formed on the beads. The sequences of the target nucleic acids and their locations on the bead array will provide spatial information about the nucleic acid content of the biological specimen. Although the example above is described in the context of target nucleic acids that are released from a biological specimen, it will be understood that the target nucleic acids need not be released. Rather, the target nucleic acids may remain in contact with the biological specimen, for example, when they are attached to an exposed surface of the biological specimen in a way that the target nucleic acids can also bind to appropriate nucleic acid probes on the beads. [0166] A method of the present disclosure can include a step of extending bead-attached probes to which target nucleic acids are hybridized. In embodiments where the probes include barcode sequences, the resulting extended probes will include the barcode sequences and sequences from the target nucleic acids (albeit in complementary form). The extended probes are thus spatially tagged versions of the target nucleic acids from the biological specimen. The sequences of the extended probes identify what nucleic acids are in the biological specimen and where in the biological specimen the target nucleic acids are located. It will be understood that other sequence elements that are present in the nucleic acid probes can also be included in the extended probes (see, e.g., description as provided elsewhere herein). Such elements include, for example, primer binding sites, cleavage sites, other tag sequences (e g., sample identification tags), capture sequences, recognition sites for nucleic acid binding proteins or nucleic acid enzymes, or the like. [0167] Extension of probes can be carried out using methods exemplified herein or otherwise known in the art for amplification of nucleic acids or sequencing of nucleic acids. In particular embodiments, one or more nucleotides can be added to the 3' end of a nucleic acid, for example, via polymerase catalysis (e.g., DNA polymerase, RNA polymerase or reverse transcriptase). Chemical or enzymatic methods can be used to add one or more nucleotide to the 3' or 5' end of a nucleic acid. One or more oligonucleotides can be added to the 3' or 5' end of a nucleic acid, for example, via chemical or enzymatic (e.g., ligase catalysis) methods. A nucleic acid can be extended in a template directed manner, whereby the product of extension is complementary to a template nucleic acid that is hybridized to the nucleic acid that is extended. In some embodiments, a DNA primer is extended by a reverse transcriptase using an RNA template, thereby producing a cDNA. Thus, an extended probe made in a method set forth herein can be a reverse transcribed DNA molecule. Exemplary methods for extending nucleic acids are set forth in US Pat. App. Publ. No. US 2005/0037393 Al or US Pat. No. 8,288,103 or 8,486,625, each of which is incorporated herein by reference.
[0168] All or part of a target nucleic acid that is hybridized to a nucleic acid probe can be copied by extension. For example, an extended probe can include at least, 1, 2, 5, 10, 25, 50, 100, 200, 500, 1000 or more nucleotides that are copied from a target nucleic acid. The length of the extension product can be controlled, for example, using reversibly terminated nucleotides in the extension reaction and running a limited number of extension cycles.
[0169] Accordingly, an extended probe produced in a method set forth herein can include no more than 1000, 500, 200, 100, 50, 25, 10, 5, 2 or 1 nucleotides that are copied from a target nucleic acid. Of course, extended probes can be any length within or outside of the ranges set forth above. [0170] It will be understood that probes used in a method, composition or apparatus set forth herein need not be nucleic acids. Other molecules can be used such as proteins, carbohydrates, small molecules, particles or the like. Probes can be a combination of a nucleic acid component (e.g., having a barcode, primer binding site, cleavage site and/or other sequence element set forth herein) and another moiety (e.g., a moiety that captures or modifies a target nucleic acid).
[0171] A method of the present disclosure can include a step of removing one or more extended probes from a bead. In particular embodiments, the probes will have included a cleavage site such that the product of extending the probes will also include the cleavage site. Alternatively, a cleavage site can be introduced into a probe during a modification step. For example, a cleavage site can be introduced into an extended probe during the extension step. [0172] Exemplary cleavage sites include, but are not limited to, moieties that are susceptible to a chemical, enzymatic or physical process that results in bond breakage. For example, the location can be a nucleotide sequence that is recognized by an endonuclease. Suitable endonucleases and their recognition sequences are well known in the art and in many cases are even commercially available (e.g., from New England Biolabs, Beverley MA; ThermoFisher, Waltham, MA or Sigma Aldrich, St. Louis MO). A particularly useful endonuclease will break a bond in a nucleic acid strand at a site that is 3 '-remote to its binding site in the nucleic acid, examples of which include Type II or Type IIS restriction endonucleases. In some embodiments an endonuclease will cut only one strand in a duplex nucleic acid (e.g., a nicking enzyme). Examples of endonucleases that cleave only one strand include Nt.BstNBI and Nt.Alwl.
[0173] In some embodiments, a cleavage site is an abasic site or a nucleotide that has a base that is susceptible to being removed to create an abasic site. Examples of nucleotides that are susceptible to being removed to form an abasic site include uracil and 8-oxo-guanine. Abasic sites can be created by hydrolysis of nucleotide residues using chemical or enzymatic reagents. Once formed, abasic sites may be cleaved (e.g., by treatment with an endonuclease or other singlestranded cleaving enzyme, exposure to heat or alkali), providing a means for site-specific cleavage of a nucleic acid. An abasic site may be created at a uracil nucleotide on one strand of a nucleic acid. The enzyme uracil DNA glycosylase (UDG) may be used to remove the uracil base, generating an abasic site on the strand. The nucleic acid strand that has the abasic site may then be cleaved at the abasic site by treatment with endonuclease (e.g., EndoIV endonuclease, AP lyase, FPG glycosylase/ AP lyase, EndoVIII glycosylase/ AP lyase), heat or alkali. In a particular embodiment, the USER™ reagent available from New England Biolabs is used for the creation of a single nucleotide gap at a uracil base in a nucleic acid.
[0174] Abasic sites may also be generated at non-natural/modified deoxyribonucleotides other than uracil and cleaved in an analogous manner by treatment with endonuclease, heat or alkali. For example, 8-oxo-guanine can be converted to an abasic site by exposure to FPG glycosylase. Deoxyinosine can be converted to an abasic site by exposure to AlkA glycosylase. The abasic sites thus generated may then be cleaved, typically by treatment with a suitable endonuclease (e.g., EndoIV or AP lyase).
[0175] Other examples of cleavage sites and methods that can be used to cleave nucleic acids are set forth, for example, in US Pat. No. 7,960,120, which is incorporated herein by reference. [0176] Modified nucleic acid probes (e.g., extended nucleic acid probes) that are released from a solid support can be pooled to form a fluidic mixture. The mixture can include, for example, at least 10, 100, 1 x 103, 1 x 104, 1 x 10’, 1 x 106, 1 x 107, 1 x 108, 1 x 109 or more different modified probes. Alternatively or additionally, a fluidic mixture can include at most 1 x 109, 1 x 108, 1 x 107, l x 106, 1 x 105, l x 104, l x 103, 100, 10 or fewer different modified probes. The fluidic mixture can be manipulated to allow detection of the modified nucleic acid probes. For example, the modified nucleic acid probes can be separated spatially on a second solid support (i.e., different from the bead array and/or adhered solid support from which the nucleic acid probes were released after having been contacted with a biological specimen and modified), or the probes can be separated temporally in a fluid stream.
[0177] Modified nucleic acid probes (e.g., extended nucleic acid probes) can be separated on a bead or other solid support in a capture or detection method commonly employed for microarraybased techniques or nucleic acid sequencing techniques such as those set forth previously and/or otherwise described herein. For example, modified probes can be attached to a microarray by hybridization to complementary nucleic acids. The modified probes can be attached to beads or to a flow cell surface and optionally amplified as is carried out in many nucleic acid sequencing platforms. Modified probes can be separated in a fluid stream using a microfluidic device, droplet manipulation device, or flow cytometer. Typically, detection is carried out on these separation devices, but detection is not necessary in all embodiments.
[0178] The number of bead-attached oligonucleotides present upon an individual bead can vary across a wide range, e.g., from tens to thousands, or millions, or more. Due to the transcriptome profiling nature of the instant disclosure, it is generally preferred to pack as many capture oligonucleotides as spatially and sterically (as well as economically) possible onto an individual bead (i.e., thousands, tens of thousands, or more, of oligonucleotides per individual bead), provided that mRNA capture from a contacted tissue is optimized. It is contemplated that optimization of the oligonucleotide-per-bead metric can be readily performed by one of ordinary skill in the art.
[0179] It is further expressly contemplated that in addition to the above-described sequence features, oligonucleotides of the instant disclosure can possess any number of other art-recognized features while remaining within the scope of the instant disclosure. Capture Material
[0180] In certain aspects of the instant disclosure, a capture material is employed to associate a bead array with a solid support (e.g., a glass slide). In some embodiments, the capture material is a liquid electrical tape. An exemplary liquid electrical tape of the instant disclosure is Permatex™ liquid electrical tape, which is a weatherproof protectant for wiring and electrical connections. Liquid capture material such as liquid tape can be applied as a liquid, which then dries to a vinyl polymer that resists dirt, dust, chemicals, and moisture, ensuring that applied beads are attached to a capture material-coated slide in a dry condition. Without wishing to be bound by theory, it is believed that one advantage of the instant methods is that the oligonucleotide-coated beads used in certain embodiments of the invention, which are attached to a solid support (e.g., a slide surface via use, e.g., of electrical tape as a capture material) are maintained in a dry state that optimizes transfer of RNA (or other macromolecule) from a cryosection of a tissue to a bead-coated surface (again without wishing to be bound by theory, such transfer is currently believed to occur via capillary action at the scale of the microbead-cryosection interface surface). It is believed that this highly efficient and direct transfer of cellular RNAs (i.e., the transcriptome of cells found within cryosectioned tissues) or other macromolecules to microbeads (where each microbead respectively possesses thousands of oligonucleotides capable of capturing oligoribonucleotides, e.g., transcripts) arrayed upon a solid support - where the transfer occurs upon an otherwise dry surface, therefore limiting and/or eliminating diffusive properties - is what imparts the instant methods and compositions with extremely high resolution (i.e., resolution at 10-50 pm spacing across a two- dimensional image of a section) of assessment of the cellular transcriptomes (or other macromolecules) of assayed tissue sections.
[0181] It is contemplated that beads of the instant disclosure can be applied to a capture material- coated solid support, either immediately upon deposit of capture material to the solid support, or following an initial drying period for the capture material. Capture materials of the instant disclosure can be applied by any of a number of methods, including brushed onto the solid support, sprayed onto the solid support, or the like, or via submersion of the solid support in the capture material. For certain forms of liquid capture material, use of a brush top applicator can allow coverage without gaps and can enable access to tight spaces, which offers advantages in certain embodiments over forms of capture material (i.e., tape) that are applied in a non-liquid state. [0182] While liquid electrical tape has been exemplified as a capture material for use in the methods and compositions of the instant disclosure, other capture materials are also contemplated for such use, including any art-recognized glue or other reagent that is (a) spreadable and/or depositable upon a solid surface (e.g., upon a slide, optionally a slide that allows for light transmission through the slide, e g., a microscope slide) and (b) capable of binding or otherwise capturing a population of beads of 1-100 pm size. Exemplary other capture materials that are expressly contemplated include latex such as cis-l,4-polyisoprene and other rubbers, as well as elastomers (which are generally defined as polymers that possess viscoelasticity (i.e., both viscosity and elasticity), very weak inter-molecular forces, and generally low Young's modulus and high failure strain compared with other materials), including artificial elastomers (e.g., neoprene) and/or silicone elastomers. Acrylate polymers (e.g., scotch tape) are also expressly contemplated, e.g., for use as a capture material of the instant disclosure.
Tissue Samples and Cryosectioning
[0183] In some embodiments, a tissue section is employed. The tissue can be derived from a multicellular organism. Exemplary multicellular organisms include, but are not limited to a mammal, plant, algae, nematode, insect, fish, reptile, amphibian, fungi or Plasmodium falciparum. Exemplary species are set forth previously herein or known in the art. The tissue can be freshly excised from an organism, or it may have been previously preserved for example by freezing, embedding in a material such as paraffin (e.g., formalin fixed paraffin embedded samples), formalin fixation, infiltration, dehydration or the like. Optionally, a tissue section can be cryosectioned, using techniques and compositions as described herein and as known in the art. As a further option, a tissue can be permeabilized and the cells of the tissue lysed. Any of a variety of art-recognized lysis treatments can be used. Target nucleic acids that are released from a tissue that is permeabilized can be captured by nucleic acid probes, as described herein and as known in the art.
[0184] A tissue can be prepared in any convenient or desired way for its use in a method, composition or apparatus herein. Fresh, frozen, fixed or unfixed tissues can be used. A tissue can be fixed or embedded using methods described herein or known in the art.
[0185] A tissue sample for use herein, can be fixed by deep freezing at temperature suitable to maintain or preserve the integrity of the tissue structure, e.g., less than -20° C. In another example, a tissue can be prepared using formalin-fixation and paraffin embedding (FFPE) methods which are known in the art. Other fixatives and/or embedding materials can be used as desired. A fixed or embedded tissue sample can be sectioned, i.e., thinly sliced, using known methods. For example, a tissue sample can be sectioned using a chilled microtome or cryostat, set at a temperature suitable to maintain both the structural integrity of the tissue sample and the chemical properties of the nucleic acids in the sample. Exemplary additional fixatives that are expressly contemplated include alcohol fixation (e.g., methanol fixation, ethanol fixation), glutaraldehyde fixation and paraformaldehyde fixation.
[0186] In some embodiments, a tissue sample will be treated to remove embedding material (e g. to remove paraffin or formalin) from the sample prior to release, capture or modification of nucleic acids. This can be achieved by contacting the sample with an appropriate solvent (e.g. xylene and ethanol washes). Treatment can occur prior to contacting the tissue sample with a solid support- captured bead array as set forth herein or the treatment can occur while the tissue sample is on the solid support-captured bead array.
[0187] Exemplary methods for manipulating tissues for use with solid supports to which nucleic acids are attached are set forth in US Pat. App. Publ. No. 2014/0066318 Al, which is incorporated herein by reference.
[0188] The thickness of a tissue sample or other biological specimen that is contacted with a bead array in a method, composition or apparatus set forth herein can be any suitable thickness desired. In representative embodiments, the thickness will be at least 0.1 pm, 0.25 pm, 0.5 pm, 0.75 pm, 1 pm, 5 pm, 10 pm, 50 pm, 100 pm or thicker. Alternatively or additionally, the thickness of a tissue sample that is contacted with bead array will be no more than 100 pm, 50 pm, 10 pm, 5 pm, 1 pm, 0.5 pm, 0.25 pm, 0.1 pm or thinner.
[0189] A particularly relevant source for a tissue sample is a human being. The sample can be derived from an organ, including for example, an organ of the central nervous system such as brain, brainstem, cerebellum, spinal cord, cranial nerve, or spinal nerve; an organ of the musculoskeletal system such as muscle, bone, tendon or ligament; an organ of the digestive system such as salivary gland, pharynx, esophagus, stomach, small intestine, large intestine, liver, gallbladder or pancreas; an organ of the respiratory system such as larynx, trachea, bronchi, lungs or diaphragm; an organ of the urinary system such as kidney, ureter, bladder or urethra; a reproductive organ such as ovary, fallopian tube, uterus, vagina, placenta, testicle, epididymis, vas deferens, seminal vesicle, prostate, penis or scrotum; an organ of the endocrine system such as pituitary gland, pineal gland, thyroid gland, parathyroid gland, or adrenal gland; an organ of the circulatory system such as heart, artery, vein or capillary; an organ of the lymphatic system such as lymphatic vessel, lymph node, bone marrow, thymus or spleen; a sensory organ such as eye, ear, nose, or tongue; or an organ of the integument such as skin, subcutaneous tissue or mammary gland. In some embodiments, a tissue sample is obtained from a bodily fluid or excreta such as blood, lymph, tears, sweat, saliva, semen, vaginal secretion, ear wax, fecal matter or urine.
[0190] A sample from a human can be considered (or suspected) healthy or diseased when used. In some cases, two samples can be used: a first being considered diseased and a second being considered as healthy (e.g., for use as a healthy control). Any of a variety of conditions can be evaluated, including but not limited to, an autoimmune disease, cancer, cystic fibrosis, aneuploidy, pathogenic infection, psychological condition, hepatitis, diabetes, sexually transmitted disease, heart disease, stroke, cardiovascular disease, multiple sclerosis or muscular dystrophy. Certain contemplated conditions include genetic conditions or conditions associated with pathogens having identifiable genetic signatures.
Macromolecules
[0191] In addition to the poly-A-tailed RNAs captured by poly-dT sequences in certain exemplified embodiments of the instant disclosure, it is expressly contemplated that the instant compositions and methods can be applied to obtain spatially-resolvable abundance data for a wide range of macromolecules, including not only poly-A-tailed RNAs/transcripts, but also, e.g., non- poly-A-tailed RNAs (e.g., tRNAs, miRNAs, etc.; optionally specifically captured using sequencespecific oligonucleotide sequences), DNAs (including, e.g., capture via gene-specific oligonucleotides, loaded transposases, etc.), and proteins (including, e.g., DNA-barcoded antibodies, optionally where a DNA barcode effectively tags a capture antibody for detection, allowing for direct comparison of spatial distribution(s) of antibodies and/or antibody-captured proteins with spatially-resolvable expression profiling that also can be performed upon the test sample via use of the compositions and methods of the instant disclosure. Accordingly, the range of macromolecules expressly contemplated for capture using the compositions and methods of the instant disclosure includes all forms of RNA (including, e g., transcripts, tRNAs, rRNAs, miRNAs, etc.), DNAs (including, e.g., genomic DNAs, barcode DNAs, etc.) and proteins (including, e.g., antibodies that are tagged for binding and detection and/or other forms of protein, optionally including proteins captured by antibodies). In one embodiment, proteins can be profiled using a library of DNA-barcoded antibodies to stain a tissue, before capturing proteins on the spatial array (refer to Cellular Indexing of Transcriptome and Epitopes by sequencing (CITE-seq), which combines unbiased genome-wide expression profiling with the measurement of specific protein markers in thousands of single cells using droplet microfluidics. In brief, monoclonal antibodies are conjugated to oligonucleotides containing unique antibody identifier sequences; a cell suspension is then labeled with the oligo-tagged antibodies and single cells are subsequently encapsulated into nanoliter-sized aqueous droplets in a microfluidic apparatus. In each droplet, antibody and cDNA molecules are indexed with the same unique (or sufficiently unique) barcode and are converted into libraries that are amplified independently and mixed in appropriate proportions for sequencing in the same lane. Stoeckius and Smibert. Protocol Exchange (2017) doi: 10.1038/protex.2017.068). Additionally, proteins may be adsorbed onto the beads nonspecifically, or through chemical capture (such as amine reactive chemistry or crosslinkers), the beads may be sorted into wells and the proteins quantitated by standard measures (antibodies, ELISA, etc.), and then followed by sequencing of the paired bead sequences and the spatial locations reconstructed.
Application of Wash Solution to Bead Array (Optional)
[0192] In certain embodiments, a solid support-captured bead array is washed after exposure of the bead array to a cryosectioned tissue (optionally, the cryosectioned tissue is removed prior to or during application of a wash solution). For example, a solid support-captured bead array of the instant disclosure can be submerged in a buffered salt solution (or other stabilizing solution) after contacting the bead array with a cryosectioned tissue sample. Exemplified buffered salt solutions include saline-sodium citrate (SSC), for example at a NaCl concentration of about 0.2 M to 5 M NaCl, optionally at about 0.5 to 3 M NaCl, optionally at about 1 M NaCl. Without wishing to be bound by theory, as exemplified, exposure of a transcriptome-bound bead array to a saline solution (or other stabilizing solution) is believed to stabilize bead-attached capture probe-sample RNA (i.e., transcript) interactions, likely by blocking RNA degradation and/or other degradative processes. While SSC has been exemplified in the processes of the instant disclosure, use of other types of buffered solutions is expressly contemplated, including, e.g., PBS, Tris buffered saline and/or Tris buffer, as well as, more broadly, any aqueous buffer possessing a pH between 4 and 10 and salt between 0-1 osmolarity.
[0193] Wash solutions can contain various additives, such as surfactants (e.g., detergents), enzymes (e.g., proteases and collagenases), cleavage reagents, or the like, to facilitate removal of the specimen. In some embodiments, the solid support is treated with a solution comprising a proteinase enzyme. Alternatively or additionally, the solution can include cellulase, hemicelluase or chitinase enzymes (e.g. if desiring to remove a tissue sample from a plant or fungal source). In some cases, the temperature of a wash solution will be at least 30°C, 35°C, 50°C, 60°C or 90°C. Conditions can be selected for removal of a biological specimen while not denaturing hybrid complexes formed between target nucleic acids and solid support-attached nucleic acid probes.
Sequencing Me thods
[0194] Some of the methods and compositions provided herein employ methods of sequencing nucleic acids. A number of DNA sequencing techniques are known in the art, including fluorescence-based sequencing methodologies (See, e.g., Birren et al., Genome Analysis Analyzing DNA, 1, Cold Spring Harbor, N.Y., which is incorporated herein by reference in its entirety). In some embodiments, automated sequencing techniques understood in that art are utilized. In some embodiments, parallel sequencing of partitioned amplicons can be utilized (PCT Publication No W02006084132, which is incorporated herein by reference in its entirety). In some embodiments, DNA sequencing is achieved by parallel oligonucleotide extension (See, e.g., U.S. Pat. No. 5,750,341; U.S. Pat. No. 6,306,597, which are incorporated herein by reference in their entireties). Additional examples of sequencing techniques include the Church polony technology (Mitra et al, 2003, Analytical Biochemistry 320, 55-65; Shendure et al, 2005 Science 309, 1728- 1732; U.S. Pat. No. 6,432,360, U.S. Pat. No. 6,485,944, U.S. Pat. No. 6,511,803, which are incorporated by reference), the 454 picotiter pyrosequencing technology (Margulies et al, 2005 Nature 437, 376-380; US 20050130173, which are incorporated herein by reference in their entireties), the Solexa single base addition technology (Bennett et al, 2005, Pharmacogenomics, 6, 373- 382; U.S. Pat. No. 6,787,308; U.S. Pat. No. 6,833,246, which are incorporated herein by reference in their entireties), the Lynx massively parallel signature sequencing technology (Brenner et al. (2000). Nat. Biotechnol. 18:630-634; U.S. Pat. No. 5,695,934; U.S. Pat. No. 5,714,330, which are incorporated herein by reference in their entireties), and the Adessi PCR colony technology (Adessi et al. (2000). Nucleic Acid Res. 28, E87; WO 00018957, which are incorporated herein by reference in their entireties).
[0195] Next-generation sequencing (NGS) methods can be employed in certain aspects of the instant disclosure to obtain a high volume of sequence information (such as are particularly required to perform deep sequencing of bead-associated RNAs following capture of RNAs from cryosections) in a highly efficient and cost-effective manner. NGS methods share the common feature of massively parallel, high-throughput strategies, with the goal of lower costs in comparison to older sequencing methods (see, e.g., Voelkerding et al, Clinical Chem., 55: 641- 658, 2009; MacLean et al, Nature Rev. Microbiol, 7- 287-296; which are incorporated herein by reference in their entireties). NGS methods can be broadly divided into those that typically use template amplification and those that do not. Amplification-utilizing methods include pyrosequencing commercialized by Roche as the 454 technology platforms (e.g., GS 20 and GS FLX), the Solexa platform commercialized by Illumina, and the Supported Oligonucleotide Ligation and Detection (SOLiD™) platform commercialized by Applied Biosystems. Nonamplification approaches, also known as single -molecule sequencing, are exemplified by the HeliScope platform commercialized by Helicos Biosciences, SMRT sequencing commercialized by Pacific Biosciences, and emerging platforms marketed by VisiGen and Oxford Nanopore Technologies Ltd.
[0196] In pyrosequencing (U.S. Pat. No. 6,210,891; U.S. Pat. No. 6,258,568, which are incorporated herein by reference in their entireties), template DNA is fragmented, end-repaired, ligated to adaptors, and clonally amplified in-situ by capturing single template molecules with beads bearing oligonucleotides complementary to the adaptors. Each bead bearing a single template type is compartmentalized into a water-in-oil microvesicle, and the template is clonally amplified using a technique referred to as emulsion PCR. The emulsion is disrupted after amplification and beads are deposited into individual wells of a picotitre plate functioning as a flow cell during the sequencing reactions. Ordered, iterative introduction of each of the four dNTP reagents occurs in the flow cell in the presence of sequencing enzymes and luminescent reporter such as luciferase. In the event that an appropriate dNTP is added to the 3' end of the sequencing primer, the resulting production of ATP causes a burst of luminescence within the well, which is recorded using a CCD camera. It is possible to achieve read lengths greater than or equal to 400 bases, and 106 sequence reads can be achieved, resulting in up to 500 million base pairs (Mb) of sequence.
[0197] In the Solexa/Illumina platform (Voelkerding et al, Clinical Chem., 55- 641-658, 2009; MacLean et al, Nature Rev. Microbiol, 7:287-296; U.S. Pat. No. 6,833,246; U.S. Pat. No. 7,115,400; U.S. Pat. No. 6,969,488, which are incorporated herein by reference in their entireties), sequencing data are produced in the form of shorter-1 ength reads. In this method, single-stranded fragmented DNA is end-repaired to generate 5'-phosphorylated blunt ends, followed by Klenow- mediated addition of a single A base to the 3' end of the fragments. A-addition facilitates addition of T-overhang adaptor oligonucleotides, which are subsequently used to capture the templateadaptor molecules on the surface of a flow cell that is studded with oligonucleotide anchors. The anchor is used as a PCR primer, but because of the length of the template and its proximity to other nearby anchor oligonucleotides, extension by PCR results in the "arching over" of the molecule to hybridize with an adjacent anchor oligonucleotide to form a bridge structure on the surface of the flow cell. These loops of DNA are denatured and cleaved. Forward strands are then sequenced with reversible dye terminators. The sequence of incorporated nucleotides is determined by detection of post- incorporation fluorescence, with each fluorophore and block removed prior to the next cycle of dNTP addition. Sequence read length ranges from 36 nucleotides to over 50 nucleotides, with overall output exceeding 1 billion nucleotide pairs per analytical run.
[0198] Sequencing nucleic acid molecules using SOLiD technology (Voelkerding et al, Clinical Chem., 55: 641-658, 2009; U.S. Patent No. 5,912,148; and U.S. Patent No. 6,130,073, which are incorporated herein by reference in their entireties) can initially involve fragmentation of the template, ligation to oligonucleotide adaptors, attachment to beads, and clonal amplification by emulsion PCR. Following this, beads bearing the template are immobilized on a derivatized surface of a glass flow-cell, and a primer complementary to the adaptor oligonucleotide is annealed. However, rather than utilizing this primer for 3' extension, it is instead used to provide a 5' phosphate group for ligation to interrogation probes containing two probe- specific bases followed by 6 degenerate bases and one of four fluorescent labels. In the SOLiD system, interrogation probes have 16 possible combinations of the two bases at the 3' end of each probe, and one of four fluors at the 5' end. Fluor color, and thus identity of each probe, corresponds to specified color-space coding schemes. Multiple rounds (usually 7) of probe annealing, ligation, and fluor detection are followed by denaturation, and then a second round of sequencing using a primer that is offset by one base relative to the initial primer. In this manner, the template sequence can be computationally re-constructed, and template bases are interrogated twice, resulting in increased accuracy. Sequence read length averages 35 nucleotides, and overall output exceeds 4 billion bases per sequencing run.
[0199] In certain embodiments, nanopore sequencing is employed (see, e.g., Astier et al, J. Am. Chem. Soc. 2006 Feb 8; 128(5): 1705-10, which is incorporated by reference). The theory behind nanopore sequencing has to do with what occurs when a nanopore is immersed in a conducting fluid and a potential (voltage) is applied across it. Under these conditions a slight electric current due to conduction of ions through the nanopore can be observed, and the amount of current is exceedingly sensitive to the size of the nanopore. As each base of a nucleic acid passes through the nanopore (or as individual nucleotides pass through the nanopore in the case of exonucleasebased techniques), this causes a change in the magnitude of the current through the nanopore that is distinct for each of the four bases, thereby allowing the sequence of the DNA molecule to be determined.
[0200] The Ion Torrent technology is a method of DNA sequencing based on the detection of hydrogen ions that are released during the polymerization of DNA (see, e.g., Science 327(5970): 1190 (2010); U.S. Pat. Appl. Pub. Nos. 20090026082, 20090127589, 20100301398, 20100197507, 20100188073, and 20100137143, which are incorporated herein by reference in their entireties). A microwell contains a template DNA strand to be sequenced. Beneath the layer of microwells is a hypersensitive ISFET ion sensor. All layers are contained within a CMOS semiconductor chip, similar to that used in the electronics industry. When a dNTP is incorporated into the growing complementary strand a hydrogen ion is released, which triggers a hypersensitive ion sensor. If homopolymer repeats are present in the template sequence, multiple dNTP molecules will be incorporated in a single cycle. This leads to a corresponding number of released hydrogens and a proportionally higher electronic signal. This technology differs from other sequencing technologies in that no modified nucleotides or optics are used. The per base accuracy of the Ion Torrent sequencer is approximately 99.6% for 50 base reads, with approximately 100 Mb generated per run. The read-length is 100 base pairs. The accuracy for homopolymer repeats of 5 repeats in length is approximately 98%. The benefits of ion semiconductor sequencing are rapid sequencing speed and low upfront and operating costs. Latent Space Representations of Macromolecules
[0201] Certain aspects of the instant disclosure feature use of dimensionality reduction analyses, applied to an assortment of (i) macromolecule abundance data; and (ii) associated array element identification information (e.g., barcodes that identify position within an array), also employing diffusion data (e.g., diffusion patterns), to generate latent space representations of bound macromolecules of a biological sample. Exemplified dimensionality reduction analyses include the supervised dimensionality reduction analyses Uniform Manifold Approximation and Projection (UMAP) reduction, t-distributed stochastic neighbor embedding (t-SNE) reduction, multidimensional scaling (MDS) reduction, and variable autoencoders, though other dimensionality reduction analyses known in the art may be employed (additionally or alternatively). In certain embodiments, dimensionality reduction analysis involves performing non-linear cell trajectory reconstruction on latent space to construct an inferred maximum likelihood progression trajectory between a first phenotypic state and a second phenotypic state. In some embodiments, performing non-linear cell trajectory reconstruction involves applying a reverse graph embedding algorithm to the latent space.
[0202] The present disclosure also provides computer systems that are programmed to implement methods of the disclosure. In certain embodiments, a computer system is programmed or otherwise configured to, for example: generate a latent space representation of spatial representation of macromolecule abundance. The computer system can regulate various aspects of methods and systems of the present disclosure, such as, for example, generating a latent space representation of macromolecule abundance. Methods and systems of the present disclosure can be implemented by way of one or more algorithms. An algorithm can be implemented by way of software upon execution by a central processing unit. The algorithm can, for example, generate a latent space representation of macromolecule abundance.
Porous Matrices
[0203] Certain aspects of the disclosure contemplate use of monomer or polymer components in proportions capable of forming a matrix yet retaining porosity sufficient to allow for efficient enzymatic activity to occur upon matrix-attached nucleic acid primers or probes in situ. In general, such matrix components include cross-linking agents at very low concentrations as compared to other monomers or linear polymers, relative to commonly used amounts of cross-linking agents in polymeric matrices (e.g., bis-acrylamide and acrylamide, respectively, in acrylamide gel matrix formation), and are described in detail, e.g., in WO 2022/174054.
Kits
[0204] The instant disclosure also provides kits containing agents of this disclosure for use in the methods of the present disclosure. Kits of the instant disclosure may include one or more containers comprising an agent (e.g., a capture material, such as liquid electrical tape) and/or composition (e.g., a slide-captured bead array) of this disclosure. In some embodiments, the kits further include instructions for use in accordance with the methods of this disclosure. In some embodiments, these instructions comprise a description of administration of the agent to diagnose, e.g., a disease and/or malignancy. In some embodiments, the instructions comprise a description of how to create a tissue cryosection, form a spatially-defined (or simply spatially definable, pending performance of a step that defines the spatial resolution of the bead array) bead array, contact a tissue cryosection with a spatially-defined bead array and/or obtain captured, tissue cryosection-derived transcript sequence from the spatially-defined bead array. The kit may further comprise a description of selecting an individual suitable for treatment based on identifying whether that subj ect has a certain pattern of expression of one or more transcripts in a cryosection sample.
[0205] The instructions generally include information as to dosage, dosing schedule, and route of administration for the intended use/treatment. Instructions supplied in the kits of the instant disclosure are typically written instructions on a label or package insert (e.g., a paper sheet included in the kit), but machine-readable instructions (e.g., instructions carried on a magnetic or optical storage disk) are also acceptable.
[0206] The label or package insert indicates that the composition is used for staging a cryosection and/or diagnosing a specific expression pattern in a cryosection. Instructions may be provided for practicing any of the methods described herein.
[0207] The kits of this disclosure are in suitable packaging. Suitable packaging includes, but is not limited to, vials, bottles, jars, flexible packaging (e.g., sealed Mylar or plastic bags), and the like. The container may further comprise a pharmaceutically active agent.
[0208] Kits may optionally provide additional components such as buffers and interpretive information. Normally, the kit comprises a container and a label or package insert(s) on or associated with the container. [0209] The practice of the present disclosure employs, unless otherwise indicated, conventional techniques of chemistry, molecular biology, microbiology, recombinant DNA, genetics, immunology, cell biology, cell culture and transgenic biology, which are within the skill of the art. See, e.g., Maniatis et al., 1982, Molecular Cloning (Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y ); Sambrook et al., 1989, Molecular Cloning, 2nd Ed. (Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y.); Sambrook and Russell, 2001, Molecular Cloning, 3rd Ed. (Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y.); Ausubel et al., 1992), Current Protocols in Molecular Biology (John Wiley & Sons, including periodic updates); Glover, 1985, DNA Cloning (IRL Press, Oxford); Anand, 1992; Guthrie and Fink, 1991; Harlow and Lane, 1988, Antibodies, (Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y.); Jakoby and Pastan, 1979; Nucleic Acid Hybridization (B. D. Hames & S. J. Higgins eds. 1984); Transcription And Translation (B. D. Hames & S. J. Higgins eds. 1984); Culture Of Animal Cells (R. I. Freshney, Alan R. Liss, Inc., 1987); Immobilized Cells And Enzymes (IRL Press, 1986); B. Perbal, A Practical Guide To Molecular Cloning (1984); the treatise, Methods In Enzymology (Academic Press, Inc., N.Y.); Gene Transfer Vectors For Mammalian Cells (J. H. Miller and M. P. Calos eds., 1987, Cold Spring Harbor Laboratory); Methods In Enzymology, Vols. 154 and 155 (Wu et al. eds.), Immunochemical Methods In Cell And Molecular Biology (Mayer and Walker, eds., Academic Press, London, 1987); Handbook Of Experimental Immunology, Volumes I- IV (D. M. Weir and C. C. Blackwell, eds., 1986); Riott, Essential Immunology, 6th Edition, Blackwell Scientific Publications, Oxford, 1988; Hogan et al., Manipulating the Mouse Embryo, (Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., 1986); Westerfield, M., The zebrafish book. A guide for the laboratory use of zebrafish (Danio rerio), (4th Ed., Univ, of Oregon Press, Eugene, 2000). Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this disclosure belongs. Although methods and materials similar or equivalent to those described herein can be used in the practice or testing of the present disclosure, suitable methods and materials are described below. All publications, patent applications, patents, and other references mentioned herein are incorporated by reference in their entirety. In case of conflict, the present specification, including definitions, will control. In addition, the materials, methods, and examples are illustrative only and not intended to be limiting. [0210] Reference will now be made in detail to exemplary embodiments of the disclosure. While the disclosure will be described in conjunction with the exemplary embodiments, it will be understood that it is not intended to limit the disclosure to those embodiments. To the contrary, it is intended to cover alternatives, modifications, and equivalents as may be included within the spirit and scope of the disclosure as defined by the appended claims. Standard techniques well known in the art or the techniques specifically described below were utilized.
EXAMPLES
Example 1: Materials and Methods
Sample information and processing
[0211] Mouse brain samples were obtained following guidelines in accordance with the U.S. National Institutes of Health Guide for the Care and Use of Laboratory Animals under protocol number 0120-09-16 and approved by the Broad Institutional Animal Care and Use Committee. Wild-type C57BL/6 mice, maintained on a 12-hour light/dark cycle were anesthetized by administration of isoflurane in a gas chamber flowing 3% isoflurane for 1 minute. Blood was cleared from the brain using transcardial perfusion with a chilled pH 7.4 HEPES buffer (1 lOmM NaCl, 10 mM HEPES, 25 mM glucose, 75 mM sucrose, 7.5 mM MgCh, 2.5 mM KC1). Brain was removed, frozen in liquid nitrogen vapor for 3 minutes and stored at -80 °C.
[0212] The C57BL/6 mouse embryo at Pl (MF-104-P1-BL) was purchased from Zyagen™. The sample was stored at -80 °C before use.
Histological processing
[0213] A 12 pm thick section from the frozen Pl mouse sample was mounted onto a glass slide. The Leica™ ST5010 Autostainer XL (Leica Biosystems™) was used with hematoxylin and eosin (H&E) staining. Sections were immersed in xylene, sequentially processed through 100% and 95% ethanol series, and then stained with hematoxylin. Eosin staining was applied and the section was again processed through 100% and 95% ethanol series, xylene, dehydrated, and covered using the Leica™ CB6030 Fully Automated Glass Coverslipper. The slide was imaged with Leica Aperio VERSA Brightfield, Fluorescence & FISH Digital Pathology Scanner.
Barcoded beads and array production
[0214] Bead barcodes were synthesized as in Slide-seqV217. Sequences of beads used in reconstruction for "Slide-seq" and "Slide-tags":
(1) Capture beads: 5'-
TTTTCTACACGACGCTCTTCCGATCTJJJJJJJJTCTTCAGCGTTCCCGAGAJJJJJJJNNNNN NNVVT30-3’ (SEQ ID NO: 1);
(2) Fiducial beads: 5’-TTT-pc-
GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCTJJJJJJJJTCTTCAGCGTTCCCGAGAJJ JJJJJNNNNNNNVVA3O-3’ (SEQ ID NO: 2), ‘pc’ in the sequence denotes photocleavable linker.
(3) Sequence of capture beads used in reconstruction of Pl mouse sample: 5'-PEG-pc- TTTCTACACGACGCTCTTCCGATCTJJJJJJJJTCTTCAGCGTTCCCGAGAJJJJJJJNNNNNN NVVT30-3' (SEQ ID NO: 3), ‘PEG’ in the sequence denotes a polyethylene glycol (PEG) linker.
[0215] In " Slide-seq" experiments, capture beads and fiducial beads were mixed at the ratio of 3 : 1. In "Slide-tags" experiments, capture beads and fiducial beads were mixed at the ratio of 1 :3. After mixing of the beads, array preparation and in situ sequencing were performed as described previously. Beads were pelleted and resuspended in water with 10% DMSO at a concentration between 20,000 and 50,000 beads per microliter. Then 10 pl bead solution was pipetted into each 3 mm well on the gasket. For a 1.2 cm array, 200 pl bead solution was used. The coverslip gasket filled with beads was centrifuged at 850 g for at least 30 min at 40 °C until the surface was dry, for the beads to stick to glass coverslips sprayed with Gorilla Glue® and Plastic Dip®. Excess beads were washed off to form a monolayer bead array. Pucks were sequenced using a sequencing-by- ligation approach with a monobase-encoding strategy.
Reconstruction with Slide-seq procedure
[0216] For reconstruction with Slide-seq, the protocol of Slide-seq V2 was performed until the step of reverse transcription (RT). First, 10 pm frozen tissue sections were put on arrays and immersed in 200 pl hybridization buffer (6x SSC with 2 U pl-1 Lucigen NxGen® RNase inhibitor) for 30 min at room temperature to allow for binding of RNA to the oligonucleotides on the beads. Then, arrays were incubated in RT solution (115 pl water, 40 pl Maxima 5* RT buffer (Thermo Fisher®, EP0751), 20 pl of lO mM dNTPs (NEB®, N0477L), 5 pl RNase inhibitor (Lucigen®, 30281), 10 pl of 50 pM template switch oligonucleotide (Qiagen®,
339414YCO0076714): AAGCAGTGGTATCAACGCAGAGTGAATrG+GrG and 10 pl Maxima™ H Minus reverse transcriptase (Thermo Fisher®, EP0751)) for 1 .5 h at 52 °C. After RT finished, the bead array was put on a glass slide in a 10 pL diffusion buffer (2x SSC, 20% formamide). The bead array was then placed under an ultraviolet (365 nm) light source (0.42 mW mm-2, Thorlabs, M365LP1-C5, Thorlabs, LEDD1B) for 2 min (Slide-seq for mouse hippocampus section) or 5 s (Slide-seq for Pl mouse sample, as the capture beads used were also photocleavable so were cleaved for less time). Then the bead array was incubated at room temperature for 10 min for cleaved oligonucleotides to diffuse. After dipping the bead array into a 1 mL diffusion buffer to wash out free oligonucleotides, the bead array was put into a 1.5 mL centrifuge tube of 200 pL extension buffer (lx NEBuffer 2, ImM dNTP, 25 units Klenow exo- (NEB M0212L)). The bead array was incubated at 37 °C for 1 h. Then Slide-seq V2 was continued from adding the tissue clearing buffer to cDNA PCR on the dissociated beads. A total of 200 pl tissue digestion buffer (200 mM Tris-Cl pH 8, 400 mM NaCl, 4% SDS, 10 mM EDTA and 32 U ml-1 proteinase K (NEB®, P8107S)) was added directly to the Klenow extension solution, and the mixture was incubated at 37 °C for 30 min. Beads were pipetted up and down to detach from the surface. Then, 200 pl wash buffer (10 mM Tris pH 8.0, 1 mM EDTA and 0.01% Tween-20) was added to the 400 pl tissue clearing and RT solution mix, and the tube was centrifuged for 3 min at 3,000 g. The supernatant was then removed from the bead pellet, and the beads were resuspended in 200 pl wash buffer and centrifuged again. This was repeated a total of three times.
[0217] The supernatant was then removed from the bead pellet, and the bead pellet was resuspended in 200 pl of 0.1 N NaOH and incubated for 5 min at room temperature. Then beads were resuspended in 200 pl of H2O. Second-strand synthesis was then performed on the beads by incubating the pellet in 200 pl of second-strand synthesis mix (133 pl water, 40 pl Maxima™ 5* RT buffer, 20 pl of 10 mM dNTPs, 2 pl of 1 mM dN-SMRT oligonucleotide and 5 pl Klenow enzyme (NEB®, M0210)) at 37 °C for 1 h. After second-strand synthesis, 200 pl wash buffer was added, and the beads were centrifuged for 3 min at 3,000 g. The supernatant was then removed from the bead pellet, and the beads were resuspended in 200 pl wash buffer and centrifuged again. This was repeated a total of three times. Water (200 pl) was added to the bead pellet, and the beads were moved into a 200-pl PCR strip tube, pelleted in a minifuge and resuspended in 200 pl water. The beads were then pelleted and resuspended in library PCR mix (22 pl water, 25 pl of Terra Direct PCR mix buffer (Takara® Biosciences, 639270), 1 pl Terra polymerase (Takara® Biosciences, 639270), 1 pl of 100 pM TruSeq® PCR handle primer (IDT):
CTACACGACGCTCTTCCGATCT (SEQ ID NO: 6) and 1 pl of 100 pM SMART® PCR primer (IDT)): AAGCAGTGGTATCAACGCAGAGT (SEQ ID NO: 7), and PCR was performed according to the following pro-gram: 95 °C for 3 min; four cycles of 98 °C for 20 s, 65 °C for 45 s and 72 °C for 3 min; nine cycles of 98 °C for 20 s, 67 °C for 20 s and 72 °C for 3 min; 72 °C for 5 min; hold at 4 °C. After cDNA PCR, the beads were spun down with 2 min 3000 RCF. Then supernatant was collected for cDNA purification and beads were resuspended in PCR mix again for the reconstruction library. The PCR mix included: lx KAPA (Roche KK2612), 100 nM P5- TruSeq® Read 1 primer 5’-
AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATC T-3’ (SEQ ID NO: 4), 100 nM P7-TruSeq® Read 2 primer 5’-
CAAGCAGAAGACGGCATACGAGANNNNNNNNGGTGACTGGAGTTCAGACGTGTGC TCTTCCGATCT-3’ (SEQ ID NO: 5).
[0218] The reconstruction library PCR followed the program: 3 min at 95 °C, 13 cycles of (20 s at 98 °C, 15 s at 65 °C, 15 s at 72 °C), and 1 min at 72 °C. After the PCR program, supernatant was collected after centrifuge and purified with 1.5x SPRI cleanup (Beckman Coulter®, A63881), generating a reconstruction library ready for sequencing. Meanwhile, the mRNA library was prepared as described previously. Sequences used are shown in FIGs. 24A and 24B.
Reconstruction with Slide-tags procedure
[0219] For reconstruction with Slide-tags, the diffusion and extension steps were performed before doing Slide-tags. Similar to the process in reconstruction with Slide-seq, the bead array was emerged with diffusion buffer and exposed to ultraviolet (365 nm) light source for 5 s, incubated at room temperature for 10 min, dipped in 1 mL diffusion buffer, put in 200 pL extension buffer as above and incubated at 37 °C for 1 h. After extension, the bead array was dried, and used for a Slide-tags protocol until nuclei extraction. Fresh frozen tissues were cryosectioned to 20 pm and then placed onto the array. The array was placed onto the glass slide in 6-10 pl dissociation buffer (82 mM Na2SO4, 30 mM K2SO4, 10 mM glucose, 10 mM HEPES, 5 mM MgC12) and exposed to ultraviolet (365 nm) light source (0.42 mW mm-2, Thor-labs, M365LP1-C5, Thorlabs, LEDD1B) for 30 s. After photocleavage, the puck was incubated for 7.5 min for the oligos to tag nuclei. Nuclei were exacted and loaded into the lOx Genomics® Chromium® controller using the Chromium® Next GEM Single Cell 3' Kit v3.1 (lOx Genomics, PN- 1000268). Afterwards, extracted nuclei followed Slide-tags protocol19 for gene expression while beads were dissociated, washed as in the Slide-tags protocol19 with 200 pL wash buffer (10 mM Tris pH 8.0, 1 mM EDTA, and 0.01% Tween-20) and resuspended in 200 pL PCR mix. PCR mix and program are both the same as in reconstruction with Slide-seq. The PCR mix included lx KAPA (Roche® KK2612), 100 nM P5-TruSeq® Read 1 primer (SEQ ID NO: 4) and 100 nM P7-TruSeq® Read 2 primer (SEQ ID NO: 5).
[0220] The reconstruction library PCR followed the program 3 min at 95 °C, 13 cycles of (20 s at 98 °C, 15 s at 65 °C, 15 s at 72 °C), and 1 min at 72 °C. After the PCR program, supernatant was collected after centrifuge and purified with 1.5x SPRI cleanup (Beckman Coulter®, A63881), generating a reconstruction library ready for sequencing. Sequences used are shown in FIGs. 24A and 24B.
Sequencing
[0221] The gene expression libraries and reconstruction libraries were sequenced on the Illumina® NextseqlOOO instrument using P2 100 cycle kit (Illumina®, 20046811). For the Pl mouse sample, the gene expression library was sequenced on an Illumina® NovaSeq instrument using the S Prime platform.
Simulating diffusion matrix
[0222] Fiducial and capture beads were simulated to locate uniformly in a circle with each bead’s color determined by its location in the image of pattern ‘H’ or in a two-dimensional color gradient. The diffusion of a fixed number of barcodes, described by unique molecular identifiers (UMIs), from each fiducial bead was assumed to follow a Gaussian distribution. F£ , the number of fiducial bead barcode j captured by capture bead i, follows a binomial distribution with the probability determined by the distance between beads:
Yij ~ Binomial(UMI, pi ) di j2
Pij = C exp(- —) where is the Euclidean distance between the fiducial and capture beads, is the standard deviation in the Gaussian distribution, and C is for normalization. The diffusion based pairwise count matrix was then generated for reconstruction.
[0223] For the simulated diffusion matrix, 1000 capture beads and 1000 fiducial beads were located uniformly with a circle of diameter as 3000 pm. Diffusion distance a was 300 pm and UMI was 300 when simulating the diffusion matrix. Sequencing data processing and di ffusion matrix generation
[0224] In the processing of FASTQ files from the reconstruction library sequencing, reads were initially filtered out if their constant sequences had Hamming distances greater than 3 when compared to the universal primer sequence. From remaining reads, capture bead barcode, capture bead UMI, fiducial bead barcode, and fiducial bead UMI were abstracted. To determine the read threshold for reliable bead barcodes, rank plots were generated for both capture bead barcodes and fiducial bead barcodes. Barcodes above the read threshold were collapsed with a Hamming distance of 1, resulting in a whitelist of barcodes. Barcodes with reads below the threshold were matched to the whitelist with a Hamming distance of 1. Paired reads with both capture bead barcode and fiducial bead barcode in the whitelist were compiled along with the UMI information. [0225] Data frames of capture bead barcode, fiducial bead barcode, and UMI were grouped by the combinations of capture bead barcode and fiducial bead barcode. Within these groups, the unique UMIs were counted, with these counts serving as elements within diffusion matrices. Diffusion matrices in sparse matrix data structure were generated by taking capture bead barcodes as rows and fiducial bead barcodes as columns (or fiducial bead barcodes as rows and capture bead barcodes as columns for reconstruction with Slide-tags). The element values in these matrices corresponded to the count of unique UMIs associated with each barcode pair.
Computational reconstruction with UMAP
[0226] Diffusion matrices were used as input for Uniform Manifold Approximation and Proj ection (UMAP) to reduce to a two dimensional space. Coordinates in the two-dimensional space were directly used as reconstructed locations. UMAP parameters that were tuned were: larger n neighbors and larger min dist for uniform distribution of beads, larger n epochs for converging, and also cosine metric that can better represent the high dimensional distance from diffusion matrix. With experimental data, it was also found a loglp transformation of the diffusion matrix improved reconstruction accuracy.
[0227] For simulated diffusion matrix, UMAP was used to embed the diffusion matrix directly in a two-dimensional space with the following parameters: cosine metric, n_neighbors=25, min dist=0.99, n epochs= 10000, learning rate=l.
[0228] For reconstruction with Slide-seq or Slide-tags, UMAP was used to embed the loglp transformed diffusion matrix in a two-dimensional space with the following parameters: cosine metric, n_neighbors=25, min_dist=0.99, n_epochs=50000, learning_rate=l. The UMAP computation was expedited through the use of 24 parallel threads.
[0229] For reconstruction with Pl mouse, UMAP was used to embed the loglp transformed diffusion matrix in a two-dimensional space with the following parameters: cosine metric, n_neighbors=45, min_dist=0.4, n_epochs= 10000, learning_rate=l. The UMAP computation was expedited through the use of 24 parallel threads. The parameters for this 1.2 cm sample were changed because: the diffusion distance (tr) was the same while the whole size increased, which means the relative connectivity decreased and the minimum distance between beads decreased.
Comparing reconstruction with other dimensionality reduction methods in simulation
[0230] The same simulated diffusion matrix was used for testing and comparing other dimensionality reduction methods.
[0231] PCA was performed with n_components=2 and other default parameters; MDS was performed with n_components=2 and other default parameters; Isomap was performed with n_components=2, n_neighbors=100, max_iter= 100000; tSNE was performed with n_components=2, random_state=0, perplexity=50.
Evaluation of reconstruction results
[0232] To estimate the absolute errors of beads locations, reconstructed locations were registered to ground truth (from simulation or in situ sequencing) using procrustes analysis, which applies only rigid transformation (scaling, reflection, rotation, translation) on the reconstructed locations. The reconstruction error of each bead was calculated directly by comparing the locations in ground truth and in transformed reconstruction results. The absolute errors were presented through spatial distributions, displacement vectors, and distribution histograms. In the displacement vectors, only vectors with a length of less than 282.8 pm were included. In distribution histograms, errors were displayed within the indicated ranges.
[0233] To estimate the measurement length errors, pairwise distances between capture beads were calculated in both reconstruction and ground truth. The differences between the distances in reconstruction and ground truth were calculated as the measurement length errors. The measurement lengths were categorized into bins with a width of 30 pm, and the errors within each bin were averaged using the root mean square (RMS) method. The standard deviation for the measurement length errors within each bin was also determined. To obtain relative errors, the RMS errors were divided by the corresponding measurement lengths.
[0234] For quality control of UMAP reconstruction without ground truth, the UMAP output result with uniformity and circularity was checked and compared to the parameters of a uniformly distributed array with the same number of beads.
Effect of UMAP parameters
[0235] How UMAP parameters and normalization might affect reconstruction results with the reconstruction with Slide-seq data was tested. The effects of UMAP n_epochs, metric (‘euclidean’ or ‘cosine’), and log Ip transformation were tested using UMAP in cuML package with GPU (v.24.02.00), with other parameters kept as: n_neighbors=25 and min_dist=0.99. Results were visualized with colors determined by ground truth locations and absolute errors were calculated based on ground truth. The effects of UMAP n neighbors and min dist were tested using UMAP in cuML package with GPU, with other variables as: metric = ‘cosine’ and n_epochs=50000. Results were visualized with colors determined by ground truth locations and absolute errors were calculated based on ground truth.
Analysis of diffusion in ground truth
[0236] The diffusion distribution of a capture bead barcode was represented by the position of its associated fiducial bead barcodes, which were color-coded according to the conjugation UMI count. The diffusion distributions of capture beads with high total UMI counts (first 3000 beads ranked by total UMI counts) along the X axis were fitted using Kernel Density Estimation (KDE) and then averaged to derive the ensemble KDE diffusion distribution. The empirical Full Width at Half Maximum (FWHM) of the diffusion distribution was calculated based on the ensemble KDE.
Slide-seq gene expression data analysis
[0237] After sequencing, gene expression results were processed using the Slide-seq pipeline without bead barcode matching. The gene expression data of the beads was visualized using UMAP with top 40 principal components and the number of nearest neighbors =10. Then we applied robust cell type decomposition18 (RCTD) to decompose the cell type associated with each capture bead barcode. The single-cell reference dataset used for comparison was the same as the one for the mouse hippocampus mentioned in the RCTD study. CAI width analysis
[0238] The spatial distribution of Atp2bl was profiled in both reconstruction and ground truth. A line perpendicular to the expression pattern of Atp2b 1 in the CAI region was drawn to characterize the expression density of Atp2bl along this line. The widths of the CAI region were then determined by the Full Width at Half Maximum (FWHM) of the Atp2bl expression distribution. This analysis of the CAI width was conducted across three biological replicates.
Neighborhood enrichment analysis
[0239] The neighborhood enrichment analysis was performed using the squidpy package in python (v.1.2.2). Cell types were defined by RCTD and locations were from reconstruction and ground truth respectively. Results from reconstruction and ground truth were presented in the same scale. Pearson correlation was calculated for the z-score of neighborhood enrichment from reconstruction and ground truth, with a ‘two-sided’ alternative hypothesis.
Matched bead and UMI per bead comparison
[0240] With the same Slide-seq library, each read was matched to a whitelist of barcodes from either in situ sequencing or reconstruction. Among all the beads matched with some gene expression reads, beads with less than 20 UMI matched were filtered for downstream analysis. The number of matched beads were calculated for in situ sequencing and reconstruction respectively. Also, the distribution of UMI per bead for in situ sequencing or reconstruction were shown with violin plot.
Slide -tags gene expression data analysis and nuclei positioning
[0241] In reconstruction with Slide-tags, single-nucleus gene expression data and spatial barcode data were analyzed with Cell Ranger v.6.1.2 and Cell Bender v.0.2.0 according to the Slide-tags pipeline. Nuclei positioning was performed using bead locations from both the reconstructed data and the ground truth respectively.
Evaluation of nuclei positioning error
[0242] To evaluate the error in nuclei positioning, nuclei locations were first transformed from reconstruction according to the registration between bead locations in reconstruction and ground truth. Nuclei locations from the reconstruction and the ground truth were directly compared to calculate the absolute error. For the analysis of the measurement length error, the pairwise distances between nuclei were examined, employing the same methodology used for the measurement length error of beads.
Optimization of Uniform Manifold Approximation and Projection (UMAP)
[0243] The UMAP reconstruction code was optimized by performing multiple parameter sweeps to settle on the set of parameters (n_neighbors, min_dist, local_connectivity, etc.) that yielded the best results and consistency across multiple runs.
[0244] Next a Leiden initialization step was performed. Drawing on the finding that spatial barcode counts matrix can be Leiden clustered into spatially resolved clusters (Liao et. al; DOI: 10.1101/2024.08.06.606834), the “Leiden initialization” algorithm was implemented in which Leiden clusters the counts matrix into discrete clusters, and using those clusters, made a separate matrix (with dimensions cluster by cluster) that counts the edges of all the beads within a cluster that connect to beads outside of its cluster. UMAP of this smaller matrix gives an initial embedding where each bead is in the correct relative position compared to the beads of other clusters, and so UMAP converges in significantly fewer steps and converges more reliably (a local optimum that matches the known shape of a circle).
[0245] One significant advantage of the currently disclosed “Slide-tags” approach for spatial reconstruction of macromolecule abundance in a tissue sample is the size of tissue sample to which the arrays and processes of the current disclosure can be successfully applied. In particular, the current disclosure can be applied to significantly larger tissue samples than previously exemplified - e.g., tissue sample sections that are 5 cm across, 7 cm across, 10 cm across, or even larger, can be readily imaged for macromolecule abundance using the compositions and methods disclosed herein. As proof-of-principle, a 7 cm puck (array, used to contact a 7 cm or larger tissue section) was successfully imaged using the approaches disclosed herein. Successful imaging of the 7 cm puck was accomplished by making certain changes to the spatial reconstruction process, including rewriting of the KNN code. Instead of a tree-based approximation algorithm (which eventually would not support running on the counts matrix of size -8,000,000 by -24,000,000), a novel KNN algorithm that directly computes cosine distances was designed and employed. Through sparse dot topn, very fast computation of k-Nearest Neighbors was achieved. This novel KNN algorithm supported significantly larger sizes of matrices through chunking. Because it uses a direct calculation of the metric and is not tree based, inaccuracies in the calculation are reduced (which can occur with the approximation algorithm). The currently disclosed compositions and methods were therefore applied to significantly larger tissue sections than previously exemplified 1.2 cm pucks, with imaging of pucks of 5 cm or more diameter, 7 cm or more diameter, 8 cm or more diameter, 9 cm or more diameter, 10 cm or more diameter, 11 cm or more diameter, 12 cm or more diameter, 13 cm or more diameter, 14 cm or more diameter, 15 cm or more diameter, 16 cm or more diameter, or larger, now demonstrated.
Segmentation of Pl mouse bead array
[0246] A 1.2 cm circular bead array was used to profile the spatial transcriptomics of the Pl mouse section although the tissue covered only a portion of the bead array. To differentiate between the tissue-covered and uncovered regions of the bead array, segmentation was performed based on the UMI (Unique Molecular Identifier) count per bead. Kernel Density Estimation (KDE) was employed to estimate the UMI count density across the array, and a threshold for UMI counts was established. Only the beads covered by tissue were retained for further analysis, to save computational memory.
Pl mouse gene expression analysis
[0247] After sequencing, the gene expression results were processed using the Slide-seq pipeline. The gene expression data of the beads were visualized using UMAP, with top 40 principal components and the number of nearest neighbors =10. Then RCTD was applied to decompose the cell type associated with each capture bead barcode with a reference from Pl mouse single cell data. In the reference, cell types that were not present in the tissue section being profiled were excluded.
[0248] Neuronal cell types, including CNS neurons, neural crest and PNS neurons, olfactory sensory neurons, and intermediate neuronal progenitors, were isolated and analyzed again with UMAP embedding and unsupervised clustering. Highly variable genes were found for each subcluster. Olfactory epithelium enriched genes were listed by calculating the ratio of the mean expression level in the olfactory epithelium region to that in the entire section.
[0249] To identify genes with spatial differential expression, nonparametric cell type-specific inference was performed of differential expression (C-SIDE) analysis focusing on epithelial and muscle cells. Furthermore, Moran's I statistics were calculated to verify the patterned gene expression detected through C-SIDE. Example 2: Computational Reconstruction of Spatial Locations Through Dimensionality Reduction
[0250] "Slide-seq" is a high resolution spatial transcriptomic approach that utilizes arrays of barcoded beads (10-micron polystyrene beads) for spatial capture of macromolecules (e.g., RNAs). Previously, the "Slide-seq" approach has been implemented with inclusion of an imaging step designed to detect the sequence and position of individual array elements (e.g., via barcoding of, e.g., clusters of oligonucleotides, probes and/or beads), thereby allowing for immediate placement of macromolecule abundance data at a position within a surveyed array. However, this imaging/position detection step can be expensive and laborious for an end-user to implement. It was newly contemplated herein that examination of oligonucleotide and/or macromolecule diffusion data across a "Slide-seq" or other array format might allow for reconstruction of array element positions (e.g., bead positioning) simply from measured macromolecule abundance data, particularly where deep sampling of macromolecule abundance data is obtained. It was specifically theorized that diffusional interactions could be simulated between beads on a 2D array. It was therefore examined whether computational reconstruction could map spatial locations using diffusion-based proximity data. Initially, a framework to generate simulated diffusion data across a "Slide-seq" array was implemented.
[0251] To visualize the reconstruction effect, capture and fiducial beads were uniformly sampled from a circular area with color pattern of a letter H (FIG. IB, FIG. 3 A). The diffusion of barcodes from each fiducial bead was simulated to follow a Gaussian distribution (FIG. 3B). Due to this diffusion, capture beads exhibited proximity-dependent capture of barcodes from fiducial beads, generating a neighboring matrix between bead barcodes (FIG. 1A). Specifically, capture beads registered higher count values for barcodes from fiducial beads that were closer, while distant fiducial beads were associated with zero counts. While each capture barcode can be characterized by count values on its associated fiducial barcodes, physically adjacent capture beads capture similar barcodes from similar fiducial beads and are closer in the high-dimensional space of fiducial bead barcodes. It was reasoned that dimensionality reduction algorithms might be able to reconstruct the latent two-dimensional representation of physical space from this high-dimensional interaction matrix. In particular, Uniform Manifold Approximation and Projection (UMAP), a nonlinear dimensionality reduction method, was performed, which reduces the high-dimensional capture bead barcode by fiducial bead barcode data into a two-dimensional embedding space, while preserving the similarity of capture bead barcodes in high dimensional space (FIGs. 3A-3G). While all parameters clustered by proximity, a range of parameters was found, wherein locations of capture beads in this two-dimensional embedding were highly similar to their physical locations, indicating UMAP learned the intrinsic two-dimensional manifold embedded within the highdimensional data (FIG. IB, FIGs. 3C and 3D). The accuracy of reconstruction was quantitatively assessed by comparing the reconstructed locations of the capture beads to their known physical positions, following a rigid transformation for alignment. The median error in simulation was found to be equivalent to 1.6% of the array's diameter (FIGs. 3E-3G). Other dimensionality reduction methods, encompassing both linear and nonlinear, were assessed using identical simulation data. While all methods showed promise in discerning the order of bead arrangement, they incurred larger absolute errors when compared to the results of UMAP.
[0252] To assess the practical applicability of this computational reconstruction method in experiments, this method’s dynamic range was explored by tuning various parameters in simulation. Reconstruction errors were analyzed with different ratios of fiducial to capture bead numbers. With the ratio ranging from 1 :5 to 5: 1, the simulation displayed median errors less than 2% of the array size (FIG. 1C, FIG. 12). This range of ratio was subsequently considered for the design of experimental mixed arrays. The effects of changing captured unique molecular identifiers (UMIs) per capture bead and diffusion distance (G) were also simulated (FIG. 1C, FIG. 12). Both excessively narrow and wide diffusion distance led to larger errors, whereas a greater number of UMIs per bead enabled better positioning accuracy. A wide range of parameters demonstrated feasibility for computational reconstruction, for example, diffusion distance (o) between 2% to 6% of the array size and a minimum of 40 UMIs per bead.
Example 3: Implementing Spatial Transcriptomics through Computational Reconstruction [0253] Implementation and validation of the reconstruction strategy was then attempted experimentally by performing the "Slide-seq" assay on an array that was both spatially indexed and reconstructed by the new approach. The array mixed the original barcoded poly(dT) beads for capturing mRNA with barcoded poly (dA) fiducial beads to enable diffusion-based reconstruction (FIG. 1A, FIG. ID, FIG. 9, FIG. 14A, FIG. 14B). Next, in situ sequencing was performed to spatially index the array, using the standard approach as previously described, to generate ground truth positions before reconstruction. To obtain spatial coordinates of beads by reconstruction, the oligonucleotide barcodes on poly(dA) beads (fiducial beads, also called diffusible beads) were cleaved with UV and were captured by nearby poly(dT) beads (capture beads) (FIG. ID). To validate the reconstruction data, ground truth positions of the same array were generated before reconstruction using in situ sequencing. The distribution of capture bead barcodes on fiducial bead barcodes followed a heavy tailed distribution, with the full width at half maximum (FWHM) around 123.1 pm (FIG. IE).
[0254] Given that this approach generated diffusion of spatial information from fiducial beads to capture beads, it was next examined whether relative spatial locations could be reconstructed. UMAP was applied upon the high-dimensional diffusion information to reconstruct the relative locations of anchor beads in two-dimensional embedding space, without any spatial information input. In addition to the two main UMAP parameters (n neighbors, min dist) that were tuned in the simulation, it was found that increasing the number of epochs, using cosine metric for computing high dimensional distance, and applying loglp transformation of diffusion matrix improved the reconstruction accuracy. Comparing to ground truth from in situ sequencing, reconstructed locations recovered the global arrangement of capture beads. While it was observed that a few beads (20 out of about 16000) were positioned significantly away (>200 pm) from ground truth positions, this was most of which were attributable to in situ sequencing errors or were barcode collisions.
[0255] In view of the current approach's reconstructed of bead locations, spatial transcriptomics data captured by the array was then examined. Slide-seqV2 was performed using the same reconstructed array with the capture beads (poly (dT)) on a mouse hippocampal transcript with no modifications to the protocol. Individual bead profiles were clustered and assigned with cell types by robust cell type decomposition (RCTD) (FIG. 4D). Spatial representation of cell types with reconstructed capture bead location demonstrated the known structures of the hippocampus (FIG. IF, FIG. 4D). This was especially clear when examining cell-type distributions and marker gene plots, which were virtually indistinguishable between reconstruction and ground truth data.
[0256] Quantification of the accuracy of the reconstruction process was then attempted. To perform such assessment, reconstruction accuracy was assessed with three strategies: examining the absolute error, relative error, and histological structure preservation in comparison to ground truth. First, to assess absolute error, each capture bead’s absolute displacement was calculated via a rigid registration between reconstruction result and ground truth. The median value of displacement lengths was 25.9 pm (FIGS.4A-4C and FIG. 15). The absolute error can be affected by the registration process; this indicated that a more appropriate statistic would be the error in distance measurements between reconstruction and ground truth (FIG. 1G, FIG. 4E, FIG. 15). The intuition here is that most commonly, the pairwise distance measurements (e.g., length, neighborhoods, spatial proximity) are being quantified, and thus, the error of pairwise distance measurements are taken into consideration. The root-mean-square (RMS) error of length measurements was quantified across all pairwise length measurements as a function of measurement length. RMS error was close to 10 pm (the bead size) at local scale measurements (about 100 pm) as nearby beads were usually displaced in the same direction, and plateaued at ~20 pm >1000 microns (representing <2% error in measurement lengths) (FIG. 4E).
[0257] To assess the effect of reconstruction error on histological structure, the width of the CAI layer in the hippocampus was measured. The widths, characterized by CAI marker Atp2bl expression, were similar in ground truth and reconstruction (FWHM is 49.5 pm in ground truth and 43.7 pm in reconstruction) (FIG. 1H, FIG. 15). Neighborhood enrichment analysis (Example 1 : Methods) between all pairs of cell types was performed, and the results were found to be highly similar between reconstruction and group truth (with Pearson correlation coefficient = 0.997) (FIG. 4G, FIG. 16). Lastly, the reconstruction error across three biological replicates in the hippocampus was evaluated, and it was observed that results were largely concordant across replicates, thereby demonstrating the robustness of the reconstruction procedure (FIG. 1G, FIG. 4F, FIG. 15). It was also found that across the three replicates, the gene expression information with matched spatial barcode increased by 1.6 times in reconstruction compared to ground truth (FIG. 4H). These data demonstrated that the reconstruction error was relatively small (~25 pm, 2.5 times of bead size), with a subtle effect on local spatial transcriptomics analysis.
Example 4: Spatial Reconstruction at Single-Nucleus Resolution
[0258] As the computation reconstruction strategy should be adaptable to all array-based spatial technologies, the reconstruction was applied with "Slide-tags". In "Slide-tags", a recently developed single-nucleus spatial technology, nucleic acid spatial barcodes are photocleaved from barcoded arrays, associated with nuclei, then followed by single-nucleus RNA sequencing with single-cell indexing. To apply reconstruction to "Slide-tags", arrays with mixtures of photocleavable poly(dA) (fiducial, also referred to as diffusible) and non-cleavable poly(dT) (capture) beads were generated at a ratio of 3 to 1. The fiducial beads (poly(dA)) were photocleaved, diffused, and captured by capture beads while nuclei were tagged at the same time (FIG. 2A, FIG. 17).
[0259] Reconstruction was implemented upon "Slide-tags" contacted with a mouse hippocampal section. In addition to computational reconstruction of the array, in situ sequencing was performed to generate ground truth spatial positions on the same array. High-quality single-nucleus spatial data (2091 genes per cell) were generated, which showed high quality clustering of major cell types in the hippocampus (FIG. 2B, FIG. 17). Next, the bead locations were reconstructed with diffusion information, and each nucleus was positioned based on bead barcodes that tagged the nucleus. Following reconstruction and spatial placement, the spatial representation of different cell types captured the architecture of the hippocampus (FIG. 2C, FIG. 17).
[0260] The accuracy of the reconstruction of "Slide-tags" was then evaluated. The absolute positioning errors of each bead and each nucleus were calculated by comparing to ground truth locations from in situ sequencing, after applying a rigid transformation (FIGs. 5A-5G, FIG. 18). It was identified that, the median value of bead positioning error (25.4 pm) and nuclei positioning error (27.2 pm) were similar (FIGs. 5A-5G, FIG. 18). To avoid the error introduced by registration, the RMS error of measurement lengths was calculated, which was smaller than 20 pm at local measurement scales (<500 pm) and plateaued at -20 pm >1000 microns (representing <3% error in measurement lengths) (FIG. 2D, FIG. 5G, FIG. 18). To assess the effect of reconstruction error on biological structure, the spatial representation of each clustered cell type was compared in reconstruction and ground truth. No detectable difference in the dimensions of brain structures was found (FIG. 18). Lastly, "Slide-seq" and "Slide-tags" showed similar performance with respect to reconstruction error, likely due to the shared characteristics of the reconstruction protocol.
Example 5: Computational Reconstruction Enabled Spatial Transcriptomics At Large Scale [0261] The instantly disclosed reconstruction technique is purely performed through molecular biology reactions, and, thus, is not limited by imaging throughput. To demonstrate the scalability of the current reconstruction approach for spatial transcriptomics, reconstruction on a 1.2- centimeter Pl mouse cranial section was performed with "Slide-seq" adapted to use the current reconstruction techniques. Transcriptomes were spatially profiled across different tissue types, including brain, muscle, and the upper respiratory system, with a single section (FIG. 2E, FIGs. 20A-20B). When compared with hematoxylin and eosin staining of an adjacent section, reconstruction successfully identified the compartmentalization of different tissues and elucidated fine structural details (FIG. 2E, FIGs. 20A-20B). Decomposed cell types were assigned to each bead with robust cell type decomposition (RCTD; FIG. 2F). To assess the fidelity of reconstructed tissue structure, the spatial distribution of certain cell types that exhibit unique spatial localization were represented: adipocytes around anterior cervical region; neuronal cells in central nervous systems (CNS), peripheral nervous systems (PNS), and olfactory sensory region (FIG. 2G, FIG. 21). The locations of these cell types were highly correlated with the spatial expression pattern of their maker genes (FIG. 2H, FIG. 21). For instance, the locations of fibroblasts and osteoblasts corresponded with the distribution of type I collagen, which is abundantly present in tendons and bones. To examine the spatial transcriptomics data in detail, beads that were assigned with neuronal cell types were gathered and subjected to further clustering. Such subclustering revealed distinctions between CNS, PNS, olfactory neurons, which were highly correlated with the expression pattern of their respective marker genes. Cortical neurons were found with this higher resolution of clustering (FIGs. 2I-2J).
[0262] With this comprehensive spatial transcriptomics profiling, it was investigated whether genes with spatially differential expressions could be identified. Olfactory epithelium was initially focused upon, due to the olfactory epithelium's well-organized structures and multilayered cell compositions. Spatial expression of previously identified genes was profiled, and the results were found to be consistent with in situ hybridization: carbonyl reductase 2 (Cbr2), a sustentacular cell maker, exhibited higher expression in olfactory epithelium’s outer layer; regenerating islet-derived protein 3 gamma (Reg3g), a respiratory epithelium marker, exhibited a significant expression decrease at the boundary between respiratory epithelium and olfactory epithelium; and growth associated protein 43 (Gap43 which marks immature olfactory sensory neurons (OSNs), was only expressed in the inner olfactory epithelium layer where immature OSNs are located (FIGs. 22A-22C). Furthermore, by analyzing genes enriched in the olfactory epithelium, additional spatially variable genes in the olfactory epithelium were profiled. The spatial profiling of spatially differential expression genes in olfactory epithelium further displayed the high fidelity of the reconstruction method at fine structural details.
[0263] As the data set covered a wide range of tissue types, it was examined whether cell typespecific differential gene expression could be identified across the entire section. Nonparametric cell type-specific inference of differential expression (C-SIDE) of epithelial specific genes was performed to identify genes with spatially variable expression. Moran’s I statistics, a spatial autocorrelation measurement, was also calculated of genes with high C-SIDE Z-scores. The results from both analyses indicated a nonrandom pattern in the variable expression of these genes (FIG. 2K). Spatial representation of selected genes displayed their distinct expression patterns across various regions (FIG. 2L). For example, BPI fold containing family A member 2 (Bpifa2) was prominently expressed in the parotid gland epithelial cells. In addition, spatial differential expression in muscle cells was analyzed in a similar manner, and 277 region specific muscle genes were found. These analyses demonstrate the reconstruction methods disclosed herein are capable of discovering spatially relevant genes and revealing the complexity of gene expression patterns. [0264] Molecular diffusion based computational reconstruction approaches, such as those disclosed herein, have now been demonstrated to enable imaging-free spatial transcriptomics with high fidelity. In particular, UMAP was identified as particularly capable of reconstructing a two- dimensional barcode array given a diffusive interaction process between barcodes. In this process, UMAP successfully learns the intrinsic two-dimensional manifold embedded within the highdimensional diffusion data. The rationale of using a dimensionality reduction method for this reconstruction task was that the bead barcode diffusion process generated information in a high dimensional space in which physically closer bead barcodes were also closer. Thus, to reconstruct the physical locations, a method should preserve the high dimensional distances while reducing relationships to a low dimensional space, which matches the aim of dimensionality reduction methods. Intuitively, nonlinear dimensionality reduction methods may be preferred due to the nonlinear nature of the diffusion process. Furthermore, in contrast to t-Distributed Stochastic Neighbor Embedding (t-SNE), which primarily conserves local distances, UMAP is adept at preserving a broader spectrum of distances. Empirically, UMAP outperforms a variety of other dimensionality reduction algorithms with the highest accuracy and the shortest running time.
[0265] The error of reconstruction was systematically assessed. Although the median absolute error is 25 pm, it is crucial to contextualize its impact on experiments using reconstructed data. Reconstruction introduces slight nonrigid deformations; however, these deformations are smooth, preserving local information. Thus, although a pixel may be 25 pm from its ground truth location, nearby pixels experience consistent, concordant displacements due to the smooth nature of the deformation (FIG. 4B, FIG. 5B, FIG. 5E). The assessment of absolute error, by direct comparison between the actual and reconstructed bead locations may overestimate errors, as reconstruction deformation may be locally smooth. Therefore, evaluating the relative error in distances between pairs of beads provided a more reliable metric. Within the range of several hundred micrometers, bead displacements tended to be concordant. Thus, for small measurement lengths (<500 pm), the error remained small, typically under 20 pm. Lastly, the distortion of reconstruction was evaluated when analyzing biological structure such as the mouse hippocampus. It was found that such locally concordant errors exerted a minor effect on local structures or neighboring cell analyses. These deformations are comparable in magnitude and nature to tissue distortions commonly observed during processing steps like freezing, fixation and sectioning. Local rigid registration further reduces displacements in Slide-seq and Slide-tags. Thus, computational reconstruction through dimensionality reduction generated high fidelity, and high-resolution spatial transcriptomics data. [0266] The registration process contributes to absolute error, making relative error in distance measurements a more accurate metric. The relative error in length measurements is particularly relevant because most spatial analyses depend on distance measurements, such as quantifying intercellular distances, defining cellular neighborhoods and identifying spatially varying gene expression. Here, empirical analyses of biological structures, such as the mouse hippocampus CAI region, confirm that locally concordant errors have negligible effects on measurements of local structures or cellular neighborhoods. Though minor uncertainties may arise in absolute registration across sections, they are unlikely to affect overall conclusions.
[0267] Despite the relatively high fidelity of the reconstruction strategy given its simplicity, it nonetheless introduced some variation into the spatial transcriptomics data. Higher accuracy in reconstruction could potentially be attained by solving it as an assignment problem with predefined locations; for instance, by imaging the bead array in bright field conditions and identifying the locations of beads while the corresponding barcodes remain unidentified. Lastly, while reconstruction decouples the scale of spatial transcriptomics from imaging, the sequencing cost of reconstruction grows linearly with the array area. While costs are currently modest for reconstruction, they may continue to decrease with the exponential decrease in sequencing costs and adoption of novel sequencing technologies.
[0268] Computational reconstruction, as disclosed and applied herein, can therefore enable spatial transcriptomics at a large scale and high throughput. An elegant aspect of molecular biology tools supporting modern day genomics (e.g., RNA-seq, epigenomics) has been the ability to distribute through open-source protocols and readily available enzymes. Here, it was demonstrated that spatial transcriptomics can be effectively converted into a molecular biology tool, as opposed to one that requires specialized equipment, thereby allowing for widespread accessibility for the scientific community. While the technology is currently demonstrated with "Slide-seq" and "Slidetags" approaches, the experimental and computational approaches disclosed herein are likely highly generalizable to array-based capture technologies, as well as potentially directly within tissues in 3D contexts. Lastly, decoupling from microscopy allows for performance of spatial genomics at spatial scales not limited by microscopy, including, e.g., spatial scales to reach the dimensions of entire human organs.
[0269] Processes described herein may be performed singly or collectively by one or more computer systems, such as one or more computer system(s) executing software to perform spatial assessment of macromolecule abundance (e.g., RNA expression, DNA abundance, protein abundance) according to the techniques described herein. FIG. 25 depicts an example of a computer system and associated devices to perform spatial assessment of macromolecule abundance according to the techniques described herein. A computer system may also be referred to herein as a data processing device/system, computing device/system/node, or simply a computer. The computer system may be based on one or more of various system architectures and/or instruction set architectures, such as those offered by Intel Corporation (Santa Clara, California, USA) or Apple computer (Cupertino, CA) as examples. FIG. 25 shows a computer system 100 in communication with external device(s) 112. Computer system 100 includes one or more processor(s) 102, for instance central processing unit(s) (CPUs) or 103, for example a graphics processing unit(s) (GPUs). A processor 102 or 103 can include functional components used in the execution of instructions, such as functional components to fetch program instructions from locations such as cache or main memory, decode program instructions, and execute program instructions, access memory for instruction execution, and write results of the executed instructions. A processor 102 or 103 can also include register(s) to be used by one or more of the functional components. Computer system 100 also includes memory 104, input/output (I/O) devices 108, and I/O interfaces 110, which may be coupled to processor(s) 102 and/or 103 and each other via one or more buses and/or other connections. Bus connections represent one or more of any of several types of bus structures, including a memory bus or memory controller, a peripheral bus, an accelerated graphics port, and a processor or local bus using any of a variety of bus architectures. By way of example, and not limitation, such architectures include the Industry Standard Architecture (ISA), the Micro Channel Architecture (MCA), the Enhanced ISA (EISA), the Video Electronics Standards Association (VESA) local bus, and the Peripheral Component Interconnect (PCI).
[0270] Memory 104 can be or include main or system memory (e.g. Random Access Memory) used in the execution of program instructions, storage device(s) such as hard drive(s), flash media, or optical media as examples, and/or cache memory, as examples. Memory 104 can include, for instance, a cache, such as a shared cache, which may be coupled to local caches (examples include LI cache, L2 cache, etc.) of processor(s) 102. Additionally, memory 104 may be or include at least one computer program product having a set (e.g., at least one) of program modules, instructions, code or the like that is/are configured to carry out functions of embodiments described herein when executed by one or more processors.
[0271] Memory 104 can store an operating system 105 and other computer programs 106, such as one or more computer programs/applications that execute to perform aspects described herein. Specifically, programs/applications can include computer readable program instructions that may be configured to carry out functions of embodiments or aspects described herein.
[0272] Examples of I/O devices 108 include but are not limited to microphones, speakers, Global Positioning System (GPS) devices, cameras, lights, accelerometers, gyroscopes, magnetometers, sensor devices configured to sense light, proximity, heart rate, body and/or ambient temperature, blood pressure, and/or skin resistance, and activity monitors. An EO device may be incorporated into the computer system as shown, though in some embodiments an EO device may be regarded as an external device 112 coupled to the computer system through one or more EO interfaces 110. [0273] Computer system 100 may communicate with one or more external devices 112 via one or more EO interfaces 110. Example external devices include a keyboard, a pointing device, a display, and/or any other devices that enable a user to interact with computer system 100. Other example external devices include any device that enables computer system 100 to communicate with one or more other computing systems or peripheral devices such as a printer. A network interface/adapter is an example EO interface that enables computer system 100 to communicate with one or more networks, such as a local area network (LAN), a general wide area network (WAN), and/or a public network (e.g., the Internet), providing communication with other computing devices or systems, storage devices, or the like. Ethernet-based (such as Wi-Fi) interfaces and Bluetooth® adapters are just examples of the currently available types of network adapters used in computer systems.
[0274] The communication between I/O interfaces 110 and external devices 112 can occur across wired and/or wireless communications link(s) 111, such as Ethernet-based wired or wireless connections. Example wireless connections include cellular, Wi-Fi, Bluetooth®, proximity-based, near-field, or other types of wireless connections. More generally, communications link(s) 111 may be any appropriate wireless and/or wired communication link(s) for communicating data.
[0275] Particular external device(s) 112 may include one or more data storage devices, which may store one or more programs, one or more computer readable program instructions, and/or data, etc. Computer system 100 may include and/or be coupled to and in communication with (e.g. as an external device of the computer system) removable/non-removable, volatile/non-volatile computer system storage media. For example, it may include and/or be coupled to a non-removable, nonvolatile magnetic media (typically called a “hard drive”), a magnetic disk drive for reading from and writing to a removable, non-volatile magnetic disk (e.g., a “floppy disk”), and/or an optical disk drive for reading from or writing to a removable, non-volatile optical disk, such as a CD- ROM, DVD-ROM or other optical media.
[0276] Computer system 100 may be operational with numerous other general purpose or special purpose computing system environments or configurations. Computer system 100 may take any of various forms, well-known examples of which include, but are not limited to, personal computer (PC) system(s), server computer system(s), such as messaging server(s), thin client(s), thick client(s), workstation(s), laptop(s), handheld device(s), mobile device(s)/computer(s) such as smartphone(s), tablet(s), and wearable device(s), multiprocessor system(s), microprocessor-based system(s), telephony device(s), network appliance(s) (such as edge appliance(s)), virtualization device(s), storage controller(s), set top box(es), programmable consumer electronic(s), network PC(s), minicomputer system(s), mainframe computer system(s), and distributed cloud computing environment s) that include any of the above systems or devices, and the like.
[0277] Aspects of the present invention may be a system, a method, and/or a computer program product, any of which may be configured to perform or facilitate aspects described herein.
[0278] In some embodiments, aspects of the present invention may take the form of a computer program product, which may be embodied as computer readable medium(s). A computer readable medium may be a tangible storage device/medium having computer readable program code/instructions stored thereon. Example computer readable medium(s) include, but are not limited to, electronic, magnetic, optical, or semiconductor storage devices or systems, or any combination of the foregoing. Example embodiments of a computer readable medium include a hard drive or other mass-storage device, an electrical connection having wires, random access memory (RAM), read-only memory (ROM), erasable-programmable read-only memory such as EPROM or flash memory, an optical fiber, a portable computer disk/diskette, such as a compact disc read-only memory (CD-ROM) or Digital Versatile Disc (DVD), an optical storage device, a magnetic storage device, or any combination of the foregoing. The computer readable medium may be readable by a processor, processing unit, or the like, to obtain data (e.g. instructions) from the medium for execution. In a particular example, a computer program product is or includes one or more computer readable media that includes/ stores computer readable program code to provide and facilitate one or more aspects described herein.
[0279] As noted, program instruction contained or stored in/on a computer readable medium can be obtained and executed by any of various suitable components such as a processor of a computer system to cause the computer system to behave and function in a particular manner. Such program instructions for carrying out operations to perform, achieve, or facilitate aspects described herein may be written in, or compiled from code written in, any desired programming language. In some embodiments, such programming language includes object-oriented and/or procedural programming languages such as C, C++, C #, Java, Perl, Python, etc.
[0280] Program code can include one or more program instructions obtained for execution by one or more processors. Computer program instructions may be provided to one or more processors of, e.g., one or more computer systems, to produce a machine, such that the program instructions, when executed by the one or more processors, perform, achieve, or facilitate aspects of the present invention, such as actions or functions described in flowcharts and/or block diagrams described herein. Thus, each block, or combinations of blocks, of the flowchart illustrations and/or block diagrams depicted and described herein can be implemented, in some embodiments, by computer program instructions.
[0281] All patents and publications mentioned in the specification are indicative of the levels of skill of those skilled in the art to which the disclosure pertains. All references cited in this disclosure are incorporated by reference to the same extent as if each reference had been incorporated by reference in its entirety individually.
[0282] One skilled in the art would readily appreciate that the present disclosure is well adapted to carry out the objects and obtain the ends and advantages mentioned, as well as those inherent therein. The methods and compositions described herein as presently representative of preferred embodiments are exemplary and are not intended as limitations on the scope of the disclosure. Changes therein and other uses will occur to those skilled in the art, which are encompassed within the spirit of the disclosure, are defined by the scope of the claims.
[0283] In addition, where features or aspects of the disclosure are described in terms of Markush groups or other grouping of alternatives, those skilled in the art will recognize that the disclosure is also thereby described in terms of any individual member or subgroup of members of the Markush group or other group.
[0284] The use of the terms "a" and "an" and "the" and similar referents in the context of describing the disclosure (especially in the context of the following claims) are to be construed to cover both the singular and the plural, unless otherwise indicated herein or clearly contradicted by context. The terms "comprising," "having," "including," and "containing" are to be construed as open- ended terms (i.e., meaning "including, but not limited to,") unless otherwise noted. Recitation of ranges of values herein are merely intended to serve as a shorthand method of referring individually to each separate value falling within the range, unless otherwise indicated herein, and each separate value is incorporated into the specification as if it were individually recited herein.
[0285] All methods described herein can be performed in any suitable order unless otherwise indicated herein or otherwise clearly contradicted by context. The use of any and all examples, or exemplary language (e.g., "such as") provided herein, is intended merely to better illuminate the disclosure and does not pose a limitation on the scope of the disclosure unless otherwise claimed. No language in the specification should be construed as indicating any non-claimed element as essential to the practice of the disclosure.
[0286] Embodiments of this disclosure are described herein, including the best mode known to the inventors for carrying out the disclosed invention. Variations of those embodiments may become apparent to those of ordinary skill in the art upon reading the foregoing description.
[0287] The disclosure illustratively described herein suitably can be practiced in the absence of any element or elements, limitation or limitations that are not specifically disclosed herein. Thus, for example, in each instance herein any of the terms "comprising", "consisting essentially of, and "consisting of may be replaced with either of the other two terms. The terms and expressions which have been employed are used as terms of description and not of limitation, and there is no intention that in the use of such terms and expressions of excluding any equivalents of the features shown and described or portions thereof, but it is recognized that various modifications are possible within the scope of the invention claimed. Thus, it should be understood that although the present disclosure provides preferred embodiments, optional features, modification and variation of the concepts herein disclosed may be resorted to by those skilled in the art, and that such modifications and variations are considered to be within the scope of this disclosure as defined by the description and the appended claims.
[0288] It will be readily apparent to one skilled in the art that varying substitutions and modifications can be made to the invention disclosed herein without departing from the scope and spirit of the invention. Thus, such additional embodiments are within the scope of the present disclosure and the following claims. The present disclosure teaches one skilled in the art to test various combinations and/or substitutions of chemical modifications described herein toward generating conjugates possessing improved contrast, diagnostic and/or imaging activity. Therefore, the specific embodiments described herein are not limiting and one skilled in the art can readily appreciate that specific combinations of the modifications described herein can be tested without undue experimentation toward identifying conjugates possessing improved contrast, diagnostic and/or imaging activity.
[0289] The inventors expect skilled artisans to employ such variations as appropriate, and the inventors intend for the disclosure to be practiced otherwise than as specifically described herein. Accordingly, this disclosure includes all modifications and equivalents of the subject matter recited in the claims appended hereto as permitted by applicable law. Moreover, any combination of the above-described elements in all possible variations thereof is encompassed by the disclosure unless otherwise indicated herein or otherwise clearly contradicted by context. Those skilled in the art will recognize or be able to ascertain using no more than routine experimentation, many equivalents to the specific embodiments of the disclosure described herein. Such equivalents are intended to be encompassed by the following claims. [0290] Features described above as well as those claimed below may be combined in various ways without departing from the scope hereof. The following examples illustrate some possible, nonlimiting combinations:
[0291] (Al) A method for generating a spatial representation of macromolecule abundance from a sample comprising: (i) contacting first oligonucleotides bound to a solid support and present in a positional array with a sample under conditions suitable for first oligonucleotide-macromolecule binding, wherein the first oligonucleotides comprise: an array location identifier sequence that is common to all first oligonucleotides in a given element in the positional array; and a macromolecule-specific capture sequence; (ii) obtaining sequence information for a population of macromolecules bound to the first oligonucleotides and the array location identifier sequence of the respective first oligonucleotides bound to each of the macromolecules of the population of macromolecules for which sequence information is obtained; and (iii) generating a computational reconstruction of the spatial locations of the population of macromolecules bound to the first oligonucleotides from inputs minimally comprising the obtained sequence information for the array location identifier sequences and molecular diffusion patterns of the macromolecules of the population of macromolecules for which sequence information is obtained, relative to the first oligonucleotides present in the positional array, thereby generating a spatial representation of macromolecule abundance from the sample.
[0292] (A2) For the method denoted as (Al), wherein the macromolecules are selected from the group consisting of RNA, DNA, protein, and combinations thereof.
[0293] (A3) For the method denoted as (A2), wherein the RNA is a poly-A-tailed RNA, optionally a mRNA.
[0294] (A4) For the method denoted as any one of (Al) through (A3), wherein the macromoleculespecific capture sequence comprises a poly-dT tail of sufficient length to allow for capture of poly- A-tailed RNAs via hybridization.
[0295] (A5) For the method denoted as any one of (Al) through (A4), wherein the macromoleculespecific capture sequence comprises a gene-specific sequence or a transcript-specific sequence.
[0296] (A6) For the method denoted as (A2), wherein the DNA is selected from the group consisting of a genomic DNA and a barcode DNA.
[0297] (A7) For the method denoted as any one of (Al) through (A6), wherein the macromoleculespecific capture sequence is a component of a loaded transposase. [0298] (A8) For the method denoted as any one of (Al) through (A7), wherein the positional array possesses a resolution of 50 micrometers or less between individual elements of the positional array, optionally wherein the positional array possesses a resolution of 30 micrometers or less between individual elements of the positional array, optionally wherein the positional array possesses a resolution of 20 micrometers or less between individual elements of the positional array, optionally wherein the positional array possesses a resolution of 10 micrometers or less between individual elements of the positional array.
[0299] (A9) For the method denoted as any one of (Al) through (A8), wherein the sample is a tissue sample.
[0300] (A10) For the method denoted as (A9), wherein the tissue sample is obtained from a tissue selected from the group consisting of brain, lung, liver, kidney, pancreas, and heart.
[0301] (Al l) For the method denoted as any one of (Al) through (A10), wherein the sample is obtained from a mammal, optionally a human.
[0302] (A12) For the method denoted as any one of (Al) through (Al 1), wherein the sample is fixed, optionally wherein the tissue sample is fixed with paraffin, optionally wherein the tissue sample is fixed using formalin-fixation and paraffin embedding (FFPE).
[0303] (Al 3) For the method denoted as any one of (Al) through (A 12), wherein the solid support is a slide, optionally wherein the solid support is a glass slide.
[0304] (A14) For the method denoted as any one of (Al) through (A13), wherein the first oligonucleotides are bound to the solid support using a capture material, optionally wherein the capture material is applied as a liquid, optionally wherein the capture material is applied using a brush or aerosol spray, optionally wherein the capture material is a liquid electrical tape, optionally wherein the capture material dries to form a vinyl polymer, optionally wherein the vinyl polymer is polyvinyl hexane.
[0305] (Al 5) For the method denoted as any one of (Al) through (A14), wherein the obtaining sequence information of step (ii) comprises a next-generation sequencing approach, optionally wherein the next-generation sequencing approach is selected from the group consisting of solidphase, reversible dye-terminator sequencing; massively parallel signature sequencing; pyrosequencing; sequencing-by-ligation; ion semiconductor sequencing; Nanopore sequencing; and DNA nanoball sequencing, optionally wherein the next-generation sequencing approach comprises solid-phase, reversible dye-terminator sequencing. [0306] (A16) For the method denoted as any one of (Al) through (A15), wherein the generating of the computational reconstruction of the spatial locations of the population of macromolecules of step (iii) comprises performing a dimensionality reduction analysis.
[0307] (A17) For the method denoted as any one of (Al) through (A16), wherein the macromolecules comprise a population of second oligonucleotides capable of binding to the first oligonucleotides, optionally wherein the second oligonucleotides are attached to a bead.
[0308] (A18) For the method denoted as any one of (Al) through (A17), wherein the generating of the computational reconstruction of the spatial locations of the population of macromolecules of step (iii) comprises performing Uniform Manifold Approximation and Projection (UMAP) reduction, t-distributed stochastic neighbor embedding (t-SNE) reduction, and/or multidimensional scaling (MDS) reduction, optionally wherein the generating of the computational reconstruction of the spatial locations of the population of macromolecules of step (iii) comprises performing Uniform Manifold Approximation and Projection (UMAP) reduction.
[0309] (A19) For the method denoted as any one of (Al) through (Al 8), the first oligonucleotides bound to the solid support and present in the positional array have a resolution of 100 micrometers or less between individual elements of the positional array.
[0310] (Bl) A method for generating a spatial representation of mRNA abundance from a sample, the method comprising: (i) contacting first oligonucleotides bound to a solid support and present in a positional array having a resolution of 100 micrometers or less between individual elements of the positional array with a sample, wherein the first oligonucleotides comprise: an array location identifier sequence that is common to all first oligonucleotides in a given element in the positional array; and a poly-dT tail of sufficient length to allow for capture of poly-A-tailed mRNAs via hybridization, under conditions suitable for oligonucleotide-mRNA hybridization; (ii) obtaining sequence information for a population of poly-A-tailed mRNAs bound to the first oligonucleotides and the array location identifier sequence of the respective first oligonucleotides bound to each of the poly-A-tailed mRNAs for which sequence information is obtained; and (iii) generating a computational reconstruction of the spatial locations of the population of poly-A-tailed RNAs bound to the first oligonucleotides from inputs minimally comprising the obtained sequence information for the array location identifier sequences; and molecular diffusion patterns of the poly-A-tailed RNAs of the population of poly-A-tailed RNAs for which sequence information is obtained, relative to the first oligonucleotides present in the positional array, thereby generating the spatial representation of mRNA abundance from the sample.
[0311] (B2) For the method denoted as (Bl), wherein the generating of the computational reconstruction of the spatial locations of the population of poly-A-tailed RNAs of step (iii) comprises performing a dimensionality reduction analysis.
[0312] (B3) For the method denoted as (B 1) or (B2), wherein the poly-A-tailed RNAs comprise a population of second oligonucleotides capable of binding to the first oligonucleotides, optionally wherein the second oligonucleotides are attached to a bead.
[0313] (B4) For the method denoted as any one of (Bl) through (B3), further comprising performing reverse transcription upon hybridized poly-A-tailed mRNAs immediately after hybridizing said poly-A-tailed mRNAs to the solid support-bound oligonucleotides, optionally performing reverse transcription before a digestion step is performed.
[0314] (B5) For the method denoted as any one of (Bl) through (B4), wherein the conditions suitable for oligonucleotide-mRNA hybridization comprise incubation in 6X SSC buffer, optionally wherein the 6X SSC buffer is supplemented with detergent.
[0315] (Cl) A method for generating a spatial representation of macromolecule abundance from a sample, the method comprising: (i) generating a well array having a plurality of wells, wherein each well of the well array can hold exactly one bead; (ii) depositing beads comprising macromolecule capture oligonucleotides into the wells of the well array, optionally depositing by evaporation in a centrifuge; (iii) brushing the well array to remove all of the beads not present in the wells; (iv) depositing the sample onto the well array and centrifuging, thereby forcing the biological sample into the wells of the well array; (v) adding a digestion buffer, thereby lysing the sample and causing the macromolecules of the sample to transfer onto the beads in the wells; (vi) obtaining sequence information for a population of macromolecules bound to the macromolecule capture oligonucleotides of the beads and an associated capture oligonucleotide bead identification sequence for each macromolecule of the population of macromolecules for which sequence information is obtained; and (vii) generating a computational reconstruction of the spatial locations of the population of macromolecules bound to the macromolecule capture oligonucleotides from inputs minimally comprising the obtained sequence information for the bead identification sequences and molecular diffusion patterns of the macromolecules of the population of macromolecules for which sequence information is obtained, relative to the macromolecule capture oligonucleotides present in the well array, thereby generating the spatial representation of macromolecule abundance from the sample.
[0316] (C2) For the method denoted as (Cl), wherein the generating of the computational reconstruction of the spatial locations of the population of macromolecules of step (vii) comprises performing a dimensionality reduction analysis.
[0317] (C3) For the method denoted as (Cl) or (C2), wherein the macromolecules comprise a population of second oligonucleotides capable of binding to the macromolecule capture oligonucleotides, optionally wherein the second oligonucleotides are attached to a bead.
[0318] (C4) For the method denoted as any one of (Cl) through (C3), further comprising performing reverse transcription upon the sample in the wells of the well array, optionally further comprising separating oligonucleotides from beads by sonication or by photocleavage.
[0319] (DI) A method for generating a spatial representation of macromolecule abundance from a sample, the method comprising: (i) adhering clusters of oligonucleotides in an array to a solid support; (ii) contacting the array with a tissue sample; (iii) obtaining sequence information for a population of macromolecules bound to the oligonucleotide clusters and a respective associated oligonucleotide cluster identification sequence for each macromolecule of the population of macromolecules for which sequence information is obtained; and (iv) generating a computational reconstruction of the spatial locations of the population of macromolecules bound to the clusters of oligonucleotides from inputs minimally comprising: the obtained oligonucleotide cluster identification sequenced and molecular diffusion patterns of the macromolecules of the population of macromolecules for which sequence information is obtained, relative to the clusters of oligonucleotides present in the array.
[0320] (D2) For the method denoted as (DI), wherein the generating of the computational reconstruction of the spatial locations of the population of macromolecules of step (iv) comprises performing a dimensionality reduction analysis.
[0321] (D3) For the method denoted as (DI) or (D2), wherein the macromolecules comprise a population of second oligonucleotides capable of binding to the clusters of oligonucleotides, optionally wherein the second oligonucleotides are attached to a bead.
[0322] (D4) For the method denoted as any one of (DI) through (D3), wherein the array comprises barcoded clusters of oligonucleotides on the solid support. [0323] (D5) For the method denoted as any one of (DI) through (D4), wherein the obtaining step (iii) comprises performance of long-read sequencing.
[0324] (El) A method for generating a spatial representation of macromolecule abundance from a tissue sample of a subject, the method comprising: (i) obtaining the tissue sample from the subject; (ii) preparing a cryosection of the tissue sample and adhering said cryosection to a solid support; (iii) forming an array of barcoded oligonucleotide clusters and/or an array of beads attached to barcoded oligonucleotides and contacting the cryosection adhered to the solid support with the array; (iv) obtaining sequence information for a population of macromolecules bound to the array(s), wherein the sequence information comprises macromolecule identification information and associated positional identification information of the barcoded oligonucleotides; and (v) generating a computational reconstruction of the spatial locations of the population of macromolecules bound to the bead oligonucleotides from inputs minimally comprising: the obtained sequence information and molecular diffusion patterns of the macromolecules of the population of macromolecules for which sequence information is obtained, relative to the bead oligonucleotides present in the array.
[0325] (E2) For the method denoted as (El), wherein the generating of the computational reconstruction of the spatial locations of the population of macromolecules of step (v) comprises performing a dimensionality reduction analysis.
[0326] (E3) For the method denoted as (El) or (E2), wherein the macromolecules comprise a population of second oligonucleotides capable of binding to the barcoded oligonucleotides, optionally wherein the second oligonucleotides are attached to a bead.
[0327] (E4) For the method denoted as any one of (Al) through (A19), (Bl) through (B5), (Cl) through (C4), (DI) through (D5), or (El) through (E3), wherein the array is physically transferred from one surface to another, optionally wherein a gel encasement is formed on top of the array, thereby allowing beads to be picked up off the surface of the array without altering bead positions relative to each other.
[0328] (E5) For the method denoted as any one of (Al) through (A19), (Bl) through (B5), (Cl) through (C4), (DI) through (D5), or (El) through (E4), wherein the beads or array are used for capture of oligonucleotides. [0329] (E6) For the method denoted as any one of (Al) through (A19), (Bl) through (B5), (Cl) through (C4), (DI) through (D5), or (El) through (E5), wherein the beads or array comprise or bind oligonucleotide-conjugated antibodies.
[0330] (E7) For the method denoted as any one of (Al) through (A19), (Bl) through (B5), (Cl) through (C4), (DI) through (D5), or (El) through (E6), wherein the beads or array comprise or bind nucleic acid hybridization probes.
[0331] (E8) For the method denoted as (E7), wherein the nucleic acid hybridization probes are RNA hybridization probes.
[0332] (E9) For the method denoted as (E7), wherein the nucleic acid hybridization probes are DNA hybridization probes.
[0333] (E10) For the method denoted as (E7), wherein the nucleic acid hybridization probes are capable of specific hybridization to transcriptome or genome sequence(s) of the tissue sample.
[0334] (El l) For the method denoted as (E7), wherein the nucleic acid hybridization probes comprise unique molecular identifiers (UMIs), optionally wherein the UMIs of the hybridization probes are counted via sequencing to assess the levels of hybridization probe-bound macromolecules, optionally wherein the hybridization probe-bound macromolecules are selected from the group consisting of proteins, exons, transcripts, nucleic acid sequences comprising single nucleotide polymorphisms (SNPs) and/or genomic regions.
[0335] (El 2) For the method denoted as (E7), wherein the nucleic acid hybridization probes are released from the array or tissue, optionally wherein the nucleic acid hybridization probes are released from the array or tissue by a method selected from the group consisting of: (a) cleavage and/or degradation of a photolabile and/or photocleavable group; (b) T7 RNA polymerase transcription; (c) enzymatic cleavage, optionally RNAseH cleavage of bound RNA or RNAse cleavage of an RNA base in the hybridization probes; and/or (d) chemical cleavage, optionally disulfide cleavage.
[0336] (E13) For the method denoted as any one of (Al) through (A19), (Bl) through (B5), (Cl) through (C4), (DI) through (D5), or (El) through (El 2), wherein the beads or array possess primers capable of specific binding to a selection of one or more target transcripts, optionally wherein the one or more target transcripts are selected from the group consisting of T Cell receptor transcript sequences; transcripts of low-expressing proteins, optionally wherein the low-expressing proteins are transcription factors; and synthetic transcripts, optionally wherein the synthetic transcripts are guide-RNAs.
[0337] (E14) For the method denoted as any one of (Al) through (A19), (Bl) through (B5), (Cl) through (C4), (DI) through (D5), or (El) through (E13), wherein the generating a computational reconstruction of the spatial locations comprises a Uniform Manifold Approximation and Projection (UMAP) reduction, a t-distributed stochastic neighbor embedding (t-SNE) reduction, or a multidimensional scaling (MDS) reduction.
[0338] (E15) For the method denoted as any one of (Al) through (A19), (Bl) through (B5), (Cl) through (C4), (DI) through (D5), or (El) through (E14), wherein a tissue sample of 5 cm or more in diameter is imaged, 7 cm or more in diameter is imaged, 8 cm or more in diameter is imaged, 9 cm or more in diameter is imaged, 10 cm or more in diameter is imaged, 11 cm or more in diameter is imaged, 12 cm or more in diameter is imaged, 13 cm or more in diameter is imaged, 14 cm or more in diameter is imaged, 15 cm or more in diameter is imaged, 16 cm or more in diameter is imaged, or larger tissue sample is imaged.
[0339] (Fl) A method for generating a spatial representation of macromolecule abundance from a tissue sample comprising: (i) contacting the tissue sample with a first monomer or linear polymer; a cross-linking agent comprising a second monomer or polymer, wherein the cross-linking agent is capable of crosslinking with the first monomer or linear polymer when combined; and a nucleic acid primer or probe comprising a modification capable of binding the primer or probe to the first monomer or linear polymer, the cross-linking agent, or both, wherein the primer or probe comprises: a matrix location identifier sequence that is common to all primers or probes in a given element in a matrix and a target nucleic acid molecule-specific capture sequence; (ii) crosslinking the cross-linking agent with the first monomer or linear polymer, thereby forming the matrix; (iii) binding the nucleic acid primer or probe to the first monomer or linear polymer, the cross-linking agent, or both; (iv) incubating the matrix and nucleic acid primer or probe with the tissue under conditions suitable for annealing of the nucleic acid primer or probe to a target nucleic acid molecule of or associated with the tissue sample, thereby forming a primer-bound or probe-bound target nucleic acid molecule, thereby binding a target nucleic acid molecule of or associated with the tissue; (v) obtaining sequence information for a population of target nucleic acid molecules bound to the primers or probes and the matrix location identifier sequence of the primers or probes bound to each of the target nucleic acid molecules sequenced; and (vi) generating a computational reconstruction of the spatial locations of the population of target nucleic acid molecules bound to the primers or probes from inputs minimally comprising the obtained sequence information for the matrix location identifier sequences, and molecular diffusion patterns of the target nucleic acid molecules of the population of target nucleic acid molecules for which sequence information is obtained, relative to the primers or probes present in the matrix.
[0340] (F2) For the method denoted as (Fl), wherein the generating of the computational reconstruction of the spatial locations of the population of target nucleic acid molecules of step (vi) comprises performing a dimensionality reduction analysis.
[0341] (F3) For the method denoted as (Fl) or (F2), wherein the target nucleic acid molecules comprise a population of second oligonucleotides capable of binding to the nucleic acid primers or probes, optionally wherein the second oligonucleotides are attached to a bead.
[0342] (Gl) A computer-implemented method for reconstructing spatial locations of macromolecules distributed in an array, comprising: contacting one or more first oligonucleotides bound to a solid support and present in the array with a sample under conditions suitable for first oligonucleotide-macromolecule binding, wherein the one or more first oligonucleotides comprise an array location identifier sequence that is common to all of the one or more first oligonucleotides in a given element in the array; and a macromolecule-specific capture sequence; obtaining sequence information for a population of macromolecules bound to the one or more first oligonucleotides and the array location identifier sequence of the respective one or more first oligonucleotides bound to each of the macromolecules of the population of macromolecules for which sequence information is obtained; and reconstructing spatial locations of the population of macromolecules bound to the one or more first oligonucleotides from inputs minimally comprising the obtained sequence information for the array location identifier sequences and molecular diffusion patterns of the macromolecules of the population of macromolecules for which sequence information is obtained, relative to the first oligonucleotides present in the positional array, thereby generating a spatial representation of macromolecule abundance from the sample.
[0343] (G2) For the computer-implemented method denoted as (Gl), wherein reconstructing spatial locations of the population of macromolecules further comprises applying a linear or nonlinear dimensionality reduction method to reduce the high-dimensional sequence information from the population of macromolecules into a two-dimensional (2D) embedding space. [0344] (G3) For the computer-implemented method denoted as (Gl) or (G2), wherein the nonlinear dimensionality reduction method is Uniform Manifold Approximation and Projection (UMAP).
[0345] (Hl) A computer program product comprising: a computer readable storage medium readable by at least one processor and storing instructions for execution by the at least one processor for performing a method comprising: contacting one or more first oligonucleotides bound to a solid support and present in the array with a sample under conditions suitable for first oligonucleotide-macromolecule binding, wherein the one or more first oligonucleotides comprise an array location identifier sequence that is common to all of the one or more first oligonucleotides in a given element in the array; and a macromolecule-specific capture sequence; obtaining sequence information for a population of macromolecules bound to the one or more first oligonucleotides and the array location identifier sequence of the respective one or more first oligonucleotides bound to each of the macromolecules of the population of macromolecules for which sequence information is obtained; and reconstructing spatial locations of the population of macromolecules bound to the one or more first oligonucleotides from inputs minimally comprising the obtained sequence information for the array location identifier sequences and molecular diffusion patterns of the macromolecules of the population of macromolecules for which sequence information is obtained, relative to the first oligonucleotides present in the positional array, thereby generating a spatial representation of macromolecule abundance from the sample.

Claims

We Claim:
1. A method for generating a spatial representation of macromolecule abundance from a sample, the method comprising:
(i) contacting first oligonucleotides bound to a solid support and present in a positional array with a sample under conditions suitable for first oligonucleotide-macromolecule binding, wherein the first oligonucleotides comprise: an array location identifier sequence that is common to all first oligonucleotides in a given element in the positional array; and a macromolecule-specific capture sequence;
(ii) obtaining sequence information for a population of macromolecules bound to the first oligonucleotides and the array location identifier sequence of the respective first oligonucleotides bound to each of the macromolecules of the population of macromolecules for which sequence information is obtained; and
(iii) generating a computational reconstruction of the spatial locations of the population of macromolecules bound to the first oligonucleotides from inputs minimally comprising the obtained sequence information for the array location identifier sequences and molecular diffusion patterns of the macromolecules of the population of macromolecules for which sequence information is obtained, relative to the first oligonucleotides present in the positional array, thereby generating a spatial representation of macromolecule abundance from the sample.
2. The method of claim 1, wherein the first oligonucleotides bound to the solid support and present in the positional array have a resolution of 100 micrometers or less between individual elements of the positional array.
3. The method of claim 1, wherein the macromolecules are selected from the group consisting of RNA, DNA, protein, and combinations thereof.
4. The method of claim 3, wherein the RNA is a poly-A-tailed RNA.
5. The method of claim 1, wherein the macromolecule-specific capture sequence comprises a poly-dT tail of sufficient length to allow for capture of poly-A-tailed RNAs via hybridization.
6. The method of claim 1, wherein the macromolecule-specific capture sequence comprises a gene-specific sequence or a transcript-specific sequence.
7. The method of claim 1, wherein the sample is a tissue sample.
8. The method of claim 1, wherein the sample is fixed.
9. The method of claim 1, wherein the solid support is a slide.
10. The method of claim 1, wherein the generating of the computational reconstruction of the spatial locations of the population of macromolecules of step (iii) comprises performing a dimensionality reduction analysis.
11. The method of claim 1, wherein the macromolecules further comprise a population of second oligonucleotides capable of binding to the first oligonucleotides.
12. The method of claim 1, wherein the generating of the computational reconstruction of the spatial locations of the population of macromolecules of step (iii) comprises performing Uniform Manifold Approximation and Projection (UMAP) reduction, t-distributed stochastic neighbor embedding (t-SNE) reduction, and/or multidimensional scaling (MDS) reduction.
13. A method for generating a spatial representation of mRNA abundance from a sample, the method comprising:
(i) contacting first oligonucleotides bound to a solid support and present in a positional array having a resolution of 100 micrometers or less between individual elements of the positional array with a sample, wherein the first oligonucleotides comprise: an array location identifier sequence that is common to all the first oligonucleotides in a given element in the positional array; and a poly-dT tail of sufficient length to allow for capture of poly-A-tailed mRNAs via hybridization, under conditions suitable for oligonucleotide- mRNA hybridization;
(ii) obtaining sequence information for a population of poly-A-tailed mRNAs bound to the first oligonucleotides and the array location identifier sequence of the respective first oligonucleotides bound to each of the poly-A-tailed mRNAs for which sequence information is obtained; and
(iii) generating a computational reconstruction of the spatial locations of the population of poly-A-tailed RNAs bound to the first oligonucleotides from inputs minimally comprising the obtained sequence information for the array location identifier sequences; and molecular diffusion patterns of the poly-A-tailed RNAs of the population of poly-A-tailed RNAs for which sequence information is obtained, relative to the first oligonucleotides present in the positional array, thereby generating the spatial representation of mRNA abundance from the sample.
14. The method of claim 13, further comprising performing reverse transcription upon hybridized poly-A-tailed mRNAs immediately after hybridizing said poly-A-tailed mRNAs to the solid support-bound oligonucleotides.
15. A method for generating a spatial representation of macromolecule abundance from a tissue sample, the method comprising:
(i) adhering clusters of oligonucleotides in an array to a solid support;
(ii) contacting the array with the tissue sample;
(iii) obtaining sequence information for a population of macromolecules bound to the oligonucleotide clusters and a respective associated oligonucleotide cluster identification sequence for each macromolecule of the population of macromolecules for which sequence information is obtained; and
(iv) generating a computational reconstruction of the spatial locations of the population of macromolecules bound to the clusters of oligonucleotides from inputs minimally comprising: the obtained oligonucleotide cluster identification sequenced and molecular diffusion patterns of the macromolecules of the population of macromolecules for which sequence information is obtained, relative to the clusters of oligonucleotides present in the array.
16. The method of claim 15, wherein the array comprises barcoded clusters of oligonucleotides on the solid support.
17. The method of claim 15, wherein the obtaining step (iii) comprises performance of long- read sequencing.
18. The method of claim 1, wherein the array is physically transferred from one surface to another.
19. A method for generating a spatial representation of macromolecule abundance from a tissue sample comprising:
(i) contacting the tissue sample with a first monomer or linear polymer; a cross-linking agent comprising a second monomer or polymer, wherein the cross-linking agent is capable of crosslinking with the first monomer or linear polymer when combined; and a nucleic acid primer or probe comprising a modification capable of binding the nucleic acid primer or probe to the first monomer or linear polymer, the cross-linking agent, or both, wherein the nucleic acid primer or probe comprises: a matrix location identifier sequence that is common to all nucleic acid primers or probes in a given element in a matrix and a target nucleic acid molecule-specific capture sequence;
(ii) crosslinking the cross-linking agent with the first monomer or linear polymer, thereby forming the matrix;
(iii) binding the nucleic acid primer or probe to the first monomer or linear polymer, the cross-linking agent, or both;
(iv) incubating the matrix and nucleic acid primer or probe with the tissue sample under conditions suitable for annealing of the nucleic acid primer or probe to a target nucleic acid molecule of or associated with the tissue sample, thereby forming a primer-bound or probebound target nucleic acid molecule, thereby binding a target nucleic acid molecule of or associated with the tissue sample;
(v) obtaining sequence information for a population of target nucleic acid molecules bound to the nucleic acid primers or probes and the matrix location identifier sequence of the nucleic acid primers or probes bound to each of the target nucleic acid molecules sequenced; and
(vi) generating a computational reconstruction of the spatial locations of the population of target nucleic acid molecules bound to the nucleic acid primers or probes from inputs minimally comprising the obtained sequence information for the matrix location identifier sequences, and molecular diffusion patterns of the target nucleic acid molecules of the population of target nucleic acid molecules for which sequence information is obtained, relative to the nucleic acid primers or probes present in the matrix.
20. The method of claim 19, wherein the generating of the computational reconstruction of the spatial locations of the population of nucleic acid molecules of step (vi) comprises performing a dimensionality reduction analysis.
PCT/US2025/029834 2024-05-17 2025-05-16 Imaging-free high-resolution spatial macromolecule abundance reconstruction Pending WO2025240905A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US202463649062P 2024-05-17 2024-05-17
US63/649,062 2024-05-17

Publications (1)

Publication Number Publication Date
WO2025240905A1 true WO2025240905A1 (en) 2025-11-20

Family

ID=95981670

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2025/029834 Pending WO2025240905A1 (en) 2024-05-17 2025-05-16 Imaging-free high-resolution spatial macromolecule abundance reconstruction

Country Status (1)

Country Link
WO (1) WO2025240905A1 (en)

Citations (59)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5130238A (en) 1988-06-24 1992-07-14 Cangene Corporation Enhanced nucleic acid amplification process
US5455166A (en) 1991-01-31 1995-10-03 Becton, Dickinson And Company Strand displacement amplification
US5641658A (en) 1994-08-03 1997-06-24 Mosaic Technologies, Inc. Method for performing amplification of nucleic acid with two primers bound to a single solid support
US5695934A (en) 1994-10-13 1997-12-09 Lynx Therapeutics, Inc. Massively parallel sequencing of sorted polynucleotides
US5714330A (en) 1994-04-04 1998-02-03 Lynx Therapeutics, Inc. DNA sequencing by stepwise ligation and cleavage
US5750341A (en) 1995-04-17 1998-05-12 Lynx Therapeutics, Inc. DNA sequencing by parallel oligonucleotide extensions
US5912148A (en) 1994-08-19 1999-06-15 Perkin-Elmer Corporation Applied Biosystems Coupled amplification and ligation method
WO2000018957A1 (en) 1998-09-30 2000-04-06 Applied Research Systems Ars Holding N.V. Methods of nucleic acid amplification and sequencing
WO2000063437A2 (en) 1999-04-20 2000-10-26 Illumina, Inc. Detection of nucleic acid reactions on bead arrays
US6210891B1 (en) 1996-09-27 2001-04-03 Pyrosequencing Ab Method of sequencing DNA
US6214587B1 (en) 1994-03-16 2001-04-10 Gen-Probe Incorporated Isothermal strand displacement nucleic acid amplification
US6258568B1 (en) 1996-12-23 2001-07-10 Pyrosequencing Ab Method of sequencing DNA based on the detection of the release of pyrophosphate and enzymatic nucleotide degradation
US6266459B1 (en) 1997-03-14 2001-07-24 Trustees Of Tufts College Fiber optic sensor with encoded microspheres
US6274320B1 (en) 1999-09-16 2001-08-14 Curagen Corporation Method of sequencing a nucleic acid
US6355431B1 (en) 1999-04-20 2002-03-12 Illumina, Inc. Detection of nucleic acid amplification reactions using bead arrays
US20020055100A1 (en) 1997-04-01 2002-05-09 Kawashima Eric H. Method of nucleic acid sequencing
US6432360B1 (en) 1997-10-10 2002-08-13 President And Fellows Of Harvard College Replica amplification of nucleic acid arrays
US6485944B1 (en) 1997-10-10 2002-11-26 President And Fellows Of Harvard College Replica amplification of nucleic acid arrays
US6511803B1 (en) 1997-10-10 2003-01-28 President And Fellows Of Harvard College Replica amplification of nucleic acid arrays
US20040002090A1 (en) 2002-03-05 2004-01-01 Pascal Mayer Methods for detecting genome-wide sequence variations associated with a phenotype
US20040096853A1 (en) 2000-12-08 2004-05-20 Pascal Mayer Isothermal amplification of nucleic acids on a solid support
US6770441B2 (en) 2000-02-10 2004-08-03 Illumina, Inc. Array compositions and methods of making same
US6787308B2 (en) 1998-07-30 2004-09-07 Solexa Ltd. Arrayed biomolecules and their use in sequencing
US6833246B2 (en) 1999-09-29 2004-12-21 Solexa, Ltd. Polynucleotide sequencing
WO2005010145A2 (en) 2003-07-05 2005-02-03 The Johns Hopkins University Method and compositions for detection and enumeration of genetic variations
US20050037393A1 (en) 2003-06-20 2005-02-17 Illumina, Inc. Methods and compositions for whole genome amplification and genotyping
US6859570B2 (en) 1997-03-14 2005-02-22 Trustees Of Tufts College, Tufts University Target analyte sensors utilizing microspheres
US20050064460A1 (en) 2001-11-16 2005-03-24 Medical Research Council Emulsion compositions
US20050130173A1 (en) 2003-01-29 2005-06-16 Leamon John H. Methods of amplifying and sequencing nucleic acids
US6969488B2 (en) 1998-05-22 2005-11-29 Solexa, Inc. System and apparatus for sequential processing of analytes
US20070099208A1 (en) 2005-06-15 2007-05-03 Radoje Drmanac Single molecule arrays for genetic and chemical analysis
US20070128624A1 (en) 2005-11-01 2007-06-07 Gormley Niall A Method of preparing libraries of template polynucleotides
US20080009420A1 (en) 2006-03-17 2008-01-10 Schroth Gary P Isothermal methods for creating clonal single molecule arrays
WO2008093098A2 (en) 2007-02-02 2008-08-07 Illumina Cambridge Limited Methods for indexing samples and sequencing multiple nucleotide templates
US20090026082A1 (en) 2006-12-14 2009-01-29 Ion Torrent Systems Incorporated Methods and apparatus for measuring analytes using large scale FET arrays
US20090127589A1 (en) 2006-12-14 2009-05-21 Ion Torrent Systems Incorporated Methods and apparatus for measuring analytes using large scale FET arrays
US20100111768A1 (en) 2006-03-31 2010-05-06 Solexa, Inc. Systems and devices for sequence by synthesis analysis
US20100137143A1 (en) 2008-10-22 2010-06-03 Ion Torrent Systems Incorporated Methods and apparatus for measuring analytes
US20100282617A1 (en) 2006-12-14 2010-11-11 Ion Torrent Systems Incorporated Methods and apparatus for detecting molecular interactions using fet arrays
US20100301398A1 (en) 2009-05-29 2010-12-02 Ion Torrent Systems Incorporated Methods and apparatus for measuring analytes
US20110059865A1 (en) 2004-01-07 2011-03-10 Mark Edward Brennan Smith Modified Molecular Arrays
US7960120B2 (en) 2006-10-06 2011-06-14 Illumina Cambridge Ltd. Method for pair-wise sequencing a plurality of double stranded target polynucleotides
US8288103B2 (en) 2000-02-07 2012-10-16 Illumina, Inc. Multiplex nucleic acid reactions
US8460865B2 (en) 1998-06-24 2013-06-11 Illumina, Inc. Multiplex decoding of array sensors with microspheres
US8486625B2 (en) 1999-04-20 2013-07-16 Illumina, Inc. Detection of nucleic acid reactions on bead arrays
US20140066318A1 (en) 2011-04-13 2014-03-06 Spatial Transcriptomics Ab Method and product for localized or spatial detection of nucleic acid in a tissue sample
US20140079923A1 (en) 2012-06-08 2014-03-20 Wayne N. George Polymer coatings
US8778849B2 (en) 2011-10-28 2014-07-15 Illumina, Inc. Microarray fabrication system and method
US20140243224A1 (en) 2013-02-26 2014-08-28 Illumina, Inc. Gel patterned surfaces
WO2014142841A1 (en) 2013-03-13 2014-09-18 Illumina, Inc. Multilayer fluidic devices and methods for their fabrication
US8895249B2 (en) 2012-06-15 2014-11-25 Illumina, Inc. Kinetic exclusion amplification of nucleic acid libraries
US20150005447A1 (en) 2013-07-01 2015-01-01 Illumina, Inc. Catalyst-free surface functionalization and polymer grafting
US8951781B2 (en) 2011-01-10 2015-02-10 Illumina, Inc. Systems, methods, and apparatuses to image a sample for biological or chemical analysis
WO2016040476A1 (en) 2014-09-09 2016-03-17 The Broad Institute, Inc. A droplet-based method and apparatus for composite single-cell nucleic acid analysis
WO2019213254A1 (en) 2018-05-02 2019-11-07 The General Hospital Corporation High-resolution spatial macromolecule abundance assessment
WO2021096814A1 (en) 2019-11-11 2021-05-20 The Broad Institute, Inc. High-resolution spatial and quantitative dna assessment
US11339390B2 (en) 2015-09-11 2022-05-24 The Broad Institute, Inc. DNA microscopy methods
US20220177963A1 (en) 2020-12-07 2022-06-09 The Broad Institute, Inc. Paired macromolecule abundance and t-cell receptor sequencing with high spatial resolution
WO2022174054A1 (en) 2021-02-13 2022-08-18 The General Hospital Corporation Methods and compositions for in situ macromolecule detection and uses thereof

Patent Citations (64)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5130238A (en) 1988-06-24 1992-07-14 Cangene Corporation Enhanced nucleic acid amplification process
US5455166A (en) 1991-01-31 1995-10-03 Becton, Dickinson And Company Strand displacement amplification
US6214587B1 (en) 1994-03-16 2001-04-10 Gen-Probe Incorporated Isothermal strand displacement nucleic acid amplification
US5714330A (en) 1994-04-04 1998-02-03 Lynx Therapeutics, Inc. DNA sequencing by stepwise ligation and cleavage
US5641658A (en) 1994-08-03 1997-06-24 Mosaic Technologies, Inc. Method for performing amplification of nucleic acid with two primers bound to a single solid support
US6130073A (en) 1994-08-19 2000-10-10 Perkin-Elmer Corp., Applied Biosystems Division Coupled amplification and ligation method
US5912148A (en) 1994-08-19 1999-06-15 Perkin-Elmer Corporation Applied Biosystems Coupled amplification and ligation method
US5695934A (en) 1994-10-13 1997-12-09 Lynx Therapeutics, Inc. Massively parallel sequencing of sorted polynucleotides
US6306597B1 (en) 1995-04-17 2001-10-23 Lynx Therapeutics, Inc. DNA sequencing by parallel oligonucleotide extensions
US5750341A (en) 1995-04-17 1998-05-12 Lynx Therapeutics, Inc. DNA sequencing by parallel oligonucleotide extensions
US6210891B1 (en) 1996-09-27 2001-04-03 Pyrosequencing Ab Method of sequencing DNA
US6258568B1 (en) 1996-12-23 2001-07-10 Pyrosequencing Ab Method of sequencing DNA based on the detection of the release of pyrophosphate and enzymatic nucleotide degradation
US6266459B1 (en) 1997-03-14 2001-07-24 Trustees Of Tufts College Fiber optic sensor with encoded microspheres
US6859570B2 (en) 1997-03-14 2005-02-22 Trustees Of Tufts College, Tufts University Target analyte sensors utilizing microspheres
US20020055100A1 (en) 1997-04-01 2002-05-09 Kawashima Eric H. Method of nucleic acid sequencing
US6432360B1 (en) 1997-10-10 2002-08-13 President And Fellows Of Harvard College Replica amplification of nucleic acid arrays
US6485944B1 (en) 1997-10-10 2002-11-26 President And Fellows Of Harvard College Replica amplification of nucleic acid arrays
US6511803B1 (en) 1997-10-10 2003-01-28 President And Fellows Of Harvard College Replica amplification of nucleic acid arrays
US6969488B2 (en) 1998-05-22 2005-11-29 Solexa, Inc. System and apparatus for sequential processing of analytes
US8460865B2 (en) 1998-06-24 2013-06-11 Illumina, Inc. Multiplex decoding of array sensors with microspheres
US6787308B2 (en) 1998-07-30 2004-09-07 Solexa Ltd. Arrayed biomolecules and their use in sequencing
US7115400B1 (en) 1998-09-30 2006-10-03 Solexa Ltd. Methods of nucleic acid amplification and sequencing
WO2000018957A1 (en) 1998-09-30 2000-04-06 Applied Research Systems Ars Holding N.V. Methods of nucleic acid amplification and sequencing
WO2000063437A2 (en) 1999-04-20 2000-10-26 Illumina, Inc. Detection of nucleic acid reactions on bead arrays
US8486625B2 (en) 1999-04-20 2013-07-16 Illumina, Inc. Detection of nucleic acid reactions on bead arrays
US6355431B1 (en) 1999-04-20 2002-03-12 Illumina, Inc. Detection of nucleic acid amplification reactions using bead arrays
US6274320B1 (en) 1999-09-16 2001-08-14 Curagen Corporation Method of sequencing a nucleic acid
US6833246B2 (en) 1999-09-29 2004-12-21 Solexa, Ltd. Polynucleotide sequencing
US8288103B2 (en) 2000-02-07 2012-10-16 Illumina, Inc. Multiplex nucleic acid reactions
US6770441B2 (en) 2000-02-10 2004-08-03 Illumina, Inc. Array compositions and methods of making same
US20040096853A1 (en) 2000-12-08 2004-05-20 Pascal Mayer Isothermal amplification of nucleic acids on a solid support
US20050064460A1 (en) 2001-11-16 2005-03-24 Medical Research Council Emulsion compositions
US20040002090A1 (en) 2002-03-05 2004-01-01 Pascal Mayer Methods for detecting genome-wide sequence variations associated with a phenotype
US20050130173A1 (en) 2003-01-29 2005-06-16 Leamon John H. Methods of amplifying and sequencing nucleic acids
US20050037393A1 (en) 2003-06-20 2005-02-17 Illumina, Inc. Methods and compositions for whole genome amplification and genotyping
WO2005010145A2 (en) 2003-07-05 2005-02-03 The Johns Hopkins University Method and compositions for detection and enumeration of genetic variations
US20110059865A1 (en) 2004-01-07 2011-03-10 Mark Edward Brennan Smith Modified Molecular Arrays
US20070099208A1 (en) 2005-06-15 2007-05-03 Radoje Drmanac Single molecule arrays for genetic and chemical analysis
US20070128624A1 (en) 2005-11-01 2007-06-07 Gormley Niall A Method of preparing libraries of template polynucleotides
US20080009420A1 (en) 2006-03-17 2008-01-10 Schroth Gary P Isothermal methods for creating clonal single molecule arrays
US20100111768A1 (en) 2006-03-31 2010-05-06 Solexa, Inc. Systems and devices for sequence by synthesis analysis
US7960120B2 (en) 2006-10-06 2011-06-14 Illumina Cambridge Ltd. Method for pair-wise sequencing a plurality of double stranded target polynucleotides
US20100197507A1 (en) 2006-12-14 2010-08-05 Ion Torrent Systems Incorporated Methods and apparatus for measuring analytes using large scale fet arrays
US20100188073A1 (en) 2006-12-14 2010-07-29 Ion Torrent Systems Incorporated Methods and apparatus for measuring analytes using large scale fet arrays
US20090127589A1 (en) 2006-12-14 2009-05-21 Ion Torrent Systems Incorporated Methods and apparatus for measuring analytes using large scale FET arrays
US20090026082A1 (en) 2006-12-14 2009-01-29 Ion Torrent Systems Incorporated Methods and apparatus for measuring analytes using large scale FET arrays
US20100282617A1 (en) 2006-12-14 2010-11-11 Ion Torrent Systems Incorporated Methods and apparatus for detecting molecular interactions using fet arrays
WO2008093098A2 (en) 2007-02-02 2008-08-07 Illumina Cambridge Limited Methods for indexing samples and sequencing multiple nucleotide templates
US20100137143A1 (en) 2008-10-22 2010-06-03 Ion Torrent Systems Incorporated Methods and apparatus for measuring analytes
US20100301398A1 (en) 2009-05-29 2010-12-02 Ion Torrent Systems Incorporated Methods and apparatus for measuring analytes
US8951781B2 (en) 2011-01-10 2015-02-10 Illumina, Inc. Systems, methods, and apparatuses to image a sample for biological or chemical analysis
US20140066318A1 (en) 2011-04-13 2014-03-06 Spatial Transcriptomics Ab Method and product for localized or spatial detection of nucleic acid in a tissue sample
US8778849B2 (en) 2011-10-28 2014-07-15 Illumina, Inc. Microarray fabrication system and method
US20140079923A1 (en) 2012-06-08 2014-03-20 Wayne N. George Polymer coatings
US8895249B2 (en) 2012-06-15 2014-11-25 Illumina, Inc. Kinetic exclusion amplification of nucleic acid libraries
US20140243224A1 (en) 2013-02-26 2014-08-28 Illumina, Inc. Gel patterned surfaces
WO2014142841A1 (en) 2013-03-13 2014-09-18 Illumina, Inc. Multilayer fluidic devices and methods for their fabrication
US20150005447A1 (en) 2013-07-01 2015-01-01 Illumina, Inc. Catalyst-free surface functionalization and polymer grafting
WO2016040476A1 (en) 2014-09-09 2016-03-17 The Broad Institute, Inc. A droplet-based method and apparatus for composite single-cell nucleic acid analysis
US11339390B2 (en) 2015-09-11 2022-05-24 The Broad Institute, Inc. DNA microscopy methods
WO2019213254A1 (en) 2018-05-02 2019-11-07 The General Hospital Corporation High-resolution spatial macromolecule abundance assessment
WO2021096814A1 (en) 2019-11-11 2021-05-20 The Broad Institute, Inc. High-resolution spatial and quantitative dna assessment
US20220177963A1 (en) 2020-12-07 2022-06-09 The Broad Institute, Inc. Paired macromolecule abundance and t-cell receptor sequencing with high spatial resolution
WO2022174054A1 (en) 2021-02-13 2022-08-18 The General Hospital Corporation Methods and compositions for in situ macromolecule detection and uses thereof

Non-Patent Citations (32)

* Cited by examiner, † Cited by third party
Title
"Immunochemical Methods In Cell And Molecular Biology", 1987, COLD SPRING HARBOR LABORATORY
ADESSI ET AL., NUCLEIC ACID RES., vol. 28, 2000, pages E87
ASTIER ET AL., J. AM. CHEM. SOC., vol. 128, no. 5, 8 February 2006 (2006-02-08), pages 1705 - 10
AUSUBEL ET AL.: "Current Protocols in Molecular Biology", 1992, JOHN WILEY & SONS
B. PERBAL: "A Practical Guide To Molecular Cloning", 1984
BENNETT ET AL., PHARMACOGENOMICS, vol. 6, 2005, pages 373 - 382
BENTLEY ET AL., NATURE, vol. 456, 2008, pages 53 - 59
BRENNER ET AL., NAT. BIOTECHNOL., vol. 18, 2000, pages 630 - 634
DEAN ET AL., PROC NATL. ACAD. SCI. USA, vol. 99, 2002, pages 5261 - 66
DRESSMAN ET AL., PROC. NATL. ACAD. SCI. USA, vol. 100, 2003, pages 8817 - 8822
GLOVER: "DNA Cloning", 1985, IRL PRESS
HARLOWLANE: "Essential Immunology", 1988, COLD SPRING HARBOR LABORATORY PRESS
HOGAN ET AL.: "Handbook Of Experimental Immunology", vol. I- IV, 1986, COLD SPRING HARBOR LABORATORY PRESS
JAKOBYPASTAN: "Nucleic Acid Hybridization", 1979
LAGE ET AL., GENOME RESEARCH, vol. 13, 2003, pages 294 - 307
LIZARDI ET AL., NAT. GENET, vol. 19, 1998, pages 225 - 232
MACLEAN ET AL., NATURE REV. MICROBIOL, vol. 7, pages 287 - 296
MARDIS: "The impact of next-generation sequencing technology on genetics", TRENDS IN GENETICS, vol. 24, no. 3, 2007, pages 133 - 141, XP022498431
MARGULIES ET AL., NATURE, vol. 437, 2005, pages 376 - 380
MITRA ET AL., ANALYTICAL BIOCHEMISTRY, vol. 320, 2003, pages 55 - 65
RUSSELL ANDREW J ET AL: "Slide-tags enables single-nucleus barcoding for multimodal spatial genomics", NATURE, vol. 625, no. 7993, 4 January 2024 (2024-01-04), London, pages 101 - 109, XP093173805, ISSN: 0028-0836, Retrieved from the Internet <URL:https://www.nature.com/articles/s41586-023-06837-4.pdf> DOI: 10.1038/s41586-023-06837-4 *
SAMBROOKRUSSELL: "Molecular Cloning", 2001, COLD SPRING HARBOR LABORATORY PRESS
SCIENCE, vol. 327, no. 5970, 2010, pages 1190
SHENDURE ET AL., SCIENCE, vol. 309, 2005, pages 1728 - 1732
SHENDURE ET AL.: "Next-generation DNA sequencing", NATURE, vol. 26, no. 10, 2008, pages 135 - 145, XP002572506, DOI: 10.1038/nbt1486
STICKELS ROBERT R ET AL: "Highly sensitive spatial transcriptomics at near-cellular resolution with Slide-seqV2", NATURE BIOTECHNOLOGY, vol. 39, no. 3, 31 March 2021 (2021-03-31), pages 313 - 319, XP037407913, ISSN: 1087-0156, DOI: 10.1038/S41587-020-0739-1 *
SU ET AL.: "Next-generation sequencing and its applications in molecular diagnostics", EXPERT REV MOL DIAGN, vol. 11, no. 3, 2011, pages 333 - 43, XP009505883, DOI: 10.1586/erm.11.3
VOELKERDING ET AL., CLINICAL CHEM., vol. 55, 2009, pages 641 - 658
WALKER ET AL., NUCL. ACIDS RES, vol. 20, 1992, pages 1691 - 96
WALKER ET AL.: "Molecular Methods for Virus Detection", 1995, ACADEMIC PRESS, INC.
WESTERFIELD, M.: "A guide for the laboratory use of zebrafish (Danio rerio", vol. The zebrafish book, 2000, UNIV. OF OREGON PRESS
ZHANG ET AL.: "The impact of next-generation sequencing on genomics", J GENET GENOMICS, vol. 38, no. 3, pages 95 - 109, XP028188028, DOI: 10.1016/j.jgg.2011.02.003

Similar Documents

Publication Publication Date Title
US20250327060A1 (en) High-resolution spatial macromolecule abundance assessment
US11613773B2 (en) Spatially distinguished, multiplex nucleic acid analysis of biological specimens
US20210062272A1 (en) Systems and methods for using the spatial distribution of haplotypes to determine a biological condition
US20220389409A1 (en) High-resolution spatial and quantitative dna assessment
US20220177963A1 (en) Paired macromolecule abundance and t-cell receptor sequencing with high spatial resolution
US20250034634A1 (en) Photoselective non-invasive targeted genomic and epigenomic sequencing of spatially-defined cells or subcellular regions
US20240043915A1 (en) Methods and compositions for in situ macromolecule detection and uses thereof
WO2025240905A1 (en) Imaging-free high-resolution spatial macromolecule abundance reconstruction
US20250340864A1 (en) Single-nucleus high-resolution multi-modal spatial genomics
HK40062809A (en) Spatially distinguished, multiplex nucleic acid analysis of biological specimens
HK40062808A (en) Spatially distinguished, multiplex nucleic acid analysis of biological specimens
HK40081866A (en) Spatially distinguished, multiplex nucleic acid analysis of biological specimens
HK40080403B (en) Spatially distinguished, multiplex nucleic acid analysis of biological specimens
HK40062808B (en) Spatially distinguished, multiplex nucleic acid analysis of biological specimens