WO2021067514A1 - Systems and methods for identifying morphological patterns in tissue samples - Google Patents
Systems and methods for identifying morphological patterns in tissue samples Download PDFInfo
- Publication number
- WO2021067514A1 WO2021067514A1 PCT/US2020/053655 US2020053655W WO2021067514A1 WO 2021067514 A1 WO2021067514 A1 WO 2021067514A1 US 2020053655 W US2020053655 W US 2020053655W WO 2021067514 A1 WO2021067514 A1 WO 2021067514A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- probe
- discrete attribute
- cluster
- spatial
- probe spots
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/762—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using clustering, e.g. of similar faces in social networks
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01N—INVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
- G01N1/00—Sampling; Preparing specimens for investigation
- G01N1/28—Preparing specimens for investigation including physical details of (bio-)chemical methods covered elsewhere, e.g. G01N33/50, C12Q
- G01N1/30—Staining; Impregnating ; Fixation; Dehydration; Multistep processes for preparing samples of tissue, cell or nucleic acid material and the like for analysis
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/213—Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
- G06F18/2137—Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods based on criteria of topology preservation, e.g. multidimensional scaling or self-organising maps
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/0002—Inspection of images, e.g. flaw detection
- G06T7/0012—Biomedical image inspection
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/60—Type of objects
- G06V20/69—Microscopic objects, e.g. biological cells or cellular parts
- G06V20/698—Matching; Classification
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B15/00—ICT specially adapted for analysing two-dimensional or three-dimensional molecular structures, e.g. structural or functional relations or structure alignment
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01N—INVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
- G01N1/00—Sampling; Preparing specimens for investigation
- G01N1/28—Preparing specimens for investigation including physical details of (bio-)chemical methods covered elsewhere, e.g. G01N33/50, C12Q
- G01N1/30—Staining; Impregnating ; Fixation; Dehydration; Multistep processes for preparing samples of tissue, cell or nucleic acid material and the like for analysis
- G01N2001/302—Stain compositions
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30004—Biomedical image processing
- G06T2207/30024—Cell structures in vitro; Tissue sections in vitro
Definitions
- This specification describes technologies relating to visualizing patterns in large, complex datasets, such as spatially arranged next generation sequencing data, and using the data to visualize patterns.
- a tissue section (e.g., fresh-frozen tissue section) is imaged for histological purposes and placed on an array containing barcoded capture probes that bind to RNA. Tissue is fixed and permeabilized to release RNA to bind to adjacent capture probes, allowing for the capture of barcoded spatial gene expression information. Spatially barcoded cDNA is then synthesized from captured RNA and sequencing libraries prepared with the spatial barcodes intact. The libraries are then sequenced and data visualized to determine which genes are expressed, and where, as well as in what quantity.
- the present disclosure provides a number of tools for handling the vast amount of sequencing data such techniques produce and well as tools for identifying morphological patterns in the underlying tissue sample that are associated with specific biological conditions.
- the discrete attribute value dataset comprises one or more spatial projections of a biological sample (e.g ., tissue sample).
- the discrete attribute value dataset further comprises one or more two-dimensional images, for a first spatial projection in the one or more spatial projections.
- the method further comprises obtaining a corresponding cluster assignment in a plurality of clusters, of each respective probe spot in the plurality of probe spots of the discrete attribute value dataset.
- the corresponding cluster assignment is based, at least in part, on the corresponding plurality of discrete attribute values of the respective probe spot, or a corresponding plurality of dimension reduction components derived, at least in part, from the corresponding plurality of discrete attribute values of the respective probe spot.
- the method further comprises overlaying on the first two-dimensional image and co aligned with the first two-dimensional image (i) first indicia for each probe spot in the plurality of probe spots that have been assigned to a first cluster in the plurality of clusters and (ii) second indicia for each probe spot in the plurality of probe spots that have been assigned to a second cluster in the plurality of clusters, thereby identifying the morphological pattern.
- the one or more spatial projections is a plurality of spatial projections of the biological sample, the plurality of spatial projections comprises the first spatial projection for a first tissue section of the biological sample, and the plurality of spatial projections comprises a second spatial projection for a second tissue section of the biological sample.
- the one or more two-dimensional images for the first spatial projection comprises a first plurality of two-dimensional images
- the second spatial projection comprises a second plurality of two-dimensional images.
- each two-dimensional image in the first plurality two- dimensional images is taken of the first tissue section of the biological sample
- each two- dimensional image in the second plurality two-dimensional images is taken of a second tissue section of the biological sample.
- the one or more spatial projections is a single spatial projection
- the one or more two-dimensional images of the first spatial projection is a plurality of two- dimensional images
- a first two-dimensional image in the plurality of two-dimensional images is a bright-field image of the first tissue section
- a second two-dimensional image in the plurality of two-dimensional images is a first immunohistochemistry (IHC) image of the first tissue section taken at a first wavelength or a first wavelength range
- a third two-dimensional image in the plurality of two-dimensional images is a second immunohistochemistry (IHC) image of the first tissue section taken at a second wavelength or a second wavelength range that is different than the first wavelength or the first wavelength range.
- IHC immunohistochemistry
- the first two- dimensional image is acquired using Haemotoxylin and Eosin, a Periodic acid-Schiff reaction stain, a Masson’s tri chrome stain, an Alcian blue stain, a van Gieson stain, a reticulin stain, an Azan stain, a Giemsa stain, a Toluidine blue stain, an isamin blue/eosin stain, a Nissl and methylene blue stain, a Sudan black and/or osmium staining of the biological sample.
- the method further comprises storing the first two-dimensional image in a first schema, wherein the first schema comprises a first number of tiles and storing the first two-dimensional image in a second schema, wherein the second schema comprises a second number of tiles, where the second number of tiles is less than the first number of tiles.
- the method responsive to receiving display instructions for a user, the method further comprises switching from the first schema to the second schema in order to display all or a portion of the first two-dimensional image or switching from the second schema to the first schema in order to display all or a portion of the first two-dimensional image.
- At least a first tile in the first number of tiles comprises a first predetermined tile size
- at least a second tile in the first number of tiles comprises a second predetermined tile size
- at least a first tile in the second number of tiles comprises of a third predetermined tile size.
- the obtaining a corresponding cluster assignment comprises clustering all or a subset of the probe spots in the plurality of probe spots across the one or more spatial projections using the discrete attribute values assigned to each respective probe spot in each of the one or more spatial projections as a multi-dimensional vector, where the clustering is configured to load less than the entirety of the discrete attribute value dataset into a non- persistent memory during the clustering thereby allowing the clustering of the discrete attribute value dataset having a size that exceeds storage space in a non-persistent memory allocated to the discrete attribute value dataset.
- the clustering of all or a subset of the probe spots comprises k-means clustering with K set to a predetermined value between one and twenty-five.
- each respective cluster in the plurality of clusters consists of a unique subset of the plurality of probe spots.
- each locus in the plurality of loci is a respective feature in a plurality of features
- each discrete attribute value in the corresponding plurality of discrete attribute values is a count of UMI that map to a corresponding probe spot and that also map to a respective feature in the plurality of features
- each feature in the plurality of features is an open-reading frame, an intron, an exon, an entire gene, an mRNA transcript, a predetermined non-coding part of a reference genome, an enhancer, a repressor, a predetermined sequence encoding a variant allele, or any combination thereof.
- the plurality of loci comprises more than 50 loci, more than 100 loci, more than 250 loci, more than 500 loci, more than 1000 loci, or more than 10000 loci.
- each unique barcode encodes a unique predetermined value selected from the set (1, ..., 1024 ⁇ , (1, ..., 4096 ⁇ , (1, ..., 16384 ⁇ , (1, ..., 65536 ⁇ , (1, ..., 262144 ⁇ , (1, ..., 1048576 ⁇ , (1, ..., 4194304 ⁇ , (1, ..., 16777216 ⁇ , (1, ..., 67108864 ⁇ , or (1, ..., l x lO 12 ⁇ .
- cells in the first tissue section that map to the probe spots of the first cluster are a first tissue type and cells in the first tissue section that map to the probe spots of the second cluster are a second tissue type.
- the first tissue type is healthy tissue and the second tissue type is diseased tissue.
- the morphological pattern is a spatial arrangement of probe spots assigned to the first cluster relative to probe spots assigned to the second cluster.
- the method further comprises, in response to a first user selection of a first subset of probe spots using the displayed pixel values of the first two- dimensional image, assigning the first subset of probe spots to the first cluster, and, in response to receiving a second user selection of a second subset of probe spots using the displayed pixel values of the first two-dimensional image, assigning the second subset of probe spots to the second cluster.
- the method further comprises in response to a first user selection of a first subset of probe spots using displayed discrete attribute values of an active list of genes superimposed on the first two-dimensional image, assigning the first subset of probe spots to the first cluster, and, in response to a second user selection of a second subset of probe spots using displayed discrete attribute values of an active list of genes superimposed on the first two- dimensional image, assigning the second subset of probe spots to the second cluster.
- Another aspect of the present disclosure provides a computing system comprising at least one processor and memory storing at least one program to be executed by the at least one processor, the at least one program comprising instructions for identifying a morphological pattern by any of the methods disclosed above.
- Still another aspect of the present disclosure provides a non-transitory computer readable storage medium storing one or more programs for identifying a morphological pattern. The one or more programs are configured for execution by a computer. The one or more programs collectively encode computer executable instructions for performing any of the methods disclosed above.
- Figures 1 A and IB are an example block diagram illustrating a computing device in accordance with some embodiments of the present disclosure.
- Figures 2A and 2B collectively illustrate an example method in accordance with an embodiment of the present disclosure, in which optional steps are indicated by dashed lines.
- Figure 3 illustrates a user interface for obtaining a dataset in accordance with some embodiments.
- Figure 5 illustrates an example display in which a table that comprises the differential value for each respective locus in a plurality of loci for each cluster in a plurality of clusters is displayed in a first panel while each respective probe spot in a plurality of probe spots is displayed in a second panel in accordance with some embodiments.
- Figure 7 illustrates an example of a user interface where a plurality of probe spots is displayed in a panel of the user interface, where the spatial location of each probe spot in the user interface is based upon the physical localization of each probe spot on a substrate, where each probe spot is additionally colored in conjunction with one or more clusters identified based on the discrete attribute value dataset, in accordance with some embodiments of the present disclosure.
- Figures 9A and 9B collectively illustrate examples of the image settings available for fine-tuning the visualization of the probe spot localizations, in accordance with some embodiments of the present disclosure.
- Figure 10 illustrates selection of a single gene for visualization, in accordance with some embodiments of the present disclosure.
- Figures 11 A and 1 IB illustrate adjusting the opacity of the probe spots overlaid on an underlying tissue image and creating one or more custom clusters, in accordance with some embodiments of the present disclosure.
- Figures 12A and 12B collectively illustrate clusters based on t-SNE and UMAP plots in either computational expression space as shown in Figure 12A or in spatial projection space as shown in Figure 12B, in accordance with some embodiments of the present disclosure.
- Figure 13 illustrates subdividing image files into tiles for efficiently storing image information in accordance with some embodiments of the present disclosure.
- Figure 16A illustrates an embodiment in which all of the images of a spatial projection are fluorescence images and are all displayed in accordance with an embodiment of the present disclosure.
- Figures 18 A, 18B, 18C, 18D, 18E, and 18F illustrate spatial projections that make use of linked windows in accordance with an embodiment of the present disclosure.
- Figure 19 illustrates details of a spatial probe spot and capture probe in accordance with an embodiment of the present disclosure.
- Figure 20 illustrates an immunofluorescence image, a representation of all or a portion of each subset of sequence reads at each respective position within one or more images that maps to a respective capture spot corresponding to the respective position, as well as composite representations in accordance with embodiments of the present disclosure.
- the capture area is imaged and then cells within the tissue are permeabilized in place, enabling the capture probes to bind to RNA from cells in proximity to (e.g, on top and/or laterally positioned with respect to) the probe spots.
- two-dimensional spatial sequencing is performed by obtaining barcoded cDNA and then sequencing libraries from the bound RNA, and the barcoded cDNA is then separated (e.g, washed) from the substrate.
- the sequencing libraries are run on a sequencer and sequencing read data is generated and applied to a sequencing pipeline.
- Reads from the sequencer are grouped by barcodes and UMIs, aligned to genes in a transcriptome reference, after which the pipeline generates a number of files, including a feature- barcode matrix.
- the barcodes correspond to individual spots within a capture area.
- the value of each entry in the spatial feature-barcode matrix is the number of RNA molecules in proximity to e.g ., on top and/or laterally positioned with respect to) the probe spot affixed with that barcode, that align to a particular gene feature.
- the method then provides for displaying the relative abundance of features (e.g., expression of genes) at each probe spot in the capture area overlaid on the image of the original tissue. This enables users to observe patterns in feature abundance (e.g., gene or protein expression) in the context of tissue samples. Such methods provide for improved pathological examination of patient samples.
- nucleic acid and “nucleotide” are intended to be consistent with their use in the art and to include naturally-occurring species or functional analogs thereof. Particularly useful functional analogs of nucleic acids are capable of hybridizing to a nucleic acid in a sequence-specific fashion or are capable of being used as a template for replication of a particular nucleotide sequence.
- Naturally-occurring nucleic acids generally have a backbone containing phosphodiester bonds.
- An analog structure can have an alternate backbone linkage including any of a variety of those known in the art.
- Naturally-occurring nucleic acids generally have a deoxyribose sugar (e.g, found in deoxyribonucleic acid (DNA)) or a ribose sugar (e.g, found in ribonucleic acid (RNA)).
- a deoxyribose sugar e.g, found in deoxyribonucleic acid (DNA)
- RNA ribonucleic acid
- a native deoxyribonucleic acid can have one or more bases selected from the group consisting of adenine (A), thymine (T), cytosine (C), or guanine (G), and a ribonucleic acid can have one or more bases selected from the group consisting of uracil (U), adenine (A), cytosine (C), or guanine (G).
- uracil U
- A adenine
- C cytosine
- G guanine
- Useful non-native bases that can be included in a nucleic acid or nucleotide are known in the art.
- a “barcode” is a label, or identifier, that conveys or is capable of conveying information e.g ., information about an analyte in a sample, a bead, and/or a capture probe).
- a barcode can be part of an analyte, or independent of an analyte.
- a barcode can be attached to an analyte.
- a particular barcode can be unique relative to other barcodes.
- Barcodes can have a variety of different formats.
- barcodes can include polynucleotide barcodes, random nucleic acid and/or amino acid sequences, and synthetic nucleic acid and/or amino acid sequences.
- a barcode can be attached to an analyte or to another moiety or structure in a reversible or irreversible manner.
- a barcode can be added to, for example, a fragment of a deoxyribonucleic acid (DNA) or ribonucleic acid (RNA) sample before or during sequencing of the sample.
- Barcodes can allow for identification and/or quantification of individual sequencing-reads (e.g., a barcode can be or can include a unique molecular identifier or “UMI”).
- Barcodes can spatially-resolve molecular components found in biological samples, for example, a barcode can be or can include a “spatial barcode”.
- a barcode includes both a UMI and a spatial barcode.
- the UMI and barcode are separate entities.
- a barcode includes two or more sub-barcodes that together function as a single barcode.
- a polynucleotide barcode can include two or more polynucleotide sequences (e.g., sub-barcodes) that are separated by one or more non barcode sequences. More details on barcodes and UMIs is disclosed in United States Provisional Patent Application No. 62/980,073, entitled “Pipeline for Analysis of Analytes,” filed February 21, 2020, attorney docket number 104371-5033-PR01, which is hereby incorporated by reference.
- the terms “chip” and “substrate” are used interchangabley and refer to any surface onto which capture probes can be affixed (e.g ., a solid array, a bead, a coverslip, etc). More details on suitable substrates is disclosed in United States Provisional Patent Application No. 62/980,073, entitled “Pipeline for Analysis of Analytes,” filed February 21, 2020, attorney docket number 104371-5033-PR01, which is hereby incorporated by reference.
- a “biological sample” is obtained from the subject for analysis using any of a variety of techniques including, but not limited to, biopsy, surgery, and laser capture microscopy (LCM), and generally includes tissues or organs and/or other biological material from the subject.
- CCM laser capture microscopy
- Biological samples can include one or more diseased cells.
- a diseased cell can have altered metabolic properties, gene expression, protein expression, and/or morphologic features. Examples of diseases include inflammatory disorders, metabolic disorders, nervous system disorders, neurological disorders and cancer. Cancer cells can be derived from solid tumors, hematological malignancies, cell lines, or obtained as circulating tumor cells.
- FIG. 1 A is a block diagram illustrating a visualization system 100 in accordance with some implementations.
- the device 100 in some implementations includes one or more processing units CPU(s) 102 (also referred to as processors), one or more network interfaces 104, a user interface 106 comprising a display 108 and an input module 110, a non-persistent 111, a persistent memory 112, and one or more communication buses 114 for interconnecting these components.
- the one or more communication buses 114 optionally include circuitry (sometimes called a chipset) that interconnects and controls communications between system components.
- the non-persistent memory 111 typically includes high-speed random access memory, such as DRAM, SRAM, DDR RAM, ROM, EEPROM, flash memory, whereas the persistent memory 112 typically includes CD-ROM, digital versatile disks (DVD) or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, magnetic disk storage devices, optical disk storage devices, flash memory devices, or other non-volatile solid state storage devices.
- the persistent memory 112 optionally includes one or more storage devices remotely located from the CPU(s) 102.
- the persistent memory 112, and the non-volatile memory device(s) within the non-persistent memory 112 comprise non-transitory computer readable storage medium.
- the non-persistent memory 111 or alternatively the non-transitory computer readable storage medium stores the following programs, modules and data structures, or a subset thereof, sometimes in conjunction with the persistent memory 112:
- an optional operating system 116 which includes procedures for handling various basic system services and for performing hardware dependent tasks;
- an optional clustering module 152 for clustering a discrete attribute value dataset 120 using the discrete attribute values 124 for each locus 122 in the plurality of loci for each respective probe spot 126 in the plurality of probe spots for each image 125 for each spatial projection 121, or principal component values 164 derived therefrom, thereby assigning respective probe spots to clusters 158 in a plurality of clusters in a clustered dataset 128;
- clustered dataset 128 comprising a plurality of clusters 158, each cluster 158 including a subset of probe spots 126, and each respective cluster 158 including a differential value 162 for each locus 122 across the probe spots 126 of the subset of probe spots for the respective cluster 158.
- one or more of the above identified elements are stored in one or more of the previously mentioned memory devices, and correspond to a set of instructions for performing a function described above.
- the non-persistent memory 111 optionally stores a subset of the modules and data structures identified above. Furthermore, in some embodiments, the memory stores additional modules and data structures not described above. In some embodiments, one or more of the above identified elements is stored in a computer system, other than that of visualization system 100, that is addressable by visualization system 100 so that visualization system 100 may retrieve all or a portion of such data when needed.
- Figure 1 A illustrates that the clustered dataset 128 includes a plurality of clusters 158 comprising cluster 1 (158-1), cluster 2 (158-2) and other clusters up to cluster P (158-P), where P is a positive integer.
- Cluster 1 (158-1) is stored in association with probe spot 1 for cluster 1 (126-1-1), probe spot 2 for cluster 1 (126-2-1), and subsequent probe spots up to probe spot Q for cluster 1 (126-Q-l), where Q is a positive integer.
- the cluster attribute value for probe spot 1 is stored in association with the probe spot 1 for cluster 1 (126-1-1)
- the cluster attribute value for the probe spot 2 (160-2-1) is stored in association with the probe spot 2 for cluster 1 (126-2-1)
- the cluster attribute value for the probe spot Q (160- Q-l) is stored in association with the probe spot Q for cluster 1 (126-Q-l).
- the clustered dataset 128 also includes differential value for locus 1 for cluster 1 (162-1-1) and subsequent differential values up to differential value for locus M for cluster 1 (162-1-M).
- Cluster 2 (158-2) and other clusters up to cluster P (158-P) in the clustered dataset 128 can include information similar to that in cluster 1 (158-1), and each cluster in the clustered dataset 128 is therefore not described in detail.
- a discrete attribute value dataset 120 which is store in the persistent memory 112, includes discrete attribute value dataset 120-1 and other discrete attribute value datasets up to discrete attribute value dataset 120-X.
- persistent memory 112 stores one or more discrete attribute value datasets 120.
- Each discrete attribute value dataset 120 comprises one or more spatial projections 121.
- a discrete attribute value dataset 120 comprises a single spatial projection 121.
- a discrete attribute value dataset 120 comprises a plurality of spatial projections.
- Each spatial projection 121 has an independent set of images 125, and a distinct set of probe locations 123.
- a discrete attribute value dataset 120 contains a single feature barcode matrix. In other words, the probe set used in each of the spatial projections 125 in a particular single given discrete attribute value dataset 120 are the same.
- the probe set used in each of the images of a particular spatial projection 125 are the same. Accordingly, in some embodiments, the probes of a probe set contain a suffix, or other form of indicator, that indicates which spatial projection 121 a given probe spot (and subsequent measurements) originated. For instance, the barcode (probe) ATAAA-1 from spatial projection (capture area) 1 (121-1-1) will be different from ATAAA-2 from spatial projection (capture area) 2 (121-1-2).
- the sample has been stained with a Masson’s trichrome stain (nuclei and other basophilic structures are stained blue, cytoplasm, muscle, erythrocytes and keratin are stained bright-red, collagen is stained green or blue, depending on which variant of the technique is used) and the image is a bright-field microscopy image.
- a Masson trichrome stain nuclei and other basophilic structures are stained blue, cytoplasm, muscle, erythrocytes and keratin are stained bright-red, collagen is stained green or blue, depending on which variant of the technique is used
- the sample has been stained with an Alcian blue stain (a mucin stain that stains certain types of mucin blue, and stains cartilage blue and can be used with H&E, and with van Gieson stains) and the image is a bright-field microscopy image.
- the sample has been stained with a van Gieson stain (stains collagen red, nuclei blue, and erythrocytes and cytoplasm yellow, and can be combined with an elastin stain that stains elastin blue/black) and the image is a bright-field microscopy image.
- a van Gieson stain stains collagen red, nuclei blue, and erythrocytes and cytoplasm yellow, and can be combined with an elastin stain that stains elastin blue/black
- the image is a bright-field microscopy image.
- a van Gieson stain stains collagen red, nuclei blue, and erythrocytes and cytoplasm yellow, and can be combined with an elastin stain that stains elastin blue/black
- the image is a bright-field microscopy image.
- the sample has been stained with an immunofluorescence (IF) stain (e.g ., an immunofluorescence label conjugated to an antibody).
- IF immunofluorescence
- biological samples are stained as described in I. Introduction; (d) Biological samples; (ii) Preparation of biological samples; (6) staining of United States Provisional Patent Application No. 62/938,336, entitled “Pipeline for Analysis of Analytes,” filed November 21, 2019, attorney docket number 104371-5033 -PR, which is hereby incorporated by reference in its entirety.
- an image 125 is an immunohistochemistry (IHC) image.
- IHC imaging relies upon a staining technique using antibody labels.
- One form of immunohistochemistry (IHC) imaging is immunofluorescence (IF) imaging.
- IF imaging primary antibodies are used that specifically label a protein in the biological sample, and then a fluorescently labelled secondary antibody or other form of probe is used to bind to the primary antibody, to show up where the first (primary) antibody has bound.
- a light microscope, equipped with fluorescence is used to visualize the staining. The fluorescent label is excited at one wavelength of light, and emits light at a different wavelength.
- fluorescence imaging in addition to brightfield imaging or instead of brightfield imaging, fluorescence imaging is used to acquire one or more spatial images of the sample.
- fluorescence imaging refers to imaging that relies on the excitation and re-emission of light by fluorophores, regardless of whether they're added experimentally to the sample and bound to antibodies (or other compounds) or simply natural features of the sample.
- IHC imaging, and in particular IF imaging is just one form of fluorescence imaging.
- each respective image 125 in a single spatial projection represents a different channel in a plurality of channels, where each such channel in the plurality of channels represent an independent (e.g., different) wavelength or a different wavelength range (e.g., corresponding to a different emission wavelength).
- the images 125 of a single spatial projection will have been taken of a tissue (e.g., the same tissue section) by a microscope at multiple wavelengths, where each such wavelength corresponds to the excitation frequency of a different kind of substance (containing a fluorophore) within or spatially associated with the sample.
- This substance can be a natural feature of the sample (e.g., a type of molecule that is naturally within the sample), or one that has been added to the sample.
- One manner in which such substances are added to the sample is in the form of probes that excite at specific wavelengths.
- probes can be directly added to the sample, or they can be conjugated to antibodies that are specific for some sort of antigen occurring within the sample, such as one that is exhibited by a particular protein.
- each of the images 125 of a given spatial projection will have the same dimensions and position relative to a single set of capture spot locations associated with the spatial projection.
- Each respective spatial projection in a discrete attribute value dataset will have its own set of capture spot locations associated with the respective spatial projection.
- each spatial projection represents images that are taken from an independent target (e.g., different tissue sections, etc.).
- both a bright-field microscopy image and a set of fluorescence images are taken of a biological sample and are in the same spatial projection for the biological sample.
- Each respective spatial projection in a discrete attribute value dataset 120 will have its own set of probe spot locations associated with the respective spatial projection.
- each spatial projection represents images that are taken from an independent target (e.g., different tissue sections, etc.).
- Example probe spot dimensions and density is disclosed in United States Provisional Application No.
- both a bright-field microscopy image and a set of fluorescence images are taken of a biological sample and are in the same spatial projection 121.
- an image 125 comprises, for each respective probe spot 126 in a plurality of probe spots (associated with the corresponding dataset), a discrete attribute value 124 for each locus 122 in a plurality of loci.
- a discrete attribute value dataset 120-1 includes information related to probe spot 1 (126-1-1-1), probe spot 2 (126-1-1-2) and other probe spots up to probe spot Y (126-1-1-Y) for each image 125 of each spatial projection 121.
- the probe spot 1 (126-1-1-1) includes a discrete attribute value 124-1-1-1 of locus 1 for probe spot 1 (122-1-1-1), a discrete attribute value 124-1-1-2 of locus 2 for probe spot 1 (122-1-1-1), and other discrete attribute values up to discrete attribute value 124-1-1-M of locus M for probe spot 1 (122-1-1-1).
- each locus is a different locus in a reference genome.
- the dataset further stores a plurality of principal component values 164 and/or a two-dimensional data point and/or a category 170 assignment for each respective probe spot 126 in the plurality of probe spots.
- Figure IB illustrates, by way of example, principal component value 1 164-1-1 through principal component value N 164-1-N stored for probe spot 126-1, where N is positive integer.
- the principal components are computed for the discrete attribute values of each respective probe spot, from each image 125, for each spatial projection 121 of the discrete attribute dataset 120.
- the principal component is taken across the variance observed in the discrete attribute value of the probe spot in each of the eleven images, where the assumption is made that the equivalent probe spot is known in the two projections.
- the principal components are computed for only a subset of the discrete attribute values of a probe spot across each spatial projection 121 of the discrete attribute dataset 120. In other words, the discrete attribute value for the probe spot in only a subset of the images is used.
- the principal components are computed for a select set of loci 124 (rather than all the loci), from each image 125, across each spatial projection 121 of the discrete attribute dataset 120.
- the principal components are computed for the discrete attribute values of the probe spot, from a subset of images, across each spatial projection 121 of the discrete attribute dataset 120.
- the principal components are computed for the discrete attribute values of each instance of a probe spot across each image 125 for a single spatial projection 121 of the discrete attribute dataset 120. In some alternative embodiments, the principal components are computed for the discrete attribute values of each instance of a probe spot across each image 125 across a subset of the spatial projections 121 of the discrete attribute dataset 120. In some embodiments, a user selects this subset.
- the principal components are computed for the discrete attribute values of each instance of a probe spot across a subset of the images 125 across each spatial projection 121 of the discrete attribute dataset 120.
- a single channel single image type
- the principal components are computed for the discrete attribute values of each instance of a probe spot across this single channel across each spatial projection 121 of the discrete attribute dataset 120.
- Figure IB also illustrates how, in some embodiments, each probe spot is given a cluster assignment 158 (e.g ., cluster assignment 158-1 for probe spot 1).
- clustering clusters based on discrete attribute values across all the images of all the spatial projections of a dataset.
- some subset of the images, or some subset of the projections is used to perform the clustering.
- Figure IB also illustrates one or more category assignments 170-1, ... 170-Q, where Q is a positive integer, for each probe spot (e.g., category assignment 170-1-1, ... 170-Q-l, for probe spot 1).
- a category assignment includes multiple classes 172 (e.g., class 172-1, ..., 172-M, such as class 172-1-1, ..., 172-M-l for probe spot 1, where M is a positive integer).
- the discrete attribute value dataset 120 stores a two- dimensional data point 166 for each respective probe spot 126 in the plurality of probe spots (e.g, two-dimensional data point 166-1 for probe spot 1 in Figure IB) but does not store the plurality of principal component values 164.
- each probe spot represents a plurality of cells.
- each probe spot represents a different individual cell (e.g, for liquid biopsy analysis where cells are clearly distinct on a substrate).
- each locus represents a number of mRNA measured in the different probe spot that maps to a respective gene in the genome of the cell, and the dataset further comprises the total RNA counts per probe spot.
- one or more spatial projections each comprise one or more layer maps 182.
- a layer map 182 is in the form of a binary map/probability map.
- a layer map 182 is in the same space (orientation) as the one or more images 125 of the spatial projection 121 of a discrete attribute value dataset 120.
- the layer maps 182 provide a way to import various kinds of data into the visualization module 119 for co-display with images 125 of spatial projections 121 within a discrete attribute value dataset 120. For example, in the case where there is software external to the visualization module 119 that measures signal intensity in an image 125, such signal intensity measurements can be imported as a layer map 182.
- the layer map 182 comprises a two-dimensional array of pixel values, where the two-dimensional array has the same dimensions (pixelspace) as the two-dimensional array of pixel values of images 125 of the corresponding spatial projection 121.
- each pixel in the layer map 125 contains information on measured signal intensity.
- each layer map 182 represents an analysis of a different stain, or a different combination of stains in a plurality of stains.
- each such stain or combination of stains is coded into the corresponding layer map and defines a tissue type and the probability of a specific tissue type by the greyscale value/color assigned to each specific pixel in the respective layer map 182. Because the pixelspace of the information in the respective layer map is the same as the one or more images 125 in the spatial projection 121 of the corresponding discrete attribute value dataset, each layer map 182 is readily overlayed onto the one or more images 125 of the corresponding discrete attribute value dataset. Moreover, additional operations can be performed to utilize the image coded in the layer map 182 such as selecting clusters based on a probability threshold.
- the information coded into the pixelspace of a layer map 182 is the result of processing of the native pixel values of a corresponding image 125 in the discrete attribute value dataset.
- a layer map 182 may represent a transformation of the corresponding image 125, where the transformation is a cell segmentation of probe spots in the corresponding image 125 based on different combinations of fluorescent markers present in the in the corresponding image 125, identification of tissue structures ( e.g ., glands) in the corresponding image 125, and/or identification of pathology (healthy vs diseased) in the corresponding image 125.
- the layer map 182 represents a transformation of the corresponding image 125, where the transformation is an output of a trained machine learning algorithm, in which the corresponding images 125 is the input to the trained machine learning algorithm, and where the machine learning algorithm has been trained using on annotated training sets of images.
- machine learning algorithms for processing images are disclosed in United States Provisional Patent Application No.: 62/977,565, entitled “Systems and Methods for Machine Learning Patterns in Biological Samples,” filed February 2020, which is hereby incorporated by reference.
- a layer map 182 is derived from more than one image 125 of a corresponding spatial projection.
- a layer map is derived from two images of a corresponding spatial projection.
- a layer map comprises a Boolean combination of the corresponding pixel values of two images 125, for instance the summation or subtraction of the corresponding pixel values from the two images.
- a layer map comprises a Boolean combination of the corresponding probe spot values (e.g., discrete attribute values 124) of two images 125, for instance the summation or subtraction of the corresponding pixel values from the two images for each locus for each probe spot.
- Figures 1 A, IB, and 1C depict a “visualization system 100,” the figures are intended more as functional description of the various features that may be present in computer systems than as a structural schematic of the implementations described herein. In practice, and as recognized by those of ordinary skill in the art, items shown separately could be combined and some items could be separated. Moreover, although Figure 1 A depicts certain data and modules in non-persistent memory 111, some or all of these data and modules may be in persistent memory 112. Further, while discrete attribute value dataset 120 is depicted as resident in persistent memory 112, a portion of discrete attribute value dataset 120 is, in fact, resident in non-persistent memory 111 at various stages of the disclosed methods.
- FIG. 1 A A non-limiting example of a visualization system is collectively illustrated in Figures 1 A, IB, and 1C.
- the persistent memory and/or the non-persistent memory can be on a single computer, distributed across a network of computers, be represented by one or more virtual machines, or be part of a cloud computing architecture.
- Each two-dimensional image in the one or more two-dimensional images is (a) taken of a first tissue section, obtained from the biological sample, overlaid on a substrate having the plurality of probe spots arranged in the spatial arrangement and (b) comprises at least 100,000 pixel values.
- each two-dimensional image comprises at least 200,000 pixel values, at least 300,000 pixel values, at least 500,000 pixel values at least 1 million pixel values, at least 1 million pixel values, at least 2 million pixel values, at least 3 million pixel values, at least 4 million pixel values, at least 5 million pixel values, or at least 8 million pixel values.
- the discrete attribute value dataset comprises (iii) a corresponding plurality of discrete attribute values 124 for each respective probe spot 126 in the plurality of probe spots obtained from two-dimensional spatial sequencing of the first tissue section.
- Each respective discrete attribute value in the corresponding plurality of discrete attribute values is for a different loci in a plurality of loci.
- each corresponding plurality of discrete attribute values comprises at least 25 discrete attribute values, at least 50 discrete attribute values, at least 100 discrete attribute values, at least 500 discrete attribute values, or at least 1000 discrete attribute values.
- the discrete attribute value dataset 120 comprises a corresponding discrete attribute value 124 for each locus 122 in a plurality of loci for each respective probe spot 126 in a plurality of probe spots for each image in a set of images for each spatial projection in the set of spatial projections.
- a corresponding discrete attribute value 124 for each locus 122 in a plurality of loci for each respective probe spot 126 in a plurality of probe spots for each image in a set of images for each spatial projection in the set of spatial projections In some embodiments, in which there are multiple images in a single projection, it is possible for there to be only a single set of discrete attribute values.
- a discrete attribute value dataset 120 has a file size of more than 1 megabytes, more than 5 megabytes, more than 100 megabytes, more than 500 megabytes, or more than 1000 megabytes. In some embodiments, a discrete attribute value dataset 120 has a file size of between 0.5 gigabytes and 25 gigabytes. In some embodiments, a discrete attribute value dataset 120 has a file size of between 0.5 gigabytes and 100 gigabytes. [00120] In some embodiments, each set of images is, in fact, a single image. In some embodiments, each set of images is, in fact, a plurality of images.
- each set of images has an independent number of images meaning that there is no requirement that each spatial projection 121 in a discrete attribute dataset 120 has the same number of images.
- the set of images of a particular spatial projection consists of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, or more than 20 images.
- one or more of the images include an image of other analytes, such as proteins in a biological sample.
- images in the set of images are acquired by exciting a target sample using a different wavelength or a different wavelength ranges.
- an image is acquired using transmission light microscopy (e.g ., bright field transmission light microscopy, dark field transmission light microscopy, oblique illumination transmission light microscopy, dispersion staining transmission light microscopy, phase contrast transmission light microscopy, differential interference contrast transmission light microscopy, emission imaging, etc.).
- transmission light microscopy e.g ., bright field transmission light microscopy, dark field transmission light microscopy, oblique illumination transmission light microscopy, dispersion staining transmission light microscopy, phase contrast transmission light microscopy, differential interference contrast transmission light microscopy, emission imaging, etc.
- each of the images in the set of images for a spatial projection is acquired by using a different bandpass filter that blocks out light other than a particular wavelength or set of wavelengths.
- the set of images of a projection are images created using fluorescence imaging, for example, by making use of various immunohistochemistry (IHC) probes that excite at various different wavelengths.
- IHC immunohistochemistry
- each set of images corresponds to a different tissue section in a collection of tissue sections taken from a biological sample
- each respective spatial projection e.g., each tissue section or sub-portion of the tissue section
- each respective spatial projection is inputted into the disclosed discrete attribute value dataset 120 with 1, 2, 3, 4, ... Q different images 125 associated with it, where Q is a positive integer.
- the biological sample is attached to a substrate (e.g ., a slide, a coverslip, a semiconductor wafer, a chip, etc.). Attachment of the biological sample can be irreversible or reversible, depending upon the nature of the sample and subsequent steps in the analytical method.
- a substrate e.g ., a slide, a coverslip, a semiconductor wafer, a chip, etc.
- the transcriptome reference has sequences for 10 or more genes, 25 or more genes, 50 or more genes, 500 or more genes, 750 or more genes, 1000 or more genes, 2000 or more genes, 5000 or more genes or 10000 or more genes to which each sequence read is aligned against.
- each entry corresponds to a number of RNA molecules proximal to (e.g, on top of) a respective probe spot (e.g, each RNA molecule has been bound by a barcode corresponding to the respective probe spot) that align to a particular locus (e.g, gene feature).
- each capture area of an image 125 is indicated (e.g, outlined) by a plurality of printed fiduciary dots (e.g, to identify the location of each capture area).
- each plurality of printed fiduciary dots e.g, dots 706 in Figure 7
- the fiduciary positions are stored in the discrete attribute value dataset 120 (e.g., a .cloupe file) as an additional projection, akin to the other spots in a .cloupe dataset. These fiduciary positions are viewable for spatial datasets by selecting “Fiduciary Spots” from the Image Settings panel, discussed herein, as shown in Figure 9B.
- each such mRNA represents a different gene and the discrete attribute value dataset 120 includes discrete attribute values for between 5 and 20,000 different genes, or variants of different genes or open reading frames of different genes, in each probe spot of each image 125 of each spatial projection 121 represented by the dataset.
- V(D)J sequences are spatially quantified using, for example clustering and/or t-SNE (where such cluster and/or t-SNE plots can be displayed in linked windows), see , United States Patent Application No. 15/984,324, entitled “Systems and Methods for Clonotype Screening,” filed May 19, 2018, which is hereby incorporated by reference.
- mRNA for more than 50, more than 100, more than 500, or more 1000 different genetic loci are localized to a single probe spot, and for each such respective genetic loci, one or more UMI are identified, meaning that there were one or more mRNA genetic loci encoding the respective genetic loci.
- more than ten, more than one hundred, more than one thousand, or more than ten thousand UMI for a respective genetic loci are localized to a single probe spot.
- the file size of the native image 125 is between 0.5 gigabyte and 10 gigabytes. In some embodiments, the file size of the native image 125 is between 0.5 gigabyte and 25 gigabytes. In some embodiments, the native image includes between 1 million and 25 million pixels. In some embodiments, each probe spot is represented by five or more, ten or more, 100 or more, 1000 or more contiguous pixels in a native image 125. In some embodiments, each probe spot is represented by between 1000 and 250,000 contiguous pixels in a native image 125.
- each native image 125 is in any file format including but not limited to JPEG/JFIF, TIFF, Exif, PDF, EPS, GIF, BMP, PNG, PPM, PGM, PBM, PNM, WebP, HDR raster formats, HEIF, BAT, BPG, DEEP, DRW, ECW, FITS, FLIF, ICO, ILBM, IMG, PAM, PCX, PGF, JPEG XR, Layered Image File Format, PLBM, SGI, SID, CD5, CPT, PSD, PSP, XCF, PDN, CGM, SVG, PostScript, PCT, WMF, EMF, SWF, XAML, and/or RAW.
- an image is represented as an array (e.g., matrix) comprising a plurality of pixels, such that the location of each respective pixel in the plurality of pixels in the array (e.g, matrix) corresponds to its original location in the image.
- an image is represented as a vector comprising a plurality of pixels, such that each respective pixel in the plurality of pixels in the vector comprises spatial information corresponding to its original location in the image.
- an image 125 is a color image (e.g ., 3 x 8 bit, 2424 x 2424 pixel resolution). In some embodiments, an image 125 is a monochrome image (e.g., 14 bit, 2424 x 2424 pixel resolution).
- the biological sample is stained with Haemotoxylin and Eosin, a Periodic acid- Schiff reaction stain (stains carbohydrates and carbohydrate rich macromolecules a deep red color), a Masson’s tri chrome stain (nuclei and other basophilic structures are stained blue, cytoplasm, muscle, erythrocytes and keratin are stained bright-red, collagen is stained green or blue, depending on which variant of the technique is used), an Alcian blue stain (a mucin stain that stains certain types of mucin blue, and stains cartilage blue and can be used with H&E, and with van Gieson stains), a van Gieson stain (stains collagen red, nuclei blue, and erythrocytes and cytoplasm yellow, and can be combined with an elastin stain that stains elastin blue/black), a reticulin stain, an Azan stain, a Giemsa stain, a Toluidine blue stain, an is
- the discrete attribute value 124 for a given locus 122 for a given probe spot 126 in a given image 125 is a number in the set (0, 1, . . . , 100 ⁇ . In some embodiments, the discrete attribute value 124 for a given locus 122 for a given probe spot 126 in a given image 125 is a number in the set ⁇ 0, 1, .. 50 ⁇ . In some embodiments, the discrete attribute value 124 for a given locus 122 for a given probe spot 126 in a given image 125 is a number in the set (0, 1, .. 30 ⁇ . In some embodiments, the discrete attribute value 124 for a given locus 122 for a given probe spot 126 in a given image 125 is a number in the set (0, 1, 1, . . 30 ⁇ . In some embodiments, the discrete attribute value 124 for a given locus 122 for a given probe spot 126 in a given image 125 is a number in the set (0, 1,
- N is a positive integer
- the discrete attribute value dataset 120 includes discrete attribute values for 25 or more, 50 or more, 100 or more, 250 or more, 1000 or more, 3000 or more, 5000 or more, 10,000 or more, or 15,000 or more loci 122 in each probe spot 126 in each image 125 in each spatial projection 121 represented by the dataset.
- the discrete attribute value dataset 120 includes discrete attribute values 124 for the loci of 500 or more probe spots, 5000 or more probe spots, 100,000 or more probe spots, 250,000 or more probe spots, 500,000 or more probe spots, 1,000,000 or more probe spots, 10 million or more probe spots, or 50 million or more probe spots for each image 125 of each spatial projection 121 in the discrete attribute value dataset 120.
- the discrete attribute value dataset 120 includes discrete attribute values for 50 or more, 100 or more, 250 or more, 500 or more, 1000 or more, 3000 or more, 5000 or more, 10,000 or more, or 15,000 or more analytes in each probe spot 126 in each image 125 in each spatial projection 121 represented by the dataset.
- the systems and methods of the present disclosure support very large discrete attribute value datasets 120 that may have difficulty being stored in the persistent memory 112 of conventional devices due to persistent memory 112 size limitations in conventional devices.
- the systems and methods of the present disclosure are designed for data in which the sparsity of the dataset is significantly more than twenty percent (e.g ., at least 40% of the values in the dataset are zero, at least 50% of the values in the dataset are zero, at least 60% of the values in the dataset are zero, at least 70% of the values in the dataset are zero, at least 80% of the values in the dataset are zero, or at least 90% of the values in the dataset are zero).
- the number of zero-valued elements divided by the total number of elements is called the sparsity of the matrix (which is equal to 1 minus the density of the matrix.
- the sparsity of the matrix which is equal to 1 minus the density of the matrix.
- the discrete attribute value dataset 120 is represented in a compressed sparse matrix representation that may be searched both on a locus 122 basis and on a probe spot 126 basis.
- the discrete attribute value dataset 120 redundantly represents the corresponding discrete attribute value 124 for each locus 122 in a plurality of loci for each respective probe spot 126 in a plurality of probe spots of an image 125 in a spatial projection 121 in both a compressed sparse row format and a compressed sparse column format in which loci for a respective probe spot that have a null discrete attribute data value are optionally discarded.
- the average density of the gene barcode matrices that are used in the systems and methods of the present disclosure are on the order of two percent.
- loci e.g, genes
- the sparse matrix allows the dataset to fit in persistent memory 112.
- the memory footprint is still too high once the data for half a million probe spots 126 or more is used.
- both the row-oriented and column-oriented spare-matrix representations of the data are stored in persistent memory 112 in some embodiments in compressed blocks (e.g., bgzf blocks) to support quick differential- expression analysis, which requires examination of the data (e.g, the discrete attribute values of loci) for individual probe spots.
- compressed blocks e.g., bgzf blocks
- access to the discrete attribute data for gene 3 works by looking at the address in the dataset for gene 3, which thereby identifies the block in which the data for gene 3 resides.
- the address of the individual probe spot is first needed.
- the discrete attribute value dataset 120 is stored in compressed sparse row (CSR) format.
- compressed sparse row is used interchangeably with the term “compressed sparse column” (CSC) format.
- CSR compressed sparse row
- the CSR format stores a sparse m x n matrix M in row form using three (one-dimensional) arrays (A, IA, JA).
- NNZ denotes the number of nonzero entries in M (note that zero-based indices shall be used here)
- the array A is of length NNZ and holds all the nonzero entries of M in left-to- right top-to-bottom (“row-major”) order.
- the array IA is of length m + 1. It is defined by this recursive definition:
- [00190] is a 4 x 4 matrix with 4 nonzero elements, hence
- the discrete attribute value dataset 120 is also stored in compressed sparse column (CSC or CCS) format.
- CSC compressed sparse column
- a CSC is similar to CSR except that values are read first by column, a row index is stored for each value, and column pointers are stored.
- the discrete attribute value dataset 120 is compressed in accordance with a blocked compression algorithm. In some such embodiments, this involves compressing the A and JA data structures but not the IA data structures using a block compression algorithm such as bgzf and storing this in persistent memory 112. Moreover, an index for compressed A and an index for compressed JA enable random seeks of the compressed data.
- the discrete attribute value dataset 120 represents a whole transcriptome sequencing (RNA-seq) experiment that quantifies gene expression from a probe spot in counts of transcript reads mapped to the genes.
- a discrete attribute value dataset 120 represents a sequencing experiment in which baits are used to selectively filter and pull down a gene set of interest as disclosed, for example, in United States Provisional Patent Application No. 62/979,889, entitled “Capturing Targeted Genetic Targets Using a Hybridization/Capture Approach,” filed February 21, 2020, attorney docket number 104371- 5028-PR02, which is hereby incorporated by reference.
- the clustering is done prior to implementation of the disclosed methods.
- the discrete attribute value dataset 120 already includes the cluster assignments for each probe spot in the discrete attribute dataset.
- the principal components are computed using the discrete attribute values of each instance of a probe spot across each image 125 across each spatial projection 121 of the discrete attribute dataset 120.
- the principal components are computed for only a subset of the discrete attribute values of each instance of a probe spot across each image 125 across each spatial projection 121 of the discrete attribute dataset 120.
- the principal components are computed for a select set of loci 124 (rather than all the loci) across each image 125 across each spatial projection 121 of the discrete attribute dataset 120.
- the principal components are computed for the discrete attribute values of each instance of a probe spot across each image 125 for a single spatial projection 121 of the discrete attribute dataset 120. In some alternative embodiments, the principal components are computed for the discrete attribute values of each instance of a probe spot across each image 125 across a subset of the spatial projections 121 of the discrete attribute dataset 120. In some embodiments, a user selects this subset. In some alternative embodiments, the principal components are computed for the discrete attribute values of each instance of a probe spot across a subset of the images 125 across each spatial projection 121 of the discrete attribute dataset 120. For instance, in some embodiments, a single channel (single image type) is user selected and the principal components are computed for the discrete attribute values of each instance of a probe spot across this single channel across each spatial projection 121 of the discrete attribute dataset 120.
- Principal component analysis is a mathematical procedure that reduces a number of correlated variables into a fewer uncorrelated variables called “principal components.”
- the first principal component is selected such that it accounts for as much of the variability in the data as possible, and each succeeding component accounts for as much of the remaining variability as possible.
- the purpose of PCA is to discover or to reduce the dimensionality of the dataset, and to identify new meaningful underlying variables.
- PCA is accomplished by establishing actual data in a covariance matrix or a correlation matrix.
- the mathematical technique used in PCA is called eigen analysis: one solves for the eigenvalues and eigenvectors of a square symmetric matrix with sums of squares and cross products. The eigenvector associated with the largest eigenvalue has the same direction as the first principal component.
- the eigenvector associated with the second largest eigenvalue determines the direction of the second principal component.
- the sum of the eigenvalues equals the trace of the square matrix and the maximum number of eigenvectors equals the number of rows (or columns) of this matrix. See, for example, Duda, Hart, and Stork, Pattern Classification, Second Edition, John Wiley & Sons, Inc., NY, 2000, pp. 115-116, which is hereby incorporated by reference.
- principal components are not clustered but, rather the discrete attribute values of each instance of a probe spot across each image 125 across each spatial projection 121 of the discrete attribute dataset 120 are clustered instead.
- only a subset of the discrete attribute values of each instance of a probe spot across each image 125 across each spatial projection 121 of the discrete attribute dataset 120 are clustered.
- the discrete attribute values for a select set of loci 124 (rather than all the loci) across each image 125 across each spatial projection 121 of the discrete attribute dataset 120 are clustered.
- the discrete attribute values of each instance of a probe spot across each image 125 for a single spatial projection 121 of the discrete attribute dataset 120 are clustered. In some alternative embodiments, the discrete attribute values of each instance of a probe spot across each image 125 across a subset of the spatial projections 121 of the discrete attribute dataset 120 are clustered. In some embodiments, a user selects this subset. In some alternative embodiments, the discrete attribute values of each instance of a probe spot across a subset of the images 125 across each spatial projection 121 of the discrete attribute dataset 120 are clustered.
- a single channel (single image type) is user selected and the discrete attribute values of each instance of a probe spot across this single channel across each spatial projection 121 of the discrete attribute dataset 120 are clustered.
- the above-described clustering e.g ., of the principal component values and/or the discrete attribute values
- the discrete attribute value dataset 120 includes the cluster assignment 158 of each probe spot, as illustrated in Figure IB.
- the cluster assignment of each probe spot 126 is not performed prior to storing the discrete attribute value dataset 120 but rather all the principal component analysis computation of the principal component values 164 is performed prior to storing the discrete attribute value dataset 120.
- clustering is performed by the clustering module 152 of Figure 1A.
- XlO ⁇ XI, X2, X3, X4, X5, X6, X7, X8, X9, Xio ⁇
- Xi is the discrete attribute value 124 for the locus i 124 associated with the probe spot 126 in a given spatial projection.
- the discrete attribute value dataset 120 includes mRNA data from one or more probe spot types (classes, e.g., diseased state and non-diseased state), two or more probe spot types, or three or more probe spot types.
- probe spots of like type will tend to have like values for mRNA across the set of loci (mRNA) and therefor cluster together.
- the discrete attribute value dataset 120 includes class a: probe spots from subjects that have a disease
- class b probe spots from subjects that do not have a disease
- an ideal clustering classifier will cluster the discrete attribute value dataset 120 into two groups, with one cluster group uniquely representing class a and the other cluster group uniquely representing class b.
- each probe spot 126 is associated with ten principal component values that collectively represent the variation in the discrete attribute values of a large number of loci 122 of a given probe spot with respect to the discrete attribute values of corresponding loci 122 of other probe spots in the dataset. This can be for a single image 125, across all or a subset of images in a single spatial projection 121, or across all or a subset of the images in all or a subject of a plurality of spatial projections 125 in a discrete attribute value dataset 120. In such instances, each probe spot 126 can be expressed as a vector:
- X lO ⁇ XI, X2, X3, X4, X5, X6, X7, X8, X9, Xio ⁇
- Xi is the principal component value 164 i associated with the probe spot 126.
- the discrete attribute value dataset 120 includes mRNA data from one or more probe spot types (e.g diseased state and non-diseased state), two or more probe spot types, or three or more probe spot types.
- probe spots of like type will tend to have like values for mRNA across the set of loci (mRNA) and therefor cluster together.
- the discrete attribute value dataset 120 includes class a: probe spots from subjects that have a disease
- class b probe spots from subjects that have a disease
- an ideal clustering classifier will cluster the discrete attribute value dataset 120 into two groups, with one cluster group uniquely representing class a and the other cluster group uniquely representing class b.
- s(x, x') is a symmetric function whose value is large when x and x' are somehow “similar.”
- An example of a nonmetric similarity function s(x, x') is provided on page 216 of Duda 1973.
- clustering requires a criterion function that measures the clustering quality of any partition of the data. Partitions of the dataset that extremize the criterion function are used to cluster the data. See page 217 of Duda 1973. Criterion functions are discussed in Section 6.8 of Duda 1973.
- each respective vector in the plurality of vectors comprises the discrete attribute values 124 across the loci 122 of a corresponding probe spot 126 (or principal components derived therefrom) includes, but is not limited to, hierarchical clustering (agglomerative clustering using nearest-neighbor algorithm, farthest-neighbor algorithm, the average linkage algorithm, the centroid algorithm, or the sum-of-squares algorithm), k-means clustering, fuzzy k-means clustering algorithm, and Jarvis-Patrick clustering.
- hierarchical clustering agglomerative clustering using nearest-neighbor algorithm, farthest-neighbor algorithm, the average linkage algorithm, the centroid algorithm, or the sum-of-squares algorithm
- k-means clustering fuzzy k-means clustering algorithm
- Jarvis-Patrick clustering Jarvis-Patrick clustering
- the clustering module 152 clusters the discrete attribute value dataset 120 using the discrete attribute value 124 for each locus 122 in the plurality of loci for each respective probe spot 126 in the plurality of probe spots, or principal component values 164 derived from the discrete attribute values 124, across one or more images in one or more spatial projections in the discrete attribute value dataset 120 thereby assigning each respective probe spot 126 in the plurality of probe spots to a corresponding cluster 158 in a plurality of clusters and thereby assigning a cluster attribute value to each respective probe spot in the plurality of probe spots of each image used in the analysis.
- k-means clustering is used.
- the goal of k-means clustering is to cluster the discrete attribute value dataset 120 based upon the principal components or the discrete attribute values of individual probe spots into K partitions.
- f is a number between 2 and 50 inclusive.
- the number f is set to a predetermined number such as 10.
- the number if is optimized for a particular discrete attribute value dataset 120.
- a user sets the number if using the visualization module 119.
- Figure 4 illustrates an instance in which the multichannel-aggr dataset 120, constituting data from a plurality of probe spots has been clustered into eleven clusters 158.
- the user selects in advance how many clusters the clustering algorithm will compute prior to clustering. In some embodiments, no predetermined number of clusters is selected. Instead, clustering is performed until predetermined convergence criteria are achieved. In embodiments where a predetermined number of clusters is determined, k-means clustering of the present disclosure is then initialized with K cluster centers mi, ..., mk randomly initialized in two-dimensional space.
- a vector X is constructed of each principal component value 164 associated with the respective probe spot 126.
- K is equal to 10
- ten such vectors X are selected to be the centers of the ten clusters.
- each remaining vector X ; corresponding to the probe spots 126 that were not selected to be cluster centers, is assigned to its closest cluster center: where C k is the set of examples closest to / using the objective function: where mi, ..., mk are the K cluster centers and r nk e ⁇ 0, 1 ⁇ is an indicator denoting whether a probe spot 126 X t belongs to a cluster k.
- new cluster centers / are recomputed (mean / centroid of the set C k ):
- the k-means clustering computes a score for each respective entity 126 that takes into account the distance between the respective entity and the centroid of the cluster 158 that the respective probe spot has been assigned. In some embodiments, this score is stored as the cluster attribute value 160 for the probe spot 126. [00216] Once the clusters are identified, as illustrated in Figure 4, individual clusters can be selected to display. For instance, referring to Figure 4, affordances 440 are individually selected or deselected to display or remove from the display the corresponding cluster 158.
- each respective cluster 158 in the plurality of clusters consists of a unique different subset of the second plurality of probe spots 126.
- this clustering loads less than the entirety of the discrete attribute value dataset 120 into the non-persistent memory 111 at any given time during the clustering. For instance, in embodiments where the discrete attribute value dataset 120 has been compressed using bgzf, only a subset of the blocks of the discrete attribute value dataset 120 are loaded into non-persistent memory during the clustering of the discrete attribute value dataset 120.
- the subset of blocks of the discrete attribute value dataset 120 has been loaded from persistent memory 112 into non-persistent memory 111 and processed in accordance with the clustering algorithm (e.g k-means clustering)
- the subset of blocks of data is discarded from non-persistent memory 111 and a different subset of blocks of the discrete attribute value dataset 120 are loaded from persistent memory 112 into non- persistent memory 111 and processed in accordance with the clustering algorithm of the clustering module 152.
- k-means clustering is used to assign probe spots 126 to clusters 158.
- the k-means clustering uses as input the principal component values 164 for each probe spot 126 as the basis for clustering the probe spots into cluster.
- the k-means algorithm computes like clusters of probe spots from the higher dimensional data (the set of principal component values) and then after some resolution, the k-means clustering tries to minimize error.
- the k-means clustering provides cluster assignments 158, which are recorded in the discrete attribute value dataset 120.
- the user decides in advance how many clusters 158 there will be.
- feature of k-means cluster is exploited by running a series of k-means clustering runs, with each different run having a different number of clusters (a different value for K).
- a separability score e.g a quality score
- the clustering that is displayed by default in such embodiments is the k-means clustering (1, ... N) that has the highest separability score.
- each cluster 158 is displayed in a different color. In other embodiments, each cluster 158 is displayed with a different dot pattern or hash pattern.
- the k-means clustering algorithm described herein elucidates like clusters 158 within the data. There is no guarantee that all the clusters 158 represent physiologically significant events. In other words, a priori , it is not known what the clusters 158 mean in some instances. What is known is that the algorithm has determined that there are differences between the probe spots 126 that are being represented by different colors or different hash patterns or symbols.
- the systems and methods of the present disclosure provide tools for determining whether there is any meaning behind the differences between the clusters such as the heat map of panel 404.
- a Louvain modularity algorithm is used. See, Blondel el al. , July 25, 2008, “Fast unfolding of communities in large networks,” arXiv:0803.0476v2 [physical. coc-ph], which is hereby incorporated by reference.
- the user can choose a clustering algorithm.
- the user can choose between at least k-means clustering and a Louvain modularity algorithm.
- clustering the dataset comprises application of a Louvain modularity algorithm to a map, the map comprising a plurality of nodes and a plurality of edges.
- each node in the plurality of nodes represents a probe spot in the plurality of probe spots.
- the coordinates in N-dimensional space of a respective node in the plurality of nodes are a set of principal components of the corresponding probe spot in the plurality of probe spots.
- the set of principal components is derived from the corresponding discrete attribute values of the plurality of loci for the corresponding probe spot, where N is the number of principal components in each set of principal components.
- An edge exists in the plurality of edges between a first node and a second node in the plurality of nodes when the first node is among the k nearest neighboring nodes of the second node in the first plurality of node, where the k nearest neighboring nodes to the second node is determined by computing a distance in the N-dimensional space between each node in the plurality of nodes, other than the second node, and the second node.
- the distance is a Euclidean distance.
- other distance metrics are used (e.g ., Chebyshev distance, Mahalanobis distance, Manhattan distance, etc.).
- the nodes and the edges are not weighted for the Louvain modularity algorithm.
- each node and each edge receives the same weight in such embodiments
- Block 218 Display, in a first window, pixel values of all or apportion of an image in the one or more images of a first projection.
- the methods continue by displaying, in a first window on the display, pixel values of all or portion of a first two-dimensional image in the one or more two-dimensional images of the first projection.
- Figure 17A the image is illustrated and those pixels corresponding to where the sample were overlayed, that is region 1750, on a substrate are shown colored against a background grey. The background grey reflects the regions where no sample was overlayed on the substrate.
- Each round spot in region 1750 of the image represents a probe spot.
- Block 220 - Overlay cluster indicia Referring to block 220 of Figure 2B, the method continues by displaying on the first two-dimensional image and co-aligned with the first two- dimensional image (i) first indicia for each probe spot in the plurality of probe spots that have been assigned to a first cluster in the plurality of clusters and (ii) second indicia for each probe spot in the plurality of probe spots that have been assigned to a second cluster in the plurality of clusters, thereby identifying the morphological pattern.
- each such cluster is assigned a different graphic or color code.
- a morphological pattern can provide valuable insight into the underlying biological sample.
- the morphological patterns can be used to determine a disease state of the biological sample.
- the morphological pattern can be used to recommend a therapeutic treatment for the donor of the biological sample.
- lymphocytes may have different expression profiles then the tumor cells.
- the lymphocytes may cluster ( e.g ., through the clustering described above in conjunction with block 206) into the first cluster and thus each probe spot corresponding to portions of a tissue sample in which lymphocytes are present may have the first indicia cluster.
- the tumor cells may cluster into the second cluster and thus each probe spot in which lymphocytes are not present may have the second indicia for the second cluster.
- the morphological pattern of lymphocyte infiltration into the tumor could be documented by probe spots bearing first indicia (representing the lymphocytes) amongst the probe spots bearing second indicia (representing the tumor cells).
- the morphological pattern exhibited by the lymphocyte infiltration into the tumor would be associated with a favorable diagnosis whereas the inability of lymphocytes to infiltrate the tumor would be associated with an unfavorable diagnosis.
- the spatial relationship (morphological pattern) of cell types in heterogeneous tissue can be used to analyze tissue samples.
- cancerous cells associated with the tumor will have different expression profiles then normal cells.
- the cancerous cells may cluster (e.g., through the clustering described above in conjunction with block 206) into a first cluster using the disclosed methods and thus each probe spot corresponding to portions of a tissue sample in which the cancerous cells are present will have the first indicia cluster.
- the normal cells may cluster into the second cluster and thus each probe spot corresponding to portions of the tissue sample in which cancerous cells are not present will have the second indicia for the second cluster. If this is the case, the morphological pattern of cancer cell metastatis, or the morphology of a tumor (e.g., shape and extent within a normal healthy tissue sample) can be documented by probe spots bearing first indicia (representing cancerous cells) amongst the probe spots bearing second indicia (representing normal cells).
- the lower panel 502 is arranged by rows and columns. Each row corresponds to a different locus. Each column corresponds to a different cluster. Each cell, then, illustrates the fold change (e.g., ⁇ ogi fold change) of the average discrete attribute value 124 for the locus 122 represented by the row the cell is in across the probe spots 126 of the cluster represented by the column the cell is in compared to the average discrete attribute value 124 of the respective locus 122 in the probe spots in the remainder of the clusters represented by the discrete attribute value dataset 120.
- fold change e.g., ⁇ ogi fold change
- the lower panel 502 has two settings.
- the first is a hierarchical clustering view of significant loci 122 per cluster.
- log2 fold change in expression refers to the log2 fold value of (i) the average number of transcripts (discrete attribute value) measured in each of the probe spots of the subject cluster that map to a particular gene (locus 122) and (ii) the average number of transcripts measured in each of the probe spots of all clusters other than the subject cluster that map to the particular gene.
- selection of a particular locus (row) in the lower panel 502 of Figure 5 causes the locus (feature) associated with that row to be an active feature that is posted to the active feature list 506.
- the locus “CCDC80” from lower panel 502 has been selected and so the locus “CCDC80” is in the active feature list 506.
- the active feature list 506 is a list of all features that a user has either selected (e.g., “CCDC80”) or uploaded.
- the expression patterns of those features are displayed in panel 504 of Figure 5. If more than one feature is in the active feature list 506, then the expression patter that is displayed in panel 504 corresponds to a combination (measure of central tendency) of all the features.
- each respective probe spot in the discrete attribute value dataset 120 regardless of which cluster the probe spot is in, is illuminated with an intensity, color, or other form of display attribute that is commensurate with a number of transcripts ( e.g ., log2 of expression) of the single active feature CCDC80 that is present in the respective probe spot 126 in the upper panel 504.
- the scale & attribute parameters 510 control how the expression patterns are rendered in the upper panel 504.
- toggle, 512 sets which scale value to display (e.g., Log2, linear, log-normalized).
- the top right menu sets how to combine values when there are multiple features in the Active Feature List. For instance, in the case where two features (e.g., loci) have been selected for the active feature list 506, toggle 514 can be used to display, in each probe spot, the feature minimum, feature maximum, feature sum, for feature average.
- each respective probe spot is assigned a color on the color scale that is commensurate of a minimum expression value, that is, the expression of A or the expression of B, whichever is lower.
- each respective probe spot is independently evaluated for the expression of A and B at the respective probe spot, and the probe spot is colored by the lowest expression value of A and B.
- toggle 514 can be used to select the maximum feature value from among the features in the active feature list 506 for each probe spot, or to sum the feature values across the features in the active feature list 506 for each probe spot or to provide a measure of central tendency, such as average, across the features in the active feature list 506 for each probe spot.
- the select by count menu options 516 control how to filter the expression values displayed.
- the color palette 510 controls the color scale and range of values.
- the user can also choose to manually set the minimum and maximum of the color scale by unchecking an Auto-scale checkbox (not shown), typing in a value, and clicking an Update Min/Max button (not shown).
- an Update Min/Max button (not shown).
- spots with values outside the range, less than the minimum or greater than the maximum are colored gray. This is particularly useful if there is a lot of noise or ambient expression of a locus or a combination of loci in the active feature list 506.
- Increasing the minimum value of the scale filters that noise. It is also useful to configure the scale to optimally highlight the expression of genes of interest.
- color scale 508 shows the Log2 expression of CCDC80 ranging from 0.0 to 5.0.
- toggle 510 can be used to illustrate the relative expression of features in the active feature list 506 on a linear basis or a log-normalized basis.
- palette 510 can be used to change the color scale 508 to other colors, as well as to set the minimum and maximum values that are displayed.
- this p-value is annotated with a star system, in which four stars means there is a significant difference between the selected cluster (k-means cluster 158-1 in Figure 5) and the rest of the clusters for a given locus, whereas fewer stars means that there is a less significant difference in the discrete attribute value 124 (e.g., difference in expression) between the locus 122 in the selected cluster relative to all the other clusters.
- the ranking of the entire table is inverted so that the locus 122 associated with the least significant discrete attribute value 124 (e.g, least expressed) is at the top of the table.
- a respective user selection results in zooming the spatial analysis view into a region of the tissue (see e.g., Figure 8, which illustrates a zoomed-in region of Figure 7).
- the user selection comprises adjusting the zoom slider 802 (e.g, see the difference in the sizes of the plurality of probe spots between panels 704 and 804) and loading the appropriate tile corresponding to the desired location on the image.
- image tiles are retrieved based on the zoom level (of zoom slider 802) and position of the viewer with tiles retrieved for each active image (channel) concurrently.
- the discrete attribute value dataset 120 (e.g., .cloupe file) stores feature data (read counts) and feature metadata separately, and the feature metadata is stored in a serialized data structure, as shown in Figure 15.
- Figure 15 illustrates a discrete attribute dataset for a gene expression pipeline, capable of counting gene and non-gene features per probe spot, that includes a feature data module storing UMI counts per gene per probe spot and UMI counts of non-gene features (e.g ., bound antibodies) per probe spots.
- a feature metadata module also referred to herein as a “label class,” identifies a type of each row the matrix.
- it can be a compressed JSON string and information stored in the table of contents can be in the form of CSC and CSR indices and data.
- Such approaches increase computational efficiency. For instance, representation of labels for two different types of loci in this manner improves the computational efficiency of visualization system 100, in part, by reducing the amount of data that needs to be processed in order to visualize the data for the loci of the first type and the loci of the second type.
- a new category, “Cell Receptor,” that was not in the loaded discrete attribute value dataset 120 was user defined by selecting a first class of probe spots 172- 1-1 (“Wild Type”) using Lasso 552 and selecting displayed probe spots in the upper panel 420. A total of 452 probe spots 126 were selected from the Wild Type class. Further, a second class of probe spots 172-1-2 (“Variant”) was user defined using Lasso 552 to select the probe spots as illustrated in Figure 6. Next, the loci whose discrete attribute values 124 discriminate between the identified user defined classes “Wild Type” and “Variant” were computed.
- the locally distinguishing option 452 described above in conjunction with Figure 4 was used to identify the loci whose discrete attribute values discriminate between class 172-1-1 (Wild Type) and class 172-1-2 (Variant).
- the Wild Type class consisted of whole transcriptome mRNA transcript counts for 452 probe spots.
- the Variant class consisted of whole transcriptome mRNA transcript counts for 236 probe spots.
- Each column in the heat map shows the average expression of a corresponding gene across the probe spots of the corresponding class 172.
- the heat map includes more than 1000 different columns, each for a different human gene.
- the heat map shows which loci discriminate between the two classes.
- An absolute definition for what constitutes discrimination between the two classes is not provided because such definitions depend upon the technical problem to be solved.
- those of skill in the art will appreciate that many such metrics can be used to define such discrimination and any such definition is within the scope of the present disclosure.
- the computation and display of the heat map 402 took less than two seconds on the example system using the disclosed clustering module 152.
- the second class 172-1-1 computing (i) a first measure of central tendency of the discrete attribute value for the respective locus measured in each of the probe spots in the plurality of probe spots of the second class 172-1-2 and (ii) a second measure of central tendency of the discrete attribute value for the respective locus measured in each of the probe spots in the first class 172-1-1 and the third class 172-1-3 collectively, and [00308] for the third class 172-1-3, computing (i) a first measure of central tendency of the discrete attribute value for the respective locus measured in each of the probe spots in the plurality of probe spots of the third class 172-1-3 and (ii) a second measure of central tendency of the discrete attribute value for the respective locus measured in each of the probe spots in the first class 172-1-1 and the second class 172-1-2 collectively.
- TNBC Triple negative breast cancer
- the assay in this example incorporates -5000 molecularly barcoded, spatially encoded capture probes in probe spots 122 over which a tissue is placed, imaged, and permeabilized, capturing native mRNA in an unbiased fashion. Imaging and next-generation sequencing data were processed together resulting in gene expression mapped to image position. By capturing and sequencing of polyadenylated RNA transcripts from 10pm thick sections of tissue combined with histological visualization of the tissue, the Visium platform generated an unbiased map of gene expression of cells within the native tissue morphology.
- each representation of sequence reads in each subset represents a number of unique UMI, on a capture spot by capture spot basis, in the subsets of sequence reads on a color scale basis as outlined by respective scales 2010, 2012, and 2014.
- the techniques of this Example 5 are run on any of the discrete attribute value datasets of the present disclosure.
- each probe spot 126 has been assigned to a respective cluster 158
- the systems and methods of the present disclosure are able to compute, for each respective locus 122 in the plurality of loci for each respective cluster 158 in the plurality of clusters, a difference in the discrete attribute value 124 for the respective locus 122 across the respective subset of probe spots 126 in the respective cluster 158 relative to the discrete attribute value 124 for the respective locus 122 across the plurality of clusters 158 other than the respective cluster, thereby deriving a differential value 162 for each respective locus 122 in the plurality of loci for each cluster 158 in the plurality of clusters.
- differential expression is computed as the log2 fold change in (i) the average number of transcripts (discrete attribute value 124 for locus 122) measured in each of the probe spots 126 of the subject cluster 158 that map to a particular gene (locus 122) and (ii) the average number of transcripts measured in each of the probe spots of all clusters other than the subject cluster that map to the particular gene.
- the subject cluster contains 50 probe spots and on average each of the 50 probe spots contain 100 transcripts for gene A.
- the remaining clusters collectively contain 250 probe spots and, on average, each of the 250 probe spots contains 50 transcripts for gene A.
- the log2 fold change is computed in this manner for each gene in the human genome.
- the differential value 162 for each respective locus 122 in the plurality of loci for each respective cluster 158 in the plurality of clusters is a fold change in (i) a first measure of central tendency of the discrete attribute value 124 for the locus measured in each of the probe spots 126 in the plurality of probe spots in the respective cluster 158 and (ii) a second measure of central tendency of the discrete attribute value 124 for the respective locus 122 measured in each of the probe spots 126 of all clusters 158 other than the respective cluster.
- the first measure of central tendency is an arithmetic mean, weighted mean, midrange, midhinge, trimean, Winsorized mean, median, or mode of all the discrete attribute value 124 for the locus measured in each of the probe spots 126 in the plurality of probe spots in the respective cluster 158.
- the second measure of central tendency is an arithmetic mean, weighted mean, midrange, midhinge, trimean, Winsorized mean, median, or mode of all the discrete attribute value 124 for the locus 122 measured in each of the probe spots 126 in the plurality of probe spots 126 in all clusters other than the respective cluster.
- the fold change is a log2 fold change.
- the fold change is a logio fold change.
- each discrete attribute value 124 is normalized prior to computing the differential value 162 for each respective locus 122 in the plurality of loci for each respective cluster 158 in the plurality of clusters.
- the normalizing comprises modeling the discrete attribute value 124 of each locus associated with each probe spot in the plurality of probe spots with a negative binomial distribution having a consensus estimate of dispersion without loading the entire dataset into non-persistent memory 111.
- Such embodiments are useful, for example, for RNA-seq experiments that produce discrete attribute values 124 for loci 122 (e.g, digital counts of mRNA reads that are affected by both biological and technical variation).
- the negative binomial distribution for a discrete attribute value 124 for a given locus 122 includes a dispersion parameter for the discrete attribute value 124, which tracks the extent to which the variance in the discrete attribute value 124 exceeds an expected value. See Yu,
- RNA-seq whole transcriptome sequencing
- sSeq is applied to the discrete attribute value 124 of each locus 122.
- sSeq is disclosed in Yu, 2013, “Shrinkage estimation of dispersion in Negative Binomial models for RNA-seq experiments with small sample size,” Bioinformatics 29, pp. 1275-1282, which is hereby incorporated by reference. sSeq scales very well with the number of genes that are being compared.
- each cluster 158 may include hundreds, thousands, tens of thousands, hundreds of thousands, or more probe spots 126, and each respective probe spot 126 may contain mRNA expression data for hundreds, or thousands of different genes.
- sSeq is particularly advantageous when testing for differential expression in such large discrete attribute value datasets 120.
- RNA-seq methods sSeq is advantageously faster.
- Other single-probe spot differential expression methods exist and can be used in some embodiments, but they are designed for smaller-scale experiments.
- sSeq and more generally techniques that normalize discrete attribute values by modeling the discrete attribute value 124 of each locus 122 associated with each probe spot 126 in the plurality of probe spots with a negative binomial distribution having a consensus estimate of dispersion without loading the entire discrete attribute value dataset 120 into non-persistent memory 111, are practiced in some embodiments, of the present disclosure.
- the discrete attribute values for each of the loci is examined in order to get a dispersion value for all the loci.
- the discrete attribute values are not all read from persistent memory 112 at the same time.
- discrete attribute values are obtained by traversing through blocks of compressed data, a few blocks at a time. That is, a set of blocks (e.g., consisting of the few compressed blocks) in the dataset are loaded into non-persistent memory from persistent memory and are analyzed to determine which loci the set of blocks represent. An array of discrete attribute values across the plurality of probe spots, for each of the loci encoded in the set of blocks, is determined and used calculate the variance, or other needed parameters, for these loci across the plurality of probe spots.
- This process is repeated in which new set of blocks is loaded into non-persistent memory from persistent memory, analyzed to determine which loci are encoded in the new set of blocks, and then used to compute the variance, or other needed parameters, for these loci across the plurality of probe spots for each of the loci encoded in the new set of blocks, before discarding the set of blocks from non-persistent memory.
- a limited amount of the discrete attribute value dataset 120 is stored in non-persistent memory 111 at any given time (e.g., the data for a particular block that contain the discrete attribute values for a particular locus).
- systems and methods of the present disclosure are able to compute variance in discrete attribute values for a given locus because it has got stored the discrete attribute values for that particular locus across one or more images and/or one or more spatial projections 121 of the discrete attribute value dataset 120 stored in a single bgzf block, in some embodiments.
- the accessed set of bgzf blocks (which is a subset of the total number of bgzf blocks in the dataset), which had been loaded into non-persistent memory 111 to perform the computation, is dropped from non-persistent memory and another set of bgzf blocks for which such computations is to be performed is loaded into the non-persistent memory 111 from the persistent memory 112.
- such processes run in parallel (e.g, one process for each locus) when there are multiple processing cores 102.
- each processing core concurrently analyzes a different respective set of blocks in the dataset and computes loci statistics for those loci represented in the respective set of blocks.
- an average (or some other measure of central tendency) discrete attribute value 124 e.g ., count of the locus 122 is calculated for each cluster 158 of probe spots 126.
- the average (or some other measure of central tendency) discrete attribute value 124 of the locus A across all the probe spots 126 of the first cluster 158, and the average (or some other measure of central tendency) discrete attribute value 124 of locus A across all the probe spots 126 of the second cluster 158 is calculated and, from this, the differential value 162 for each the locus with respect to the first cluster is calculated. This is repeated for each of the loci 122 in a given cluster. It is further repeated for each cluster 158 in the plurality of clusters. In some embodiments, there are other factors that are considered, like adjusting the initial estimate of the variance in the discrete attribute value 124 when the data proves to be noisy.
- the average (or some other measure of central tendency) discrete attribute value 124 of the locus A across all the probe spots 126 of the first cluster 158 and the average (or some other measure of central tendency) discrete attribute value 124 of locus A across all the probe spots 126 of the remaining cluster 158 is calculated and used to compute the differential value 162.
- Example 6 - Display a heat map.
- the techniques of this Example 6 are run on any of the discrete attribute value datasets of the present disclosure.
- a heat map 402 of these differential values is displayed in a first panel 404 of an interface 400.
- the heat map 402 comprises a representation of the differential value 162 for each respective locus 122 in the plurality of loci for each cluster 158 in the plurality of clusters.
- Example 7 two dimensional plot of the probe spots in the dataset.
- the techniques of this Example 7 are run on any of the discrete attribute value datasets of the present disclosure.
- a two-dimensional visualization of the discrete attribute value dataset 120 is also provided in a second panel 420.
- the two-dimensional visualization in the second panel 420 is computed by a back end pipeline that is remote from visualization system 100 and is stored as two-dimensional data points 166 in the discrete attribute value dataset 120 as illustrated in Figure IB.
- the two-dimensional visualization 420 is computed by the visualization system.
- the two-dimensional visualization is prepared by computing a corresponding plurality of principal component values 164 for each respective probe spot 126 in the plurality of probe spots based upon respective values of the discrete attribute value 124 for each locus 122 in the respective probe spot 126.
- the plurality of principal component values is ten.
- the plurality of principal component values is between 5 and 100.
- the plurality of principal component values is between 5 and 50.
- the plurality of principal component values is between 8 and 35.
- a dimension reduction technique is then applied to the plurality of principal components values for each respective probe spot 126 in the plurality of probe spots, thereby determining a two-dimensional data point 166 for each probe spot 126 in the plurality of probe spots.
- Each respective probe spot 126 in the plurality of probe spots is then plotted in the second panel based upon the two- dimensional data point for the respective probe spot.
- one embodiment of the present disclosure provides a back end pipeline that is performed on a computer system other than the visualization system 100.
- the back end pipeline comprises a two stage data reduction.
- the discrete attribute values 124 e.g., mRNA expression data
- the data point is, in some embodiments, a one-dimensional vector that includes a dimension for each of the 19,000 - 20,000 genes in the human genome, with each dimension populated with the measured mRNA expression level for the corresponding gene.
- a one-dimensional vector includes a dimension for each discrete attribute value 124 of the plurality of loci, with each dimension populated with the discrete attribute value 124 for the corresponding locus 122.
- This data is considered somewhat sparse and so principal component analysis is suitable for reducing the dimensionality of the data down to ten dimensions in this example.
- application of principal component analysis can drastically reduce (reduce by at least 5-fold, at least 10-fold, at least 20-fold, or at least 40- fold) the dimensionality of the data (e.g ., from approximately 20,000 to ten dimensions).
- t-SNE t-Distributed Stochastic Neighboring Entities
- the nonlinear dimensionality reduction technique t-SNE is particularly well-suited for embedding high-dimensional data (here, the ten principal components values 164) computed for each measured probe spot based upon the measured discrete attribute value (e.g., expression level) of each locus 122 (e.g, expressed mRNA) in a respective probe spot as determined by principal component analysis into a space of two, which can then be visualized as a two-dimensional visualization (e.g, the scatter plot of second panel 420).
- high-dimensional data here, the ten principal components values 164
- the measured discrete attribute value e.g., expression level
- each locus 122 e.g, expressed mRNA
- t-SNE is used to model each high-dimensional object (the 10 principal components of each measured probe spot) as a two-dimensional point in such a way that similarly expressing probe spots are modeled as nearby two-dimensional data points 166 and dissimilarly expressing probe spots are modeled as distant two-dimensional data points 166 in the two-dimensional plot.
- the t-SNE algorithm comprises two main stages.
- t-SNE constructs a probability distribution over pairs of high dimensional probe spot vectors in such a way that similar probe spot vectors (probe spots that have similar values for their ten principal components and thus presumably have similar discrete attribute values 124 across the plurality of loci 122) have a high probability of being picked, while dissimilarly dissimilar probe spot vectors (probe spots that have dissimilar values for their ten principal components and thus presumably have dissimilar discrete attribute values 124 across the plurality of loci 122) have a small probability of being picked.
- t-SNE defines a similar probability distribution over the plurality of probe spots 126 in the low-dimensional map, and it minimizes the Kullback-Leibler divergence between the two distributions with respect to the locations of the points in the map.
- the t-SNE algorithm uses the Euclidean distance between objects as the base of its similarity metric. In other embodiments, other distance metrics are used (e.g Chebyshev distance, Mahalanobis distance, Manhattan distance, etc.).
- the dimension reduction technique used to reduce the principal component values 164 to a two- dimensional data point 166 is Sammon mapping, curvilinear components analysis, stochastic neighbor embedding, Isomap, maximum variance unfolding, locally linear embedding, or Laplacian Eigenmaps. These techniques are described in van der Maaten and Hinton, 2008, “Visualizing High-Dimensional Data Using t-SNE,” Journal of Machine Learning Research 9, 2579-2605, which is hereby incorporated by reference.
- the user has the option to select the dimension reduction technique.
- the user has the option to select the dimension reduction technique from a group comprising all or a subset of the group consisting of t-SNE, Sammon mapping, curvilinear components analysis, stochastic neighbor embedding, Isomap, maximum variance unfolding, locally linear embedding, and Laplacian Eigenmaps.
- first, second, etc. may be used herein to describe various elements, these elements should not be limited by these terms. These terms are only used to distinguish one element from another. For example, a first subject could be termed a second subject, and, similarly, a second subject could be termed a first subject, without departing from the scope of the present disclosure. The first subject and the second subject are both subjects, but they are not the same subject.
- phrase “if it is determined” or “if [a stated condition or event] is detected” may be construed to mean “upon determining” or “in response to determining” or “upon detecting (the stated condition or event)” or “in response to detecting (the stated condition or event),” depending on the context.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- General Health & Medical Sciences (AREA)
- General Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Medical Informatics (AREA)
- Data Mining & Analysis (AREA)
- Chemical & Material Sciences (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- Molecular Biology (AREA)
- Biomedical Technology (AREA)
- Multimedia (AREA)
- Biophysics (AREA)
- Quality & Reliability (AREA)
- Crystallography & Structural Chemistry (AREA)
- Biotechnology (AREA)
- Radiology & Medical Imaging (AREA)
- Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
- Pathology (AREA)
- Analytical Chemistry (AREA)
- Biochemistry (AREA)
- Immunology (AREA)
- Computing Systems (AREA)
- Databases & Information Systems (AREA)
- Software Systems (AREA)
- General Engineering & Computer Science (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
- Apparatus Associated With Microorganisms And Enzymes (AREA)
- Investigating Or Analysing Biological Materials (AREA)
- Image Processing (AREA)
Abstract
Description
Claims
Priority Applications (4)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| EP24191094.2A EP4468252A3 (en) | 2019-10-01 | 2020-09-30 | Systems and methods for identifying morphological patterns in tissue samples |
| EP20792886.2A EP4038546B1 (en) | 2019-10-01 | 2020-09-30 | Systems and methods for identifying morphological patterns in tissue samples |
| CN202080083321.6A CN114761992B (en) | 2019-10-01 | 2020-09-30 | Systems and methods for identifying morphological patterns in tissue samples |
| CN202310861916.5A CN117036248A (en) | 2019-10-01 | 2020-09-30 | System and method for identifying morphological patterns in tissue samples |
Applications Claiming Priority (6)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US201962909071P | 2019-10-01 | 2019-10-01 | |
| US62/909,071 | 2019-10-01 | ||
| US202062980077P | 2020-02-21 | 2020-02-21 | |
| US62/980,077 | 2020-02-21 | ||
| US202063041823P | 2020-06-20 | 2020-06-20 | |
| US63/041,823 | 2020-06-20 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2021067514A1 true WO2021067514A1 (en) | 2021-04-08 |
Family
ID=72896167
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/US2020/053655 Ceased WO2021067514A1 (en) | 2019-10-01 | 2020-09-30 | Systems and methods for identifying morphological patterns in tissue samples |
Country Status (4)
| Country | Link |
|---|---|
| US (4) | US11514575B2 (en) |
| EP (2) | EP4038546B1 (en) |
| CN (2) | CN117036248A (en) |
| WO (1) | WO2021067514A1 (en) |
Cited By (59)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US11560593B2 (en) | 2019-12-23 | 2023-01-24 | 10X Genomics, Inc. | Methods for spatial analysis using RNA-templated ligation |
| US11592447B2 (en) | 2019-11-08 | 2023-02-28 | 10X Genomics, Inc. | Spatially-tagged analyte capture agents for analyte multiplexing |
| US11608520B2 (en) | 2020-05-22 | 2023-03-21 | 10X Genomics, Inc. | Spatial analysis to detect sequence variants |
| US11608498B2 (en) | 2020-06-02 | 2023-03-21 | 10X Genomics, Inc. | Nucleic acid library methods |
| US11618897B2 (en) | 2020-12-21 | 2023-04-04 | 10X Genomics, Inc. | Methods, compositions, and systems for capturing probes and/or barcodes |
| US11624063B2 (en) | 2020-06-08 | 2023-04-11 | 10X Genomics, Inc. | Methods of determining a surgical margin and methods of use thereof |
| US11624086B2 (en) | 2020-05-22 | 2023-04-11 | 10X Genomics, Inc. | Simultaneous spatio-temporal measurement of gene expression and cellular activity |
| WO2023076345A1 (en) | 2021-10-26 | 2023-05-04 | 10X Genomics, Inc. | Methods for spatial analysis using targeted rna capture |
| US11649485B2 (en) | 2019-01-06 | 2023-05-16 | 10X Genomics, Inc. | Generating capture probes for spatial analysis |
| US11661626B2 (en) | 2020-06-25 | 2023-05-30 | 10X Genomics, Inc. | Spatial analysis of DNA methylation |
| US11692218B2 (en) | 2020-06-02 | 2023-07-04 | 10X Genomics, Inc. | Spatial transcriptomics for antigen-receptors |
| US11702693B2 (en) | 2020-01-21 | 2023-07-18 | 10X Genomics, Inc. | Methods for printing cells and generating arrays of barcoded cells |
| US11702698B2 (en) | 2019-11-08 | 2023-07-18 | 10X Genomics, Inc. | Enhancing specificity of analyte binding |
| US11732300B2 (en) | 2020-02-05 | 2023-08-22 | 10X Genomics, Inc. | Increasing efficiency of spatial analysis in a biological sample |
| US11732299B2 (en) | 2020-01-21 | 2023-08-22 | 10X Genomics, Inc. | Spatial assays with perturbed cells |
| WO2023159028A1 (en) * | 2022-02-15 | 2023-08-24 | 10X Genomics, Inc. | Systems and methods for spatial analysis of analytes using fiducial alignment |
| US11739381B2 (en) | 2021-03-18 | 2023-08-29 | 10X Genomics, Inc. | Multiplex capture of gene and protein expression from a biological sample |
| US11753673B2 (en) | 2021-09-01 | 2023-09-12 | 10X Genomics, Inc. | Methods, compositions, and kits for blocking a capture probe on a spatial array |
| US11761038B1 (en) | 2020-07-06 | 2023-09-19 | 10X Genomics, Inc. | Methods for identifying a location of an RNA in a biological sample |
| US11768175B1 (en) | 2020-03-04 | 2023-09-26 | 10X Genomics, Inc. | Electrophoretic methods for spatial analysis |
| US11773433B2 (en) | 2020-04-22 | 2023-10-03 | 10X Genomics, Inc. | Methods for spatial analysis using targeted RNA depletion |
| US11821035B1 (en) | 2020-01-29 | 2023-11-21 | 10X Genomics, Inc. | Compositions and methods of making gene expression libraries |
| US11827935B1 (en) | 2020-11-19 | 2023-11-28 | 10X Genomics, Inc. | Methods for spatial analysis using rolling circle amplification and detection probes |
| US11835462B2 (en) | 2020-02-11 | 2023-12-05 | 10X Genomics, Inc. | Methods and compositions for partitioning a biological sample |
| WO2024015578A1 (en) | 2022-07-15 | 2024-01-18 | 10X Genomics, Inc. | Methods for determining a location of a target nucleic acid in a biological sample |
| US11898205B2 (en) | 2020-02-03 | 2024-02-13 | 10X Genomics, Inc. | Increasing capture efficiency of spatial assays |
| US11926822B1 (en) | 2020-09-23 | 2024-03-12 | 10X Genomics, Inc. | Three-dimensional spatial analysis |
| US11926867B2 (en) | 2019-01-06 | 2024-03-12 | 10X Genomics, Inc. | Generating capture probes for spatial analysis |
| US11926863B1 (en) | 2020-02-27 | 2024-03-12 | 10X Genomics, Inc. | Solid state single cell method for analyzing fixed biological cells |
| US11965213B2 (en) | 2019-05-30 | 2024-04-23 | 10X Genomics, Inc. | Methods of detecting spatial heterogeneity of a biological sample |
| US11981958B1 (en) | 2020-08-20 | 2024-05-14 | 10X Genomics, Inc. | Methods for spatial analysis using DNA capture |
| US11981960B1 (en) | 2020-07-06 | 2024-05-14 | 10X Genomics, Inc. | Spatial analysis utilizing degradable hydrogels |
| US12031177B1 (en) | 2020-06-04 | 2024-07-09 | 10X Genomics, Inc. | Methods of enhancing spatial resolution of transcripts |
| US12071655B2 (en) | 2021-06-03 | 2024-08-27 | 10X Genomics, Inc. | Methods, compositions, kits, and systems for enhancing analyte capture for spatial analysis |
| US12076701B2 (en) | 2020-01-31 | 2024-09-03 | 10X Genomics, Inc. | Capturing oligonucleotides in spatial transcriptomics |
| US12098985B2 (en) | 2021-02-19 | 2024-09-24 | 10X Genomics, Inc. | Modular assay support devices |
| US12110541B2 (en) | 2020-02-03 | 2024-10-08 | 10X Genomics, Inc. | Methods for preparing high-resolution spatial arrays |
| US12117439B2 (en) | 2019-12-23 | 2024-10-15 | 10X Genomics, Inc. | Compositions and methods for using fixed biological samples |
| US12129516B2 (en) | 2020-02-07 | 2024-10-29 | 10X Genomics, Inc. | Quantitative and automated permeabilization performance evaluation for spatial transcriptomics |
| US12128403B2 (en) | 2020-06-10 | 2024-10-29 | 10X Genomics, Inc. | Fluid delivery methods |
| US12195790B2 (en) | 2021-12-01 | 2025-01-14 | 10X Genomics, Inc. | Methods for improved in situ detection of nucleic acids and spatial analysis |
| US12203134B2 (en) | 2021-04-14 | 2025-01-21 | 10X Genomics, Inc. | Methods of measuring mislocalization of an analyte |
| US12209280B1 (en) | 2020-07-06 | 2025-01-28 | 10X Genomics, Inc. | Methods of identifying abundance and location of an analyte in a biological sample using second strand synthesis |
| US12223751B2 (en) | 2021-12-20 | 2025-02-11 | 10X Genomics, Inc. | Self-test for imaging device |
| US12249085B2 (en) | 2020-09-18 | 2025-03-11 | 10X Genomics, Inc. | Sample handling apparatus and image registration methods |
| US12265079B1 (en) | 2020-06-02 | 2025-04-01 | 10X Genomics, Inc. | Systems and methods for detecting analytes from captured single biological particles |
| US12275988B2 (en) | 2021-11-10 | 2025-04-15 | 10X Genomics, Inc. | Methods, compositions, and kits for determining the location of an analyte in a biological sample |
| US12281357B1 (en) | 2020-02-14 | 2025-04-22 | 10X Genomics, Inc. | In situ spatial barcoding |
| US12297486B2 (en) | 2020-01-24 | 2025-05-13 | 10X Genomics, Inc. | Methods for spatial analysis using proximity ligation |
| US12344892B2 (en) | 2018-08-28 | 2025-07-01 | 10X Genomics, Inc. | Method for transposase-mediated spatial tagging and analyzing genomic DNA in a biological sample |
| US12365942B2 (en) | 2020-01-13 | 2025-07-22 | 10X Genomics, Inc. | Methods of decreasing background on a spatial array |
| US12365935B2 (en) | 2021-05-06 | 2025-07-22 | 10X Genomics, Inc. | Methods for increasing resolution of spatial analysis |
| US12385083B2 (en) | 2018-12-10 | 2025-08-12 | 10X Genomics, Inc. | Methods of using master / copy arrays for spatial detection |
| US12398262B1 (en) | 2021-01-22 | 2025-08-26 | 10X Genomics, Inc. | Triblock copolymer-based cell stabilization and fixation system and methods of use thereof |
| US12399123B1 (en) | 2020-02-14 | 2025-08-26 | 10X Genomics, Inc. | Spatial targeting of analytes |
| US12405264B2 (en) | 2020-01-17 | 2025-09-02 | 10X Genomics, Inc. | Electrophoretic system and method for analyte capture |
| US12416603B2 (en) | 2020-05-19 | 2025-09-16 | 10X Genomics, Inc. | Electrophoresis cassettes and instrumentation |
| US12435363B1 (en) | 2020-06-10 | 2025-10-07 | 10X Genomics, Inc. | Materials and methods for spatial transcriptomics |
| AU2022257481B2 (en) * | 2021-04-15 | 2025-11-06 | Portrai Inc. | Apparatus and method for predicting cell type enrichment from tissue images using spatially resolved gene expression data |
Families Citing this family (14)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| EP3591574A3 (en) * | 2018-07-06 | 2020-01-15 | Universität Zürich | Method and computer program for clustering large multiplexed spatially resolved data of a biological sample |
| US11514575B2 (en) * | 2019-10-01 | 2022-11-29 | 10X Genomics, Inc. | Systems and methods for identifying morphological patterns in tissue samples |
| CN115023734B (en) | 2019-11-22 | 2023-10-20 | 10X基因组学有限公司 | System and method for spatially analyzing analytes using fiducial alignment |
| US12421558B2 (en) | 2020-02-13 | 2025-09-23 | 10X Genomics, Inc. | Systems and methods for joint interactive visualization of gene expression and DNA chromatin accessibility |
| EP4413578A1 (en) | 2021-10-06 | 2024-08-14 | 10X Genomics, Inc. | Systems and methods for evaluating biological samples |
| CN113869261A (en) * | 2021-10-11 | 2021-12-31 | 上海海洋大学 | A method for evaluating seaweed biomass based on RGB images |
| WO2023064547A1 (en) * | 2021-10-16 | 2023-04-20 | Leavitt Medical, Inc. | Systems and methods for aligning digital slide images |
| US20250272996A1 (en) | 2022-04-26 | 2025-08-28 | 10X Genomics, Inc. | Systems and methods for evaluating biological samples |
| CN115170555B (en) * | 2022-08-04 | 2023-06-06 | 格物致和生物科技(北京)有限公司 | Counting method and system based on images |
| US20240052404A1 (en) | 2022-08-05 | 2024-02-15 | 10X Genomics, Inc. | Systems and methods for immunofluorescence quantification |
| WO2024036191A1 (en) | 2022-08-10 | 2024-02-15 | 10X Genomics, Inc. | Systems and methods for colocalization |
| WO2024238625A1 (en) | 2023-05-15 | 2024-11-21 | 10X Genomics, Inc. | Spatial antibody data normalization |
| CN116998374A (en) * | 2023-08-07 | 2023-11-07 | 河南农业大学 | A method for screening corn varieties suitable for 4:2 and 6:4 strip compound planting of soybeans and corns |
| CN116821396B (en) * | 2023-08-25 | 2024-09-03 | 神州医疗科技股份有限公司 | Pathological labeling system based on OpenSeadragon frames |
Citations (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20200105373A1 (en) | 2018-09-28 | 2020-04-02 | 10X Genomics, Inc. | Systems and methods for cellular analysis using nucleic acid sequencing |
Family Cites Families (97)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US5231580A (en) * | 1991-04-01 | 1993-07-27 | The United States Of America As Represented By The Secretary Of The Department Of Health And Human Services | Automated method and apparatus for determining characteristics of nerve fibers |
| US5472881A (en) | 1992-11-12 | 1995-12-05 | University Of Utah Research Foundation | Thiol labeling of DNA for attachment to gold surfaces |
| US5610287A (en) | 1993-12-06 | 1997-03-11 | Molecular Tool, Inc. | Method for immobilizing nucleic acid molecules |
| US5552278A (en) | 1994-04-04 | 1996-09-03 | Spectragen, Inc. | DNA sequencing by stepwise ligation and cleavage |
| US5807522A (en) | 1994-06-17 | 1998-09-15 | The Board Of Trustees Of The Leland Stanford Junior University | Methods for fabricating microarrays of biological samples |
| US5846719A (en) | 1994-10-13 | 1998-12-08 | Lynx Therapeutics, Inc. | Oligonucleotide tags for sorting and identification |
| CA2214101A1 (en) * | 1995-03-03 | 1996-09-12 | Ulrich Bick | Method and system for the detection of lesions in medical images |
| US5750341A (en) | 1995-04-17 | 1998-05-12 | Lynx Therapeutics, Inc. | DNA sequencing by parallel oligonucleotide extensions |
| GB9620209D0 (en) | 1996-09-27 | 1996-11-13 | Cemu Bioteknik Ab | Method of sequencing DNA |
| GB9624927D0 (en) * | 1996-11-29 | 1997-01-15 | Oxford Glycosciences Uk Ltd | Gels and their use |
| GB9626815D0 (en) | 1996-12-23 | 1997-02-12 | Cemu Bioteknik Ab | Method of sequencing DNA |
| AU747242B2 (en) | 1997-01-08 | 2002-05-09 | Proligo Llc | Bioconjugation of macromolecules |
| US5837860A (en) | 1997-03-05 | 1998-11-17 | Molecular Tool, Inc. | Covalent attachment of nucleic acid molecules onto solid-phases via disulfide bonds |
| US6327410B1 (en) | 1997-03-14 | 2001-12-04 | The Trustees Of Tufts College | Target analyte sensors utilizing Microspheres |
| US6023540A (en) | 1997-03-14 | 2000-02-08 | Trustees Of Tufts College | Fiber optic sensor with encoded microspheres |
| JP2002503954A (en) | 1997-04-01 | 2002-02-05 | グラクソ、グループ、リミテッド | Nucleic acid amplification method |
| US6969488B2 (en) | 1998-05-22 | 2005-11-29 | Solexa, Inc. | System and apparatus for sequential processing of analytes |
| US7427678B2 (en) | 1998-01-08 | 2008-09-23 | Sigma-Aldrich Co. | Method for immobilizing oligonucleotides employing the cycloaddition bioconjugation method |
| WO1999067641A2 (en) | 1998-06-24 | 1999-12-29 | Illumina, Inc. | Decoding of array sensors with microspheres |
| US6391937B1 (en) | 1998-11-25 | 2002-05-21 | Motorola, Inc. | Polyacrylamide hydrogels and hydrogel arrays made from polyacrylamide reactive prepolymers |
| US6355431B1 (en) | 1999-04-20 | 2002-03-12 | Illumina, Inc. | Detection of nucleic acid amplification reactions using bead arrays |
| ATE413467T1 (en) | 1999-04-20 | 2008-11-15 | Illumina Inc | DETECTION OF NUCLEIC ACID REACTIONS ON BEAD ARRAYS |
| US6430430B1 (en) * | 1999-04-29 | 2002-08-06 | University Of South Florida | Method and system for knowledge guided hyperintensity detection and volumetric measurement |
| US6274320B1 (en) | 1999-09-16 | 2001-08-14 | Curagen Corporation | Method of sequencing a nucleic acid |
| US6770441B2 (en) | 2000-02-10 | 2004-08-03 | Illumina, Inc. | Array compositions and methods of making same |
| US7001792B2 (en) | 2000-04-24 | 2006-02-21 | Eagle Research & Development, Llc | Ultra-fast nucleic acid sequencing device and a method for making and using the same |
| US7057026B2 (en) | 2001-12-04 | 2006-06-06 | Solexa Limited | Labelled nucleotides |
| JP3670260B2 (en) * | 2002-02-14 | 2005-07-13 | 日本碍子株式会社 | Probe reactive tip |
| WO2003101972A1 (en) | 2002-05-30 | 2003-12-11 | The Scripps Research Institute | Copper-catalysed ligation of azides and acetylenes |
| EP3795577A1 (en) | 2002-08-23 | 2021-03-24 | Illumina Cambridge Limited | Modified nucleotides |
| GB0321306D0 (en) | 2003-09-11 | 2003-10-15 | Solexa Ltd | Modified polymerases for improved incorporation of nucleotide analogues |
| JP2007510199A (en) | 2003-10-08 | 2007-04-19 | ライフスパン バイオサイエンス,インク. | Automated microscope slide tissue sample mapping and image acquisition |
| US7259258B2 (en) | 2003-12-17 | 2007-08-21 | Illumina, Inc. | Methods of attaching biological compounds to solid supports using triazine |
| EP3175914A1 (en) | 2004-01-07 | 2017-06-07 | Illumina Cambridge Limited | Improvements in or relating to molecular arrays |
| US7877213B2 (en) * | 2004-09-23 | 2011-01-25 | Agilent Technologies, Inc. | System and methods for automated processing of multiple chemical arrays |
| GB0427236D0 (en) | 2004-12-13 | 2005-01-12 | Solexa Ltd | Improved method of nucleotide detection |
| EP1828412B2 (en) | 2004-12-13 | 2019-01-09 | Illumina Cambridge Limited | Improved method of nucleotide detection |
| US8623628B2 (en) | 2005-05-10 | 2014-01-07 | Illumina, Inc. | Polymerases |
| ES2434915T3 (en) | 2005-06-20 | 2013-12-18 | Advanced Cell Diagnostics, Inc. | Multiplex nucleic acid detection |
| GB0514936D0 (en) | 2005-07-20 | 2005-08-24 | Solexa Ltd | Preparation of templates for nucleic acid sequencing |
| US8262900B2 (en) | 2006-12-14 | 2012-09-11 | Life Technologies Corporation | Methods and apparatus for measuring analytes using large scale FET arrays |
| EP4134667B1 (en) | 2006-12-14 | 2025-11-12 | Life Technologies Corporation | Apparatus for measuring analytes using fet arrays |
| US8349167B2 (en) | 2006-12-14 | 2013-01-08 | Life Technologies Corporation | Methods and apparatus for detecting molecular interactions using FET arrays |
| US8200440B2 (en) * | 2007-05-18 | 2012-06-12 | Affymetrix, Inc. | System, method, and computer software product for genotype determination using probe array data |
| US20100137143A1 (en) | 2008-10-22 | 2010-06-03 | Ion Torrent Systems Incorporated | Methods and apparatus for measuring analytes |
| US8488863B2 (en) * | 2008-11-06 | 2013-07-16 | Los Alamos National Security, Llc | Combinational pixel-by-pixel and object-level classifying, segmenting, and agglomerating in performing quantitative image analysis that distinguishes between healthy non-cancerous and cancerous cell nuclei and delineates nuclear, cytoplasm, and stromal material objects from stained biological tissue materials |
| US8582858B2 (en) * | 2009-12-17 | 2013-11-12 | The Regents Of The University Of California | Method and apparatus for quantitative analysis of breast density morphology based on MRI |
| US20130171621A1 (en) | 2010-01-29 | 2013-07-04 | Advanced Cell Diagnostics Inc. | Methods of in situ detection of nucleic acids |
| CA2794522C (en) | 2010-04-05 | 2019-11-26 | Prognosys Biosciences, Inc. | Spatially encoded biological assays |
| US8774494B2 (en) * | 2010-04-30 | 2014-07-08 | Complete Genomics, Inc. | Method and system for accurate alignment and registration of array for DNA sequencing |
| WO2012018397A2 (en) * | 2010-08-02 | 2012-02-09 | The Regents Of The University Of California | Molecular subtyping of oral squamous cell carcinoma to distinguish a subtype that is unlikely to metastasize |
| US8951781B2 (en) | 2011-01-10 | 2015-02-10 | Illumina, Inc. | Systems, methods, and apparatuses to image a sample for biological or chemical analysis |
| GB201106254D0 (en) | 2011-04-13 | 2011-05-25 | Frisen Jonas | Method and product |
| JP6159391B2 (en) | 2012-04-03 | 2017-07-05 | イラミーナ インコーポレーテッド | Integrated read head and fluid cartridge useful for nucleic acid sequencing |
| US9012022B2 (en) | 2012-06-08 | 2015-04-21 | Illumina, Inc. | Polymer coatings |
| US20140378345A1 (en) | 2012-08-14 | 2014-12-25 | 10X Technologies, Inc. | Compositions and methods for sample processing |
| US9783841B2 (en) | 2012-10-04 | 2017-10-10 | The Board Of Trustees Of The Leland Stanford Junior University | Detection of target nucleic acids in a cellular sample |
| DK3511423T4 (en) | 2012-10-17 | 2024-07-29 | Spatial Transcriptomics Ab | METHODS AND PRODUCT FOR OPTIMIZING LOCALIZED OR SPATIAL DETECTION OF GENE EXPRESSION IN A TISSUE SAMPLE |
| US9512422B2 (en) | 2013-02-26 | 2016-12-06 | Illumina, Inc. | Gel patterned surfaces |
| US10138509B2 (en) | 2013-03-12 | 2018-11-27 | President And Fellows Of Harvard College | Method for generating a three-dimensional nucleic acid containing matrix |
| JP6416887B2 (en) * | 2013-05-15 | 2018-10-31 | ジ アドミニストレーターズ オブ ザ テュレーン エデュケーショナル ファンド | Microscopic observation of tissue samples using structured illumination |
| US9868979B2 (en) | 2013-06-25 | 2018-01-16 | Prognosys Biosciences, Inc. | Spatially encoded biological assays using a microfluidic device |
| US20150000854A1 (en) | 2013-06-27 | 2015-01-01 | The Procter & Gamble Company | Sheet products bearing designs that vary among successive sheets, and apparatus and methods for producing the same |
| US10041949B2 (en) | 2013-09-13 | 2018-08-07 | The Board Of Trustees Of The Leland Stanford Junior University | Multiplexed imaging of tissues using mass tags and secondary ion mass spectrometry |
| WO2015161173A1 (en) | 2014-04-18 | 2015-10-22 | William Marsh Rice University | Competitive compositions of nucleic acid molecules for enrichment of rare-allele-bearing species |
| WO2015200893A2 (en) | 2014-06-26 | 2015-12-30 | 10X Genomics, Inc. | Methods of analyzing nucleic acids from individual cells or cell populations |
| US10179932B2 (en) | 2014-07-11 | 2019-01-15 | President And Fellows Of Harvard College | Methods for high-throughput labelling and detection of biological features in situ using microscopy |
| EP3262192B1 (en) | 2015-02-27 | 2020-09-16 | Becton, Dickinson and Company | Spatially addressable molecular barcoding |
| WO2016162309A1 (en) | 2015-04-10 | 2016-10-13 | Spatial Transcriptomics Ab | Spatially distinguished, multiplex nucleic acid analysis of biological specimens |
| US10059990B2 (en) | 2015-04-14 | 2018-08-28 | Massachusetts Institute Of Technology | In situ nucleic acid sequencing of expanded biological samples |
| EP3283641B1 (en) | 2015-04-14 | 2019-11-27 | Koninklijke Philips N.V. | Spatial mapping of molecular profiles of biological tissue samples |
| WO2017015099A1 (en) | 2015-07-17 | 2017-01-26 | Nanostring Technologies, Inc. | Simultaneous quantification of gene expression in a user-defined region of a cross-sectioned tissue |
| CA3242290A1 (en) | 2015-07-27 | 2017-02-02 | Illumina, Inc. | Spatial mapping of nucleic acid sequence information |
| CA2994957A1 (en) | 2015-08-07 | 2017-02-16 | Massachusetts Institute Of Technology | Protein retention expansion microscopy |
| US20170241911A1 (en) | 2016-02-22 | 2017-08-24 | Miltenyi Biotec Gmbh | Automated analysis tool for biological specimens |
| US20170253918A1 (en) | 2016-03-01 | 2017-09-07 | Expansion Technologies | Combining protein barcoding with expansion microscopy for in-situ, spatially-resolved proteomics |
| WO2017161251A1 (en) * | 2016-03-17 | 2017-09-21 | President And Fellows Of Harvard College | Methods for detecting and identifying genomic nucleic acids |
| US20180052081A1 (en) | 2016-05-11 | 2018-02-22 | Expansion Technologies | Combining modified antibodies with expansion microscopy for in-situ, spatially-resolved proteomics |
| EP3472359B1 (en) | 2016-06-21 | 2022-03-16 | 10X Genomics, Inc. | Nucleic acid sequencing |
| CN118853848A (en) | 2016-08-31 | 2024-10-29 | 哈佛学院董事及会员团体 | Methods for combining detection of biomolecules into a single assay using fluorescent in situ sequencing |
| CN118389650A (en) | 2016-08-31 | 2024-07-26 | 哈佛学院董事及会员团体 | Methods for generating nucleic acid sequence libraries for detection by fluorescent in situ sequencing |
| US11505819B2 (en) | 2016-09-22 | 2022-11-22 | William Marsh Rice University | Molecular hybridization probes for complex sequence capture and analysis |
| SG10202012440VA (en) | 2016-10-19 | 2021-01-28 | 10X Genomics Inc | Methods and systems for barcoding nucleic acid molecules from individual cells or cell populations |
| GB201619458D0 (en) | 2016-11-17 | 2017-01-04 | Spatial Transcriptomics Ab | Method for spatial tagging and analysing nucleic acids in a biological specimen |
| US10656144B2 (en) | 2016-12-02 | 2020-05-19 | The Charlotte Mecklenburg Hospital Authority | Immune profiling and minimal residue disease following stem cell transplantation in multiple myeloma |
| CN118345145A (en) | 2016-12-09 | 2024-07-16 | 乌尔蒂维尤股份有限公司 | Improved methods for multiplexed imaging using labeled nucleic acid imaging agents |
| WO2018136856A1 (en) | 2017-01-23 | 2018-07-26 | Massachusetts Institute Of Technology | Multiplexed signal amplified fish via splinted ligation amplification and sequencing |
| US10347365B2 (en) * | 2017-02-08 | 2019-07-09 | 10X Genomics, Inc. | Systems and methods for visualizing a pattern in a dataset |
| US10748643B2 (en) * | 2017-08-31 | 2020-08-18 | 10X Genomics, Inc. | Systems and methods for determining the integrity of test strings with respect to a ground truth string |
| CN120210336A (en) | 2017-10-06 | 2025-06-27 | 10X基因组学有限公司 | RNA templated ligation |
| WO2019075091A1 (en) | 2017-10-11 | 2019-04-18 | Expansion Technologies | Multiplexed in situ hybridization of tissue sections for spatially resolved transcriptomics with expansion microscopy |
| CN114807306A (en) | 2017-12-08 | 2022-07-29 | 10X基因组学有限公司 | Methods and compositions for labeling cells |
| CA3095056A1 (en) * | 2018-04-13 | 2019-10-17 | Freenome Holdings, Inc. | Machine learning implementation for multi-analyte assay of biological samples |
| CN113767177B (en) | 2018-12-10 | 2025-01-14 | 10X基因组学有限公司 | Generation of capture probes for spatial analysis |
| US20220119871A1 (en) * | 2019-01-28 | 2022-04-21 | The Broad Institute, Inc. | In-situ spatial transcriptomics |
| US11514575B2 (en) * | 2019-10-01 | 2022-11-29 | 10X Genomics, Inc. | Systems and methods for identifying morphological patterns in tissue samples |
| CA3158888A1 (en) * | 2019-11-21 | 2021-05-27 | Yifeng YIN | Spatial analysis of analytes |
-
2020
- 2020-09-30 US US17/039,935 patent/US11514575B2/en active Active
- 2020-09-30 WO PCT/US2020/053655 patent/WO2021067514A1/en not_active Ceased
- 2020-09-30 EP EP20792886.2A patent/EP4038546B1/en active Active
- 2020-09-30 CN CN202310861916.5A patent/CN117036248A/en active Pending
- 2020-09-30 CN CN202080083321.6A patent/CN114761992B/en active Active
- 2020-09-30 EP EP24191094.2A patent/EP4468252A3/en active Pending
-
2022
- 2022-10-18 US US18/047,620 patent/US11756286B2/en active Active
-
2023
- 2023-07-20 US US18/355,963 patent/US12125260B2/en active Active
-
2024
- 2024-07-08 US US18/766,382 patent/US20250022253A1/en active Pending
Patent Citations (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20200105373A1 (en) | 2018-09-28 | 2020-04-02 | 10X Genomics, Inc. | Systems and methods for cellular analysis using nucleic acid sequencing |
Non-Patent Citations (27)
| Title |
|---|
| "Fluorescence Spectroscopy and Microscopy: Methods and Protocols (Methods in Molecular Biology", 2014, HUMANPRESS |
| "Light Microscopy Method and Protocols", 2018, HUMANA PRESS, article "Methods in Molecular Biology" |
| ANONYMOUS: "Gene - Wikipedia", 29 August 2019 (2019-08-29), pages 1 - 30, XP055756501, Retrieved from the Internet <URL:https://en.wikipedia.org/w/index.php?title=Gene&oldid=912988894> [retrieved on 20201203] * |
| BACKER, COMPUTER-ASSISTED REASONING IN CLUSTER ANALYSIS, PRENTICE HALL, UPPER SADDLE RIVER, N.J., 1995 |
| BERK: "Comparing Subset Regression Procedures", TECHNOMETRICS, vol. 20, no. 1, 1978, pages 1 - 6 |
| BLONDEL ET AL., FAST UNFOLDING OF COMMUNITIES IN LARGE NETWORKS, 25 July 2008 (2008-07-25) |
| CAMERONTRIVEDI: "Econometric Society Monograph 30", 1998, CAMBRIDGE UNIVERSITY PRESS, article "Regression Analysis of Count Data" |
| CHEN ET AL., SCIENCE, vol. 347, no. 6221, 2015, pages 543 - 548 |
| DAYDAVIDSON: "The Fluorescent Protein Revolution (In Cellular and Clinical Imaging", vol. 123, 2014, CRC PRESS, TAYLOR & FRANCIS GROUP, article "Quantitative Imaging in Cell Biology" |
| DEEP ZOOM FILE FORMAT OVERVIEW, 16 November 2011 (2011-11-16), Retrieved from the Internet <URL:https://docs.microsoft.com/en-us/previous-versions/windows/silverlight/dotnet-windows-silverlight/cc645077(v=vs.95)?redirectedfrom=MSDN> |
| DUDAHART: "Pattern Classification and Scene Analysis", 1973, JOHN WILEY & SONS, INC., pages: 211 - 256 |
| EVERITT: "Cluster analysis", 1993, WILEY |
| FURNIVALWILSON: "Regression by Leaps and Bounds", TECHNOMETRICS, vol. 16, no. 4, 1974, pages 499 - 511 |
| GOODE ET AL., J PATHOL INFORM, vol. 4, no. 27, 2013 |
| HASTIE: "The Elements of Statistical Learning", 2001, SPRINGER, pages: 64 - 65,69-72,330-331 |
| JAMUR ET AL., METHODMOL. BIOL., vol. 588, 2010, pages 63 - 66 |
| KAUFMANROUSSEEUW: "Finding Groups in Data: An Introduction to Cluster Analysis", 1990, WILEY |
| LEE ET AL.: "Using fixed fiduciary markers for stage drift correction", OPT EXPRESS, vol. 20, no. 11, 2012, pages 12177 - 12183, XP055514083, DOI: 10.1364/OE.20.012177 |
| MANIATIS: "Spatiotemporal Dynamics of Molecular Pathology in Amyotrophic Lateral Sclerosis", SCIENCE, vol. 364, no. 6435, 2019, pages 89 - 93 |
| NIKHIL RAO ET AL: "Envision New Dimensions Introducing the Visium Spatial Gene Expression Solution", 19 September 2019 (2019-09-19), XP055756016, Retrieved from the Internet <URL:https://wp.10xgenomics.com/videos/seminars/introducing-the-visium-spatial-gene-expression-solution/> [retrieved on 20201202] * |
| REUBEN MONCADA ET AL: "Building a tumor atlas: integrating single-cell RNA-Seq data with spatial transcriptomics in pancreatic ductal adenocarcinoma", BIORXIV, 5 March 2018 (2018-03-05), XP055505886, Retrieved from the Internet <URL:https://www.biorxiv.org/content/early/2018/03/05/254375.full.pdf> [retrieved on 20180910], DOI: 10.1101/254375 * |
| SALMÉN FREDRIK ET AL: "Barcoded solid-phase RNA capture for Spatial Transcriptomics profiling in mammalian tissue sections", NATURE PROTOCOLS, NATURE PUBLISHING GROUP, GB, vol. 13, no. 11, 23 October 2018 (2018-10-23), pages 2501 - 2534, XP036624836, ISSN: 1754-2189, [retrieved on 20181023], DOI: 10.1038/S41596-018-0045-2 * |
| STÅHL PATRIK L ET AL: "Visualization and analysis of gene expression in tissue sections by spatial transcriptomics", SCIENCE, AMERICAN ASSOCIATION FOR THE ADVANCEMENT OF SCIENCE, US, vol. 353, no. 6294, July 2016 (2016-07-01), pages 78 - 82, XP002784680, ISSN: 1095-9203, DOI: 10.1126/SCIENCE.AAF2403 * |
| VAN DER MAATENHINTON: "Visualizing High-Dimensional Data Using t-SNE", JOURNAL OF MACHINE LEARNING RESEARCH, vol. 9, 2008, pages 2579 - 2605 |
| VICKOVIC SANJA ET AL: "High-definition spatial transcriptomics for in situ tissue profiling", NATURE METHODS, NATURE PUB. GROUP, NEW YORK, vol. 16, no. 10, 9 September 2019 (2019-09-09), pages 987 - 990, XP036887805, ISSN: 1548-7091, [retrieved on 20190909], DOI: 10.1038/S41592-019-0548-Y * |
| VIYAYAKURMASCHAAL, LOCALLY WEIGHTED PROJECTION REGRESSION : AN O(N) ALGORITHM FOR INCREMENTAL REAL TIME LEARNING IN HIGH DIMENSIONAL SPACE, PROC. OF SEVENTEENTH INTERNATIONAL CONFERENCE ON MACHINE LEARNING (ICML2000, 2000, pages 1079 - 1086 |
| YU: "Shrinkage estimation of dispersion in Negative Binomial models for RNA-seq experiments with small sample size", BIOINFORMATICS, vol. 29, 2013, pages 1275 - 1282 |
Cited By (85)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US12378607B2 (en) | 2018-08-28 | 2025-08-05 | 10X Genomics, Inc. | Method for transposase-mediated spatial tagging and analyzing genomic DNA in a biological sample |
| US12344892B2 (en) | 2018-08-28 | 2025-07-01 | 10X Genomics, Inc. | Method for transposase-mediated spatial tagging and analyzing genomic DNA in a biological sample |
| US12385083B2 (en) | 2018-12-10 | 2025-08-12 | 10X Genomics, Inc. | Methods of using master / copy arrays for spatial detection |
| US11926867B2 (en) | 2019-01-06 | 2024-03-12 | 10X Genomics, Inc. | Generating capture probes for spatial analysis |
| US11649485B2 (en) | 2019-01-06 | 2023-05-16 | 10X Genomics, Inc. | Generating capture probes for spatial analysis |
| US11753675B2 (en) | 2019-01-06 | 2023-09-12 | 10X Genomics, Inc. | Generating capture probes for spatial analysis |
| US12442045B2 (en) | 2019-05-30 | 2025-10-14 | 10X Genomics, Inc. | Methods of detecting spatial heterogeneity of a biological sample |
| US11965213B2 (en) | 2019-05-30 | 2024-04-23 | 10X Genomics, Inc. | Methods of detecting spatial heterogeneity of a biological sample |
| US11702698B2 (en) | 2019-11-08 | 2023-07-18 | 10X Genomics, Inc. | Enhancing specificity of analyte binding |
| US11592447B2 (en) | 2019-11-08 | 2023-02-28 | 10X Genomics, Inc. | Spatially-tagged analyte capture agents for analyte multiplexing |
| US11808769B2 (en) | 2019-11-08 | 2023-11-07 | 10X Genomics, Inc. | Spatially-tagged analyte capture agents for analyte multiplexing |
| US11795507B2 (en) | 2019-12-23 | 2023-10-24 | 10X Genomics, Inc. | Methods for spatial analysis using RNA-templated ligation |
| US11560593B2 (en) | 2019-12-23 | 2023-01-24 | 10X Genomics, Inc. | Methods for spatial analysis using RNA-templated ligation |
| US11981965B2 (en) | 2019-12-23 | 2024-05-14 | 10X Genomics, Inc. | Methods for spatial analysis using RNA-templated ligation |
| US12117439B2 (en) | 2019-12-23 | 2024-10-15 | 10X Genomics, Inc. | Compositions and methods for using fixed biological samples |
| US12241890B2 (en) | 2019-12-23 | 2025-03-04 | 10X Genomics, Inc. | Methods for generating barcoded nucleic acid molecules using fixed cells |
| US12365942B2 (en) | 2020-01-13 | 2025-07-22 | 10X Genomics, Inc. | Methods of decreasing background on a spatial array |
| US12405264B2 (en) | 2020-01-17 | 2025-09-02 | 10X Genomics, Inc. | Electrophoretic system and method for analyte capture |
| US11732299B2 (en) | 2020-01-21 | 2023-08-22 | 10X Genomics, Inc. | Spatial assays with perturbed cells |
| US11702693B2 (en) | 2020-01-21 | 2023-07-18 | 10X Genomics, Inc. | Methods for printing cells and generating arrays of barcoded cells |
| US12297486B2 (en) | 2020-01-24 | 2025-05-13 | 10X Genomics, Inc. | Methods for spatial analysis using proximity ligation |
| US11821035B1 (en) | 2020-01-29 | 2023-11-21 | 10X Genomics, Inc. | Compositions and methods of making gene expression libraries |
| US12076701B2 (en) | 2020-01-31 | 2024-09-03 | 10X Genomics, Inc. | Capturing oligonucleotides in spatial transcriptomics |
| US12110541B2 (en) | 2020-02-03 | 2024-10-08 | 10X Genomics, Inc. | Methods for preparing high-resolution spatial arrays |
| US11898205B2 (en) | 2020-02-03 | 2024-02-13 | 10X Genomics, Inc. | Increasing capture efficiency of spatial assays |
| US12286673B2 (en) | 2020-02-05 | 2025-04-29 | 10X Genomics, Inc. | Increasing efficiency of spatial analysis in a biological sample |
| US11732300B2 (en) | 2020-02-05 | 2023-08-22 | 10X Genomics, Inc. | Increasing efficiency of spatial analysis in a biological sample |
| US12129516B2 (en) | 2020-02-07 | 2024-10-29 | 10X Genomics, Inc. | Quantitative and automated permeabilization performance evaluation for spatial transcriptomics |
| US11835462B2 (en) | 2020-02-11 | 2023-12-05 | 10X Genomics, Inc. | Methods and compositions for partitioning a biological sample |
| US12281357B1 (en) | 2020-02-14 | 2025-04-22 | 10X Genomics, Inc. | In situ spatial barcoding |
| US12399123B1 (en) | 2020-02-14 | 2025-08-26 | 10X Genomics, Inc. | Spatial targeting of analytes |
| US11926863B1 (en) | 2020-02-27 | 2024-03-12 | 10X Genomics, Inc. | Solid state single cell method for analyzing fixed biological cells |
| US11768175B1 (en) | 2020-03-04 | 2023-09-26 | 10X Genomics, Inc. | Electrophoretic methods for spatial analysis |
| US12228544B2 (en) | 2020-03-04 | 2025-02-18 | 10X Genomics, Inc. | Electrophoretic methods for spatial analysis |
| US11773433B2 (en) | 2020-04-22 | 2023-10-03 | 10X Genomics, Inc. | Methods for spatial analysis using targeted RNA depletion |
| US12416603B2 (en) | 2020-05-19 | 2025-09-16 | 10X Genomics, Inc. | Electrophoresis cassettes and instrumentation |
| US11959130B2 (en) | 2020-05-22 | 2024-04-16 | 10X Genomics, Inc. | Spatial analysis to detect sequence variants |
| US11608520B2 (en) | 2020-05-22 | 2023-03-21 | 10X Genomics, Inc. | Spatial analysis to detect sequence variants |
| US11624086B2 (en) | 2020-05-22 | 2023-04-11 | 10X Genomics, Inc. | Simultaneous spatio-temporal measurement of gene expression and cellular activity |
| US11866767B2 (en) | 2020-05-22 | 2024-01-09 | 10X Genomics, Inc. | Simultaneous spatio-temporal measurement of gene expression and cellular activity |
| US11840687B2 (en) | 2020-06-02 | 2023-12-12 | 10X Genomics, Inc. | Nucleic acid library methods |
| US11608498B2 (en) | 2020-06-02 | 2023-03-21 | 10X Genomics, Inc. | Nucleic acid library methods |
| US12265079B1 (en) | 2020-06-02 | 2025-04-01 | 10X Genomics, Inc. | Systems and methods for detecting analytes from captured single biological particles |
| US11845979B2 (en) | 2020-06-02 | 2023-12-19 | 10X Genomics, Inc. | Spatial transcriptomics for antigen-receptors |
| US11859178B2 (en) | 2020-06-02 | 2024-01-02 | 10X Genomics, Inc. | Nucleic acid library methods |
| US11692218B2 (en) | 2020-06-02 | 2023-07-04 | 10X Genomics, Inc. | Spatial transcriptomics for antigen-receptors |
| US12098417B2 (en) | 2020-06-02 | 2024-09-24 | 10X Genomics, Inc. | Spatial transcriptomics for antigen-receptors |
| US12031177B1 (en) | 2020-06-04 | 2024-07-09 | 10X Genomics, Inc. | Methods of enhancing spatial resolution of transcripts |
| US11781130B2 (en) | 2020-06-08 | 2023-10-10 | 10X Genomics, Inc. | Methods of determining a surgical margin and methods of use thereof |
| US11624063B2 (en) | 2020-06-08 | 2023-04-11 | 10X Genomics, Inc. | Methods of determining a surgical margin and methods of use thereof |
| US12435363B1 (en) | 2020-06-10 | 2025-10-07 | 10X Genomics, Inc. | Materials and methods for spatial transcriptomics |
| US12128403B2 (en) | 2020-06-10 | 2024-10-29 | 10X Genomics, Inc. | Fluid delivery methods |
| US11661626B2 (en) | 2020-06-25 | 2023-05-30 | 10X Genomics, Inc. | Spatial analysis of DNA methylation |
| US12060604B2 (en) | 2020-06-25 | 2024-08-13 | 10X Genomics, Inc. | Spatial analysis of epigenetic modifications |
| US11981960B1 (en) | 2020-07-06 | 2024-05-14 | 10X Genomics, Inc. | Spatial analysis utilizing degradable hydrogels |
| US11761038B1 (en) | 2020-07-06 | 2023-09-19 | 10X Genomics, Inc. | Methods for identifying a location of an RNA in a biological sample |
| US12209280B1 (en) | 2020-07-06 | 2025-01-28 | 10X Genomics, Inc. | Methods of identifying abundance and location of an analyte in a biological sample using second strand synthesis |
| US11952627B2 (en) | 2020-07-06 | 2024-04-09 | 10X Genomics, Inc. | Methods for identifying a location of an RNA in a biological sample |
| US11981958B1 (en) | 2020-08-20 | 2024-05-14 | 10X Genomics, Inc. | Methods for spatial analysis using DNA capture |
| US12249085B2 (en) | 2020-09-18 | 2025-03-11 | 10X Genomics, Inc. | Sample handling apparatus and image registration methods |
| US11926822B1 (en) | 2020-09-23 | 2024-03-12 | 10X Genomics, Inc. | Three-dimensional spatial analysis |
| US11827935B1 (en) | 2020-11-19 | 2023-11-28 | 10X Genomics, Inc. | Methods for spatial analysis using rolling circle amplification and detection probes |
| US11680260B2 (en) | 2020-12-21 | 2023-06-20 | 10X Genomics, Inc. | Methods, compositions, and systems for spatial analysis of analytes in a biological sample |
| US12371688B2 (en) | 2020-12-21 | 2025-07-29 | 10X Genomics, Inc. | Methods, compositions, and systems for spatial analysis of analytes in a biological sample |
| US11618897B2 (en) | 2020-12-21 | 2023-04-04 | 10X Genomics, Inc. | Methods, compositions, and systems for capturing probes and/or barcodes |
| US12241060B2 (en) | 2020-12-21 | 2025-03-04 | 10X Genomics, Inc. | Methods, compositions, and systems for capturing probes and/or barcodes |
| US11959076B2 (en) | 2020-12-21 | 2024-04-16 | 10X Genomics, Inc. | Methods, compositions, and systems for capturing probes and/or barcodes |
| US11873482B2 (en) | 2020-12-21 | 2024-01-16 | 10X Genomics, Inc. | Methods, compositions, and systems for spatial analysis of analytes in a biological sample |
| US12398262B1 (en) | 2021-01-22 | 2025-08-26 | 10X Genomics, Inc. | Triblock copolymer-based cell stabilization and fixation system and methods of use thereof |
| US12287264B2 (en) | 2021-02-19 | 2025-04-29 | 10X Genomics, Inc. | Modular assay support devices |
| US12098985B2 (en) | 2021-02-19 | 2024-09-24 | 10X Genomics, Inc. | Modular assay support devices |
| US11970739B2 (en) | 2021-03-18 | 2024-04-30 | 10X Genomics, Inc. | Multiplex capture of gene and protein expression from a biological sample |
| US11739381B2 (en) | 2021-03-18 | 2023-08-29 | 10X Genomics, Inc. | Multiplex capture of gene and protein expression from a biological sample |
| US12203134B2 (en) | 2021-04-14 | 2025-01-21 | 10X Genomics, Inc. | Methods of measuring mislocalization of an analyte |
| AU2022257481B2 (en) * | 2021-04-15 | 2025-11-06 | Portrai Inc. | Apparatus and method for predicting cell type enrichment from tissue images using spatially resolved gene expression data |
| US12365935B2 (en) | 2021-05-06 | 2025-07-22 | 10X Genomics, Inc. | Methods for increasing resolution of spatial analysis |
| US12071655B2 (en) | 2021-06-03 | 2024-08-27 | 10X Genomics, Inc. | Methods, compositions, kits, and systems for enhancing analyte capture for spatial analysis |
| US11753673B2 (en) | 2021-09-01 | 2023-09-12 | 10X Genomics, Inc. | Methods, compositions, and kits for blocking a capture probe on a spatial array |
| US11840724B2 (en) | 2021-09-01 | 2023-12-12 | 10X Genomics, Inc. | Methods, compositions, and kits for blocking a capture probe on a spatial array |
| WO2023076345A1 (en) | 2021-10-26 | 2023-05-04 | 10X Genomics, Inc. | Methods for spatial analysis using targeted rna capture |
| US12275988B2 (en) | 2021-11-10 | 2025-04-15 | 10X Genomics, Inc. | Methods, compositions, and kits for determining the location of an analyte in a biological sample |
| US12195790B2 (en) | 2021-12-01 | 2025-01-14 | 10X Genomics, Inc. | Methods for improved in situ detection of nucleic acids and spatial analysis |
| US12223751B2 (en) | 2021-12-20 | 2025-02-11 | 10X Genomics, Inc. | Self-test for imaging device |
| WO2023159028A1 (en) * | 2022-02-15 | 2023-08-24 | 10X Genomics, Inc. | Systems and methods for spatial analysis of analytes using fiducial alignment |
| WO2024015578A1 (en) | 2022-07-15 | 2024-01-18 | 10X Genomics, Inc. | Methods for determining a location of a target nucleic acid in a biological sample |
Also Published As
| Publication number | Publication date |
|---|---|
| CN114761992B (en) | 2023-08-08 |
| EP4468252A2 (en) | 2024-11-27 |
| EP4038546C0 (en) | 2024-08-21 |
| US20230394790A1 (en) | 2023-12-07 |
| US20210097684A1 (en) | 2021-04-01 |
| US12125260B2 (en) | 2024-10-22 |
| EP4468252A3 (en) | 2025-02-19 |
| EP4038546A1 (en) | 2022-08-10 |
| US20250022253A1 (en) | 2025-01-16 |
| CN114761992A (en) | 2022-07-15 |
| US11756286B2 (en) | 2023-09-12 |
| CN117036248A (en) | 2023-11-10 |
| EP4038546B1 (en) | 2024-08-21 |
| US20230081613A1 (en) | 2023-03-16 |
| US11514575B2 (en) | 2022-11-29 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US12125260B2 (en) | Systems and methods for identifying morphological patterns in tissue samples | |
| US20240378734A1 (en) | Systems and methods for image registration or alignment | |
| Moses et al. | Museum of spatial transcriptomics | |
| Waylen et al. | From whole-mount to single-cell spatial assessment of gene expression in 3D | |
| Park et al. | Spatial transcriptomics: technical aspects of recent developments and their applications in neuroscience and cancer research | |
| US12421558B2 (en) | Systems and methods for joint interactive visualization of gene expression and DNA chromatin accessibility | |
| US20230238078A1 (en) | Systems and methods for machine learning biological samples to optimize permeabilization | |
| US20230081232A1 (en) | Systems and methods for machine learning features in biological samples | |
| Chen et al. | Benchmarking algorithms for spatially variable gene identification in spatial transcriptomics | |
| US20230140008A1 (en) | Systems and methods for evaluating biological samples | |
| US20250272996A1 (en) | Systems and methods for evaluating biological samples | |
| Wang et al. | ELLA: modeling subcellular spatial variation of gene expression within cells in high-resolution spatial transcriptomics | |
| CN104182656B (en) | A method for locating and displaying biological gene expression information and environmentally sensitive regions on chromosomes | |
| US12406364B2 (en) | Systems and methods for spatial analysis of analytes using fiducial alignment | |
| US20250029681A1 (en) | Systems and methods for cell-type identification | |
| US20240052404A1 (en) | Systems and methods for immunofluorescence quantification | |
| Zhou et al. | Spatial transcriptomics in transpathology | |
| WO2024036191A1 (en) | Systems and methods for colocalization |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 20792886 Country of ref document: EP Kind code of ref document: A1 |
|
| DPE1 | Request for preliminary examination filed after expiration of 19th month from priority date (pct application filed from 20040101) | ||
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| ENP | Entry into the national phase |
Ref document number: 2020357847 Country of ref document: AU Date of ref document: 20200930 Kind code of ref document: A |
|
| ENP | Entry into the national phase |
Ref document number: 2020792886 Country of ref document: EP Effective date: 20220502 |
|
| WWG | Wipo information: grant in national office |
Ref document number: 11202203147T Country of ref document: SG |
|
| WWP | Wipo information: published in national office |
Ref document number: 11202203147T Country of ref document: SG |