WO2025166108A1 - Methods of generating digitally stained images from unstained biological samples - Google Patents
Methods of generating digitally stained images from unstained biological samplesInfo
- Publication number
- WO2025166108A1 WO2025166108A1 PCT/US2025/013949 US2025013949W WO2025166108A1 WO 2025166108 A1 WO2025166108 A1 WO 2025166108A1 US 2025013949 W US2025013949 W US 2025013949W WO 2025166108 A1 WO2025166108 A1 WO 2025166108A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- training
- multispectral
- test
- transmission image
- image
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T11/00—2D [Two Dimensional] image generation
- G06T11/001—Texturing; Colouring; Generation of texture or colour
Definitions
- the present disclosure relates to microscopy methods and systems that utilize deep neural network learning for virtually morphologically staining one or more images derived from an unstained histological or cytological specimen.
- Deep learning neural networks are utilized to virtually morphologically stain one or more input images derived from an unstained histological or cytological specimen into one or more output images that are equivalent, such as diagnostically equivalent, to brightfield images of the same samples that are morphologically stained.
- Microscopic imaging of tissue samples is a fundamental tool used for the diagnosis of various diseases. Histopathologists use chemical staining techniques on tissue samples to highlight microscopic structure and composition looking for abnormalities that indicate the presence, nature, and extent of disease. Microscopic imaging of a tissue sample is, however, a time-consuming and expensive process. For instance, the process of preparing a stained tissue sample includes fixing the tissue sample with formalin, embedding the formalin-fixed tissue specimen to provide a formalin-fixed paraffin-embedded (FFPE) tissue sample, sectioning the FFPE tissue sample into thin slices, staining the tissue slices, and mounting the stained slices onto a glass slide, which is then followed by its microscopic imaging.
- FFPE formalin-fixed paraffin-embedded
- tissue processing and staining is destructive. Indeed, the aforementioned steps of preparing a stained tissue sample uses multiple reagents and introduces irreversible effects onto the tissue sample. Additionally, different assays often require multiple tissue sections, which can quickly deplete valuable biopsy samples and increase the likelihood of needing a repeat biopsy from patients, further increasing cost, patient pain, and valuable time. Further, each assay requires highly trained and scarce histotechnicians, produces chemical waste, and requires years of costly physical storage for the resulting glass slides.
- Systems and methods are desired that generate virtually stained images of unstained biological specimens, including histological and cytological specimens. It would be desirable to have systems and methods that decrease the amount of time required to produce useful stained images of biological specimens. It would also be desirable to have systems and methods that facilitate the generation of multiple virtually stained images derived from a single acquired image of an unstained biological specimen, thereby mitigating the need for multiple tissue sections. To meet these needs, Applicant has developed systems and methods which facilitate the generation of one or more virtually stained images derived from an image of an unstained biological specimen, where the virtually stained image is equivalent, such as diagnostically equivalent, to a corresponding brightfield image of the same biological specimen that has been chemically stained. These and other embodiments are described herein.
- a first aspect of the present disclosure is a system for generating a virtually stained image of a test unstained biological specimen disposed on a substrate, the system comprising: (i) one or more processors, and (ii) one or more memories coupled to the one or more processors, the one or more memories to store computer-executable instructions that, when executed by the one or more processors, cause the system to perform operations comprising: (a) obtaining a test multispectral transmission image of the test unstained biological specimen, wherein the test multispectral transmission image is derived from one or more test multispectral transmission image channel images; (b) supplying the obtained multispectral test transmission image of the test unstained biological specimen to a virtual staining engine, wherein the virtual staining engine is trained to generate an image of an unstained biological specimen stained with a stain, and wherein the virtual staining engine is trained from a plurality of pairs of coregistered training images derived from one or more training biological specimens, (i) wherein a first training image in each pair
- the test multispectral transmission image of the test unstained biological specimen is derived from two or more test multispectral transmission image channel images, and wherein no compression technique or dimensionality reduction technique is applied to the two or more test multispectral transmission image channel images to derive the test multispectral transmission image of the test unstained biological specimen.
- the test multispectral transmission image of the test unstained biological specimen is derived from two or more test multispectral transmission image channel images, and wherein a compression technique or a dimensionality reduction technique (e.g., principal component analysis) is applied to the two or more test multispectral transmission image channel images to derive the test multispectral transmission image of the test unstained biological specimen.
- the test multispectral transmission image of the test unstained biological specimen is derived from three or more test multispectral transmission image channel images, and wherein no compression technique or dimensionality reduction technique is applied to the three or more test multispectral transmission image channel images to derive the test multispectral transmission image of the test unstained biological specimen.
- the test multispectral transmission image of the test unstained biological specimen is derived from three or more test multispectral transmission image channel images, and wherein a compression technique or a dimensionality reduction technique (e.g., principal component analysis) is applied to the three or more test multispectral transmission image channel images to derive the test multispectral transmission image of the test unstained biological specimen.
- the test multispectral transmission image of the test unstained biological specimen is derived from four or more test multispectral transmission image channel images, and wherein no compression technique or dimensionality reduction technique is applied to the four or more test multispectral transmission image channel images to derive the test multispectral transmission image of the test unstained biological specimen.
- the test multispectral transmission image of the test unstained biological specimen is derived from four or more test multispectral transmission image channel images, and wherein a compression technique or a dimensionality reduction technique (e.g., principal component analysis) is applied to the four or more test multispectral transmission image channel images to derive the test multispectral transmission image of the test unstained biological specimen.
- the test multispectral transmission image of the test unstained biological specimen is derived from twelve or more test multispectral transmission image channel images, and wherein no compression technique or dimensionality reduction technique is applied to the twelve or more test multispectral transmission image channel images to derive the test multispectral transmission image of the test unstained biological specimen.
- the test multispectral transmission image of the test unstained biological specimen is derived from twelve or more test multispectral transmission image channel images, and wherein a compression technique or a dimensionality reduction technique (e.g., principal component analysis) is applied to the twelve or more test multispectral transmission image channel images to derive the test multispectral transmission image of the test unstained biological specimen.
- the first training image is derived from two or more training multispectral transmission image channel images, and wherein no compression technique or dimensionality reduction technique is applied to the two or more training multispectral transmission image channel images to derive the first training image.
- the first training image is derived from two or more training multispectral transmission image channel images, and wherein a compression technique a dimensionality reduction technique is applied to the two or more training multispectral transmission image channel images to derive the first training image.
- the first training image is derived from three or more training multispectral transmission image channel images, and wherein no compression technique or dimensionality reduction technique is applied to the three or more training multispectral transmission image channel images to derive the first training image.
- the first training image is derived from three or more training multispectral transmission image channel images, and wherein a compression technique a dimensionality reduction technique is applied to the three or more training multispectral transmission image channel images to derive the first training image.
- the first training image is derived from four or more training multispectral transmission image channel images, and wherein no compression technique or dimensionality reduction technique is applied to the four or more training multispectral transmission image channel images to derive the first training image.
- the first training image is derived from four or more training multispectral transmission image channel images, and wherein a compression technique a dimensionality reduction technique is applied to the four or more training multispectral transmission image channel images to derive the first training image.
- the first training image is derived from twelve or more training multispectral transmission image channel images, and wherein no compression technique or dimensionality reduction technique is applied to the twelve or more training multispectral transmission image channel images to derive the first training image.
- the first training image is derived from twelve or more training multispectral transmission image channel images, and wherein a compression technique a dimensionality reduction technique is applied to the twelve or more training multispectral transmission image channel images to derive the first training image.
- At least three test multispectral transmission image channel images are acquired; and (ii) at least three training multispectral transmission image channel images are acquired; and wherein each of the at least three test multispectral transmission image channel images are acquired at about the same wavelengths as each of the at least three training multispectral transmission image channel images.
- test multispectral transmission image channel images are acquired; and (ii) at least four training multispectral transmission image channel images are acquired; and wherein each of the at least four test multispectral transmission image channel images are acquired at about the same wavelengths as each of the at least four training multispectral transmission image channel images.
- the test multispectral transmission image is generated by performing a dimensionality reduction (e.g., principal component analysis) on the at least four test multispectral transmission image channel images.
- the first training image in each of pair of coregistered training images is generated by performing a dimensionality reduction on the at least four training multispectral transmission image channel images.
- test multispectral transmission image channel images are acquired; and (ii) at least six training multispectral transmission image channel images are acquired; and wherein each of the at least six test multispectral transmission image channel images are acquired at about the same wavelengths as each of the at least six training multispectral transmission image channel images.
- the test multispectral transmission image is generated by performing a dimensionality reduction on the at least six test multispectral transmission image channel images.
- the first training image in each of pair of coregistered training images is generated by performing a dimensionality reduction on the at least six training multispectral transmission image channel images.
- test multispectral transmission image channel images are acquired; (ii) at least twelve training multispectral transmission image channel images are acquired; and wherein each of the at least twelve test multispectral transmission image channel images are acquired at about the same wavelengths as each of the at least twelve training multispectral transmission image channel images.
- the test multispectral transmission image is generated by performing a dimensionality reduction on the at least twelve test multispectral transmission image channel images.
- the first training image in each of pair of coregistered training images is generated by performing a dimensionality reduction on the at least twelve training multispectral transmission image channel images.
- the wavelengths are selected from 365 +/- 20 nm, about 400 +/- 20 nm, about 435 +/- 20 nm, about 470 +/- 20 nm, about 500 +/- 20 nm, about 550+/- 20 nm, about 580 +/- 20 nm, about 635 +/- 20 nm, about 660 +/- 20 nm, about 690 +/- 20 nm, about 780 +/- 20 nm, and/or about 850 +/- 20 nm.
- the wavelengths are selected from 365 +/- 10 nm, about 400 +/- 10 nm, about 435 +/- 10 nm, about 470 +/- 10 nm, about 500 +/- 10 nm, about 550+/- 10 nm, about 580 +/- 10 nm, about 635 +/- 10 nm, about 660 +/- 10 nm, about 690 +/- 10 nm, about 780 +/- 10 nm, and/or about 850 +/- 10 nm.
- the wavelengths are selected from 365 nm, about 400 nm, about 435 nm, about 470 nm, about 500 nm, about 550 nm, about 580 nm, about 635 nm, about 660 nm, about 690 nm, about 780 nm, and/or about 850 nm.
- the obtained virtual staining engine comprises a generative adversarial network (GAN) (e.g., a 3-channel GAN, a 12-channel GAN, etc.).
- GAN generative adversarial network
- the stain is a primary stain. In some embodiments, the stain is a special stain. In some embodiments, the stain comprises hematoxylin. In some embodiments, the stain comprises hematoxylin and eosin.
- a second aspect of the present disclosure is a method of generating a virtually stained image of a test unstained biological specimen disposed on a substrate, the method comprising: (a) obtaining a test multispectral transmission image of the test unstained biological specimen, wherein the test multispectral transmission image is derived from one or more test multispectral transmission image channel images; (b) supplying the obtained test multispectral transmission image of the test unstained biological specimen to a virtual staining engine, wherein the virtual staining engine is trained to generate an image of an unstained biological specimen stained with a stain, and wherein the virtual staining engine is trained from a plurality of pairs of coregistered training images derived from one or more training biological specimens, (i) wherein a first training image in each pair of the plurality of coregistered training images is derived from one or more training multispectral transmission image channel images; (ii) wherein a second training image in each pair of the plurality of coregistered training images comprises training brightfield image data of the same
- the test multispectral transmission image of the test unstained biological specimen is derived from two or more test multispectral transmission image channel images, and wherein no compression technique or dimensionality reduction technique is applied to the two or more test multispectral transmission image channel images to derive the test multispectral transmission image of the test unstained biological specimen.
- the test multispectral transmission image of the test unstained biological specimen is derived from two or more test multispectral transmission image channel images, and wherein a compression technique or a dimensionality reduction technique (e.g., principal component analysis) is applied to the two or more test multispectral transmission image channel images to derive the test multispectral transmission image of the test unstained biological specimen.
- the test multispectral transmission image of the test unstained biological specimen is derived from three or more test multispectral transmission image channel images, and wherein no compression technique or dimensionality reduction technique is applied to the three or more test multispectral transmission image channel images to derive the test multispectral transmission image of the test unstained biological specimen.
- the test multispectral transmission image of the test unstained biological specimen is derived from three or more test multispectral transmission image channel images, and wherein a compression technique or a dimensionality reduction technique (e.g., principal component analysis) is applied to the three or more test multispectral transmission image channel images to derive the test multispectral transmission image of the test unstained biological specimen.
- the test multispectral transmission image of the test unstained biological specimen is derived from four or more test multispectral transmission image channel images, and wherein no compression technique or dimensionality reduction technique is applied to the four or more test multispectral transmission image channel images to derive the test multispectral transmission image of the test unstained biological specimen.
- the test multispectral transmission image of the test unstained biological specimen is derived from four or more test multispectral transmission image channel images, and wherein a compression technique or a dimensionality reduction technique (e g., principal component analysis) is applied to the four or more test multispectral transmission image channel images to derive the test multispectral transmission image of the test unstained biological specimen.
- the test multispectral transmission image of the test unstained biological specimen is derived from twelve or more test multispectral transmission image channel images, and wherein no compression technique or dimensionality reduction technique is applied to the twelve or more test multispectral transmission image channel images to derive the test multispectral transmission image of the test unstained biological specimen.
- the test multispectral transmission image of the test unstained biological specimen is derived from twelve or more test multispectral transmission image channel images, and wherein a compression technique or a dimensionality reduction technique (e.g., principal component analysis) is applied to the twelve or more test multispectral transmission image channel images to derive the test multispectral transmission image of the test unstained biological specimen.
- the first training image is derived from two or more training multispectral transmission image channel images, and wherein no compression technique or dimensionality reduction technique is applied to the two or more training multispectral transmission image channel images to derive the first training image.
- the first training image is derived from two or more training multispectral transmission image channel images, and wherein a compression technique a dimensionality reduction technique is applied to the two or more training multispectral transmission image channel images to derive the first training image.
- the first training image is derived from three or more training multispectral transmission image channel images, and wherein no compression technique or dimensionality reduction technique is applied to the three or more training multispectral transmission image channel images to derive the first training image.
- the first training image is derived from three or more training multispectral transmission image channel images, and wherein a compression technique a dimensionality reduction technique is applied to the three or more training multispectral transmission image channel images to derive the first training image.
- the first training image is derived from four or more training multispectral transmission image channel images, and wherein no compression technique or dimensionality reduction technique is applied to the four or more training multispectral transmission image channel images to derive the first training image.
- the first training image is derived from four or more training multispectral transmission image channel images, and wherein a compression technique a dimensionality reduction technique is applied to the four or more training multispectral transmission image channel images to derive the first training image.
- the first training image is derived from twelve or more training multispectral transmission image channel images, and wherein no compression technique or dimensionality reduction technique is applied to the twelve or more training multispectral transmission image channel images to derive the first training image.
- the first training image is derived from twelve or more training multispectral transmission image channel images, and wherein a compression technique a dimensionality reduction technique is applied to the twelve or more training multispectral transmission image channel images to derive the first training image.
- At least three test multispectral transmission image channel images are acquired; and (ii) at least three training multispectral transmission image channel images are acquired; wherein each of the at least three test multispectral transmission image channel images are acquired at about the same wavelengths as each of the at least three training multispectral transmission image channel images.
- test multispectral transmission image channel images are acquired;
- training multispectral transmission image channel images are acquired; and wherein each of the at least four test multispectral transmission image channel images are acquired at about the same wavelengths as each of the at least four training multispectral transmission image channel images.
- the test multispectral transmission image is generated by performing a dimensionality reduction on the at least four test multispectral transmission image channel images.
- the first training image in each of pair of coregistered training images is generated by performing a dimensionality reduction on the at least four training multispectral transmission image channel images.
- test multispectral transmission image channel images are acquired;
- training multispectral transmission image channel images are acquired; and wherein each of the at least six test multispectral transmission image channel images are acquired at about the same wavelengths as each of the at least six training multispectral transmission image channel images.
- the test multispectral transmission image is generated by performing a dimensionality reduction on the at least six test multispectral transmission image channel images.
- the first training image in each of pair of coregistered training images is generated by performing a dimensionality reduction on the at least six training multispectral transmission image channel images.
- test multispectral transmission image channel images are acquired; (ii) at least twelve training multispectral transmission image channel images are acquired; and wherein each of the at least twelve test multispectral transmission image channel images are acquired at about the same wavelengths as each of the at least twelve training multispectral transmission image channel images.
- the test multispectral transmission image is generated by performing a dimensionality reduction on the at least twelve test multispectral transmission image channel images.
- the first training image in each of pair of coregistered training images is generated by performing a dimensionality reduction on the at least twelve training multispectral transmission image channel images.
- the wavelengths are selected from 365 +/- 20 nm, about 400 +/- 20 nm, about 435 +/- 20 nm, about 470 +/- 20 nm, about 500 +/- 20 nm, about 550+/- 20 nm, about 580 +/- 20 nm, about 635 +/- 20 nm, about 660 +/- 20 nm, about 690 +/- 20 nm, about 780 +/- 20 nm, and/or about 850 +/- 20 nm.
- the wavelengths are selected from 365 +/- 10 nm, about 400 +/- 10 nm, about 435 +/- 10 nm, about 470 +/- 10 nm, about 500 +/- 10 nm, about 550+/- 10 nm, about 580 +/- 10 nm, about 635 +/- 10 nm, about 660 +/- 10 nm, about 690 +/- 10 nm, about 780 +/- 10 nm, and/or about 850 +/- 10 nm.
- the wavelengths are selected from 365 nm, about 400 nm, about 435 nm, about 470 nm, about 500 nm, about 550 nm, about 580 nm, about 635 nm, about 660 nm, about 690 nm, about 780 nm, and/or about 850 nm.
- the obtained virtual staining engine comprises a generative adversarial network.
- the morphological stain is a primary stain. In some embodiments, the morphological stain is a special stain. In some embodiments, the morphological stain comprises hematoxylin. In some embodiments, the morphological stain comprises hematoxylin and eosin.
- a third aspect of the present disclosure is a system for generating a virtually stained image of a test unstained biological specimen disposed on a substrate, the system comprising: (i) one or more processors, and (ii) one or more memories coupled to the one or more processors, the one or more memories to store computer-executable instructions that, when executed by the one or more processors, cause the system to perform operations comprising: (a) obtaining a test multispectral transmission image of the test unstained biological specimen, wherein the test multispectral transmission image is derived from one or more test multispectral transmission image channel images; (b) supplying the obtained multispectral test transmission image of the test unstained biological specimen to a virtual staining engine, wherein the virtual staining engine is trained to generate an image of an unstained biological specimen stained with a morphological stain; and (c) with the trained virtual staining engine, generating the virtually stained image of the test unstained biological specimen stained with the morphological stain.
- the test multispectral transmission image of the test unstained biological specimen is derived from two or more test multispectral transmission image channel images, and wherein no compression technique or dimensionality reduction technique is applied to the two or more test multispectral transmission image channel images to derive the test multispectral transmission image of the test unstained biological specimen.
- the test multispectral transmission image of the test unstained biological specimen is derived from two or more test multispectral transmission image channel images, and wherein a compression technique or a dimensionality reduction technique (e.g., principal component analysis) is applied to the two or more test multispectral transmission image channel images to derive the test multispectral transmission image of the test unstained biological specimen.
- the test multispectral transmission image of the test unstained biological specimen is derived from three or more test multispectral transmission image channel images, and wherein no compression technique or dimensionality reduction technique is applied to the three or more test multispectral transmission image channel images to derive the test multispectral transmission image of the test unstained biological specimen.
- the test multispectral transmission image of the test unstained biological specimen is derived from three or more test multispectral transmission image channel images, and wherein a compression technique or a dimensionality reduction technique (e g., principal component analysis) is applied to the three or more test multispectral transmission image channel images to derive the test multispectral transmission image of the test unstained biological specimen.
- the test multispectral transmission image of the test unstained biological specimen is derived from four or more test multispectral transmission image channel images, and wherein no compression technique or dimensionality reduction technique is applied to the four or more test multispectral transmission image channel images to derive the test multispectral transmission image of the test unstained biological specimen.
- the test multispectral transmission image of the test unstained biological specimen is derived from four or more test multispectral transmission image channel images, and wherein a compression technique or a dimensionality reduction technique (e.g., principal component analysis) is applied to the four or more test multispectral transmission image channel images to derive the test multispectral transmission image of the test unstained biological specimen.
- the test multispectral transmission image of the test unstained biological specimen is derived from twelve or more test multispectral transmission image channel images, and wherein no compression technique or dimensionality reduction technique is applied to the twelve or more test multispectral transmission image channel images to derive the test multispectral transmission image of the test unstained biological specimen.
- the test multispectral transmission image of the test unstained biological specimen is derived from twelve or more test multispectral transmission image channel images, and wherein a compression technique or a dimensionality reduction technique (e.g., principal component analysis) is applied to the twelve or more test multispectral transmission image channel images to derive the test multispectral transmission image of the test unstained biological specimen.
- the virtual staining engine is trained using (a) one or more training multispectral transmission image channel images of an unstained training biological specimen; and (b) brightfield training image data of the same unstained training biological specimen stained with a morphological stain.
- each of the one or more test multispectral transmission image channel images are acquired at about the same wavelengths as each of the one or more training multispectral transmission image channel images.
- the training brightfield image data is acquired using a brightfield image acquisition device.
- the training brightfield image data is acquired using a multispectral image acquisition device.
- the virtual staining engine is trained from a plurality of pairs of coregistered training images derived from one or more training biological specimens.
- a first training image in each pair of the plurality of coregistered training images is derived from one or more training multispectral transmission image channel images; and wherein a second training image in each pair of the plurality of coregistered training images comprises training brightfield image data of the same biological specimen stained with the morphological stain.
- each of the one or more test multispectral transmission image channel images are acquired at about the same wavelengths as each of the one or more training multispectral transmission image channel images.
- the wavelengths are selected from 365 +/- 20 nm, about 400 +/- 20 nm, about 435 +/- 20 nm, about 470 +/- 20 nm, about 500 +/- 20 nm, about 550+/- 20 nm, about 580 +/- 20 nm, about 635 +/- 20 nm, about 660 +/- 20 nm, about 690 +/- 20 nm, about 780 +/- 20 nm, and/or about 850 +/- 20 nm.
- the wavelengths are selected from 365 +/- 10 nm, about 400 +/- 10 nm, about 435 +/- 10 nm, about 470 +/- 10 nm, about 500 +/- 10 nm, about 550+/- 10 nm, about 580 +/- 10 nm, about 635 +/- 10 nm, about 660 +/- 10 nm, about 690 +/- 10 nm, about 780 +/- 10 nm, and/or about 850 +/- 10 nm.
- the wavelengths are selected from 365 nm, about 400 nm, about 435 nm, about 470 nm, about 500 nm, about 550 nm, about 580 nm, about 635 nm, about 660 nm, about 690 nm, about 780 nm, and/or about 850 nm.
- a fourth aspect of the present disclosure is a method of generating a virtually stained image of a test unstained biological specimen disposed on a substrate, the method comprising (a) obtaining a test multispectral transmission image of the test unstained biological specimen, wherein the test multispectral transmission image is derived from one or more test multispectral transmission image channel images; (b) supplying the obtained multispectral test transmission image of the test unstained biological specimen to a virtual staining engine, wherein the virtual staining engine is trained to generate an image of an unstained biological specimen stained with a morphological stain; and (c) with the trained virtual staining engine, generating the virtually stained image of the test unstained biological specimen stained with the morphological stain.
- the virtual staining engine is trained using (a) one or more training multispectral transmission image channel images of an unstained training biological specimen; and (b) brightfield training image data of the same unstained training biological specimen stained with a morphological stain.
- each of the one or more test multispectral transmission image channel images are acquired at about the same wavelengths as each of the one or more training multispectral transmission image channel images.
- the training brightfield image data is acquired using a brightfield image acquisition device.
- the training brightfield image data is acquired using a multispectral image acquisition device.
- a fifth aspect of the present disclosure is system for generating two or more virtually stained images of a test unstained biological specimen disposed on a substrate, the system comprising: (i) one or more processors, and (ii) one or more memories coupled to the one or more processors, the one or more memories to store computer-executable instructions that, when executed by the one or more processors, cause the system to perform operations comprising: (a) obtaining a test multispectral transmission image of the test unstained biological specimen, wherein the test multispectral transmission image is derived from one or more test multispectral transmission image channel images; (b) supplying the obtained test multispectral transmission image of the test unstained biological specimen to two or more different virtual staining engines, wherein each virtual staining engine of the two or more different virtual staining engines is trained to generate an image of an unstained biological specimen stained with a different morphological stain; and (c) with the trained virtual staining engine, generating two or more virtually stained images of the test unstained biological specimen, wherein each of
- the two or more virtual staining engines are each independently trained using different sets of training data, such as where each training data set includes images stained with a specific morphological stain.
- the two or more virtual staining engines are each independently trained using (a) one or more training multispectral transmission image channel images of an unstained training biological specimen; and (b) brightfield training image data of the same unstained training biological specimen stained with a morphological stain.
- each of the one or more test multispectral transmission image channel images are acquired at about the same wavelengths as each of the one or more training multispectral transmission image channel images used to train each of the two or more virtual staining engines.
- the training brightfield image data is acquired using a brightfield image acquisition device.
- the training brightfield image data is acquired using a multispectral image acquisition device.
- the wavelengths are selected from 365 +/- 20 nm, about 400 +/- 20 nm, about 435 +/- 20 nm, about 470 +/- 20 nm, about 500 +/- 20 nm, about 550+/- 20 nm, about 580 +/- 20 nm, about 635 +/- 20 nm, about 660 +/- 20 nm, about 690 +/- 20 nm, about 780 +/- 20 nm, and/or about 850 +/- 20 nm.
- the wavelengths are selected from 365 +/- 10 nm, about 400 +/- 10 nm, about 435 +/- 10 nm, about 470 +/- 10 nm, about 500 +/- 10 nm, about 550+/- 10 nm, about 580 +/- 10 nm, about 635 +/- 10 nm, about 660 +/- 10 nm, about 690 +/- 10 nm, about 780 +/- 10 nm, and/or about 850 +/- 10 nm.
- the wavelengths are selected from 365 nm, about 400 nm, about 435 nm, about 470 nm, about 500 nm, about 550 nm, about 580 nm, about 635 nm, about 660 nm, about 690 nm, about 780 nm, and/or about 850 nm.
- each of the two or more virtual staining engines are trained from a plurality of pairs of coregistered training images derived from one or more training biological specimens.
- a first training image in each pair of the plurality of coregistered training images is derived from one or more training multispectral transmission image channel images; and wherein a second training image in each pair of the plurality of coregistered training images comprises training brightfield image data of the same biological specimen stained with the morphological stain.
- the training brightfield image data is acquired using a brightfield image acquisition device.
- the training brightfield image data is acquired using a multispectral image acquisition device.
- a sixth aspect of the present disclosure is method of generating two or more virtually stained images of a test unstained biological specimen disposed on a substrate, the method comprising (a) obtaining a test multispectral transmission image of the test unstained biological specimen, wherein the test multispectral transmission image is derived from one or more test multispectral transmission image channel images; (b) supplying the obtained test multispectral transmission image of the test unstained biological specimen to two or more different virtual staining engines, wherein each virtual staining engine of the two or more different virtual staining engines is trained to generate an image of an unstained biological specimen stained with a different morphological stain; and (c) with the trained virtual staining engine, generating two or more virtually stained images of the test unstained biological specimen, wherein each of the generated two or more virtually stained images are stained with a different morphological stain.
- each of the two or more virtual staining engines are trained from a plurality of pairs of coregistered training images derived from one or more training biological specimens.
- a first training image in each pair of the plurality of coregistered training images is derived from one or more training multispectral transmission image channel images; and wherein a second training image in each pair of the plurality of coregistered training images comprises training brightfield image data of the same biological specimen stained with the morphological stain.
- the training brightfield image data is acquired using a brightfield image acquisition device.
- the training brightfield image data is acquired using a multispectral image acquisition device.
- a seventh aspect of the present disclosure is system for generating a virtually stained image of a test unstained biological specimen disposed on a substrate, the system comprising: (i) one or more processors, and (ii) one or more memories coupled to the one or more processors, the one or more memories to store computer-executable instructions that, when executed by the one or more processors, cause the system to perform operations comprising: (a) obtaining a trained virtual staining engine trained to generate an image of an unstained biological specimen stained with a morphological stain, wherein the trained virtual staining engine is trained from a plurality of pairs of coregistered training images, (i) where a first training image in each pair of the plurality of coregistered training images is derived from training multispectral transmission image data of an unstained training biological specimen acquired at three or more different wavelengths, (ii) wherein a second training image in each pair of the plurality of coregistered training images comprises training brightfield image data of the same training biological specimen stained with a morph
- the training brightfield image data is acquired using a brightfield image acquisition device. In some embodiments, the training brightfield image data is acquired using a multispectral image acquisition device.
- the morphological stain is a primary stain.
- the morphological stain is a special stain (e.g., Masson's Trichrome or any of the other exemplary special stains described herein).
- the morphological stain comprises hematoxylin.
- the morphological stain comprises hematoxylin and eosin.
- the training multispectral transmission image data of the unstained training biological specimen is acquired at four or more different wavelengths.
- the first training image is generated without reducing a dimensionality of the training multispectral transmission image data at the four or more different wavelengths.
- the first training image is generated by reducing a dimensionality of the training multispectral transmission image data at the four or more different wavelengths.
- the dimensionality is reduced using principal component analysis.
- the training multispectral transmission image data of the unstained training biological specimen is acquired at six or more different wavelengths.
- the first training image is generated without reducing a dimensionality of the training multispectral transmission image data acquired at the six or more different wavelengths.
- the first training image is generated by reducing a dimensionality of the training multispectral transmission image data acquired at the six or more different wavelengths.
- the dimensionality is reduced using principal component analysis.
- the training multispectral transmission image data of the unstained training biological specimen is acquired at twelve or more different wavelengths.
- the first training image is generated without reducing a dimensionality of the training multispectral transmission image data acquired at the twelve or more different wavelengths.
- the first training image is generated by reducing a dimensionality of the training multispectral transmission image data acquired at the twelve or more different wavelengths.
- the dimensionality is reduced using principal component analysis.
- the at least two wavelengths are selected from 365 +/- 20 nm, about 400 +/- 20 nm, about 435 +/- 20 nm, about 470 +/- 20 nm, about 500 +/- 20 nm, about 550+/- 20 nm, about 580 +/- 20 nm, about 635 +/- 20 nm, about 660 +/- 20 nm, about 690 +/- 20 nm, about 780 +/- 20 nm, and/or about 850 +/- 20 nm.
- the at least two wavelengths are selected from 365 +/- 10 nm, about 400 +/- 10 nm, about 435 +/- 10 nm, about 470 +/- 10 nm, about 500 +/- 10 nm, about 550+/- 10 nm, about 580 +/- 10 nm, about 635 +/- 10 nm, about 660 +/- 10 nm, about 690 +/- 10 nm, about 780 +/- 10 nm, and/or about 850 +/- 10 nm.
- the at least two wavelengths are selected from 365 nm, about 400 nm, about 435 nm, about 470 nm, about 500 nm, about 550 nm, about 580 nm, about 635 nm, about 660 nm, about 690 nm, about 780 nm, and/or about 850 nm.
- the training multispectral transmission image data and the test multispectral image data are acquired using a multispectral image acquisition device.
- the training brightfield image data is acquired using a brightfield image acquisition device.
- the training brightfield image data is acquired using a multispectral image acquisition device, wherein the multispectral image acquisition device is configured to capture image data at 700nm +/- 10mm, about 550nm +/- 10mm , and about 470nm +/- 10mm.
- the obtained trained virtual staining engine comprises a generative adversarial network.
- An eighth aspect of the present disclosure is a method of generating a virtually stained image of a test unstained biological specimen disposed on a substrate, comprising: (a) obtaining a trained virtual staining engine trained to generate an image of an unstained biological specimen stained with a morphological stain, wherein the trained virtual staining engine is trained from a plurality of pairs of coregistered training images, (b) where a first training image in each pair of the plurality of coregistered training images is derived from training multispectral transmission image data of an unstained training biological specimen illuminated with the at least two different illumination sources; (i) wherein a second training image in each pair of the plurality of coregistered training images comprises training brightfield image data of the same training biological specimen stained with a morphological stain; (ii) obtaining a test multispectral transmission image of the test unstained biological specimen, wherein the obtained test multispectral transmission image is derived from test multispectral transmission image data of the test unstained biological specimen illuminated with the at least
- the morphological stain is a primary stain. In some embodiments, the morphological stain is a special stain. In some embodiments, the morphological stain comprises hematoxylin. In some embodiments, the morphological stain comprises hematoxylin and eosin.
- the unstained training biological specimen and the test unstained biological specimen illuminated with at least four different illumination sources. In some embodiments, the unstained training biological specimen and the test unstained biological specimen illuminated with at least six different illumination sources. In some embodiments, the unstained training biological specimen and the test unstained biological specimen illuminated with at least twelve different illumination sources.
- the at least two illumination sources are selected from 365 +/- 20 nm, about 400 +/- 20 nm, about 435 +/- 20 nm, about 470 +/- 20 nm, about 500 +/- 20 nm, about 550+/- 20 nm, about 580 +/- 20 nm, about 635 +/- 20 nm, about 660 +/- 20 nm, about 690 +/- 20 nm, about 780 +/- 20 nm, and/or about 850 +/- 20 nm.
- the at least two illumination sources are selected from 365 +/- 10 nm, about 400 +/- 10 nm, about 435 +/- 10 nm, about 470 +/- 10 nm, about 500 +/- 10 nm, about 550+/- 10 nm, about 580 +/- 10 nm, about 635 +/- 10 nm, about 660 +/- 10 nm, about 690 +/- 10 nm, about 780 +/- 10 nm, and/or about 850 +/- 10 nm.
- the at least two illumination sources are selected from 365 nm, about 400 nm, about 435 nm, about 470 nm, about 500 nm, about 550 nm, about 580 nm, about 635 nm, about 660 nm, about 690 nm, about 780 nm, and/or about 850 nm.
- the training multispectral transmission image data and the test multispectral image data are acquired using a multispectral image acquisition device.
- the training brightfield image data is acquired using a brightfield image acquisition device.
- the training brightfield image data is acquired using a multispectral image acquisition device, wherein the multispectral image acquisition device is configured to capture image data at 700nm +/- 10mm, about 550nm +/- 10mm , and about 470nm +/- 10mm.
- the obtained trained virtual staining engine comprises a generative adversarial network.
- each pair of the plurality of coregistered training images is derived from different training biological specimen.
- a ninth aspect of the present disclosure is a non-transitory computer-readable medium storing instructions for generating a virtually stained image of a test unstained biological specimen disposed on a substrate, the instructions comprising: (a) obtaining a test multispectral transmission image of the test unstained biological specimen, wherein the test multispectral transmission image is derived from one or more test multispectral transmission image channel images; (b) supplying the obtained test multispectral transmission image of the test unstained biological specimen to a virtual staining engine, wherein the virtual staining engine is trained to generate an image of an unstained biological specimen stained with a morphological stain, and wherein the virtual staining engine is trained from a plurality of pairs of coregistered training images derived from one or more training biological specimens, (i) wherein a first training image in each pair of the plurality of coregistered training images is derived from one or more training multispectral transmission image channel images; (ii) wherein a second training image in each pair of the plurality of co
- the test multispectral transmission image of the test unstained biological specimen is derived from two or more test multispectral transmission image channel images, and wherein no compression technique or dimensionality reduction technique is applied to the two or more test multispectral transmission image channel images to derive the test multispectral transmission image of the test unstained biological specimen.
- the test multispectral transmission image of the test unstained biological specimen is derived from two or more test multispectral transmission image channel images, and wherein a compression technique or a dimensionality reduction technique (e.g., principal component analysis) is applied to the two or more test multispectral transmission image channel images to derive the test multispectral transmission image of the test unstained biological specimen.
- the test multispectral transmission image of the test unstained biological specimen is derived from three or more test multispectral transmission image channel images, and wherein no compression technique or dimensionality reduction technique is applied to the three or more test multispectral transmission image channel images to derive the test multispectral transmission image of the test unstained biological specimen.
- the test multispectral transmission image of the test unstained biological specimen is derived from three or more test multispectral transmission image channel images, and wherein a compression technique or a dimensionality reduction technique (e.g., principal component analysis) is applied to the three or more test multispectral transmission image channel images to derive the test multispectral transmission image of the test unstained biological specimen.
- the test multispectral transmission image of the test unstained biological specimen is derived from four or more test multispectral transmission image channel images, and wherein no compression technique or dimensionality reduction technique is applied to the four or more test multispectral transmission image channel images to derive the test multispectral transmission image of the test unstained biological specimen.
- the test multispectral transmission image of the test unstained biological specimen is derived from four or more test multispectral transmission image channel images, and wherein a compression technique or a dimensionality reduction technique (e g., principal component analysis) is applied to the four or more test multispectral transmission image channel images to derive the test multispectral transmission image of the test unstained biological specimen.
- the test multispectral transmission image of the test unstained biological specimen is derived from twelve or more test multispectral transmission image channel images, and wherein no compression technique or dimensionality reduction technique is applied to the twelve or more test multispectral transmission image channel images to derive the test multispectral transmission image of the test unstained biological specimen.
- the test multispectral transmission image of the test unstained biological specimen is derived from twelve or more test multispectral transmission image channel images, and wherein a compression technique or a dimensionality reduction technique (e.g., principle component analysis) is applied to the twelve or more test multispectral transmission image channel images to derive the test multispectral transmission image of the test unstained biological specimen.
- the first training image is derived from two or more training multispectral transmission image channel images, and wherein no compression technique or dimensionality reduction technique is applied to the two or more training multispectral transmission image channel images to derive the first training image.
- the first training image is derived from two or more training multispectral transmission image channel images, and wherein a compression technique a dimensionality reduction technique is applied to the two or more training multispectral transmission image channel images to derive the first training image.
- the first training image is derived from three or more training multispectral transmission image channel images, and wherein no compression technique or dimensionality reduction technique is applied to the three or more training multispectral transmission image channel images to derive the first training image.
- the first training image is derived from three or more training multispectral transmission image channel images, and wherein a compression technique a dimensionality reduction technique is applied to the three or more training multispectral transmission image channel images to derive the first training image.
- the first training image is derived from four or more training multispectral transmission image channel images, and wherein no compression technique or dimensionality reduction technique is applied to the four or more training multispectral transmission image channel images to derive the first training image.
- the first training image is derived from four or more training multispectral transmission image channel images, and wherein a compression technique a dimensionality reduction technique is applied to the four or more training multispectral transmission image channel images to derive the first training image.
- the first training image is derived from twelve or more training multispectral transmission image channel images, and wherein no compression technique or dimensionality reduction technique is applied to the twelve or more training multispectral transmission image channel images to derive the first training image.
- the first training image is derived from twelve or more training multispectral transmission image channel images, and wherein a compression technique a dimensionality reduction technique is applied to the twelve or more training multispectral transmission image channel images to derive the first training image.
- a tenth aspect of the present disclosure is a method of virtually staining an image derived of an unstained test biological specimen, comprising obtaining test multispectral image data from the unstained test biological specimen, wherein the obtained test multispectral image data comprises at least four multispectral transmission image channel images acquired at different wavelengths; reducing the dimensionality of the obtained test multispectral image data thereby generating a multi-channel multispectral test transmission image; and generating the virtually stained image from the multi-channel multispectral test transmission image using a virtual staining engine trained to generate an image of an unstained biological specimen stained with a particular morphological stain.
- FIG. 1 compares virtually stained images of tissue samples to the same tissue samples which were chemically stained, such as with a morphological stain.
- FIG. 2A provides an overview of a method of training a machine-learning algorithm to generate a morphologically stained image from a test multispectral image derived from an unstained biological specimen in accordance with one embodiment of the present disclosure.
- FIG. 2B provides an overview of a method of generating a virtually stained image of a test unstained biological specimen in accordance with one embodiment of the present disclosure.
- FIGS. 3A - 3C illustrate systems for acquiring image data and generating a virtual stain of a test biological specimen or for training a virtual staining engine.
- FIGS. 4A - 4B illustrate systems for acquiring image data and generating a virtual stain of a test biological specimen or for training a virtual staining engine.
- FIG. 5 provides a block diagram of a multispectral image acquisition device in accordance with one embodiment of the present disclosure.
- FIGS. 6A, 6B, and 6C illustrate methods of training a virtual staining engine in accordance with some embodiments of the present disclosure.
- FIG. 7 illustrates a method of training a virtual staining engine with different obtained training samples.
- FIGS. 8A and 8B illustrate methods of generating training image data for use in training a virtual staining engine.
- FIG. 9 illustrates a method of training a virtual staining engine with different obtained serial tissue sections.
- FIG. 10 illustrates a method of generating a multispectral training image in accordance with one embodiment of the present disclosure.
- FIG. 11 illustrates a method of generating a multispectral training image in accordance with one embodiment of the present disclosure.
- FIG. 12A provides an example of a multi-channel multispectral training transmission image.
- FIG. 12B provides an example of a training brightfield transmission.
- FIG. 13 A illustrates a method of coregistering a multispectral training transmission image and a training brightfield transmission image to provide a pair of coregistered training images in accordance with one embodiment of the present disclosure.
- FIG. 13B sets forth a method of coregistering a generated multi-channel multispectral training transmission image and a training brightfield transmission image in accordance with one embodiment of the present disclosure.
- FIGS. 14A and 14B illustrate the placement of landmarks in both an obtained generated multi-channel multispectral training transmission image and an obtained training brightfield transmission image.
- FIG. 15 provides a method of virtually staining an image of a test unstained biological specimen with a trained virtual staining engine.
- FIG. 16 illustrates a method of generating a multispectral test image in accordance with one embodiment of the present disclosure.
- FIG. 17 illustrates images (three) of unstained tonsil tissue acquired at three different wavelengths using a multispectral imaging apparatus.
- FIG. 17 further illustrates the coding of the images of unstained tonsil tissue into an RGB image.
- FIG. 17 compares a virtually stained image generated using a trained virtual staining engine with an image of the sample tissue specimen which was chemically stained.
- FIGS. 18A and 18B provides pseudo colored image obtained from a chemically stained H&E breast section and scanned in FLASH multispectral scanner (FIG. 18A). Channels illuminated with 470, 550, and 635 nm wavelengths were extracted and transformed to enhance the coloring and white balancing and then coded into an RGB image to be used as ground truth. A virtual stained image obtained when transforming the spectral image using pix2pix algorithm and the image on the left as ground truth (FIG. 18B).
- FIG. 19 provides an image of a virtually H&E-stained whole slide image of breast tissue in accordance with the methods of the present disclosure.
- FIG. 20 provides an image of a virtually H&E-stained breast tissue sample in accordance with the methods of the present disclosure.
- FIG. 21 provides an image of a colorectal tissue section virtually stained with a Masson's trichrome stain in accordance with the methods of the present disclosure.
- FIG. 22 provides a method of scanning an unstained slide in accordance with the methods of the present disclosure.
- FIG. 23 provides a method of scanning a stained slide, such as for acquiring images of samples stained with a morphological stain or a special stain for training a machine-learning algorithm, in accordance with the methods of the present disclosure.
- FIG. 24 illustrates a workflow for coregistering one or more images in accordance with the methods of the present disclosure.
- a method involving steps a, b, and c means that the method includes at least steps a, b, and c.
- steps and processes may be outlined herein in a particular order, the skilled artisan will recognize that the ordering steps and processes may vary.
- the phrase "at least one,” in reference to a list of one or more elements, should be understood to mean at least one element selected from any one or more of the elements in the list of elements, but not necessarily including at least one of each and every element specifically listed within the list of elements and not excluding any combinations of elements in the list of elements.
- This definition also allows that elements may optionally be present other than the elements specifically identified within the list of elements to which the phrase "at least one" refers, whether related or unrelated to those elements specifically identified.
- “at least one of A and B" can refer, in one embodiment, to at least one, optionally including more than one, A, with no B present (and optionally including elements other than B); in another embodiment, to at least one, optionally including more than one, B, with no A present (and optionally including elements other than A); in yet another embodiment, to at least one, optionally including more than one, A, and at least one, optionally including more than one, B (and optionally including other elements); etc.
- the term “about” means +/- 5%. In some embodiments, “about” means +/- 10%. In some embodiments, “substantially” means within about 15%. In some embodiments, “about” means +/- 20%.
- biomolecule such as a protein, a peptide, a nucleic acid, a lipid, a carbohydrate, or a combination thereof
- samples include mammals (such as humans; veterinary animals like cats, dogs, horses, cattle, and swine; and laboratory animals like mice, rats, and primates), insects, annelids, arachnids, marsupials, reptiles, amphibians, bacteria, and fungi.
- Biological specimens include tissue samples (such as tissue sections and needle biopsies of tissue), cell samples (such as cytological smears such as Pap smears or blood smears or samples of cells obtained by microdissection), or cell fractions, fragments, or organelles (such as obtained by lysing cells and separating their components by centrifugation or otherwise).
- tissue samples such as tissue sections and needle biopsies of tissue
- cell samples such as cytological smears such as Pap smears or blood smears or samples of cells obtained by microdissection
- cell fractions, fragments, or organelles such as obtained by lysing cells and separating their components by centrifugation or otherwise.
- biological specimens include blood, serum, urine, semen, fecal matter, cerebrospinal fluid, interstitial fluid, mucous, tears, sweat, pus, biopsied tissue (for example, obtained by a surgical biopsy or a needle biopsy), nipple aspirates, cerumen, milk, vaginal fluid, saliva, swabs (such as buccal swabs), or any material containing biomolecules that is derived from a first biological specimen.
- the term "biological specimen” as used herein refers to a sample (such as a homogenized or liquefied sample) prepared from a tumor or a portion thereof obtained from a subject.
- biomarker refers to a measurable indicator of some biological state or condition.
- a biomarker may be a protein or peptide, e.g., a surface protein, which can be specifically stained, and which is indicative of a biological feature of the cell, e.g., the cell type or the physiological state of the cell.
- An immune cell marker is a biomarker that is selectively indicative of a feature that relates to an immune response of a mammal.
- a biomarker may be used to determine how well the body responds to a treatment for a disease or condition or if the subject is predisposed to a disease or condition.
- a biomarker refers to a biological substance that is indicative of the presence of cancer in the body.
- a biomarker may be a molecule secreted by a tumor or a specific response of the body to the presence of cancer.
- Genetic, epigenetic, proteomic, glycomic, and imaging biomarkers can be used for cancer diagnosis, prognosis, and epidemiology. Such biomarkers can be assayed in non- invasively collected biofluids like blood or serum.
- Biomarkers may be useful as diagnostics (to identify early-stage cancers) and/or prognostics (to forecast how aggressive a cancer is and/or predict how a subject will respond to a particular treatment and/or how likely a cancer is to recur).
- a "brightfield” refers to data, e.g., image data, obtained via a microscope based on a biological sample illuminated from below such that the light waves pass through transparent portions of the biological sample. The varying brightness levels are then captured, such as in the form of an image.
- the term "cell,” refers to a prokaryotic cell or a eukaryotic cell.
- the cell may be an adherent or a non-adherent cell, such as an adherent prokaryotic cell, adherent eukaryotic cell, non-adherent prokaryotic cell, or non-adherent eukaryotic cell.
- a cell may be a yeast cell, a bacterial cell, an algae cell, a fungal cell, or any combination thereof.
- a cell may be a mammalian cell.
- a cell may be a primary cell obtained from a subject.
- a cell may be a cell line or an immortalized cell.
- a cell may be obtained from a mammal, such as a human or a rodent.
- a cell may be a cancer or tumor cell.
- a cell may be an epithelial cell.
- a cell may be a red blood cell or a white blood cell.
- a cell may be an immune cell such as a T cell, a B cell, a natural killer (NK) cell, a macrophage, a dendritic cell, or others.
- a cell may be a neuronal cell, a glial cell, an astrocyte, a neuronal support cell, a Schwann cell, or others.
- a cell may be an endothelial cell.
- a cell may be a fibroblast or a keratinocyte.
- a cell may be a pericyte, hepatocyte, a stem cell, a progenitor cell, or others.
- a cell may be a circulating cancer or tumor cell or a metastatic cell.
- a cell may be a marker specific cell such as a CD8+ T cell or a CD4+ T cell.
- a cell may be a neuron.
- a neuron may be a central neuron, a peripheral neuron, a sensory neuron, an interneuron, an intraneuronal, a motor neuron, a multipolar neuron, a bipolar neuron, or a pseudo-unipolar neuron.
- a cell may be a neuron supporting cell, such as a Schwann cell.
- a cell may be one of the cells of a blood-brain barrier system.
- a cell may be a cell line, such as a neuronal cell line.
- a cell may be a primary cell, such as cells obtained from a brain of a subject.
- a cell may be a population of cells that may be isolated from a subject, such as a tissue biopsy, a cytology specimen, a blood sample, a fine needle aspirate (FNA) sample, or any combination thereof.
- FNA fine needle aspirate
- a cell may be obtained from a bodily fluid such as urine, milk, sweat, lymph, blood, sputum, amniotic fluid, aqueous humor, vitreous humor, bile, cerebrospinal fluid, chyle, chyme, exudates, endolymph, perilymph, gastric acid, mucus, pericardial fluid, peritoneal fluid, pleural fluid, pus, rheum, saliva, sebum, serous fluid, smegma, sputum, tears, vomit, or other bodily fluid.
- a cell may comprise cancerous cells, non-cancerous cells, tumor cells, non-tumor cells, healthy cells, or any combination thereof.
- cytological sample refers to a cellular sample in which the cells of the sample have been partially or completely disaggregated, such that the sample no longer reflects the spatial relationship of the cells as they existed in the subject from which the cellular sample was obtained.
- tissue scrapings such as a cervical scraping
- fine needle aspirates samples obtained by lavage of a subject, et cetera.
- fixation refers to a process by which molecular and/or morphological details of a cellular sample are preserved.
- fixation processes There are generally three kinds of fixation processes: (1) heat fixation, (2) perfusion; and (3) immersion.
- heat fixation samples are exposed to a heat source for a sufficient period of time to heat kill and adhere the sample to the slide.
- Perfusion involves use of the vascular system to distribute a chemical fixative throughout a whole organ or a whole organism.
- Immersion involves immersing a sample in a volume of a chemical fixative and allowing the fixative to diffuse throughout the sample.
- Chemical fixation involves diffusion or perfusion of a chemical throughout the cellular samples, where the fixative reagent causes a reaction that preserves structures (both chemically and structurally) as close to that of living cellular sample as possible.
- Chemical fixatives can be classified into two broad classes based on mode of action: cross-linking fixatives and non-cross-linking fixatives.
- Crosslinking fixatives typically aldehydes - create covalent chemical bonds between endogenous biological molecules, such as proteins and nucleic acids, present in the tissue sample.
- Formaldehyde is the most commonly used cross-linking fixative in histology.
- Formaldehyde may be used in various concentrations for fixation, but it primarily is used as 10% neutral buffered formalin (NBF), which is about 3.7% formaldehyde in an aqueous phosphate buffered saline solution.
- NBF neutral buffered formalin
- Paraformaldehyde is a polymerized form of formaldehyde, which depolymerizes to provide formalin when heated.
- Glutaraldehyde operates in similar manner as formaldehyde but is a larger molecule having a slower rate of diffusion across membranes.
- Glutaraldehyde fixation provides a more rigid or tightly linked fixed product, causes rapid and irreversible changes, fixes quickly and well at 4 °C, provides good overall cytoplasmic and nuclear detail, but is not ideal for immunohistochemistry staining.
- Some fixation protocols use a combination of formaldehyde and glutaraldehyde. Glyoxal and acrolein are less commonly used aldehydes.
- the term "immunohistochemistry” refers to a method of determining the presence or distribution of an antigen in a sample by detecting interaction of the antigen with a specific binding agent, such as an antibody.
- a sample is contacted with an antibody under conditions permitting antibody-antigen binding.
- Antibody-antigen binding can be detected by means of a detectable label conjugated to the antibody (direct detection) or by means of a detectable label conjugated to a secondary antibody, which binds specifically to the primary antibody (indirect detection).
- indirect detection can include tertiary or higher antibodies that serve to further enhance the detectability of the antigen.
- detectable labels include enzymes, fluorophores and haptens, which in the case of enzymes, can be employed along with chromogenic or Anorogenic substrates.
- machine learning refers to a type of learning in which the machine (e.g., computer program) can learn on its own without being programmed.
- the term "slide” refers to any substrate (e.g., substrates made, in whole or in part, glass, quartz, plastic, silicon, etc.) of any suitable dimensions on which a biological specimen is placed for analysis, and more particularly to a "microscope slide” such as a standard 3 inch by 1 inch microscope slide or a standard 75 mm by 25 mm microscope slide.
- a cytological smear such as a standard 3 inch by 1 inch microscope slide or a standard 75 mm by 25 mm microscope slide.
- a thin tissue section such as from a biopsy
- an array of biological specimens for example a tissue array, a cellular array, aDNA array, an RNA array, a protein array, or any combination thereof.
- tissue sections, DNA samples, RNA samples, and/or proteins are placed on a slide at particular locations.
- the term slide may refer to SELDI and MALDI chips, and silicon wafers.
- the term “substantially” means the qualitative condition of exhibiting total or near-total extent or degree of a characteristic or property of interest. In some embodiments, “substantially” means within about 5%. In some embodiments, “substantially” means within about 10%. In some embodiments, “substantially” means within about 15%. In some embodiments, “substantially” means within about 20%.
- the term "virtual stained image” refers to an image of an unstained biological sample that simulates a chemically stained biological sample. In some embodiments, there is no discernable difference in the diagnostic quality between the virtually stained images of unstained biological specimens and the corresponding images of chemically stained biological specimens, at least not to the extent that any differences will substantially alter a diagnostic outcome.
- the present disclosure provides systems and methods for the generation of a virtually stained image of an unstained biological specimen based on an acquired image of the unstained biological specimen, where the virtually stained image manifests the appearance of the unstained biological specimen as if it were chemically stained, such as with a morphological stain (e g., a primary stain or a special stain).
- a morphological stain e g., a primary stain or a special stain
- a virtual staining engine trained in accordance with the methods described herein will output a virtually generated stained image in response to a provided input image of an unstained biological specimen, where the virtually stained image appears to a skilled observer (e.g., a trained histopathologist) to be substantially equivalent to a corresponding brightfield image of the same biological specimen that has been chemically stained, such as with a primary stain or with a special stain.
- a skilled observer e.g., a trained histopathologist
- a single input image of an unstained biological specimen may be virtually stained using one or more differently trained virtual staining engines to provide one or more different virtually stained output images.
- a single input image of an unstained biological specimen may be virtually stained with (i) a virtual staining engine trained for the H&E morphological stain to provide a first virtually stained output image based on the unstained input image, where the first virtually stained output image is substantially equivalent to a corresponding brightfield image of the same biological specimen stained with H&E (at least for diagnostic purposes); and (ii) a virtual staining engine trained for the resorcin fuchsine morphological stain to provide a second virtually stained output image based on the unstained input image, where the second virtually stained output image is substantially equivalent to a corresponding brightfield image of the same biological specimen stained with resorcin fuchsine (at least for diagnostic purposes).
- FIGS. 1 and 18 compare virtually stained images of unstained biological specimens to those same biological specimens that have been chemically stained, where the virtually stained images of the unstained biological specimens and the corresponding images of the chemically stained biological specimens may both be equally utilized for diagnostic purposes. In some embodiments, there is no discernable difference in the diagnostic quality between the virtually stained images of the unstained biological specimens and the corresponding images of the chemically stained biological specimens, at least not to the extent that any differences will substantially alter a diagnostic outcome.
- a machine-learning algorithm such as a deep learning neural network, is trained to generate a virtual stain for a specific morphological stain, such as with a primary stain (e.g., H&E) or a special stain (Masson's Trichrome).
- a primary stain e.g., H&E
- a special stain Masson's Trichrome
- the present disclosure also provides for systems and methods of training virtual staining engines.
- the present disclosure provides for a system which includes a plurality of different trained virtual staining engines which may be utilized in virtually staining one or more unstained biological specimens.
- FIG. 2A provides an overview of a method of training a machine-learning algorithm to generate a morphologically stained image from a test multispectral image derived from an unstained biological specimen.
- training transmission image data including training multispectral image data and training brightfield image data, is acquired from one or more training biological specimens.
- training multispectral transmission image data is acquired from an unstained training biological specimen, (step 100), such as with a multispectral image acquisition device 12A.
- training brightfield transmission image data is acquired from a stained training biological specimen, such with a brightfield imaging acquisition device 12B or with a multispectral imaging device 12A using RGB colors (e.g., acquiring image data from a stained training biological specimen with a multispectral image acquisition device 12A at about wavelengths 700nm, 550nm, and 470nm) (step 100).
- the acquired training multispectral transmission image data and the acquired training brightfield transmission image data is then supplied to a machine-learning algorithm (e.g., a GAN algorithm) to train a model to predict or generate a virtually stained image (step 102).
- a machine-learning algorithm e.g., a GAN algorithm
- test multispectral transmission image data is then obtained (step 111) and supplied to the obtained trained virtual staining engine (step 112).
- test multispectral transmission image data is acquired at about the same wavelengths that were used when training the virtual staining engine.
- test multispectral transmission images should be acquired at about 350nm, at about 450nm, and about 550nm.
- the trained virtual staining engine will then generate a virtually stained image of the test unstained biological specimen stained with the morphological stain, where the virtually stained image manifests the appearance of the test unstained biological specimen as if it were stained with the particular morphological stain that the trained virtual staining engine was trained to generate (step 113).
- the present disclosure also discloses systems adapted to acquire image data, process the acquired image data, train a machine-learning algorithm, and generate virtually stained slides using the trained machine-learning algorithm.
- FIGS. 3A - 3C, 4A, and 4B A system 200 for acquiring image data and generating a virtual stain of a test biological specimen or for training a virtual staining engine is illustrated in FIGS. 3A - 3C, 4A, and 4B.
- the system may include an image acquisition device 12 and a computer 14, whereby the image acquisition device 12 and computer may be communicatively coupled together (e.g., directly, or indirectly over a network 20).
- the image acquisition device 12 is a multispectral image acquisition device 12A (see, e.g., FIGS. 3B and 4A).
- the image acquisition device is a brightfield image acquisition device.
- the system 200 includes both a multispectral image acquisition device 12A and a brightfield image acquisition device 12B (see, e.g., FIGS. 3C and 4B).
- images captured from the image acquisition device 12 may be stored in binary form, such as locally or on a server.
- the captured digital images can also be divided into a matrix of pixels.
- the pixels can include a digital value of one or more bits, defined by the bit depth.
- the computer system 14 can include a desktop computer, a laptop computer, a tablet, or the like, digital electronic circuitry, firmware, hardware, memory 201, a computer storage medium (240), a computer program or set of instructions (e.g., where the program is stored within the memory or storage medium), one or more processors (209) (including a programmed processor), and any other hardware, software, or firmware modules or combinations thereof (such as described further herein).
- the system 14 illustrated in FIGS. 1 A - 1 C may include a computer with a display device 16 and an enclosure 18.
- the computer system can store acquired image data locally, such as in a memory, on a server, or another network connected device.
- Multispectral Image Acquisition Device / Acquisition of Multispectral Image Data
- the image acquisition device 12 is a multispectral image acquisition device 12A for acquiring transmission image data of a biological specimen at one or more wavelengths.
- the multispectral image acquisition device 12A is adapted to acquire multispectral transmission image data, such as multispectral transmission image channel images to a biological specimen disposed on a substrate, such as a microscope slide.
- the multispectral image acquisition device 12A is adapted to illuminate the biological specimen with an illumination source at a particular wavelength and acquire transmission image data of the biological specimen illuminated with the particular wavelength (referred to herein as "acquiring image data at a particular wavelength").
- multispectral image acquisition device 12A includes a CMOS sensor or a CCD sensor, such as a CMOS sensor or a CCD sensor which is sensitive to all wavelengths of light sources utilized.
- the sensor is focused to a middle wavelength used, or is independently focused to each wavelength, or dynamically focused for each wavelength by either moving the optics of the camera or moving the camera in a direction perpendicular to the sample.
- a traditional white source and filter system may be used in the multispectral image acquisition device 12A.
- an illuminator can include a white light source and a filter to produce set of color monochrome images. The color of the monochrome images can be redefined and combined to produce an enhanced digital image.
- an LED light source may be used in the detection step to generate narrower illumination light.
- the multispectral image acquisition device 12A is configured to illuminate a biological specimen with at least two different illumination sources, and acquire multispectral transmission image data (e.g., multispectral transmission image channel images) of the biological specimen illuminated with each of at least two different wavelengths.
- the two or more wavelengths could be broadband wavelengths, such as wavelengths up to about 300 nanometers each, such as to capture large regions of spectral transmission.
- two channels or more may be combined, for example by averaging, to generate a single transmission channel that represents an extended spectral region.
- the multispectral image acquisition device 12A is configured to illuminate a biological specimen with at least four different illumination sources, and acquire multispectral transmission image data (e.g., multispectral transmission image channel images) of the biological specimen illuminated with of at least four different wavelengths. In some embodiments, the multispectral image acquisition device 12A is configured to illuminate a biological specimen illuminated with least six different illumination sources, and acquire multispectral transmission image data (e.g., multispectral transmission image channel images) of the biological specimen illuminated with each of at least six different wavelengths.
- the multispectral image acquisition device 12A is configured to illuminate a biological specimen with at least eight different illumination sources, and acquire multispectral transmission image data (e.g., multispectral transmission image channel images) of the biological specimen illuminated with each of at least eight different wavelengths.
- multispectral transmission image data e.g., multispectral transmission image channel images
- the multispectral image acquisition device 12A is configured to illuminate a biological specimen with at least ten different illumination sources, and acquire multispectral transmission image data (e.g., multispectral transmission image channel images) of a biological specimen at each of at least ten different wavelengths.
- the multispectral image acquisition device 12A is configured to illuminate a biological specimen with at least twelve different illumination sources, and acquire multispectral transmission image data (e.g., multispectral transmission image channel images) of the biological specimen illuminated with each of at least twelve different wavelengths.
- the multispectral image acquisition device 12A is configured to illuminate a biological specimen with at least sixteen different illumination sources, and acquire multispectral transmission image data (e.g., multispectral transmission image channel images) of the biological specimen illuminated with each of at least sixteen different wavelengths. In some embodiments, the multispectral image acquisition device 12A is configured to illuminate a biological specimen with at least twenty different illumination sources, and acquire multispectral transmission image data (e.g., multispectral transmission image channel images) of the biological specimen illuminated with each of at least twenty different wavelengths.
- the multispectral image acquisition device 12A is configured to illuminate a biological specimen with at least twenty-four different illumination sources, and acquire multispectral transmission image data (e.g., multispectral transmission image channel images) of the biological specimen illuminated with each of at least twenty-four different wavelengths.
- multispectral transmission image data e.g., multispectral transmission image channel images
- the multispectral image acquisition device 12A includes one or more image capture devices; and one or more energy emitters, such as light sources, infrared sources, ultraviolet sources, or the like to illuminate the biological sample at a particular wavelength.
- the energy emitter can include, without limitation, one or more LEDs (e.g., edge emitting LEDs, surface emitting LEDs, super luminescent LEDs, or the like), laser diodes, electroluminescent light sources, incandescent light sources, cold cathode fluorescent light sources, organic polymer light sources, lamps, inorganic light sources, or other suitable lightemitting sources.
- light sources can be light-emitting diodes (LEDs), which may be pulsed on and off to correspond with imaging frames such that successive frames are recorded with a different LED illumination.
- the energy emitter of the multispectral image acquisition device 12A is configured to produce energy emissions with mean wavelengths that are different from one another.
- the total number of different energy emissions capable of being produced by the multispectral image acquisition device 12A ranges, for example, from 3 to about 50, such as from 3 to 40, such as from 3 to 30, such as from 3 to 20, such as from 3 to 16, such as from 3 to 12.
- the energy emitter can include, without limitation, two or more light sources of different mean wavelengths, three or more light sources of different mean wavelengths, four or more light sources of different mean wavelengths, five or more light sources of different mean wavelengths, six or more light sources of different mean wavelengths, seven or more light sources of different mean wavelengths, eight or more light sources of different mean wavelengths, nine or more light sources of different mean wavelengths, ten or more light sources of different mean wavelengths, eleven or more light sources of different mean wavelengths, twelve or more light sources of different mean wavelengths, fifteen or more light sources of different mean wavelengths, twenty or more light sources of different mean wavelengths, etc.
- the multispectral image acquisition device 12A may include two or more different LEDs, each emitting a different mean wavelength.
- the multispectral image acquisition device 12A or the energy emitter included therein may be configured to produce energy emissions with mean wavelengths of about 365 nm, about 400 nm, about 435 nm, about 470 nm, about 500 nm, about 550 nm, about 580 nm, about 635 nm, about 660 nm, about 690 nm, about 780 nm, and/or about 850 nm.
- the energy emitter of the multispectral image acquisition device 12A can be a blue light LED having a maximum intensity at a wavelength in the blue region of the spectrum.
- a blue light LED can have a peak wavelength and/or mean wavelength in a range of about 430 nanometers to about 490 nanometers (nm).
- the energy emitter can be a green light LED having a maximum intensity at a wavelength in the green region of the spectrum.
- the green light LED can have a peak wavelength and/or mean wavelength in a range of about 490 - 560 nm.
- the energy emitter can be an amber light LED having a maximum intensity at a wavelength in the amber region of the spectrum.
- the amber light can have a peak wavelength and/or mean wavelength in a range of about 570 - 610 nm.
- the energy emitter can be a red-light LED having a maximum intensity at a wavelength in the red region of the spectrum.
- the red light can have a peak wavelength and/or mean wavelength in a range of about 620 - 800 nm.
- two or more of the LED light sources can be combined, thereby producing processing flexibility. Different arrangements of light sources can be selected to achieve the desired illumination field. For instance, multiple LEDs of specific wavelengths may be combined such that the acquired multispectral image may resemble that of an RGB brightfield image (allowing such a configured multispectral image acquisition device to be used in place of a brightfield image acquisition device).
- LED light sources can be part of or form a light emitting panel. In some embodiments, the number, colors, and positions of the LEDs can be selected to achieve desired illumination.
- the multispectral image acquisition device 12A may include one or more lasers, halogen light sources, incandescent sources, and/or other devices capable of emitting light.
- each source can include a light emitter (e.g., a halogen lamp incandescent light source, etc.) that outputs white light and a filter that transmits certain wavelength(s) or waveband(s) of the white light.
- An image sensor for example, a CCD sensor can capture a digital image of the specimen.
- the digitized tissue data may be generated, for example, by an image scanning system, such as a Ventana DP 200® slide scanner by Ventana Medical Systems, Inc. (Tucson, Arizona) or other suitable imaging equipment.
- a scanner may be used to acquire images of test and/or training images.
- the scanner is used to acquire images of unstained samples (see, e g., FIG. 22).
- the scanner is used to acquire Additional imaging devices and systems are described further herein.
- the digital color image acquired by the brightfield image acquisition device may be conventionally composed of elementary color pixels.
- Each colored pixel can be coded over three digital components, each comprising the same number of bits, each component corresponding to a primary color, generally red, green, or blue, also denoted by the term "RGB" components.
- the multispectral image acquisition device 12A uses a single white light source, and the sensor has a filter (i.e., abig filter) to produce a three channel (i.e., RGB) color image; while the brightfield image acquisition device 12B uses multiple colors turned in sequence and the images are acquired in a monochromatic camera.
- a filter i.e., a invention filter
- the multispectral image acquisition device 12A can also be used to acquire a "brightfield transmission image data," as that term is used herein, of a biological specimen with a multispectral imaging device 12A using RGB colors (e.g., by acquiring image data at about wavelengths 700nm, 550nm, and 470nm).
- a "brightfield transmission image” or a “training brightfield transmission image” can include transmission image data acquired from a multispectral image acquisition device 12A or a brightfield image acquisition device 12B.
- FIGS. 4A and 4B provide an overview of the systems 200 of the present disclosure and the various modules utilized within the system.
- the system 200 employs a computer device or computer-implemented method having one or more processors 209 and one or more memories 201, the one or more memories 201 storing non-transitory computer- readable instructions for execution by the one or more processors to cause the one or more processors to execute certain instructions as described herein.
- the image acquisition module 202 commands a multispectral imaging device 12A and/or a brightfield imaging device 12B to acquire multispectral and/or brightfield transmission image data, respectively, of a biological specimen (or a portion thereof) disposed on a substrate.
- the image acquisition module 202 acquires training image data from one or more training biological specimens for training a virtual staining engine 210.
- the image acquisition module 202 acquires test image data from one or more test biological specimens such that a virtual stain may be generated using a trained virtual staining engine 210.
- the acquired test and/or training transmission image data of the test and/or training biological specimen, respectively, may be stored in one or more memories 201 or one or more storage modules 240 communicatively coupled to the system 200 for downstream processing.
- the image acquisition module 202 may command the multispectral image acquisition device 12A to acquire one or more transmission image channel images of the biological specimen, where each transmission image channel image is acquired at a different wavelength (i.e., the multispectral image acquisition device 12A illuminates the biological specimen with an illumination source having a particular wavelength, and then transmission image data is acquired of the biological specimen at that particular wavelength).
- the image acquisition module 202 may command multispectral image acquisition device 12A to acquire transmission image data from a biological specimen at 12 different wavelengths, thereby generating 12 multispectral transmission image channel images (collectively referred to as acquired multispectral transmission image data).
- the plurality of acquired multispectral transmission image channel images i.e., the acquired multispectral transmission image data, are then stored in one or more memories 201 or one or more storage modules 240 for downstream processing.
- the image acquisition module 202 may command an image acquisition device 12 to acquire transmission image data for an entire biological specimen.
- transmission image data may be acquired for the entirety of a biological specimen disposed on a substrate, e.g., a microscope slide.
- the image acquisition module 202 may command an image acquisition device 12 to acquire transmission image data from a portion of a biological specimen. This can be useful where only specific regions of interest of the biological specimen are relevant for analysis. For instance, certain regions of interest may include a specific type of tissue or a comparatively higher population of a specific type of cell as compared with another region of interest.
- a region of interest may be selected in a biological specimen that includes tissue of interest but excludes tissue not of interest (e.g., tumor tissue versus non-tumor tissue).
- the image acquisition module 202 may be programmed to acquire transmission image data from one or more predefined portions of the sample; or may acquire one or more transmission images through random sampling or by sampling at regular intervals across a grid covering the entire sample.
- the system 200 further includes an image processing module 212 adapted to process acquired image data.
- the image processing module is capable of converting or otherwise transforming acquired transmission image data, including multispectral transmission image data and/or brightfield transmission image data.
- the image processing module 212 may combine two or more image channel images into a multi-channel image (such as without using a dimensionality reduction technique or a compression technique).
- the image processing module 212 may reduce multispectral image data acquired at four or more different wavelengths into a multichannel RGB image, such as by using a dimensionality reduction method.
- the image processing module 212 may coregister one image with another image.
- the image processing module 212 may register an image derived from acquired multispectral transmission image data with an acquired transmission brightfield image, to provide a pair of coregistered images for training a virtual staining engine 210.
- the image processing module 212 may convert multispectral image data acquired from a stained biological specimen using a multispectral image acquisition device to an RGB image, and which may be used as a bright field image, as described herein.
- the image processing module 212 may be further configured to pre-process image data, to identify regions of the image that correspond to the substrate (e.g., a microscope slide) on which the sample is disposed, to identify regions of different tissue types (e.g., connective tissue), or to interpret one or more annotations.
- the image processing module 212 may yet further include one or more submodules, such as tissue classification modules, glass recognition modules, etc.
- the one or more submodules may implement support vector machines and/or neural networks. Examples of overlay generation modules, tissue classification modules, glass / slide recognition modules are described in U.S. Publication Nos.
- the system 200 further includes a training module 211 adapted to receive pairs of coregistered training images and to use the received pairs of coregistered training images to train a virtual staining engine 210.
- the pairs of coregistered training images are used to train a one or more machine-learning algorithms, such as a deep neural network.
- the deep neural network is based on an implicit generative model, e.g., a generative adversarial network ("GAN").
- GAN generative adversarial network
- two models are used for training.
- a generative model is used that captures data distribution while a second model estimates the probability that a sample came from the training data rather than from the generative model.
- GAN deep neural network 10
- GAN may be performed the same or different computing device.
- a personal computer may be used to train the GAN although such training may take a considerable amount of time.
- one or more dedicated GPUs may be used for training.
- the system 200 further includes a virtual staining engine 210.
- the virtual staining engine 210 is trained to generate, from an unstained image of a biological specimen, an image of the biological specimen stained with a morphological stain.
- the deep neural network may be used or executed on a different computing device which may include one with less computational resources used for the training process (although GPUs may also be integrated into execution of the trained deep neural network).
- the system 200 includes multiple virtual staining engines 210, where each virtual staining engine is trained to generate a different virtual stain, where each different virtual stain corresponds to a different morphological stain.
- the trained, deep neural network 10 is trained using a GAN model.
- one or more additional modules may be incorporated into the workflow or into system 200.
- one or more an automated algorithms may be run such that cells may be detected, classified, and/or scored (see, e.g., Unit States Patent Publication No. 2017/0372117, the disclosure of which is hereby incorporated by reference herein in its entirety).
- the present disclosure provides methods of training a virtual staining engine 210, such as training a virtual staining engine to generate a virtually morphologically stained image derived from an acquired image of an unstained biological specimen.
- a virtual staining engine 210 such as training a virtual staining engine to generate a virtually morphologically stained image derived from an acquired image of an unstained biological specimen.
- multiple, different machine-learning algorithms may be trained to generate different virtual stains (e.g., to generate a H&E virtual stain, to generate a basic fuchsin virtual stain, to generate a Masson's Trichrome stain, etc.).
- the present disclosure also provides methods of using a trained virtual staining engine 210 to generate a virtually stained image of a test unstained biological specimen, from an image of the test unstained biological specimen, where the virtually stained image manifests the appearance of the test unstained biological specimen as if it were stained with a morphological stain.
- FIG. 6A provides an overview of training a virtual staining engine 210.
- one or more unstained training biological specimens are obtained (step 610).
- Training multispectral transmission image data is then acquired from each of the one or more unstained training biological specimens (step 611).
- at least two training multispectral transmission image channel images are acquired from each of the unstained training biological specimens over at least two different wavelengths, such as at different wavelengths.
- a multi-channel multispectral training transmission image is then generated based on the acquired training multispectral transmission image data for each of the unstained training biological specimens (step 612).
- the multi-channel multispectral training transmission image may include three different wavelengths.
- the multi-channel multispectral training transmission image may include twelve different wavelengths.
- twelve different multispectral transmission image channel images may be reduced to a multi-channel multispectral training transmission image using a dimensionality reduction method, such as principal component analysis.
- the twelve different multispectral transmission image channel images are not compressed, such as not compressed using any dimensionality reduction technique (e.g., PCA).
- the unstained training biological specimen is then stained, such as morphologically stained, such as with a primary stain (e.g., H&E) or stained with a special stain to provide a stained training biological specimen (e.g., a basic fuchsin stain, a Masson's Trichrome stain, etc.) (step 613).
- a primary stain e.g., H&E
- a special stain e.g., a basic fuchsin stain, a Masson's Trichrome stain, etc.
- suitable special stains are described further herein.
- a brightfield training transmission image of the stained training biological specimen is then acquired (step 614).
- the brightfield training transmission image of the stained training biological specimen is acquired with a brightfield image acquisition device 12B.
- the brightfield training transmission image of the stained training biological specimen is acquired using a multispectral image acquisition device 12A using RGB channels as described further herein.
- the multispectral training image and the brightfield training image are then coregistered to provide at least one pair of coregistered training images (step 615).
- a first training image of a pair of coregistered training images is a multispectral transmission training image (which, as noted above, is derived from training multispectral transmission image data of an unstained training biological specimen).
- a second member of the pair of coregistered training images is a brightfield transmission training image (which, as noted above, is derived from training brightfield transmission image data of the same training biological specimen used to generate the multispectral transmission image data but stained with a morphological stain).
- the plurality of pairs of coregistered training images are utilized by the training module 211 to train a virtual staining engine 210.
- twelve different image channel images of a first training tissue specimen may be acquired at twelve different wavelengths and those training multispectral image channel images may be used to generate a multispectral training image of a pair of coregistered training images.
- the twelve different image channel images are acquired while the first training tissue specimen is unstained. That same first training tissue specimen is then morphologically stained according to methods known in the art. Following morphological staining, a brightfield transmission training image may be acquired from the stained first training tissue specimen.
- the multispectral training image and the brightfield training image are optionally segmented to provide a plurality of segmented coregistered training images (e.g., segments into 64x64 pixel images patches; 128x128 pixel image patches; 256x256 pixel image patches; 512x512 pixel image patches, etc ).
- the at least one pair of coregistered training images are then provided to a machine-learning algorithm, such as a GAN algorithm, for training (step 616).
- a plurality of different virtual staining engines 210 may be trained, where each trained virtual staining engine may be trained to generate a different virtual morphological stain of an unstained biological specimen (e.g., different virtual staining engines may be trained to generate virtual H&E stains or virtual special stain, such as a virtual Masson's Trichrome stain).
- a virtual staining engine 210 may be trained with training biological specimens that have been stained with H&E.
- the virtual staining engine 210 is trained to generate a virtually stained image of a test unstained biological specimen where the virtually stained image manifests the appearance of the test unstained biological specimen as if it were stained with H&E.
- a virtual staining engine 210 may be trained with training biological specimens that have been stained with the special stain basic fuchsine. Following this second example, the virtual staining engine 210 is trained to generate a virtually stained image of a test unstained biological specimen where the virtually stained image manifests the appearance of the test unstained biological specimen as if it were stained with basic fuchsine.
- a virtual staining engine 210 may be trained with training biological specimens that have been stained with Masson's Trichrome.
- the virtual staining engine 210 is trained to generate a virtually stained image of a test unstained biological specimen where the virtually stained image manifests the appearance of the test unstained biological specimen as if it were stained with Masson's Trichrome (see, e.g., FIG. 21). Following these examples even further, each differently trained virtual staining engine may then be used to generate one or more virtually stained output images based on a single acquired multispectral transmission image of an unstained biological specimen.
- FIG. 6B An exemplary method of training a virtual staining engine is set forth in FIG. 6B.
- FIG. 6B shows the general workflow of a non-limiting training process in accordance with the present disclosure.
- a tissue sample e.g., such as a biopsy sample, a resection sample, an FFPE sample, etc. obtained, such as through one or more preanalytical steps
- steps 650 is processed (step 650), and sections are cut (step 651), such as in a microtome such as at a thickness of about 3 pm to about 5 pm).
- the slides are then baked for 5 minutes, and then dewaxed using heat to melt wax and rinsed with EZ prep to remove paraffin (look for paraffin removing routine in HE600) (step 652).
- the slides after the slides have been dewaxed, they are placed on a multispectral scanner stage, such as one of the multispectral scanners described herein.
- the sample After selecting/detecting the area of interest either by the user or by automatic means, the sample is illuminated with a first (of multiple) wavelength (step 653)h, with a predefined illumination power (controlled by a pulse with modulation circuit to reduce/augment the duty cycle on the LED light source).
- the objective of the scanner is focused by means of moving the sample or the objective, and a first field of view (FOV) is digitized with the camera with a predefined obturation time and gain.
- the sample is illuminated with a second wavelength (step 653).
- the objective is refocused to the second wavelength using a close loop focusing algorithm that iterates until finding the right focus (e.g., wavelength controlled autofocus).
- a close loop focusing algorithm that iterates until finding the right focus (e.g., wavelength controlled autofocus).
- only the focus of a central wavelength is calibrated through the autofocus and the other wavelength Z distance is calculated (such as by using predefined offsets per wavelength).
- only one central wavelength is focused (with a feedback- controlled algorithm), and all the other wavelengths are digitized without moving the Z distance in the same FOV.
- process is repeated until all the FOV is digitized with all the different wavelengths (step 653).
- the stage is moved to a different location.
- the next location is in the vicinity of the previous location with some area overlapping.
- the next location is in a different location to collect nonneighboring areas, speeding up the digitization and augmenting the variety of the dataset.
- each FOV is stored in memory independently.
- the FOV are stitched together to form a whole slide hypercube.
- the next step is to stain the sample with the desired stain assay (i.e., H&E, with a special stain such as with Masson's Trichrome, or any of the other stains described herein) following typical protocols; and then the slide is coversliped (step 654).
- the slide is stained and coverslipped it is introduced in the stage of the same multispectral scanner.
- the above-mentioned imaging process is repeated but only in a subset of wavelengths, such as the 3 wavelengths that closely match the red (620-750 mm), green (495-570 nm) and blue ( 400 - 480 nm) wavelengths (step 655). Other wavelengths may be utilized in other embodiments.
- the FOV of the stain images position matches the position of the FOV of the unstained samples by the use of fiducials located in the sample. In some embodiments, these fiducials are present in the slide (edges, tags, etc.). In other embodiments, these fiducials are engraved in the slide with mechanical or laser means (i.e., circles laser engraved in the slide). In some embodiments, once the multispectral digital microscope finishes scanning all the AOI, a whole slide image (WSI) of the raw data is stored in the disk.
- WSI whole slide image
- this WSI of the stained sample is then processed to apply a previously calibrated color correction matrix to correct the colors and perform a white balancing, in a manner such that the image resembles closely to what a user would look at by using a brightfield microscope. It is believed that an advantage of this method is that as the images are scanned in the same scanner, the resolution of the digital files matches perfectly or substantially perfectly. It is also believed that any optical artifacts are going will be similar.
- the spectral images of the unstained sample are expected to be coregistered. Also, both unstained and stained FOVs are going to be very closely registered to each other, simplifying the next step of coregistration.
- a dimensionality reduction is performed.
- a Principal Component Analysis (PCA) reduction is performed on the spectral hypercube images to reduce the channels from n (in example 12 channels) to 3 (step 656) (see FIG. 6B).
- PCA Principal Component Analysis
- no dimensionality reduction e.g., PCA
- a machine learning algorithm such as GAN, described herein
- GAN multispectral data
- the principal component values are normalized to values ranging from 0 to 255 and coded in the channels of an RGB image, where the first channel (i.e., red) is the first principal component, the second channel (i.e., green) is the second principal component, and the third channel (i.e., blue) is the third principal component (steps 657). It is believed that the advantage of this method is that the image complexity is reduced.
- data from a stained slide is coded into an RGB image by performing a white balancing color correction to adjust the balance of each channel to achieve a white background as is typically viewed with a light microscope or from a calibrated digital scanner (steps 658A and 658B).
- the hypercube images are used raw with no further processing.
- the next step is to coregister the images to a pixel level (steps 659A and 659B).
- an automatic algorithm transforms both images (e.g., the PCA image and the brightfield or ground truth) by applying filters (i.e., Laplacian, Sobel, etc.) to where the general morphology of the tissue is enhanced in such a way that both filtered images will reveal similar structures (so called descriptors).
- filters i.e., Laplacian, Sobel, etc.
- a Fourier based correlation algorithm will be applied, translating and rotating the filtered image of the ground truth to find the position and rotation that will lead to the higher correlation.
- the required transformations are applied to the ground truth, producing 2 coarsely registered images (PCA and transformed brightfield).
- the next step is to find the FOV of the stitched image if dealing with WSI.
- a Laplacian filter is applied to "look for" discontinuities, revealing the stitch lines.
- both images are then divided into individual field of view (FOV) images, and the above-described process is repeated, filtering paired images to reveal descriptors and finding the best correlation by translating and rotating the filtered version of the FOV containing the ground truth.
- FOV field of view
- a final step is done further dividing each registered FOV into tiles, such as tiles of 270 x 270 pixels.
- the process is repeated at the tile level creating pixel level registered tiles of 256 x 256 pixels images (steps 660A and 660B).
- the images are then saved into one or more memories communicatively coupled to the systems of the present disclosure, such as to a disk, to train a GAN algorithm (step 661).
- the tissue section previously stained in H&E is then processed to remove the coverslip and remove the staining (such as by chemically removing the staining) and then stained again with a different assay (i.e., Masson’s trichrome), and a coverslip is placed again.
- a different assay i.e., Masson’s trichrome
- the re-stained slide is then scanned in the multispectral microscope again following the same procedure mentioned before (3 colors matching RGB) and coregistered to the unstained hypercubes to create a different dataset that can be imputed to train a GAN algorithm.
- the multispectral and brightfield image data is acquired from one or more training biological specimens.
- the obtained training biological specimens may be obtained from any source.
- the obtained training biological specimens may be obtained from a tumor, including, for example, tumor biopsies samples, resection samples, cell smears, fine needle aspirates (FNA), liquid-based cytology samples, and the like.
- the obtained training biological specimens are histological specimens.
- the obtained training biological specimens are cytological specimens.
- training multispectral transmission image data and training brightfield transmission image data is acquired from the same biological specimen, i.e., training multispectral transmission image data is acquired from the biological specimen when it is unstained, and training brightfield transmission image data is acquired from the same biological specimen after it is stained, such as stained with a primary stain (e.g., H&E), a special stain, or any of the other statins known in the art or as described herein.
- a primary stain e.g., H&E
- a plurality of training biological specimens is obtained and each training biological specimen of the plurality of obtained training biological specimens is used to train a different virtual staining engine (see, e.g., FIG. 7).
- training multispectral transmission image data may be acquired from a first training biological specimen in an unstained state; and then training brightfield transmission image data may be acquired from the first training biological after staining with H&E.
- the training multispectral transmission image data and the training brightfield transmission image data may then be used to train an H&E virtual staining engine.
- training multispectral transmission image data may be acquired from a second training biological specimen in an unstained state; and then training brightfield transmission image data may be acquired from the second training biological after staining with a first special stain.
- the training multispectral transmission image data and the training brightfield transmission image data may then be used to train a first special stain virtual staining engine.
- a plurality of training biological specimens is obtained and each training biological specimen of the plurality of obtained training biological specimens is used to train the same virtual staining engine.
- the obtained plurality of training biological specimens used to train the same virtual staining engine are derived from the same source (e.g., same patient but different tissue blocks; same tissue block but different serial sections having different thicknesses, fixation times, etc.) (see, e.g., FIG. 8A).
- the obtained plurality of training biological specimens used to train the same virtual staining engine are derived from different sources (e.g., different patients).
- training multispectral transmission image data may be acquired from a first training biological specimen in an unstained state; and then training brightfield transmission image data may be acquired from the first training biological after staining with H&E.
- Training multispectral transmission image data may be acquired from a second training biological specimen in an unstained state; and then training brightfield transmission image data may be acquired from the second training biological after staining with H&E.
- the training multispectral transmission image data and the training brightfield transmission image data derived from the first and second training biological specimens may then be used to train an H&E virtual staining engine. It is believed that training the same virtual staining engine from training biological specimens obtained from different sources allows for any lab-to- lab variations, tissue-to-tissue variations, fixation level / quality variations, etc. to be accounted for during training of the virtual staining engine.
- a plurality of training biological specimens is obtained where each of the obtained plurality of training biological specimens is of the same tissue type (e.g., tonsil tissue) and where the obtained plurality of training biological specimens of the same tissue type is used to train the same virtual staining engine.
- tissue type e.g., tonsil tissue
- a plurality of training biological specimens is obtained where each of the obtained plurality of training biological specimens is of a different tissue type and where the obtained plurality of training biological specimens of the different tissue types is used to train the same virtual staining engine.
- a plurality of training biological specimens of the same type is obtained and each training biological specimen of the plurality of obtained training biological specimens of the same type is used to train a different virtual staining engine.
- the plurality of training biological specimens of the same type is obtained from different sources.
- the plurality of training biological specimens of the same type are different serial sections derived from the same tissue block, and where each different training serial section could be used to train a different virtual staining engine (see, e.g., FIG. 9).
- a plurality of training biological specimens of the same type is obtained, but where each of the obtained plurality of training biological specimens of the same type has a different thickness, a different type of glass substrate, and/or a different fixation state, etc.; and where each training biological specimen of the plurality of obtained training biological specimens of the same type is used to train the same virtual staining engine.
- Training multispectral transmission image data is acquired from each of a plurality of training unstained biological specimens.
- two or more training multispectral transmission image channel images are acquired for each training unstained biological specimen of the plurality of training unstained biological specimens, where each of the two or more training multispectral transmission image channel images are acquired at a specific wavelength.
- the two or more training multispectral transmission image channel images may be processed and/or combined in one or more downstream operations.
- At least two training multispectral transmission image channel images are acquired for each unstained training biological specimen, where each of the at least two training multispectral transmission image channel images are acquired using a multispectral image acquisition device 12A configured to illuminate each unstained training biological specimen with at least two different illumination sources; and further configured to acquire transmission image data (e.g., at least two multispectral image channel images) of the biological specimen illuminated with the at least two different illumination sources.
- at least two training multispectral transmission image channel images are acquired for each unstained training biological specimen, where each of the at least two training multispectral transmission image channel images are acquired at different wavelengths.
- four training multispectral transmission image channel images are obtained from a least four different illumination sources.
- five training multispectral transmission image channel images are obtained from a least five different illumination sources.
- training multispectral transmission image channel images are obtained from a least six different illumination sources.
- training multispectral transmission image channel images are obtained from a least seven different illumination sources.
- training multispectral transmission image channel images are obtained from a least eight different illumination sources.
- training multispectral transmission image channel images are obtained from a least nine different illumination sources.
- training multispectral transmission image channel images are obtained from a least ten different illumination sources.
- training multispectral transmission image channel images are obtained from a least eleven different illumination sources.
- training multispectral transmission image channel images are obtained from a least twelve different illumination sources. In some embodiments, training multispectral transmission image channel images are obtained from twelve or more illumination sources. In some embodiments, training multispectral transmission image channel images are obtained from sixteen or more different illumination sources. In some embodiments, training multispectral transmission image channel images are obtained from twenty or more different illumination sources. In some embodiments, training multispectral transmission image channel images are obtained from twenty- four or more different illumination sources.
- training multispectral transmission image channel images are acquired from a multispectral image acquisition device 12A configured to illuminate a training biological specimen with two or more different illumination sources (see, e.g., FIG. 5), where the at least two different illumination sources are from within the ultraviolet (UV) spectrum, from within the visible spectrum, and/or from within the infrared (IR) spectrum.
- the at least two different illumination sources comprise (i) at least one illumination source within the ultraviolet (UV) spectrum; (ii) at least one illumination source within the visible spectrum; and (iii) at least one illumination source within the infrared spectrum.
- training multispectral transmission image channel images are acquired from a multispectral image acquisition device 12A configured to illuminate a training biological specimen with six or more different illumination sources, where the at least six different illumination sources are from within the ultraviolet (UV) spectrum, from within the visible spectrum, and/or from within the infrared (IR) spectrum.
- the at least six different illumination sources comprise (i) at least one illumination source within the ultraviolet (UV) spectrum; (ii) at least one illumination source within the visible spectrum; and (iii) at least one illumination source within the infrared spectrum.
- training multispectral transmission image channel images are acquired from a multispectral image acquisition device 12A configured to illuminate a training biological specimen with twelve or more different illumination sources, where the at least twelve different illumination sources are from within the ultraviolet (UV) spectrum, from within the visible spectrum, and/or from within the infrared (IR) spectrum.
- the at least twelve different illumination sources comprise (i) at least one illumination source within the ultraviolet (UV) spectrum; (ii) at least one illumination source within the visible spectrum; and (iii) at least one illumination source within the infrared spectrum.
- training multispectral transmission image channel images are acquired from a multispectral image acquisition device 12A configured to illuminate a training biological specimen with two or more different illumination sources, where at least two of the illumination sources are from at least two different wavelengths within the UV spectrum.
- training multispectral transmission image channel images are acquired from a multispectral image acquisition device 12A configured to illuminate a training biological specimen with two or more different illumination sources, where at least two of the illumination sources are from at least two different wavelengths within the visible spectrum.
- training multispectral transmission image channel images are acquired from a multispectral image acquisition device 12A configured to illuminate a training biological specimen with two or more different illumination sources, where at least two of the illumination sources are from at least two different wavelengths within the IR spectrum.
- training multispectral transmission image channel images are acquired from a multispectral image acquisition device 12A configured to illuminate a training biological specimen with four or more different illumination sources (such as narrow illumination sources), wherein training multispectral transmission image channel images are acquired after illuminating the biological specimen with (i) at least two different illuminations sources within the UV spectrum; (ii) at least two different illuminations sources within the visible spectrum; and (iii) least two different illuminations sources within the IR spectrum.
- illumination sources such as narrow illumination sources
- training multispectral transmission image channel images are acquired from a multispectral image acquisition device 12A configured to illuminate a training biological specimen with six or more different illumination sources, wherein training multispectral transmission image channel images are acquired after illuminating the biological specimen with (i) at least two different illuminations sources within the UV spectrum; (ii) at least two different illuminations sources within the visible spectrum; and (iii) least two different illuminations sources within the IR spectrum.
- training multispectral transmission image channel images are acquired from a multispectral image acquisition device 12A configured to illuminate a training biological specimen with eight or more different illumination sources, wherein training multispectral transmission image channel images are acquired after illuminating the biological specimen with (i) at least two different illuminations sources within the UV spectrum; (ii) at least two different illuminations sources within the visible spectrum; and (iii) least two different illuminations sources within the IR spectrum.
- training multispectral transmission image channel images are acquired from a multispectral image acquisition device 12A configured to illuminate a training biological specimen with ten or more different illumination sources, wherein training multispectral transmission image channel images are acquired after illuminating the biological specimen with (i) at least two different illuminations sources within the UV spectrum; (ii) at least two different illuminations sources within the visible spectrum; and (iii) least two different illuminations sources within the IR spectrum.
- training multispectral transmission image channel images are acquired from a multispectral image acquisition device 12A configured to illuminate a training biological specimen with twelve or more different illumination sources, wherein training multispectral transmission image channel images are acquired after illuminating the biological specimen with (i) at least two different illuminations sources within the UV spectrum; (ii) at least two different illuminations sources within the visible spectrum; and (iii) least two different illuminations sources within the IR spectrum.
- training multispectral transmission image channel images are acquired from a multispectral image acquisition device 12A configured to illuminate a training biological specimen with sixteen or more different illumination sources, wherein training multispectral transmission image channel images are acquired after illuminating the biological specimen with (i) at least two different illuminations sources within the UV spectrum; (ii) at least two different illuminations sources within the visible spectrum; and (iii) least two different illuminations sources within the IR spectrum.
- training multispectral transmission image channel images are acquired from a multispectral image acquisition device 12A configured to illuminate a training biological specimen with twenty or more different illumination sources, wherein training multispectral transmission image channel images are acquired after illuminating the biological specimen with (i) at least two different illuminations sources within the UV spectrum; (ii) at least two different illuminations sources within the visible spectrum; and (iii) least two different illuminations sources within the IR spectrum.
- training multispectral transmission image channel images are acquired from a multispectral image acquisition device 12A configured to illuminate a training biological specimen with nine or more different illumination sources, wherein training multispectral transmission image channel images are acquired after illuminating the biological specimen with (i) at least two different illuminations sources within the UV spectrum; (ii) at least two different illuminations sources within the visible spectrum; and (iii) least three different illuminations sources within the IR spectrum.
- training multispectral transmission image channel images are acquired from a multispectral image acquisition device 12A configured to illuminate a training biological specimen with twelve or more different illumination sources, wherein training multispectral transmission image channel images are acquired after illuminating the biological specimen with (i) at least two different illuminations sources within the UV spectrum; (ii) at least two different illuminations sources within the visible spectrum; and (iii) least three different illuminations sources within the IR spectrum.
- training multispectral transmission image channel images are acquired from a multispectral image acquisition device 12A configured to illuminate a training biological specimen with two or more different illumination sources, where the two or more different illumination sources differ by at least 120nm. In some embodiments, training multispectral transmission image channel images are acquired from a multispectral image acquisition device 12A configured to illuminate a training biological specimen with two or more different illumination sources, where the two or more different illumination sources differ by at least lOOnm. In some embodiments, training multispectral transmission image channel images are acquired from a multispectral image acquisition device 12A configured to illuminate a training biological specimen with two or more different illumination sources, where the two or more different illumination sources differ by at least 80nm.
- training multispectral transmission image channel images are acquired from a multispectral image acquisition device 12A configured to illuminate a training biological specimen with two or more different illumination sources, where the two or more different illumination sources differ by at least 60nm. In some embodiments, training multispectral transmission image channel images are acquired from a multispectral image acquisition device 12A configured to illuminate a training biological specimen with two or more different illumination sources, where the two or more different illumination sources differ by at least 40nm.
- training multispectral transmission image channel images are acquired from a multispectral image acquisition device 12A configured to illuminate a training biological specimen with two or more different illumination sources, where the at least two different illumination sources comprise (i) at least one illumination source within the ultraviolet (UV) spectrum; (ii) at least one illumination source within the visible spectrum; and (iii) at least one illumination source within the infrared spectrum; and wherein the least three different illumination sources have mean wavelengths of about 365 +/- 300 nm, about 400 +/- 300 nm, about 435 +/- 300 nm, about 470 +/- 300 nm, about 500 +/- 300 nm, about 550+/- 300 nm, about 580 +/- 300 nm, about 635 +/- 300 nm, about 660 +/- 300 nm, about 690 +/- 300 nm, about 780 +/- 300 nm, and/or about 850 +/- 300 nm.
- UV ultraviolet
- the least three different illumination sources
- training multispectral transmission image channel images are acquired from a multispectral image acquisition device 12A configured to illuminate a training biological specimen with two or more different illumination sources, where the at least two different illumination sources comprise (i) at least one illumination source within the ultraviolet (UV) spectrum; (ii) at least one illumination source within the visible spectrum; and (iii) at least one illumination source within the infrared spectrum; and wherein the least three different illumination sources have mean wavelengths of about 365 +/- 200 nm, about 400 +/- 200 nm, about 435 +/- 200 nm, about 470 +/- 200 nm, about 500 +/- 200 nm, about 550+/- 200 nm, about 580 +/- 200 nm, about 635 +/- 200 nm, about 660 +/- 200 nm, about 690 +/- 200 nm, about 780 +/- 200 nm, and/or about 850 +/- 200 nm.
- UV ultraviolet
- the least three different illumination sources
- training multispectral transmission image channel images are acquired from a multispectral image acquisition device 12A configured to illuminate a training biological specimen with two or more different illumination sources, where the at least two different illumination sources comprise (i) at least one illumination source within the ultraviolet (UV) spectrum; (ii) at least one illumination source within the visible spectrum; and (iii) at least one illumination source within the infrared spectrum; and wherein the least three different illumination sources have mean wavelengths of about 365 +/- 100 nm, about 400 +/- 100 nm, about 435 +/- 100 nm, about 470 +/- 100 nm, about 500 +/- 100 nm, about 550+/- 100 nm, about 580 +/- 100 nm, about 635 +/- 100 nm, about 660 +/- 100 nm, about 690 +/- 100 nm, about 780 +/- 100 nm, and/or about 850 +/- 100 nm.
- UV ultraviolet
- the least three different illumination sources
- training multispectral transmission image channel images are acquired from a multispectral image acquisition device 12A configured to illuminate a training biological specimen with two or more different illumination sources, where the at least two different illumination sources comprise (i) at least one illumination source within the ultraviolet (UV) spectrum; (ii) at least one illumination source within the visible spectrum; and (iii) at least one illumination source within the infrared spectrum; and wherein the least three different illumination sources have mean wavelengths of about 365 +/- 50 nm, about 400 +/- 50 nm, about 435 +/- 50 nm, about 470 +/- 50 nm, about 500 +/- 50 nm, about 550+/- 50 nm, about 580 +/- 50 nm, about 635 +/- 50 nm, about 660 +/- 50 nm, about 690 +/- 50 nm, about 780 +/- 50 nm, and/or about 850 +/- 50 nm.
- UV ultraviolet
- the least three different illumination sources
- training multispectral transmission image channel images are acquired from a multispectral image acquisition device 12A configured to illuminate a training biological specimen with two or more different illumination sources, where the at least two different illumination sources comprise (i) at least one illumination source within the ultraviolet (UV) spectrum; (ii) at least one illumination source within the visible spectrum; and (iii) at least one illumination source within the infrared spectrum; and wherein the least three different illumination sources have mean wavelengths of about 365 +/- 30 nm, about 400 +/- 30 nm, about 435 +/- 30 nm, about 470 +/- 30 nm, about 500 +/- 30 nm, about 550+/- 30 nm, about 580 +/- 30 nm, about 635 +/- 30 nm, about 660 +/- 30 nm, about 690 +/- 30 nm, about 780 +/- 30 nm, and/or about 850 +/- 30 nm.
- UV ultraviolet
- the least three different illumination sources
- training multispectral transmission image channel images are acquired from a multispectral image acquisition device 12A configured to illuminate a training biological specimen with two or more different illumination sources, where the at least two different illumination sources comprise (i) at least one illumination source within the ultraviolet (UV) spectrum; (ii) at least one illumination source within the visible spectrum; and (iii) at least one illumination source within the infrared spectrum; and wherein the least three different illumination sources have mean wavelengths of about 365 +/- 20 nm, about 400 +/- 20 nm, about 435 +/- 20 nm, about 470 +/- 20 nm, about 500 +/- 20 nm, about 550+/- 20 nm, about 580 +/- 20 nm, about 635 +/- 20 nm, about 660 +/- 20 nm, about 690 +/- 20 nm, about 780 +/- 20 nm, and/or about 850 +/- 20 nm.
- UV ultraviolet
- the least three different illumination sources
- training multispectral transmission image channel images are acquired from a multispectral image acquisition device 12A configured to illuminate a training biological specimen with two or more different illumination sources, where the at least two different illumination sources comprise (i) at least one illumination source within the ultraviolet (UV) spectrum; (ii) at least one illumination source within the visible spectrum; and (iii) at least one illumination source within the infrared spectrum; and wherein the least three different illumination sources have mean wavelengths of about 365 nm, about 400 nm, about 435 nm, about 470 nm, about 500 nm, about 550 nm, about 580 nm, about 635 nm, about 660 nm, about 690 nm, about 780 nm, and/or about 850 nm.
- training multispectral transmission image channel images are acquired from a multispectral image acquisition device 12A configured to illuminate a training biological specimen with six or more different illumination sources, where the at least six different illumination sources comprise (i) at least one illumination source within the ultraviolet (UV) spectrum; (ii) at least two illumination sources within the visible spectrum; and (iii) at least one illumination source within the infrared spectrum; and wherein the least six different illumination sources have mean wavelengths of about 365 +/- 20 nm, about 400 +/- 20 nm, about 435 +/- 20 nm, about 470 +/- 20 nm, about 500 +/- 20 nm, about 550+/- 20 nm, about 580 +/- 20 nm, about 635 +/- 20 nm, about 660 +/- 20 nm, about 690 +/- 20 nm, about 780 +/- 20 nm, and/or about 850 +/- 20 nm.
- the at least six different illumination sources comprise (i
- training multispectral transmission image channel images are acquired from a multispectral image acquisition device 12A configured to illuminate a training biological specimen with six or more different illumination sources, where the at least six different illumination sources comprise (i) at least one illumination source within the ultraviolet (UV) spectrum; (ii) at least two illumination sources within the visible spectrum; and (iii) at least one illumination source within the infrared spectrum; and wherein the least six different illumination sources have mean wavelengths of about 365 +/- 10 nm, about 400 +/- 10 nm, about 435 +/- 10 nm, about 470 +/- 10 nm, about 500 +/- 10 nm, about 550+/- 10 nm, about 580 +/- 10 nm, about 635 +/- 10 nm, about 660 +/- 10 nm, about 690 +/- 10 nm, about 780 +/- 10 nm, and/or about 850 +/- 10 nm.
- the at least six different illumination sources comprise (i
- training multispectral transmission image channel images are acquired from a multispectral image acquisition device 12A configured to illuminate a training biological specimen with six or more different illumination sources, where the at least six different illumination sources comprise (i) at least one illumination source within the ultraviolet (UV) spectrum; (ii) at least two illumination sources within the visible spectrum; and (iii) at least one illumination source within the infrared spectrum; and wherein the least six different illumination sources have mean wavelengths of about 365 nm, about 400 nm, about 435 nm, about 470 nm, about 500 nm, about 550 nm, about 580 nm, about 635 nm, about 660 nm, about 690 nm, about 780 nm, and/or about 850 nm.
- training multispectral transmission image channel images are acquired from a multispectral image acquisition device 12A configured to illuminate a training biological specimen with twelve or more different illumination sources, where the at least twelve different illumination sources comprise (i) at least one illumination source within the ultraviolet (UV) spectrum; (ii) at least two illumination sources within the visible spectrum; and (iii) at least one illumination source within the infrared spectrum; and wherein the least twelve different illumination sources have mean wavelengths of about 365 +/- 20 nm, about 400 +/- 20 nm, about 435 +/- 20 nm, about 470 +/- 20 nm, about 500 +/- 20 nm, about 550+/- 20 nm, about 580 +/- 20 nm, about 635 +/- 20 nm, about 660 +/- 20 nm, about 690 +/- 20 nm, about 780 +/- 20 nm, and/or about 850 +/- 20 nm.
- the at least twelve different illumination sources comprise (i
- training multispectral transmission image channel images are acquired from a multispectral image acquisition device 12A configured to illuminate a training biological specimen with twelve or more different illumination sources, where the at least twelve different illumination sources comprise (i) at least one illumination source within the ultraviolet (UV) spectrum; (ii) at least two illumination sources within the visible spectrum; and (iii) at least one illumination source within the infrared spectrum; and wherein the least twelve different illumination sources have mean wavelengths of about 365 +/- 10 nm, about 400 +/- 10 nm, about 435 +/- 10 nm, about 470 +/- 10 nm, about 500 +/- 10 nm, about 550+/- 10 nm, about 580 +/- 10 nm, about 635 +/- 10 nm, about 660 +/- 10 nm, about 690 +/- 10 nm, about 780 +/- 10 nm, and/or about 850 +/- 10 nm.
- the at least twelve different illumination sources comprise (i
- training multispectral transmission image channel images are acquired from a multispectral image acquisition device 12A configured to illuminate a training biological specimen with twelve or more different illumination sources, where the at least twelve different illumination sources comprise (i) at least one illumination source within the ultraviolet (UV) spectrum; (ii) at least two illumination sources within the visible spectrum; and (iii) at least one illumination source within the infrared spectrum; and wherein the least twelve different illumination sources have mean wavelengths of about 365 nm, about 400 nm, about 435 nm, about 470 nm, about 500 nm, about 550 nm, about 580 nm, about 635 nm, about 660 nm, about 690 nm, about 780 nm, and/or about 850 nm.
- the selection of and/or number of different illumination sources in any of the UV, visible, or IR spectra are selected based on a specific tissue type; or based on cellular organelles or cellular structures within the biological specimen that are relevant for downstream virtual stain generation and/or analysis.
- transmission image data is acquired from training biological specimens using one or more illumination sources matching one or more mean absorbances or peak absorbances of one or more cellular structures and/or one or more cellular organelles or interest.
- nucleic acids e.g., DNA
- nucleic acids absorb energy within the UV spectrum (e.g., below about 300nm) and within the visible spectrum (e.g., between 460nm to about 490nm)
- training multispectral transmission image data from different illumination sources emitting energy at about the peak absorbance wavelengths of nucleic acid molecules (and/or their associated structures, e.g., histones) may facilitate the training of a virtual staining engine 210, wherein the trained virtual staining engine 210 may be used to generate a virtual stain mimicking a traditional nuclear stain (i.e., hematoxylin).
- the selection of and/or number of different illumination sources in any of the UV, visible, or IR spectra are selected based on wavelengths characteristic of a particular type of morphological stain. For instance, H&E staining is used to provide contrast based on negatively or positively charged components of tissue, providing a morphological map of said tissue sample that can be used to look for the presence, absence, arrangement, and appearance of tissue structures to diagnose and prognosticate pathology. It is possible to map the same structures using the endogenous contrast of tissue at different wavelengths, for example using 250 nm to highlight the nuclei and 420 nm to highlight endogenous cytochromes, corresponding to cytoplasm.
- one or more, such as two or more training multispectral transmission image channel images are obtained (step 120).
- the one or more, such as two or more training multispectral transmission image channel images are simply combined into a multi-channel image (i.e., no compression or dimensionality reduction is applied).
- the obtained two or more training multispectral transmission images are reduced or compressed into a multi-channel multispectral training transmission image (step 121).
- the obtained two or more training multispectral transmission image channels are compressed to generate a multispectral training image using a dimensionality reduction method.
- suitable dimensionality reduction methods include principal component analysis (PCA) (such as principal component analysis plus discriminant analysis), projection onto latent structure regression, and t-Distributed Stochastic Neighbor Embedding (t- SNE), and Uniform Manifold Approximation and Projection (UMAP).
- PCA principal component analysis
- t- SNE t-Distributed Stochastic Neighbor Embedding
- UMAP Uniform Manifold Approximation and Projection
- 4 or more training multispectral transmission images channel images are compressed into a multi-channel a multispectral training transmission image using a dimensionality reduction technique.
- 5 or more training multispectral transmission images channel images are compressed into a multi-channel a multispectral training transmission image using a dimensionality reduction technique.
- 6 or more training multispectral transmission images channel images are compressed into a multi-channel a multispectral training transmission image using a dimensionality reduction technique.
- 7 or more training multispectral transmission images channel images are compressed into a multi-channel a multispectral training transmission image using a dimensionality reduction technique.
- 8 or more training multispectral transmission images channel images are compressed into a multi-channel a multispectral training transmission image using a dimensionality reduction technique. In some embodiments, 9 or more training multispectral transmission images channel images are compressed into a multi-channel a multispectral training transmission image using a dimensionality reduction technique. In some embodiments, 10 or more training multispectral transmission images channel images are compressed into a multi-channel a multispectral training transmission image using a dimensionality reduction technique. In some embodiments, 11 or more training multispectral transmission images channel images are compressed into a multi-channel a multispectral training transmission image using a dimensionality reduction technique.
- 12 or more training multispectral transmission images channel images are compressed into a multi-channel a multispectral training transmission image using a dimensionality reduction technique.
- 16 or more training multispectral transmission images channel images are compressed into a multi-channel a multispectral training transmission image using a dimensionality reduction technique.
- 20 or more training multispectral transmission images channel images are compressed into a multi-channel a multispectral training transmission image using a dimensionality reduction technique.
- 24 or more training multispectral transmission images channel images are compressed into a multi-channel a multispectral training transmission image using a dimensionality reduction technique.
- the 2 or more training multispectral transmission images channel images may be supplied to a machine learning algorithm without performing any compression step, such as any dimensionality reduction step (e.g., Principal Component Analysis (PCA)).
- PCA Principal Component Analysis
- PCA is used to reduce the dimensionality of the data set comprising the four or more multispectral transmission training image channel images.
- PCA is used to reduce the dimensionality of a data set consisting of many variables correlated with each other while retaining the variation present in the dataset, up to the maximum extent. The same is done by transforming the variables to a new set of variables, which are known as the principal components (or simply, the PCs) and are orthogonally ordered such that the retention of variation present in the original variables decreases as they move down in the order. In this way, the first principal component retains maximum variation that was present in the original components.
- the principal components are the eigenvectors of a covariance matrix, and hence they are orthogonal.
- the t-SNE algorithm is a non-linear dimensionality reduction technique well-suited for embedding high-dimensional data for visualization in a low-dimensional space of two or three dimensions. Specifically, it models each high-dimensional object by a two- or three-dimensional point in such a way that similar objects are modeled by nearby points and dissimilar objects are modeled by distant points with high probability.
- the t-SNE algorithm comprises two main stages. First, t-SNE constructs a probability distribution over pairs of high-dimensional objects in such a way that similar objects have a high probability of being picked while dissimilar points have an extremely small probability of being picked.
- t-SNE defines a similar probability distribution over the points in the low-dimensional map, and it minimizes the Kullback-Leibler divergence between the two distributions with respect to the locations of the points in the map.
- the t-SNE algorithm is further described in United States Patent Publication Nos. 2018/0046755, 2014/0336942, and 2018/0166077, the disclosures of which are hereby incorporated by reference herein in their entireties.
- FIG. 11 illustrates one method of compressing four or more training multispectral transmission image channel images into a single multispectral training image.
- training multispectral transmission image channel images from are obtained (step 130).
- Each of the obtained four or more training multispectral transmission image channel images are then converted into a data matrix (step 131). For instance, for 12 acquired training multispectral image channel images, the stacked images are converted into a 3 -dimensional matrix with 2 spatial dimensions and 12 channels.
- training multispectral transmission image data / training multispectral transmission image channel images the training biological specimens are stained with a morphological stain.
- one or more training brightfield transmission images may be acquired from morphologically stained training biological specimens, such as using brightfield image acquisition device 12B.
- a training brightfield transmission image may be obtained with multispectral image acquisition device 12A, such as by acquiring image data at three or predetermined wavelengths, such as at about 700nm +/- 10mm, about 550nm +/- 10mm , and about 470nm +/- 10mm.
- An example of a training brightfield transmission image is shown in FIG. 12B.
- the acquired training brightfield transmission images serve as ground-truth when training a virtual staining engine, such as described herein.
- the morphological stain is hematoxylin which stains the nuclei blue. In other embodiments, the morphological stain is eosin which stains the cytoplasm pink. In yet other embodiments, the obtained biological specimen is stained with both hematoxylin and/or eosin (H&E). In some embodiments, an H&E staining protocol may be performed, including applying the tissue section with hematoxylin stain mixed with a metallic salt, or mordant. The tissue section can then be rinsed in a weak acid solution to remove excess staining (differentiation), followed by bluing in mildly alkaline water. In some embodiments, after the application of hematoxylin, the tissue can be counterstained with eosin. It will be appreciated that other H&E staining techniques can be implemented.
- the morphological stain is a "special stain.”
- a "special stain” refers to any chemically based stain useful for histological analysis that is not an immunohistochemical stain, an in-situ hybridization stain, or H&E.
- the special stain includes one or more reagents selected from Acid fuchsin (C.I. 42685; absorbance maximum 546 nm), Alcian blue 8 GX (C.I. 74240; absorbance maximum 615 nm), Alizarin red S (C.I. 58005; absorbance maximum 556 and 596 nm), Auramine O (C.I.
- C.I refers to Color IndexTM.
- the Color IndexTM describes a commercial product by its recognized usage class, its hue, and a serial number (which simply reflects the chronological order in which related colorant types have been registered with the Color Index). This definition enables a particular product to be classified along with other products whose essential colorant is of the same chemical constitution and in which that essential colorant results from a single chemical reaction or a series of reactions.
- specialty stains include, but are not limited to, PAS STAINING KIT, SPECIAL STAINS GMS II STAIN KIT PACK, Reticulum II Staining Kit, IRON STAINING KIT, GIEMSA STAINING KIT, TRICHROME STAINING KIT, DIASTASE KIT, BenchMark Special Stain AFB Staining Kit, ALCIAN BLUE FOR PAS, LIGHT GREEN FOR PAS, STEINER II, STAINING KIT, Congo Red Staining Kit, Special Stains Van Gieson CS, Elastic Stain Core Kit, Jones Staining Kit, ALC BLUE STAINING KIT PH2.5,MUCICARMINE STAINING KIT, GRAM STAINING KIT, GREEN FOR TRICHROME, Jones Light Green Staining Kit, each available from Roche Diagnostics.
- the special stain is for a Grocott methenamine sliver assay. In other embodiments, the special stain is for TRICHROME,
- samples are morphologically stained according to the processes described in PCT Application Nos. PCT/EP2021/073738 or PCT/EP20-21/073733m the disclosures of which are hereby incorporated by reference herein in their entireties.
- the acquired training multispectral transmission image data / training multispectral transmission image channel images and the acquired brightfield transmission image are each associated with one or more identifiers.
- the identifiers may include a sample number, sample type, stain type, fixation properties, etc.
- the acquired training multispectral transmission image channel images and the acquired brightfield transmission image are each associated with unique identifiers such that the training multispectral transmission image channel images (or any multispectral training image derived therefrom) and the training brightfield transmission image may be associated with each other and retrieved from the one or more memories 201 and/or one or more storage units 240 to facilitate the preparation of coregistered pairs of training images for use in training a virtual staining engine 210.
- the image processing module 212 is utilized to generate a pair of coregistered training images from the generated multi-channel multispectral training transmission image and the training brightfield transmission image such that features in each image are aligned and/or coregistered (see steps 140 and 141 of FIG. 13A).
- the coregistered training images are then used to train a machine learning algorithm, namely, to train a virtual staining engine.
- FIG. 24 One method of coregistering images is set forth in FIG. 24.
- Other methods of coregistering images with respect to each other are known and described in the literature, see for example D. Mueller et al., Real-time deformable registration of multi-modal whole slides for digital pathology. Computerized Medical Imaging and Graphics vol. 35 p. 542-556 (2011); F. El- Gamal et al., Current trends in medical image registration and fusion, Egyptian Informatics Journal vol. 17 p. 99-124 (2016): J. Singla et al., A systematic way of affine transformation using image registration, International Journal of Information Technology and Knowledge Management July- December 2012, Vol. 5, No. 2, pp. 239-243; Z.
- a generated multi-channel multispectral training transmission image, and a training brightfield transmission image may be aligned or coregistered with RANSAC (random sample consensus, a known algorithm for image alignment).
- RANSAC random sample consensus, a known algorithm for image alignment
- both a generated multi-channel multispectral training transmission image and a training brightfield transmission image are obtained (step 150), such according to the methods described herein.
- the obtained images are segmented, such as into 64-pixel x 64- pixel segments, 128-pixel x 128-pixel segments, 256-pixel x 256-pixel segments, 512-pixel x 512- pixel segments, etc.
- ROIs regions of interest
- features are then identified in each of the identified one or more ROIs (step 152).
- features include epithelial cell nuclei, fat cells, inflammatory cells, etc.
- lumens, glands, and/or fatty cells are used as first landmarks to locate zones. The, after "zooming in” / increasing magnification, epithelial cells, red blood cells and inflammatory cells may be located.
- Landmarks are then placed in each of the generated multi-channel multispectral training transmission image and the training brightfield transmission image (step 153). Landmark placement in both the obtained generated multi-channel multispectral training transmission image and the obtained training brightfield transmission image are shown in FIGS. 14A and 14B, respectively.
- a transform is then applied to translate, rotate, shear, shrink, and/or stretch the obtained training brightfield transmission image to match landmarks placed in the generated multichannel multispectral training transmission image (step 154).
- the transform is an affine transform, such as one which permits shearing and scaling in addition to rotation and translation.
- the pairs of coregistered images are provided to a machine learning algorithm for training to train / teach the machine learning algorithm (virtual staining engine) to predict a virtual stained image.
- Pixel-to-pixel, cell-to-cell, and/or patch-to-patch mapping is performed using the pairs of coregistered training images.
- Machine learning algorithms include: [0234] 1) Self-supervised learning neural network.
- neural network refers to one or more computer-implemented networks capable of being trained to achieve a goal.
- references herein to a neural network include one neural network or multiple interrelated neural networks that are trained together.
- Examples of neural networks include, without limitation, convolutional neural networks (CNNs), recurrent neural networks (RNNs), fully connected neural networks, encoder neural networks (e.g., "encoders”), decoder neural networks (e.g., "decoders”), dense- connection neural networks, and other types of neural networks.
- a neural network can be implemented using special hardware (e.g., GPU, tensor processing units (TPUs), systolic arrays, single instruction multiple data (SIMD) processor, etc.), using software code and a general-purpose processor, or a combination of special hardware and software code.
- special hardware e.g., GPU, tensor processing units (TPUs), systolic arrays, single instruction multiple data (SIMD) processor, etc.
- SIMD single instruction multiple data
- the neural network is configured as a deep learning network.
- deep learning is a branch of machine learning based on a set of algorithms that attempt to model high level abstractions in data. Deep learning is part of a broader family of machine learning methods based on learning representations of data. An observation can be represented in many ways such as a vector of intensity values per pixel, or in a more abstract way as a set of edges, regions of particular shape, etc. Some representations are better than others at simplifying the learning task.
- One of the promises of deep learning is replacing handcrafted features with efficient algorithms for unsupervised or semi-supervised feature learning and hierarchical feature extraction.
- the neural network may be a deep neural network with a set of weights that model the world according to the data that it has been fed to train it.
- Neural networks typically consist of multiple layers, and the signal path traverses from front to back between the layers. Any neural network may be implemented for this purpose. Suitable neural networks include LeNet, AlexNet, ZFnet, GoogLeNet, VGGNet, VGG16, DenseNet (also known as a Dense Convolutional Network or DenseNet-121), MiniNet, and the ResNet.
- a fully convolutional neural network is utilized, such as described by Long et al., "Fully Convolutional Networks for Semantic Segmentation,” Computer Vision and Pattern Recognition (CVPR), 2015 IEEE Conference, June 20015 (IN SPEC Accession Number: 15524435), the disclosure of which is hereby incorporated by reference.
- CVPR Computer Vision and Pattern Recognition
- 7Q networks include a Convolutional Neural Network (CNN), a Recurrent Neural Network, a Long Short-Term Memory Neural Network (LSTM), a Compound Scaled Efficient Neural Network (EfficientNet), a Normalizer Free Neural Network (NFNet), a Densely Connected Convolutional Neural Network (DenseNet), an Aggregated Residual Transformation Neural Network (ResNeXT), a Channel Boosted Convolutional Neural Network (CB-CNN), a Wide Residual Network (WRN), or a Residual Neural Network (RNN).
- the neural network is one that operates on cross-entropy, e.g., one that may recognize per pixel cross-entropy.
- Implicit generative models such as generative Adversarial Networks (GANs).
- the neural network is a generative network. Further details regarding GAN may be found in Goodfellow et al., Generative Adversarial Nets., Advances in Neural Information Processing Systems, 27, pp. 2672-2680 (2014), which is incorporated by reference herein in its entirety.
- a "generative” network can be generally defined as a model that is probabilistic in nature. In other words, a "generative” network is not one that performs forward simulation or rulebased approaches. Instead, the generative network can be learned (in that its parameters can be learned) based on a suitable set of training data (e.g., a plurality of training image data sets).
- the neural network is configured as a deep generative network.
- the network may be configured to have a deep learning architecture in that the network may include multiple layers, which perform a number of algorithms or transformations.
- the term "layer” or “network layer” refers to an analysis stage in a neural network. Layers perform different types of analysis related to the type of the neural network.
- layers in an encoder may perform different types of analysis on an input image to encode the input image.
- a particular layer provides features based on the particular analysis performed by that layer.
- a particular layer down-samples a received image.
- An additional layer performs additional down-sampling.
- each round of down-sampling reduces the visual quality of the output image, but provides features based on the related analysis performed by that layer.
- Bengio "Generative Adversarial Nets,” in Advances in neural information processing systems, pp. 2672-2680, 2014, to model the training image data distribution which is then used to generate new image samples from the same distribution.
- the discriminator D tries to classify images generated by G whether they came from real training data (true distribution) or fake.
- These two networks are trained at the same time and updated as if they are playing a game. That is, generator G tries to fool discriminator D and in turn discriminator D adjust its parameters to make better estimates to detect fake images generated by G.
- the GAN is a cyclic GAN.
- An example for a suitable cyclic GAN network architecture is described by Jun-Yan Zhu, Taesung Park, Phillip Isola, and Alexei A. Efros in “Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks,” (24 Nov. 2017).
- the GAN is a three-channel GAN which utilizes data from three channels, such as three channels in compressed (e.g., dimensionality reduced) input images.
- the GAN is a multi-channel GAN which utilizes data from a plurality of channels, such as two or more channels, such as three or more channels, such as four or more channels, such as five or more channels, such as six or more channels, such as seven or more channels, such as eight or more channels, such as nine or more channels, such as ten or more channels, such as eleven or more channels, such as twelve or more channels, such as sixteen or more channels, such as twenty or more channels, such as twenty-four or more channels, etc.
- the GAN training procedure involves training of two different networks, namely (i) a generator network, which in this case aims to learn the statistical transformation between the unstained multi-channel multispectral training transmission image and the corresponding brightfield image after the training biological specimen is stained with a morphological stain; and (ii) a discriminator network that learns how to discriminate between a true brightfield image of a stained training specimen and the generator network's output image.
- the desired result of this training process is a trained deep neural network, which transforms an unstained training specimen input image into a virtually stained image of an unstained biological sample which will be diagnostically equivalent to a chemically stained brightfield image of the same biological sample.
- there is no discernable difference in the diagnostic quality between the resulting virtually stained images of the unstained biological specimens and corresponding images of chemically stained biological specimens at least not to the extent that any differences will substantially alter a diagnostic outcome.
- a GAN is used to identify discrepancies between pairs of coregistered training images, where the brightfield training transmission image in each pair of coregistered training images serves as ground truth.
- the process of identifying the discrepancies between in a pair of coregistered training images may performed in a "discriminator network.”
- the discrepancies between the brightfield training transmission image and the multi-channel multispectral training transmission image (in the pair of coregistered training images) may then be formulated as a loss function.
- the loss function may be communicated to the discriminator network and to the generator network for use in backpropagation.
- the input from the loss function enables the discriminator network to learn how to distinguish between an actual or true image of a stained training biological specimen and the generator network's output virtual histological image.
- the generator network computes a virtually stained image
- the discriminator network assesses the similarity of each to an actual histological image, namely the brightfield training transmission image included within a pair of coregistered training images.
- the generator network may utilize the loss function to generate a corrected virtual image which is then communicated to the discriminator network. This process may be performed iteratively until the discriminator network cannot distinguish between a generated virtually stained image and the brightfield training transmission image (the image which includes an actual morphological stain).
- the discriminator network's classification helps the generator network to update its weights and thereby fine-tune the virtual histological images being produced.
- the generator network begins to output higher-quality virtually stained histological images, and the discriminator network becomes better at distinguishing the virtually stained histological images from the actual histological images.
- the network can produce virtually stained images representative of morphological stains, e.g., H&E, a basic fuchsin stain, a Masson's Trichrome stain, etc.
- the present disclosure provides methods for the generation of a virtually stained image of a test unstained biological specimen, based on an acquired input image of the test unstained biological specimen, where the virtually stained image manifests the appearance of the test unstained biological specimen as if it were stained with a morphological stain.
- an unstained input image e.g., a multispectral test transmission image
- the trained virtual staining engine generates a predicted or virtual stained output image that corresponds to that unstained input image.
- the virtual staining engine selected is trained to predict H&E staining from an unstained input image
- the trained virtual staining engine generates a predicted or virtually stained H&E output image for the unstained input image.
- This predicted or virtually stained output image may then be stored in one or more memories or in one or more storage systems for later retrieval and downstream processing (e.g., analysis by a pathologist or other automated image analysis workflow).
- a method of generating a virtually stained slide comprises (i) acquiring test multispectral transmission image data from an unstained test biological specimen (step 161); (ii) supplying the test multispectral transmission image data to a trained virtual staining engine 210 trained to generate an image of an unstained biological specimen stained with a morphological stain (step 162); and (iii) with the trained virtual staining engine, generate a virtually stained image of the test unstained biological specimen stained with the morphological stain (step 163).
- the test multispectral image data is acquired from one or more test unstained biological specimens.
- the obtained test biological specimens may be obtained from any source.
- the obtained test biological specimens may be obtained from a tumor, including, for example, tumor biopsies samples, resection samples, cell smears, fine needle aspirates (FNA), liquid-based cytology samples, and the like.
- the obtained test biological specimens are derived from specimens that have been previously stained; and where the previous stain has been substantially removed.
- one or more, such as two or more test multispectral transmission image channel images are acquired for each test unstained biological specimen, where each of the one or more, such as two or more test multispectral transmission image channel images are acquired at a specific wavelength.
- At least one, such as at least two test multispectral transmission image channel images are acquired for each unstained test biological specimen, where each of the at least one, such as at least two test multispectral transmission image channel images are acquired using a multispectral image acquisition device 12A configured to illuminate each unstained test biological specimen with at least one, such as at least two different illumination sources; and further configured to acquire transmission image data (e.g., at least two multispectral image channel images) of the biological specimen illuminated with the at least one, such as at least two different illumination sources.
- a multispectral image acquisition device 12A configured to illuminate each unstained test biological specimen with at least one, such as at least two different illumination sources; and further configured to acquire transmission image data (e.g., at least two multispectral image channel images) of the biological specimen illuminated with the at least one, such as at least two different illumination sources.
- At least one, such as at least two test multispectral transmission image channel images are acquired for each unstained test biological specimen, where each of the at least one, such as at least two test multispectral transmission image channel images are acquired at different wavelengths.
- four test multispectral transmission image channel images are obtained from a least four different illumination sources.
- five test multispectral transmission image channel images are obtained from a least five different illumination sources.
- test multispectral transmission image channel images are obtained from a least six different illumination sources.
- test multispectral transmission image channel images are obtained from a least seven different illumination sources.
- test multispectral transmission image channel images are obtained from a least eight different illumination sources.
- test multispectral transmission image channel images are obtained from a least nine different illumination sources. In some embodiments, test multispectral transmission image channel images are obtained from a least ten different illumination sources. In some embodiments, test multispectral transmission image channel images are obtained from a least eleven different illumination sources. In some embodiments, test multispectral transmission image channel images are obtained from a least twelve different illumination sources. In some embodiments, test multispectral transmission image channel images are obtained from twelve or more illumination sources. In some embodiments, test multispectral transmission image channel images are obtained from sixteen or more different illumination sources. In some embodiments, test multispectral transmission image channel images are obtained from twenty or more different illumination sources. In some embodiments, test multispectral transmission image channel images are obtained from twenty-four or more different illumination sources.
- test multispectral image data acquired is acquired based on the trained virtual staining engine selected for use in generating a virtually stained slide and, in particular, the number of image channels used during training and the particular wavelengths at which the image data was acquired.
- the number of different test multispectral image channel images to acquire and the wavelengths of the different illumination sources that the test unstained biological specimen should be illuminated with are dependent on (i) the number of different training multispectral image channel images used during the training of a specific virtual staining engine; and (ii) the wavelengths of the different illumination sources that were used during the training of the specific virtual staining engine (such as within +/- 15% nm, +/- 12% nm, +/- 10% nm, +/- 9% nm, +7-8% nm, +7-7% nm, +7-6% nm, +7-5% nm, +7-4% nm, +7-3% nm, +7-2% nm, or +/-1 nm of the different illumination sources that were used during the training of the specific virtual staining engine).
- test multispectral image data is acquired at about the same five different barrow-band illumination sources, such as within +/- 15% nm, +/- 12% nm, +/- 10% nm, +/- 9% nm, +7-8% nm, +7-7% nm, +7-6% nm, +7-5% nm, +7-4% nm, +7-3% nm, +/-2% nm, or +/-1 nm of the same five different illumination sources.
- test multispectral image channel images should be acquired at 350nm +/- 10% nm, 375nm +/- 10% nm, 400nm +/- 10% nm, 425nm +/- 10% nm, 450nm +/- 10% nm, 475nm +/- 10% nm, 500nm +/- 10% nm, 525nm +/- 10% nm, 550nm +/- 10% nm, 575nm +/- 10% nm, 600nm +/- 10% nm, and 625nm +/- 10% nm.
- two or more test multispectral transmission image channel images are obtained (step 170) from any test unstained biological specimen. Each of the two or more obtained test multispectral transmission images channel images are then combined to form a multi-channel multispectral training image (step 171).
- the two or more test multispectral transmission image channel images are combined into a multi-channel multispectral training image without compressing the image, such as without performing any dimensionality reduction technique (e.g., PCA).
- the two or more the two or more test multispectral transmission image channel images are combined into a multi-channel multispectral training image by compressing the images using a dimensionality reduction technique (e.g., PCA).
- the obtained four or more test multispectral transmission images are reduced or compressed into a multi-channel multispectral test transmission image.
- the obtained four or more test multispectral transmission image channels are compressed to generate a multispectral training image using a dimensionality reduction method.
- suitable dimensionality reduction methods include principal component analysis (PCA) (such as principal component analysis plus discriminant analysis), projection onto latent structure regression, and t-Distributed Stochastic Neighbor Embedding (t-SNE), and Uniform Manifold Approximation and Projection (UMAP).
- PCA principal component analysis
- t-SNE t-Distributed Stochastic Neighbor Embedding
- UMAP Uniform Manifold Approximation and Projection
- the steps outlined in FIG. 11 herein which describes a method of compressing four or more training multispectral transmission image channel images into a single multispectral training image, may be applied equally in the generation of a 3-channel multispectral test transmission image.
- the obtained four or more test multispectral transmission images are combined into a multi-channel multispectral test transmission image without performing any compression technique or any dimensionality reduction technique (e.g., PCA).
- test unstained multispectral image data e.g., multi-channel Multispectral Test Transmission Image
- a selected trained virtual staining engine to provide a virtually stained image based on the input image of the test unstained biological specimen. Examples of virtually stained images of unstained biological specimens are set forth within FIG. 1.
- the same multi-channel Multispectral Test Transmission Image is supplied to multiple different trained virtual staining engines, to provide multiple different virtually stained images of the same unstained biological specimen.
- the system 200 of the present disclosure may be tied to a specimen processing apparatus that can perform one or more preparation processes on the tissue specimen.
- the preparation process can include, without limitation, deparaffinizing a specimen, conditioning a specimen (e.g., cell conditioning), staining a specimen, performing antigen retrieval, performing immunohistochemistry staining (including labeling) or other reactions, and/or performing in situ hybridization (e.g., SISH, FISH, etc.) staining (including labeling) or other reactions, as well as other processes for preparing specimens for microscopy, microanalyses, mass spectrometric methods, or other analytical methods.
- the processing apparatus can apply fixatives to the specimen.
- Fixatives can include cross-linking agents (such as aldehydes, e g., formaldehyde, paraformaldehyde, and glutaraldehyde, as well as non-aldehyde cross-linking agents), oxidizing agents (e.g., metallic ions and complexes, such as osmium tetroxide and chromic acid), protein-denaturing agents (e.g., acetic acid, methanol, and ethanol), fixatives of unknown mechanism (e.g., mercuric chloride, acetone, and picric acid), combination reagents (e.g., Camoy's fixative, methacarn, Bouin's fluid, B5 fixative, Rossman's fluid, and Gendre's fluid), microwaves, and miscellaneous fixatives (e.g., excluded volume fixation and vapor fixation).
- cross-linking agents such as al
- the sample can be deparaffinized using appropriate deparaffinizing fluid(s).
- any number of substances can be successively applied to the specimen.
- the substances can be for pretreatment (e.g., to reverse protein-crosslinking, expose cells acids, etc.), denaturation, hybridization, washing (e.g., stringency wash), detection (e g., link a visual or marker molecule to a probe), amplifying (e.g., amplifying proteins, genes, etc.), counterstaining, coverslipping, or the like.
- the specimen processing apparatus can apply a wide range of substances to the specimen.
- the substances include, without limitation, stains, probes, reagents, rinses, and/or conditioners.
- the substances can be fluids (e.g., gases, liquids, or gas/liquid mixtures), or the like.
- the fluids can be solvents (e.g., polar solvents, non-polar solvents, etc.), solutions (e.g., aqueous solutions or other types of solutions), or the like.
- Reagents can include, without limitation, stains, wetting agents, antibodies (e.g., monoclonal antibodies, polyclonal antibodies, etc.), antigen recovering fluids (e.g., aqueous- or non-aqueous-based antigen retrieval solutions, antigen recovering buffers, etc ), or the like.
- Probes can be an isolated cells acid or an isolated synthetic oligonucleotide, attached to a detectable label or reporter molecule. Labels can include radioactive isotopes, enzyme substrates, co-f actors, ligands, chemiluminescent or fluorescent agents, nanoparticles, haptens, and enzymes.
- fluid refers to any liquid or liquid composition, including water, solvents, buffers, solutions (e.g., polar solvents, non-polar solvents), and/or mixtures.
- the fluid may be aqueous or non-aqueous.
- Non-limiting examples of fluids include washing solutions, rinsing solutions, acidic solutions, alkaline solutions, transfer solutions, and hydrocarbons (e.g., alkanes, isoalkanes and aromatic compounds such as xylene).
- washing solutions include a surfactant to facilitate spreading of the washing liquids over the specimen-bearing surfaces of the slides.
- acid solutions include deionized water, an acid (e.g., acetic acid), and a solvent.
- alkaline solutions include deionized water, a base, and a solvent.
- transfer solutions include one or more glycol ethers, such as one or more propylene-based glycol ethers (e.g., propylene glycol ethers, di(propylene glycol) ethers, and tri(propylene glycol) ethers, ethylene-based glycol ethers (e.g., ethylene glycol ethers, di(ethylene glycol) ethers, and tri(ethylene glycol) ethers), and functional analogs thereof.
- propylene-based glycol ethers e.g., propylene glycol ethers, di(propylene glycol) ethers, and tri(propylene glycol) ethers
- ethylene-based glycol ethers e.g., ethylene glycol ethers, di(ethylene glycol) ethers,
- Non-liming examples of buffers include citric acid, potassium dihydrogen phosphate, boric acid, diethyl barbituric acid, piperazine-N,N'-bis(2-ethanesulfonic acid), dimethylarsinic acid, 2-(N-morpholino)ethanesulfonic acid, tris(hydroxymethyl)methylamine (TRIS), 2-(N-morpholino)ethanesulfonic acid (TAPS), N,N- bis(2-hydroxyethyl)glycine(Bicine), N-tris(hydroxymethyl)methylglycine (Tricine), 4-2- hy droxy ethyl- 1 -piperazineethanesulfonic acid (HEPES), 2-
- the buffer may be comprised of tris(hydroxymethyl)methylamine (TRIS), 2-(N-morpholino)ethanesulfonic acid (TAPS), N,N-bis(2-hydroxyethyl)glycine(Bicine), N -tris(hydroxymethyl)methylglycine (Tricine), 4-2-hy droxy ethyl- 1 -piperazineethanesulfonic acid (HEPES), 2-
- Staining may be performed with a histochemical staining module or separate platform, such as an automated IHC/ISH slide Stainer.
- Automated IHC/ISH slide Stainers typically include at least: reservoirs of the various reagents used in the staining protocols, a reagent dispense unit in fluid communication with the reservoirs for dispensing reagent to onto a slide, a waste removal system for removing used reagents and other waste from the slide, and a control system that coordinates the actions of the reagent dispense unit and waste removal system.
- many automated slide Stainers can also perform steps ancillary to staining (or are compatible with separate systems that perform such ancillary steps), including slide baking (for adhering the sample to the slide), dewaxing (also referred to as deparaffmization), antigen retrieval, counterstaining, dehydration and clearing, and coverslipping.
- steps ancillary to staining or are compatible with separate systems that perform such ancillary steps
- slide baking for adhering the sample to the slide
- dewaxing also referred to as deparaffmization
- antigen retrieval counterstaining
- counterstaining dehydration and clearing
- IHC/ISH slide Stainers 1578-1582 (2014), incorporated herein by reference in its entirety, describes several specific examples of automated IHC/ISH slide Stainers and their various features, including the intelliPATH (Biocare Medical), WAVE (Celerus Diagnostics), DAKO OMNIS and DAKO AUTO STAINER LINK 48 (Agilent Technologies), BENCHMARK (Ventana Medical Systems, Inc.), Leica BOND, and Lab Vision Autostainer (Thermo Scientific) automated slide Stainers. Additionally, Ventana Medical Systems, Inc. is the assignee of a number of United States patents disclosing systems and methods for performing automated analyses, including U.S. Pat. Nos.
- reagent refers to solutions or suspensions including one or more agents capable of covalently or non-covalently reacting with, coupling with, interacting with, or hybridizing to another entity.
- Non-limiting examples of such agents include specific-binding entities, antibodies (primary antibodies, secondary antibodies, or antibody conjugates), nucleic acid probes, oligonucleotide sequences, detection probes, chemical moieties bearing a reactive functional group or a protected functional group, enzymes, solutions or suspensions of dye or stain molecules.
- staining units typically operate on one of the following principles: (1) open individual slide staining, in which slides are positioned horizontally and reagents are dispensed as a puddle on the surface of the slide containing a tissue sample (such as implemented on the DAKO AUTOSTAINER Link 48 (Agilent Technologies) and intelliPATH (Biocare Medical) Stainers); (2) liquid overlay technology, in which reagents are either covered with or dispensed through an inert fluid layer deposited over the sample (such as implemented on Ventana BenchMark and DISCOVERY Stainers); (3) capillary gap staining, in which the slide surface is placed in proximity to another surface (which may be another slide or a coverplate) to create a narrow gap, through which capillary forces draw up and keep liquid reagents in contact with the samples (such as the staining principles used by DAKO TECHMATE, Leica BOND, and DAKO OMNIS Stainers).
- capillary gap staining do not mix the fluids in the gap (such as on the DAKO TECHMATE and the Leica BOND).
- dynamic gap staining capillary forces are used to apply sample to the slide, and then the parallel surfaces are translated relative to one another to agitate the reagents during incubation to effect reagent mixing (such as the staining principles implemented on DAKO OMNIS slide Stainers (Agilent)).
- a translatable head is positioned over the slide. A lower surface of the head is spaced apart from the slide by a first gap sufficiently small to allow a meniscus of liquid to form from liquid on the slide during translation of the slide.
- a mixing extension having a lateral dimension less than the width of a slide extends from the lower surface of the translatable head to define a second gap smaller than the first gap between the mixing extension and the slide.
- the lateral dimension of the mixing extension is sufficient to generate lateral movement in the liquid on the slide in a direction generally extending from the second gap to the first gap.
- an automated &E staining platform may be used.
- Automated systems for performing staining typically operate on one of two staining principles: batch staining (also referred to as "dip ‘n dunk") or individual slide staining.
- Batch stainers generally use vats or baths of reagents in which many slides are immersed at the same time.
- Individual slide stainers apply reagent directly to each slide, and no two slides share the same aliquot of reagent.
- stainers examples include the VENTANA SYMPHONY (individual slide Stainer) and VENTANA HE 600 (individual slide Stainer) series H&E stainers from Roche; the Dako CoverStainer (batch stainer) from Agilent Technologies; the Leica ST4020 Small Linear Stainer (batch stainer), Leica ST5020 Multistainer (batch stainer), and the Leica ST5010 Autostainer XL series (batch stainer) H&E stainers from Leica Biosystems Nusloch GmbH.
- the stained samples can be manually analyzed on a microscope, and/or digital images of the stained samples can be acquired for archiving and/or digital analysis (e.g., with image acquisition apparatus 12B).
- Digital images can be captured via a scanning platform such as a slide scanner that can scan the stained slides at 20x, 40x, or other magnifications to produce high resolution whole-slide digital images.
- the typical slide scanner includes at least: (1) a microscope with lens objectives, (2) a light source (such as halogen, light emitting diode, white light, and/or multispectral light sources, depending on the dye), (3) robotics to move glass slides around or to move the optics around the slide or both, (4) one or more digital cameras for image capture, (5) a computer and associated software to control the robotics and to manipulate, manage, and view digital slides.
- Digital data at a number of different X-Y locations (and in some cases, at multiple Z planes) on the slide are captured by the camera’s charge-coupled device (CCD), and the images are joined together to form a composite image of the entire scanned surface.
- CCD charge-coupled device
- tile based scanning in which the slide stage or the optics are moved in small increments to capture square image frames, which overlap adjacent squares to a slight degree. The captured squares are then automatically matched to one another to build the composite image; and
- Examples of commercially available slide scanners include: 3DHistech PANNORAMIC SCAN II; DigiPath PATHSCOPE; Hamamatsu NAN0Z00MER RS, HT, and XR; Huron TISSUESCOPE 4000, 4000XT, and HS; Leica SCANSCOPE AT, AT2, CS, FL, and SCN400; Mikroscan D2; Olympus VS120-SL; Omnyx VL4, and VL120; PerkinElmer LAMINA; Philips ULTRA-FAST SCANNER; Sakura Finetek VISIONTEK; Unic PRECICE 500, and PRECICE 600x; and Zeiss AXIO SCAN.Z1.
- the scanning device is a digital pathology device as disclosed any of United States Patent No. 9,575,301; U.S. Patent Application Publication No. 2014/0178169; United States Patent No. 9,575,301; U.S. Patent Application Publication No. 2014/0178169; United States Patent Publication Nos. 2021/0092308; and/or U.S. Patent Application Publication No. 2021/0088769, the content of each of which is incorporated by reference in its entirety.
- Exemplary commercially available image analysis software packages include VENTANA VIRTUOSO software suite (Ventana Medical Systems, Inc ); TISSUE STUDIO, DEVELOPER XD, and IMAGE MINER software suites (Definiens); BIOTOPIX, ONCOTOPIX, and STEREOTOPIX software suites (Visiopharm); and the HALO platform (Indica Labs, Inc.).
- any imaging may be accomplished using any of the systems disclosed in U.S. Patent Nos. 10,317,666 and 10,313,606, the disclosures of which are hereby incorporated by reference herein in their entireties.
- the imaging apparatus may be a brightfield imager such as the iScan CoreoTM brightfield scanner or the DP200 scanner sold by Ventana Medical Systems, Inc.
- Image analysis system may include one or more computing devices such as desktop computers, laptop computers, tablets, smartphones, servers, application-specific computing devices, or any other type(s) of electronic device(s) capable of performing the techniques and operations described herein.
- image analysis system may be implemented as a single device.
- image analysis system may be implemented as a combination of two or more devices together achieving the various functionalities discussed herein.
- image analysis system may include one or more server computers and a one or more client computers communicatively coupled to each other via one or more local-area networks and/or wide-area networks such as the Internet.
- the image analysis system typically includes at least a memory, a processor, and a display.
- Memory may include any combination of any type of volatile or nonvolatile memories, such as random-access memories (RAMs), read-only memories such as an Electrically Erasable Programmable Read-Only Memory (EEPROM), flash memories, hard drives, solid state drives, optical discs, and the like. It is appreciated that memory can be included in a single device and can also be distributed across two or more devices.
- Processor may include one or more processors of any type, such as central processing units (CPUs), graphics processing units (GPUs), special-purpose signal or image processors, field-programmable gate arrays (FPGAs), tensor processing units (TPUs), and so forth. It is appreciated that processor can be included in a single device and can also be distributed across two or more devices.
- Display may be implemented using any suitable technology, such as LCD, LED, OLED, TFT, Plasma, etc.
- display may be a touch-sensitive display (a touchscreen).
- Image analysis system also typically includes a software system stored on the memory comprising a set of instructions implementable on the processor, the instructions comprising various image analysis tasks, such as object identification, stain intensity quantification, and the like.
- Exemplary commercially available software packages useful in implementing modules as disclosed herein include VENTANA VIRTUOSO; Definiens TISSUE STUDIO, DEVELOPER XD, and IMAGE MINER; and Visiopharm BIOTOPIX, ONCOTOPIX, and STEREOTOPIX software packages.
- the imaging apparatus is a brightfield imager slide scanner.
- One brightfield imager is the iScan Coreo brightfield scanner sold by Ventana Medical Systems, Inc.
- the imaging apparatus is a digital pathology device as disclosed in International Patent Application No.: PCT/US2010/002772 (Patent Publication No.: WO/2011/049608) entitled IMAGING SYSTEM AND TECHNIQUES or disclosed in U.S. Patent Application No. 61/533,114, filed on Sep. 9, 2011, entitled IMAGING SYSTEMS, CASSETTES, AND METHODS OF USING THE SAME.
- International Patent Application No. PCT/US2010/002772 and U.S. Patent Application No. 61/533,114 are incorporated by reference in their entities.
- Embodiments of the subject matter and the operations described in this specification can be implemented in digital electronic circuitry, or in computer software, firmware, or hardware, including the structures disclosed in this specification and their structural equivalents, or in combinations of one or more of them.
- Embodiments of the subject matter described in this specification can be implemented as one or more computer programs, for example, one or more modules of computer program instructions, encoded on computer storage medium for execution by, or to control the operation of, data processing apparatus. Any of the modules described herein may include logic that is executed by the processor(s).
- Logic refers to any information having the form of instruction signals and/or data that may be applied to affect the operation of a processor. Software is an example of logic.
- a computer storage medium can be, or can be included in, a computer-readable storage device, a computer-readable storage substrate, a random or serial access memory array or device, or a combination of one or more of them.
- a computer storage medium is not a propagated signal, a computer storage medium can be a source or destination of computer program instructions encoded in an artificially generated propagated signal.
- the computer storage medium can also be, or can be included in, one or more separate physical components or media (e.g., multiple CDs, disks, or other storage devices).
- the operations described in this specification can be implemented as operations performed by a data processing apparatus on data stored on one or more computer-readable storage devices or received from other sources.
- the term "programmed processor” encompasses all kinds of apparatus, devices, and machines for processing data, including by way of example a programmable microprocessor, a computer, a system on a chip, or multiple ones, or combinations, of the foregoing.
- the apparatus can include special purpose logic circuitry, e.g., an FPGA (field programmable gate array) or an ASIC (application-specific integrated circuit).
- the apparatus also can include, in addition to hardware, code that creates an execution environment for the computer program in question, e.g., code that constitutes processor firmware, a protocol stack, a database management system, an operating system, a cross-platform runtime environment, a virtual machine, or a combination of one or more of them.
- the apparatus and execution environment can realize various different computing model infrastructures, such as web services, distributed computing, and grid computing infrastructures.
- a computer program (also known as a program, software, software application, script, or code) can be written in any form of programming language, including compiled or interpreted languages, declarative or procedural languages, and it can be deployed in any form, including as a stand-alone program or as a module, component, subroutine, object, or other unit suitable for use in a computing environment.
- a computer program may, but need not, correspond to a file in a file system.
- a program can be stored in a portion of a file that holds other programs or data (e g., one or more scripts stored in a markup language document), in a single file dedicated to the program in question, or in multiple coordinated files (e.g., files that store one or more modules, subprograms, or portions of code).
- a computer program can be deployed to be executed on one computer or on multiple computers that are located at one site or distributed across multiple sites and interconnected by a communication network.
- the processes and logic flows described in this specification can be performed by one or more programmable processors executing one or more computer programs to perform actions by operating on input data and generating output.
- the processes and logic flows can also be performed by, and apparatus can also be implemented as, special purpose logic circuitry, e.g., an FPGA (field programmable gate array) or an ASIC (application-specific integrated circuit).
- special purpose logic circuitry e.g., an FPGA (field programmable gate array) or an ASIC (application-specific integrated circuit).
- processors suitable for the execution of a computer program include, by way of example, both general and special purpose microprocessors, and any one or more processors of any kind of digital computer.
- a processor will receive instructions and data from a readonly memory or a random-access memory or both.
- the essential elements of a computer are a processor for performing actions in accordance with instructions and one or more memory devices for storing instructions and data.
- a computer will also include or be operatively coupled to receive data from or transfer data to, or both, one or more mass storage devices for storing data, e.g., magnetic, magneto-optical disks, or optical disks.
- mass storage devices for storing data, e.g., magnetic, magneto-optical disks, or optical disks.
- a computer need not have such devices.
- a computer can be embedded in another device, e.g., a mobile telephone, a personal digital assistant (PDA), a mobile audio or video player, a game console, a Global Positioning System (GPS) receiver, or a portable storage device (e g., a universal serial bus (USB) flash drive), to name just a few.
- Devices suitable for storing computer program instructions and data include all forms of non-volatile memory, media, and memory devices, including by way of example semiconductor memory devices, e.g., EPROM, EEPROM, and flash memory devices; magnetic disks, e.g., internal hard disks or removable disks; magneto-optical disks; and CD-ROM and DVD-ROM disks.
- the processor and the memory can be supplemented by, or incorporated in, special purpose logic circuitry.
- a computer having a display device, e.g., an LCD (liquid crystal display), LED (light emitting diode) display, or OLED (organic light emitting diode) display, for displaying information to the user and a keyboard and a pointing device, e.g., a mouse or a trackball, by which the user can provide input to the computer.
- a display device e.g., an LCD (liquid crystal display), LED (light emitting diode) display, or OLED (organic light emitting diode) display
- a keyboard and a pointing device e.g., a mouse or a trackball
- a touch screen can be used to display information and receive input from a user.
- a computer can interact with a user by sending documents to and receiving documents from a device that is used by the user; for example, by sending web pages to a web browser on a user's client device in response to requests received from the web browser.
- Embodiments of the subject matter described in this specification can be implemented in a computing system that includes a back-end component, e.g., as a data server, or that includes a middleware component, e.g., an application server, or that includes a front-end component, e.g., a client computer having a graphical user interface or a Web browser through which a user can interact with an implementation of the subject matter described in this specification, or any combination of one or more such back-end, middleware, or front-end components.
- the components of the system can be interconnected by any form or medium of digital data communication, e.g., a communication network. Examples of communication networks include a local area network ("LAN”) and a wide area network (“WAN”), an internetwork (e.g., the Internet), and peer-to-peer networks (e.g., ad hoc peer-to-peer networks).
- LAN local area network
- WAN wide area network
- Internet internetwork
- peer-to-peer networks
- the computing system can include any number of clients and servers.
- a client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other.
- a server transmits data (e.g., an HTML page) to a client device (e.g., for purposes of displaying data to and receiving user input from a user interacting with the client device).
- client device e.g., for purposes of displaying data to and receiving user input from a user interacting with the client device.
- Data generated at the client device e.g., a result of the user interaction
- Example 1 Digital Contrast on Unstained Tissue Slides Using Multispectral Microscopy and Deep Learning
- Histopathologists use chemical staining techniques on tissue samples to highlight microscopic structure and composition looking for abnormalities that indicate the presence, nature, and extent of pathology. Tissue processing steps are time-consuming and expensive. Each assay requires highly trained and scarce histotechnicians and produces chemical waste. Furthermore, tissue processing and staining are destructive, cutting through multiple tissue sections can deplete valuable biopsy samples.
- Virtual staining utilizes one or more computerized algorithms to create an artificial effect of staining without physically tampering the slide. It uses optical means to scan tissue sections and deep learning techniques to convert the digitized signal into rendered images that pathologists can use as a diagnostics tool. Some advantages include reduced variability, process robustness, increased speed, marker multiplexing, reduced processing expertise, and potential biomarkers with novel medical value.
- This project utilized a multispectral scanner 12A (see also FIG. 5 which illustrates non-limiting components which may be included within any multispectral scanner).
- An unstained multi-tissue array (MTA) section was scanned after performing a dewax protocol, obtaining endogenous contrast at 12 different wavelengths (365, 400, 435, 470, 500, 550, 580, 635, 660, 690, 780 and 850 nm).
- the multispectral images where processed and transformed using deep learning techniques. It was possible to obtain a digital H&E render that was evaluated to have diagnostic quality by 2 certified pathologists.
- the experiment was designed as feasibility test using a multispectral microscopy scanner originally designed for multiplexing assays.
- the objective was to test the possibility to transform the endogenous contrast of unstained tissue scanned at multiple wavelengths, into a H&E like assay that could be used for primary diagnostics.
- Multi-tissue arrays containing liver, kidney, skin, colon, and tonsil were cut and mounted on glass slides. The slides were dewaxed using a modified HE600 protocol to remove paraffin from the sample.
- Step 2 Multispectral Scanning
- Step 3 Staining and Digitizing
- the ground truth (chemically stained slide) was scanned in the multispectral scanner also, coding the closer wavelengths to red, green, and blue into an RGB images, and then transforming them to an image that resembles brightfield H&E, as shown below.
- the feasibility test was conducted using a single MTA, which was scanned with 12 wavelengths (FLASH) before chemically staining it and scanning it under a brightfield microscope (DP200). The experiment used an overfitted model (the same dataset was used for train and test). Testing in a bigger dataset where independent data is used for train and test is required to verify the results.
- the method starts with the preparation of the sample, which is a FFPE section mounted on a standard glass microscope slide as commonly done in normal histopathology workflow.
- the slide was then dewaxed using a standard protocol that can be by chemical means (baths of xylene and other chemicals), or by means of applying heat to melt the paraffin away.
- the next step was to image the slides (with no coverslip) on a device that had the ability to map the endogenous absorption spectra of the tissue sample at a series of discrete wavelengths, to obtain coregistered images at a plurality of wavelengths ranging from the possibly from the UV to IR.
- the device had to have an array sensor (i.e., CCD, CMOS) that can obtain images at the different wavelengths, and the means to focus the images at each wavelength, and the means of illuminating the sample at specific spectral wavelengths and/or combinations of wavelengths.
- an array sensor i.e., CCD, CMOS
- the images obtained were then compressed into 3 channels (using dimensionality reducing methods like PCA, tSNE, UMAP) and coded into an RGB image.
- the weights of each channel were controlled to enhance features of interest on each channel.
- the images were then be segmented into 256 x 256 pixels and converted by a previously trained GAN algorithm to be transformed into H&E like digital stains. After the recoloring, the images were stitched back together and were optionally smoothed to mitigate tiling artifacts.
- the same slide used for scanning the multispectral images was stained using the staining protocol of choice (i.e., H&E in the HE600).
- the slide was coverslipped and then scanned with a brightfield scanner.
- the digital images obtained from the scanning were then co-registered with the multispectral ones using the TrakEM2 ImageJ plugin.
- landmarks were manually designated between a target and to an image to be registered.
- An affine transform was then used to co-register the images based on the landmarks.
- both images were segmented into 256 x 256-pixel images and paired (multispectral and brightfield scans of the H&E-stained tissue sample) and used to train a GAN algorithm in a supervised manner. This process was repeated for a series of slides until the algorithm learned the majority of variation in tissue samples. The trained algorithm can then be used to transform future tissue slides without the need of chemical stain.
- tissue sample was selected to prove the diagnostic ability of the described method to generate a H&E virtual stain that can be used in diagnostics of cancerous lesions.
- FFPE blocks containing samples of cancerous breast resections were selected, sectioned with a microtome (thickness 3 - 5 um) and mounted in plus charged glass slides. The samples were taken into an HE600 to bake (5 minutes in an oven) and deparaffinized using heat to melt the wax and EZprep for rinsing, following the standard method on HE600.
- the samples were placed on the multispectral scanner stage. After selecting/detecting the area of interest by the user, the sample was illuminated with the first (of multiple) wavelength, with a predefined illumination power (controlled by a pulse with modulation circuit to reduce/augment the duty cycle on the LED light source).
- the objective of the scanner was focused by means of moving the sample or the objective, and a first field of view (FOV) was digitized with the camera with a predefined obturation time and gain. Then the sample was illuminated with a second wavelength.
- FOV field of view
- the following wavelengths were used in this experiment (365 nm, 400 nm, 435 nm, 470 nm , 500 nm, 550 nm, 580 nm, 635 nm, 660 nm, 690 nm, 780 nm and 850 nm). Nevertheless, 2 wavelengths (470 and 780 nm) were discarded as they contained optical artifacts due to poor scanning, ending up with only 10 wavelengths.
- This WSI of the stained sample was then processed to create an RGB image by placing the image illuminated with 660 nm in the red channel, the one illuminated with 550 nm in the green channel and the one illuminated with 435 nm in the blue channel.
- This WSI of the stained sample was then processed to apply a previously calibrated color correction matrix to correct the colors and perform a white balancing, in such a way that the image resembles closely to what a user would look at by using a brightfield microscope.
- the next step was to find the FOV of the stitched image if dealing with WSI. To do so, a Laplacian filter was applied to look for discontinuities, revealing the stitch lines. Both images were then divided into individual FOV images, and the above-described process was repeated, filtering paired images to reveal descriptors and finding the best correlation by translating and rotating the filtered version of the FOV containing the ground truth. Once the best position was found, the transformation as applied to the FOV with no filtering, and the images were cut to ditch black areas created by rotating or moving the ground truth, and the result is two coregistered images (PCA and ground truth). Finally, a final step was performed for further dividing each registered FOV into tiles of 270 x 270 pixels.
- the process was repeated at the tile level creating pixel level registered tiles of 256 x 256 pixels images. All the images were then saved into disk to train a GAN algorithm.
- the method to coregister also applied a strict quality control to discard the tiles that were not above the Fourier correlation threshold, and the background tiles (white tiles that contain no tissue information).
- the hyperspectral cube of the WSI of the same section was divided into 256 x 256-pixel tiles, with an overlap of the tiles of 100 pixels in both x and y directions and reduced with PCA using the same method described above to create the RGB images from the PCA components.
- 40% of the tiles were used to train and 60% of the tiles were from areas not used in the training dataset.
- the algorithm trained was used to transform the PCA reduced spectral images to a virtual H&E resembling the brightfield of the stained sample.
- the tiles were then stitched overlapping and blending the tiles with a linear blending algorithm to reduce stitching artifacts and create the effect of a WSI of a brightfield image.
- FIG. 19 shows the resulting virtually stained whole slide image.
- the tissue sample was selected to prove the diagnostic ability of the described method to generate a Masson’s Tri chrome virtual stain.
- FFPE blocks containing samples of Colo Rectal (CRC) resections were selected, sectioned with a microtome (thickness 3 - 5 um) and mounted on TOMO glass slides.
- the samples were taken to a BenchMark Special Stainer and deparaffinized using the deparaffmization steps selected in a standard Trichrome protocol. This involves utilizing BenchMark Special Stains Deparaffmization Solution (Cat. No. 860-036 / 06523102001) and BenchMark Special Stains Wash II (Cat. No. 860-041 / 08309817001).
- BenchMark Special Stains Deparaffmization Solution Cat. No. 860-036 / 06523102001
- BenchMark Special Stains Wash II Cat. No. 860-041 / 08309817001
- the sample was illuminated with the first (of multiple) wavelength, with a predefined illumination power (controlled by a pulse with modulation circuit to reduce/augment the duty cycle on the LED light source).
- the objective of the scanner was focused by means of moving the sample or the objective, and a first field of view (FOV) was digitized with the camera with a predefined obturation time and gain. Then the sample was illuminated with a second wavelength.
- FOV field of view
- the following wavelengths were used in this experiment (365 nm, 400 nm, 435 nm, 470 nm , 500 nm, 550 nm, 580 nm, 635 nm, 660 nm, 690 nm, 780 nm and 850 nm).
- Only the one central wavelength (635 nm) was focused (with feedback controlled autofocus algorithm), and all the other wavelengths were digitized without moving the Z distance in the same FOV. The process was repeated until all the FOV was digitized with all the different wavelengths.
- the stage was moved to a different location in the vicinity of the previous location with some area overlapping.
- the FOV were stitched together to form a whole slide hypercube.
- the slide was stained and coverslipped it was introduced in the stage of the same multispectral scanner.
- the above-mentioned process was repeated but only in the 3 wavelengths: 660 nm for red, 550 nm for green and 435 nm for blue.
- the FOV of the stained images position closely matches the position of the FOV of the unstained samples through the use of fiducials located in the sample (i.e., edges tags, etc.) and it was stored onto a disk making a focusing map that was the same as the one used in the unstained sample.
- AOIs areas of interest
- This WSI of the stained sample was then processed to create an RGB image by placing the image illuminated with 660 nm in the red channel, the one illuminated with 550 nm in the green channel and the one illuminated with 435 nm in the blue channel. [0332] No PCA reduction was used in this example. Instead, the RGB images were registered to the 12 wavelengths with no further processing.
- the hyperspectral cube of the WSI of a different section not used to train the algorithm was divided into 256 x 256-pixel tiles and imputed into the GAN algorithm to transform the hyperspectral cube to an RGB cube. The slides were then stitched together to form the image (see FIG. 21).
- a system for generating a virtually stained image of a test unstained biological specimen disposed on a substrate comprising: (i) one or more processors, and (ii) one or more memories coupled to the one or more processors, the one or more memories to store computer-executable instructions that, when executed by the one or more processors, cause the system to perform operations comprising: a. obtaining a test multispectral transmission image of the test unstained biological specimen, wherein the test multispectral transmission image is derived from one or more test multispectral transmission image channel images; b.
- a virtual staining engine supplying the obtained test multispectral transmission image of the test unstained biological specimen to a virtual staining engine, wherein the virtual staining engine is trained to generate an image of an unstained biological specimen stained with a morphological stain, and wherein the virtual staining engine is trained from a plurality of pairs of coregistered training images derived from one or more training biological specimens, i. wherein a first training image in each pair of the plurality of coregistered training images is derived from one or more training multispectral transmission image channel images; ii. wherein a second training image in each pair of the plurality of coregistered training images comprises training brightfield image data of the same biological specimen stained with the morphological stain; and c.
- each of the one or more test multispectral transmission image channel images are acquired at about the same wavelengths as each of the one or more training multispectral transmission image channel images.
- Additional Embodiment 2 The system of additional embodiment 1, wherein (i) at least two test multispectral transmission image channel images are acquired; and (ii) at least three training multispectral transmission image channel images are acquired; wherein each of the at least three test multispectral transmission image channel images are acquired at about the same wavelengths as each of the at least three training multispectral transmission image channel images.
- Additional Embodiment 3 The system of additional embodiment 1, wherein (i) at least four test multispectral transmission image channel images are acquired; and (ii) at least four training multispectral transmission image channel images are acquired; wherein each of the at least four test multispectral transmission image channel images are acquired at about the same wavelengths as each of the at least four training multispectral transmission image channel images.
- test multispectral transmission image is generated by performing a dimensionality reduction on the at least four test multispectral transmission image channel images.
- Additional Embodiment 5 The system of additional embodiment 3, wherein the first training image in each of pair of coregistered training images is generated by performing a dimensionality reduction on the at least four training multispectral transmission image channel images.
- Additional Embodiment 6 The system of additional embodiment 1, wherein (i) at least six test multispectral transmission image channel images are acquired; and (ii) at least six training multispectral transmission image channel images are acquired; wherein each of the at least six test multispectral transmission image channel images are acquired at about the same wavelengths as each of the at least six training multispectral transmission image channel images.
- Additional Embodiment 7 The system of additional embodiment 6, wherein the test multispectral transmission image is generated by performing a dimensionality reduction on the at least six test multispectral transmission image channel images.
- Additional Embodiment 8 The system of additional embodiment 6, wherein the first training image in each of pair of coregistered training images is generated by performing a dimensionality reduction on the at least six training multispectral transmission image channel images.
- Additional Embodiment 9 The system of additional embodiment 1, wherein (i) at least twelve test multispectral transmission image channel images are acquired; and (ii) at least twelve training multispectral transmission image channel images are acquired; wherein each of the at least twelve test multispectral transmission image channel images are acquired at about the same wavelengths as each of the at least twelve training multispectral transmission image channel images.
- test multispectral transmission image is generated by performing a dimensionality reduction on the at least twelve test multispectral transmission image channel images.
- Additional Embodiment 11 The system of additional embodiment 9, wherein the first training image in each of pair of coregistered training images is generated by performing a dimensionality reduction on the at least twelve training multispectral transmission image channel images.
- Additional Embodiment 12 The system of any one of additional embodiments 1 - 11, wherein the wavelengths are selected from 365 +/- 20 nm, about 400 +/- 20 nm, about 435 +/- 20 nm, about 470 +/- 20 nm, about 500 +/- 20 nm, about 550+/- 20 nm, about 580 +/- 20 nm, about 635 +/- 20 nm, about 660 +/- 20 nm, about 690 +/- 20 nm, about 780 +/- 20 nm, and/or about 850 +/- 20 nm.
- Additional Embodiment 13 The system of any one of additional embodiments 1 - 11, wherein the wavelengths are selected from 365 +/- 10 nm, about 400 +/- 10 nm, about 435 +/- 10 nm, about 470 +/- 10 nm, about 500 +/- 10 nm, about 550+/- 10 nm, about 580 +/- 10 nm, about 635 +/- 10 nm, about 660 +/- 10 nm, about 690 +/- 10 nm, about 780 +/- 10 nm, and/or about 850 +/- 10 nm. Additional Embodiment 14.
- any one of additional embodiments 1 - 11, wherein the wavelengths are selected from 365 nm, about 400 nm, about 435 nm, about 470 nm, about 500 nm, about 550 nm, about 580 nm, about 635 nm, about 660 nm, about 690 nm, about 780 nm, and/or about 850 nm.
- Additional Embodiment 15 The system of any one of additional embodiments 1 - 11, wherein the obtained virtual staining engine comprises a generative adversarial network.
- Additional Embodiment 17 The system of any one of additional embodiments 1 - 11, wherein the morphological stain is a special stain.
- Additional Embodiment 18 The system of any one of additional embodiments 1 - 11, wherein the morphological stain comprises hematoxylin.
- Additional Embodiment 19 The system of any one of additional embodiments 1 - 11, wherein the morphological stain comprises hematoxylin and eosin.
- Additional Embodiment 20 The method of any one of additional embodiments 1 - 19, wherein the training brightfield data is acquired using a multispectral image acquisition device.
- a first training image in each pair of the plurality of coregistered training images is derived from one or more training multispectral transmission image channel images; ii. wherein a second training image in each pair of the plurality of coregistered training images comprises training brightfield image data of the same biological specimen stained with the morphological stain; and c. with the trained virtual staining engine, generating the virtually stained image of the test unstained biological specimen stained with the morphological stain; wherein each of the one or more test multispectral transmission image channel images are acquired at about the same wavelengths as each of the one or more training multispectral transmission image channel images.
- Additional Embodiment 22 The method of additional embodiment 21, wherein (i) at least three test multispectral transmission image channel images are acquired; and (ii) at least three training multispectral transmission image channel images are acquired; wherein each of the at least three test multispectral transmission image channel images are acquired at about the same wavelengths as each of the at least three training multispectral transmission image channel images.
- Additional Embodiment 23 The method of additional embodiment 21, wherein (i) at least four test multispectral transmission image channel images are acquired; and (ii) at least four training multispectral transmission image channel images are acquired; wherein each of the at least four test multispectral transmission image channel images are acquired at about the same wavelengths as each of the at least four training multispectral transmission image channel images.
- test multispectral transmission image is generated by performing a dimensionality reduction on the at least four test multispectral transmission image channel images.
- Additional Embodiment 25 The method of additional embodiment 24, wherein the first training image in each of pair of coregistered training images is generated by performing a dimensionality reduction on the at least four training multispectral transmission image channel images.
- Additional Embodiment 26 The method of additional embodiment 21, wherein (i) at least six test multispectral transmission image channel images are acquired; and (ii) at least six training multispectral transmission image channel images are acquired; wherein each of the at least six test multispectral transmission image channel images are acquired at about the same wavelengths as each of the at least six training multispectral transmission image channel images.
- test multispectral transmission image is generated by performing a dimensionality reduction on the at least six test multispectral transmission image channel images.
- Additional Embodiment 28 The method of additional embodiment 26, wherein the first training image in each of pair of coregistered training images is generated by performing a dimensionality reduction on the at least six training multispectral transmission image channel images.
- Additional Embodiment 29 The method of additional embodiment 21, wherein (i) at least twelve test multispectral transmission image channel images are acquired; and (ii) at least twelve training multispectral transmission image channel images are acquired; wherein each of the at least twelve test multispectral transmission image channel images are acquired at about the same wavelengths as each of the at least twelve training multispectral transmission image channel images.
- test multispectral transmission image is generated by performing a dimensionality reduction on the at least twelve test multispectral transmission image channel images.
- Additional Embodiment 31 The method of additional embodiment 29, wherein the first training image in each of pair of coregistered training images is generated by performing a dimensionality reduction on the at least twelve training multispectral transmission image channel images.
- Additional Embodiment 32 The method of any one of additional embodiments 21 - 31, wherein the wavelengths are selected from 365 +/- 20 nm, about 400 +/- 20 nm, about 435 +/- 20 nm, about 470 +/- 20 nm, about 500 +/- 20 nm, about 550+/- 20 nm, about 580 +/- 20 nm, about 635 +/- 20 nm, about 660 +/- 20 nm, about 690 +/- 20 nm, about 780 +/- 20 nm, and/or about 850 +/- 20 nm.
- Additional Embodiment 33 The method of any one of additional embodiments 21 - 31, wherein the wavelengths are selected from 365 +/- 10 nm, about 400 +/- 10 nm, about 435 +/- 10 nm, about 470 +/- 10 nm, about 500 +/- 10 nm, about 550+/- 10 nm, about 580 +/- 10 nm, about 635 +/- 10 nm, about 660 +/- 10 nm, about 690 +/- 10 nm, about 780 +/- 10 nm, and/or about 850 +/- 10 nm.
- Additional Embodiment 34 The method of any one of additional embodiments 21 - 31, wherein the wavelengths are selected from 365 nm, about 400 nm, about 435 nm, about 470 nm, about 500 nm, about 550 nm, about 580 nm, about 635 nm, about 660 nm, about 690 nm, about 780 nm, and/or about 850 nm.
- Additional Embodiment 35 The method of any one of additional embodiments 21 - 30, wherein the obtained virtual staining engine comprises a generative adversarial network.
- Additional Embodiment 36 The method of any one of additional embodiments 21 - 31 , wherein the morphological stain is a primary stain.
- Additional Embodiment 37 The method of any one of additional embodiments 21 - 31, wherein the morphological stain is a special stain.
- Additional Embodiment 38 The system of any one of additional embodiments 21 - 31, wherein the morphological stain comprises hematoxylin.
- Additional Embodiment 39 The method of any one of additional embodiments 21 - 31, wherein the morphological stain comprises hematoxylin and eosin.
- Additional Embodiment 40 The method of any one of additional embodiments 21 - 39, wherein the training brightfield data is acquired using a multispectral image acquisition device.
- Additional Embodiment 41 The method of additional embodiment 40, wherein multispectral image data is acquired at about wavelengths 700nm, 550nm, and 470nm; and wherein the multispectral image data acquired is converted to an RGB image.
- a system for generating a virtually stained image of a test unstained biological specimen disposed on a substrate comprising: (i) one or more processors, and (ii) one or more memories coupled to the one or more processors, the one or more memories to store computer-executable instructions that, when executed by the one or more processors, cause the system to perform operations comprising: a. obtaining a test multi spectral transmission image of the test unstained biological specimen, wherein the test multispectral transmission image is derived from one or more test multispectral transmission image channel images; b.
- Additional Embodiment 43 The system of additional embodiment 42, wherein the virtual staining engine is trained using (a) one or more training multispectral transmission image channel images of an unstained training biological specimen; and (b) training brightfield image data of the same unstained training biological specimen stained with a morphological stain.
- the training brightfield data is acquired using a multispectral image acquisition device.
- Additional Embodiment 45 The system of additional embodiment 42, wherein each of the one or more test multispectral transmission image channel images are acquired at about the same wavelengths as each of the one or more training multispectral transmission image channel images.
- Additional Embodiment 46 The system of any one of additional embodiments 42 - 43, wherein the wavelengths are selected from 365 +/- 20 nm, about 400 +/- 20 nm, about 435 +/- 20 nm, about 470 +/- 20 nm, about 500 +/- 20 nm, about 550+/- 20 nm, about 580 +/- 20 nm, about 635 +/- 20 nm, about 660 +/- 20 nm, about 690 +/- 20 nm, about 780 +/- 20 nm, and/or about 850 +/- 20 nm.
- Additional Embodiment 47 The system of any one of additional embodiments 42 - 43, wherein the wavelengths are selected from 365 +/- 10 nm, about 400 +/- 10 nm, about 435 +/- 10 nm, about 470 +/- 10 nm, about 500 +/- 10 nm, about 550+/- 10 nm, about 580 +/- 10 nm, about 635 +/- 10 nm, about 660 +/- 10 nm, about 690 +/- 10 nm, about 780 +/- 10 nm, and/or about 850 +/- 10 nm.
- Additional Embodiment 48 The system of any one of additional embodiments 42 - 43, wherein the wavelengths are selected from 365 nm, about 400 nm, about 435 nm, about 470 nm, about 500 nm, about 550 nm, about 580 nm, about 635 nm, about 660 nm, about 690 nm, about 780 nm, and/or about 850 nm.
- Additional Embodiment 49 The system of additional embodiment 42, wherein the virtual staining engine is trained from a plurality of pairs of coregistered training images derived from one or more training biological specimens.
- Additional Embodiment 50 The system of additional embodiment 49, wherein a first training image in each pair of the plurality of coregistered training images is derived from one or more training multispectral transmission image channel images; and wherein a second training image in each pair of the plurality of coregistered training images comprises training brightfield image data of the same biological specimen stained with the morphological stain.
- Additional Embodiment 51 The system of additional embodiment 50, wherein each of the one or more test multispectral transmission image channel images are acquired at about the same wavelengths as each of the one or more training multispectral transmission image channel images.
- a system for generating two or more virtually stained images of a test unstained biological specimen disposed on a substrate comprising: (i) one or more processors, and (ii) one or more memories coupled to the one or more processors, the one or more memories to store computer-executable instructions that, when executed by the one or more processors, cause the system to perform operations comprising: a. obtaining a test multispectral transmission image of the test unstained biological specimen, wherein the test multispectral transmission image is derived from one or more test multispectral transmission image channel images; b.
- each virtual staining engine of the two or more different virtual staining engines is trained to generate an image of an unstained biological specimen stained with a different morphological stain; c. with the trained virtual staining engine, generating two or more virtually stained images of the test unstained biological specimen, wherein each of the generated two or more virtually stained images are stained with a different morphological stain.
- Additional Embodiment 53 The system of additional embodiment 52, wherein the two or more virtual staining engines are each independently trained using (a) one or more training multispectral transmission image channel images of an unstained training biological specimen; and (b) brightfield training image data of the same unstained training biological specimen stained with a morphological stain.
- Additional Embodiment 54 The system of additional embodiment 52, wherein each of the one or more test multispectral transmission image channel images are acquired at about the same wavelengths as each of the one or more training multispectral transmission image channel images used to train each of the two or more virtual staining engines.
- Additional Embodiment 55 The system of any one of additional embodiments 52 - 54, wherein the wavelengths are selected from 365 +/- 20 nm, about 400 +/- 20 nm, about 435 +/- 20 nm, about 470 +/- 20 nm, about 500 +/- 20 nm, about 550+/- 20 nm, about 580 +/- 20 nm, about 635 +/- 20 nm, about 660 +/- 20 nm, about 690 +/- 20 nm, about 780 +/- 20 nm, and/or about 850 +/- 20 nm.
- Additional Embodiment 56 The system of any one of additional embodiments 52 - 54, wherein the wavelengths are selected from 365 +/- 10 nm, about 400 +/- 10 nm, about 435 +/- 10 nm, about 470 +/- 10 nm, about 500 +/- 10 nm, about 550+/- 10 nm, about 580 +/- 10 nm, about 635 +/- 10 nm, about 660 +/- 10 nm, about 690 +/- 10 nm, about 780 +/- 10 nm, and/or about 850 +/- 10 nm.
- Additional Embodiment 57 The system of any one of additional embodiments 52 - 54, wherein the wavelengths are selected from 365 nm, about 400 nm, about 435 nm, about 470 nm, about 500 nm, about 550 nm, about 580 nm, about 635 nm, about 660 nm, about 690 nm, about 780 nm, and/or about 850 nm.
- Additional Embodiment 58 The system of additional embodiment 52, wherein each of the two or more virtual staining engines are trained from a plurality of pairs of coregistered training images derived from one or more training biological specimens.
- Additional Embodiment 59 The system of additional embodiment 58, wherein a first training image in each pair of the plurality of coregistered training images is derived from one or more training multispectral transmission image channel images; and wherein a second training image in each pair of the plurality of coregistered training images comprises training brightfield image data of the same biological specimen stained with the morphological stain.
- a system for generating a virtually stained image of a test unstained biological specimen disposed on a substrate comprising: (i) one or more processors, and (ii) one or more memories coupled to the one or more processors, the one or more memories to store computer-executable instructions that, when executed by the one or more processors, cause the system to perform operations comprising: a. obtaining a trained virtual staining engine trained to generate an image of an unstained biological specimen stained with a morphological stain, wherein the trained virtual staining engine is trained from a plurality of pairs of coregistered training images, i.
- a first training image in each pair of the plurality of coregistered training images is derived from training multispectral transmission image data of an unstained training biological specimen acquired at three or more different wavelengths, ii. wherein a second training image in each pair of the plurality of coregistered training images comprises training brightfield image data of the same training biological specimen stained with a morphological stain; b. obtaining a test multispectral transmission image of the test unstained biological specimen, wherein the obtained test multispectral transmission image is derived from test multispectral transmission image data of the test unstained biological specimen illuminated with the at least three different illumination sources; and c. with the obtained trained virtual staining engine, generating the virtually stained image of the test unstained biological specimen stained with the morphological stain based on the obtaining test multispectral transmission image.
- Additional Embodiment 61 The system of additional embodiment 60, wherein the morphological stain is a primary stain.
- Additional Embodiment 62 The system of additional embodiment 60, wherein the morphological stain is a special stain.
- Additional Embodiment 63 The system of additional embodiment 60, wherein the morphological stain comprises hematoxylin.
- Additional Embodiment 64 The system of additional embodiment 60, wherein the morphological stain comprises hematoxylin and eosin.
- Additional Embodiment 65 The system of any one of additional embodiments 60 - 64, wherein the training multispectral transmission image data of the unstained training biological specimen is acquired at four or more different wavelengths.
- Additional Embodiment 66 The system of additional embodiment 60, wherein the first training image is generated by reducing a dimensionality of the training multispectral transmission image data at the four or more different wavelengths.
- Additional Embodiment 67 The system of additional embodiment 66, wherein the dimensionality is reduced using principal component analysis.
- Additional Embodiment 68 The system of any one of additional embodiments 60 - 64, wherein the training multispectral transmission image data of the unstained training biological specimen is acquired at six or more different wavelengths.
- Additional Embodiment 69 The system of additional embodiment 68, wherein the first training image is generated by reducing a dimensionality of the training multispectral transmission image data at the six or more different wavelengths.
- Additional Embodiment 70 The system of additional embodiment 69, wherein the dimensionality is reduced using principal component analysis.
- Additional Embodiment 71 The system of any one of additional embodiments 60 - 64, wherein the training multispectral transmission image data of the unstained training biological specimen is acquired at twelve or more different wavelengths.
- Additional Embodiment 72 The system of additional embodiment 71, wherein the first training image is generated by reducing a dimensionality of the training multispectral transmission image data at the twelve or more different wavelengths.
- Additional Embodiment 73 The system of additional embodiment 72, wherein the dimensionality is reduced using principal component analysis.
- Additional Embodiment 74 The system of any one of additional embodiments 60 - 73, wherein the at least three wavelengths are selected from 365 +/- 20 nm, about 400 +/- 20 nm, about 435 +/- 20 nm, about 470 +/- 20 nm, about 500 +/- 20 nm, about 550+/- 20 nm, about 580 +/- 20 nm, about 635 +/- 20 nm, about 660 +/- 20 nm, about 690 +/- 20 nm, about 780 +/- 20 nm, and/or about 850 +/- 20 nm.
- Additional Embodiment 75 The system of any one of additional embodiments 60 - 73, wherein the at least three wavelengths are selected from 365 +/- 10 nm, about 400 +/- 10 nm, about 435 +/- 10 nm, about 470 +/- 10 nm, about 500 +/- 10 nm, about 550+/- 10 nm, about 580 +/- 10 nm, about 635 +/- 10 nm, about 660 +/- 10 nm, about 690 +/- 10 nm, about 780 +/- 10 nm, and/or about 850 +/- 10 nm.
- Additional Embodiment 76 The system of any one of additional embodiments 60 - 73, wherein the at least three wavelengths are selected from 365 nm, about 400 nm, about 435 nm, about 470 nm, about 500 nm, about 550 nm, about 580 nm, about 635 nm, about 660 nm, about 690 nm, about 780 nm, and/or about 850 nm.
- Additional Embodiment 77 The system of any one of additional embodiments 60 - 76, wherein the training multispectral transmission image data and the test multispectral image data are acquired using a multispectral image acquisition device.
- Additional Embodiment 78 The system of any one of additional embodiments 60 - 76, wherein the training brightfield image data is acquired using a brightfield image acquisition device.
- Additional Embodiment 79 The system of any one of additional embodiments 60 - 76, wherein the training brightfield image data is acquired using a multispectral image acquisition device, wherein the training brightfield image data is an RGB image.
- Additional Embodiment 80 The system of any one of additional embodiments 60 - 76, wherein the training brightfield image data is acquired using a multispectral image acquisition device, wherein the multispectral image acquisition device is configured to capture image data at 700nm +/- 10mm, about 550nm +/- 10mm , and about 470nm +/- 10mm.
- Additional Embodiment 81 The system of any one of additional embodiments 60 - 80, wherein the obtained trained virtual staining engine comprises a generative adversarial network.
- a method for generating a virtually stained image of a test unstained biological specimen disposed on a substrate comprising: a. obtaining a trained virtual staining engine trained to generate an image of an unstained biological specimen stained with a morphological stain, wherein the trained virtual staining engine is trained from a plurality of pairs of coregistered training images; i. where a first training image in each pair of the plurality of coregistered training images is derived from training multispectral transmission image data of an unstained training biological specimen illuminated with the at least three different illumination sources; ii.
- a second training image in each pair of the plurality of coregistered training images comprises training brightfield image data of the same training biological specimen stained with a morphological stain; b. obtaining a test multispectral transmission image of the test unstained biological specimen, wherein the obtained test multispectral transmission image is derived from test multispectral transmission image data of the test unstained biological specimen illuminated with the at least three different illumination sources; and c. with the obtained trained virtual staining engine, generating the virtually stained image of the test unstained biological specimen stained with the morphological stain based on the obtaining test multispectral transmission image.
- Additional Embodiment 83 The method of additional embodiment 82, wherein the morphological stain is a primary stain.
- Additional Embodiment 84 The method of additional embodiment 82, wherein the morphological stain is a special stain.
- Additional Embodiment 85 The method of additional embodiment 82, wherein the morphological stain comprises hematoxylin.
- Additional Embodiment 86 The method of additional embodiment 82, wherein the morphological stain comprises hematoxylin and eosin.
- Additional Embodiment 87 The method of additional embodiment 82, wherein the unstained training biological specimen and the test unstained biological specimen illuminated with at least four different illumination sources.
- Additional Embodiment 88 The method of additional embodiment 82, wherein the unstained training biological specimen and the test unstained biological specimen illuminated with at least six different illumination sources.
- Additional Embodiment 89 The method of additional embodiment 82, wherein the unstained training biological specimen and the test unstained biological specimen illuminated with at least twelve different illumination sources.
- Additional Embodiment 90 The method of any one of additional embodiments 82 - 89, wherein the at least three illumination sources are selected from 365 +/- 20 nm, about 400 +/- 20 nm, about 435 +/- 20 nm, about 470 +/- 20 nm, about 500 +/- 20 nm, about 550+/- 20 nm, about 580 +/- 20 nm, about 635 +/- 20 nm, about 660 +/- 20 nm, about 690 +/- 20 nm, about 780 +/- 20 nm, and/or about 850 +/- 20 nm.
- Additional Embodiment 91 The method of any one of additional embodiments 82 - 89, wherein the at least three illumination sources are selected from 365 +/- 10 nm, about 400 +/- 10 nm, about 435 +/- 10 nm, about 470 +/- 10 nm, about 500 +/- 10 nm, about 550+/- 10 nm, about 580 +/- 10 nm, about 635 +/- 10 nm, about 660 +/- 10 nm, about 690 +/- 10 nm, about 780 +/- 10 nm, and/or about 850 +/- 10 nm.
- Additional Embodiment 92 The method of any one of additional embodiments 82 - 89, wherein the at least three illumination sources are selected from 365 nm, about 400 nm, about 435 nm, about 470 nm, about 500 nm, about 550 nm, about 580 nm, about 635 nm, about 660 nm, about 690 nm, about 780 nm, and/or about 850 nm.
- Additional Embodiment 93 The method of any one of additional embodiments 82 - 92, wherein the training multispectral transmission image data and the test multispectral image data are acquired using a multispectral image acquisition device.
- Additional Embodiment 94 The method of any one of additional embodiments 82 - 92, wherein the training brightfield image data is acquired using a brightfield image acquisition device.
- Additional Embodiment 95 The method of any one of additional embodiments 82 - 92, wherein the training brightfield image data is acquired using a multispectral image acquisition device.
- Additional Embodiment 96 The method of any one of additional embodiments 82 - 92, wherein the training brightfield image data is acquired using a multispectral image acquisition device, wherein the multispectral image acquisition device is configured to capture image data at 700nm +/- 10mm, about 550nm +/- 10mm , and about 470nm +/- 10mm.
- Additional Embodiment 97 The method of any one of additional embodiments 82 - 96, wherein the obtained trained virtual staining engine comprises a generative adversarial network.
- Additional Embodiment 98 The method of additional embodiment 82, wherein each pair of the plurality of coregistered training images is derived from different training biological specimen.
- a non-transitory computer-readable medium storing instructions for generating a virtually stained image of a test unstained biological specimen disposed on a substrate, the system comprising: (i) one or more processors, and (ii) one or more memories coupled to the one or more processors, the one or more memories to store computer-executable instructions that, when executed by the one or more processors, cause the system to perform operations comprising: a. obtaining a test multispectral transmission image of the test unstained biological specimen, wherein the test multispectral transmission image is derived from one or more test multispectral transmission image channel images; b. supplying the obtained test multispectral transmission image of the test unstained biological specimen to a virtual staining engine, wherein the virtual
- I l l staining engine is trained to generate an image of an unstained biological specimen stained with a morphological stain, and wherein the virtual staining engine is trained from a plurality of pairs of coregistered training images derived from one or more training biological specimens, i. wherein a first training image in each pair of the plurality of coregistered training images is derived from one or more training multispectral transmission image channel images; ii. wherein a second training image in each pair of the plurality of coregistered training images comprises training brightfield image data of the same biological specimen stained with the morphological stain; and c.
- each of the one or more test multispectral transmission image channel images are acquired at about the same wavelengths as each of the one or more training multispectral transmission image channel images.
Landscapes
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Investigating Or Analysing Biological Materials (AREA)
Abstract
The present disclosure provides systems and methods for the generation of a virtually stained image of an unstained biological specimen based on an acquired image of the unstained biological specimen, where the virtually stained image manifests the appearance of the unstained biological specimen as if it were chemically stained with a morphological stain.
Description
METHODS OF GENERATING DIGITALLY STAINED IMAGES FROM UNSTAINED BIOLOGICAL SAMPLES
FIELD OF THE DISCLOSURE
[0001] The present disclosure relates to microscopy methods and systems that utilize deep neural network learning for virtually morphologically staining one or more images derived from an unstained histological or cytological specimen. Deep learning neural networks are utilized to virtually morphologically stain one or more input images derived from an unstained histological or cytological specimen into one or more output images that are equivalent, such as diagnostically equivalent, to brightfield images of the same samples that are morphologically stained.
BACKGROUND OF THE DISCLOSURE
[0002] Microscopic imaging of tissue samples is a fundamental tool used for the diagnosis of various diseases. Histopathologists use chemical staining techniques on tissue samples to highlight microscopic structure and composition looking for abnormalities that indicate the presence, nature, and extent of disease. Microscopic imaging of a tissue sample is, however, a time-consuming and expensive process. For instance, the process of preparing a stained tissue sample includes fixing the tissue sample with formalin, embedding the formalin-fixed tissue specimen to provide a formalin-fixed paraffin-embedded (FFPE) tissue sample, sectioning the FFPE tissue sample into thin slices, staining the tissue slices, and mounting the stained slices onto a glass slide, which is then followed by its microscopic imaging.
[0003] Moreover, tissue processing and staining is destructive. Indeed, the aforementioned steps of preparing a stained tissue sample uses multiple reagents and introduces irreversible effects onto the tissue sample. Additionally, different assays often require multiple tissue sections, which can quickly deplete valuable biopsy samples and increase the likelihood of needing a repeat biopsy from patients, further increasing cost, patient pain, and valuable time. Further, each assay requires highly trained and scarce histotechnicians, produces chemical waste, and requires years of costly physical storage for the resulting glass slides.
BRIEF SUMMARY OF THE DISCLOSURE
[0004] Systems and methods are desired that generate virtually stained images of unstained biological specimens, including histological and cytological specimens. It would be desirable to
have systems and methods that decrease the amount of time required to produce useful stained images of biological specimens. It would also be desirable to have systems and methods that facilitate the generation of multiple virtually stained images derived from a single acquired image of an unstained biological specimen, thereby mitigating the need for multiple tissue sections. To meet these needs, Applicant has developed systems and methods which facilitate the generation of one or more virtually stained images derived from an image of an unstained biological specimen, where the virtually stained image is equivalent, such as diagnostically equivalent, to a corresponding brightfield image of the same biological specimen that has been chemically stained. These and other embodiments are described herein.
[0005] A first aspect of the present disclosure is a system for generating a virtually stained image of a test unstained biological specimen disposed on a substrate, the system comprising: (i) one or more processors, and (ii) one or more memories coupled to the one or more processors, the one or more memories to store computer-executable instructions that, when executed by the one or more processors, cause the system to perform operations comprising: (a) obtaining a test multispectral transmission image of the test unstained biological specimen, wherein the test multispectral transmission image is derived from one or more test multispectral transmission image channel images; (b) supplying the obtained multispectral test transmission image of the test unstained biological specimen to a virtual staining engine, wherein the virtual staining engine is trained to generate an image of an unstained biological specimen stained with a stain, and wherein the virtual staining engine is trained from a plurality of pairs of coregistered training images derived from one or more training biological specimens, (i) wherein a first training image in each pair of the plurality of coregistered training images is derived from one or more training multispectral transmission image channel images; (ii) wherein a second training image in each pair of the plurality of coregistered training images comprises training brightfield image data of the same biological specimen stained with the stain; and (c) with the trained virtual staining engine, generating the virtually stained image of the test unstained biological specimen stained with the stain; wherein each of the one or more test multispectral transmission image channel images are acquired at about the same wavelengths as each of the one or more training multispectral transmission image channel images. In some embodiments, the training brightfield image data is acquired using a brightfield image acquisition device. In some embodiments, the training
brightfield image data is acquired using a multispectral image acquisition device. In some embodiments, the virtual staining engine is not trained using fluorescent images.
[0006] In some embodiments, the test multispectral transmission image of the test unstained biological specimen is derived from two or more test multispectral transmission image channel images, and wherein no compression technique or dimensionality reduction technique is applied to the two or more test multispectral transmission image channel images to derive the test multispectral transmission image of the test unstained biological specimen. In some embodiments, the test multispectral transmission image of the test unstained biological specimen is derived from two or more test multispectral transmission image channel images, and wherein a compression technique or a dimensionality reduction technique (e.g., principal component analysis) is applied to the two or more test multispectral transmission image channel images to derive the test multispectral transmission image of the test unstained biological specimen.
[0007] In some embodiments, the test multispectral transmission image of the test unstained biological specimen is derived from three or more test multispectral transmission image channel images, and wherein no compression technique or dimensionality reduction technique is applied to the three or more test multispectral transmission image channel images to derive the test multispectral transmission image of the test unstained biological specimen. In some embodiments, the test multispectral transmission image of the test unstained biological specimen is derived from three or more test multispectral transmission image channel images, and wherein a compression technique or a dimensionality reduction technique (e.g., principal component analysis) is applied to the three or more test multispectral transmission image channel images to derive the test multispectral transmission image of the test unstained biological specimen.
[0008] In some embodiments, the test multispectral transmission image of the test unstained biological specimen is derived from four or more test multispectral transmission image channel images, and wherein no compression technique or dimensionality reduction technique is applied to the four or more test multispectral transmission image channel images to derive the test multispectral transmission image of the test unstained biological specimen. In some embodiments, the test multispectral transmission image of the test unstained biological specimen is derived from four or more test multispectral transmission image channel images, and wherein a compression technique or a dimensionality reduction technique (e.g., principal component analysis) is applied
to the four or more test multispectral transmission image channel images to derive the test multispectral transmission image of the test unstained biological specimen.
[0009] In some embodiments, the test multispectral transmission image of the test unstained biological specimen is derived from twelve or more test multispectral transmission image channel images, and wherein no compression technique or dimensionality reduction technique is applied to the twelve or more test multispectral transmission image channel images to derive the test multispectral transmission image of the test unstained biological specimen. In some embodiments, the test multispectral transmission image of the test unstained biological specimen is derived from twelve or more test multispectral transmission image channel images, and wherein a compression technique or a dimensionality reduction technique (e.g., principal component analysis) is applied to the twelve or more test multispectral transmission image channel images to derive the test multispectral transmission image of the test unstained biological specimen.
[0010] In some embodiments, the first training image is derived from two or more training multispectral transmission image channel images, and wherein no compression technique or dimensionality reduction technique is applied to the two or more training multispectral transmission image channel images to derive the first training image. In some embodiments, the first training image is derived from two or more training multispectral transmission image channel images, and wherein a compression technique a dimensionality reduction technique is applied to the two or more training multispectral transmission image channel images to derive the first training image.
[0011] In some embodiments, the first training image is derived from three or more training multispectral transmission image channel images, and wherein no compression technique or dimensionality reduction technique is applied to the three or more training multispectral transmission image channel images to derive the first training image. In some embodiments, the first training image is derived from three or more training multispectral transmission image channel images, and wherein a compression technique a dimensionality reduction technique is applied to the three or more training multispectral transmission image channel images to derive the first training image.
[0012] In some embodiments, the first training image is derived from four or more training multispectral transmission image channel images, and wherein no compression technique or
dimensionality reduction technique is applied to the four or more training multispectral transmission image channel images to derive the first training image. In some embodiments, the first training image is derived from four or more training multispectral transmission image channel images, and wherein a compression technique a dimensionality reduction technique is applied to the four or more training multispectral transmission image channel images to derive the first training image.
[0013] In some embodiments, the first training image is derived from twelve or more training multispectral transmission image channel images, and wherein no compression technique or dimensionality reduction technique is applied to the twelve or more training multispectral transmission image channel images to derive the first training image. In some embodiments, the first training image is derived from twelve or more training multispectral transmission image channel images, and wherein a compression technique a dimensionality reduction technique is applied to the twelve or more training multispectral transmission image channel images to derive the first training image.
[0014] In some embodiments, (i) at least three test multispectral transmission image channel images are acquired; and (ii) at least three training multispectral transmission image channel images are acquired; and wherein each of the at least three test multispectral transmission image channel images are acquired at about the same wavelengths as each of the at least three training multispectral transmission image channel images.
[0015] In some embodiments, (i) at least four test multispectral transmission image channel images are acquired; and (ii) at least four training multispectral transmission image channel images are acquired; and wherein each of the at least four test multispectral transmission image channel images are acquired at about the same wavelengths as each of the at least four training multispectral transmission image channel images. In some embodiments, the test multispectral transmission image is generated by performing a dimensionality reduction (e.g., principal component analysis) on the at least four test multispectral transmission image channel images. In some embodiments, the first training image in each of pair of coregistered training images is generated by performing a dimensionality reduction on the at least four training multispectral transmission image channel images.
[0016] In some embodiments, (i) at least six test multispectral transmission image channel images are acquired; and (ii) at least six training multispectral transmission image channel images
are acquired; and wherein each of the at least six test multispectral transmission image channel images are acquired at about the same wavelengths as each of the at least six training multispectral transmission image channel images. In some embodiments, the test multispectral transmission image is generated by performing a dimensionality reduction on the at least six test multispectral transmission image channel images. In some embodiments, the first training image in each of pair of coregistered training images is generated by performing a dimensionality reduction on the at least six training multispectral transmission image channel images.
[0017] In some embodiments, (i) at least twelve test multispectral transmission image channel images are acquired; (ii) at least twelve training multispectral transmission image channel images are acquired; and wherein each of the at least twelve test multispectral transmission image channel images are acquired at about the same wavelengths as each of the at least twelve training multispectral transmission image channel images. In some embodiments, the test multispectral transmission image is generated by performing a dimensionality reduction on the at least twelve test multispectral transmission image channel images. In some embodiments, the first training image in each of pair of coregistered training images is generated by performing a dimensionality reduction on the at least twelve training multispectral transmission image channel images.
[0018] In some embodiments, the wavelengths are selected from 365 +/- 20 nm, about 400 +/- 20 nm, about 435 +/- 20 nm, about 470 +/- 20 nm, about 500 +/- 20 nm, about 550+/- 20 nm, about 580 +/- 20 nm, about 635 +/- 20 nm, about 660 +/- 20 nm, about 690 +/- 20 nm, about 780 +/- 20 nm, and/or about 850 +/- 20 nm. In some embodiments, the wavelengths are selected from 365 +/- 10 nm, about 400 +/- 10 nm, about 435 +/- 10 nm, about 470 +/- 10 nm, about 500 +/- 10 nm, about 550+/- 10 nm, about 580 +/- 10 nm, about 635 +/- 10 nm, about 660 +/- 10 nm, about 690 +/- 10 nm, about 780 +/- 10 nm, and/or about 850 +/- 10 nm. In some embodiments, the wavelengths are selected from 365 nm, about 400 nm, about 435 nm, about 470 nm, about 500 nm, about 550 nm, about 580 nm, about 635 nm, about 660 nm, about 690 nm, about 780 nm, and/or about 850 nm.
[0019] In some embodiments, the obtained virtual staining engine comprises a generative adversarial network (GAN) (e.g., a 3-channel GAN, a 12-channel GAN, etc.).
[0020] In some embodiments, the stain is a primary stain. In some embodiments, the stain is a special stain. In some embodiments, the stain comprises hematoxylin. In some embodiments, the stain comprises hematoxylin and eosin.
[0021] A second aspect of the present disclosure is a method of generating a virtually stained image of a test unstained biological specimen disposed on a substrate, the method comprising: (a) obtaining a test multispectral transmission image of the test unstained biological specimen, wherein the test multispectral transmission image is derived from one or more test multispectral transmission image channel images; (b) supplying the obtained test multispectral transmission image of the test unstained biological specimen to a virtual staining engine, wherein the virtual staining engine is trained to generate an image of an unstained biological specimen stained with a stain, and wherein the virtual staining engine is trained from a plurality of pairs of coregistered training images derived from one or more training biological specimens, (i) wherein a first training image in each pair of the plurality of coregistered training images is derived from one or more training multispectral transmission image channel images; (ii) wherein a second training image in each pair of the plurality of coregistered training images comprises training brightfield image data of the same biological specimen stained with the morphological stain; and (c) with the trained virtual staining engine, generating the virtually stained image of the test unstained biological specimen stained with the morphological stain; wherein each of the one or more test multispectral transmission image channel images are acquired at about the same wavelengths as each of the one or more training multispectral transmission image channel images. In some embodiments, the training brightfield image data is acquired using a brightfield image acquisition device. In some embodiments, the training brightfield image data is acquired using a multispectral image acquisition device.
[0022] In some embodiments, the test multispectral transmission image of the test unstained biological specimen is derived from two or more test multispectral transmission image channel images, and wherein no compression technique or dimensionality reduction technique is applied to the two or more test multispectral transmission image channel images to derive the test multispectral transmission image of the test unstained biological specimen. In some embodiments, the test multispectral transmission image of the test unstained biological specimen is derived from two or more test multispectral transmission image channel images, and wherein a compression technique or a dimensionality reduction technique (e.g., principal component analysis) is applied to the two or more test multispectral transmission image channel images to derive the test multispectral transmission image of the test unstained biological specimen.
[0023] In some embodiments, the test multispectral transmission image of the test unstained biological specimen is derived from three or more test multispectral transmission image channel images, and wherein no compression technique or dimensionality reduction technique is applied to the three or more test multispectral transmission image channel images to derive the test multispectral transmission image of the test unstained biological specimen. In some embodiments, the test multispectral transmission image of the test unstained biological specimen is derived from three or more test multispectral transmission image channel images, and wherein a compression technique or a dimensionality reduction technique (e.g., principal component analysis) is applied to the three or more test multispectral transmission image channel images to derive the test multispectral transmission image of the test unstained biological specimen.
[0024] In some embodiments, the test multispectral transmission image of the test unstained biological specimen is derived from four or more test multispectral transmission image channel images, and wherein no compression technique or dimensionality reduction technique is applied to the four or more test multispectral transmission image channel images to derive the test multispectral transmission image of the test unstained biological specimen. In some embodiments, the test multispectral transmission image of the test unstained biological specimen is derived from four or more test multispectral transmission image channel images, and wherein a compression technique or a dimensionality reduction technique (e g., principal component analysis) is applied to the four or more test multispectral transmission image channel images to derive the test multispectral transmission image of the test unstained biological specimen.
[0025] In some embodiments, the test multispectral transmission image of the test unstained biological specimen is derived from twelve or more test multispectral transmission image channel images, and wherein no compression technique or dimensionality reduction technique is applied to the twelve or more test multispectral transmission image channel images to derive the test multispectral transmission image of the test unstained biological specimen. In some embodiments, the test multispectral transmission image of the test unstained biological specimen is derived from twelve or more test multispectral transmission image channel images, and wherein a compression technique or a dimensionality reduction technique (e.g., principal component analysis) is applied to the twelve or more test multispectral transmission image channel images to derive the test multispectral transmission image of the test unstained biological specimen.
[0026] In some embodiments, the first training image is derived from two or more training multispectral transmission image channel images, and wherein no compression technique or dimensionality reduction technique is applied to the two or more training multispectral transmission image channel images to derive the first training image. In some embodiments, the first training image is derived from two or more training multispectral transmission image channel images, and wherein a compression technique a dimensionality reduction technique is applied to the two or more training multispectral transmission image channel images to derive the first training image.
[0027] In some embodiments, the first training image is derived from three or more training multispectral transmission image channel images, and wherein no compression technique or dimensionality reduction technique is applied to the three or more training multispectral transmission image channel images to derive the first training image. In some embodiments, the first training image is derived from three or more training multispectral transmission image channel images, and wherein a compression technique a dimensionality reduction technique is applied to the three or more training multispectral transmission image channel images to derive the first training image.
[0028] In some embodiments, the first training image is derived from four or more training multispectral transmission image channel images, and wherein no compression technique or dimensionality reduction technique is applied to the four or more training multispectral transmission image channel images to derive the first training image. In some embodiments, the first training image is derived from four or more training multispectral transmission image channel images, and wherein a compression technique a dimensionality reduction technique is applied to the four or more training multispectral transmission image channel images to derive the first training image.
[0029] In some embodiments, the first training image is derived from twelve or more training multispectral transmission image channel images, and wherein no compression technique or dimensionality reduction technique is applied to the twelve or more training multispectral transmission image channel images to derive the first training image. In some embodiments, the first training image is derived from twelve or more training multispectral transmission image channel images, and wherein a compression technique a dimensionality reduction technique is
applied to the twelve or more training multispectral transmission image channel images to derive the first training image.
[0030] In some embodiments, (i) at least three test multispectral transmission image channel images are acquired; and (ii) at least three training multispectral transmission image channel images are acquired; wherein each of the at least three test multispectral transmission image channel images are acquired at about the same wavelengths as each of the at least three training multispectral transmission image channel images.
[0031] In some embodiments, (i) at least four test multispectral transmission image channel images are acquired; (ii) at least four training multispectral transmission image channel images are acquired; and wherein each of the at least four test multispectral transmission image channel images are acquired at about the same wavelengths as each of the at least four training multispectral transmission image channel images. In some embodiments, the test multispectral transmission image is generated by performing a dimensionality reduction on the at least four test multispectral transmission image channel images. In some embodiments, the first training image in each of pair of coregistered training images is generated by performing a dimensionality reduction on the at least four training multispectral transmission image channel images.
[0032] In some embodiments, (i) at least six test multispectral transmission image channel images are acquired; (ii) at least six training multispectral transmission image channel images are acquired; and wherein each of the at least six test multispectral transmission image channel images are acquired at about the same wavelengths as each of the at least six training multispectral transmission image channel images. In some embodiments, the test multispectral transmission image is generated by performing a dimensionality reduction on the at least six test multispectral transmission image channel images. In some embodiments, the first training image in each of pair of coregistered training images is generated by performing a dimensionality reduction on the at least six training multispectral transmission image channel images.
[0033] In some embodiments, (i) at least twelve test multispectral transmission image channel images are acquired; (ii) at least twelve training multispectral transmission image channel images are acquired; and wherein each of the at least twelve test multispectral transmission image channel images are acquired at about the same wavelengths as each of the at least twelve training multispectral transmission image channel images. In some embodiments, the test multispectral transmission image is generated by performing a dimensionality reduction on the at least twelve
test multispectral transmission image channel images. In some embodiments, the first training image in each of pair of coregistered training images is generated by performing a dimensionality reduction on the at least twelve training multispectral transmission image channel images.
[0034] In some embodiments, the wavelengths are selected from 365 +/- 20 nm, about 400 +/- 20 nm, about 435 +/- 20 nm, about 470 +/- 20 nm, about 500 +/- 20 nm, about 550+/- 20 nm, about 580 +/- 20 nm, about 635 +/- 20 nm, about 660 +/- 20 nm, about 690 +/- 20 nm, about 780 +/- 20 nm, and/or about 850 +/- 20 nm. In some embodiments, the wavelengths are selected from 365 +/- 10 nm, about 400 +/- 10 nm, about 435 +/- 10 nm, about 470 +/- 10 nm, about 500 +/- 10 nm, about 550+/- 10 nm, about 580 +/- 10 nm, about 635 +/- 10 nm, about 660 +/- 10 nm, about 690 +/- 10 nm, about 780 +/- 10 nm, and/or about 850 +/- 10 nm. In some embodiments, the wavelengths are selected from 365 nm, about 400 nm, about 435 nm, about 470 nm, about 500 nm, about 550 nm, about 580 nm, about 635 nm, about 660 nm, about 690 nm, about 780 nm, and/or about 850 nm.
[0035] In some embodiments, the obtained virtual staining engine comprises a generative adversarial network.
[0036] In some embodiments, the morphological stain is a primary stain. In some embodiments, the morphological stain is a special stain. In some embodiments, the morphological stain comprises hematoxylin. In some embodiments, the morphological stain comprises hematoxylin and eosin.
[0037] A third aspect of the present disclosure is a system for generating a virtually stained image of a test unstained biological specimen disposed on a substrate, the system comprising: (i) one or more processors, and (ii) one or more memories coupled to the one or more processors, the one or more memories to store computer-executable instructions that, when executed by the one or more processors, cause the system to perform operations comprising: (a) obtaining a test multispectral transmission image of the test unstained biological specimen, wherein the test multispectral transmission image is derived from one or more test multispectral transmission image channel images; (b) supplying the obtained multispectral test transmission image of the test unstained biological specimen to a virtual staining engine, wherein the virtual staining engine is trained to generate an image of an unstained biological specimen stained with a morphological stain; and (c) with the trained virtual staining engine, generating the virtually stained image of the test unstained biological specimen stained with the morphological stain.
[0038] In some embodiments, the test multispectral transmission image of the test unstained biological specimen is derived from two or more test multispectral transmission image channel images, and wherein no compression technique or dimensionality reduction technique is applied to the two or more test multispectral transmission image channel images to derive the test multispectral transmission image of the test unstained biological specimen. In some embodiments, the test multispectral transmission image of the test unstained biological specimen is derived from two or more test multispectral transmission image channel images, and wherein a compression technique or a dimensionality reduction technique (e.g., principal component analysis) is applied to the two or more test multispectral transmission image channel images to derive the test multispectral transmission image of the test unstained biological specimen.
[0039] In some embodiments, the test multispectral transmission image of the test unstained biological specimen is derived from three or more test multispectral transmission image channel images, and wherein no compression technique or dimensionality reduction technique is applied to the three or more test multispectral transmission image channel images to derive the test multispectral transmission image of the test unstained biological specimen. In some embodiments, the test multispectral transmission image of the test unstained biological specimen is derived from three or more test multispectral transmission image channel images, and wherein a compression technique or a dimensionality reduction technique (e g., principal component analysis) is applied to the three or more test multispectral transmission image channel images to derive the test multispectral transmission image of the test unstained biological specimen.
[0040] In some embodiments, the test multispectral transmission image of the test unstained biological specimen is derived from four or more test multispectral transmission image channel images, and wherein no compression technique or dimensionality reduction technique is applied to the four or more test multispectral transmission image channel images to derive the test multispectral transmission image of the test unstained biological specimen. In some embodiments, the test multispectral transmission image of the test unstained biological specimen is derived from four or more test multispectral transmission image channel images, and wherein a compression technique or a dimensionality reduction technique (e.g., principal component analysis) is applied to the four or more test multispectral transmission image channel images to derive the test multispectral transmission image of the test unstained biological specimen.
[0041] In some embodiments, the test multispectral transmission image of the test unstained biological specimen is derived from twelve or more test multispectral transmission image channel images, and wherein no compression technique or dimensionality reduction technique is applied to the twelve or more test multispectral transmission image channel images to derive the test multispectral transmission image of the test unstained biological specimen. In some embodiments, the test multispectral transmission image of the test unstained biological specimen is derived from twelve or more test multispectral transmission image channel images, and wherein a compression technique or a dimensionality reduction technique (e.g., principal component analysis) is applied to the twelve or more test multispectral transmission image channel images to derive the test multispectral transmission image of the test unstained biological specimen.
[0042] In some embodiments, the virtual staining engine is trained using (a) one or more training multispectral transmission image channel images of an unstained training biological specimen; and (b) brightfield training image data of the same unstained training biological specimen stained with a morphological stain. In some embodiments, each of the one or more test multispectral transmission image channel images are acquired at about the same wavelengths as each of the one or more training multispectral transmission image channel images. In some embodiments, the training brightfield image data is acquired using a brightfield image acquisition device. In some embodiments, the training brightfield image data is acquired using a multispectral image acquisition device.
[0043] In some embodiments, the virtual staining engine is trained from a plurality of pairs of coregistered training images derived from one or more training biological specimens. In some embodiments, a first training image in each pair of the plurality of coregistered training images is derived from one or more training multispectral transmission image channel images; and wherein a second training image in each pair of the plurality of coregistered training images comprises training brightfield image data of the same biological specimen stained with the morphological stain. In some embodiments, each of the one or more test multispectral transmission image channel images are acquired at about the same wavelengths as each of the one or more training multispectral transmission image channel images.
[0044] In some embodiments, the wavelengths are selected from 365 +/- 20 nm, about 400 +/- 20 nm, about 435 +/- 20 nm, about 470 +/- 20 nm, about 500 +/- 20 nm, about 550+/- 20 nm,
about 580 +/- 20 nm, about 635 +/- 20 nm, about 660 +/- 20 nm, about 690 +/- 20 nm, about 780 +/- 20 nm, and/or about 850 +/- 20 nm. In some embodiments, the wavelengths are selected from 365 +/- 10 nm, about 400 +/- 10 nm, about 435 +/- 10 nm, about 470 +/- 10 nm, about 500 +/- 10 nm, about 550+/- 10 nm, about 580 +/- 10 nm, about 635 +/- 10 nm, about 660 +/- 10 nm, about 690 +/- 10 nm, about 780 +/- 10 nm, and/or about 850 +/- 10 nm. In some embodiments, the wavelengths are selected from 365 nm, about 400 nm, about 435 nm, about 470 nm, about 500 nm, about 550 nm, about 580 nm, about 635 nm, about 660 nm, about 690 nm, about 780 nm, and/or about 850 nm.
[0045] A fourth aspect of the present disclosure is a method of generating a virtually stained image of a test unstained biological specimen disposed on a substrate, the method comprising (a) obtaining a test multispectral transmission image of the test unstained biological specimen, wherein the test multispectral transmission image is derived from one or more test multispectral transmission image channel images; (b) supplying the obtained multispectral test transmission image of the test unstained biological specimen to a virtual staining engine, wherein the virtual staining engine is trained to generate an image of an unstained biological specimen stained with a morphological stain; and (c) with the trained virtual staining engine, generating the virtually stained image of the test unstained biological specimen stained with the morphological stain.
[0046] In some embodiments, the virtual staining engine is trained using (a) one or more training multispectral transmission image channel images of an unstained training biological specimen; and (b) brightfield training image data of the same unstained training biological specimen stained with a morphological stain. In some embodiments, each of the one or more test multispectral transmission image channel images are acquired at about the same wavelengths as each of the one or more training multispectral transmission image channel images. In some embodiments, the training brightfield image data is acquired using a brightfield image acquisition device. In some embodiments, the training brightfield image data is acquired using a multispectral image acquisition device.
[0047] A fifth aspect of the present disclosure is system for generating two or more virtually stained images of a test unstained biological specimen disposed on a substrate, the system comprising: (i) one or more processors, and (ii) one or more memories coupled to the one or more processors, the one or more memories to store computer-executable instructions that, when
executed by the one or more processors, cause the system to perform operations comprising: (a) obtaining a test multispectral transmission image of the test unstained biological specimen, wherein the test multispectral transmission image is derived from one or more test multispectral transmission image channel images; (b) supplying the obtained test multispectral transmission image of the test unstained biological specimen to two or more different virtual staining engines, wherein each virtual staining engine of the two or more different virtual staining engines is trained to generate an image of an unstained biological specimen stained with a different morphological stain; and (c) with the trained virtual staining engine, generating two or more virtually stained images of the test unstained biological specimen, wherein each of the generated two or more virtually stained images are stained with a different stain. In some embodiments, one of the virtually stained images is virtually stained with hematoxylin and/or eosin; while another one of the virtually stained images is virtually stained with a special stain (e.g., Masson's Trichrome).
[0048] In some embodiments, the two or more virtual staining engines are each independently trained using different sets of training data, such as where each training data set includes images stained with a specific morphological stain. In some embodiments, the two or more virtual staining engines are each independently trained using (a) one or more training multispectral transmission image channel images of an unstained training biological specimen; and (b) brightfield training image data of the same unstained training biological specimen stained with a morphological stain. In some embodiments, each of the one or more test multispectral transmission image channel images are acquired at about the same wavelengths as each of the one or more training multispectral transmission image channel images used to train each of the two or more virtual staining engines. In some embodiments, the training brightfield image data is acquired using a brightfield image acquisition device. In some embodiments, the training brightfield image data is acquired using a multispectral image acquisition device.
[0049] In some embodiments, the wavelengths are selected from 365 +/- 20 nm, about 400 +/- 20 nm, about 435 +/- 20 nm, about 470 +/- 20 nm, about 500 +/- 20 nm, about 550+/- 20 nm, about 580 +/- 20 nm, about 635 +/- 20 nm, about 660 +/- 20 nm, about 690 +/- 20 nm, about 780 +/- 20 nm, and/or about 850 +/- 20 nm. In some embodiments, the wavelengths are selected from 365 +/- 10 nm, about 400 +/- 10 nm, about 435 +/- 10 nm, about 470 +/- 10 nm, about 500 +/- 10 nm, about 550+/- 10 nm, about 580 +/- 10 nm, about 635 +/- 10 nm, about 660 +/- 10 nm, about 690 +/- 10 nm, about 780 +/- 10 nm, and/or about 850 +/- 10 nm. In some embodiments, the
wavelengths are selected from 365 nm, about 400 nm, about 435 nm, about 470 nm, about 500 nm, about 550 nm, about 580 nm, about 635 nm, about 660 nm, about 690 nm, about 780 nm, and/or about 850 nm.
[0050] In some embodiments, each of the two or more virtual staining engines are trained from a plurality of pairs of coregistered training images derived from one or more training biological specimens. In some embodiments, a first training image in each pair of the plurality of coregistered training images is derived from one or more training multispectral transmission image channel images; and wherein a second training image in each pair of the plurality of coregistered training images comprises training brightfield image data of the same biological specimen stained with the morphological stain. In some embodiments, the training brightfield image data is acquired using a brightfield image acquisition device. In some embodiments, the training brightfield image data is acquired using a multispectral image acquisition device.
[0051] A sixth aspect of the present disclosure is method of generating two or more virtually stained images of a test unstained biological specimen disposed on a substrate, the method comprising (a) obtaining a test multispectral transmission image of the test unstained biological specimen, wherein the test multispectral transmission image is derived from one or more test multispectral transmission image channel images; (b) supplying the obtained test multispectral transmission image of the test unstained biological specimen to two or more different virtual staining engines, wherein each virtual staining engine of the two or more different virtual staining engines is trained to generate an image of an unstained biological specimen stained with a different morphological stain; and (c) with the trained virtual staining engine, generating two or more virtually stained images of the test unstained biological specimen, wherein each of the generated two or more virtually stained images are stained with a different morphological stain.
[0052] In some embodiments, each of the two or more virtual staining engines are trained from a plurality of pairs of coregistered training images derived from one or more training biological specimens. In some embodiments, a first training image in each pair of the plurality of coregistered training images is derived from one or more training multispectral transmission image channel images; and wherein a second training image in each pair of the plurality of coregistered training images comprises training brightfield image data of the same biological specimen stained with the morphological stain. In some embodiments, the training brightfield image data is acquired
using a brightfield image acquisition device. In some embodiments, the training brightfield image data is acquired using a multispectral image acquisition device.
[0053] A seventh aspect of the present disclosure is system for generating a virtually stained image of a test unstained biological specimen disposed on a substrate, the system comprising: (i) one or more processors, and (ii) one or more memories coupled to the one or more processors, the one or more memories to store computer-executable instructions that, when executed by the one or more processors, cause the system to perform operations comprising: (a) obtaining a trained virtual staining engine trained to generate an image of an unstained biological specimen stained with a morphological stain, wherein the trained virtual staining engine is trained from a plurality of pairs of coregistered training images, (i) where a first training image in each pair of the plurality of coregistered training images is derived from training multispectral transmission image data of an unstained training biological specimen acquired at three or more different wavelengths, (ii) wherein a second training image in each pair of the plurality of coregistered training images comprises training brightfield image data of the same training biological specimen stained with a morphological stain; (b) obtaining a test multispectral transmission image of the test unstained biological specimen, wherein the obtained test multispectral transmission image is derived from test multispectral transmission image data of the test unstained biological specimen illuminated with the at least three different illumination sources; and (c) with the obtained trained virtual staining engine, generating the virtually stained image of the test unstained biological specimen stained with the morphological stain based on the obtaining test multispectral transmission image.
[0054] In some embodiments, the training brightfield image data is acquired using a brightfield image acquisition device. In some embodiments, the training brightfield image data is acquired using a multispectral image acquisition device.
[0055] In some embodiments, the morphological stain is a primary stain. In some embodiments, the morphological stain is a special stain (e.g., Masson's Trichrome or any of the other exemplary special stains described herein). In some embodiments, the morphological stain comprises hematoxylin. In some embodiments, the morphological stain comprises hematoxylin and eosin.
[0056] In some embodiments, the training multispectral transmission image data of the unstained training biological specimen is acquired at four or more different wavelengths. In some
embodiments, the first training image is generated without reducing a dimensionality of the training multispectral transmission image data at the four or more different wavelengths. In some embodiments, the first training image is generated by reducing a dimensionality of the training multispectral transmission image data at the four or more different wavelengths. In some embodiments, the dimensionality is reduced using principal component analysis.
[0057] In some embodiments, the training multispectral transmission image data of the unstained training biological specimen is acquired at six or more different wavelengths. In some embodiments, the first training image is generated without reducing a dimensionality of the training multispectral transmission image data acquired at the six or more different wavelengths. In some embodiments, the first training image is generated by reducing a dimensionality of the training multispectral transmission image data acquired at the six or more different wavelengths. In some embodiments, the dimensionality is reduced using principal component analysis.
[0058] In some embodiments, the training multispectral transmission image data of the unstained training biological specimen is acquired at twelve or more different wavelengths. In some embodiments, the first training image is generated without reducing a dimensionality of the training multispectral transmission image data acquired at the twelve or more different wavelengths. In some embodiments, the first training image is generated by reducing a dimensionality of the training multispectral transmission image data acquired at the twelve or more different wavelengths. In some embodiments, the dimensionality is reduced using principal component analysis.
[0059] In some embodiments, the at least two wavelengths are selected from 365 +/- 20 nm, about 400 +/- 20 nm, about 435 +/- 20 nm, about 470 +/- 20 nm, about 500 +/- 20 nm, about 550+/- 20 nm, about 580 +/- 20 nm, about 635 +/- 20 nm, about 660 +/- 20 nm, about 690 +/- 20 nm, about 780 +/- 20 nm, and/or about 850 +/- 20 nm. In some embodiments, the at least two wavelengths are selected from 365 +/- 10 nm, about 400 +/- 10 nm, about 435 +/- 10 nm, about 470 +/- 10 nm, about 500 +/- 10 nm, about 550+/- 10 nm, about 580 +/- 10 nm, about 635 +/- 10 nm, about 660 +/- 10 nm, about 690 +/- 10 nm, about 780 +/- 10 nm, and/or about 850 +/- 10 nm. In some embodiments, the at least two wavelengths are selected from 365 nm, about 400 nm, about 435 nm, about 470 nm, about 500 nm, about 550 nm, about 580 nm, about 635 nm, about 660 nm, about 690 nm, about 780 nm, and/or about 850 nm.
[0060] In some embodiments, the training multispectral transmission image data and the test multispectral image data are acquired using a multispectral image acquisition device. In some embodiments, the training brightfield image data is acquired using a brightfield image acquisition device. In some embodiments, the training brightfield image data is acquired using a multispectral image acquisition device, wherein the multispectral image acquisition device is configured to capture image data at 700nm +/- 10mm, about 550nm +/- 10mm , and about 470nm +/- 10mm.
[0061] In some embodiments, the obtained trained virtual staining engine comprises a generative adversarial network.
[0062] An eighth aspect of the present disclosure is a method of generating a virtually stained image of a test unstained biological specimen disposed on a substrate, comprising: (a) obtaining a trained virtual staining engine trained to generate an image of an unstained biological specimen stained with a morphological stain, wherein the trained virtual staining engine is trained from a plurality of pairs of coregistered training images, (b) where a first training image in each pair of the plurality of coregistered training images is derived from training multispectral transmission image data of an unstained training biological specimen illuminated with the at least two different illumination sources; (i) wherein a second training image in each pair of the plurality of coregistered training images comprises training brightfield image data of the same training biological specimen stained with a morphological stain; (ii) obtaining a test multispectral transmission image of the test unstained biological specimen, wherein the obtained test multispectral transmission image is derived from test multispectral transmission image data of the test unstained biological specimen illuminated with the at least two different illumination sources; and (c) with the obtained trained virtual staining engine, generating the virtually stained image of the test unstained biological specimen stained with the morphological stain based on the obtaining test multispectral transmission image.
[0063] In some embodiments, the morphological stain is a primary stain. In some embodiments, the morphological stain is a special stain. In some embodiments, the morphological stain comprises hematoxylin. In some embodiments, the morphological stain comprises hematoxylin and eosin.
[0064] In some embodiments, the unstained training biological specimen and the test unstained biological specimen illuminated with at least four different illumination sources. In some embodiments, the unstained training biological specimen and the test unstained biological
specimen illuminated with at least six different illumination sources. In some embodiments, the unstained training biological specimen and the test unstained biological specimen illuminated with at least twelve different illumination sources.
[0065] In some embodiments, the at least two illumination sources are selected from 365 +/- 20 nm, about 400 +/- 20 nm, about 435 +/- 20 nm, about 470 +/- 20 nm, about 500 +/- 20 nm, about 550+/- 20 nm, about 580 +/- 20 nm, about 635 +/- 20 nm, about 660 +/- 20 nm, about 690 +/- 20 nm, about 780 +/- 20 nm, and/or about 850 +/- 20 nm. In some embodiments, the at least two illumination sources are selected from 365 +/- 10 nm, about 400 +/- 10 nm, about 435 +/- 10 nm, about 470 +/- 10 nm, about 500 +/- 10 nm, about 550+/- 10 nm, about 580 +/- 10 nm, about 635 +/- 10 nm, about 660 +/- 10 nm, about 690 +/- 10 nm, about 780 +/- 10 nm, and/or about 850 +/- 10 nm. In some embodiments, the at least two illumination sources are selected from 365 nm, about 400 nm, about 435 nm, about 470 nm, about 500 nm, about 550 nm, about 580 nm, about 635 nm, about 660 nm, about 690 nm, about 780 nm, and/or about 850 nm.
[0066] In some embodiments, the training multispectral transmission image data and the test multispectral image data are acquired using a multispectral image acquisition device. In some embodiments, the training brightfield image data is acquired using a brightfield image acquisition device. In some embodiments, the training brightfield image data is acquired using a multispectral image acquisition device, wherein the multispectral image acquisition device is configured to capture image data at 700nm +/- 10mm, about 550nm +/- 10mm , and about 470nm +/- 10mm. In some embodiments, the obtained trained virtual staining engine comprises a generative adversarial network. In some embodiments, each pair of the plurality of coregistered training images is derived from different training biological specimen.
[0067] A ninth aspect of the present disclosure is a non-transitory computer-readable medium storing instructions for generating a virtually stained image of a test unstained biological specimen disposed on a substrate, the instructions comprising: (a) obtaining a test multispectral transmission image of the test unstained biological specimen, wherein the test multispectral transmission image is derived from one or more test multispectral transmission image channel images; (b) supplying the obtained test multispectral transmission image of the test unstained biological specimen to a virtual staining engine, wherein the virtual staining engine is trained to generate an image of an unstained biological specimen stained with a morphological stain, and wherein the virtual staining engine is trained from a plurality of pairs of coregistered training
images derived from one or more training biological specimens, (i) wherein a first training image in each pair of the plurality of coregistered training images is derived from one or more training multispectral transmission image channel images; (ii) wherein a second training image in each pair of the plurality of coregistered training images comprises training brightfield image data of the same biological specimen stained with the morphological stain; and (c) with the trained virtual staining engine, generating the virtually stained image of the test unstained biological specimen stained with the morphological stain; wherein each of the one or more test multispectral transmission image channel images are acquired at about the same wavelengths as each of the one or more training multispectral transmission image channel images.
[0068] In some embodiments, the test multispectral transmission image of the test unstained biological specimen is derived from two or more test multispectral transmission image channel images, and wherein no compression technique or dimensionality reduction technique is applied to the two or more test multispectral transmission image channel images to derive the test multispectral transmission image of the test unstained biological specimen. In some embodiments, the test multispectral transmission image of the test unstained biological specimen is derived from two or more test multispectral transmission image channel images, and wherein a compression technique or a dimensionality reduction technique (e.g., principal component analysis) is applied to the two or more test multispectral transmission image channel images to derive the test multispectral transmission image of the test unstained biological specimen.
[0069] In some embodiments, the test multispectral transmission image of the test unstained biological specimen is derived from three or more test multispectral transmission image channel images, and wherein no compression technique or dimensionality reduction technique is applied to the three or more test multispectral transmission image channel images to derive the test multispectral transmission image of the test unstained biological specimen. In some embodiments, the test multispectral transmission image of the test unstained biological specimen is derived from three or more test multispectral transmission image channel images, and wherein a compression technique or a dimensionality reduction technique (e.g., principal component analysis) is applied to the three or more test multispectral transmission image channel images to derive the test multispectral transmission image of the test unstained biological specimen.
[0070] In some embodiments, the test multispectral transmission image of the test unstained biological specimen is derived from four or more test multispectral transmission image
channel images, and wherein no compression technique or dimensionality reduction technique is applied to the four or more test multispectral transmission image channel images to derive the test multispectral transmission image of the test unstained biological specimen. In some embodiments, the test multispectral transmission image of the test unstained biological specimen is derived from four or more test multispectral transmission image channel images, and wherein a compression technique or a dimensionality reduction technique (e g., principal component analysis) is applied to the four or more test multispectral transmission image channel images to derive the test multispectral transmission image of the test unstained biological specimen.
[0071] In some embodiments, the test multispectral transmission image of the test unstained biological specimen is derived from twelve or more test multispectral transmission image channel images, and wherein no compression technique or dimensionality reduction technique is applied to the twelve or more test multispectral transmission image channel images to derive the test multispectral transmission image of the test unstained biological specimen. In some embodiments, the test multispectral transmission image of the test unstained biological specimen is derived from twelve or more test multispectral transmission image channel images, and wherein a compression technique or a dimensionality reduction technique (e.g., principle component analysis) is applied to the twelve or more test multispectral transmission image channel images to derive the test multispectral transmission image of the test unstained biological specimen.
[0072] In some embodiments, the first training image is derived from two or more training multispectral transmission image channel images, and wherein no compression technique or dimensionality reduction technique is applied to the two or more training multispectral transmission image channel images to derive the first training image. In some embodiments, the first training image is derived from two or more training multispectral transmission image channel images, and wherein a compression technique a dimensionality reduction technique is applied to the two or more training multispectral transmission image channel images to derive the first training image.
[0073] In some embodiments, the first training image is derived from three or more training multispectral transmission image channel images, and wherein no compression technique or dimensionality reduction technique is applied to the three or more training multispectral transmission image channel images to derive the first training image. In some embodiments, the
first training image is derived from three or more training multispectral transmission image channel images, and wherein a compression technique a dimensionality reduction technique is applied to the three or more training multispectral transmission image channel images to derive the first training image.
[0074] In some embodiments, the first training image is derived from four or more training multispectral transmission image channel images, and wherein no compression technique or dimensionality reduction technique is applied to the four or more training multispectral transmission image channel images to derive the first training image. In some embodiments, the first training image is derived from four or more training multispectral transmission image channel images, and wherein a compression technique a dimensionality reduction technique is applied to the four or more training multispectral transmission image channel images to derive the first training image.
[0075] In some embodiments, the first training image is derived from twelve or more training multispectral transmission image channel images, and wherein no compression technique or dimensionality reduction technique is applied to the twelve or more training multispectral transmission image channel images to derive the first training image. In some embodiments, the first training image is derived from twelve or more training multispectral transmission image channel images, and wherein a compression technique a dimensionality reduction technique is applied to the twelve or more training multispectral transmission image channel images to derive the first training image.
[0076] A tenth aspect of the present disclosure is a method of virtually staining an image derived of an unstained test biological specimen, comprising obtaining test multispectral image data from the unstained test biological specimen, wherein the obtained test multispectral image data comprises at least four multispectral transmission image channel images acquired at different wavelengths; reducing the dimensionality of the obtained test multispectral image data thereby generating a multi-channel multispectral test transmission image; and generating the virtually stained image from the multi-channel multispectral test transmission image using a virtual staining engine trained to generate an image of an unstained biological specimen stained with a particular morphological stain.
BRIEF DESCRIPTION OF THE FIGURES
[0077] For a general understanding of the features of the disclosure, reference is made to the drawings. In the drawings, like reference numerals have been used throughout to identify identical elements.
[0078] FIG. 1 compares virtually stained images of tissue samples to the same tissue samples which were chemically stained, such as with a morphological stain.
[0079] FIG. 2A provides an overview of a method of training a machine-learning algorithm to generate a morphologically stained image from a test multispectral image derived from an unstained biological specimen in accordance with one embodiment of the present disclosure.
[0080] FIG. 2B provides an overview of a method of generating a virtually stained image of a test unstained biological specimen in accordance with one embodiment of the present disclosure.
[0081] FIGS. 3A - 3C illustrate systems for acquiring image data and generating a virtual stain of a test biological specimen or for training a virtual staining engine.
[0082] FIGS. 4A - 4B illustrate systems for acquiring image data and generating a virtual stain of a test biological specimen or for training a virtual staining engine.
[0083] FIG. 5 provides a block diagram of a multispectral image acquisition device in accordance with one embodiment of the present disclosure.
[0084] FIGS. 6A, 6B, and 6C illustrate methods of training a virtual staining engine in accordance with some embodiments of the present disclosure.
[0085] FIG. 7 illustrates a method of training a virtual staining engine with different obtained training samples.
[0086] FIGS. 8A and 8B illustrate methods of generating training image data for use in training a virtual staining engine.
[0087] FIG. 9 illustrates a method of training a virtual staining engine with different obtained serial tissue sections.
[0088] FIG. 10 illustrates a method of generating a multispectral training image in accordance with one embodiment of the present disclosure.
[0089] FIG. 11 illustrates a method of generating a multispectral training image in accordance with one embodiment of the present disclosure.
[0090] FIG. 12A provides an example of a multi-channel multispectral training transmission image.
[0091] FIG. 12B provides an example of a training brightfield transmission.
[0092] FIG. 13 A illustrates a method of coregistering a multispectral training transmission image and a training brightfield transmission image to provide a pair of coregistered training images in accordance with one embodiment of the present disclosure.
[0093] FIG. 13B sets forth a method of coregistering a generated multi-channel multispectral training transmission image and a training brightfield transmission image in accordance with one embodiment of the present disclosure.
[0094] FIGS. 14A and 14B illustrate the placement of landmarks in both an obtained generated multi-channel multispectral training transmission image and an obtained training brightfield transmission image.
[0095] FIG. 15 provides a method of virtually staining an image of a test unstained biological specimen with a trained virtual staining engine.
[0096] FIG. 16 illustrates a method of generating a multispectral test image in accordance with one embodiment of the present disclosure.
[0097] FIG. 17 illustrates images (three) of unstained tonsil tissue acquired at three different wavelengths using a multispectral imaging apparatus. FIG. 17 further illustrates the coding of the images of unstained tonsil tissue into an RGB image. Finally, FIG. 17 compares a virtually stained image generated using a trained virtual staining engine with an image of the sample tissue specimen which was chemically stained.
[0098] FIGS. 18A and 18B provides pseudo colored image obtained from a chemically stained H&E breast section and scanned in FLASH multispectral scanner (FIG. 18A). Channels illuminated with 470, 550, and 635 nm wavelengths were extracted and transformed to enhance the coloring and white balancing and then coded into an RGB image to be used as ground truth. A virtual stained image obtained when transforming the spectral image using pix2pix algorithm and the image on the left as ground truth (FIG. 18B).
[0099] FIG. 19 provides an image of a virtually H&E-stained whole slide image of breast tissue in accordance with the methods of the present disclosure.
[0100] FIG. 20 provides an image of a virtually H&E-stained breast tissue sample in accordance with the methods of the present disclosure.
[0101] FIG. 21 provides an image of a colorectal tissue section virtually stained with a Masson's trichrome stain in accordance with the methods of the present disclosure.
[0102] FIG. 22 provides a method of scanning an unstained slide in accordance with the methods of the present disclosure.
[0103] FIG. 23 provides a method of scanning a stained slide, such as for acquiring images of samples stained with a morphological stain or a special stain for training a machine-learning algorithm, in accordance with the methods of the present disclosure.
[0104] FIG. 24 illustrates a workflow for coregistering one or more images in accordance with the methods of the present disclosure.
DETAILED DESCRIPTION
[0105] It should also be understood that, unless clearly indicated to the contrary, in any methods claimed herein that include more than one step or act, the order of the steps or acts of the method is not necessarily limited to the order in which the steps or acts of the method are recited. [0106] References in the specification to "one embodiment," "an embodiment," "an illustrative embodiment," etc., indicate that the embodiment described may include a particular feature, structure, or characteristic, but every embodiment may or may not necessarily include that particular feature, structure, or characteristic. Moreover, such phrases are not necessarily referring to the same embodiment. Further, when a particular feature, structure, or characteristic is described in connection with an embodiment, it is submitted that it is within the knowledge of one skilled in the art to affect such feature, structure, or characteristic in connection with other embodiments whether or not explicitly described.
[0107] As used herein, the singular terms "a," "an," and "the" include plural referents unless context clearly indicates otherwise. Similarly, the word "or" is intended to include "and" unless the context clearly indicates otherwise. The term "includes" is defined inclusively, such that "includes A or B" means including A, B, or A and B.
[0108] As used herein in the specification and in the claims, "or" should be understood to have the same meaning as "and/or" as defined above. For example, when separating items in a list, "or" or "and/or" shall be interpreted as being inclusive, for example, the inclusion of at least one, but also including more than one, of a number or list of elements, and, optionally, additional unlisted items. Only terms clearly indicated to the contrary, such as "only one of' or "exactly one
of," or, when used in the claims, "consisting of," will refer to the inclusion of exactly one element of a number or list of elements. In general, the term "or" as used herein shall only be interpreted as indicating exclusive alternatives (for example "one or the other but not both") when preceded by terms of exclusivity, such as "either," "one of," "only one of' or "exactly one of." "Consisting essentially of," when used in the claims, shall have its ordinary meaning as used in the field of patent law.
[0109] The terms "comprising," "including," "having," and the like are used interchangeably and have the same meaning. Similarly, "comprises," "includes," "has," and the like are used interchangeably and have the same meaning. Specifically, each of the terms is defined consistent with the common United States patent law definition of "comprising" and is therefore interpreted to be an open term meaning "at least the following," and is also interpreted not to exclude additional features, limitations, aspects, etc. Thus, for example, "a device having components a, b, and c" means that the device includes at least components a, b, and c. Similarly, the phrase: "a method involving steps a, b, and c" means that the method includes at least steps a, b, and c. Moreover, while the steps and processes may be outlined herein in a particular order, the skilled artisan will recognize that the ordering steps and processes may vary.
[0110] As used herein in the specification and in the claims, the phrase "at least one," in reference to a list of one or more elements, should be understood to mean at least one element selected from any one or more of the elements in the list of elements, but not necessarily including at least one of each and every element specifically listed within the list of elements and not excluding any combinations of elements in the list of elements. This definition also allows that elements may optionally be present other than the elements specifically identified within the list of elements to which the phrase "at least one" refers, whether related or unrelated to those elements specifically identified. Thus, as a non-limiting example, "at least one of A and B" (or, equivalently, "atleast one of A orB," or, equivalently "at least one of A and/or B") can refer, in one embodiment, to at least one, optionally including more than one, A, with no B present (and optionally including elements other than B); in another embodiment, to at least one, optionally including more than one, B, with no A present (and optionally including elements other than A); in yet another embodiment, to at least one, optionally including more than one, A, and at least one, optionally including more than one, B (and optionally including other elements); etc. 1
[oui] As used herein, the term "about" means +/- 5%. In some embodiments, "about" means +/- 10%. In some embodiments, "substantially" means within about 15%. In some embodiments, "about" means +/- 20%.
[0112] As used herein, the term "biological specimen," "sample," or "tissue sample" refers to any sample including a biomolecule (such as a protein, a peptide, a nucleic acid, a lipid, a carbohydrate, or a combination thereof) that is obtained from any organism including viruses. Other examples of organisms include mammals (such as humans; veterinary animals like cats, dogs, horses, cattle, and swine; and laboratory animals like mice, rats, and primates), insects, annelids, arachnids, marsupials, reptiles, amphibians, bacteria, and fungi. Biological specimens include tissue samples (such as tissue sections and needle biopsies of tissue), cell samples (such as cytological smears such as Pap smears or blood smears or samples of cells obtained by microdissection), or cell fractions, fragments, or organelles (such as obtained by lysing cells and separating their components by centrifugation or otherwise). Other examples of biological specimens include blood, serum, urine, semen, fecal matter, cerebrospinal fluid, interstitial fluid, mucous, tears, sweat, pus, biopsied tissue (for example, obtained by a surgical biopsy or a needle biopsy), nipple aspirates, cerumen, milk, vaginal fluid, saliva, swabs (such as buccal swabs), or any material containing biomolecules that is derived from a first biological specimen. In certain embodiments, the term "biological specimen" as used herein refers to a sample (such as a homogenized or liquefied sample) prepared from a tumor or a portion thereof obtained from a subject.
[0113] As used herein, the terms "biomarker" or "marker" refer to a measurable indicator of some biological state or condition. A biomarker may be a protein or peptide, e.g., a surface protein, which can be specifically stained, and which is indicative of a biological feature of the cell, e.g., the cell type or the physiological state of the cell. An immune cell marker is a biomarker that is selectively indicative of a feature that relates to an immune response of a mammal. A biomarker may be used to determine how well the body responds to a treatment for a disease or condition or if the subject is predisposed to a disease or condition. In the context of cancer, a biomarker refers to a biological substance that is indicative of the presence of cancer in the body. A biomarker may be a molecule secreted by a tumor or a specific response of the body to the presence of cancer. Genetic, epigenetic, proteomic, glycomic, and imaging biomarkers can be used for cancer diagnosis, prognosis, and epidemiology. Such biomarkers can be assayed in non-
invasively collected biofluids like blood or serum. Several gene and protein based biomarkers have already been used in patient care including but, not limited to, AFP (Liver Cancer), BCR- ABL (Chronic Myeloid Leukemia), BRCA1 / BRCA2 (Breast/Ovarian Cancer), BRAF V600E (Melanoma/Colorectal Cancer), CA-125 (Ovarian Cancer), CA19.9 (Pancreatic Cancer), CEA (Colorectal Cancer), EGFR (Non-small-cell lung carcinoma), HER-2 (Breast Cancer), KIT (Gastrointestinal stromal tumor), PSA (Prostate Specific Antigen), SI 00 (Melanoma), and many others. Biomarkers may be useful as diagnostics (to identify early-stage cancers) and/or prognostics (to forecast how aggressive a cancer is and/or predict how a subject will respond to a particular treatment and/or how likely a cancer is to recur).
[0114] As used herein, a "brightfield" refers to data, e.g., image data, obtained via a microscope based on a biological sample illuminated from below such that the light waves pass through transparent portions of the biological sample. The varying brightness levels are then captured, such as in the form of an image.
[0115] As used herein, the term "cell," refers to a prokaryotic cell or a eukaryotic cell. The cell may be an adherent or a non-adherent cell, such as an adherent prokaryotic cell, adherent eukaryotic cell, non-adherent prokaryotic cell, or non-adherent eukaryotic cell. A cell may be a yeast cell, a bacterial cell, an algae cell, a fungal cell, or any combination thereof. A cell may be a mammalian cell. A cell may be a primary cell obtained from a subject. A cell may be a cell line or an immortalized cell. A cell may be obtained from a mammal, such as a human or a rodent. A cell may be a cancer or tumor cell. A cell may be an epithelial cell. A cell may be a red blood cell or a white blood cell. A cell may be an immune cell such as a T cell, a B cell, a natural killer (NK) cell, a macrophage, a dendritic cell, or others. A cell may be a neuronal cell, a glial cell, an astrocyte, a neuronal support cell, a Schwann cell, or others. A cell may be an endothelial cell. A cell may be a fibroblast or a keratinocyte. A cell may be a pericyte, hepatocyte, a stem cell, a progenitor cell, or others. A cell may be a circulating cancer or tumor cell or a metastatic cell. A cell may be a marker specific cell such as a CD8+ T cell or a CD4+ T cell. A cell may be a neuron. A neuron may be a central neuron, a peripheral neuron, a sensory neuron, an interneuron, an intraneuronal, a motor neuron, a multipolar neuron, a bipolar neuron, or a pseudo-unipolar neuron. A cell may be a neuron supporting cell, such as a Schwann cell. A cell may be one of the cells of a blood-brain barrier system. A cell may be a cell line, such as a neuronal cell line. A cell may be a primary cell, such as cells obtained from a brain of a subject. A cell may be a population of cells
that may be isolated from a subject, such as a tissue biopsy, a cytology specimen, a blood sample, a fine needle aspirate (FNA) sample, or any combination thereof. A cell may be obtained from a bodily fluid such as urine, milk, sweat, lymph, blood, sputum, amniotic fluid, aqueous humor, vitreous humor, bile, cerebrospinal fluid, chyle, chyme, exudates, endolymph, perilymph, gastric acid, mucus, pericardial fluid, peritoneal fluid, pleural fluid, pus, rheum, saliva, sebum, serous fluid, smegma, sputum, tears, vomit, or other bodily fluid. A cell may comprise cancerous cells, non-cancerous cells, tumor cells, non-tumor cells, healthy cells, or any combination thereof.
[0116] As used herein, the term "cytological sample" refers to a cellular sample in which the cells of the sample have been partially or completely disaggregated, such that the sample no longer reflects the spatial relationship of the cells as they existed in the subject from which the cellular sample was obtained. Examples of cytological samples include tissue scrapings (such as a cervical scraping), fine needle aspirates, samples obtained by lavage of a subject, et cetera.
[0117] As used herein, the term "fixation" refers to a process by which molecular and/or morphological details of a cellular sample are preserved. There are generally three kinds of fixation processes: (1) heat fixation, (2) perfusion; and (3) immersion. With heat fixation, samples are exposed to a heat source for a sufficient period of time to heat kill and adhere the sample to the slide. Perfusion involves use of the vascular system to distribute a chemical fixative throughout a whole organ or a whole organism. Immersion involves immersing a sample in a volume of a chemical fixative and allowing the fixative to diffuse throughout the sample. Chemical fixation involves diffusion or perfusion of a chemical throughout the cellular samples, where the fixative reagent causes a reaction that preserves structures (both chemically and structurally) as close to that of living cellular sample as possible. Chemical fixatives can be classified into two broad classes based on mode of action: cross-linking fixatives and non-cross-linking fixatives. Crosslinking fixatives - typically aldehydes - create covalent chemical bonds between endogenous biological molecules, such as proteins and nucleic acids, present in the tissue sample. Formaldehyde is the most commonly used cross-linking fixative in histology. Formaldehyde may be used in various concentrations for fixation, but it primarily is used as 10% neutral buffered formalin (NBF), which is about 3.7% formaldehyde in an aqueous phosphate buffered saline solution. Paraformaldehyde is a polymerized form of formaldehyde, which depolymerizes to provide formalin when heated. Glutaraldehyde operates in similar manner as formaldehyde but is a larger molecule having a slower rate of diffusion across membranes. Glutaraldehyde fixation
provides a more rigid or tightly linked fixed product, causes rapid and irreversible changes, fixes quickly and well at 4 °C, provides good overall cytoplasmic and nuclear detail, but is not ideal for immunohistochemistry staining. Some fixation protocols use a combination of formaldehyde and glutaraldehyde. Glyoxal and acrolein are less commonly used aldehydes. Denaturation fixatives - typically alcohols or acetone - act by displacing water in the cellular sample, which destabilizes hydrophobic and hydrogen bonding within proteins. This causes otherwise water-soluble proteins to become water insoluble and precipitate, which is largely irreversible.
[0118] As used herein, the term "immunohistochemistry" refers to a method of determining the presence or distribution of an antigen in a sample by detecting interaction of the antigen with a specific binding agent, such as an antibody. A sample is contacted with an antibody under conditions permitting antibody-antigen binding. Antibody-antigen binding can be detected by means of a detectable label conjugated to the antibody (direct detection) or by means of a detectable label conjugated to a secondary antibody, which binds specifically to the primary antibody (indirect detection). In some instances, indirect detection can include tertiary or higher antibodies that serve to further enhance the detectability of the antigen. Examples of detectable labels include enzymes, fluorophores and haptens, which in the case of enzymes, can be employed along with chromogenic or Anorogenic substrates.
[0119] As used herein, the term "machine learning" refers to a type of learning in which the machine (e.g., computer program) can learn on its own without being programmed.
[0120] As used herein, the term "slide" refers to any substrate (e.g., substrates made, in whole or in part, glass, quartz, plastic, silicon, etc.) of any suitable dimensions on which a biological specimen is placed for analysis, and more particularly to a "microscope slide" such as a standard 3 inch by 1 inch microscope slide or a standard 75 mm by 25 mm microscope slide. Examples of biological specimens that can be placed on a slide include, without limitation, a cytological smear, a thin tissue section (such as from a biopsy), and an array of biological specimens, for example a tissue array, a cellular array, aDNA array, an RNA array, a protein array, or any combination thereof. Thus, in one embodiment, tissue sections, DNA samples, RNA samples, and/or proteins are placed on a slide at particular locations. In some embodiments, the term slide may refer to SELDI and MALDI chips, and silicon wafers.
[0121] As used herein, the term "substantially" means the qualitative condition of exhibiting total or near-total extent or degree of a characteristic or property of interest. In some
embodiments, "substantially" means within about 5%. In some embodiments, "substantially" means within about 10%. In some embodiments, "substantially" means within about 15%. In some embodiments, "substantially" means within about 20%.
[0122] As used herein, the term "virtual stained image" refers to an image of an unstained biological sample that simulates a chemically stained biological sample. In some embodiments, there is no discernable difference in the diagnostic quality between the virtually stained images of unstained biological specimens and the corresponding images of chemically stained biological specimens, at least not to the extent that any differences will substantially alter a diagnostic outcome.
[0123] OVERVIEW
[0124] The present disclosure provides systems and methods for the generation of a virtually stained image of an unstained biological specimen based on an acquired image of the unstained biological specimen, where the virtually stained image manifests the appearance of the unstained biological specimen as if it were chemically stained, such as with a morphological stain (e g., a primary stain or a special stain). In some embodiments, a virtual staining engine trained in accordance with the methods described herein will output a virtually generated stained image in response to a provided input image of an unstained biological specimen, where the virtually stained image appears to a skilled observer (e.g., a trained histopathologist) to be substantially equivalent to a corresponding brightfield image of the same biological specimen that has been chemically stained, such as with a primary stain or with a special stain.
[0125] The skilled artisan will appreciate that a single input image of an unstained biological specimen may be virtually stained using one or more differently trained virtual staining engines to provide one or more different virtually stained output images. For instance, a single input image of an unstained biological specimen may be virtually stained with (i) a virtual staining engine trained for the H&E morphological stain to provide a first virtually stained output image based on the unstained input image, where the first virtually stained output image is substantially equivalent to a corresponding brightfield image of the same biological specimen stained with H&E (at least for diagnostic purposes); and (ii) a virtual staining engine trained for the resorcin fuchsine morphological stain to provide a second virtually stained output image based on the unstained input image, where the second virtually stained output image is substantially equivalent to a
corresponding brightfield image of the same biological specimen stained with resorcin fuchsine (at least for diagnostic purposes).
[0126] As described further herein, Applicant has demonstrated that trained pathologists were able to recognize histopathologic features in both virtually generated images of unstained biological specimens and images of chemically stained biological specimens with a high degree of agreement (see Example 1, herein). By way of example, FIGS. 1 and 18 compare virtually stained images of unstained biological specimens to those same biological specimens that have been chemically stained, where the virtually stained images of the unstained biological specimens and the corresponding images of the chemically stained biological specimens may both be equally utilized for diagnostic purposes. In some embodiments, there is no discernable difference in the diagnostic quality between the virtually stained images of the unstained biological specimens and the corresponding images of the chemically stained biological specimens, at least not to the extent that any differences will substantially alter a diagnostic outcome.
[0127] In view of the foregoing, described herein are systems and methods for generating a virtually stained image (e.g., virtually morphologically stained) from an image (e.g., a multispectral transmission image) of an unstained biological specimen. In some embodiments, a machine-learning algorithm, such as a deep learning neural network, is trained to generate a virtual stain for a specific morphological stain, such as with a primary stain (e.g., H&E) or a special stain (Masson's Trichrome). In this regard, the present disclosure also provides for systems and methods of training virtual staining engines. In some embodiments, the present disclosure provides for a system which includes a plurality of different trained virtual staining engines which may be utilized in virtually staining one or more unstained biological specimens.
[0128] FIG. 2A provides an overview of a method of training a machine-learning algorithm to generate a morphologically stained image from a test multispectral image derived from an unstained biological specimen. As an initial step, training transmission image data, including training multispectral image data and training brightfield image data, is acquired from one or more training biological specimens. In some embodiments, training multispectral transmission image data is acquired from an unstained training biological specimen, (step 100), such as with a multispectral image acquisition device 12A. In other embodiments, training brightfield transmission image data is acquired from a stained training biological specimen, such with a brightfield imaging acquisition device 12B or with a multispectral imaging device 12A using RGB
colors (e.g., acquiring image data from a stained training biological specimen with a multispectral image acquisition device 12A at about wavelengths 700nm, 550nm, and 470nm) (step 100). The acquired training multispectral transmission image data and the acquired training brightfield transmission image data is then supplied to a machine-learning algorithm (e.g., a GAN algorithm) to train a model to predict or generate a virtually stained image (step 102).
[0129] Also described are methods of using the trained machine-learning algorithm to generate virtually stained images (see, e g., FIG. 2B), where the virtually stained images are virtually stained with the same stain that the trained machine-learning algorithm was trained to generate. In general, a trained virtual staining engine is obtained that is specific for a particular morphological stain, where the obtained trained virtual staining engine is trained to generate an image of an unstained biological specimen stained with the particular morphological stain for which it was trained (step 110). Test multispectral transmission image data is then obtained (step 111) and supplied to the obtained trained virtual staining engine (step 112). In general, test multispectral transmission image data is acquired at about the same wavelengths that were used when training the virtual staining engine. For instance, if training multispectral transmission images at the wavelengths of 350nm, 450nm, and 500nm were used to train a virtual staining engine specific for a particular morphological stain, then test multispectral transmission images should be acquired at about 350nm, at about 450nm, and about 550nm. The trained virtual staining engine will then generate a virtually stained image of the test unstained biological specimen stained with the morphological stain, where the virtually stained image manifests the appearance of the test unstained biological specimen as if it were stained with the particular morphological stain that the trained virtual staining engine was trained to generate (step 113).
[0130] The present disclosure also discloses systems adapted to acquire image data, process the acquired image data, train a machine-learning algorithm, and generate virtually stained slides using the trained machine-learning algorithm.
[0131] SYSTEMS
[0132] A system 200 for acquiring image data and generating a virtual stain of a test biological specimen or for training a virtual staining engine is illustrated in FIGS. 3A - 3C, 4A, and 4B. The system may include an image acquisition device 12 and a computer 14, whereby the image acquisition device 12 and computer may be communicatively coupled together (e.g., directly, or indirectly over a network 20).
[0133] In some embodiments, the image acquisition device 12 is a multispectral image acquisition device 12A (see, e.g., FIGS. 3B and 4A). In other embodiments, the image acquisition device is a brightfield image acquisition device. In yet other embodiments, the system 200 includes both a multispectral image acquisition device 12A and a brightfield image acquisition device 12B (see, e.g., FIGS. 3C and 4B). In some embodiments, images captured from the image acquisition device 12 may be stored in binary form, such as locally or on a server. The captured digital images can also be divided into a matrix of pixels. The pixels can include a digital value of one or more bits, defined by the bit depth.
[0134] The computer system 14 can include a desktop computer, a laptop computer, a tablet, or the like, digital electronic circuitry, firmware, hardware, memory 201, a computer storage medium (240), a computer program or set of instructions (e.g., where the program is stored within the memory or storage medium), one or more processors (209) (including a programmed processor), and any other hardware, software, or firmware modules or combinations thereof (such as described further herein). For example, the system 14 illustrated in FIGS. 1 A - 1 C may include a computer with a display device 16 and an enclosure 18. The computer system can store acquired image data locally, such as in a memory, on a server, or another network connected device.
[0135] The skilled artisan will appreciate that other computer devices or systems may be utilized and that the computer systems described herein may be communicatively coupled to additional components, e.g., microscopes (other than the multispectral image acquisition device 12A or the brightfield image acquisition device 12B described herein), automated slide preparation equipment, specimen milling / dissection devices, etc.
[0136] Multispectral Image Acquisition Device / Acquisition of Multispectral Image Data
[0137] In some embodiments, the image acquisition device 12 is a multispectral image acquisition device 12A for acquiring transmission image data of a biological specimen at one or more wavelengths. In general, the multispectral image acquisition device 12A is adapted to acquire multispectral transmission image data, such as multispectral transmission image channel images to a biological specimen disposed on a substrate, such as a microscope slide. As described herein, the multispectral image acquisition device 12A is adapted to illuminate the biological specimen with an illumination source at a particular wavelength and acquire transmission image
data of the biological specimen illuminated with the particular wavelength (referred to herein as "acquiring image data at a particular wavelength").
[0138] One non-limiting example of a multispectral image acquisition device is disclosed in United States Patent No. 11,070,750, the disclosure of which is hereby incorporated by reference herein in its entirety. A block diagram of one suitable multispectral image acquisition device 12A is shown in FIG. 5. In some embodiments, multispectral image acquisition device 12A includes a CMOS sensor or a CCD sensor, such as a CMOS sensor or a CCD sensor which is sensitive to all wavelengths of light sources utilized. In some embodiments, the sensor is focused to a middle wavelength used, or is independently focused to each wavelength, or dynamically focused for each wavelength by either moving the optics of the camera or moving the camera in a direction perpendicular to the sample. In some embodiments, a traditional white source and filter system may be used in the multispectral image acquisition device 12A. For example, an illuminator can include a white light source and a filter to produce set of color monochrome images. The color of the monochrome images can be redefined and combined to produce an enhanced digital image. In some embodiments, an LED light source may be used in the detection step to generate narrower illumination light.
[0139] In some embodiments, the multispectral image acquisition device 12A is configured to illuminate a biological specimen with at least two different illumination sources, and acquire multispectral transmission image data (e.g., multispectral transmission image channel images) of the biological specimen illuminated with each of at least two different wavelengths. In some embodiments, the two or more wavelengths could be broadband wavelengths, such as wavelengths up to about 300 nanometers each, such as to capture large regions of spectral transmission. In some embodiments, two channels or more may be combined, for example by averaging, to generate a single transmission channel that represents an extended spectral region. In some embodiments, the multispectral image acquisition device 12A is configured to illuminate a biological specimen with at least four different illumination sources, and acquire multispectral transmission image data (e.g., multispectral transmission image channel images) of the biological specimen illuminated with of at least four different wavelengths. In some embodiments, the multispectral image acquisition device 12A is configured to illuminate a biological specimen illuminated with least six different illumination sources, and acquire multispectral transmission image data (e.g., multispectral transmission image channel images) of the biological specimen
illuminated with each of at least six different wavelengths. In some embodiments, the multispectral image acquisition device 12A is configured to illuminate a biological specimen with at least eight different illumination sources, and acquire multispectral transmission image data (e.g., multispectral transmission image channel images) of the biological specimen illuminated with each of at least eight different wavelengths.
[0140] In some embodiments, the multispectral image acquisition device 12A is configured to illuminate a biological specimen with at least ten different illumination sources, and acquire multispectral transmission image data (e.g., multispectral transmission image channel images) of a biological specimen at each of at least ten different wavelengths. In some embodiments, the multispectral image acquisition device 12A is configured to illuminate a biological specimen with at least twelve different illumination sources, and acquire multispectral transmission image data (e.g., multispectral transmission image channel images) of the biological specimen illuminated with each of at least twelve different wavelengths. In some embodiments, the multispectral image acquisition device 12A is configured to illuminate a biological specimen with at least sixteen different illumination sources, and acquire multispectral transmission image data (e.g., multispectral transmission image channel images) of the biological specimen illuminated with each of at least sixteen different wavelengths. In some embodiments, the multispectral image acquisition device 12A is configured to illuminate a biological specimen with at least twenty different illumination sources, and acquire multispectral transmission image data (e.g., multispectral transmission image channel images) of the biological specimen illuminated with each of at least twenty different wavelengths. In some embodiments, the multispectral image acquisition device 12A is configured to illuminate a biological specimen with at least twenty-four different illumination sources, and acquire multispectral transmission image data (e.g., multispectral transmission image channel images) of the biological specimen illuminated with each of at least twenty-four different wavelengths.
[0141] In some embodiments, the multispectral image acquisition device 12A includes one or more image capture devices; and one or more energy emitters, such as light sources, infrared sources, ultraviolet sources, or the like to illuminate the biological sample at a particular wavelength. For instance, the energy emitter can include, without limitation, one or more LEDs (e.g., edge emitting LEDs, surface emitting LEDs, super luminescent LEDs, or the like), laser diodes, electroluminescent light sources, incandescent light sources, cold cathode fluorescent light
sources, organic polymer light sources, lamps, inorganic light sources, or other suitable lightemitting sources. In some embodiments, light sources can be light-emitting diodes (LEDs), which may be pulsed on and off to correspond with imaging frames such that successive frames are recorded with a different LED illumination.
[0142] In some embodiments, the energy emitter of the multispectral image acquisition device 12A is configured to produce energy emissions with mean wavelengths that are different from one another. In some embodiments, the total number of different energy emissions capable of being produced by the multispectral image acquisition device 12A (i.e., energy emissions with different mean wavelengths) ranges, for example, from 3 to about 50, such as from 3 to 40, such as from 3 to 30, such as from 3 to 20, such as from 3 to 16, such as from 3 to 12. In some embodiments, the energy emitter can include, without limitation, two or more light sources of different mean wavelengths, three or more light sources of different mean wavelengths, four or more light sources of different mean wavelengths, five or more light sources of different mean wavelengths, six or more light sources of different mean wavelengths, seven or more light sources of different mean wavelengths, eight or more light sources of different mean wavelengths, nine or more light sources of different mean wavelengths, ten or more light sources of different mean wavelengths, eleven or more light sources of different mean wavelengths, twelve or more light sources of different mean wavelengths, fifteen or more light sources of different mean wavelengths, twenty or more light sources of different mean wavelengths, etc.
[0143] For instance, the multispectral image acquisition device 12A may include two or more different LEDs, each emitting a different mean wavelength. By way of example, the multispectral image acquisition device 12A or the energy emitter included therein may be configured to produce energy emissions with mean wavelengths of about 365 nm, about 400 nm, about 435 nm, about 470 nm, about 500 nm, about 550 nm, about 580 nm, about 635 nm, about 660 nm, about 690 nm, about 780 nm, and/or about 850 nm.
[0144] In some embodiments, the energy emitter of the multispectral image acquisition device 12A can be a blue light LED having a maximum intensity at a wavelength in the blue region of the spectrum. For example, a blue light LED can have a peak wavelength and/or mean wavelength in a range of about 430 nanometers to about 490 nanometers (nm). In some embodiments, the energy emitter can be a green light LED having a maximum intensity at a wavelength in the green region of the spectrum. For example, the green light LED can have a
peak wavelength and/or mean wavelength in a range of about 490 - 560 nm. In some embodiments, the energy emitter can be an amber light LED having a maximum intensity at a wavelength in the amber region of the spectrum. For example, the amber light can have a peak wavelength and/or mean wavelength in a range of about 570 - 610 nm. In some embodiments, the energy emitter can be a red-light LED having a maximum intensity at a wavelength in the red region of the spectrum. For example, the red light can have a peak wavelength and/or mean wavelength in a range of about 620 - 800 nm.
[0145] In some embodiments, two or more of the LED light sources can be combined, thereby producing processing flexibility. Different arrangements of light sources can be selected to achieve the desired illumination field. For instance, multiple LEDs of specific wavelengths may be combined such that the acquired multispectral image may resemble that of an RGB brightfield image (allowing such a configured multispectral image acquisition device to be used in place of a brightfield image acquisition device). In some embodiments, LED light sources can be part of or form a light emitting panel. In some embodiments, the number, colors, and positions of the LEDs can be selected to achieve desired illumination.
[0146] In other embodiments, the multispectral image acquisition device 12A may include one or more lasers, halogen light sources, incandescent sources, and/or other devices capable of emitting light. In some embodiments, each source can include a light emitter (e.g., a halogen lamp incandescent light source, etc.) that outputs white light and a filter that transmits certain wavelength(s) or waveband(s) of the white light.
[0147] Brightfield Image Acquisition Device / Acquisition of Brightfield Image Data [0148] In some embodiments, the image acquisition device 12 is a brightfield image acquisition device 12B for acquiring brightfield transmission image data of a biological specimen. Brightfield image acquisition devices 12B can include, without limitation, a camera (e.g., an analog camera, a digital camera, etc.), optics (e.g., one or more lenses, sensor focus lens groups, microscope objectives, etc.), imaging sensors (e.g., a charge-coupled device (CCD), a complimentary metal-oxide semiconductor (CMOS) image sensor, or the like), photographic film, or the like. In some embodiments, the brightfield image acquisition devices include a plurality of lenses that cooperate to prove on-the-fly focusing. An image sensor, for example, a CCD sensor can capture a digital image of the specimen. In some embodiments, the digitized tissue data may be generated, for example, by an image scanning system, such as a Ventana DP 200® slide scanner
by Ventana Medical Systems, Inc. (Tucson, Arizona) or other suitable imaging equipment. Such a scanner may be used to acquire images of test and/or training images. In some embodiments, the scanner is used to acquire images of unstained samples (see, e g., FIG. 22). In other embodiments, the scanner is used to acquire Additional imaging devices and systems are described further herein.
[0149] The skilled artisan will appreciate that the digital color image acquired by the brightfield image acquisition device may be conventionally composed of elementary color pixels. Each colored pixel can be coded over three digital components, each comprising the same number of bits, each component corresponding to a primary color, generally red, green, or blue, also denoted by the term "RGB" components. The skilled artisan will appreciate that the main difference between the brightfield image acquisition device 12B and the multispectral image acquisition device 12A is that the multispectral image acquisition device 12A uses a single white light source, and the sensor has a filter (i.e., a Bayern filter) to produce a three channel (i.e., RGB) color image; while the brightfield image acquisition device 12B uses multiple colors turned in sequence and the images are acquired in a monochromatic camera.
[0150] In some embodiments, the multispectral image acquisition device 12A can also be used to acquire a "brightfield transmission image data," as that term is used herein, of a biological specimen with a multispectral imaging device 12A using RGB colors (e.g., by acquiring image data at about wavelengths 700nm, 550nm, and 470nm). Thus, a "brightfield transmission image" or a "training brightfield transmission image" can include transmission image data acquired from a multispectral image acquisition device 12A or a brightfield image acquisition device 12B.
[0151] Modules
[0152] FIGS. 4A and 4B provide an overview of the systems 200 of the present disclosure and the various modules utilized within the system. In some embodiments, the system 200 employs a computer device or computer-implemented method having one or more processors 209 and one or more memories 201, the one or more memories 201 storing non-transitory computer- readable instructions for execution by the one or more processors to cause the one or more processors to execute certain instructions as described herein.
[0153] Image Acquisition Module
[0154] In some embodiments, the image acquisition module 202 commands a multispectral imaging device 12A and/or a brightfield imaging device 12B to acquire multispectral
and/or brightfield transmission image data, respectively, of a biological specimen (or a portion thereof) disposed on a substrate. In some embodiments, the image acquisition module 202 acquires training image data from one or more training biological specimens for training a virtual staining engine 210. In other embodiments, the image acquisition module 202 acquires test image data from one or more test biological specimens such that a virtual stain may be generated using a trained virtual staining engine 210. The acquired test and/or training transmission image data of the test and/or training biological specimen, respectively, may be stored in one or more memories 201 or one or more storage modules 240 communicatively coupled to the system 200 for downstream processing.
[0155] In the case of acquiring image data from a biological specimen with multispectral image acquisition device 12A, the image acquisition module 202 may command the multispectral image acquisition device 12A to acquire one or more transmission image channel images of the biological specimen, where each transmission image channel image is acquired at a different wavelength (i.e., the multispectral image acquisition device 12A illuminates the biological specimen with an illumination source having a particular wavelength, and then transmission image data is acquired of the biological specimen at that particular wavelength). By way of example, the image acquisition module 202 may command multispectral image acquisition device 12A to acquire transmission image data from a biological specimen at 12 different wavelengths, thereby generating 12 multispectral transmission image channel images (collectively referred to as acquired multispectral transmission image data). Following this example, the plurality of acquired multispectral transmission image channel images, i.e., the acquired multispectral transmission image data, are then stored in one or more memories 201 or one or more storage modules 240 for downstream processing.
[0156] In some embodiments, the image acquisition module 202 may command an image acquisition device 12 to acquire transmission image data for an entire biological specimen. For instance, transmission image data may be acquired for the entirety of a biological specimen disposed on a substrate, e.g., a microscope slide. In other embodiments, the image acquisition module 202 may command an image acquisition device 12 to acquire transmission image data from a portion of a biological specimen. This can be useful where only specific regions of interest of the biological specimen are relevant for analysis. For instance, certain regions of interest may include a specific type of tissue or a comparatively higher population of a specific type of cell as
compared with another region of interest. By way of example, a region of interest may be selected in a biological specimen that includes tissue of interest but excludes tissue not of interest (e.g., tumor tissue versus non-tumor tissue). In these embodiments, the image acquisition module 202 may be programmed to acquire transmission image data from one or more predefined portions of the sample; or may acquire one or more transmission images through random sampling or by sampling at regular intervals across a grid covering the entire sample.
[0157] Image Processing Module
[0158] In some embodiments, the system 200 further includes an image processing module 212 adapted to process acquired image data. In some embodiments, the image processing module is capable of converting or otherwise transforming acquired transmission image data, including multispectral transmission image data and/or brightfield transmission image data. By way of example, the image processing module 212 may combine two or more image channel images into a multi-channel image (such as without using a dimensionality reduction technique or a compression technique). By way of another example, the image processing module 212 may reduce multispectral image data acquired at four or more different wavelengths into a multichannel RGB image, such as by using a dimensionality reduction method. In other embodiments, the image processing module 212 may coregister one image with another image. For instance, the image processing module 212 may register an image derived from acquired multispectral transmission image data with an acquired transmission brightfield image, to provide a pair of coregistered images for training a virtual staining engine 210. In some embodiments, the image processing module 212 may convert multispectral image data acquired from a stained biological specimen using a multispectral image acquisition device to an RGB image, and which may be used as a bright field image, as described herein.
[0159] In some embodiments, the image processing module 212 may be further configured to pre-process image data, to identify regions of the image that correspond to the substrate (e.g., a microscope slide) on which the sample is disposed, to identify regions of different tissue types (e.g., connective tissue), or to interpret one or more annotations. In some embodiments, the image processing module 212 may yet further include one or more submodules, such as tissue classification modules, glass recognition modules, etc. The one or more submodules may implement support vector machines and/or neural networks. Examples of overlay generation modules, tissue classification modules, glass / slide recognition modules are described in U.S.
Publication Nos. 2020/0105413, 2021/0027462, 2021/0216746, 2021/0285056 and in U.S. Patent Nos. 11,010,892 and 10,628,658, the disclosures of which are each hereby incorporated by reference herein in their entireties.
[0160] Training Module / Virtual Staining Engine
[0161] In some embodiments, the system 200 further includes a training module 211 adapted to receive pairs of coregistered training images and to use the received pairs of coregistered training images to train a virtual staining engine 210. In some embodiments, the pairs of coregistered training images are used to train a one or more machine-learning algorithms, such as a deep neural network. In some embodiments, the deep neural network is based on an implicit generative model, e.g., a generative adversarial network ("GAN"). In a GAN-trained deep neural network 10, two models are used for training. A generative model is used that captures data distribution while a second model estimates the probability that a sample came from the training data rather than from the generative model. Details regarding GAN may be found in Goodfellow et al., Generative Adversarial Nets., Advances in Neural Information Processing Systems, 27, pp. 2672-2680 (2014), which is incorporated by reference herein. Network training of the deep neural network 10 (e.g., GAN) may be performed the same or different computing device. For example, in one embodiment a personal computer may be used to train the GAN although such training may take a considerable amount of time. To accelerate this training process, one or more dedicated GPUs may be used for training.
[0162] In some embodiments, the system 200 further includes a virtual staining engine 210. In general, the virtual staining engine 210 is trained to generate, from an unstained image of a biological specimen, an image of the biological specimen stained with a morphological stain. In some embodiments, once the deep neural network has been trained, the deep neural network may be used or executed on a different computing device which may include one with less computational resources used for the training process (although GPUs may also be integrated into execution of the trained deep neural network).
[0163] Methods of training a virtual staining engine and using such a trained virtual staining engine to generate a virtually stained slide are disclosed herein. In some embodiments, the system 200 includes multiple virtual staining engines 210, where each virtual staining engine is trained to generate a different virtual stain, where each different virtual stain corresponds to a different morphological stain.
[0164] For example, the trained, deep neural network 10 is trained using a GAN model.
[0165] The skilled artisan will also appreciate that one or more additional modules may be incorporated into the workflow or into system 200. In some embodiments, one or more an automated algorithms may be run such that cells may be detected, classified, and/or scored (see, e.g., Unit States Patent Publication No. 2017/0372117, the disclosure of which is hereby incorporated by reference herein in its entirety).
[0166] METHODS
[0167] The present disclosure provides methods of training a virtual staining engine 210, such as training a virtual staining engine to generate a virtually morphologically stained image derived from an acquired image of an unstained biological specimen. The skilled artisan will appreciate that multiple, different machine-learning algorithms may be trained to generate different virtual stains (e.g., to generate a H&E virtual stain, to generate a basic fuchsin virtual stain, to generate a Masson's Trichrome stain, etc.). The present disclosure also provides methods of using a trained virtual staining engine 210 to generate a virtually stained image of a test unstained biological specimen, from an image of the test unstained biological specimen, where the virtually stained image manifests the appearance of the test unstained biological specimen as if it were stained with a morphological stain. Each of these aspects will be described herein.
[0168] TRAINING A VIRTUAL STAINING ENGINE
[0169] FIG. 6A provides an overview of training a virtual staining engine 210. As a first step, one or more unstained training biological specimens are obtained (step 610). Training multispectral transmission image data is then acquired from each of the one or more unstained training biological specimens (step 611). In some embodiments, at least two training multispectral transmission image channel images are acquired from each of the unstained training biological specimens over at least two different wavelengths, such as at different wavelengths. In some embodiments, a multi-channel multispectral training transmission image is then generated based on the acquired training multispectral transmission image data for each of the unstained training biological specimens (step 612). For instance, the multi-channel multispectral training transmission image may include three different wavelengths. By way of another example, the multi-channel multispectral training transmission image may include twelve different wavelengths. By way of example, and in some embodiments, twelve different multispectral transmission image channel images (each acquired at a different wavelength) may be reduced to a
multi-channel multispectral training transmission image using a dimensionality reduction method, such as principal component analysis. By way of a further example, and in other embodiments, the twelve different multispectral transmission image channel images (each acquired at a wavelength) are not compressed, such as not compressed using any dimensionality reduction technique (e.g., PCA).
[0170] The unstained training biological specimen is then stained, such as morphologically stained, such as with a primary stain (e.g., H&E) or stained with a special stain to provide a stained training biological specimen (e.g., a basic fuchsin stain, a Masson's Trichrome stain, etc.) (step 613). Non-limiting examples of suitable special stains are described further herein. A brightfield training transmission image of the stained training biological specimen is then acquired (step 614). In some embodiments, the brightfield training transmission image of the stained training biological specimen is acquired with a brightfield image acquisition device 12B. In other embodiments, the brightfield training transmission image of the stained training biological specimen is acquired using a multispectral image acquisition device 12A using RGB channels as described further herein.
[0171] The multispectral training image and the brightfield training image are then coregistered to provide at least one pair of coregistered training images (step 615). A first training image of a pair of coregistered training images is a multispectral transmission training image (which, as noted above, is derived from training multispectral transmission image data of an unstained training biological specimen). A second member of the pair of coregistered training images is a brightfield transmission training image (which, as noted above, is derived from training brightfield transmission image data of the same training biological specimen used to generate the multispectral transmission image data but stained with a morphological stain). The plurality of pairs of coregistered training images are utilized by the training module 211 to train a virtual staining engine 210.
[0172] By way of example, twelve different image channel images of a first training tissue specimen may be acquired at twelve different wavelengths and those training multispectral image channel images may be used to generate a multispectral training image of a pair of coregistered training images. The twelve different image channel images are acquired while the first training tissue specimen is unstained. That same first training tissue specimen is then morphologically
stained according to methods known in the art. Following morphological staining, a brightfield transmission training image may be acquired from the stained first training tissue specimen.
[0173] In some embodiments, the multispectral training image and the brightfield training image are optionally segmented to provide a plurality of segmented coregistered training images (e.g., segments into 64x64 pixel images patches; 128x128 pixel image patches; 256x256 pixel image patches; 512x512 pixel image patches, etc ). The at least one pair of coregistered training images are then provided to a machine-learning algorithm, such as a GAN algorithm, for training (step 616).
[0174] A plurality of different virtual staining engines 210 may be trained, where each trained virtual staining engine may be trained to generate a different virtual morphological stain of an unstained biological specimen (e.g., different virtual staining engines may be trained to generate virtual H&E stains or virtual special stain, such as a virtual Masson's Trichrome stain). For instance, a virtual staining engine 210 may be trained with training biological specimens that have been stained with H&E. In this specific example, the virtual staining engine 210 is trained to generate a virtually stained image of a test unstained biological specimen where the virtually stained image manifests the appearance of the test unstained biological specimen as if it were stained with H&E. By way of another example, a virtual staining engine 210 may be trained with training biological specimens that have been stained with the special stain basic fuchsine. Following this second example, the virtual staining engine 210 is trained to generate a virtually stained image of a test unstained biological specimen where the virtually stained image manifests the appearance of the test unstained biological specimen as if it were stained with basic fuchsine. By way of yet another example, a virtual staining engine 210 may be trained with training biological specimens that have been stained with Masson's Trichrome. Following this third example, the virtual staining engine 210 is trained to generate a virtually stained image of a test unstained biological specimen where the virtually stained image manifests the appearance of the test unstained biological specimen as if it were stained with Masson's Trichrome (see, e.g., FIG. 21). Following these examples even further, each differently trained virtual staining engine may then be used to generate one or more virtually stained output images based on a single acquired multispectral transmission image of an unstained biological specimen.
[0175] An exemplary method of training a virtual staining engine is set forth in FIG. 6B. In particular, FIG. 6B shows the general workflow of a non-limiting training process in accordance
with the present disclosure. In some embodiments, a tissue sample (e.g., such as a biopsy sample, a resection sample, an FFPE sample, etc. obtained, such as through one or more preanalytical steps) is processed (step 650), and sections are cut (step 651), such as in a microtome such as at a thickness of about 3 pm to about 5 pm). In some embodiments, the slides are then baked for 5 minutes, and then dewaxed using heat to melt wax and rinsed with EZ prep to remove paraffin (look for paraffin removing routine in HE600) (step 652). In some embodiments, after the slides have been dewaxed, they are placed on a multispectral scanner stage, such as one of the multispectral scanners described herein. After selecting/detecting the area of interest either by the user or by automatic means, the sample is illuminated with a first (of multiple) wavelength (step 653)h, with a predefined illumination power (controlled by a pulse with modulation circuit to reduce/augment the duty cycle on the LED light source). In some embodiments, the objective of the scanner is focused by means of moving the sample or the objective, and a first field of view (FOV) is digitized with the camera with a predefined obturation time and gain. In some embodiments, the sample is illuminated with a second wavelength (step 653). In some embodiments, the objective is refocused to the second wavelength using a close loop focusing algorithm that iterates until finding the right focus (e.g., wavelength controlled autofocus). In other embodiments, only the focus of a central wavelength is calibrated through the autofocus and the other wavelength Z distance is calculated (such as by using predefined offsets per wavelength). In yet other embodiments, embodiment, only one central wavelength is focused (with a feedback- controlled algorithm), and all the other wavelengths are digitized without moving the Z distance in the same FOV. In some embodiments, process is repeated until all the FOV is digitized with all the different wavelengths (step 653). Next, the stage is moved to a different location. In some embodiments, the next location is in the vicinity of the previous location with some area overlapping. In other embodiments, the next location is in a different location to collect nonneighboring areas, speeding up the digitization and augmenting the variety of the dataset. In other embodiments, each FOV is stored in memory independently. In other embodiments, the FOV are stitched together to form a whole slide hypercube.
[0176] In some embodiments, once the multispectral scanning is finished with all the AOI (step 653), the next step is to stain the sample with the desired stain assay (i.e., H&E, with a special stain such as with Masson's Trichrome, or any of the other stains described herein) following typical protocols; and then the slide is coversliped (step 654). In some embodiments, once the
slide is stained and coverslipped it is introduced in the stage of the same multispectral scanner. In some embodiments, the above-mentioned imaging process is repeated but only in a subset of wavelengths, such as the 3 wavelengths that closely match the red (620-750 mm), green (495-570 nm) and blue ( 400 - 480 nm) wavelengths (step 655). Other wavelengths may be utilized in other embodiments. In some embodiments, the FOV of the stain images position matches the position of the FOV of the unstained samples by the use of fiducials located in the sample. In some embodiments, these fiducials are present in the slide (edges, tags, etc.). In other embodiments, these fiducials are engraved in the slide with mechanical or laser means (i.e., circles laser engraved in the slide). In some embodiments, once the multispectral digital microscope finishes scanning all the AOI, a whole slide image (WSI) of the raw data is stored in the disk.
[0177] In some embodiments, this WSI of the stained sample is then processed to apply a previously calibrated color correction matrix to correct the colors and perform a white balancing, in a manner such that the image resembles closely to what a user would look at by using a brightfield microscope. It is believed that an advantage of this method is that as the images are scanned in the same scanner, the resolution of the digital files matches perfectly or substantially perfectly. It is also believed that any optical artifacts are going will be similar. In some embodiments, the spectral images of the unstained sample are expected to be coregistered. Also, both unstained and stained FOVs are going to be very closely registered to each other, simplifying the next step of coregistration.
[0178] In some embodiments, a dimensionality reduction is performed. In some embodiments, a Principal Component Analysis (PCA) reduction is performed on the spectral hypercube images to reduce the channels from n (in example 12 channels) to 3 (step 656) (see FIG. 6B). In other embodiments, no dimensionality reduction (e.g., PCA) is performed a machine learning algorithm (such as GAN, described herein) is utilized that can accept data (e.g., multispectral data) from more than three channels (see FIG. 6C).
[0179] In some embodiments, the principal component values are normalized to values ranging from 0 to 255 and coded in the channels of an RGB image, where the first channel (i.e., red) is the first principal component, the second channel (i.e., green) is the second principal component, and the third channel (i.e., blue) is the third principal component (steps 657). It is believed that the advantage of this method is that the image complexity is reduced. In some embodiments, data from a stained slide is coded into an RGB image by performing a white
balancing color correction to adjust the balance of each channel to achieve a white background as is typically viewed with a light microscope or from a calibrated digital scanner (steps 658A and 658B). In some embodiments, the hypercube images are used raw with no further processing.
[0180] In some embodiments, once the dataset is complete (unstained and stained), the next step is to coregister the images to a pixel level (steps 659A and 659B). To do so, in some embodiments an automatic algorithm transforms both images (e.g., the PCA image and the brightfield or ground truth) by applying filters (i.e., Laplacian, Sobel, etc.) to where the general morphology of the tissue is enhanced in such a way that both filtered images will reveal similar structures (so called descriptors). In some embodiments, a Fourier based correlation algorithm will be applied, translating and rotating the filtered image of the ground truth to find the position and rotation that will lead to the higher correlation. In some embodiments, the required transformations (translation and rotation) are applied to the ground truth, producing 2 coarsely registered images (PCA and transformed brightfield).
[0181] In some embodiments, the next step is to find the FOV of the stitched image if dealing with WSI. To do so, in some embodiments, a Laplacian filter is applied to "look for" discontinuities, revealing the stitch lines. In some embodiments, both images are then divided into individual field of view (FOV) images, and the above-described process is repeated, filtering paired images to reveal descriptors and finding the best correlation by translating and rotating the filtered version of the FOV containing the ground truth. Once the best position is found, the transformation is applied to the FOV with no filtering, and the images are cut to ditch black areas created by rotating or moving the ground truth, and the result is two coregistered images (PCA and ground truth). Finally, a final step is done further dividing each registered FOV into tiles, such as tiles of 270 x 270 pixels. In some embodiments, the process is repeated at the tile level creating pixel level registered tiles of 256 x 256 pixels images (steps 660A and 660B). In some embodiments, the images are then saved into one or more memories communicatively coupled to the systems of the present disclosure, such as to a disk, to train a GAN algorithm (step 661).
[0182] In some embodiments, the tissue section previously stained in H&E is then processed to remove the coverslip and remove the staining (such as by chemically removing the staining) and then stained again with a different assay (i.e., Masson’s trichrome), and a coverslip is placed again. The re-stained slide is then scanned in the multispectral microscope again
following the same procedure mentioned before (3 colors matching RGB) and coregistered to the unstained hypercubes to create a different dataset that can be imputed to train a GAN algorithm.
[0183] Training Biological Specimens
[0184] In some embodiments, the multispectral and brightfield image data is acquired from one or more training biological specimens. The obtained training biological specimens may be obtained from any source. For instance, the obtained training biological specimens may be obtained from a tumor, including, for example, tumor biopsies samples, resection samples, cell smears, fine needle aspirates (FNA), liquid-based cytology samples, and the like. In some embodiments, the obtained training biological specimens are histological specimens. In other embodiments, the obtained training biological specimens are cytological specimens.
[0185] As noted herein, training multispectral transmission image data and training brightfield transmission image data is acquired from the same biological specimen, i.e., training multispectral transmission image data is acquired from the biological specimen when it is unstained, and training brightfield transmission image data is acquired from the same biological specimen after it is stained, such as stained with a primary stain (e.g., H&E), a special stain, or any of the other statins known in the art or as described herein.
[0186] In some embodiments, a plurality of training biological specimens is obtained and each training biological specimen of the plurality of obtained training biological specimens is used to train a different virtual staining engine (see, e.g., FIG. 7). For instance, training multispectral transmission image data may be acquired from a first training biological specimen in an unstained state; and then training brightfield transmission image data may be acquired from the first training biological after staining with H&E. The training multispectral transmission image data and the training brightfield transmission image data may then be used to train an H&E virtual staining engine. By way of another example, training multispectral transmission image data may be acquired from a second training biological specimen in an unstained state; and then training brightfield transmission image data may be acquired from the second training biological after staining with a first special stain. The training multispectral transmission image data and the training brightfield transmission image data may then be used to train a first special stain virtual staining engine.
[0187] In other embodiments, a plurality of training biological specimens is obtained and each training biological specimen of the plurality of obtained training biological specimens is used
to train the same virtual staining engine. In some embodiments, the obtained plurality of training biological specimens used to train the same virtual staining engine are derived from the same source (e.g., same patient but different tissue blocks; same tissue block but different serial sections having different thicknesses, fixation times, etc.) (see, e.g., FIG. 8A). In other embodiments, the obtained plurality of training biological specimens used to train the same virtual staining engine are derived from different sources (e.g., different patients). For instance, training multispectral transmission image data may be acquired from a first training biological specimen in an unstained state; and then training brightfield transmission image data may be acquired from the first training biological after staining with H&E. Training multispectral transmission image data may be acquired from a second training biological specimen in an unstained state; and then training brightfield transmission image data may be acquired from the second training biological after staining with H&E. The training multispectral transmission image data and the training brightfield transmission image data derived from the first and second training biological specimens may then be used to train an H&E virtual staining engine. It is believed that training the same virtual staining engine from training biological specimens obtained from different sources allows for any lab-to- lab variations, tissue-to-tissue variations, fixation level / quality variations, etc. to be accounted for during training of the virtual staining engine.
[0188] In some embodiments, a plurality of training biological specimens is obtained where each of the obtained plurality of training biological specimens is of the same tissue type (e.g., tonsil tissue) and where the obtained plurality of training biological specimens of the same tissue type is used to train the same virtual staining engine. In other embodiments, a plurality of training biological specimens is obtained where each of the obtained plurality of training biological specimens is of a different tissue type and where the obtained plurality of training biological specimens of the different tissue types is used to train the same virtual staining engine.
[0189] In some embodiments, a plurality of training biological specimens of the same type is obtained and each training biological specimen of the plurality of obtained training biological specimens of the same type is used to train a different virtual staining engine. In some embodiments, the plurality of training biological specimens of the same type is obtained from different sources. In other embodiments, the plurality of training biological specimens of the same type are different serial sections derived from the same tissue block, and where each different training serial section could be used to train a different virtual staining engine (see, e.g., FIG. 9).
[0190] In other embodiments, a plurality of training biological specimens of the same type is obtained, but where each of the obtained plurality of training biological specimens of the same type has a different thickness, a different type of glass substrate, and/or a different fixation state, etc.; and where each training biological specimen of the plurality of obtained training biological specimens of the same type is used to train the same virtual staining engine.
[0191] Training Multispectral Transmission Image Data / Training Multispectral Transmission Image Channel Images Derived from Training Unstained Biological Specimens
[0192] Training multispectral transmission image data is acquired from each of a plurality of training unstained biological specimens. In some embodiments, two or more training multispectral transmission image channel images are acquired for each training unstained biological specimen of the plurality of training unstained biological specimens, where each of the two or more training multispectral transmission image channel images are acquired at a specific wavelength. In some embodiments, the two or more training multispectral transmission image channel images may be processed and/or combined in one or more downstream operations.
[0193] In some embodiments, at least two training multispectral transmission image channel images are acquired for each unstained training biological specimen, where each of the at least two training multispectral transmission image channel images are acquired using a multispectral image acquisition device 12A configured to illuminate each unstained training biological specimen with at least two different illumination sources; and further configured to acquire transmission image data (e.g., at least two multispectral image channel images) of the biological specimen illuminated with the at least two different illumination sources. Said another way, at least two training multispectral transmission image channel images are acquired for each unstained training biological specimen, where each of the at least two training multispectral transmission image channel images are acquired at different wavelengths. In some embodiments, four training multispectral transmission image channel images are obtained from a least four different illumination sources. In some embodiments, five training multispectral transmission image channel images are obtained from a least five different illumination sources. In some embodiments, training multispectral transmission image channel images are obtained from a least six different illumination sources. In some embodiments, training multispectral transmission image channel images are obtained from a least seven different illumination sources. In some
embodiments, training multispectral transmission image channel images are obtained from a least eight different illumination sources. In some embodiments, training multispectral transmission image channel images are obtained from a least nine different illumination sources. In some embodiments, training multispectral transmission image channel images are obtained from a least ten different illumination sources. In some embodiments, training multispectral transmission image channel images are obtained from a least eleven different illumination sources. In some embodiments, training multispectral transmission image channel images are obtained from a least twelve different illumination sources. In some embodiments, training multispectral transmission image channel images are obtained from twelve or more illumination sources. In some embodiments, training multispectral transmission image channel images are obtained from sixteen or more different illumination sources. In some embodiments, training multispectral transmission image channel images are obtained from twenty or more different illumination sources. In some embodiments, training multispectral transmission image channel images are obtained from twenty- four or more different illumination sources.
[0194] In some embodiments, training multispectral transmission image channel images are acquired from a multispectral image acquisition device 12A configured to illuminate a training biological specimen with two or more different illumination sources (see, e.g., FIG. 5), where the at least two different illumination sources are from within the ultraviolet (UV) spectrum, from within the visible spectrum, and/or from within the infrared (IR) spectrum. In some embodiments, the at least two different illumination sources comprise (i) at least one illumination source within the ultraviolet (UV) spectrum; (ii) at least one illumination source within the visible spectrum; and (iii) at least one illumination source within the infrared spectrum.
[0195] In some embodiments, training multispectral transmission image channel images are acquired from a multispectral image acquisition device 12A configured to illuminate a training biological specimen with six or more different illumination sources, where the at least six different illumination sources are from within the ultraviolet (UV) spectrum, from within the visible spectrum, and/or from within the infrared (IR) spectrum. In some embodiments, the at least six different illumination sources comprise (i) at least one illumination source within the ultraviolet (UV) spectrum; (ii) at least one illumination source within the visible spectrum; and (iii) at least one illumination source within the infrared spectrum.
[0196] In some embodiments, training multispectral transmission image channel images are acquired from a multispectral image acquisition device 12A configured to illuminate a training biological specimen with twelve or more different illumination sources, where the at least twelve different illumination sources are from within the ultraviolet (UV) spectrum, from within the visible spectrum, and/or from within the infrared (IR) spectrum. In some embodiments, the at least twelve different illumination sources comprise (i) at least one illumination source within the ultraviolet (UV) spectrum; (ii) at least one illumination source within the visible spectrum; and (iii) at least one illumination source within the infrared spectrum.
[0197] In some embodiments, training multispectral transmission image channel images are acquired from a multispectral image acquisition device 12A configured to illuminate a training biological specimen with two or more different illumination sources, where at least two of the illumination sources are from at least two different wavelengths within the UV spectrum. In some embodiments, training multispectral transmission image channel images are acquired from a multispectral image acquisition device 12A configured to illuminate a training biological specimen with two or more different illumination sources, where at least two of the illumination sources are from at least two different wavelengths within the visible spectrum. In some embodiments, training multispectral transmission image channel images are acquired from a multispectral image acquisition device 12A configured to illuminate a training biological specimen with two or more different illumination sources, where at least two of the illumination sources are from at least two different wavelengths within the IR spectrum.
[0198] In some embodiments, training multispectral transmission image channel images are acquired from a multispectral image acquisition device 12A configured to illuminate a training biological specimen with four or more different illumination sources (such as narrow illumination sources), wherein training multispectral transmission image channel images are acquired after illuminating the biological specimen with (i) at least two different illuminations sources within the UV spectrum; (ii) at least two different illuminations sources within the visible spectrum; and (iii) least two different illuminations sources within the IR spectrum. In some embodiments, training multispectral transmission image channel images are acquired from a multispectral image acquisition device 12A configured to illuminate a training biological specimen with six or more different illumination sources, wherein training multispectral transmission image channel images are acquired after illuminating the biological specimen with (i) at least two different illuminations
sources within the UV spectrum; (ii) at least two different illuminations sources within the visible spectrum; and (iii) least two different illuminations sources within the IR spectrum. In some embodiments, training multispectral transmission image channel images are acquired from a multispectral image acquisition device 12A configured to illuminate a training biological specimen with eight or more different illumination sources, wherein training multispectral transmission image channel images are acquired after illuminating the biological specimen with (i) at least two different illuminations sources within the UV spectrum; (ii) at least two different illuminations sources within the visible spectrum; and (iii) least two different illuminations sources within the IR spectrum. In some embodiments, training multispectral transmission image channel images are acquired from a multispectral image acquisition device 12A configured to illuminate a training biological specimen with ten or more different illumination sources, wherein training multispectral transmission image channel images are acquired after illuminating the biological specimen with (i) at least two different illuminations sources within the UV spectrum; (ii) at least two different illuminations sources within the visible spectrum; and (iii) least two different illuminations sources within the IR spectrum. In some embodiments, training multispectral transmission image channel images are acquired from a multispectral image acquisition device 12A configured to illuminate a training biological specimen with twelve or more different illumination sources, wherein training multispectral transmission image channel images are acquired after illuminating the biological specimen with (i) at least two different illuminations sources within the UV spectrum; (ii) at least two different illuminations sources within the visible spectrum; and (iii) least two different illuminations sources within the IR spectrum. In some embodiments, training multispectral transmission image channel images are acquired from a multispectral image acquisition device 12A configured to illuminate a training biological specimen with sixteen or more different illumination sources, wherein training multispectral transmission image channel images are acquired after illuminating the biological specimen with (i) at least two different illuminations sources within the UV spectrum; (ii) at least two different illuminations sources within the visible spectrum; and (iii) least two different illuminations sources within the IR spectrum. In some embodiments, training multispectral transmission image channel images are acquired from a multispectral image acquisition device 12A configured to illuminate a training biological specimen with twenty or more different illumination sources, wherein training multispectral transmission image channel images are acquired after illuminating the biological
specimen with (i) at least two different illuminations sources within the UV spectrum; (ii) at least two different illuminations sources within the visible spectrum; and (iii) least two different illuminations sources within the IR spectrum.
[0199] In some embodiments, training multispectral transmission image channel images are acquired from a multispectral image acquisition device 12A configured to illuminate a training biological specimen with nine or more different illumination sources, wherein training multispectral transmission image channel images are acquired after illuminating the biological specimen with (i) at least two different illuminations sources within the UV spectrum; (ii) at least two different illuminations sources within the visible spectrum; and (iii) least three different illuminations sources within the IR spectrum. In some embodiments, training multispectral transmission image channel images are acquired from a multispectral image acquisition device 12A configured to illuminate a training biological specimen with twelve or more different illumination sources, wherein training multispectral transmission image channel images are acquired after illuminating the biological specimen with (i) at least two different illuminations sources within the UV spectrum; (ii) at least two different illuminations sources within the visible spectrum; and (iii) least three different illuminations sources within the IR spectrum.
[0200] In some embodiments, training multispectral transmission image channel images are acquired from a multispectral image acquisition device 12A configured to illuminate a training biological specimen with two or more different illumination sources, where the two or more different illumination sources differ by at least 120nm. In some embodiments, training multispectral transmission image channel images are acquired from a multispectral image acquisition device 12A configured to illuminate a training biological specimen with two or more different illumination sources, where the two or more different illumination sources differ by at least lOOnm. In some embodiments, training multispectral transmission image channel images are acquired from a multispectral image acquisition device 12A configured to illuminate a training biological specimen with two or more different illumination sources, where the two or more different illumination sources differ by at least 80nm. In some embodiments, training multispectral transmission image channel images are acquired from a multispectral image acquisition device 12A configured to illuminate a training biological specimen with two or more different illumination sources, where the two or more different illumination sources differ by at least 60nm. In some embodiments, training multispectral transmission image channel images are acquired
from a multispectral image acquisition device 12A configured to illuminate a training biological specimen with two or more different illumination sources, where the two or more different illumination sources differ by at least 40nm.
[0201] In some embodiments, training multispectral transmission image channel images are acquired from a multispectral image acquisition device 12A configured to illuminate a training biological specimen with two or more different illumination sources, where the at least two different illumination sources comprise (i) at least one illumination source within the ultraviolet (UV) spectrum; (ii) at least one illumination source within the visible spectrum; and (iii) at least one illumination source within the infrared spectrum; and wherein the least three different illumination sources have mean wavelengths of about 365 +/- 300 nm, about 400 +/- 300 nm, about 435 +/- 300 nm, about 470 +/- 300 nm, about 500 +/- 300 nm, about 550+/- 300 nm, about 580 +/- 300 nm, about 635 +/- 300 nm, about 660 +/- 300 nm, about 690 +/- 300 nm, about 780 +/- 300 nm, and/or about 850 +/- 300 nm.
[0202] In some embodiments, training multispectral transmission image channel images are acquired from a multispectral image acquisition device 12A configured to illuminate a training biological specimen with two or more different illumination sources, where the at least two different illumination sources comprise (i) at least one illumination source within the ultraviolet (UV) spectrum; (ii) at least one illumination source within the visible spectrum; and (iii) at least one illumination source within the infrared spectrum; and wherein the least three different illumination sources have mean wavelengths of about 365 +/- 200 nm, about 400 +/- 200 nm, about 435 +/- 200 nm, about 470 +/- 200 nm, about 500 +/- 200 nm, about 550+/- 200 nm, about 580 +/- 200 nm, about 635 +/- 200 nm, about 660 +/- 200 nm, about 690 +/- 200 nm, about 780 +/- 200 nm, and/or about 850 +/- 200 nm.
[0203] In some embodiments, training multispectral transmission image channel images are acquired from a multispectral image acquisition device 12A configured to illuminate a training biological specimen with two or more different illumination sources, where the at least two different illumination sources comprise (i) at least one illumination source within the ultraviolet (UV) spectrum; (ii) at least one illumination source within the visible spectrum; and (iii) at least one illumination source within the infrared spectrum; and wherein the least three different illumination sources have mean wavelengths of about 365 +/- 100 nm, about 400 +/- 100 nm, about 435 +/- 100 nm, about 470 +/- 100 nm, about 500 +/- 100 nm, about 550+/- 100 nm, about 580 +/-
100 nm, about 635 +/- 100 nm, about 660 +/- 100 nm, about 690 +/- 100 nm, about 780 +/- 100 nm, and/or about 850 +/- 100 nm.
[0204] In some embodiments, training multispectral transmission image channel images are acquired from a multispectral image acquisition device 12A configured to illuminate a training biological specimen with two or more different illumination sources, where the at least two different illumination sources comprise (i) at least one illumination source within the ultraviolet (UV) spectrum; (ii) at least one illumination source within the visible spectrum; and (iii) at least one illumination source within the infrared spectrum; and wherein the least three different illumination sources have mean wavelengths of about 365 +/- 50 nm, about 400 +/- 50 nm, about 435 +/- 50 nm, about 470 +/- 50 nm, about 500 +/- 50 nm, about 550+/- 50 nm, about 580 +/- 50 nm, about 635 +/- 50 nm, about 660 +/- 50 nm, about 690 +/- 50 nm, about 780 +/- 50 nm, and/or about 850 +/- 50 nm.
[0205] In some embodiments, training multispectral transmission image channel images are acquired from a multispectral image acquisition device 12A configured to illuminate a training biological specimen with two or more different illumination sources, where the at least two different illumination sources comprise (i) at least one illumination source within the ultraviolet (UV) spectrum; (ii) at least one illumination source within the visible spectrum; and (iii) at least one illumination source within the infrared spectrum; and wherein the least three different illumination sources have mean wavelengths of about 365 +/- 30 nm, about 400 +/- 30 nm, about 435 +/- 30 nm, about 470 +/- 30 nm, about 500 +/- 30 nm, about 550+/- 30 nm, about 580 +/- 30 nm, about 635 +/- 30 nm, about 660 +/- 30 nm, about 690 +/- 30 nm, about 780 +/- 30 nm, and/or about 850 +/- 30 nm.
[0206] In some embodiments, training multispectral transmission image channel images are acquired from a multispectral image acquisition device 12A configured to illuminate a training biological specimen with two or more different illumination sources, where the at least two different illumination sources comprise (i) at least one illumination source within the ultraviolet (UV) spectrum; (ii) at least one illumination source within the visible spectrum; and (iii) at least one illumination source within the infrared spectrum; and wherein the least three different illumination sources have mean wavelengths of about 365 +/- 20 nm, about 400 +/- 20 nm, about 435 +/- 20 nm, about 470 +/- 20 nm, about 500 +/- 20 nm, about 550+/- 20 nm, about 580 +/- 20
nm, about 635 +/- 20 nm, about 660 +/- 20 nm, about 690 +/- 20 nm, about 780 +/- 20 nm, and/or about 850 +/- 20 nm.
[0207] In some embodiments, training multispectral transmission image channel images are acquired from a multispectral image acquisition device 12A configured to illuminate a training biological specimen with two or more different illumination sources, where the at least two different illumination sources comprise (i) at least one illumination source within the ultraviolet (UV) spectrum; (ii) at least one illumination source within the visible spectrum; and (iii) at least one illumination source within the infrared spectrum; and wherein the least three different illumination sources have mean wavelengths of about 365 nm, about 400 nm, about 435 nm, about 470 nm, about 500 nm, about 550 nm, about 580 nm, about 635 nm, about 660 nm, about 690 nm, about 780 nm, and/or about 850 nm.
[0208] In some embodiments, training multispectral transmission image channel images are acquired from a multispectral image acquisition device 12A configured to illuminate a training biological specimen with six or more different illumination sources, where the at least six different illumination sources comprise (i) at least one illumination source within the ultraviolet (UV) spectrum; (ii) at least two illumination sources within the visible spectrum; and (iii) at least one illumination source within the infrared spectrum; and wherein the least six different illumination sources have mean wavelengths of about 365 +/- 20 nm, about 400 +/- 20 nm, about 435 +/- 20 nm, about 470 +/- 20 nm, about 500 +/- 20 nm, about 550+/- 20 nm, about 580 +/- 20 nm, about 635 +/- 20 nm, about 660 +/- 20 nm, about 690 +/- 20 nm, about 780 +/- 20 nm, and/or about 850 +/- 20 nm. In some embodiments, training multispectral transmission image channel images are acquired from a multispectral image acquisition device 12A configured to illuminate a training biological specimen with six or more different illumination sources, where the at least six different illumination sources comprise (i) at least one illumination source within the ultraviolet (UV) spectrum; (ii) at least two illumination sources within the visible spectrum; and (iii) at least one illumination source within the infrared spectrum; and wherein the least six different illumination sources have mean wavelengths of about 365 +/- 10 nm, about 400 +/- 10 nm, about 435 +/- 10 nm, about 470 +/- 10 nm, about 500 +/- 10 nm, about 550+/- 10 nm, about 580 +/- 10 nm, about 635 +/- 10 nm, about 660 +/- 10 nm, about 690 +/- 10 nm, about 780 +/- 10 nm, and/or about 850 +/- 10 nm. In some embodiments, training multispectral transmission image channel images are acquired from a multispectral image acquisition device 12A configured to illuminate a training
biological specimen with six or more different illumination sources, where the at least six different illumination sources comprise (i) at least one illumination source within the ultraviolet (UV) spectrum; (ii) at least two illumination sources within the visible spectrum; and (iii) at least one illumination source within the infrared spectrum; and wherein the least six different illumination sources have mean wavelengths of about 365 nm, about 400 nm, about 435 nm, about 470 nm, about 500 nm, about 550 nm, about 580 nm, about 635 nm, about 660 nm, about 690 nm, about 780 nm, and/or about 850 nm.
[0209] In some embodiments, training multispectral transmission image channel images are acquired from a multispectral image acquisition device 12A configured to illuminate a training biological specimen with twelve or more different illumination sources, where the at least twelve different illumination sources comprise (i) at least one illumination source within the ultraviolet (UV) spectrum; (ii) at least two illumination sources within the visible spectrum; and (iii) at least one illumination source within the infrared spectrum; and wherein the least twelve different illumination sources have mean wavelengths of about 365 +/- 20 nm, about 400 +/- 20 nm, about 435 +/- 20 nm, about 470 +/- 20 nm, about 500 +/- 20 nm, about 550+/- 20 nm, about 580 +/- 20 nm, about 635 +/- 20 nm, about 660 +/- 20 nm, about 690 +/- 20 nm, about 780 +/- 20 nm, and/or about 850 +/- 20 nm. In some embodiments, training multispectral transmission image channel images are acquired from a multispectral image acquisition device 12A configured to illuminate a training biological specimen with twelve or more different illumination sources, where the at least twelve different illumination sources comprise (i) at least one illumination source within the ultraviolet (UV) spectrum; (ii) at least two illumination sources within the visible spectrum; and (iii) at least one illumination source within the infrared spectrum; and wherein the least twelve different illumination sources have mean wavelengths of about 365 +/- 10 nm, about 400 +/- 10 nm, about 435 +/- 10 nm, about 470 +/- 10 nm, about 500 +/- 10 nm, about 550+/- 10 nm, about 580 +/- 10 nm, about 635 +/- 10 nm, about 660 +/- 10 nm, about 690 +/- 10 nm, about 780 +/- 10 nm, and/or about 850 +/- 10 nm. In some embodiments, training multispectral transmission image channel images are acquired from a multispectral image acquisition device 12A configured to illuminate a training biological specimen with twelve or more different illumination sources, where the at least twelve different illumination sources comprise (i) at least one illumination source within the ultraviolet (UV) spectrum; (ii) at least two illumination sources within the visible spectrum; and (iii) at least one illumination source within the infrared spectrum; and wherein the
least twelve different illumination sources have mean wavelengths of about 365 nm, about 400 nm, about 435 nm, about 470 nm, about 500 nm, about 550 nm, about 580 nm, about 635 nm, about 660 nm, about 690 nm, about 780 nm, and/or about 850 nm.
[0210] In some embodiments, the selection of and/or number of different illumination sources in any of the UV, visible, or IR spectra are selected based on a specific tissue type; or based on cellular organelles or cellular structures within the biological specimen that are relevant for downstream virtual stain generation and/or analysis. In these embodiments, transmission image data is acquired from training biological specimens using one or more illumination sources matching one or more mean absorbances or peak absorbances of one or more cellular structures and/or one or more cellular organelles or interest.
[0211] For instance, nucleic acids (e.g., DNA) present within a cell or within a cell nucleus may be relevant for virtual stain generation and/or downstream analysis. As it is known that nucleic acids absorb energy within the UV spectrum (e.g., below about 300nm) and within the visible spectrum (e.g., between 460nm to about 490nm), training multispectral transmission image data from different illumination sources emitting energy at about the peak absorbance wavelengths of nucleic acid molecules (and/or their associated structures, e.g., histones) may facilitate the training of a virtual staining engine 210, wherein the trained virtual staining engine 210 may be used to generate a virtual stain mimicking a traditional nuclear stain (i.e., hematoxylin).
[0212] In some embodiments, the selection of and/or number of different illumination sources in any of the UV, visible, or IR spectra are selected based on wavelengths characteristic of a particular type of morphological stain. For instance, H&E staining is used to provide contrast based on negatively or positively charged components of tissue, providing a morphological map of said tissue sample that can be used to look for the presence, absence, arrangement, and appearance of tissue structures to diagnose and prognosticate pathology. It is possible to map the same structures using the endogenous contrast of tissue at different wavelengths, for example using 250 nm to highlight the nuclei and 420 nm to highlight endogenous cytochromes, corresponding to cytoplasm.
[0213] Generation of a Multi-Channel Multispectral Training Transmission Image
[0214] In general, and with reference to FIG. 10, one or more, such as two or more training multispectral transmission image channel images are obtained (step 120). In some embodiments, the one or more, such as two or more training multispectral transmission image channel images
are simply combined into a multi-channel image (i.e., no compression or dimensionality reduction is applied). In other embodiments, the obtained two or more training multispectral transmission images are reduced or compressed into a multi-channel multispectral training transmission image (step 121).
[0215] In some embodiments, the obtained two or more training multispectral transmission image channels are compressed to generate a multispectral training image using a dimensionality reduction method. Examples of suitable dimensionality reduction methods include principal component analysis (PCA) (such as principal component analysis plus discriminant analysis), projection onto latent structure regression, and t-Distributed Stochastic Neighbor Embedding (t- SNE), and Uniform Manifold Approximation and Projection (UMAP). See Omucheni, Dickson L., et al. "Application of principal component analysis to multispectral-multimodal optical image analysis for malaria diagnostics." Malaria journal 13 (2014): 1-11; and Farrugia, Jessica, et al. "Principal component analysis of hyperspectral data for early detection of mould in cheeselets." Current Research in Food Science 4 (2021): 18-27, the disclosure of which are hereby incorporated by reference herein in their entireties.
[0216] In some embodiments, 4 or more training multispectral transmission images channel images are compressed into a multi-channel a multispectral training transmission image using a dimensionality reduction technique. In some embodiments, 5 or more training multispectral transmission images channel images are compressed into a multi-channel a multispectral training transmission image using a dimensionality reduction technique. In some embodiments, 6 or more training multispectral transmission images channel images are compressed into a multi-channel a multispectral training transmission image using a dimensionality reduction technique. In some embodiments, 7 or more training multispectral transmission images channel images are compressed into a multi-channel a multispectral training transmission image using a dimensionality reduction technique. In some embodiments, 8 or more training multispectral transmission images channel images are compressed into a multi-channel a multispectral training transmission image using a dimensionality reduction technique. In some embodiments, 9 or more training multispectral transmission images channel images are compressed into a multi-channel a multispectral training transmission image using a dimensionality reduction technique. In some embodiments, 10 or more training multispectral transmission images channel images are compressed into a multi-channel a multispectral training
transmission image using a dimensionality reduction technique. In some embodiments, 11 or more training multispectral transmission images channel images are compressed into a multi-channel a multispectral training transmission image using a dimensionality reduction technique. In some embodiments, 12 or more training multispectral transmission images channel images are compressed into a multi-channel a multispectral training transmission image using a dimensionality reduction technique. In some embodiments, 16 or more training multispectral transmission images channel images are compressed into a multi-channel a multispectral training transmission image using a dimensionality reduction technique. In some embodiments, 20 or more training multispectral transmission images channel images are compressed into a multi-channel a multispectral training transmission image using a dimensionality reduction technique. In some embodiments, 24 or more training multispectral transmission images channel images are compressed into a multi-channel a multispectral training transmission image using a dimensionality reduction technique. Again, the 2 or more training multispectral transmission images channel images may be supplied to a machine learning algorithm without performing any compression step, such as any dimensionality reduction step (e.g., Principal Component Analysis (PCA)).
[0217] In some embodiments, PCA is used to reduce the dimensionality of the data set comprising the four or more multispectral transmission training image channel images. In general, PCA is used to reduce the dimensionality of a data set consisting of many variables correlated with each other while retaining the variation present in the dataset, up to the maximum extent. The same is done by transforming the variables to a new set of variables, which are known as the principal components (or simply, the PCs) and are orthogonally ordered such that the retention of variation present in the original variables decreases as they move down in the order. In this way, the first principal component retains maximum variation that was present in the original components. The principal components are the eigenvectors of a covariance matrix, and hence they are orthogonal. Principal component analysis and methods of employing the same are described in U.S. Patent Publication No. 2005/0123202 and in U.S. Patent Nos. 6,894,639 and 8,565,488, the disclosures of which are hereby incorporated by reference herein in their entireties. PCA and Linear Discriminant Analysis are further described by Khan et. al., "Principal Component Analysis-Linear Discriminant Analysis Feature Extractor for Pattern Recognition,"
"IJCSI International Journal of Computer Sciences Issues, Vol. 8, Issue 6, No. 2, Nov. 2011, the disclosure of which is hereby incorporated by reference herein in its entirety.
[0218] The t-SNE algorithm is a non-linear dimensionality reduction technique well-suited for embedding high-dimensional data for visualization in a low-dimensional space of two or three dimensions. Specifically, it models each high-dimensional object by a two- or three-dimensional point in such a way that similar objects are modeled by nearby points and dissimilar objects are modeled by distant points with high probability. The t-SNE algorithm comprises two main stages. First, t-SNE constructs a probability distribution over pairs of high-dimensional objects in such a way that similar objects have a high probability of being picked while dissimilar points have an extremely small probability of being picked. Second, t-SNE defines a similar probability distribution over the points in the low-dimensional map, and it minimizes the Kullback-Leibler divergence between the two distributions with respect to the locations of the points in the map. The t-SNE algorithm is further described in United States Patent Publication Nos. 2018/0046755, 2014/0336942, and 2018/0166077, the disclosures of which are hereby incorporated by reference herein in their entireties.
[0219] In those embodiments that utilize image compression or dimensionality reduction, FIG. 11 illustrates one method of compressing four or more training multispectral transmission image channel images into a single multispectral training image. First, four or more, 5 or more, 6 or more, 7 or more, 8 or more, 9 or more, 10 or more, 11 or more, 12 or more, 16 or more, 20 or more, 24 or more, etc. training multispectral transmission image channel images from are obtained (step 130). Each of the obtained four or more training multispectral transmission image channel images are then converted into a data matrix (step 131). For instance, for 12 acquired training multispectral image channel images, the stacked images are converted into a 3 -dimensional matrix with 2 spatial dimensions and 12 channels. Next, the data matrix is then transformed into a linear vector of pixels, where each pixel includes each of the four or more channels (step 132). A principal component analysis transformation is then applied to the linear vector to reduce the four or more channels to three (step 133). For example, for 12 acquired training multispectral image channel images, PC A reduction is applied to this matrix, reducing the 12 channels to 3. The principal component channels are then normalized from 0 to 255 (step 134). Subsequently, the linear vector is then returned to a tridimensional matrix (step 135). The tridimensional matrix is then converted to an image, using a first of the three channels as a red channel, a second of the
three channels as a green channel, and a third of the three channels as a blue channel (step 136). The resulting RGB multispectral training image is then stored in the one or more memories 201 or in the one or more storage units 240 (step 137).
[0220] Training Brightfield Transmission Images
[0221] Following the acquisition of training multispectral transmission image data / training multispectral transmission image channel images, the training biological specimens are stained with a morphological stain. Subsequently, one or more training brightfield transmission images may be acquired from morphologically stained training biological specimens, such as using brightfield image acquisition device 12B. Alternatively, a training brightfield transmission image may be obtained with multispectral image acquisition device 12A, such as by acquiring image data at three or predetermined wavelengths, such as at about 700nm +/- 10mm, about 550nm +/- 10mm , and about 470nm +/- 10mm. An example of a training brightfield transmission image is shown in FIG. 12B. The skilled artisan would appreciate that the acquired training brightfield transmission images serve as ground-truth when training a virtual staining engine, such as described herein.
[0222] In some embodiments, the morphological stain is hematoxylin which stains the nuclei blue. In other embodiments, the morphological stain is eosin which stains the cytoplasm pink. In yet other embodiments, the obtained biological specimen is stained with both hematoxylin and/or eosin (H&E). In some embodiments, an H&E staining protocol may be performed, including applying the tissue section with hematoxylin stain mixed with a metallic salt, or mordant. The tissue section can then be rinsed in a weak acid solution to remove excess staining (differentiation), followed by bluing in mildly alkaline water. In some embodiments, after the application of hematoxylin, the tissue can be counterstained with eosin. It will be appreciated that other H&E staining techniques can be implemented.
[0223] In other embodiments, the morphological stain is a "special stain." A "special stain" refers to any chemically based stain useful for histological analysis that is not an immunohistochemical stain, an in-situ hybridization stain, or H&E. In some embodiments, the special stain includes one or more reagents selected from Acid fuchsin (C.I. 42685; absorbance maximum 546 nm), Alcian blue 8 GX (C.I. 74240; absorbance maximum 615 nm), Alizarin red S (C.I. 58005; absorbance maximum 556 and 596 nm), Auramine O (C.I. 41000; absorbance maximum 370 and 432 nm), Azocarmine B (C.I. 50090; absorbance maximum 516 nm),
Azocarmine G (C.I. 50085; absorbance maximum 511 nm), Azure A (C.I. 52005; similar absorbance to Azure B), Azure B (C.I. 52010; absorbance maximum 639 nm), Basic fuchsine (C.I. 42510; absorbance maximum 547-552 nm), Bismarck brown Y (C.I. 21000; absorbance maximum 643 nm), Brilliant cresyl blue (C.I. 51010; absorbance maximum 622 nm), Carmine (C.I. 75470; absorbance maximum protonated 490-495, increasing in base and when combined with metal salts), Chlorazol black E (C.I. 30235; absorbance maximum 500-504 nm and 574-602 nm), Congo red (C.I. 22120; absorbance maximum 497 nm), Cresyl violet (absorbance maximum 596-601 nm), Crystal violet (C.I. 42555; absorbance maximum 590 nm), Darrow red (absorbance maximum 502 nm), Ethyl green (C.I. 42590; absorbance maximum 635 nm 420 nm), Fast green F C F (C.I. 42053; absorbance maximum 624 nm, pH dependent), Giemsa Stain (mixture of impure azure B, methylene blue and eosin Y), Indigo carmine (C.I. 73015; absorbance maximum 608 nm), Janus green B (C.I. 11050; absorbance maximum 630 nm), Jenner stain 1899, Light green SF (C.I. 42095; absorbance maximum 422 and 630 nm), Malachite green (C.I. 42000; absorbance maximum 614 and 425 nm), Martius yellow (C.I. 10315; absorbance maximum 420-432 nm), Methyl orange (C.I. 13025; absorbance maximum 507 nm), Methyl violet 2B (C.I. 42535; absorbance maximum 583-587 nm), Methylene blue (C.I. 52015; absorbance maximum 656-661 nm), Methylene violet (Bemthsen), (C.I. 52041; absorbance maximum 580-601 nm), Neutral red (C.I. 50040; absorbance maximum 454, 529, 541 nm depending upon pH and solvent), Nigrosin (C.I. 50420; absorbance maximum 570-580 nm), Nile blue A (C.I. 51180; absorbance maximum 633-660 nm), Nuclear fast red (C.I. 60760; absorbance maximum 535 and 505 nm), Oil Red O (C.I. 26125; absorbance maximum 518 and 359 nm), Orange G (C.I. 16230; absorbance maximum 475 nm), Orange II (C.I. 15510; absorbance maximum 483 nm), Orcein (absorbance maximum 575-590 pH dependent, Pararosaniline (C.I. 42500; absorbance maximum 545 nm), Phloxin B (C.I. 45410; absorbance maximum 548 and 510 nm), Pyronine B (C.I. 45010; closely related to Pyronine Y), Pyronine Y (C.I. 45005; absorbance maximum 546-549 nm), Resazurin (absorbance maximum 598 nm in water, 478 in methanol), Rose Bengal (C.I. 45435; absorbance maximum 546 nm), Safranine O (C.I. 50240; absorbance maximum 530 nm), Sudan black B (C.I. 26150; absorbance maximum 598 and 415 nm nm), Sudan III (C.I. 26100; absorbance maximum 503-507 and 503 nm), Sudan IV (C.I. 26105; absorbance maximum 520 nm), Tetrachrome stain (MacNeal), Thionine (C.I. 52000; absorbance maximum 598-602 nm), Toluidine blue (C.I. 52040; absorbance maximum 626-630 nm), Weigert’s resorcin fuchsine (absorbance maximum 508 nm), Wright
stain, and any combination thereof. In each of these examples, "C.I." refers to Color IndexTM. The Color IndexTM describes a commercial product by its recognized usage class, its hue, and a serial number (which simply reflects the chronological order in which related colorant types have been registered with the Color Index). This definition enables a particular product to be classified along with other products whose essential colorant is of the same chemical constitution and in which that essential colorant results from a single chemical reaction or a series of reactions. Yet other special stains include, but are not limited to, PAS STAINING KIT, SPECIAL STAINS GMS II STAIN KIT PACK, Reticulum II Staining Kit, IRON STAINING KIT, GIEMSA STAINING KIT, TRICHROME STAINING KIT, DIASTASE KIT, BenchMark Special Stain AFB Staining Kit, ALCIAN BLUE FOR PAS, LIGHT GREEN FOR PAS, STEINER II, STAINING KIT, Congo Red Staining Kit, Special Stains Van Gieson CS, Elastic Stain Core Kit, Jones Staining Kit, ALC BLUE STAINING KIT PH2.5,MUCICARMINE STAINING KIT, GRAM STAINING KIT, GREEN FOR TRICHROME, Jones Light Green Staining Kit, each available from Roche Diagnostics. In some embodiments, the special stain is for a Grocott methenamine sliver assay. In other embodiments, the special stain is for a Masson’s tri chrome assay.
[0224] In some embodiments, samples are morphologically stained according to the processes described in PCT Application Nos. PCT/EP2021/073738 or PCT/EP20-21/073733m the disclosures of which are hereby incorporated by reference herein in their entireties.
[0225] In some embodiments, the acquired training multispectral transmission image data / training multispectral transmission image channel images and the acquired brightfield transmission image are each associated with one or more identifiers. For instance, the identifiers may include a sample number, sample type, stain type, fixation properties, etc. In some embodiments, the acquired training multispectral transmission image channel images and the acquired brightfield transmission image are each associated with unique identifiers such that the training multispectral transmission image channel images (or any multispectral training image derived therefrom) and the training brightfield transmission image may be associated with each other and retrieved from the one or more memories 201 and/or one or more storage units 240 to facilitate the preparation of coregistered pairs of training images for use in training a virtual staining engine 210.
[0226] Preparation of Pairs of Coregistered Training Images
[0227] The skilled artisan will appreciate that the chemical treatment of unstained tissue with a morphological stain may cause the tissue to stretch, shrink, and/or move. As a result, a training brightfield transmission image (derived from a stained training biological specimen) may have tissue or cellular structures that are stretched, shrunken, and/or moved relative to a generated multi-channel multispectral training transmission image (derived from an unstained training biological specimen). These differences in tissue locations, etc. are accounted for by aligning or coregistering features present in each of the generated multi-channel multispectral training transmission image and the training brightfield transmission image. In view of the foregoing, following the generation of a multi-channel multispectral training image (and this may be a multichannel multispectral image derived from one or more image channel images that have not been compressed or from one or more image channel images that have been compressed) and the acquisition of a training brightfield transmission image from the same training biological specimen, the image processing module 212 is utilized to generate a pair of coregistered training images from the generated multi-channel multispectral training transmission image and the training brightfield transmission image such that features in each image are aligned and/or coregistered (see steps 140 and 141 of FIG. 13A). The coregistered training images are then used to train a machine learning algorithm, namely, to train a virtual staining engine.
[0228] One method of coregistering images is set forth in FIG. 24. Other methods of coregistering images with respect to each other are known and described in the literature, see for example D. Mueller et al., Real-time deformable registration of multi-modal whole slides for digital pathology. Computerized Medical Imaging and Graphics vol. 35 p. 542-556 (2011); F. El- Gamal et al., Current trends in medical image registration and fusion, Egyptian Informatics Journal vol. 17 p. 99-124 (2016): J. Singla et al., A systematic way of affine transformation using image registration, International Journal of Information Technology and Knowledge Management July- December 2012, Vol. 5, No. 2, pp. 239-243; Z. Hossein-Nejad et al., An adaptive image registration method based on SIFT features and RANSAC transform, Computers and electrical engineering Vol. 62 p. 5240537 (August 2017); U.S. Pat. Nos. 8,605,972 and 9,785,818, the disclosures of which are incorporated by reference herein in their entireties. In some embodiments, a generated multi-channel multispectral training transmission image, and a training brightfield transmission image may be aligned or coregistered with RANSAC (random sample consensus, a known algorithm for image alignment).
[0229] FIG. 13B sets forth a method of coregistering a generated multi-channel multispectral training transmission image and a training brightfield transmission image. As an initial step, both a generated multi-channel multispectral training transmission image and a training brightfield transmission image are obtained (step 150), such according to the methods described herein. In some embodiments, the obtained images are segmented, such as into 64-pixel x 64- pixel segments, 128-pixel x 128-pixel segments, 256-pixel x 256-pixel segments, 512-pixel x 512- pixel segments, etc. Next, one or more regions of interest (ROIs) are identified in each of the obtained generated multi-channel multispectral training transmission image and training brightfield transmission image (step 151). Features are then identified in each of the identified one or more ROIs (step 152). Non-limiting examples of features include epithelial cell nuclei, fat cells, inflammatory cells, etc. In some embodiments, lumens, glands, and/or fatty cells are used as first landmarks to locate zones. The, after "zooming in" / increasing magnification, epithelial cells, red blood cells and inflammatory cells may be located.
[0230] Landmarks are then placed in each of the generated multi-channel multispectral training transmission image and the training brightfield transmission image (step 153). Landmark placement in both the obtained generated multi-channel multispectral training transmission image and the obtained training brightfield transmission image are shown in FIGS. 14A and 14B, respectively. A transform is then applied to translate, rotate, shear, shrink, and/or stretch the obtained training brightfield transmission image to match landmarks placed in the generated multichannel multispectral training transmission image (step 154). In some embodiments, the transform is an affine transform, such as one which permits shearing and scaling in addition to rotation and translation.
[0231] Training a Machine Learning Algorithm Using Pairs of Coregistered Training Images
[0232] Following the coregistration of each of the generated multi-channel multispectral training transmission images and the training brightfield transmission images, the pairs of coregistered images (or segments of those paired images) are provided to a machine learning algorithm for training to train / teach the machine learning algorithm (virtual staining engine) to predict a virtual stained image. Pixel-to-pixel, cell-to-cell, and/or patch-to-patch mapping is performed using the pairs of coregistered training images.
[0233] Examples of machine learning algorithms include:
[0234] 1) Self-supervised learning neural network.
[0235] 2) Convolutional neural network. As used herein, the term "neural network" refers to one or more computer-implemented networks capable of being trained to achieve a goal. Unless otherwise indicated, references herein to a neural network include one neural network or multiple interrelated neural networks that are trained together. Examples of neural networks include, without limitation, convolutional neural networks (CNNs), recurrent neural networks (RNNs), fully connected neural networks, encoder neural networks (e.g., "encoders"), decoder neural networks (e.g., "decoders"), dense- connection neural networks, and other types of neural networks. In some embodiments, a neural network can be implemented using special hardware (e.g., GPU, tensor processing units (TPUs), systolic arrays, single instruction multiple data (SIMD) processor, etc.), using software code and a general-purpose processor, or a combination of special hardware and software code.
[0236] In some embodiments, the neural network is configured as a deep learning network. In general, "deep learning" is a branch of machine learning based on a set of algorithms that attempt to model high level abstractions in data. Deep learning is part of a broader family of machine learning methods based on learning representations of data. An observation can be represented in many ways such as a vector of intensity values per pixel, or in a more abstract way as a set of edges, regions of particular shape, etc. Some representations are better than others at simplifying the learning task. One of the promises of deep learning is replacing handcrafted features with efficient algorithms for unsupervised or semi-supervised feature learning and hierarchical feature extraction.
[0237] In some embodiments, the neural network may be a deep neural network with a set of weights that model the world according to the data that it has been fed to train it. Neural networks typically consist of multiple layers, and the signal path traverses from front to back between the layers. Any neural network may be implemented for this purpose. Suitable neural networks include LeNet, AlexNet, ZFnet, GoogLeNet, VGGNet, VGG16, DenseNet (also known as a Dense Convolutional Network or DenseNet-121), MiniNet, and the ResNet. In some embodiments, a fully convolutional neural network is utilized, such as described by Long et al., "Fully Convolutional Networks for Semantic Segmentation," Computer Vision and Pattern Recognition (CVPR), 2015 IEEE Conference, June 20015 (IN SPEC Accession Number: 15524435), the disclosure of which is hereby incorporated by reference. Yet other suitable neural
7Q
networks include a Convolutional Neural Network (CNN), a Recurrent Neural Network, a Long Short-Term Memory Neural Network (LSTM), a Compound Scaled Efficient Neural Network (EfficientNet), a Normalizer Free Neural Network (NFNet), a Densely Connected Convolutional Neural Network (DenseNet), an Aggregated Residual Transformation Neural Network (ResNeXT), a Channel Boosted Convolutional Neural Network (CB-CNN), a Wide Residual Network (WRN), or a Residual Neural Network (RNN). In some embodiments, the neural network is one that operates on cross-entropy, e.g., one that may recognize per pixel cross-entropy.
[0238] 3) Implicit generative models, such as generative Adversarial Networks (GANs).
This neural network approach is described in the article of K. Bousmalis, et al., Unsupervised Pixel — Level Domain Adaptation with Generative Adversarial Networks, https://arxiv.org/pdf/1612.05424.pdf (August 2017), the disclosure of which is hereby incorporated by reference herein in its entirety. In some embodiments, the neural network is a generative network. Further details regarding GAN may be found in Goodfellow et al., Generative Adversarial Nets., Advances in Neural Information Processing Systems, 27, pp. 2672-2680 (2014), which is incorporated by reference herein in its entirety.
[0239] A "generative" network can be generally defined as a model that is probabilistic in nature. In other words, a "generative" network is not one that performs forward simulation or rulebased approaches. Instead, the generative network can be learned (in that its parameters can be learned) based on a suitable set of training data (e.g., a plurality of training image data sets). In some embodiments, the neural network is configured as a deep generative network. For example, the network may be configured to have a deep learning architecture in that the network may include multiple layers, which perform a number of algorithms or transformations. As used herein, the term "layer" or "network layer" refers to an analysis stage in a neural network. Layers perform different types of analysis related to the type of the neural network. For example, layers in an encoder may perform different types of analysis on an input image to encode the input image. In some cases, a particular layer provides features based on the particular analysis performed by that layer. In some cases, a particular layer down-samples a received image. An additional layer performs additional down-sampling. In some cases, each round of down-sampling reduces the visual quality of the output image, but provides features based on the related analysis performed by that layer.
[0240] GANs are introduced in 2014 by I. Goodfellow, J. Pouget- Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio, "Generative Adversarial Nets," in Advances in neural information processing systems, pp. 2672-2680, 2014, to model the training image data distribution which is then used to generate new image samples from the same distribution. They consist of two networks: generator (G) and discriminator. The generative model G learns a mapping from training data to generate new samples from some prior distribution (random noise vector) by imitating the real data distribution. On the other hand, the discriminator D tries to classify images generated by G whether they came from real training data (true distribution) or fake. These two networks are trained at the same time and updated as if they are playing a game. That is, generator G tries to fool discriminator D and in turn discriminator D adjust its parameters to make better estimates to detect fake images generated by G.
[0241] In some embodiments, the GAN is a cyclic GAN. An example for a suitable cyclic GAN network architecture is described by Jun-Yan Zhu, Taesung Park, Phillip Isola, and Alexei A. Efros in “Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks,” (24 Nov. 2017).
[0242] In some embodiments, the GAN is a three-channel GAN which utilizes data from three channels, such as three channels in compressed (e.g., dimensionality reduced) input images. In other embodiments, the GAN is a multi-channel GAN which utilizes data from a plurality of channels, such as two or more channels, such as three or more channels, such as four or more channels, such as five or more channels, such as six or more channels, such as seven or more channels, such as eight or more channels, such as nine or more channels, such as ten or more channels, such as eleven or more channels, such as twelve or more channels, such as sixteen or more channels, such as twenty or more channels, such as twenty-four or more channels, etc.
[0243] In some embodiments, the GAN training procedure involves training of two different networks, namely (i) a generator network, which in this case aims to learn the statistical transformation between the unstained multi-channel multispectral training transmission image and the corresponding brightfield image after the training biological specimen is stained with a morphological stain; and (ii) a discriminator network that learns how to discriminate between a true brightfield image of a stained training specimen and the generator network's output image. Ultimately, the desired result of this training process is a trained deep neural network, which transforms an unstained training specimen input image into a virtually stained image of an
unstained biological sample which will be diagnostically equivalent to a chemically stained brightfield image of the same biological sample. In some embodiments, there is no discernable difference in the diagnostic quality between the resulting virtually stained images of the unstained biological specimens and corresponding images of chemically stained biological specimens, at least not to the extent that any differences will substantially alter a diagnostic outcome.
[0244] In some embodiments, a GAN is used to identify discrepancies between pairs of coregistered training images, where the brightfield training transmission image in each pair of coregistered training images serves as ground truth. In some embodiments, the process of identifying the discrepancies between in a pair of coregistered training images may performed in a "discriminator network." In some embodiments, the discrepancies between the brightfield training transmission image and the multi-channel multispectral training transmission image (in the pair of coregistered training images) may then be formulated as a loss function. The loss function may be communicated to the discriminator network and to the generator network for use in backpropagation. In some embodiments, the input from the loss function enables the discriminator network to learn how to distinguish between an actual or true image of a stained training biological specimen and the generator network's output virtual histological image.
[0245] In some embodiments, the generator network computes a virtually stained image, and the discriminator network assesses the similarity of each to an actual histological image, namely the brightfield training transmission image included within a pair of coregistered training images. The generator network, in turn, may utilize the loss function to generate a corrected virtual image which is then communicated to the discriminator network. This process may be performed iteratively until the discriminator network cannot distinguish between a generated virtually stained image and the brightfield training transmission image (the image which includes an actual morphological stain).
[0246] In some embodiments, and by way of backpropagation, the discriminator network's classification helps the generator network to update its weights and thereby fine-tune the virtual histological images being produced. Ultimately, after several iterations, the generator network begins to output higher-quality virtually stained histological images, and the discriminator network becomes better at distinguishing the virtually stained histological images from the actual histological images. Once the network is trained, it can produce virtually stained images
representative of morphological stains, e.g., H&E, a basic fuchsin stain, a Masson's Trichrome stain, etc.
[0247] GENERATING A VIRTUALLY STAINED SLIDE USING A TRAINED VIRTUAL STAINING ENGINE
[0248] As noted herein, the present disclosure provides methods for the generation of a virtually stained image of a test unstained biological specimen, based on an acquired input image of the test unstained biological specimen, where the virtually stained image manifests the appearance of the test unstained biological specimen as if it were stained with a morphological stain. Once a virtual staining engine is trained, an unstained input image (e.g., a multispectral test transmission image) is supplied to the trained virtual staining engine and the trained virtual staining engine generates a predicted or virtual stained output image that corresponds to that unstained input image. For instance, if the virtual staining engine selected is trained to predict H&E staining from an unstained input image, then the trained virtual staining engine generates a predicted or virtually stained H&E output image for the unstained input image. This predicted or virtually stained output image may then be stored in one or more memories or in one or more storage systems for later retrieval and downstream processing (e.g., analysis by a pathologist or other automated image analysis workflow).
[0249] With reference to FIG. 15, in some embodiments, a method of generating a virtually stained slide comprises (i) acquiring test multispectral transmission image data from an unstained test biological specimen (step 161); (ii) supplying the test multispectral transmission image data to a trained virtual staining engine 210 trained to generate an image of an unstained biological specimen stained with a morphological stain (step 162); and (iii) with the trained virtual staining engine, generate a virtually stained image of the test unstained biological specimen stained with the morphological stain (step 163).
[0250] In some embodiments, the test multispectral image data is acquired from one or more test unstained biological specimens. The obtained test biological specimens may be obtained from any source. For instance, the obtained test biological specimens may be obtained from a tumor, including, for example, tumor biopsies samples, resection samples, cell smears, fine needle aspirates (FNA), liquid-based cytology samples, and the like. In some embodiments, the obtained test biological specimens are derived from specimens that have been previously stained; and where the previous stain has been substantially removed.
[0251] Similar to the acquisition of the training multispectral transmission image channel images described herein, one or more, such as two or more test multispectral transmission image channel images are acquired for each test unstained biological specimen, where each of the one or more, such as two or more test multispectral transmission image channel images are acquired at a specific wavelength.
[0252] In some embodiments, at least one, such as at least two test multispectral transmission image channel images are acquired for each unstained test biological specimen, where each of the at least one, such as at least two test multispectral transmission image channel images are acquired using a multispectral image acquisition device 12A configured to illuminate each unstained test biological specimen with at least one, such as at least two different illumination sources; and further configured to acquire transmission image data (e.g., at least two multispectral image channel images) of the biological specimen illuminated with the at least one, such as at least two different illumination sources. Said another way, at least one, such as at least two test multispectral transmission image channel images are acquired for each unstained test biological specimen, where each of the at least one, such as at least two test multispectral transmission image channel images are acquired at different wavelengths. In some embodiments, four test multispectral transmission image channel images are obtained from a least four different illumination sources. In some embodiments, five test multispectral transmission image channel images are obtained from a least five different illumination sources. In some embodiments, test multispectral transmission image channel images are obtained from a least six different illumination sources. In some embodiments, test multispectral transmission image channel images are obtained from a least seven different illumination sources. In some embodiments, test multispectral transmission image channel images are obtained from a least eight different illumination sources. In some embodiments, test multispectral transmission image channel images are obtained from a least nine different illumination sources. In some embodiments, test multispectral transmission image channel images are obtained from a least ten different illumination sources. In some embodiments, test multispectral transmission image channel images are obtained from a least eleven different illumination sources. In some embodiments, test multispectral transmission image channel images are obtained from a least twelve different illumination sources. In some embodiments, test multispectral transmission image channel images are obtained from twelve or more illumination sources. In some embodiments, test multispectral
transmission image channel images are obtained from sixteen or more different illumination sources. In some embodiments, test multispectral transmission image channel images are obtained from twenty or more different illumination sources. In some embodiments, test multispectral transmission image channel images are obtained from twenty-four or more different illumination sources.
[0253] In some embodiments, test multispectral image data acquired is acquired based on the trained virtual staining engine selected for use in generating a virtually stained slide and, in particular, the number of image channels used during training and the particular wavelengths at which the image data was acquired. As such, the number of different test multispectral image channel images to acquire and the wavelengths of the different illumination sources that the test unstained biological specimen should be illuminated with are dependent on (i) the number of different training multispectral image channel images used during the training of a specific virtual staining engine; and (ii) the wavelengths of the different illumination sources that were used during the training of the specific virtual staining engine (such as within +/- 15% nm, +/- 12% nm, +/- 10% nm, +/- 9% nm, +7-8% nm, +7-7% nm, +7-6% nm, +7-5% nm, +7-4% nm, +7-3% nm, +7-2% nm, or +/-1 nm of the different illumination sources that were used during the training of the specific virtual staining engine).
[0254] For instance, if a selected trained virtual staining engine for a particular morphological stain is trained using training multispectral image data from five different illumination sources, then test multispectral image data is acquired at about the same five different barrow-band illumination sources, such as within +/- 15% nm, +/- 12% nm, +/- 10% nm, +/- 9% nm, +7-8% nm, +7-7% nm, +7-6% nm, +7-5% nm, +7-4% nm, +7-3% nm, +/-2% nm, or +/-1 nm of the same five different illumination sources.
[0255] By way of another example, if a trained H&E virtual staining engine is selected that is trained using 12 training multispectral image channel images at 12 different illumination sources of 350nm, 375nm, 400nm, 425nm, 450nm, 475nm, 500nm, 525nm, 550nm, 575nm, 600nm, and 625nm, then test multispectral image channel images should be acquired at 350nm +/- 10% nm, 375nm +/- 10% nm, 400nm +/- 10% nm, 425nm +/- 10% nm, 450nm +/- 10% nm, 475nm +/- 10% nm, 500nm +/- 10% nm, 525nm +/- 10% nm, 550nm +/- 10% nm, 575nm +/- 10% nm, 600nm +/- 10% nm, and 625nm +/- 10% nm.
[0256] With reference to FIG. 16, two or more test multispectral transmission image channel images are obtained (step 170) from any test unstained biological specimen. Each of the two or more obtained test multispectral transmission images channel images are then combined to form a multi-channel multispectral training image (step 171). In some embodiments, the two or more test multispectral transmission image channel images are combined into a multi-channel multispectral training image without compressing the image, such as without performing any dimensionality reduction technique (e.g., PCA). In other embodiments, the two or more the two or more test multispectral transmission image channel images are combined into a multi-channel multispectral training image by compressing the images using a dimensionality reduction technique (e.g., PCA).
[0257] In embodiments where four or more test multispectral transmission images channel images are obtained, in some embodiments the obtained four or more test multispectral transmission images are reduced or compressed into a multi-channel multispectral test transmission image. In some embodiments, the obtained four or more test multispectral transmission image channels are compressed to generate a multispectral training image using a dimensionality reduction method. Examples of suitable dimensionality reduction methods include principal component analysis (PCA) (such as principal component analysis plus discriminant analysis), projection onto latent structure regression, and t-Distributed Stochastic Neighbor Embedding (t-SNE), and Uniform Manifold Approximation and Projection (UMAP).
[0258] In some embodiments, the steps outlined in FIG. 11 herein, which describes a method of compressing four or more training multispectral transmission image channel images into a single multispectral training image, may be applied equally in the generation of a 3-channel multispectral test transmission image. In other embodiments, the obtained four or more test multispectral transmission images are combined into a multi-channel multispectral test transmission image without performing any compression technique or any dimensionality reduction technique (e.g., PCA).
[0259] The test unstained multispectral image data, e.g., multi-channel Multispectral Test Transmission Image, is then supplied to a selected trained virtual staining engine to provide a virtually stained image based on the input image of the test unstained biological specimen. Examples of virtually stained images of unstained biological specimens are set forth within FIG. 1.
[0260] In some embodiments, the same multi-channel Multispectral Test Transmission Image is supplied to multiple different trained virtual staining engines, to provide multiple different virtually stained images of the same unstained biological specimen.
[0261] Other System Components
[0262] The system 200 of the present disclosure may be tied to a specimen processing apparatus that can perform one or more preparation processes on the tissue specimen. The preparation process can include, without limitation, deparaffinizing a specimen, conditioning a specimen (e.g., cell conditioning), staining a specimen, performing antigen retrieval, performing immunohistochemistry staining (including labeling) or other reactions, and/or performing in situ hybridization (e.g., SISH, FISH, etc.) staining (including labeling) or other reactions, as well as other processes for preparing specimens for microscopy, microanalyses, mass spectrometric methods, or other analytical methods.
[0263] The processing apparatus can apply fixatives to the specimen. Fixatives can include cross-linking agents (such as aldehydes, e g., formaldehyde, paraformaldehyde, and glutaraldehyde, as well as non-aldehyde cross-linking agents), oxidizing agents (e.g., metallic ions and complexes, such as osmium tetroxide and chromic acid), protein-denaturing agents (e.g., acetic acid, methanol, and ethanol), fixatives of unknown mechanism (e.g., mercuric chloride, acetone, and picric acid), combination reagents (e.g., Camoy's fixative, methacarn, Bouin's fluid, B5 fixative, Rossman's fluid, and Gendre's fluid), microwaves, and miscellaneous fixatives (e.g., excluded volume fixation and vapor fixation).
[0264] If the specimen is a sample embedded in paraffin, the sample can be deparaffinized using appropriate deparaffinizing fluid(s). After the paraffin is removed, any number of substances can be successively applied to the specimen. The substances can be for pretreatment (e.g., to reverse protein-crosslinking, expose cells acids, etc.), denaturation, hybridization, washing (e.g., stringency wash), detection (e g., link a visual or marker molecule to a probe), amplifying (e.g., amplifying proteins, genes, etc.), counterstaining, coverslipping, or the like.
[0265] The specimen processing apparatus can apply a wide range of substances to the specimen. The substances include, without limitation, stains, probes, reagents, rinses, and/or conditioners. The substances can be fluids (e.g., gases, liquids, or gas/liquid mixtures), or the like. The fluids can be solvents (e.g., polar solvents, non-polar solvents, etc.), solutions (e.g., aqueous solutions or other types of solutions), or the like. Reagents can include, without limitation, stains,
wetting agents, antibodies (e.g., monoclonal antibodies, polyclonal antibodies, etc.), antigen recovering fluids (e.g., aqueous- or non-aqueous-based antigen retrieval solutions, antigen recovering buffers, etc ), or the like. Probes can be an isolated cells acid or an isolated synthetic oligonucleotide, attached to a detectable label or reporter molecule. Labels can include radioactive isotopes, enzyme substrates, co-f actors, ligands, chemiluminescent or fluorescent agents, nanoparticles, haptens, and enzymes. As used herein, the term "fluid" refers to any liquid or liquid composition, including water, solvents, buffers, solutions (e.g., polar solvents, non-polar solvents), and/or mixtures. The fluid may be aqueous or non-aqueous. Non-limiting examples of fluids include washing solutions, rinsing solutions, acidic solutions, alkaline solutions, transfer solutions, and hydrocarbons (e.g., alkanes, isoalkanes and aromatic compounds such as xylene). In some embodiments, washing solutions include a surfactant to facilitate spreading of the washing liquids over the specimen-bearing surfaces of the slides. In some embodiments, acid solutions include deionized water, an acid (e.g., acetic acid), and a solvent. In some embodiments, alkaline solutions include deionized water, a base, and a solvent. In some embodiments, transfer solutions include one or more glycol ethers, such as one or more propylene-based glycol ethers (e.g., propylene glycol ethers, di(propylene glycol) ethers, and tri(propylene glycol) ethers, ethylene-based glycol ethers (e.g., ethylene glycol ethers, di(ethylene glycol) ethers, and tri(ethylene glycol) ethers), and functional analogs thereof. Non-liming examples of buffers include citric acid, potassium dihydrogen phosphate, boric acid, diethyl barbituric acid, piperazine-N,N'-bis(2-ethanesulfonic acid), dimethylarsinic acid, 2-(N-morpholino)ethanesulfonic acid, tris(hydroxymethyl)methylamine (TRIS), 2-(N-morpholino)ethanesulfonic acid (TAPS), N,N- bis(2-hydroxyethyl)glycine(Bicine), N-tris(hydroxymethyl)methylglycine (Tricine), 4-2- hy droxy ethyl- 1 -piperazineethanesulfonic acid (HEPES), 2-
{[tris(hydroxymethyl)methyl]amino}ethanesulfonic acid (TES), and combinations thereof. In some embodiments, the unmasking agent is water. In other embodiments, the buffer may be comprised of tris(hydroxymethyl)methylamine (TRIS), 2-(N-morpholino)ethanesulfonic acid (TAPS), N,N-bis(2-hydroxyethyl)glycine(Bicine), N -tris(hydroxymethyl)methylglycine (Tricine), 4-2-hy droxy ethyl- 1 -piperazineethanesulfonic acid (HEPES), 2-
{[tris(hydroxymethyl)methyl]amino}ethanesulfonic acid (TES), or a combination thereof. Additional wash solutions, transfer solutions, acid solutions, and alkaline solutions are described
in United States Patent Application Publication No. 2016/0282374, the disclosure of which is hereby incorporated by reference herein in its entirety.
[0266] Staining may be performed with a histochemical staining module or separate platform, such as an automated IHC/ISH slide Stainer. Automated IHC/ISH slide Stainers typically include at least: reservoirs of the various reagents used in the staining protocols, a reagent dispense unit in fluid communication with the reservoirs for dispensing reagent to onto a slide, a waste removal system for removing used reagents and other waste from the slide, and a control system that coordinates the actions of the reagent dispense unit and waste removal system. In addition to performing staining steps, many automated slide Stainers can also perform steps ancillary to staining (or are compatible with separate systems that perform such ancillary steps), including slide baking (for adhering the sample to the slide), dewaxing (also referred to as deparaffmization), antigen retrieval, counterstaining, dehydration and clearing, and coverslipping. Prichard, Overview of Automated Immunohistochemistry, Arch Pathol Lab Med., Vol. 138, pp. 1578-1582 (2014), incorporated herein by reference in its entirety, describes several specific examples of automated IHC/ISH slide Stainers and their various features, including the intelliPATH (Biocare Medical), WAVE (Celerus Diagnostics), DAKO OMNIS and DAKO AUTO STAINER LINK 48 (Agilent Technologies), BENCHMARK (Ventana Medical Systems, Inc.), Leica BOND, and Lab Vision Autostainer (Thermo Scientific) automated slide Stainers. Additionally, Ventana Medical Systems, Inc. is the assignee of a number of United States patents disclosing systems and methods for performing automated analyses, including U.S. Pat. Nos. 5,650,327, 5,654,200, 6,296,809, 6,352,861, 6,827,901 and 6,943,029, and U.S. Published Patent Application Nos. 20030211630 and 20040052685, each of which is incorporated herein by reference in its entirety. As used herein, the term "reagent" refers to solutions or suspensions including one or more agents capable of covalently or non-covalently reacting with, coupling with, interacting with, or hybridizing to another entity. Non-limiting examples of such agents include specific-binding entities, antibodies (primary antibodies, secondary antibodies, or antibody conjugates), nucleic acid probes, oligonucleotide sequences, detection probes, chemical moieties bearing a reactive functional group or a protected functional group, enzymes, solutions or suspensions of dye or stain molecules.
[0267] Commercially-available staining units typically operate on one of the following principles: (1) open individual slide staining, in which slides are positioned horizontally and
reagents are dispensed as a puddle on the surface of the slide containing a tissue sample (such as implemented on the DAKO AUTOSTAINER Link 48 (Agilent Technologies) and intelliPATH (Biocare Medical) Stainers); (2) liquid overlay technology, in which reagents are either covered with or dispensed through an inert fluid layer deposited over the sample (such as implemented on Ventana BenchMark and DISCOVERY Stainers); (3) capillary gap staining, in which the slide surface is placed in proximity to another surface (which may be another slide or a coverplate) to create a narrow gap, through which capillary forces draw up and keep liquid reagents in contact with the samples (such as the staining principles used by DAKO TECHMATE, Leica BOND, and DAKO OMNIS Stainers). Some iterations of capillary gap staining do not mix the fluids in the gap (such as on the DAKO TECHMATE and the Leica BOND). In variations of capillary gap staining termed dynamic gap staining, capillary forces are used to apply sample to the slide, and then the parallel surfaces are translated relative to one another to agitate the reagents during incubation to effect reagent mixing (such as the staining principles implemented on DAKO OMNIS slide Stainers (Agilent)). In translating gap staining, a translatable head is positioned over the slide. A lower surface of the head is spaced apart from the slide by a first gap sufficiently small to allow a meniscus of liquid to form from liquid on the slide during translation of the slide. A mixing extension having a lateral dimension less than the width of a slide extends from the lower surface of the translatable head to define a second gap smaller than the first gap between the mixing extension and the slide. During translation of the head, the lateral dimension of the mixing extension is sufficient to generate lateral movement in the liquid on the slide in a direction generally extending from the second gap to the first gap. See WO 2011-139978 Al . It has recently been proposed to use inkjet technology to deposit reagents on slides. See WO 2016-170008 AL This list of staining technologies is not intended to be comprehensive, and any fully or semiautomated system for performing biomarker staining may be incorporated into the histochemical staining platform.
[0268] Where a morphologically stained sample is also desired, an automated &E staining platform may be used. Automated systems for performing staining typically operate on one of two staining principles: batch staining (also referred to as "dip ‘n dunk") or individual slide staining. Batch stainers generally use vats or baths of reagents in which many slides are immersed at the same time. Individual slide stainers, on the other hand, apply reagent directly to each slide, and no two slides share the same aliquot of reagent. Examples of commercially available stainers
include the VENTANA SYMPHONY (individual slide Stainer) and VENTANA HE 600 (individual slide Stainer) series H&E stainers from Roche; the Dako CoverStainer (batch stainer) from Agilent Technologies; the Leica ST4020 Small Linear Stainer (batch stainer), Leica ST5020 Multistainer (batch stainer), and the Leica ST5010 Autostainer XL series (batch stainer) H&E stainers from Leica Biosystems Nussloch GmbH.
[0269] After the specimens are stained, the stained samples can be manually analyzed on a microscope, and/or digital images of the stained samples can be acquired for archiving and/or digital analysis (e.g., with image acquisition apparatus 12B). Digital images can be captured via a scanning platform such as a slide scanner that can scan the stained slides at 20x, 40x, or other magnifications to produce high resolution whole-slide digital images. At a basic level, the typical slide scanner includes at least: (1) a microscope with lens objectives, (2) a light source (such as halogen, light emitting diode, white light, and/or multispectral light sources, depending on the dye), (3) robotics to move glass slides around or to move the optics around the slide or both, (4) one or more digital cameras for image capture, (5) a computer and associated software to control the robotics and to manipulate, manage, and view digital slides. Digital data at a number of different X-Y locations (and in some cases, at multiple Z planes) on the slide are captured by the camera’s charge-coupled device (CCD), and the images are joined together to form a composite image of the entire scanned surface. Common methods to accomplish this include:
[0270] (1) Tile based scanning, in which the slide stage or the optics are moved in small increments to capture square image frames, which overlap adjacent squares to a slight degree. The captured squares are then automatically matched to one another to build the composite image; and
[0271] (2) Line-based scanning, in which the slide stage moves in a single axis during acquisition to capture a number of composite image "strips." The image strips can then be matched with one another to form the larger composite image.
[0272] A detailed overview of various scanners (both fluorescent and brightfield) can be found at Farahani et al., Whole slide imaging in pathology: advantages, limitations, and emerging perspectives , Pathology and Laboratory Medicine Inf 1, Vol. 7, p. 23-33 (June 2015), the disclosure of which is incorporated by reference in its entirety. Examples of commercially available slide scanners include: 3DHistech PANNORAMIC SCAN II; DigiPath PATHSCOPE; Hamamatsu NAN0Z00MER RS, HT, and XR; Huron TISSUESCOPE 4000, 4000XT, and HS; Leica
SCANSCOPE AT, AT2, CS, FL, and SCN400; Mikroscan D2; Olympus VS120-SL; Omnyx VL4, and VL120; PerkinElmer LAMINA; Philips ULTRA-FAST SCANNER; Sakura Finetek VISIONTEK; Unic PRECICE 500, and PRECICE 600x; and Zeiss AXIO SCAN.Z1. In some embodiments, the scanning device is a digital pathology device as disclosed any of United States Patent No. 9,575,301; U.S. Patent Application Publication No. 2014/0178169; United States Patent No. 9,575,301; U.S. Patent Application Publication No. 2014/0178169; United States Patent Publication Nos. 2021/0092308; and/or U.S. Patent Application Publication No. 2021/0088769, the content of each of which is incorporated by reference in its entirety.
[0273] Exemplary commercially available image analysis software packages include VENTANA VIRTUOSO software suite (Ventana Medical Systems, Inc ); TISSUE STUDIO, DEVELOPER XD, and IMAGE MINER software suites (Definiens); BIOTOPIX, ONCOTOPIX, and STEREOTOPIX software suites (Visiopharm); and the HALO platform (Indica Labs, Inc.).
[0274] In some embodiments, any imaging may be accomplished using any of the systems disclosed in U.S. Patent Nos. 10,317,666 and 10,313,606, the disclosures of which are hereby incorporated by reference herein in their entireties. The imaging apparatus may be a brightfield imager such as the iScan Coreo™ brightfield scanner or the DP200 scanner sold by Ventana Medical Systems, Inc.
[0275] In some cases, the images may be analyzed on an image analysis system. Image analysis system may include one or more computing devices such as desktop computers, laptop computers, tablets, smartphones, servers, application-specific computing devices, or any other type(s) of electronic device(s) capable of performing the techniques and operations described herein. In some embodiments, image analysis system may be implemented as a single device. In other embodiments, image analysis system may be implemented as a combination of two or more devices together achieving the various functionalities discussed herein. For example, image analysis system may include one or more server computers and a one or more client computers communicatively coupled to each other via one or more local-area networks and/or wide-area networks such as the Internet. The image analysis system typically includes at least a memory, a processor, and a display. Memory may include any combination of any type of volatile or nonvolatile memories, such as random-access memories (RAMs), read-only memories such as an Electrically Erasable Programmable Read-Only Memory (EEPROM), flash memories, hard drives, solid state drives, optical discs, and the like. It is appreciated that memory can be included
in a single device and can also be distributed across two or more devices. Processor may include one or more processors of any type, such as central processing units (CPUs), graphics processing units (GPUs), special-purpose signal or image processors, field-programmable gate arrays (FPGAs), tensor processing units (TPUs), and so forth. It is appreciated that processor can be included in a single device and can also be distributed across two or more devices. Display may be implemented using any suitable technology, such as LCD, LED, OLED, TFT, Plasma, etc. In some implementations, display may be a touch-sensitive display (a touchscreen). Image analysis system also typically includes a software system stored on the memory comprising a set of instructions implementable on the processor, the instructions comprising various image analysis tasks, such as object identification, stain intensity quantification, and the like. Exemplary commercially available software packages useful in implementing modules as disclosed herein include VENTANA VIRTUOSO; Definiens TISSUE STUDIO, DEVELOPER XD, and IMAGE MINER; and Visiopharm BIOTOPIX, ONCOTOPIX, and STEREOTOPIX software packages.
[0276] After the specimens are processed, a user can transport specimen-bearing slides to the imaging apparatus. In some embodiments, the imaging apparatus is a brightfield imager slide scanner. One brightfield imager is the iScan Coreo brightfield scanner sold by Ventana Medical Systems, Inc. In automated embodiments, the imaging apparatus is a digital pathology device as disclosed in International Patent Application No.: PCT/US2010/002772 (Patent Publication No.: WO/2011/049608) entitled IMAGING SYSTEM AND TECHNIQUES or disclosed in U.S. Patent Application No. 61/533,114, filed on Sep. 9, 2011, entitled IMAGING SYSTEMS, CASSETTES, AND METHODS OF USING THE SAME. International Patent Application No. PCT/US2010/002772 and U.S. Patent Application No. 61/533,114 are incorporated by reference in their entities.
[0277] Embodiments of the subject matter and the operations described in this specification can be implemented in digital electronic circuitry, or in computer software, firmware, or hardware, including the structures disclosed in this specification and their structural equivalents, or in combinations of one or more of them. Embodiments of the subject matter described in this specification can be implemented as one or more computer programs, for example, one or more modules of computer program instructions, encoded on computer storage medium for execution by, or to control the operation of, data processing apparatus. Any of the modules described herein may include logic that is executed by the processor(s). "Logic," as used herein, refers to any
information having the form of instruction signals and/or data that may be applied to affect the operation of a processor. Software is an example of logic.
[0278] A computer storage medium can be, or can be included in, a computer-readable storage device, a computer-readable storage substrate, a random or serial access memory array or device, or a combination of one or more of them. Moreover, while a computer storage medium is not a propagated signal, a computer storage medium can be a source or destination of computer program instructions encoded in an artificially generated propagated signal. The computer storage medium can also be, or can be included in, one or more separate physical components or media (e.g., multiple CDs, disks, or other storage devices). The operations described in this specification can be implemented as operations performed by a data processing apparatus on data stored on one or more computer-readable storage devices or received from other sources.
[0279] The term "programmed processor" encompasses all kinds of apparatus, devices, and machines for processing data, including by way of example a programmable microprocessor, a computer, a system on a chip, or multiple ones, or combinations, of the foregoing. The apparatus can include special purpose logic circuitry, e.g., an FPGA (field programmable gate array) or an ASIC (application-specific integrated circuit). The apparatus also can include, in addition to hardware, code that creates an execution environment for the computer program in question, e.g., code that constitutes processor firmware, a protocol stack, a database management system, an operating system, a cross-platform runtime environment, a virtual machine, or a combination of one or more of them. The apparatus and execution environment can realize various different computing model infrastructures, such as web services, distributed computing, and grid computing infrastructures.
[0280] A computer program (also known as a program, software, software application, script, or code) can be written in any form of programming language, including compiled or interpreted languages, declarative or procedural languages, and it can be deployed in any form, including as a stand-alone program or as a module, component, subroutine, object, or other unit suitable for use in a computing environment. A computer program may, but need not, correspond to a file in a file system. A program can be stored in a portion of a file that holds other programs or data (e g., one or more scripts stored in a markup language document), in a single file dedicated to the program in question, or in multiple coordinated files (e.g., files that store one or more modules, subprograms, or portions of code). A computer program can be deployed to be executed
on one computer or on multiple computers that are located at one site or distributed across multiple sites and interconnected by a communication network.
[0281] The processes and logic flows described in this specification can be performed by one or more programmable processors executing one or more computer programs to perform actions by operating on input data and generating output. The processes and logic flows can also be performed by, and apparatus can also be implemented as, special purpose logic circuitry, e.g., an FPGA (field programmable gate array) or an ASIC (application-specific integrated circuit).
[0282] Processors suitable for the execution of a computer program include, by way of example, both general and special purpose microprocessors, and any one or more processors of any kind of digital computer. Generally, a processor will receive instructions and data from a readonly memory or a random-access memory or both. The essential elements of a computer are a processor for performing actions in accordance with instructions and one or more memory devices for storing instructions and data. Generally, a computer will also include or be operatively coupled to receive data from or transfer data to, or both, one or more mass storage devices for storing data, e.g., magnetic, magneto-optical disks, or optical disks. However, a computer need not have such devices. Moreover, a computer can be embedded in another device, e.g., a mobile telephone, a personal digital assistant (PDA), a mobile audio or video player, a game console, a Global Positioning System (GPS) receiver, or a portable storage device (e g., a universal serial bus (USB) flash drive), to name just a few. Devices suitable for storing computer program instructions and data include all forms of non-volatile memory, media, and memory devices, including by way of example semiconductor memory devices, e.g., EPROM, EEPROM, and flash memory devices; magnetic disks, e.g., internal hard disks or removable disks; magneto-optical disks; and CD-ROM and DVD-ROM disks. The processor and the memory can be supplemented by, or incorporated in, special purpose logic circuitry.
[0283] To provide for interaction with a user, embodiments of the subject matter described in this specification can be implemented on a computer having a display device, e.g., an LCD (liquid crystal display), LED (light emitting diode) display, or OLED (organic light emitting diode) display, for displaying information to the user and a keyboard and a pointing device, e.g., a mouse or a trackball, by which the user can provide input to the computer. In some implementations, a touch screen can be used to display information and receive input from a user. Other kinds of devices can be used to provide for interaction with a user as well; for example, feedback provided
to the user can be in any form of sensory feedback, e.g., visual feedback, auditory feedback, or tactile feedback; and input from the user can be received in any form, including acoustic, speech, or tactile input. In addition, a computer can interact with a user by sending documents to and receiving documents from a device that is used by the user; for example, by sending web pages to a web browser on a user's client device in response to requests received from the web browser.
[0284] Embodiments of the subject matter described in this specification can be implemented in a computing system that includes a back-end component, e.g., as a data server, or that includes a middleware component, e.g., an application server, or that includes a front-end component, e.g., a client computer having a graphical user interface or a Web browser through which a user can interact with an implementation of the subject matter described in this specification, or any combination of one or more such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication, e.g., a communication network. Examples of communication networks include a local area network ("LAN") and a wide area network ("WAN"), an internetwork (e.g., the Internet), and peer-to-peer networks (e.g., ad hoc peer-to-peer networks).
[0285] The computing system can include any number of clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. In some embodiments, a server transmits data (e.g., an HTML page) to a client device (e.g., for purposes of displaying data to and receiving user input from a user interacting with the client device). Data generated at the client device (e.g., a result of the user interaction) can be received from the client device at the server.
[0286] EXAMPLES
[0287] Example 1 - Digital Contrast on Unstained Tissue Slides Using Multispectral Microscopy and Deep Learning
[0288] Introduction
[0289] Histopathologists use chemical staining techniques on tissue samples to highlight microscopic structure and composition looking for abnormalities that indicate the presence, nature, and extent of pathology. Tissue processing steps are time-consuming and expensive. Each assay requires highly trained and scarce histotechnicians and produces chemical waste. Furthermore,
tissue processing and staining are destructive, cutting through multiple tissue sections can deplete valuable biopsy samples.
[0290] Emerging technologies for label-free visualization of tissue samples hold a significant promise to improve histology practice in the near future. Virtual staining utilizes one or more computerized algorithms to create an artificial effect of staining without physically tampering the slide. It uses optical means to scan tissue sections and deep learning techniques to convert the digitized signal into rendered images that pathologists can use as a diagnostics tool. Some advantages include reduced variability, process robustness, increased speed, marker multiplexing, reduced processing expertise, and potential biomarkers with novel medical value.
[0291] This project utilized a multispectral scanner 12A (see also FIG. 5 which illustrates non-limiting components which may be included within any multispectral scanner). An unstained multi-tissue array (MTA) section was scanned after performing a dewax protocol, obtaining endogenous contrast at 12 different wavelengths (365, 400, 435, 470, 500, 550, 580, 635, 660, 690, 780 and 850 nm). The multispectral images where processed and transformed using deep learning techniques. It was possible to obtain a digital H&E render that was evaluated to have diagnostic quality by 2 certified pathologists.
[0292] Methods
[0293] The experiment was designed as feasibility test using a multispectral microscopy scanner originally designed for multiplexing assays. The objective was to test the possibility to transform the endogenous contrast of unstained tissue scanned at multiple wavelengths, into a H&E like assay that could be used for primary diagnostics.
[0294] Step 1 - Sample preparation
[0295] Multi-tissue arrays (MTAs) containing liver, kidney, skin, colon, and tonsil were cut and mounted on glass slides. The slides were dewaxed using a modified HE600 protocol to remove paraffin from the sample.
[0296] Step 2 - Multispectral Scanning
[0297] The unstained and dewax samples were scanned using FLASH multispectral microscope 12A and digitized using 12 different wavelengths. The sections were scanned without coverslipper to reveal endogenous contrast at different wavelengths.
[0298] Step 3 - Staining and Digitizing
[0299] After the multispectral scanning, the section was stained in a HE600 and digitized with a DP200 (both available from Ventana Medical Systems, Tucson, AZ). The resulting digital image was used as ground truth in the algorithm training.
[0300] Step 4 - Algorithm Training
[0301] A dimensionality reduction algorithm, Principal Component Analysis, was used to compress the 12 channels into 3, and then coded into a RGB image. The PCA reduced and DP200 image were coregistered, segmented and paired. A pix2pix algorithm was trained using Roche's High Performance Computing cluster at Santa Clara, CA.
[0302] Step 5 - Model Testing
[0303] A model was tested for each tissue type and the virtual renders stitched back. The images were shown to two certified pathologists of the Clinical Development Laboratory. Individual features were evaluated and the feasibility of using same quality images for diagnostic purposes was studied.
[0304] Results
[0305] Results of the experiment are shown in FIGS. 17 and 18.
[0306] Correct registration of the multispectral images and the ground truth was one of the most important and difficult steps while training an algorithm. Nevertheless, good results were obtained in an overfitted training and testing set. The results were better for tonsil, liver, and kidney; while intestine and skin had the worst quality of the array.
[0307] Discussion
[0308] Coregistering the RGB coded images of the PCA reduced multispectral digitization with the brightfield images was a critical step, specially using pix2pix algorithm. The algorithm worked better when segmenting the whole image into 256 x 256 pixels pieces, with the spectral images and the 'ground truth' coregistered as closely as possible. Nevertheless, chemically stained tissue samples will not match 100% the unstained counterparts, due to stretching and other transformations suffered during the staining process. Furthermore, DP200 and FLASH have slightly different resolutions, which complicates coregistering both images. To reduce coregistering difficulty, the ground truth (chemically stained slide) was scanned in the multispectral scanner also, coding the closer wavelengths to red, green, and blue into an RGB images, and then transforming them to an image that resembles brightfield H&E, as shown below. [0309] Conclusions
[0310] The feasibility test was conducted using a single MTA, which was scanned with 12 wavelengths (FLASH) before chemically staining it and scanning it under a brightfield microscope (DP200). The experiment used an overfitted model (the same dataset was used for train and test). Testing in a bigger dataset where independent data is used for train and test is required to verify the results. Much better results were obtained with liver, tonsil, and kidney than intestine and skin, probably because of the amount of fat and heterogeneous morphology contained in the last two tissue types. A test done with three arbitrarily selected wavelengths instead of the PCA reductions rendered images that resembled H&E stains, but with digital hallucinations that were quickly spotted by the pathologist during the assessment.
[0311] Example 2
[0312] The method starts with the preparation of the sample, which is a FFPE section mounted on a standard glass microscope slide as commonly done in normal histopathology workflow. The slide was then dewaxed using a standard protocol that can be by chemical means (baths of xylene and other chemicals), or by means of applying heat to melt the paraffin away. The next step was to image the slides (with no coverslip) on a device that had the ability to map the endogenous absorption spectra of the tissue sample at a series of discrete wavelengths, to obtain coregistered images at a plurality of wavelengths ranging from the possibly from the UV to IR. The device had to have an array sensor (i.e., CCD, CMOS) that can obtain images at the different wavelengths, and the means to focus the images at each wavelength, and the means of illuminating the sample at specific spectral wavelengths and/or combinations of wavelengths.
[0313] The images obtained were then compressed into 3 channels (using dimensionality reducing methods like PCA, tSNE, UMAP) and coded into an RGB image. The weights of each channel were controlled to enhance features of interest on each channel. The images were then be segmented into 256 x 256 pixels and converted by a previously trained GAN algorithm to be transformed into H&E like digital stains. After the recoloring, the images were stitched back together and were optionally smoothed to mitigate tiling artifacts.
[0314] For training purposes, the same slide used for scanning the multispectral images was stained using the staining protocol of choice (i.e., H&E in the HE600). The slide was coverslipped and then scanned with a brightfield scanner. The digital images obtained from the scanning were then co-registered with the multispectral ones using the TrakEM2 ImageJ plugin. Using the ImageJ plugin, landmarks were manually designated between a target and to an image
to be registered. An affine transform was then used to co-register the images based on the landmarks. Once they were co-registered, both images were segmented into 256 x 256-pixel images and paired (multispectral and brightfield scans of the H&E-stained tissue sample) and used to train a GAN algorithm in a supervised manner. This process was repeated for a series of slides until the algorithm learned the majority of variation in tissue samples. The trained algorithm can then be used to transform future tissue slides without the need of chemical stain.
[0315] Example 3
[0316] The tissue sample was selected to prove the diagnostic ability of the described method to generate a H&E virtual stain that can be used in diagnostics of cancerous lesions. FFPE blocks containing samples of cancerous breast resections were selected, sectioned with a microtome (thickness 3 - 5 um) and mounted in plus charged glass slides. The samples were taken into an HE600 to bake (5 minutes in an oven) and deparaffinized using heat to melt the wax and EZprep for rinsing, following the standard method on HE600.
[0317] Next, the samples were placed on the multispectral scanner stage. After selecting/detecting the area of interest by the user, the sample was illuminated with the first (of multiple) wavelength, with a predefined illumination power (controlled by a pulse with modulation circuit to reduce/augment the duty cycle on the LED light source). The objective of the scanner was focused by means of moving the sample or the objective, and a first field of view (FOV) was digitized with the camera with a predefined obturation time and gain. Then the sample was illuminated with a second wavelength. The following wavelengths were used in this experiment (365 nm, 400 nm, 435 nm, 470 nm , 500 nm, 550 nm, 580 nm, 635 nm, 660 nm, 690 nm, 780 nm and 850 nm). Nevertheless, 2 wavelengths (470 and 780 nm) were discarded as they contained optical artifacts due to poor scanning, ending up with only 10 wavelengths.
[0318] Only the one central wavelength (550 nm) was focused (with feedback controlled autofocus algorithm), and all the other wavelengths were digitized without moving the Z distance in the same FOV. The process was repeated until all the FOV was digitized with all the different wavelengths. Next, the stage was moved to a different location in the vicinity of the previous location with some area overlapping. The FOV were stitched together to form a whole slide hypercube.
[0319] Once the multispectral scanning finishes with all the AOI, the next step was to stain the sample in H&E following the standard HE600 protocol. Once the slide was stained and
coverslipped it was introduced in the stage of the same multispectral scanner. The above- mentioned process was repeated but only in the 3 wavelengths: 660 nm for red, 550 nm for green and 435 nm for blue. Once the multispectral digital microscope finishes scanning all the AOI, a WSI of the raw data was stored on the disk.
[0320] This WSI of the stained sample was then processed to create an RGB image by placing the image illuminated with 660 nm in the red channel, the one illuminated with 550 nm in the green channel and the one illuminated with 435 nm in the blue channel.
[0321] This WSI of the stained sample was then processed to apply a previously calibrated color correction matrix to correct the colors and perform a white balancing, in such a way that the image resembles closely to what a user would look at by using a brightfield microscope.
[0322] Next Principal Component Analysis reduction was performed on the spectral hypercube images to reduce the channels from 10 to 3. The principal component values were normalized to values from 0 to 255 and coded in the channels of an RGB image, where the first channel (i.e., Red) was the first principal component, the second channel (i.e., Green) was the second principal component, and the third channel (i.e., Blue) was the third principal component. [0323] Once the dataset was complete (unstained and stained), the next step was to coregister the images to a pixel level. To do so, an automatic algorithm was applied that transformed both images (unstained PCA and stain RGB or ground truth) applying filters (i.e., Laplacian, Sobel, etc.) to where the general morphology of the tissue was enhanced in such a way that both filtered images would reveal similar structures (so called descriptors). Next, a Fourier based correlation algorithm was applied, translating and rotating the filtered image of the ground truth to find the position and rotation that will lead to the higher correlation. Next, the required transformations (translation and rotation) were applied to the ground truth, producing two coarsely registered images (unstained PCA and ground truth).
[0324] The next step was to find the FOV of the stitched image if dealing with WSI. To do so, a Laplacian filter was applied to look for discontinuities, revealing the stitch lines. Both images were then divided into individual FOV images, and the above-described process was repeated, filtering paired images to reveal descriptors and finding the best correlation by translating and rotating the filtered version of the FOV containing the ground truth. Once the best position was found, the transformation as applied to the FOV with no filtering, and the images were cut to ditch black areas created by rotating or moving the ground truth, and the result is two coregistered
images (PCA and ground truth). Finally, a final step was performed for further dividing each registered FOV into tiles of 270 x 270 pixels. The process was repeated at the tile level creating pixel level registered tiles of 256 x 256 pixels images. All the images were then saved into disk to train a GAN algorithm. The method to coregister also applied a strict quality control to discard the tiles that were not above the Fourier correlation threshold, and the background tiles (white tiles that contain no tissue information).
[0325] To test the algorithm, the hyperspectral cube of the WSI of the same section was divided into 256 x 256-pixel tiles, with an overlap of the tiles of 100 pixels in both x and y directions and reduced with PCA using the same method described above to create the RGB images from the PCA components. Of the testing dataset, 40% of the tiles were used to train and 60% of the tiles were from areas not used in the training dataset. The algorithm trained was used to transform the PCA reduced spectral images to a virtual H&E resembling the brightfield of the stained sample. The tiles were then stitched overlapping and blending the tiles with a linear blending algorithm to reduce stitching artifacts and create the effect of a WSI of a brightfield image. FIG. 19 shows the resulting virtually stained whole slide image.
[0326] Example 4
[0327] The tissue sample was selected to prove the diagnostic ability of the described method to generate a Masson’s Tri chrome virtual stain. FFPE blocks containing samples of Colo Rectal (CRC) resections were selected, sectioned with a microtome (thickness 3 - 5 um) and mounted on TOMO glass slides. The samples were taken to a BenchMark Special Stainer and deparaffinized using the deparaffmization steps selected in a standard Trichrome protocol. This involves utilizing BenchMark Special Stains Deparaffmization Solution (Cat. No. 860-036 / 06523102001) and BenchMark Special Stains Wash II (Cat. No. 860-041 / 08309817001). Next, the samples were placed on the multispectral scanner stage. After selecting/detecting the area of interest by the user, the sample was illuminated with the first (of multiple) wavelength, with a predefined illumination power (controlled by a pulse with modulation circuit to reduce/augment the duty cycle on the LED light source). The objective of the scanner was focused by means of moving the sample or the objective, and a first field of view (FOV) was digitized with the camera with a predefined obturation time and gain. Then the sample was illuminated with a second wavelength. The following wavelengths were used in this experiment (365 nm, 400 nm, 435 nm, 470 nm , 500 nm, 550 nm, 580 nm, 635 nm, 660 nm, 690 nm, 780 nm and 850 nm).
[0328] Only the one central wavelength (635 nm) was focused (with feedback controlled autofocus algorithm), and all the other wavelengths were digitized without moving the Z distance in the same FOV. The process was repeated until all the FOV was digitized with all the different wavelengths. Next, the stage was moved to a different location in the vicinity of the previous location with some area overlapping. The FOV were stitched together to form a whole slide hypercube.
[0329] Once the multispectral scanning finish with all the AOI, the next step was to stain the sample Masson's Trichrome on the BenchMark Special Stainer following the following protocols: application of Special Stains LCS and 300uL of Trichrome Bouins A and incubate for 32min, slide washing with Special Stains Wash II, addition of Special Stains LCS and deposition of 150uL each of Hematoxylin A + B and incubation for 12min, slide washing with Special Stains Wash II, addition of Special Stains LCS and 300uL of Trichrome Red and incubate for 8min, slide rinse with Special Stains Wash II, addition of Special Stains LCS and deposition of 150uL of Trichrome Mordant and incubation for 12min, the slide was washed with Special Stains Wash II and another lOOuL of Tri chrome Mordant was applied and incubated for 4min and Special Stains LCS was added, then 300uL of Trichrome Blue were added and incubated for 24min, the slide was washed with Special Stains Wash II, addition of Special Stains LCS and lOOuL of Trichrome Clarifier was added and incubated for 4min, the slide was the rinsed one final time with Special Stains Wash II
[0330] Once the slide was stained and coverslipped it was introduced in the stage of the same multispectral scanner. The above-mentioned process was repeated but only in the 3 wavelengths: 660 nm for red, 550 nm for green and 435 nm for blue. In some embodiments, the FOV of the stained images position closely matches the position of the FOV of the unstained samples through the use of fiducials located in the sample (i.e., edges tags, etc.) and it was stored onto a disk making a focusing map that was the same as the one used in the unstained sample. Once the multispectral digital microscope finished scanning all the areas of interest (AOIs), a WSI of the raw data was stored on the disk.
[0331] This WSI of the stained sample was then processed to create an RGB image by placing the image illuminated with 660 nm in the red channel, the one illuminated with 550 nm in the green channel and the one illuminated with 435 nm in the blue channel.
[0332] No PCA reduction was used in this example. Instead, the RGB images were registered to the 12 wavelengths with no further processing.
[0333] Once the dataset was complete (unstained and stained), the next step was to coregister the images to a pixel level. To do so, an automatic algorithm transforms both images (stained RGB or ground truth) applying filters (i.e., Laplacian, Sobel, etc.) to where the general morphology of the tissue was enhanced in such a way that both filtered images will reveal similar structures (so called descriptors). Next, a Fourier based correlation algorithm will be applied, translating and rotating the filtered image of the ground truth to find the position and rotation that will lead to the higher correlation. Next, the required transformations (translation and rotation) were applied to the ground truth, producing 2 coarsely registered images (unstained hypercube and stained RGB).
[0334] Next step, both WSI were then divided into individual FOV images, and the abovedescribed process was repeated, filtering paired images to reveal descriptors and finding the best correlation by translating and rotating the filtered version of the FOV containing the ground truth. Once the best position was found, the transformation was applied to the FOV with no filtering, and the images were cut to ditch black areas created by rotating or moving the ground truth, and the result was 2 coregistered images (PCA and Ground truth). Finally, a final step was done further dividing each registered FOV into tiles of 270 x 270 pixels. The process was repeated at the tile level creating pixel level registered tiles of 256 x 256 pixels images. All the images were then saved into disk to train a GAN algorithm.
[0335] To test the algorithm, the hyperspectral cube of the WSI of a different section not used to train the algorithm was divided into 256 x 256-pixel tiles and imputed into the GAN algorithm to transform the hyperspectral cube to an RGB cube. The slides were then stitched together to form the image (see FIG. 21).
[0336] All the U.S. patents, U.S. patent application publications, U.S. patent applications, foreign patents, foreign patent applications and non-patent publications referred to in this specification and/or listed in the Application Data Sheet are incorporated herein by reference, in their entirety. Aspects of the embodiments can be modified, if necessary to employ concepts of the various patents, applications, and publications to provide yet further embodiments.
[0337] Although the present disclosure has been described with reference to a number of illustrative embodiments, it should be understood that numerous other modifications and
embodiments can be devised by those skilled in the art that will fall within the spirit and scope of the principles of this disclosure. More particularly, reasonable variations and modifications are possible in the component parts and/or arrangements of the subject combination arrangement within the scope of the foregoing disclosure, the drawings, and the appended claims without departing from the spirit of the disclosure. In addition to variations and modifications in the component parts and/or arrangements, alternative uses will also be apparent to those skilled in the art.
[0338] ADDITIONAL EMBODIMENTS
Additional Embodiment 1. A system for generating a virtually stained image of a test unstained biological specimen disposed on a substrate, the system comprising: (i) one or more processors, and (ii) one or more memories coupled to the one or more processors, the one or more memories to store computer-executable instructions that, when executed by the one or more processors, cause the system to perform operations comprising: a. obtaining a test multispectral transmission image of the test unstained biological specimen, wherein the test multispectral transmission image is derived from one or more test multispectral transmission image channel images; b. supplying the obtained test multispectral transmission image of the test unstained biological specimen to a virtual staining engine, wherein the virtual staining engine is trained to generate an image of an unstained biological specimen stained with a morphological stain, and wherein the virtual staining engine is trained from a plurality of pairs of coregistered training images derived from one or more training biological specimens, i. wherein a first training image in each pair of the plurality of coregistered training images is derived from one or more training multispectral transmission image channel images; ii. wherein a second training image in each pair of the plurality of coregistered training images comprises training brightfield image data of the same biological specimen stained with the morphological stain; and
c. with the trained virtual staining engine, generating the virtually stained image of the test unstained biological specimen stained with the morphological stain; wherein each of the one or more test multispectral transmission image channel images are acquired at about the same wavelengths as each of the one or more training multispectral transmission image channel images.
Additional Embodiment 2. The system of additional embodiment 1, wherein (i) at least two test multispectral transmission image channel images are acquired; and (ii) at least three training multispectral transmission image channel images are acquired; wherein each of the at least three test multispectral transmission image channel images are acquired at about the same wavelengths as each of the at least three training multispectral transmission image channel images.
Additional Embodiment 3. The system of additional embodiment 1, wherein (i) at least four test multispectral transmission image channel images are acquired; and (ii) at least four training multispectral transmission image channel images are acquired; wherein each of the at least four test multispectral transmission image channel images are acquired at about the same wavelengths as each of the at least four training multispectral transmission image channel images.
Additional Embodiment 4. The system of additional embodiment 3, wherein the test multispectral transmission image is generated by performing a dimensionality reduction on the at least four test multispectral transmission image channel images.
Additional Embodiment 5. The system of additional embodiment 3, wherein the first training image in each of pair of coregistered training images is generated by performing a dimensionality reduction on the at least four training multispectral transmission image channel images.
Additional Embodiment 6. The system of additional embodiment 1, wherein (i) at least six test multispectral transmission image channel images are acquired; and (ii) at least six training multispectral transmission image channel images are acquired; wherein each of the at least six test multispectral transmission image channel images are acquired at about the same wavelengths as each of the at least six training multispectral transmission image channel images.
Additional Embodiment 7. The system of additional embodiment 6, wherein the test multispectral transmission image is generated by performing a dimensionality reduction on the at least six test multispectral transmission image channel images.
Additional Embodiment 8. The system of additional embodiment 6, wherein the first training image in each of pair of coregistered training images is generated by performing a dimensionality reduction on the at least six training multispectral transmission image channel images.
Additional Embodiment 9. The system of additional embodiment 1, wherein (i) at least twelve test multispectral transmission image channel images are acquired; and (ii) at least twelve training multispectral transmission image channel images are acquired; wherein each of the at least twelve test multispectral transmission image channel images are acquired at about the same wavelengths as each of the at least twelve training multispectral transmission image channel images.
Additional Embodiment 10. The system of additional embodiment 9, wherein the test multispectral transmission image is generated by performing a dimensionality reduction on the at least twelve test multispectral transmission image channel images.
Additional Embodiment 11. The system of additional embodiment 9, wherein the first training image in each of pair of coregistered training images is generated by performing a dimensionality reduction on the at least twelve training multispectral transmission image channel images.
Additional Embodiment 12. The system of any one of additional embodiments 1 - 11, wherein the wavelengths are selected from 365 +/- 20 nm, about 400 +/- 20 nm, about 435 +/- 20 nm, about 470 +/- 20 nm, about 500 +/- 20 nm, about 550+/- 20 nm, about 580 +/- 20 nm, about 635 +/- 20 nm, about 660 +/- 20 nm, about 690 +/- 20 nm, about 780 +/- 20 nm, and/or about 850 +/- 20 nm.
Additional Embodiment 13. The system of any one of additional embodiments 1 - 11, wherein the wavelengths are selected from 365 +/- 10 nm, about 400 +/- 10 nm, about 435 +/- 10 nm, about 470 +/- 10 nm, about 500 +/- 10 nm, about 550+/- 10 nm, about 580 +/- 10 nm, about 635 +/- 10 nm, about 660 +/- 10 nm, about 690 +/- 10 nm, about 780 +/- 10 nm, and/or about 850 +/- 10 nm.
Additional Embodiment 14. The system of any one of additional embodiments 1 - 11, wherein the wavelengths are selected from 365 nm, about 400 nm, about 435 nm, about 470 nm, about 500 nm, about 550 nm, about 580 nm, about 635 nm, about 660 nm, about 690 nm, about 780 nm, and/or about 850 nm.
Additional Embodiment 15. The system of any one of additional embodiments 1 - 11, wherein the obtained virtual staining engine comprises a generative adversarial network.
Additional Embodiment 16. The system of any one of additional embodiments 1 - 11, wherein the morphological stain is a primary stain.
Additional Embodiment 17. The system of any one of additional embodiments 1 - 11, wherein the morphological stain is a special stain.
Additional Embodiment 18. The system of any one of additional embodiments 1 - 11, wherein the morphological stain comprises hematoxylin.
Additional Embodiment 19. The system of any one of additional embodiments 1 - 11, wherein the morphological stain comprises hematoxylin and eosin.
Additional Embodiment 20. The method of any one of additional embodiments 1 - 19, wherein the training brightfield data is acquired using a multispectral image acquisition device.
Additional Embodiment 21. A method for generating a virtually stained image of a test unstained biological specimen disposed on a substrate, comprising: a. obtaining a test multispectral transmission image of the test unstained biological specimen, wherein the test multispectral transmission image is derived from one or more test multispectral transmission image channel images; b. supplying the obtained test multispectral transmission image of the test unstained biological specimen to a virtual staining engine, wherein the virtual staining engine is trained to generate an image of an unstained biological specimen stained with a morphological stain, and wherein the virtual staining engine is trained from a plurality of pairs of coregistered training images derived from one or more training biological specimens,
i. wherein a first training image in each pair of the plurality of coregistered training images is derived from one or more training multispectral transmission image channel images; ii. wherein a second training image in each pair of the plurality of coregistered training images comprises training brightfield image data of the same biological specimen stained with the morphological stain; and c. with the trained virtual staining engine, generating the virtually stained image of the test unstained biological specimen stained with the morphological stain; wherein each of the one or more test multispectral transmission image channel images are acquired at about the same wavelengths as each of the one or more training multispectral transmission image channel images.
Additional Embodiment 22. The method of additional embodiment 21, wherein (i) at least three test multispectral transmission image channel images are acquired; and (ii) at least three training multispectral transmission image channel images are acquired; wherein each of the at least three test multispectral transmission image channel images are acquired at about the same wavelengths as each of the at least three training multispectral transmission image channel images.
Additional Embodiment 23. The method of additional embodiment 21, wherein (i) at least four test multispectral transmission image channel images are acquired; and (ii) at least four training multispectral transmission image channel images are acquired; wherein each of the at least four test multispectral transmission image channel images are acquired at about the same wavelengths as each of the at least four training multispectral transmission image channel images.
Additional Embodiment 24. The method of additional embodiment 23, wherein the test multispectral transmission image is generated by performing a dimensionality reduction on the at least four test multispectral transmission image channel images.
Additional Embodiment 25. The method of additional embodiment 24, wherein the first training image in each of pair of coregistered training images is generated by performing a
dimensionality reduction on the at least four training multispectral transmission image channel images.
Additional Embodiment 26. The method of additional embodiment 21, wherein (i) at least six test multispectral transmission image channel images are acquired; and (ii) at least six training multispectral transmission image channel images are acquired; wherein each of the at least six test multispectral transmission image channel images are acquired at about the same wavelengths as each of the at least six training multispectral transmission image channel images.
Additional Embodiment 27. The method of additional embodiment 26, wherein the test multispectral transmission image is generated by performing a dimensionality reduction on the at least six test multispectral transmission image channel images.
Additional Embodiment 28. The method of additional embodiment 26, wherein the first training image in each of pair of coregistered training images is generated by performing a dimensionality reduction on the at least six training multispectral transmission image channel images.
Additional Embodiment 29. The method of additional embodiment 21, wherein (i) at least twelve test multispectral transmission image channel images are acquired; and (ii) at least twelve training multispectral transmission image channel images are acquired; wherein each of the at least twelve test multispectral transmission image channel images are acquired at about the same wavelengths as each of the at least twelve training multispectral transmission image channel images.
Additional Embodiment 30. The method of additional embodiment 29, wherein the test multispectral transmission image is generated by performing a dimensionality reduction on the at least twelve test multispectral transmission image channel images.
Additional Embodiment 31. The method of additional embodiment 29, wherein the first training image in each of pair of coregistered training images is generated by performing a dimensionality reduction on the at least twelve training multispectral transmission image channel images.
Additional Embodiment 32. The method of any one of additional embodiments 21 - 31, wherein the wavelengths are selected from 365 +/- 20 nm, about 400 +/- 20 nm, about 435 +/- 20 nm, about 470 +/- 20 nm, about 500 +/- 20 nm, about 550+/- 20 nm, about 580 +/- 20 nm,
about 635 +/- 20 nm, about 660 +/- 20 nm, about 690 +/- 20 nm, about 780 +/- 20 nm, and/or about 850 +/- 20 nm.
Additional Embodiment 33. The method of any one of additional embodiments 21 - 31, wherein the wavelengths are selected from 365 +/- 10 nm, about 400 +/- 10 nm, about 435 +/- 10 nm, about 470 +/- 10 nm, about 500 +/- 10 nm, about 550+/- 10 nm, about 580 +/- 10 nm, about 635 +/- 10 nm, about 660 +/- 10 nm, about 690 +/- 10 nm, about 780 +/- 10 nm, and/or about 850 +/- 10 nm.
Additional Embodiment 34. The method of any one of additional embodiments 21 - 31, wherein the wavelengths are selected from 365 nm, about 400 nm, about 435 nm, about 470 nm, about 500 nm, about 550 nm, about 580 nm, about 635 nm, about 660 nm, about 690 nm, about 780 nm, and/or about 850 nm.
Additional Embodiment 35. The method of any one of additional embodiments 21 - 30, wherein the obtained virtual staining engine comprises a generative adversarial network.
Additional Embodiment 36. The method of any one of additional embodiments 21 - 31 , wherein the morphological stain is a primary stain.
Additional Embodiment 37. The method of any one of additional embodiments 21 - 31, wherein the morphological stain is a special stain.
Additional Embodiment 38. The system of any one of additional embodiments 21 - 31, wherein the morphological stain comprises hematoxylin.
Additional Embodiment 39. The method of any one of additional embodiments 21 - 31, wherein the morphological stain comprises hematoxylin and eosin.
Additional Embodiment 40. The method of any one of additional embodiments 21 - 39, wherein the training brightfield data is acquired using a multispectral image acquisition device.
Additional Embodiment 41. The method of additional embodiment 40, wherein multispectral image data is acquired at about wavelengths 700nm, 550nm, and 470nm; and wherein the multispectral image data acquired is converted to an RGB image.
Additional Embodiment 42. A system for generating a virtually stained image of a test unstained biological specimen disposed on a substrate, the system comprising: (i) one or more processors, and (ii) one or more memories coupled to the one or more processors, the one or more
memories to store computer-executable instructions that, when executed by the one or more processors, cause the system to perform operations comprising: a. obtaining a test multi spectral transmission image of the test unstained biological specimen, wherein the test multispectral transmission image is derived from one or more test multispectral transmission image channel images; b. supplying the obtained test multispectral transmission image of the test unstained biological specimen to a virtual staining engine, wherein the virtual staining engine is trained to generate an image of an unstained biological specimen stained with a morphological stain; and c. with the trained virtual staining engine, generating the virtually stained image of the test unstained biological specimen stained with the morphological stain.
Additional Embodiment 43. The system of additional embodiment 42, wherein the virtual staining engine is trained using (a) one or more training multispectral transmission image channel images of an unstained training biological specimen; and (b) training brightfield image data of the same unstained training biological specimen stained with a morphological stain.
Additional Embodiment 44. The system of additional embodiment 43, the training brightfield data is acquired using a multispectral image acquisition device.
Additional Embodiment 45. The system of additional embodiment 42, wherein each of the one or more test multispectral transmission image channel images are acquired at about the same wavelengths as each of the one or more training multispectral transmission image channel images.
Additional Embodiment 46. The system of any one of additional embodiments 42 - 43, wherein the wavelengths are selected from 365 +/- 20 nm, about 400 +/- 20 nm, about 435 +/- 20 nm, about 470 +/- 20 nm, about 500 +/- 20 nm, about 550+/- 20 nm, about 580 +/- 20 nm, about 635 +/- 20 nm, about 660 +/- 20 nm, about 690 +/- 20 nm, about 780 +/- 20 nm, and/or about 850 +/- 20 nm.
Additional Embodiment 47. The system of any one of additional embodiments 42 - 43, wherein the wavelengths are selected from 365 +/- 10 nm, about 400 +/- 10 nm, about 435 +/- 10 nm, about 470 +/- 10 nm, about 500 +/- 10 nm, about 550+/- 10 nm, about 580 +/- 10 nm, about
635 +/- 10 nm, about 660 +/- 10 nm, about 690 +/- 10 nm, about 780 +/- 10 nm, and/or about 850 +/- 10 nm.
Additional Embodiment 48. The system of any one of additional embodiments 42 - 43, wherein the wavelengths are selected from 365 nm, about 400 nm, about 435 nm, about 470 nm, about 500 nm, about 550 nm, about 580 nm, about 635 nm, about 660 nm, about 690 nm, about 780 nm, and/or about 850 nm.
Additional Embodiment 49. The system of additional embodiment 42, wherein the virtual staining engine is trained from a plurality of pairs of coregistered training images derived from one or more training biological specimens.
Additional Embodiment 50. The system of additional embodiment 49, wherein a first training image in each pair of the plurality of coregistered training images is derived from one or more training multispectral transmission image channel images; and wherein a second training image in each pair of the plurality of coregistered training images comprises training brightfield image data of the same biological specimen stained with the morphological stain.
Additional Embodiment 51. The system of additional embodiment 50, wherein each of the one or more test multispectral transmission image channel images are acquired at about the same wavelengths as each of the one or more training multispectral transmission image channel images.
Additional Embodiment 52. A system for generating two or more virtually stained images of a test unstained biological specimen disposed on a substrate, the system comprising: (i) one or more processors, and (ii) one or more memories coupled to the one or more processors, the one or more memories to store computer-executable instructions that, when executed by the one or more processors, cause the system to perform operations comprising: a. obtaining a test multispectral transmission image of the test unstained biological specimen, wherein the test multispectral transmission image is derived from one or more test multispectral transmission image channel images; b. supplying the obtained test multispectral transmission image of the test unstained biological specimen to two or more different virtual staining engines, wherein each virtual staining engine of the two or more different
virtual staining engines is trained to generate an image of an unstained biological specimen stained with a different morphological stain; c. with the trained virtual staining engine, generating two or more virtually stained images of the test unstained biological specimen, wherein each of the generated two or more virtually stained images are stained with a different morphological stain.
Additional Embodiment 53. The system of additional embodiment 52, wherein the two or more virtual staining engines are each independently trained using (a) one or more training multispectral transmission image channel images of an unstained training biological specimen; and (b) brightfield training image data of the same unstained training biological specimen stained with a morphological stain.
Additional Embodiment 54. The system of additional embodiment 52, wherein each of the one or more test multispectral transmission image channel images are acquired at about the same wavelengths as each of the one or more training multispectral transmission image channel images used to train each of the two or more virtual staining engines.
Additional Embodiment 55. The system of any one of additional embodiments 52 - 54, wherein the wavelengths are selected from 365 +/- 20 nm, about 400 +/- 20 nm, about 435 +/- 20 nm, about 470 +/- 20 nm, about 500 +/- 20 nm, about 550+/- 20 nm, about 580 +/- 20 nm, about 635 +/- 20 nm, about 660 +/- 20 nm, about 690 +/- 20 nm, about 780 +/- 20 nm, and/or about 850 +/- 20 nm.
Additional Embodiment 56. The system of any one of additional embodiments 52 - 54, wherein the wavelengths are selected from 365 +/- 10 nm, about 400 +/- 10 nm, about 435 +/- 10 nm, about 470 +/- 10 nm, about 500 +/- 10 nm, about 550+/- 10 nm, about 580 +/- 10 nm, about 635 +/- 10 nm, about 660 +/- 10 nm, about 690 +/- 10 nm, about 780 +/- 10 nm, and/or about 850 +/- 10 nm.
Additional Embodiment 57. The system of any one of additional embodiments 52 - 54, wherein the wavelengths are selected from 365 nm, about 400 nm, about 435 nm, about 470 nm, about 500 nm, about 550 nm, about 580 nm, about 635 nm, about 660 nm, about 690 nm, about 780 nm, and/or about 850 nm.
Additional Embodiment 58. The system of additional embodiment 52, wherein each of the two or more virtual staining engines are trained from a plurality of pairs of coregistered training images derived from one or more training biological specimens.
Additional Embodiment 59. The system of additional embodiment 58, wherein a first training image in each pair of the plurality of coregistered training images is derived from one or more training multispectral transmission image channel images; and wherein a second training image in each pair of the plurality of coregistered training images comprises training brightfield image data of the same biological specimen stained with the morphological stain.
Additional Embodiment 60. A system for generating a virtually stained image of a test unstained biological specimen disposed on a substrate, the system comprising: (i) one or more processors, and (ii) one or more memories coupled to the one or more processors, the one or more memories to store computer-executable instructions that, when executed by the one or more processors, cause the system to perform operations comprising: a. obtaining a trained virtual staining engine trained to generate an image of an unstained biological specimen stained with a morphological stain, wherein the trained virtual staining engine is trained from a plurality of pairs of coregistered training images, i. where a first training image in each pair of the plurality of coregistered training images is derived from training multispectral transmission image data of an unstained training biological specimen acquired at three or more different wavelengths, ii. wherein a second training image in each pair of the plurality of coregistered training images comprises training brightfield image data of the same training biological specimen stained with a morphological stain; b. obtaining a test multispectral transmission image of the test unstained biological specimen, wherein the obtained test multispectral transmission image is derived from test multispectral transmission image data of the test unstained biological specimen illuminated with the at least three different illumination sources; and
c. with the obtained trained virtual staining engine, generating the virtually stained image of the test unstained biological specimen stained with the morphological stain based on the obtaining test multispectral transmission image.
Additional Embodiment 61. The system of additional embodiment 60, wherein the morphological stain is a primary stain.
Additional Embodiment 62. The system of additional embodiment 60, wherein the morphological stain is a special stain.
Additional Embodiment 63. The system of additional embodiment 60, wherein the morphological stain comprises hematoxylin.
Additional Embodiment 64. The system of additional embodiment 60, wherein the morphological stain comprises hematoxylin and eosin.
Additional Embodiment 65. The system of any one of additional embodiments 60 - 64, wherein the training multispectral transmission image data of the unstained training biological specimen is acquired at four or more different wavelengths.
Additional Embodiment 66. The system of additional embodiment 60, wherein the first training image is generated by reducing a dimensionality of the training multispectral transmission image data at the four or more different wavelengths.
Additional Embodiment 67. The system of additional embodiment 66, wherein the dimensionality is reduced using principal component analysis.
Additional Embodiment 68. The system of any one of additional embodiments 60 - 64, wherein the training multispectral transmission image data of the unstained training biological specimen is acquired at six or more different wavelengths.
Additional Embodiment 69. The system of additional embodiment 68, wherein the first training image is generated by reducing a dimensionality of the training multispectral transmission image data at the six or more different wavelengths.
Additional Embodiment 70. The system of additional embodiment 69, wherein the dimensionality is reduced using principal component analysis.
Additional Embodiment 71. The system of any one of additional embodiments 60 - 64, wherein the training multispectral transmission image data of the unstained training biological specimen is acquired at twelve or more different wavelengths.
Additional Embodiment 72. The system of additional embodiment 71, wherein the first training image is generated by reducing a dimensionality of the training multispectral transmission image data at the twelve or more different wavelengths.
Additional Embodiment 73. The system of additional embodiment 72, wherein the dimensionality is reduced using principal component analysis.
Additional Embodiment 74. The system of any one of additional embodiments 60 - 73, wherein the at least three wavelengths are selected from 365 +/- 20 nm, about 400 +/- 20 nm, about 435 +/- 20 nm, about 470 +/- 20 nm, about 500 +/- 20 nm, about 550+/- 20 nm, about 580 +/- 20 nm, about 635 +/- 20 nm, about 660 +/- 20 nm, about 690 +/- 20 nm, about 780 +/- 20 nm, and/or about 850 +/- 20 nm.
Additional Embodiment 75. The system of any one of additional embodiments 60 - 73, wherein the at least three wavelengths are selected from 365 +/- 10 nm, about 400 +/- 10 nm, about 435 +/- 10 nm, about 470 +/- 10 nm, about 500 +/- 10 nm, about 550+/- 10 nm, about 580 +/- 10 nm, about 635 +/- 10 nm, about 660 +/- 10 nm, about 690 +/- 10 nm, about 780 +/- 10 nm, and/or about 850 +/- 10 nm.
Additional Embodiment 76. The system of any one of additional embodiments 60 - 73, wherein the at least three wavelengths are selected from 365 nm, about 400 nm, about 435 nm, about 470 nm, about 500 nm, about 550 nm, about 580 nm, about 635 nm, about 660 nm, about 690 nm, about 780 nm, and/or about 850 nm.
Additional Embodiment 77. The system of any one of additional embodiments 60 - 76, wherein the training multispectral transmission image data and the test multispectral image data are acquired using a multispectral image acquisition device.
Additional Embodiment 78. The system of any one of additional embodiments 60 - 76, wherein the training brightfield image data is acquired using a brightfield image acquisition device.
Additional Embodiment 79. The system of any one of additional embodiments 60 - 76, wherein the training brightfield image data is acquired using a multispectral image acquisition device, wherein the training brightfield image data is an RGB image.
Additional Embodiment 80. The system of any one of additional embodiments 60 - 76, wherein the training brightfield image data is acquired using a multispectral image acquisition
device, wherein the multispectral image acquisition device is configured to capture image data at 700nm +/- 10mm, about 550nm +/- 10mm , and about 470nm +/- 10mm.
Additional Embodiment 81. The system of any one of additional embodiments 60 - 80, wherein the obtained trained virtual staining engine comprises a generative adversarial network.
Additional Embodiment 82. A method for generating a virtually stained image of a test unstained biological specimen disposed on a substrate, comprising: a. obtaining a trained virtual staining engine trained to generate an image of an unstained biological specimen stained with a morphological stain, wherein the trained virtual staining engine is trained from a plurality of pairs of coregistered training images; i. where a first training image in each pair of the plurality of coregistered training images is derived from training multispectral transmission image data of an unstained training biological specimen illuminated with the at least three different illumination sources; ii. wherein a second training image in each pair of the plurality of coregistered training images comprises training brightfield image data of the same training biological specimen stained with a morphological stain; b. obtaining a test multispectral transmission image of the test unstained biological specimen, wherein the obtained test multispectral transmission image is derived from test multispectral transmission image data of the test unstained biological specimen illuminated with the at least three different illumination sources; and c. with the obtained trained virtual staining engine, generating the virtually stained image of the test unstained biological specimen stained with the morphological stain based on the obtaining test multispectral transmission image.
Additional Embodiment 83. The method of additional embodiment 82, wherein the morphological stain is a primary stain.
Additional Embodiment 84. The method of additional embodiment 82, wherein the morphological stain is a special stain.
Additional Embodiment 85. The method of additional embodiment 82, wherein the morphological stain comprises hematoxylin.
Additional Embodiment 86. The method of additional embodiment 82, wherein the morphological stain comprises hematoxylin and eosin.
Additional Embodiment 87. The method of additional embodiment 82, wherein the unstained training biological specimen and the test unstained biological specimen illuminated with at least four different illumination sources.
Additional Embodiment 88. The method of additional embodiment 82, wherein the unstained training biological specimen and the test unstained biological specimen illuminated with at least six different illumination sources.
Additional Embodiment 89. The method of additional embodiment 82, wherein the unstained training biological specimen and the test unstained biological specimen illuminated with at least twelve different illumination sources.
Additional Embodiment 90. The method of any one of additional embodiments 82 - 89, wherein the at least three illumination sources are selected from 365 +/- 20 nm, about 400 +/- 20 nm, about 435 +/- 20 nm, about 470 +/- 20 nm, about 500 +/- 20 nm, about 550+/- 20 nm, about 580 +/- 20 nm, about 635 +/- 20 nm, about 660 +/- 20 nm, about 690 +/- 20 nm, about 780 +/- 20 nm, and/or about 850 +/- 20 nm.
Additional Embodiment 91. The method of any one of additional embodiments 82 - 89, wherein the at least three illumination sources are selected from 365 +/- 10 nm, about 400 +/- 10 nm, about 435 +/- 10 nm, about 470 +/- 10 nm, about 500 +/- 10 nm, about 550+/- 10 nm, about 580 +/- 10 nm, about 635 +/- 10 nm, about 660 +/- 10 nm, about 690 +/- 10 nm, about 780 +/- 10 nm, and/or about 850 +/- 10 nm.
Additional Embodiment 92. The method of any one of additional embodiments 82 - 89, wherein the at least three illumination sources are selected from 365 nm, about 400 nm, about 435 nm, about 470 nm, about 500 nm, about 550 nm, about 580 nm, about 635 nm, about 660 nm, about 690 nm, about 780 nm, and/or about 850 nm.
Additional Embodiment 93. The method of any one of additional embodiments 82 - 92, wherein the training multispectral transmission image data and the test multispectral image data are acquired using a multispectral image acquisition device.
Additional Embodiment 94. The method of any one of additional embodiments 82 - 92, wherein the training brightfield image data is acquired using a brightfield image acquisition device.
Additional Embodiment 95. The method of any one of additional embodiments 82 - 92, wherein the training brightfield image data is acquired using a multispectral image acquisition device.
Additional Embodiment 96. The method of any one of additional embodiments 82 - 92, wherein the training brightfield image data is acquired using a multispectral image acquisition device, wherein the multispectral image acquisition device is configured to capture image data at 700nm +/- 10mm, about 550nm +/- 10mm , and about 470nm +/- 10mm.
Additional Embodiment 97. The method of any one of additional embodiments 82 - 96, wherein the obtained trained virtual staining engine comprises a generative adversarial network.
Additional Embodiment 98. The method of additional embodiment 82, wherein each pair of the plurality of coregistered training images is derived from different training biological specimen.
Additional Embodiment 99. A non-transitory computer-readable medium storing instructions for generating a virtually stained image of a test unstained biological specimen disposed on a substrate, the system comprising: (i) one or more processors, and (ii) one or more memories coupled to the one or more processors, the one or more memories to store computer-executable instructions that, when executed by the one or more processors, cause the system to perform operations comprising: a. obtaining a test multispectral transmission image of the test unstained biological specimen, wherein the test multispectral transmission image is derived from one or more test multispectral transmission image channel images; b. supplying the obtained test multispectral transmission image of the test unstained biological specimen to a virtual staining engine, wherein the virtual
I l l
staining engine is trained to generate an image of an unstained biological specimen stained with a morphological stain, and wherein the virtual staining engine is trained from a plurality of pairs of coregistered training images derived from one or more training biological specimens, i. wherein a first training image in each pair of the plurality of coregistered training images is derived from one or more training multispectral transmission image channel images; ii. wherein a second training image in each pair of the plurality of coregistered training images comprises training brightfield image data of the same biological specimen stained with the morphological stain; and c. with the trained virtual staining engine, generating the virtually stained image of the test unstained biological specimen stained with the morphological stain; wherein each of the one or more test multispectral transmission image channel images are acquired at about the same wavelengths as each of the one or more training multispectral transmission image channel images.
Claims
1. A system for generating a virtually stained image of a test unstained biological specimen disposed on a substrate, the system comprising: (i) one or more processors, and (ii) one or more memories coupled to the one or more processors, the one or more memories to store computerexecutable instructions that, when executed by the one or more processors, cause the system to perform operations comprising: a. obtaining a test multispectral transmission image of the test unstained biological specimen, wherein the test multispectral transmission image is derived from one or more test multispectral transmission image channel images; b. supplying the obtained test multispectral transmission image of the test unstained biological specimen to a virtual staining engine, wherein the virtual staining engine is trained to generate an image of an unstained biological specimen stained with a morphological stain, and wherein the virtual staining engine is trained from a plurality of pairs of coregistered training images derived from one or more training biological specimens, i. wherein a first training image in each pair of the plurality of coregistered training images is derived from one or more training multispectral transmission image channel images; ii. wherein a second training image in each pair of the plurality of coregistered training images comprises training brightfield image data of the same biological specimen stained with the morphological stain; and c. with the trained virtual staining engine, generating the virtually stained image of the test unstained biological specimen stained with the morphological stain; wherein each of the one or more test multispectral transmission image channel images are acquired at about the same wavelengths as each of the one or more training multispectral transmission image channel images.
2. The system of claim 1, wherein at least two test multispectral transmission image channel images are acquired.
3. The system of claim 1, wherein three or more test multispectral transmission image channel images are acquired.
4. The system of claim 3, wherein the test multispectral transmission image is generated by performing a dimensionality reduction on the three or more test multispectral transmission image channel images.
5. The system of claim 3, wherein the test multispectral transmission image is generated without compressing any of the three or more test multispectral transmission image channel images or performing a dimensionality reduction on any of the three or more test multispectral transmission image channel images.
6. The system of claim 1, wherein (i) at least four test multispectral transmission image channel images are acquired; and (ii) at least four training multispectral transmission image channel images are acquired; wherein each of the at least six test multispectral transmission image channel images are acquired at about the same wavelengths as each of the at least six training multispectral transmission image channel images.
7. The system of claim 6, wherein the test multispectral transmission image is generated without performing a dimensionality reduction on the at least six test multispectral transmission image channel images.
8. The system of claim 6, wherein the first training image in each of pair of coregistered training images is generated without performing a dimensionality reduction on the at least six training multispectral transmission image channel images.
9. The system of claim 1, wherein (i) at least twelve test multispectral transmission image channel images are acquired; and (ii) at least twelve training multispectral transmission image channel images are acquired; wherein each of the at least twelve test multispectral transmission image channel images are acquired at about the same wavelengths as each of the at least twelve training multispectral transmission image channel images.
10. The system of claim 9, wherein the test multispectral transmission image is generated without performing a dimensionality reduction on the at least twelve test multispectral transmission image channel images.
11. The system of claim 9, wherein the first training image in each of pair of coregistered training images is generated without performing a dimensionality reduction on the at least twelve training multispectral transmission image channel images.
12. The system of any one of claims 1 - 11, wherein the wavelengths are selected from 365 +/- 20 nm, about 400 +/- 20 nm, about 435 +/- 20 nm, about 470 +/- 20 nm, about 500 +/- 20 nm, about 550+/- 20 nm, about 580 +/- 20 nm, about 635 +/- 20 nm, about 660 +/- 20 nm, about 690 +/- 20 nm, about 780 +/- 20 nm, and/or about 850 +/- 20 nm.
13. The system of any one of claims 1 - 11, wherein the wavelengths are selected from 365 +/- 10 nm, about 400 +/- 10 nm, about 435 +/- 10 nm, about 470 +/- 10 nm, about 500 +/- 10 nm, about 550+/- 10 nm, about 580 +/- 10 nm, about 635 +/- 10 nm, about 660 +/- 10 nm, about 690 +/- 10 nm, about 780 +/- 10 nm, and/or about 850 +/- 10 nm.
14. The system of any one of claims 1 - 11, wherein the wavelengths are selected from 365 nm, about 400 nm, about 435 nm, about 470 nm, about 500 nm, about 550 nm, about 580 nm, about 635 nm, about 660 nm, about 690 nm, about 780 nm, and/or about 850 nm.
15. The system of any one of claims 1 - 11, wherein the obtained virtual staining engine comprises a generative adversarial network.
16. The system of any one of claims 1 - 11 , wherein the morphological stain is a primary stain.
17. The system of any one of claims 1 - 11, wherein the morphological stain is a special stain.
18. The system of any one of claims 1 - 11, wherein the morphological stain comprises hematoxylin.
19. The system of any one of claims 1 - 11, wherein the morphological stain comprises hematoxylin and eosin.
20. The method of any one of claims 1 - 19, wherein the training brightfield data is acquired using a multispectral image acquisition device.
21. A method for generating a virtually stained image of a test unstained biological specimen disposed on a substrate, comprising: a. obtaining a test multispectral transmission image of the test unstained biological specimen, wherein the test multispectral transmission image is derived from one or more test multispectral transmission image channel images; b. supplying the obtained test multispectral transmission image of the test unstained biological specimen to a virtual staining engine, wherein the virtual staining engine is trained to generate an image of an unstained biological specimen stained with a morphological stain, and wherein the virtual staining
engine is trained from a plurality of pairs of coregistered training images derived from one or more training biological specimens, i. wherein a first training image in each pair of the plurality of coregistered training images is derived from one or more training multispectral transmission image channel images; ii. wherein a second training image in each pair of the plurality of coregistered training images comprises training brightfield image data of the same biological specimen stained with the morphological stain; and c. with the trained virtual staining engine, generating the virtually stained image of the test unstained biological specimen stained with the morphological stain; wherein each of the one or more test multispectral transmission image channel images are acquired at about the same wavelengths as each of the one or more training multispectral transmission image channel images.
22. The method of claim 21, wherein at least two test multispectral transmission image channel images are acquired.
23. The method of claim 21, wherein at least three test multispectral transmission image channel images are acquired.
24. The method of claim 23, wherein the test multispectral transmission image is generated by performing a dimensionality reduction on the at least three test multispectral transmission image channel images.
25. The method of claim 24, wherein the test multispectral transmission image is generated without compressing any of the at least three test multispectral transmission image channel images or performing a dimensionality reduction on any of the at least three test multispectral transmission image channel images.
26. The method of claim 21, wherein (i) at least six test multispectral transmission image channel images are acquired; and (ii) at least six training multispectral transmission image channel images are acquired; wherein each of the at least six test multispectral transmission image channel images are acquired at about the same wavelengths as each of the at least six training multispectral transmission image channel images.
27. The method of claim 26, wherein the test multispectral transmission image is generated without performing a dimensionality reduction on the at least six test multispectral transmission image channel images.
28. The method of claim 26, wherein the first training image in each of pair of coregistered training images is generated without performing a dimensionality reduction on the at least six training multispectral transmission image channel images.
29. The method of claim 21, wherein (i) at least twelve test multispectral transmission image channel images are acquired; and (ii) at least twelve training multispectral transmission image channel images are acquired; wherein each of the at least twelve test multispectral transmission image channel images are acquired at about the same wavelengths as each of the at least twelve training multispectral transmission image channel images.
30. The method of claim 29, wherein the test multispectral transmission image is generated without performing a dimensionality reduction on the at least twelve test multispectral transmission image channel images.
31. The method of claim 29, wherein the first training image in each of pair of coregistered training images is generated without performing a dimensionality reduction on the at least twelve training multispectral transmission image channel images.
32. The method of any one of claims 21 - 31, wherein the wavelengths are selected from 365 +/- 20 nm, about 400 +/- 20 nm, about 435 +/- 20 nm, about 470 +/- 20 nm, about 500 +/- 20 nm, about 550+/- 20 nm, about 580 +/- 20 nm, about 635 +/- 20 nm, about 660 +/- 20 nm, about 690 +/- 20 nm, about 780 +/- 20 nm, and/or about 850 +/- 20 nm.
33. The method of any one of claims 21 - 31, wherein the wavelengths are selected from 365 +/- 10 nm, about 400 +/- 10 nm, about 435 +/- 10 nm, about 470 +/- 10 nm, about 500 +/- 10 nm, about 550+/- 10 nm, about 580 +/- 10 nm, about 635 +/- 10 nm, about 660 +/- 10 nm, about 690 +/- 10 nm, about 780 +/- 10 nm, and/or about 850 +/- 10 nm.
34. The method of any one of claims 21 - 31, wherein the wavelengths are selected from 365 nm, about 400 nm, about 435 nm, about 470 nm, about 500 nm, about 550 nm, about 580 nm, about 635 nm, about 660 nm, about 690 nm, about 780 nm, and/or about 850 nm.
35. The method of any one of claims 21 - 30, wherein the obtained virtual staining engine comprises a generative adversarial network.
36. The method of any one of claims 21 - 31, wherein the morphological stain is a primary stain.
37. The method of any one of claims 21 - 31, wherein the morphological stain is a special stain.
38. The system of any one of claims 21 - 31, wherein the morphological stain comprises hematoxylin.
39. The method of any one of claims 21 - 31, wherein the morphological stain comprises hematoxylin and eosin.
40. The method of any one of claims 21 - 39, wherein the training brightfield data is acquired using a multispectral image acquisition device.
41. The method of claim 40, wherein multispectral image data is acquired at about wavelengths 700nm, 550nm, and 470nm; and wherein the multispectral image data acquired is converted to an RGB image.
42. A system for generating a virtually stained image of a test unstained biological specimen disposed on a substrate, the system comprising: (i) one or more processors, and (ii) one or more memories coupled to the one or more processors, the one or more memories to store computerexecutable instructions that, when executed by the one or more processors, cause the system to perform operations comprising: a. obtaining a test multispectral transmission image of the test unstained biological specimen, wherein the test multispectral transmission image is derived from one or more test multispectral transmission image channel images; b. supplying the obtained test multispectral transmission image of the test unstained biological specimen to a virtual staining engine, wherein the virtual staining engine is trained to generate an image of an unstained biological specimen stained with a morphological stain; and c. with the trained virtual staining engine, generating the virtually stained image of the test unstained biological specimen stained with the morphological stain.
43. The system of claim 42, wherein the virtual staining engine is trained using (a) one or more training multispectral transmission image channel images of an unstained training biological specimen; and (b) training brightfield image data of the same unstained training biological specimen stained with a morphological stain.
44. The system of claim 43, the training brightfield data is acquired using a multispectral image acquisition device.
45. The system of claim 42, wherein each of the one or more test multispectral transmission image channel images are acquired at about the same wavelengths as each of the one or more training multispectral transmission image channel images.
46. The system of any one of claims 42 - 43, wherein the wavelengths are selected from 365 +/- 20 nm, about 400 +/- 20 nm, about 435 +/- 20 nm, about 470 +/- 20 nm, about 500 +/- 20 nm, about 550+/- 20 nm, about 580 +/- 20 nm, about 635 +/- 20 nm, about 660 +/- 20 nm, about 690 +/- 20 nm, about 780 +/- 20 nm, and/or about 850 +/- 20 nm.
47. The system of any one of claims 42 - 43, wherein the wavelengths are selected from 365 +/- 10 nm, about 400 +/- 10 nm, about 435 +/- 10 nm, about 470 +/- 10 nm, about 500 +/- 10 nm, about 550+/- 10 nm, about 580 +/- 10 nm, about 635 +/- 10 nm, about 660 +/- 10 nm, about 690 +/- 10 nm, about 780 +/- 10 nm, and/or about 850 +/- 10 nm.
48. The system of any one of claims 42 - 43, wherein the wavelengths are selected from 365 nm, about 400 nm, about 435 nm, about 470 nm, about 500 nm, about 550 nm, about 580 nm, about 635 nm, about 660 nm, about 690 nm, about 780 nm, and/or about 850 nm.
49. The system of claim 42, wherein the virtual staining engine is trained from a plurality of pairs of coregistered training images derived from one or more training biological specimens.
50. The system of claim 49, wherein a first training image in each pair of the plurality of coregistered training images is derived from one or more training multispectral transmission image channel images; and wherein a second training image in each pair of the plurality of coregistered training images comprises training brightfield image data of the same biological specimen stained with the morphological stain.
51. The system of claim 50, wherein each of the one or more test multispectral transmission image channel images are acquired at about the same wavelengths as each of the one or more training multispectral transmission image channel images.
52. A system for generating two or more virtually stained images of a test unstained biological specimen disposed on a substrate, the system comprising: (i) one or more processors, and (ii) one or more memories coupled to the one or more processors, the one or more memories to store computer-executable instructions that, when executed by the one or more processors, cause the system to perform operations comprising: a. obtaining a test multispectral transmission image of the test unstained biological specimen, wherein the test multispectral transmission image is
derived from one or more test multispectral transmission image channel images; b. supplying the obtained test multispectral transmission image of the test unstained biological specimen to two or more different virtual staining engines, wherein each virtual staining engine of the two or more different virtual staining engines is trained to generate an image of an unstained biological specimen stained with a different stain; c. with the trained virtual staining engine, generating two or more virtually stained images of the test unstained biological specimen, wherein each of the generated two or more virtually stained images are stained with a different stain.
53. The system of claim 52, wherein the two or more virtual staining engines are each independently trained using (a) one or more training multispectral transmission image channel images of an unstained training biological specimen; and (b) brightfield training image data of the same unstained training biological specimen stained with a stain.
54. The system of claim 52, wherein each of the one or more test multispectral transmission image channel images are acquired at about the same wavelengths as each of the one or more training multispectral transmission image channel images used to train each of the two or more virtual staining engines.
55. The system of any one of claims 52 - 54, wherein the wavelengths are selected from 365 +/- 20 nm, about 400 +/- 20 nm, about 435 +/- 20 nm, about 470 +/- 20 nm, about 500 +/- 20 nm, about 550+/- 20 nm, about 580 +/- 20 nm, about 635 +/- 20 nm, about 660 +/- 20 nm, about 690 +/- 20 nm, about 780 +/- 20 nm, and/or about 850 +/- 20 nm.
56. The system of any one of claims 52 - 54, wherein the wavelengths are selected from 365 +/- 10 nm, about 400 +/- 10 nm, about 435 +/- 10 nm, about 470 +/- 10 nm, about 500 +/- 10 nm, about 550+/- 10 nm, about 580 +/- 10 nm, about 635 +/- 10 nm, about 660 +/- 10 nm, about 690 +/- 10 nm, about 780 +/- 10 nm, and/or about 850 +/- 10 nm.
57. The system of any one of claims 52 - 54, wherein the wavelengths are selected from 365 nm, about 400 nm, about 435 nm, about 470 nm, about 500 nm, about 550 nm, about 580 nm, about 635 nm, about 660 nm, about 690 nm, about 780 nm, and/or about 850 nm.
58. The system of claim 52, wherein each of the two or more virtual staining engines are trained from a plurality of pairs of coregistered training images derived from one or more training biological specimens.
59. The system of claim 58, wherein a first training image in each pair of the plurality of coregistered training images is derived from one or more training multispectral transmission image channel images; and wherein a second training image in each pair of the plurality of coregistered training images comprises training brightfield image data of the same biological specimen stained with the stain.
60. A system for generating a virtually stained image of a test unstained biological specimen disposed on a substrate, the system comprising: (i) one or more processors, and (ii) one or more memories coupled to the one or more processors, the one or more memories to store computerexecutable instructions that, when executed by the one or more processors, cause the system to perform operations comprising: a. obtaining a trained virtual staining engine trained to generate an image of an unstained biological specimen stained with a morphological stain, wherein the trained virtual staining engine is trained from a plurality of pairs of coregistered training images, i. where a first training image in each pair of the plurality of coregistered training images is derived from training multispectral transmission image data of an unstained training biological specimen acquired at three or more different wavelengths, ii. wherein a second training image in each pair of the plurality of coregistered training images comprises training brightfield image data of the same training biological specimen stained with a morphological stain; b. obtaining a test multispectral transmission image of the test unstained biological specimen, wherein the obtained test multispectral transmission image is derived from test multispectral transmission image data of the test unstained biological specimen illuminated with the at least three different illumination sources; and
c. with the obtained trained virtual staining engine, generating the virtually stained image of the test unstained biological specimen stained with the morphological stain based on the obtaining test multispectral transmission image.
61. The system of claim 60, wherein the morphological stain is a primary stain.
62. The system of claim 60, wherein the morphological stain is a special stain.
63. The system of claim 60, wherein the morphological stain comprises hematoxylin.
64. The system of claim 60, wherein the morphological stain comprises hematoxylin and eosin.
65. The system of any one of claims 60 - 64, wherein the training multispectral transmission image data of the unstained training biological specimen is acquired at four or more different wavelengths.
66. The system of claim 60, wherein the first training image is generated by reducing a dimensionality of the training multispectral transmission image data at the three or more different wavelengths.
67. The system of claim 60, wherein the first training image is generated without compressing or reducing a dimensionality of the training multispectral transmission image data at the three or more different wavelengths.
68. The system of any one of claims 60 - 64, wherein the training multispectral transmission image data of the unstained training biological specimen is acquired at six or more different wavelengths.
69. The system of claim 68, wherein the first training image is generated by reducing a dimensionality of the training multispectral transmission image data at the six or more different wavelengths.
70. The system of claim 68, wherein the first training image is generated without compressing or reducing a dimensionality of the training multispectral transmission image data at the six or more different wavelengths.
71. The system of any one of claims 60 - 64, wherein the training multispectral transmission image data of the unstained training biological specimen is acquired at twelve or more different wavelengths.
72. The system of claim 71, wherein the first training image is generated by reducing a dimensionality of the training multispectral transmission image data at the twelve or more different wavelengths.
73. The system of claim 71, wherein the first training image is generated without compressing or reducing a dimensionality of the training multispectral transmission image data at the twelve or more different wavelengths.
74. The system of any one of claims 60 - 73, wherein the at least three wavelengths are selected from 365 +/- 20 nm, about 400 +/- 20 nm, about 435 +/- 20 nm, about 470 +/- 20 nm, about 500 +/- 20 nm, about 550+/- 20 nm, about 580 +/- 20 nm, about 635 +/- 20 nm, about 660 +/- 20 nm, about 690 +/- 20 nm, about 780 +/- 20 nm, and/or about 850 +/- 20 nm.
75. The system of any one of claims 60 - 73, wherein the at least three wavelengths are selected from 365 +/- 10 nm, about 400 +/- 10 nm, about 435 +/- 10 nm, about 470 +/- 10 nm, about 500 +/- 10 nm, about 550+/- 10 nm, about 580 +/- 10 nm, about 635 +/- 10 nm, about 660 +/- 10 nm, about 690 +/- 10 nm, about 780 +/- 10 nm, and/or about 850 +/- 10 nm.
76. The system of any one of claims 60 - 73, wherein the at least three wavelengths are selected from 365 nm, about 400 nm, about 435 nm, about 470 nm, about 500 nm, about 550 nm, about 580 nm, about 635 nm, about 660 nm, about 690 nm, about 780 nm, and/or about 850 nm.
77. The system of any one of claims 60 - 76, wherein the training multispectral transmission image data and the test multispectral image data are acquired using a multispectral image acquisition device.
78. The system of any one of claims 60 - 76, wherein the training brightfield image data is acquired using a brightfield image acquisition device.
79. The system of any one of claims 60 - 76, wherein the training brightfield image data is acquired using a multispectral image acquisition device, wherein the training brightfield image data is an RGB image.
80. The system of any one of claims 60 - 76, wherein the training brightfield image data is acquired using a multispectral image acquisition device, wherein the multispectral image acquisition device is configured to capture image data at 700nm +/- 10mm, about 550nm +/- 10mm , and about 470nm +/- 10mm.
81. The system of any one of claims 60 - 80, wherein the obtained trained virtual staining engine comprises a generative adversarial network.
82. A method for generating a virtually stained image of a test unstained biological specimen disposed on a substrate, comprising: a. obtaining a trained virtual staining engine trained to generate an image of an unstained biological specimen stained with a morphological stain, wherein the trained virtual staining engine is trained from a plurality of pairs of coregistered training images; i. where a first training image in each pair of the plurality of coregistered training images is derived from training multispectral transmission image data of an unstained training biological specimen illuminated with the at least three different illumination sources; ii. wherein a second training image in each pair of the plurality of coregistered training images comprises training brightfield image data of the same training biological specimen stained with a morphological stain; b. obtaining a test multispectral transmission image of the test unstained biological specimen, wherein the obtained test multispectral transmission image is derived from test multispectral transmission image data of the test unstained biological specimen illuminated with the at least three different illumination sources; and c. with the obtained trained virtual staining engine, generating the virtually stained image of the test unstained biological specimen stained with the morphological stain based on the obtaining test multispectral transmission image.
83. The method of claim 82, wherein the morphological stain is a primary stain.
84. The method of claim 82, wherein the morphological stain is a special stain.
85. The method of claim 82, wherein the morphological stain comprises hematoxylin.
86. The method of claim 82, wherein the morphological stain comprises hematoxylin and eosin.
87. The method of claim 82, wherein the unstained training biological specimen and the test unstained biological specimen illuminated with at least four different illumination sources.
88. The method of claim 82, wherein the unstained training biological specimen and the test unstained biological specimen illuminated with at least six different illumination sources.
89. The method of claim 82, wherein the unstained training biological specimen and the test unstained biological specimen illuminated with at least twelve different illumination sources.
90. The method of any one of claims 82 - 89, wherein the at least three illumination sources are selected from 365 +/- 20 nm, about 400 +/- 20 nm, about 435 +/- 20 nm, about 470 +/- 20 nm, about 500 +/- 20 nm, about 550+/- 20 nm, about 580 +/- 20 nm, about 635 +/- 20 nm, about 660 +/- 20 nm, about 690 +/- 20 nm, about 780 +/- 20 nm, and/or about 850 +/- 20 nm.
91. The method of any one of claims 82 - 89, wherein the at least three illumination sources are selected from 365 +/- 10 nm, about 400 +/- 10 nm, about 435 +/- 10 nm, about 470 +/- 10 nm, about 500 +/- 10 nm, about 550+/- 10 nm, about 580 +/- 10 nm, about 635 +/- 10 nm, about 660 +/- 10 nm, about 690 +/- 10 nm, about 780 +/- 10 nm, and/or about 850 +/- 10 nm.
92. The method of any one of claims 82 - 89, wherein the at least three illumination sources are selected from 365 nm, about 400 nm, about 435 nm, about 470 nm, about 500 nm, about 550 nm, about 580 nm, about 635 nm, about 660 nm, about 690 nm, about 780 nm, and/or about 850 nm.
93. The method of any one of claims 82 - 92, wherein the training multispectral transmission image data and the test multispectral image data are acquired using a multispectral image acquisition device.
94. The method of any one of claims 82 - 92, wherein the training brightfield image data is acquired using a brightfield image acquisition device.
95. The method of any one of claims 82 - 92, wherein the training brightfield image data is acquired using a multispectral image acquisition device.
96. The method of any one of claims 82 - 92, wherein the training brightfield image data is acquired using a multispectral image acquisition device, wherein the multispectral image acquisition device is configured to capture image data at 700nm +/- 10mm, about 550nm +/- 10mm , and about 470nm +/- 10mm.
97. The method of any one of claims 82 - 96, wherein the obtained trained virtual staining engine comprises a generative adversarial network.
98. The method of claim 82, wherein each pair of the plurality of coregistered training images is derived from different training biological specimen.
99. A non-transitory computer-readable medium storing instructions for generating a virtually stained image of a test unstained biological specimen disposed on a substrate, the system
comprising: (i) one or more processors, and (ii) one or more memories coupled to the one or more processors, the one or more memories to store computer-executable instructions that, when executed by the one or more processors, cause the system to perform operations comprising: a. obtaining a test multispectral transmission image of the test unstained biological specimen, wherein the test multispectral transmission image is derived from one or more test multispectral transmission image channel images; b. supplying the obtained test multispectral transmission image of the test unstained biological specimen to a virtual staining engine, wherein the virtual staining engine is trained to generate an image of an unstained biological specimen stained with a morphological stain, and wherein the virtual staining engine is trained from a plurality of pairs of coregistered training images derived from one or more training biological specimens, i. wherein a first training image in each pair of the plurality of coregistered training images is derived from one or more training multispectral transmission image channel images; ii. wherein a second training image in each pair of the plurality of coregistered training images comprises training brightfield image data of the same biological specimen stained with the morphological stain; and c. with the trained virtual staining engine, generating the virtually stained image of the test unstained biological specimen stained with the morphological stain; wherein each of the one or more test multispectral transmission image channel images are acquired at about the same wavelengths as each of the one or more training multispectral transmission image channel images.
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US202463627968P | 2024-02-01 | 2024-02-01 | |
| US63/627,968 | 2024-02-01 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2025166108A1 true WO2025166108A1 (en) | 2025-08-07 |
Family
ID=94820837
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/US2025/013949 Pending WO2025166108A1 (en) | 2024-02-01 | 2025-01-31 | Methods of generating digitally stained images from unstained biological samples |
Country Status (1)
| Country | Link |
|---|---|
| WO (1) | WO2025166108A1 (en) |
Citations (26)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US5650327A (en) | 1990-03-02 | 1997-07-22 | Ventana Medical Systems, Inc. | Method for mixing reagent and sample mounted on a slide |
| US6296809B1 (en) | 1998-02-27 | 2001-10-02 | Ventana Medical Systems, Inc. | Automated molecular pathology apparatus having independent slide heaters |
| US20030211630A1 (en) | 1998-02-27 | 2003-11-13 | Ventana Medical Systems, Inc. | Automated molecular pathology apparatus having independent slide heaters |
| US20040052685A1 (en) | 1998-02-27 | 2004-03-18 | Ventana Medical Systems, Inc. | Automated molecular pathology apparatus having independent slide heaters |
| US6894639B1 (en) | 1991-12-18 | 2005-05-17 | Raytheon Company | Generalized hebbian learning for principal component analysis and automatic target recognition, systems and method |
| US20050123202A1 (en) | 2003-12-04 | 2005-06-09 | Samsung Electronics Co., Ltd. | Face recognition apparatus and method using PCA learning per subgroup |
| WO2011049608A2 (en) | 2009-10-19 | 2011-04-28 | Bioimagene, Inc. | Imaging system and techniques |
| WO2011139978A1 (en) | 2010-05-04 | 2011-11-10 | Ventana Medical Systems, Inc. | Moving meniscus rinsing and mixing in cell staining |
| US8565488B2 (en) | 2010-05-27 | 2013-10-22 | Panasonic Corporation | Operation analysis device and operation analysis method |
| US8605972B2 (en) | 2012-03-02 | 2013-12-10 | Sony Corporation | Automatic image alignment |
| US20140178169A1 (en) | 2011-09-09 | 2014-06-26 | Ventana Medical Systems, Inc. | Imaging systems, cassettes, and methods of using the same |
| US20140336942A1 (en) | 2012-12-10 | 2014-11-13 | The Trustees Of Columbia University In The City Of New York | Analyzing High Dimensional Single Cell Data Using the T-Distributed Stochastic Neighbor Embedding Algorithm |
| US20160282374A1 (en) | 2013-12-13 | 2016-09-29 | Ventana Medical Systems, Inc. | Staining reagents and other liquids for histological processing of biological specimens and associated technology |
| WO2016170008A1 (en) | 2015-04-20 | 2016-10-27 | Ventana Medical Systems, Inc. | Inkjet deposition of reagents for histological samples |
| US9785818B2 (en) | 2014-08-11 | 2017-10-10 | Synaptics Incorporated | Systems and methods for image alignment |
| US20170372117A1 (en) | 2014-11-10 | 2017-12-28 | Ventana Medical Systems, Inc. | Classifying nuclei in histology images |
| US20180166077A1 (en) | 2016-12-14 | 2018-06-14 | Toyota Jidosha Kabushiki Kaisha | Language storage method and language dialog system |
| US10313606B2 (en) | 2014-05-23 | 2019-06-04 | Ventana Medical Systems, Inc | Method and apparatus for imaging a sample using a microscope scanner |
| US10317666B2 (en) | 2014-05-23 | 2019-06-11 | Ventana Medical Systems, Inc. | Method and apparatus for imaging a sample using a microscope scanner |
| US20200105413A1 (en) | 2018-09-29 | 2020-04-02 | Roche Molecular Systems, Inc. | Multimodal machine learning based clinical predictor |
| US20210027462A1 (en) | 2018-04-13 | 2021-01-28 | Ventana Medical Systems, Inc. | Systems for cell shape estimation |
| US20210043331A1 (en) * | 2018-03-30 | 2021-02-11 | The Regents Of The University Of California | Method and system for digital staining of label-free fluorescence images using deep learning |
| US11010892B2 (en) | 2016-10-07 | 2021-05-18 | Ventana Medical Systems, Inc. | Digital pathology system and associated workflow for providing visualized whole-slide image analysis |
| US20210216746A1 (en) | 2018-10-15 | 2021-07-15 | Ventana Medical Systems, Inc. | Systems and methods for cell classification |
| US11070750B2 (en) | 2013-03-12 | 2021-07-20 | Ventana Medical Systems, Inc. | Digitally enhanced microscopy for multiplexed histology |
| US20210285056A1 (en) | 2018-07-27 | 2021-09-16 | Ventana Medical Systems, Inc. | Systems for automated in situ hybridization analysis |
-
2025
- 2025-01-31 WO PCT/US2025/013949 patent/WO2025166108A1/en active Pending
Patent Citations (35)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US6943029B2 (en) | 1990-03-02 | 2005-09-13 | Ventana Medical Systems, Inc. | Automated biological reaction apparatus |
| US5654200A (en) | 1990-03-02 | 1997-08-05 | Ventana Medical Systems, Inc. | Automated slide processing apparatus with fluid injector |
| US6352861B1 (en) | 1990-03-02 | 2002-03-05 | Ventana Medical Systems, Inc. | Automated biological reaction apparatus |
| US6827901B2 (en) | 1990-03-02 | 2004-12-07 | Ventana Medical Systems, Inc. | Automated biological reaction apparatus |
| US5650327A (en) | 1990-03-02 | 1997-07-22 | Ventana Medical Systems, Inc. | Method for mixing reagent and sample mounted on a slide |
| US6894639B1 (en) | 1991-12-18 | 2005-05-17 | Raytheon Company | Generalized hebbian learning for principal component analysis and automatic target recognition, systems and method |
| US6296809B1 (en) | 1998-02-27 | 2001-10-02 | Ventana Medical Systems, Inc. | Automated molecular pathology apparatus having independent slide heaters |
| US20030211630A1 (en) | 1998-02-27 | 2003-11-13 | Ventana Medical Systems, Inc. | Automated molecular pathology apparatus having independent slide heaters |
| US20040052685A1 (en) | 1998-02-27 | 2004-03-18 | Ventana Medical Systems, Inc. | Automated molecular pathology apparatus having independent slide heaters |
| US20050123202A1 (en) | 2003-12-04 | 2005-06-09 | Samsung Electronics Co., Ltd. | Face recognition apparatus and method using PCA learning per subgroup |
| WO2011049608A2 (en) | 2009-10-19 | 2011-04-28 | Bioimagene, Inc. | Imaging system and techniques |
| US9575301B2 (en) | 2009-10-19 | 2017-02-21 | Ventana Medical Systems, Inc. | Device for a microscope stage |
| WO2011139978A1 (en) | 2010-05-04 | 2011-11-10 | Ventana Medical Systems, Inc. | Moving meniscus rinsing and mixing in cell staining |
| US8565488B2 (en) | 2010-05-27 | 2013-10-22 | Panasonic Corporation | Operation analysis device and operation analysis method |
| US20140178169A1 (en) | 2011-09-09 | 2014-06-26 | Ventana Medical Systems, Inc. | Imaging systems, cassettes, and methods of using the same |
| US8605972B2 (en) | 2012-03-02 | 2013-12-10 | Sony Corporation | Automatic image alignment |
| US20140336942A1 (en) | 2012-12-10 | 2014-11-13 | The Trustees Of Columbia University In The City Of New York | Analyzing High Dimensional Single Cell Data Using the T-Distributed Stochastic Neighbor Embedding Algorithm |
| US20180046755A1 (en) | 2012-12-10 | 2018-02-15 | The Trustees Of Columbia University In The City Of New York | Analyzing high dimensional single cell data using the t-distributed stochastic neighbor embedding algorithm |
| US11070750B2 (en) | 2013-03-12 | 2021-07-20 | Ventana Medical Systems, Inc. | Digitally enhanced microscopy for multiplexed histology |
| US20160282374A1 (en) | 2013-12-13 | 2016-09-29 | Ventana Medical Systems, Inc. | Staining reagents and other liquids for histological processing of biological specimens and associated technology |
| US20210088769A1 (en) | 2014-05-23 | 2021-03-25 | Ventana Medical Systems, Inc. | Method and apparatus for imaging a sample using a microscope scanner |
| US20210092308A1 (en) | 2014-05-23 | 2021-03-25 | Ventana Medical Systems, Inc. | Method and apparatus for imaging a sample using a microscope scanner |
| US10317666B2 (en) | 2014-05-23 | 2019-06-11 | Ventana Medical Systems, Inc. | Method and apparatus for imaging a sample using a microscope scanner |
| US10313606B2 (en) | 2014-05-23 | 2019-06-04 | Ventana Medical Systems, Inc | Method and apparatus for imaging a sample using a microscope scanner |
| US9785818B2 (en) | 2014-08-11 | 2017-10-10 | Synaptics Incorporated | Systems and methods for image alignment |
| US10628658B2 (en) | 2014-11-10 | 2020-04-21 | Ventana Medical Systems, Inc. | Classifying nuclei in histology images |
| US20170372117A1 (en) | 2014-11-10 | 2017-12-28 | Ventana Medical Systems, Inc. | Classifying nuclei in histology images |
| WO2016170008A1 (en) | 2015-04-20 | 2016-10-27 | Ventana Medical Systems, Inc. | Inkjet deposition of reagents for histological samples |
| US11010892B2 (en) | 2016-10-07 | 2021-05-18 | Ventana Medical Systems, Inc. | Digital pathology system and associated workflow for providing visualized whole-slide image analysis |
| US20180166077A1 (en) | 2016-12-14 | 2018-06-14 | Toyota Jidosha Kabushiki Kaisha | Language storage method and language dialog system |
| US20210043331A1 (en) * | 2018-03-30 | 2021-02-11 | The Regents Of The University Of California | Method and system for digital staining of label-free fluorescence images using deep learning |
| US20210027462A1 (en) | 2018-04-13 | 2021-01-28 | Ventana Medical Systems, Inc. | Systems for cell shape estimation |
| US20210285056A1 (en) | 2018-07-27 | 2021-09-16 | Ventana Medical Systems, Inc. | Systems for automated in situ hybridization analysis |
| US20200105413A1 (en) | 2018-09-29 | 2020-04-02 | Roche Molecular Systems, Inc. | Multimodal machine learning based clinical predictor |
| US20210216746A1 (en) | 2018-10-15 | 2021-07-15 | Ventana Medical Systems, Inc. | Systems and methods for cell classification |
Non-Patent Citations (18)
| Title |
|---|
| ASAF MUHAMMAD ZEESHAN ET AL: "Dual contrastive learning based image-to-image translation of unstained skin tissue into virtually stained H&E images", SCIENTIFIC REPORTS, vol. 14, no. 1, 28 January 2024 (2024-01-28), US, XP093174303, ISSN: 2045-2322, Retrieved from the Internet <URL:https://www.nature.com/articles/s41598-024-52833-7.pdf> DOI: 10.1038/s41598-024-52833-7 * |
| BAYRAMOGLU NESLIHAN ET AL: "Towards Virtual H&E Staining of Hyperspectral Lung Histology Images Using Conditional Generative Adversarial Networks", 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), IEEE, 22 October 2017 (2017-10-22), pages 64 - 71, XP033303442, [retrieved on 20180119], DOI: 10.1109/ICCVW.2017.15 * |
| D. MUELLER ET AL.: "Real-time deformable registration of multi-modal whole slides for digital pathology", COMPUTERIZED MEDICAL IMAGING AND GRAPHICS, vol. 35, 2011, pages 542 - 556, XP028282335, DOI: 10.1016/j.compmedimag.2011.06.006 |
| F. EL-GAMAL ET AL.: "Current trends in medical image registration and fusion", EGYPTIAN INFORMATICS JOURNAL, vol. 17, 2016, pages 99 - 124, XP029464146, DOI: 10.1016/j.eij.2015.09.002 |
| FARAHANI ET AL.: "Whole slide imaging in pathology: advantages, limitations, and emerging perspectives", PATHOLOGY AND LABORATORY MEDICINE INT'L, vol. 7, June 2015 (2015-06-01), pages 23 - 33 |
| FARRUGIAJESSICA ET AL.: "Principal component analysis of hyperspectral data for early detection of mould in cheeselets", CURRENT RESEARCH IN FOOD SCIENCE, vol. 4, 2021, pages 18 - 27 |
| GOODFELLOW ET AL.: "Advances in Neural Information Processing Systems", GENERATIVE ADVERSARIAL NETS, vol. 27, 2014, pages 2672 - 2680 |
| GOODFELLOW ET AL.: "Advances in Neural Information Processing Systems", GENERATIVE ADVERSARIAL NETS., vol. 27, 2014, pages 2672 - 2680 |
| GOODFELLOWJ. POUGET-ABADIEM. MIRZAB. XUD. WARDE-FARLEYS. OZAIRA. COURVILLEY. BENGIO: "Generative Adversarial Nets", ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS, 2014, pages 2672 - 2680 |
| J. SINGLA ET AL.: "A systematic way of affine transformation using image registration", INTERNATIONAL JOURNAL OF INFORMATION TECHNOLOGY AND KNOWLEDGE MANAGEMENT, vol. 5, no. 2, 2012, pages 239 - 243 |
| JUN-YAN ZHUTAESUNG PARKPHILLIP ISOLAALEXEI A. EFROS, UNPAIRED IMAGE-TO-IMAGE TRANSLATION USING CYCLE-CONSISTENT ADVERSARIAL NETWORKS, 24 November 2017 (2017-11-24) |
| K. BOUSMALIS ET AL.: "Unsupervised Pixel-Level Domain Adaptation with Generative Adversarial", NETWORKS, August 2017 (2017-08-01), Retrieved from the Internet <URL:https://arxiv.org/pdf/1612.05424.pdf> |
| KHAN UMAIR ET AL: "The effect of neural network architecture on virtual H&E staining: Systematic assessment of histological feasibility", PATTERNS, vol. 4, no. 5, 12 May 2023 (2023-05-12), pages 1 - 32, XP093217568, ISSN: 2666-3899, Retrieved from the Internet <URL:https://www.cell.com/patterns/pdfExtended/S2666-3899(23)00065-X> DOI: 10.1016/j.patter.2023.100725 * |
| KHAN: "Principal Component Analysis-Linear Discriminant Analysis Feature Extractor for Pattern Recognition", IJCSI INTERNATIONAL JOURNAL OF COMPUTER SCIENCES ISSUES, vol. 8, no. 6, 2 November 2011 (2011-11-02) |
| LONG ET AL.: "Computer Vision and Pattern Recognition (CVPR", 2015, IEEE CONFERENCE, article "Fully Convolutional Networks for Semantic Segmentation" |
| OMUCHENI, DICKSON L. ET AL.: "Application of principal component analysis to multispectral-multimodal optical image analysis for malaria diagnostics", MALARIA JOURNAL, vol. 13, 2014, pages 1 - 11 |
| PRICHARD: "Overview of Automated Immunohistochemistry", ARCH PATHOL LAB MED, vol. 138, 2014, pages 1578 - 1582, XP055326004, DOI: 10.5858/arpa.2014-0083-RA |
| Z. HOSSEIN-NEJAD ET AL.: "An adaptive image registration method based on SIFT features and RANSAC transform", COMPUTERS AND ELECTRICAL ENGINEERING, vol. 62, August 2017 (2017-08-01), pages 5240537 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| JP7534461B2 (en) | Systems and methods for cell classification - Patents.com | |
| JP7757479B2 (en) | Machine learning models for cell localization and classification trained using repelcoding | |
| CN112823376B (en) | Image enhancement for improved nucleus detection and segmentation | |
| JP7047059B2 (en) | Automated Assay Evaluation and Normalization for Image Processing | |
| JP7092503B2 (en) | Systems and methods for co-expression analysis | |
| US20200320699A1 (en) | System and method for generating selective stain segmentation images for cell types of interest | |
| CN112534439A (en) | System for automated in situ hybridization analysis | |
| US20240320562A1 (en) | Adversarial robustness of deep learning models in digital pathology | |
| US20230368504A1 (en) | Synthetic generation of immunohistochemical special stains | |
| CN117940971A (en) | Machine learning techniques for predicting phenotypes in dual digital pathology images | |
| CN111602172B (en) | Color unmixing using scatter correction | |
| JP7011067B2 (en) | Systems and methods for classifying cells in tissue images based on membrane characteristics | |
| WO2025166108A1 (en) | Methods of generating digitally stained images from unstained biological samples | |
| EP4627549A1 (en) | Consensus labeling in digital pathology images | |
| WO2025038339A1 (en) | Deep learning model to determine a fixation status of a morphologically stained biological specimen | |
| WO2024025969A1 (en) | Architecture-aware image tiling for processing pathology slides | |
| JP2025541735A (en) | Consensus Labeling in Digital Pathology Images |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 25708570 Country of ref document: EP Kind code of ref document: A1 |