GitHub - tcxxxx/WSI-analysis: Python scripts for automatic Whole-Slide Image preprocessing.

tcxxxx / WSI-analysis Public

Notifications You must be signed in to change notification settings
Fork 13
Star 35

Python scripts for automatic Whole-Slide Image preprocessing.

35 stars 13 forks Branches Tags Activity

Notifications

Name		Name	Last commit message	Last commit date
Latest commit History 178 Commits
patch_extraction		patch_extraction
README.md		README.md

Repository files navigation

Whole-Slide Image Analysis

Background

Whole-Slide images(WSI) are gigapixel high-resolution histopathology images. Traditional analysis procedures could not work efficiently if directly applied to WSIs. Most successful solutions adopted patch-based paradigm.

Overview

This repo currently contains codes for patch extraction (from WSI) and will be updated constantly. :) (Deep-learning based codes for classification and segmentation will be added when they are ready).

Patch extraction

There are several tricky parts when extracting patches from WSIs:

Memory limit.
The RAM size of our lab is 31 GB, and it could hardly hold a level0 WSI. So be careful when loading the whole image.
It is also helpful to use del and gc.collect() to free up memory.
And in order to process level0/1/2 WSIs, we need to split the original image up.
Coordinates scaling level/reference frame.
The read_region() method in OpenSlide processes WSIs in level 0 reference frame. So necessary transformation is needed when we crop patches from WSIs using read_region() method.
Shape difference between Pillow Image object and NumPy arrays.
numpy.asarray() / numpy.array() would switch the position of WIDTH and HEIGHT in shape, and vice versa. If an Image object' shape is (WIDTH, HEIGHT, CHANNEL), the shape will be (HEIGHT, WIDTH, CHANNEL) after the np.asarray() transformation.
Magnification level choice
Below is an patch-extraction example (performed on one sample from Camelyon 2017 dataset). Red boxes are selected patches and green ones annotated tumor areas. As we can see, when we extract 500 x 500 patches from a WSI in level3 scale, the portion of tumor areas are too small, which means discriminative information could be significantly diluted if we use all these selected patches to train CNN.
This urges us to use smaller magnification level (higher resolution scale).

About

Python scripts for automatic Whole-Slide Image preprocessing.

patches medical imageprocessing wsi wholeslide-imaging multipleinstance

Report repository

Releases

No releases published

Packages

No packages published

Contributors 2

Languages