[go: up one dir, main page]

WO2017210690A1 - Spatial aggregation of holistically-nested convolutional neural networks for automated organ localization and segmentation in 3d medical scans - Google Patents

Spatial aggregation of holistically-nested convolutional neural networks for automated organ localization and segmentation in 3d medical scans Download PDF

Info

Publication number
WO2017210690A1
WO2017210690A1 PCT/US2017/035974 US2017035974W WO2017210690A1 WO 2017210690 A1 WO2017210690 A1 WO 2017210690A1 US 2017035974 W US2017035974 W US 2017035974W WO 2017210690 A1 WO2017210690 A1 WO 2017210690A1
Authority
WO
WIPO (PCT)
Prior art keywords
segmentation
pancreas
hnn
organ
maps
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/US2017/035974
Other languages
French (fr)
Inventor
Le LU
Holger Roth
Isabella-Emmanuella NOGUES
Ronald Summers
Xiaosong Wang
Adam P. HARRISON
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Publication of WO2017210690A1 publication Critical patent/WO2017210690A1/en
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0012Biomedical image inspection
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2413Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
    • G06F18/24133Distances to prototypes
    • G06F18/24137Distances to cluster centroïds
    • G06F18/2414Smoothing the distance, e.g. radial basis function networks [RBFN]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/09Supervised learning
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/096Transfer learning
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/01Dynamic search techniques; Heuristics; Dynamic trees; Branch-and-bound
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/143Segmentation; Edge detection involving probabilistic approaches, e.g. Markov random field [MRF] modelling
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10072Tomographic images
    • G06T2207/10081Computed x-ray tomography [CT]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10072Tomographic images
    • G06T2207/10088Magnetic resonance imaging [MRI]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20072Graph-based image processing
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30056Liver; Hepatic
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30092Stomach; Gastric

Definitions

  • This application is related to methods and systems for organ localization and segmentation using medical imaging data.
  • This type of organ segmentation strategy is widely used for many organ segmentation problems, such as the brain, heart, lung, and pancreas. These methods can be referred as a top-down model fitting approach, or more specifically, MALF (Multi- Atlas Registration & Label Fusion).
  • MALF Multi- Atlas Registration & Label Fusion
  • Another group of top-down frameworks leverages statistical model detection, e.g., generalized Hough transform or marginal space learning for organ localization; and deformable statistical shape models for object segmentation.
  • statistical model detection e.g., generalized Hough transform or marginal space learning for organ localization
  • deformable statistical shape models for object segmentation.
  • HNNs holistically-nested convolutional networks
  • the resulting HNN per-pixel probability maps are then fused using pooling to reliably produce a 3D bounding box of the pancreas that maximizes the recall.
  • our introduced localizer compares favorably to both a conventional non-deep-learning method and a recent hybrid approach based on spatial aggregation of superpixels using random forest classification.
  • the second, segmentation, phase operates within the computed bounding box and integrates semantic mid- level cues of deeply-learned organ interior and boundary maps, obtained by two additional and separate realizations of HNNs.
  • FIG. 1 is a flowchart of the proposed two-stage pancreas localization and segmentation framework.
  • Sec. II-B2 and Sec. II-B3 are the alternative means of bottom-up organ localization.
  • the remaining modules are for pancreas segmentation.
  • FIG. 2 illustrates the HNN-I/B network architecture for both interior (left images) and boundary (right images) detection pathways.
  • FIG. 2 illustrates the HNN-I/B network architecture for both interior (left images) and boundary (right images) detection pathways.
  • the ground truth images are inverted for aided visualization.
  • FIG. 3 illustrates a candidate bounding box region generation pipeline (left to right). Gold standard pancreas in red.
  • FIG. 4 illustrates a candidate bounding box region generation.
  • Gold standard pancreas in red blobs of > 0:5 probabilities in green, the selected largest 3D connected component in purple, the resulting candidate bounding box in yellow.
  • FIG. 5 is a multiscale combinatorial grouping (MCG) on three different scales of learned boundary predication maps from HNN-B: Y B2 side, Y B3 side, and Y B fuse using the original CT image on far left as input (with ground truth delineation of pancreas in red).
  • MCG computes superpixels at each scale and produces a set of merged superpixel-based object proposals. We only visualize the boundary probabilities whose values are greater than 0.10.
  • FIG. 6 (a)-(f) illustrate an example for comparison of regression forest (RF, a-c) and HNN-I (d-f) for pancreas localization.
  • Green and red boxes are ground truth and detected bounding boxes respectively.
  • the green dot denotes the ground truth center.
  • This case demonstrates a case in the 90th percentile in RF localization distance and serves as a representative of poorly performing localization.
  • HNN-I includes all of the pancreas with nearly 100% recall in this case.
  • FIG. 7 (a) and (b) are histogram plots (Y-Axis) of regression forest based bounding boxes (a) and ⁇ - ⁇ s generated bounding boxes (b) in recalls (X-axis) covering the ground-truth pancreas masks in 3D. Note that Regression Forest produces 16 out of 82 bounding boxes that lie below 60% in pixel-to-pixel recall while HNN-I produces 100% recalls, except for two cases >94.54%.
  • FIG. 8A-8C illustrate average DSC performance as a function of pancreas probability using HNNmeanmax (8A) and spatial aggregation via RF (8B) for comparison. Note that the DSC performance remains much more stable after RF aggregation with respect to the probability threshold.
  • the percentage of total cases that lie above a certain DSC with RF are shown in 8C: 80% of the cases have a DSC of 78.05%, and 90% of the cases have a DSC of 74.22% and higher.
  • FIG. 9 (a)-(d) shows examples of our HNN-RF pancreas segmentation results (green) comparing with the ground-truth annotation (red).
  • the best performing case (a), two cases with DSC scores close to the data set mean (b,c) and the worst case are shown (d).
  • FIG. 10 shows examples of thoracoabdominal lymph node clusters in CT images with ground truth (red) boundaries.
  • FIG. 11 illustrates disclosed frameworks integrating trained holistically-nested neural networks that capture the interior appearance and boundary cues of the organ to segment, via structured optimization (a) or superpixel based spatial aggregation (b).
  • a three different grid- structured representation and optimization methods are used and evaluated, namely dense CRF, graph cuts (GC) and boundary neural fields (BNF).
  • dense CRF dense CRF
  • GC graph cuts
  • BNF boundary neural fields
  • the learned outputs of HNN-B are incorporated into the pairwise interactions between pixels within a large spatial neighborhood, e.g., 20x20.
  • the spatial aggregation is performed on the enforced boundary-preserving superpixels computed using multiscale boundary maps from HNN-B.
  • FIG. 12 illustrates, at left, a cropped CT image with a Lymph node located in the center together with its associated HNN-A (middle) and HNN-B before non-maximum suppression (right) maps.
  • FIG. 13 shows examples of LN CT image segmentation. Top: original CT images with ground truth (red) and BNF segmented (green) boundaries. Bottom: HNN-A LN probability maps. Left and Middle depict successful segmentation results. Right represents an unsuccessful case.
  • FIG. 14 shows three graphs comparing ground truth and predicted LN volumes, for (a)
  • HNN-A HNN-A
  • (b) boundary neural fields (b) graph cuts.
  • FIG. 15 shows example of lymph node thoracoabdominal CT image segmentation.
  • Pancreas segmentation in computed tomography challenges current computer-aided diagnosis (CAD) systems. While automatic segmentation of numerous other organs in CT scans, such as the liver, heart or kidneys, achieves good performance with Dice similarity coefficients (DSCs) of >90%, the pancreas' variable shape, size, and location in the abdomen limits segmentation accuracy to ⁇ 73% DSC being reported in the literature.
  • Previous pancreas segmentation work are all based on performing volumetric multiple atlas registration and executing robust label fusion methods to optimize the per-pixel organ labeling process. This type of organ segmentation strategy is widely used for many organ segmentation problems, such as the brain, heart, lung, and pancreas.
  • top-down model fitting approach or more specifically, MALF (Multi- Atlas Registration & Label Fusion).
  • MALF Multi- Atlas Registration & Label Fusion
  • Another group of top-down frameworks leverages statistical model detection, e.g., generalized Hough transform or marginal space learning for organ localization; and deformable statistical shape models for object segmentation.
  • statistical model detection e.g., generalized Hough transform or marginal space learning for organ localization
  • deformable statistical shape models for object segmentation.
  • due to the intrinsic huge 3D shape variability of the pancreas statistical shape modeling has not been applied for pancreas segmentation.
  • pancreas segmentation representation uses dense binary image patch labeling confidence maps that are aggregated to classify image regions, or superpixels, into pancreas and non-pancreas label assignments.
  • This method's motivation is to improve segmentation accuracy of highly deformable organs, such as the pancreas, by leveraging midlevel visual representations of image segments.
  • Roth et al. who proposed a probabilistic bottom- up approach using a set of multi-scale and multi-level deep convolutional neural networks (CNNs) to capture the complexity of pancreas appearance in CT images.
  • CNNs multi-scale and multi-level deep convolutional neural networks
  • Deep CNNs have successfully been applied to many high-level tasks in medical imaging, such as recognition and object detection.
  • the main advantage of CNNs comes from the fact that end-to-end learning of salient feature representations for the task at hand is more effective than handcrafted features with heuristically tuned parameters.
  • CNNs demonstrate promising performance for pixel-level labeling problems, e.g., semantic segmentation in recent computer vision and medical imaging analysis work, e.g., fully convolutional neural networks (FCN), DeepLab, and U-Net.
  • FCN fully convolutional neural networks
  • DeepLab DeepLab
  • U-Net U-Net
  • Semantic organ segmentation involves assigning a label to each pixel in the image.
  • features for classification of single pixels (or patches) play a major role, but on the other hand, factors such as edges, i.e., organ boundaries, appearance consistency, and spatial consistency, could greatly impact the overall system performance.
  • edges i.e., organ boundaries, appearance consistency, and spatial consistency
  • semantic vision tasks requiring hierarchical levels of visual perception and abstraction.
  • generating rich feature hierarchies for both the interior and the boundary of the organ could provide important "mid-level visual cues" for semantic segmentation. Subsequent spatial aggregation of these mid- level cues then has the prospect of improving semantic segmentation methods by enhancing the accuracy and consistency of pixel-level labeling.
  • pancreas localization method by replacing the initial super-pixel based one, with a new general deep learning based approach.
  • This methodological component is designed to optimize or maximize the pancreas spatial recall criterion while reducing the non-pancreas volume as much as possible.
  • HNNs holistically-nested convolutional networks
  • the disclosed two-stage process essentially performs 3D spatial aggregation and assembling on the HNN-produced per-pixel pancreas probability maps that run on 2D axial, coronal, and sagittal CT planes.
  • This process operates exhaustively for pancreas localization and selectively for pancreas segmentation. Therefore, this work inherits a hierarchical and compositional visual representation of computing 3D object information aggregated from 2D image slices or parts.
  • Analogous examples of comparing compositional multi-view 2D CNNs versus direct 3D deep models can be found in other computer vision problems: 1) video based action recognition where a two-stream 2D CNN model, capturing the image intensity and motion cues, significantly improves upon the 3D CNN method; 2) the advantageous performance of multi-view CNNs over volumetric CNNs in 3D shape recognition.
  • pancreas localization step aims to robustly compute a bounding box which, at the desirable setting, should cover the entire pancreas while pruning the high majority volumetric space from any input CT scan without any manual pre-processing.
  • the second stage of pancreas segmentation incorporates deeply learned organ interior and boundary mid- level cues with subsequent spatial aggregation, focusing only on the properly zoomed or cascaded pancreas location and spatial extents that are generated after the first phase.
  • Sec. II-A we introduce the HNN model that proves effective for both stages. Afterwards, we focus on localization in Sec.
  • Fig. 1 provides a flowchart depicting the makeup of our system.
  • HNN architecture to learn the pancreas' interior and boundary image-labeling maps, for both localization and segmentation.
  • Object-level interior and boundary information are referred to as mid-level visual cues.
  • this type of CNN architecture was first proposed under the name "holistically-nested edge detection” as a deep learning based general image edge detection method. It has been used successfully for extracting "edge-like" structures like blood vessels in 2D retina images. We however would argue and validate that it can serve as a suitable deep representation to learn general raw pixel-in and label-out mapping functions, i.e., to perform semantic segmentation. We use these principles to segment the interior of organs.
  • HNN can address two important issues: (1) training and prediction on the whole image end-to-end, i.e., holistically, using a per-pixel labeling cost; and (2) incorporating multi-scale and multi-level learning of deep image features via auxiliary cost functions at each convolutional layer.
  • HNN computes the image - to- image or pixel-to-pixel prediction maps from any input raw image to its annotated labeling map, building on fully convolutional neural networks and deeply-supervised nets.
  • the per-pixel labeling cost function makes it feasible that HNN/FCN can be effectively trained using only several hundred annotated image pairs. This enables the automatic learning of rich hierarchical feature
  • the network structure is initialized based on an ImageNet pre-trained VGGNet model. Fine-tuning CNNs pre-trained on general image classification tasks is helpful for low-level tasks, e.g., edge detection. Furthermore, we can utilize pre-trained edge-detection networks (trained on BSDS500, for example) to segment organ- specific boundaries.
  • HNN-I and HNN-B interior and boundary map of the pancreas, respectively, for any corresponding Xn.
  • HNN-I and HNN-B interior and boundary prediction maps
  • HNN network architecture with 5 stages, including strides of 1 , 2, 4, 8 and 16, respectively, and with different receptive field sizes.
  • a HNN network has M side- output layers as shown in Fig. 2. These side-output layers are also realized as classifiers in which the corresponding weights are For simplicity, all standard network layer
  • hide denotes an image-level loss function for side-outputs, computed over all pixels in a training image pair X and Y.
  • is simply denote the ground truth set
  • organ edge/interior map predictions Y can be obtained at each side- output layer, where
  • Dist(., .) is a distance measure between the fused predictions and the ground truth label map.
  • the following objective function can be minimized via standard stochastic gradient descent and back prop ag ation :
  • Given image X we obtain both interior (HNN-I) and boundary (HNN-B) predictions from the models' side output layers and the weighted-fusion layer:
  • HNN-I/B( ) denotes the interior boundary prediction maps estimated by the network.
  • Segmentation performance can be enhanced if irrelevant regions of the CT volume are pruned out.
  • Conventional organ localization methods using random forest regression which we explain in Sec. II-B1, may not guarantee that the regressed organ bounding box contains the targeted organ with extremely high sensitivities on the pixel-level coverage.
  • Sec. II-B2 we outline a superpixel based approach, based on hand-crafted and CNN features, that is able to provide improved performance. While this is effective, the complexity involved motivates our own development of a simpler and more accessible newly proposed multi-view HNN fusion based procedure. This is explained in Sec. II-B3.
  • the output of the localization method will later feed into a more detailed and accurate segmentation method combining multiple mid-level cues from HNNs as illustrated in Fig. 1.
  • Regression Forest In object localization by regression, the general idea is to predict an offset vector for a given image patch I(x) centered about The predicted object position is then given as * This is repeated for many examples of image patches and then aggregated to produce a final predicted position. Aggregation can be done with non-maximum suppression on prediction voting maps, mean aggregation, cluster medoid aggregation, and the use of local appearance with discriminative models to accept or reject predictions.
  • the pancreas can be localized by regression due to their locations in the body in correlation to other anatomical structures. The objective is to predict bounding boxes is the center of the pancreas and are the lower and upper corner of the pancreas bounding box respectively.
  • pancreas Regression Forest predicts for a given image patch I(x). This produces pancreas bounding box candidates of the form We additiona
  • Random Forest on Superpixels As a form of initialization we alternatively employ a method based on random forest (RF) classification using both hand-crafted and deep CNN derived image features to compute a candidate bounding box regions. We only operate the RF labeling at a low probability threshold of >0.5 which is sufficient to reject the vast amount of non- pancreas from the CT images. This initial candidate generation is sufficient to extract bounding box regions that nearly surround the pancreases completely in all patient cases with -97% recall.
  • the constant balancing weight on ⁇ during training HNN-I is critical in this step since the high majority of CT slices have empty pancreas appearance and are indeed included for effective training of HNN-I models, in order to successfully suppress the pancreas probability values from appearing in background. Furthermore, we perform a largest connected-component analysis to remove outlier "blobs" of high probabilities. To get rid of small incorrect connections between high-probability blobs, we first perform an erosion step with radius of 1 voxel, and then select the largest connected- component, and subsequently dilate the region again (Fig. 3). HNN-I models are trained in axial, coronal, and sagittal planes in order to make use of the multi-view representation of 3D image context.
  • Multiscale combinatorial grouping is one of the state-of-the-art methods for generating segmentation object proposals in computer vision. We utilize this approach to generate organ- specific superpixels based on the learned boundary predication maps HNN-B. Superpixels are extracted via continuous oriented watershed transform at three different scales, denoted
  • HNN-B supervisedly learned by HNN-B. This allows the computation of a hierarchy of superpixel partitions at each scale, and merges superpixels across scales, thereby efficiently exploring their combinatorial space. This, then, allows MCG to group the merged superpixels toward object proposals. We find that the first two levels of object MCG proposals are sufficient to achieve -88% DSC (see Table IV and Fig. 5), with the optimally computed superpixel labels using their spatial overlapping ratios against the segmentation ground truth map. All merged superpixels S from the first two levels are used for the subsequent spatial aggregation step. Note that HNN-B can only be trained using axial slices where the manual annotation was performed. Pancreas boundary maps in coronal and sagittal views can display strong artifacts.
  • Table I shows the test performance of pancreas localization and bounding box prediction using regression forests in DSC and average Euclidean distance against the gold standard bounding boxes.
  • regression forest based localization generates 16 out of 82 bounding boxes that lie below 60% in the pixel-to- pixel recall against the ground-truth pancreas masks. Nevertheless we obtain nearly 100% recall for all scans (except for two cases >94.54%) through the multi-view max-pooled HNN-Is.
  • An example of detected pancreas can be seen in Fig. 6.
  • HNN Spatial Aggregation for Pancreas Segmentation The interior HNN models trained on the axial (AX), coronal (CO) or sagittal (SA) CT images in Sec. II-B3 can be straightforwardly used to generate pancreas segmentation masks.
  • AX, CO, SA any single view HNN-I probability map simply used
  • mean(AX,CO), mean(AX,SA), mean(CO,SA) and mean(AX,CO,SA) (element-wise mean of two or three view HNN-I probability maps); max(AX,CO,SA) (elementwise maximum of three view HNN-I probability maps); and finally meanmax(AX,CO,SA) (element-wise mean of the maximal two scores from three view HNN-I probability maps).
  • meanmax(AX,CO,SA) produces the most superior performance in mean DSC which may behave as a robust fusion function by rejecting the smallest probability value and averaging the remained two HNN-I scores per pixel location.
  • Table IV shows the improvement from the meanmax-pooled HNN-Is (i.e., HNNmeanmax) to the HNN-RF based spatial aggregation, using DSC and average minimum surface-to-surface distance (AVGDIST).
  • the average DSC is increased from 81.14% to 81.27%, However, this improvement is not statistically significantly with p > 0:05 using Wilcoxon signed rank test.
  • DCRF dense CRF
  • the new candidate region bounding box generation method (Sec. II-B3) works comparably to the hybrid technique (Sec. II-B2) based on our empirical evaluation.
  • the proposed pancreas localization via multi-view max-pooled HNNs greatly simplified our overall pancreas segmentation system which may also help the generality and reproducibility.
  • the variant of HNNmeanmax produces competitive segmentation accuracy but merely involves evaluating two sets of multi-view HNN-Is at two spatial scales: whole CT slices or truncated bounding boxes. There is no need to compute any handcrafted image features or train other external machine learning classifiers. As shown in Fig.
  • the conventional organ localization framework using regression forest does not address well the purpose of candidate region generation for segmentation where extremely high pixel-to-pixel recall is required since it is mainly designed for organ detection.
  • Table V the quantitative pancreas segmentation performance of two method variants, HNNmeanmax, HNN-RF spatial aggregation, are evaluated using four metrics of DSC (%), Jaccard Index (%), Hausdorff distance (HDRFDST [mm]) and AVGDIST [mm]. Note that there is no statistical significance when comparing the performance of two variants in three measures of DSC, JACARD, and AVGDIST, except for HDRFDIST with p ⁇ 0:001 under Wilcoxon signed rank test. Since Hausdorff distance represents the maximum deviation between two point sets or surfaces, this observation indicates that HNN-RF may be more robust than HNNmeanmax in the worst case scenario.
  • Deep networks representations with direct 3D input suffer from the curse-of dimensionality and are more prone to overfitting.
  • Volumetric object detection might require more training data and might suffer from scalability issues.
  • proper hyper-parameter tuning of the CNN architecture and enough training data (including data augmentation) might help eliminate these problems.
  • spatial aggregation in multiple 2D views can be a very efficient
  • HNN-I interior holistically- nested networks
  • HNN-RF incorporates the organ boundary responses from the HNN-B model and significantly improves the worst case pancreas segmentation accuracy in Hausdorff distance (p ⁇ 0.001).
  • the highest reported DSCs of 81.27%+6.27% is achieved, at the computational cost of 2-3 minutes, not hours.
  • Our deep learning based organ segmentation approach could be generalizable to other segmentation problems with large variations and pathologies, e.g., pathological organs and tumors.
  • pancreas one of the most difficult organs to segment
  • Lymph node segmentation is also an important challenge in medical image analysis.
  • LNs lymph nodes
  • TA thoracoabdominal
  • LNCs lymph node clusters
  • LN segmentation and volume measurement play a crucial role in important medical imaging base4d diagnosis tasks, such as quantitatively evaluating disease progression and the effectiveness of a given treatment or therapy.
  • Enlarged LNs of greater than 10 mm on a CT slice signal the onset or progression of a malignant disease or an infection.
  • LN segmentation is highly complex, tedious and time consuming. For example, weak intensity contrast renders the boundaries of distinct agglomerated LNs ambiguous, as shown in FIG. 10.
  • the methods disclosed herein provide a fully-automated method for TA LNC segmentation. More importantly, the segmentation task is formulated as a flexible, bottom-up image binary classification problem that can be effectively solved using deep CNNs and graph-based structured optimization and inference. This disclosed methods can handle all variations in LNC size and spatial configuration. Furthermore, the methods disclosed herein are well-suited for measuring agglomerated LNs, whose ambiguous boundaries compromise the accuracy of diameter measurement.
  • the segmentation framework for lymph nodes can be similar to what is described elsewhere herein for the pancreas and other organs. More information regarding applications to lymph node segmentation can be found in U.S. Provisional Patent Application No. 62/345,606, filed June 3, 2016, which is incorporated by reference herein.
  • FIG. 11 illustrates disclosed frameworks integrating trained holistic ally-nested neural networks that capture the interior appearance and boundary cues of the organ to segment, via structured optimization (a) or superpixel based spatial aggregation (b).
  • a three different grid- structured representation and optimization methods are used and evaluated, namely dense CRF, graph cuts (GC) and boundary neural fields (BNF).
  • dense CRF dense CRF
  • GC graph cuts
  • BNF boundary neural fields
  • the learned outputs of HNN-B are incorporated into the pairwise interactions between pixels within a large spatial neighborhood, e.g., 20x20.
  • the spatial aggregation is performed on the enforced boundary-preserving superpixels computed using multiscale boundary maps from HNN-B.
  • FIG. 12 illustrates, at left, a cropped CT image with a Lymph node located in the center together with its associated HNN-A (middle) and HNN-B before non-maximum suppression (right) maps.
  • FIG. 13 shows examples of LN CT image segmentation. Top: original CT images with ground truth (red) and BNF segmented (green) boundaries. Bottom: HNN-A LN probability maps. Left and Middle depict successful segmentation results. Right represents an unsuccessful case.
  • FIG. 14 shows three graphs comparing ground truth and predicted LN volumes, for (a) HNN-A, (b) boundary neural fields, and (c) graph cuts.
  • FIG. 15 shows example of lymph node thoracoabdominal CT image segmentation. Top: original CT images with ground truth (red) and BNF segmented (green) boundaries. Bottom: HNN-A lymph node probability maps (red: probability 1; blue: probability 0).
  • a computer or other processing system comprising a processor and memory, such as a personal computer, a workstation, a mobile computing device, or a networked computer, can be used to perform the methods disclosed herein, including any combination of CT or MR imaging acquisition, imaging processing, imaging data analysis, data storage, and output/display of results (e.g., segmentation maps, etc.).
  • the computer or processing system may include a hard disk, a removable storage medium such as a floppy disk or CD-ROM, and/or other memory such as random access memory (RAM).
  • Computer-executable instructions for causing a computing system to execute the disclosed methods can be provided on any form of tangible and/or non-transitory data storage media, and/or delivered to the computing system via a local area network, the Internet, or other network. Any associated computing process or method step can be performed with distributed processing. For example, extracting information from the imaging data and determining producing segmentation maps can be performed at different locations and/or using different computing systems.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Data Mining & Analysis (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computational Linguistics (AREA)
  • Medical Informatics (AREA)
  • Molecular Biology (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Databases & Information Systems (AREA)
  • Multimedia (AREA)
  • Quality & Reliability (AREA)
  • Radiology & Medical Imaging (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Apparatus For Radiation Diagnosis (AREA)

Abstract

Disclosed are systems and methods for localization and segmentation of organs (especially abnormally shaped, deformable, and/or smaller organs, such as the pancreas and lymph nodes) based on data from 3D medical imaging (e.g., CT and MRI scans) using holistically-nested convolutional neural networks ("HNNs"). Using as an example CT scan data and the pancreas, the methods can include localizing an organ from an entire 3D CT scan, providing a reliable bounding box for the more refined segmentation step. The methods can further comprise introducing a fully deep-learning approach, based on an efficient application of HNNs on the three orthogonal views. The resulting HNN per-pixel probability maps can then be fused using pooling to reliably produce a 3D bounding box of the pancreas that maximizes the recall. An introduced localizer compares favorably to both a conventional non-deep-learning method and a hybrid approach based on spatial aggregation of superpixels using random forest classification. The segmentation phase can operate within the computed bounding box and can integrate semantic mid-level cues of deeply-learned organ interior and boundary maps, obtained by two additional and separate realizations pf HNNs. By integrating these two mid-level cues, the disclosed methods are capable of generating boundary-preserving pixel-wise class label maps that result in exceptional final organ segmentations.

Description

SPATIAL AGGREGATION OF HOLISTICALLY-NESTED CONVOLUTIONAL NEURAL NETWORKS FOR AUTOMATED ORGAN LOCALIZATION AND SEGMENTATION IN 3D MEDICAL SCANS
CROSS-REFERENCE TO RELATED APPLICATIONS
This application claims the benefit of U.S. Provisional Patent Application No. 62/345,606, filed June 3, 2016, and U.S. Provisional Patent Application No. 62/450,681, filed January 26, 2017, both of which are incorporated by reference herein in their entireties.
FIELD
This application is related to methods and systems for organ localization and segmentation using medical imaging data. BACKGROUND
Segmentation of pancreas, lymph nodes, and other organs in computed tomography (CT) challenges current computer-aided diagnosis (CAD) systems. While automatic segmentation of numerous other organs in CT scans, such as the liver, heart or kidneys, achieves good performance with Dice similarity coefficients (DSCs) of >90%, segmentation of other organs is more difficult, for example the pancreas' variable shape, size, and location in the abdomen limits segmentation accuracy to <73% DSC being reported in the literature. Previous pancreas segmentation work have all been based on performing volumetric multiple atlas registration and executing robust label fusion methods to optimize the per-pixel organ labeling process. This type of organ segmentation strategy is widely used for many organ segmentation problems, such as the brain, heart, lung, and pancreas. These methods can be referred as a top-down model fitting approach, or more specifically, MALF (Multi- Atlas Registration & Label Fusion). Another group of top-down frameworks leverages statistical model detection, e.g., generalized Hough transform or marginal space learning for organ localization; and deformable statistical shape models for object segmentation. However, due to the intrinsic 3D shape variability of the pancreas, lymph nodes, and other organs, statistical shape modeling has not been applied for segmentation of these organs.
SUMMARY
Accurate and automatic organ segmentation from 3D radiological scans is an important yet challenging problem for medical image analysis. Specifically, as a small, soft, and flexible abdominal organ, the pancreas demonstrates very high inter-patient anatomical variability in both its shape and volume. This inhibits traditional automated segmentation methods from achieving high accuracies, especially compared to the performance obtained for other organs, such as the liver, heart or kidneys. To fill this gap, we present an automated system from 3D computed tomography (CT) volumes that is based on a two stage cascaded approach including pancreas localization and pancreas segmentation. For the first step, we localize the pancreas from the entire 3D CT scan, providing a reliable bounding box for the more refined segmentation step. We introduce a fully deep-learning approach, based on an efficient application of holistically-nested convolutional networks (HNNs) on the three orthogonal axial, sagittal, and coronal views. The resulting HNN per-pixel probability maps are then fused using pooling to reliably produce a 3D bounding box of the pancreas that maximizes the recall. We show that our introduced localizer compares favorably to both a conventional non-deep-learning method and a recent hybrid approach based on spatial aggregation of superpixels using random forest classification. The second, segmentation, phase operates within the computed bounding box and integrates semantic mid- level cues of deeply-learned organ interior and boundary maps, obtained by two additional and separate realizations of HNNs. By integrating these two mid-level cues, our method is capable of generating boundary-preserving pixel-wise class label maps that result in exceptional final pancreas segmentations. Quantitative evaluation is performed on a publicly available dataset of 82 patient CT scans using 4-fold cross-validation (CV). We achieve a (mean + std. dev.) Dice similarity coefficient (DSC) of 81.27+6.27% in validation, which significantly outperforms both a previous state-of-the art method and a preliminary version of this work that report DSCs of 71.80+10.70% and 78.01+8.20, respectively, using the same dataset.
The foregoing and other objects, features, and advantages of the disclosed technology will become more apparent from the following detailed description, which proceeds with reference to the accompanying figures.
BRIEF DESCRIPTION OF THE DRAWINGS
FIG. 1 is a flowchart of the proposed two-stage pancreas localization and segmentation framework. Sec. II-B2 and Sec. II-B3 are the alternative means of bottom-up organ localization. The remaining modules are for pancreas segmentation.
FIG. 2 illustrates the HNN-I/B network architecture for both interior (left images) and boundary (right images) detection pathways. We highlight the error back-propagation paths to illustrate the deep supervision performed at each side-output layer after the corresponding convolutional layer. As the side-outputs become smaller, the receptive field sizes get larger. This allows HNN to combine multi-scale and multi-level outputs in a learned weighted fusion layer. The ground truth images are inverted for aided visualization.
FIG. 3 illustrates a candidate bounding box region generation pipeline (left to right). Gold standard pancreas in red.
FIG. 4 illustrates a candidate bounding box region generation. Gold standard pancreas in red, blobs of > 0:5 probabilities in green, the selected largest 3D connected component in purple, the resulting candidate bounding box in yellow.
FIG. 5 is a multiscale combinatorial grouping (MCG) on three different scales of learned boundary predication maps from HNN-B: YB2 side, YB3 side, and YB fuse using the original CT image on far left as input (with ground truth delineation of pancreas in red). MCG computes superpixels at each scale and produces a set of merged superpixel-based object proposals. We only visualize the boundary probabilities whose values are greater than 0.10.
FIG. 6 (a)-(f) illustrate an example for comparison of regression forest (RF, a-c) and HNN-I (d-f) for pancreas localization. Green and red boxes are ground truth and detected bounding boxes respectively. The green dot denotes the ground truth center. This case demonstrates a case in the 90th percentile in RF localization distance and serves as a representative of poorly performing localization. In contrast, HNN-I includes all of the pancreas with nearly 100% recall in this case.
FIG. 7 (a) and (b) are histogram plots (Y-Axis) of regression forest based bounding boxes (a) and ΗΝΝ-Γ s generated bounding boxes (b) in recalls (X-axis) covering the ground-truth pancreas masks in 3D. Note that Regression Forest produces 16 out of 82 bounding boxes that lie below 60% in pixel-to-pixel recall while HNN-I produces 100% recalls, except for two cases >94.54%.
FIG. 8A-8C illustrate average DSC performance as a function of pancreas probability using HNNmeanmax (8A) and spatial aggregation via RF (8B) for comparison. Note that the DSC performance remains much more stable after RF aggregation with respect to the probability threshold. The percentage of total cases that lie above a certain DSC with RF are shown in 8C: 80% of the cases have a DSC of 78.05%, and 90% of the cases have a DSC of 74.22% and higher.
FIG. 9 (a)-(d) shows examples of our HNN-RF pancreas segmentation results (green) comparing with the ground-truth annotation (red). The best performing case (a), two cases with DSC scores close to the data set mean (b,c) and the worst case are shown (d).
FIG. 10 shows examples of thoracoabdominal lymph node clusters in CT images with ground truth (red) boundaries.
FIG. 11 illustrates disclosed frameworks integrating trained holistically-nested neural networks that capture the interior appearance and boundary cues of the organ to segment, via structured optimization (a) or superpixel based spatial aggregation (b). In (a), three different grid- structured representation and optimization methods are used and evaluated, namely dense CRF, graph cuts (GC) and boundary neural fields (BNF). For dense CRF, the parwise energy terms are not learned but directly computed from the CT intensity contrast and pixel distance measurements. In both GC and BNF, the learned outputs of HNN-B are incorporated into the pairwise interactions between pixels within a large spatial neighborhood, e.g., 20x20. In (b), the spatial aggregation is performed on the enforced boundary-preserving superpixels computed using multiscale boundary maps from HNN-B.
FIG. 12 illustrates, at left, a cropped CT image with a Lymph node located in the center together with its associated HNN-A (middle) and HNN-B before non-maximum suppression (right) maps.
FIG. 13 shows examples of LN CT image segmentation. Top: original CT images with ground truth (red) and BNF segmented (green) boundaries. Bottom: HNN-A LN probability maps. Left and Middle depict successful segmentation results. Right represents an unsuccessful case.
FIG. 14 shows three graphs comparing ground truth and predicted LN volumes, for (a)
HNN-A, (b) boundary neural fields, and (c) graph cuts.
FIG. 15 shows example of lymph node thoracoabdominal CT image segmentation. Top: original CT images with ground truth (red) and BNF segmented (green) boundaries. Bottom: HNN-A lymph node probability maps (red: probability 1; blue: probability 0).
DETAILED DESCRIPTION
I. Introduction
Pancreas segmentation in computed tomography (CT) challenges current computer-aided diagnosis (CAD) systems. While automatic segmentation of numerous other organs in CT scans, such as the liver, heart or kidneys, achieves good performance with Dice similarity coefficients (DSCs) of >90%, the pancreas' variable shape, size, and location in the abdomen limits segmentation accuracy to <73% DSC being reported in the literature. Previous pancreas segmentation work are all based on performing volumetric multiple atlas registration and executing robust label fusion methods to optimize the per-pixel organ labeling process. This type of organ segmentation strategy is widely used for many organ segmentation problems, such as the brain, heart, lung, and pancreas. These methods can be referred as a top-down model fitting approach, or more specifically, MALF (Multi- Atlas Registration & Label Fusion). Another group of top-down frameworks leverages statistical model detection, e.g., generalized Hough transform or marginal space learning for organ localization; and deformable statistical shape models for object segmentation. However, due to the intrinsic huge 3D shape variability of the pancreas, statistical shape modeling has not been applied for pancreas segmentation.
Recently, a new bottom-up pancreas segmentation representation has been proposed, which uses dense binary image patch labeling confidence maps that are aggregated to classify image regions, or superpixels, into pancreas and non-pancreas label assignments. This method's motivation is to improve segmentation accuracy of highly deformable organs, such as the pancreas, by leveraging midlevel visual representations of image segments. This work was advanced further by Roth et al., who proposed a probabilistic bottom- up approach using a set of multi-scale and multi-level deep convolutional neural networks (CNNs) to capture the complexity of pancreas appearance in CT images. The resulting system improved the performance with a reported DSC of 71.8 ±10.7% against 68.8 ±25.6%. Compared to the MALF based pancreas segmentation work that are evaluated using "leave-one-patient-out" (LOO) protocol, the bottom-up approaches using superpixel representation have reported comparable or higher DSC accuracy measurements, under more challenging 6-fold or 4-fold cross-validation. LOO can be considered as an extreme case of M-fold cross-validation with M = N when N patient datasets are available for experiments. When M is decreasing and significantly smaller than N, M-fold CV becomes more challenging since there are less data for training and more patient cases on testing. Comparing the two bottom-up approaches, the usage of deep CNN models has noticeably improved the performance stability, which is evident by the significantly smaller standard deviation than all other top-down or bottom- up works.
Deep CNNs have successfully been applied to many high-level tasks in medical imaging, such as recognition and object detection. The main advantage of CNNs comes from the fact that end-to-end learning of salient feature representations for the task at hand is more effective than handcrafted features with heuristically tuned parameters. Similarly, CNNs demonstrate promising performance for pixel-level labeling problems, e.g., semantic segmentation in recent computer vision and medical imaging analysis work, e.g., fully convolutional neural networks (FCN), DeepLab, and U-Net. These approaches have all garnered significant improvements in performance over previous methods by applying state-of-the-art CNN-based image classifiers and representation to the semantic segmentation problem in both domains.
Semantic organ segmentation involves assigning a label to each pixel in the image. On one hand, features for classification of single pixels (or patches) play a major role, but on the other hand, factors such as edges, i.e., organ boundaries, appearance consistency, and spatial consistency, could greatly impact the overall system performance. Furthermore, there are indications of semantic vision tasks requiring hierarchical levels of visual perception and abstraction. As such, generating rich feature hierarchies for both the interior and the boundary of the organ could provide important "mid-level visual cues" for semantic segmentation. Subsequent spatial aggregation of these mid- level cues then has the prospect of improving semantic segmentation methods by enhancing the accuracy and consistency of pixel-level labeling.
We have also demonstrated that a two-stage bottom-up localization and segmentation approach can improve upon the state of the art. Here, the major extension is that we describe an improved pancreas localization method by replacing the initial super-pixel based one, with a new general deep learning based approach. This methodological component is designed to optimize or maximize the pancreas spatial recall criterion while reducing the non-pancreas volume as much as possible. Specifically, we generate the per-pixel pancreas class probability maps (or "heat maps") through an efficient combination of holistically-nested convolutional networks (HNNs) in the three orthogonal axial, sagittal, and coronal CT views. We fuse the three HNN outputs to produce a 3D bounding box covering the underlying, yet latent in testing, pancreas volume by nearly 100%. In addition, we show that exactly the same HNN model architecture can be effective for the subsequent pancreas segmentation stage by integrating both deeply learned boundary and appearance cues. This also results in a simpler overall pancreas localization and segmentation system using HNNs only, rather than the previous hybrid setup involving non-deep- and deep- learning method components. Lastly, our current method reports an overall improved DSC performance compared to other methods: e.g., DSC of 81.14 plus or minus 7.3% versus 78.0 plus or minus 8.2% and 71.8 plus or minus 10.7%.
The disclosed two-stage process essentially performs 3D spatial aggregation and assembling on the HNN-produced per-pixel pancreas probability maps that run on 2D axial, coronal, and sagittal CT planes. This process operates exhaustively for pancreas localization and selectively for pancreas segmentation. Therefore, this work inherits a hierarchical and compositional visual representation of computing 3D object information aggregated from 2D image slices or parts.
Alternatively, there are recent studies on directly using 3D convolutional neural networks for liver, brain segmentation and volumetric vascular boundary detection. Due to CNN memory restrictions, these 3D CNN approaches adopt padded sliding windows or volumes to process the original CT scans, such as 96x96x48 segments, 160x160x72 subvolumes and 80x80x80 windows, which may cause segmentation discontinuities or inconsistencies at overlapped window boundaries. We argue that learning shareable lower-dimensional 2D CNN models may be more generalizable and handle the "curse-of-dimensionality" issue better than their fully 3D counterparts, especially when used to parse complex 3D anatomical structures, e.g., lymph node clusters and the pancreas. Analogous examples of comparing compositional multi-view 2D CNNs versus direct 3D deep models can be found in other computer vision problems: 1) video based action recognition where a two-stream 2D CNN model, capturing the image intensity and motion cues, significantly improves upon the 3D CNN method; 2) the advantageous performance of multi-view CNNs over volumetric CNNs in 3D shape recognition.
II. Methods
In this work, we present a two-phased approach for automated pancreas localization and segmentation. The pancreas localization step aims to robustly compute a bounding box which, at the desirable setting, should cover the entire pancreas while pruning the high majority volumetric space from any input CT scan without any manual pre-processing. The second stage of pancreas segmentation incorporates deeply learned organ interior and boundary mid- level cues with subsequent spatial aggregation, focusing only on the properly zoomed or cascaded pancreas location and spatial extents that are generated after the first phase. In Sec. II-A we introduce the HNN model that proves effective for both stages. Afterwards, we focus on localization in Sec. II-B, which discusses and contrasts a conventional approach to localization with newer CNN-based ones— a hybrid and a fully deep-learning approach. We show how the latter approach, which relies on HNNs, provides a simple, yet state-of-the-art, localization method. Importantly, it relies on the same HNN architecture as the later segmentation step. With localization discussed, we explain our segmentation approach in Sec. II-C, which relies on combining semantic mid-level cues produced from HNNs. Our approach to organ segmentation is based on simple, reproducible, yet effective, machine-learning principles. In particular, we demonstrate the most effective configuration of our system is simply composed of cascading and aggregating outputs from six HNNs trained at three orthogonal views and two spatial scales. No multi-atlas registration or multi-label fusion techniques are employed. Fig. 1 provides a flowchart depicting the makeup of our system.
A. Learning Mid-level Cues via Holistically-Nested Networks for Localization and
Segmentation
We use the HNN architecture to learn the pancreas' interior and boundary image-labeling maps, for both localization and segmentation. Object-level interior and boundary information are referred to as mid-level visual cues. Note that this type of CNN architecture was first proposed under the name "holistically-nested edge detection" as a deep learning based general image edge detection method. It has been used successfully for extracting "edge-like" structures like blood vessels in 2D retina images. We however would argue and validate that it can serve as a suitable deep representation to learn general raw pixel-in and label-out mapping functions, i.e., to perform semantic segmentation. We use these principles to segment the interior of organs. HNN can address two important issues: (1) training and prediction on the whole image end-to-end, i.e., holistically, using a per-pixel labeling cost; and (2) incorporating multi-scale and multi-level learning of deep image features via auxiliary cost functions at each convolutional layer. HNN computes the image - to- image or pixel-to-pixel prediction maps from any input raw image to its annotated labeling map, building on fully convolutional neural networks and deeply-supervised nets. The per-pixel labeling cost function makes it feasible that HNN/FCN can be effectively trained using only several hundred annotated image pairs. This enables the automatic learning of rich hierarchical feature
representations and contexts that are critical to resolve spatial ambiguity in the segmentation of organs. The network structure is initialized based on an ImageNet pre-trained VGGNet model. Fine-tuning CNNs pre-trained on general image classification tasks is helpful for low-level tasks, e.g., edge detection. Furthermore, we can utilize pre-trained edge-detection networks (trained on BSDS500, for example) to segment organ- specific boundaries.
Network formulation: Our training data
Figure imgf000009_0003
where X„ denotes cropped axial CT images Xn, rescaled to within [0; : : : ; 255] with a soft-tissue window of denote the binary ground truths of the
Figure imgf000009_0002
interior and boundary map of the pancreas, respectively, for any corresponding Xn. Each image is considered holistically and independently. The network is able to learn features from these images alone from which interior and boundary prediction maps can be produced, which we denote as HNN-I and HNN-B, respectively. HNN can efficiently generate multi-level image features due to its deep architecture. Furthermore, multiple stages with different convolutional strides can capture the inherent scales of organ edge/interior labeling maps. However, due to the difficulty of learning such deep neural networks with multiple stages from scratch, we use the pre-trained network fine- tuned to our specific training data sets a relatively smaller learning rate of 10"6. We use the
Figure imgf000009_0005
HNN network architecture with 5 stages, including strides of 1 , 2, 4, 8 and 16, respectively, and with different receptive field sizes. In addition to standard CNN layers, a HNN network has M side- output layers as shown in Fig. 2. These side-output layers are also realized as classifiers in which the corresponding weights are For simplicity, all standard network layer
Figure imgf000009_0004
parameters are denoted as W. Hence, the following objective function can be defined:
Figure imgf000009_0001
Here, hide denotes an image-level loss function for side-outputs, computed over all pixels in a training image pair X and Y. Because of the heavy bias towards non-labeled pixels in the ground truth data, we apply a strategy to automatically balance the loss between positive and negative classes via a per-pixel class-balancing weight β. This offsets the imbalances between edge/interior (y = 1) and non-edge/exterior (y = 0) samples. Specifically, a class-balanced cross-entropy loss function can be used in Eq. (1) above with j iterating over the spatial dimensions of the image:
Figure imgf000010_0001
Here, β is simply denote the ground truth set
Figure imgf000010_0007
of negatives and positives, respectively. In contrast to examples where β is computed for each training image independently, we use a constant balancing weight computed on the entire training set. This is because some training slices might have no positives at all and otherwise would be ignored in the loss function. The class probability
Figure imgf000010_0002
computed on the activation value at each pixel j using the sigmoid function σ(.). Now, organ edge/interior map predictions Y
Figure imgf000010_0005
can be obtained at each side- output layer, where
Figure imgf000010_0006
are activations of the side-output of layer m. Finally, a "weighted-fusion" layer is added to the network that can be simultaneously learned during training. The loss function at the fusion layer Lfuse is defin
Figure imgf000010_0003
where being the fusion weight.
Figure imgf000010_0004
Dist(., .) is a distance measure between the fused predictions and the ground truth label map. We use cross-entropy loss for this purpose. Hence, the following objective function can be minimized via standard stochastic gradient descent and back prop ag ation :
Figure imgf000010_0008
Testing phase: Given image X, we obtain both interior (HNN-I) and boundary (HNN-B) predictions from the models' side output layers and the weighted-fusion layer:
Figure imgf000011_0001
Here, HNN-I/B( ) denotes the interior boundary prediction maps estimated by the network.
B. Pancreas Localization
Segmentation performance can be enhanced if irrelevant regions of the CT volume are pruned out. Conventional organ localization methods using random forest regression, which we explain in Sec. II-B1, may not guarantee that the regressed organ bounding box contains the targeted organ with extremely high sensitivities on the pixel-level coverage. In Sec. II-B2 we outline a superpixel based approach, based on hand-crafted and CNN features, that is able to provide improved performance. While this is effective, the complexity involved motivates our own development of a simpler and more accessible newly proposed multi-view HNN fusion based procedure. This is explained in Sec. II-B3. The output of the localization method will later feed into a more detailed and accurate segmentation method combining multiple mid-level cues from HNNs as illustrated in Fig. 1.
1) Regression Forest: In object localization by regression, the general idea is to predict an offset vector
Figure imgf000011_0002
for a given image patch I(x) centered about
Figure imgf000011_0004
The predicted object position is then given as *
Figure imgf000011_0003
This is repeated for many examples of image patches and then aggregated to produce a final predicted position. Aggregation can be done with non-maximum suppression on prediction voting maps, mean aggregation, cluster medoid aggregation, and the use of local appearance with discriminative models to accept or reject predictions. The pancreas can be localized by regression due to their locations in the body in correlation to other anatomical structures. The objective is to predict bounding boxes is the center of the pancreas and
Figure imgf000011_0005
Figure imgf000011_0006
are the lower and upper corner of the pancreas bounding box respectively. The addition of the extra three parameters follows from the observation that the center of the bounding box is not necessarily the center of the localized object. The pancreas Regression Forest predicts
Figure imgf000012_0001
for a given image patch I(x). This produces pancreas bounding box candidates of the form We additiona
Figure imgf000012_0002
lly use a discriminative model to accept or reject predictions
Figure imgf000012_0003
Finally, accepted predictions are aggregated using non-maximum suppression over probability scores and then the bounding boxes are ranked by the count of accepted predictions within the box. The box with the highest count of predictions is kept as the final prediction. 2) Random Forest on Superpixels: As a form of initialization we alternatively employ a method based on random forest (RF) classification using both hand-crafted and deep CNN derived image features to compute a candidate bounding box regions. We only operate the RF labeling at a low probability threshold of >0.5 which is sufficient to reject the vast amount of non- pancreas from the CT images. This initial candidate generation is sufficient to extract bounding box regions that nearly surround the pancreases completely in all patient cases with -97% recall. All candidate regions are computed during the testing phase of cross-validation (CV). As we will see next, candidate generation can be done even more efficiently by using the same HNN architectures, which are based on convolutional neural networks. Technical details of HNNs are described in Sec. II-A.
3) Multi-view Aggregated HNNs: Alternatively to the candidate region generation process described in Sec. II-B2 that uses hybrid deep and non-deep learning techniques, we employ HNN-I (interior, see Sec. II-A) as a building block for pancreas localization, inspired by the effectiveness of HNN being able to capture the complex pancreas appearance in CT images. This enables us to drastically discard large negative volumes of the CT scan, while operating HNN-I on a conservative probability threshold of >=0.5 that retains high sensitivity/recall (>99%). The constant balancing weight on β during training HNN-I is critical in this step since the high majority of CT slices have empty pancreas appearance and are indeed included for effective training of HNN-I models, in order to successfully suppress the pancreas probability values from appearing in background. Furthermore, we perform a largest connected-component analysis to remove outlier "blobs" of high probabilities. To get rid of small incorrect connections between high-probability blobs, we first perform an erosion step with radius of 1 voxel, and then select the largest connected- component, and subsequently dilate the region again (Fig. 3). HNN-I models are trained in axial, coronal, and sagittal planes in order to make use of the multi-view representation of 3D image context. Empirically, we found a max-pooling operation across the 3D models to give the highest sensitivity/recall while still being sufficient to reject the vast amount of non-pancreas from the CT images (see Table II). One illustrative example is demonstrated in Fig. 4. This initial candidate generation is sufficient to extract bounding box regions that completely surround the pancreases with nearly 100% recall. All candidate regions are computed during the testing phase of cross- validation (CV) with the same split. Note that this candidate region proposal can be important for further processing. It removes "easy" non-pancreas tissue from further analysis and allows HNN-I and HNN-B to focus on the more difficult distinction of pancreas versus its surrounding tissue. The fact that we can use exactly the same HNN model architecture for both stages though is noteworthy.
C. Pancreas Segmentation
With pancreas localized, the next step is to produce a reliable segmentation. Our segmentation pipeline consists of three steps. We first use HNN probability maps to generate mid- level boundary and interior cues. These are then used to produce superpixels, which are then aggregated together into a final segmentation using RF classification.
1) Combining Mid-level Cues via HNNs: We now show that organ segmentation can benefit from multiple mid-level cues, like organ interior and boundary predictions. We investigate deep-learning based approaches to independently learn the pancreas' interior and boundary mid-level cues. Combining both cues via learned spatial aggregation can elevate the overall performance of this semantic segmentation system. Organ boundaries are a major mid- level cue for defining and delineating the anatomy of interest. It could prove to be essential for accurate semantic segmentation of an organ.
2) Learning Organ- specific Segmentation Proposals: Multiscale combinatorial grouping (MCG) is one of the state-of-the-art methods for generating segmentation object proposals in computer vision. We utilize this approach to generate organ- specific superpixels based on the learned boundary predication maps HNN-B. Superpixels are extracted via continuous oriented watershed transform at three different scales, denoted
Figure imgf000013_0001
supervisedly learned by HNN-B. This allows the computation of a hierarchy of superpixel partitions at each scale, and merges superpixels across scales, thereby efficiently exploring their combinatorial space. This, then, allows MCG to group the merged superpixels toward object proposals. We find that the first two levels of object MCG proposals are sufficient to achieve -88% DSC (see Table IV and Fig. 5), with the optimally computed superpixel labels using their spatial overlapping ratios against the segmentation ground truth map. All merged superpixels S from the first two levels are used for the subsequent spatial aggregation step. Note that HNN-B can only be trained using axial slices where the manual annotation was performed. Pancreas boundary maps in coronal and sagittal views can display strong artifacts.
3) Spatial Aggregation with Random Forest: We use the superpixel set S generated previously to extract features for spatial aggregation via random forest classification. Within any superpixel
Figure imgf000014_0001
we compute simple statistics including the lst-4th order moments, and 8 percentiles on the CT intensities, and a per-pixel element-wise pooling
Figure imgf000014_0002
function of multi-view HNN-Is and HNN-B. Additionally, we compute the mean x, y, and z coordinates normalized by the range of the 3D candidate region (Sec. II-B3). This results in 39 features describing each superpixel and are used to train a RF classifier on the training positive or negative superpixels at each round of 4-fold CV. Empirically, we find 50 trees to be sufficient to model our feature set. A final 3D pancreas segmentation is simply obtained by stacking each slice prediction back into the original CT volume space. No further post-processing is employed. This complete pancreas segmentation model is denoted as HNN-RF.
III. Experimental Results
A. Data
Manual tracings of the pancreas for 82 contrast-enhanced abdominal CT volumes are provided by a publicly available dataset, for the ease of comparison. Our experiments are conducted on random splits of -60 patients for training and -20 for unseen testing, in 4-fold cross-validation throughout in this section, unless otherwise mentioned.
B. Evaluation
We perform extensive quantitative evaluation on different configurations of our method and compare to the previous state-of-the-art work with in-depth analysis.
1) Localization: From our empirical study, the candidate region bounding box generation based on multi-view max-pooled HNN-Is (Sec. II-B3) or previous hybrid methods (Sec. II-B2) works comparably in terms of addressing the requirement to produce spatially-truncated 3D regions that maximally cover the pancreas in the pixel-to-pixel level and reject as much as possible the background spaces. An average reduction of absolute volume of 90.36% (range [80.45%- 96.26%]) between CT scan and candidate bounding box is achieved during this step, while keeping a mean recall of 99.93%, ranging [94.54%-100.00%] Table I shows the test performance of pancreas localization and bounding box prediction using regression forests in DSC and average Euclidean distance against the gold standard bounding boxes. As illustrated in Fig. 7, regression forest based localization generates 16 out of 82 bounding boxes that lie below 60% in the pixel-to- pixel recall against the ground-truth pancreas masks. Nevertheless we obtain nearly 100% recall for all scans (except for two cases >94.54%) through the multi-view max-pooled HNN-Is. An example of detected pancreas can be seen in Fig. 6.
Figure imgf000015_0001
TABLE I: Test performance of pancreas localization and bounding box prediction using regression forests in Dice and average Euclidean distance against the gold standard bounding boxes, in 4-fold cross validation.
Figure imgf000015_0002
TABLE II: Four-fold cross-validation: DSC [%] pancreas segmentation performance of various spatial aggregation functions on AX, CO, and SA viewed HNN-I probability maps the candidate region generation stage (the best results in bold).
Figure imgf000016_0001
TABLE III: Four-fold cross-validation: DSC [%] pancreas segmentation performance of various spatial aggregation functions on AX, CO, and SA viewed HNN-I probability maps in the second cascaded stage (the best results in bold).
2) HNN Spatial Aggregation for Pancreas Segmentation: The interior HNN models trained on the axial (AX), coronal (CO) or sagittal (SA) CT images in Sec. II-B3 can be straightforwardly used to generate pancreas segmentation masks. We exploit different spatial aggregation or pooling functions on the AX, CO, and SA viewed HNN-I probability maps, denoted as AX, CO, SA (any single view HNN-I probability map simply used); mean(AX,CO), mean(AX,SA), mean(CO,SA) and mean(AX,CO,SA) (element-wise mean of two or three view HNN-I probability maps); max(AX,CO,SA) (elementwise maximum of three view HNN-I probability maps); and finally meanmax(AX,CO,SA) (element-wise mean of the maximal two scores from three view HNN-I probability maps). After the optimal thresholding calibrated using the training folds on these pooled HNN-I maps, the resulting binary segmentation masks are further refined by 3D connected component process and simple morphological operations (as in Sec. II- B3). Table II demonstrates the DSC pancreas segmentation accuracy performance by investigating different spatial aggregation functions. We observe that the element- wise multi-view (mean or max) pooling operations on HNN-I probabilities maps generally outperform their single view counterparts. max(AX,CO,SA) performs slightly better than mean(AX,CO,SA). The configuration of meanmax(AX,CO,SA) produces the most superior performance in mean DSC which may behave as a robust fusion function by rejecting the smallest probability value and averaging the remained two HNN-I scores per pixel location. After the pancreas localization stage, we train a new set of multi-view HNN-Is with the spatially truncated scales and extents. This serves a desirable "Zoom Better to See Clearer" effect for deep neural network segmentation models [46] where cascaded HNN-Is only focus on discriminating or parsing the remained organ candidate regions. Similarly, DSC [%] pancreas segmentation accuracy results of various spatial aggregation or pooling functions on AX, CO, and SA viewed HNN-I probability maps (trained in the second cascaded stage) are shown in Table III. We find consistent empirical observations as above when comparing multi-view HNN pooling operations. The meanmax(AX,CO,SA) operation again reports the best mean DSC performance at 81.14% which is increased considerably from 76.79% in Table II. We denote this system configuration as HNNmeanmax. This result validates our two staged pancreas segmentation framework of proposing candidate region generation for organ localization followed by "Zoomed" deep HNN models to refine segmentation. Table IV shows the improvement from the meanmax-pooled HNN-Is (i.e., HNNmeanmax) to the HNN-RF based spatial aggregation, using DSC and average minimum surface-to-surface distance (AVGDIST). The average DSC is increased from 81.14% to 81.27%, However, this improvement is not statistically significantly with p > 0:05 using Wilcoxon signed rank test. In contrast, using dense CRF (DCRF) optimization (with HNN-I as the unary term and the pairwise term depending on the CT values) as a means of introducing spatial consistency does not improve upon HNN-I noticeably). Comparing to the performance of other state-of-the-art methods at mean DSC scores of 71.4% and 78.01 % respectively, both variants of HNNmeanmax and HNN-RF demonstrate superior quantitative segmentation accuracy in DSC and AVGDIST metrics. We have the following two observations. 1 , The main performance gain (similar to HNNAX in Table III) is found by the multi-view aggregated HNN pancreas segmentation probability maps (e.g., HNNmeanmax), which also serve in HNN-RF.
The new candidate region bounding box generation method (Sec. II-B3) works comparably to the hybrid technique (Sec. II-B2) based on our empirical evaluation. However the proposed pancreas localization via multi-view max-pooled HNNs greatly simplified our overall pancreas segmentation system which may also help the generality and reproducibility. The variant of HNNmeanmax produces competitive segmentation accuracy but merely involves evaluating two sets of multi-view HNN-Is at two spatial scales: whole CT slices or truncated bounding boxes. There is no need to compute any handcrafted image features or train other external machine learning classifiers. As shown in Fig. 7, the conventional organ localization framework using regression forest does not address well the purpose of candidate region generation for segmentation where extremely high pixel-to-pixel recall is required since it is mainly designed for organ detection. In Table V, the quantitative pancreas segmentation performance of two method variants, HNNmeanmax, HNN-RF spatial aggregation, are evaluated using four metrics of DSC (%), Jaccard Index (%), Hausdorff distance (HDRFDST [mm]) and AVGDIST [mm]. Note that there is no statistical significance when comparing the performance of two variants in three measures of DSC, JACARD, and AVGDIST, except for HDRFDIST with p < 0:001 under Wilcoxon signed rank test. Since Hausdorff distance represents the maximum deviation between two point sets or surfaces, this observation indicates that HNN-RF may be more robust than HNNmeanmax in the worst case scenario.
Pancreas segmentation on illustrative patient cases are shown in Fig. 9. Furthermore, we applied our trained HNNI model on a different CT data set with 30 patients, and achieve a mean DSC of 62.26% without any re-training on the new data cases, but if we average the outputs of our 4 HNN-I models from cross-validation, we achieve 65.66% DSC. This demonstrates that HNN-I may be highly generalizable in cross-dataset evaluation. Performance on that dataset will likely improve with further fine-tuning. Last, we collected an additional dataset of 19 unseen CT scans using the same patient data protocol. Here, HNN meanmax achieves a mean DSC of 81.2%.
IV. Discussion and Conclusion
To the best of our knowledge, our result comprises the highest reported average DSC in testing folds under 4-fold CV evaluation metric. Strict comparison to most other methods is not directly possible due to different datasets utilized. Our holistic segmentation approach with multi- view pooling and spatial aggregation advances the current state-of-the-art quantitative performance to an average DSC of 81.27% in testing. Previous notable results for CT images
range from -68% to -78%, all under the "leave-one -patient-out" (LOO) cross-validation scheme. In particular, DSC drops from 68% (150 patients) to 58% (50 patients). Our methods also perform with the better statistical stability, i.e., comparing 7.3% or 6.27% versus 18.6% and 15.3% in the standard deviation of DSC scores. The minimal DSC values are 44.69% with HNNmeanmax and 50.69% for HNN-RF whereas others report patient cases with DSC <10%. Recent work that explores the direct application of 3D convolutional filters with fully convolutional architectures also shows promise. 2D or 3D implementations may be more suited for certain tasks. Deep networks representations with direct 3D input suffer from the curse-of dimensionality and are more prone to overfitting. Volumetric object detection might require more training data and might suffer from scalability issues. However, proper hyper-parameter tuning of the CNN architecture and enough training data (including data augmentation) might help eliminate these problems. In the meantime, spatial aggregation in multiple 2D views (as disclosed herein) can be a very efficient
(and computationally less expensive) way of diminishing the curse-of-dimensionality. Furthermore, using 2D views has the advantage that networks trained on much larger databases of natural images (e.g. ImageNet, BSDS500) can be used for fine tuning to the medical domain. Transfer learning is a viable approach when the medical imaging data set size is limited. 3D CNN approaches can adopt padded spatially-local sliding volumes to parse any CT scan, e.g., 96x96x48, 160x160x72 or 80x80x80, which may cause the segmentation discontinuity or inconsistency at overlapped window boundaries. Ensemble of several neural networks trained with random configuration variations is found to be advantageous comparing a single CNN model in object recognition. Our pancreas segmentation method can be indeed considered as ensembles of multiple correlated HNN models but good complementary information gain since they are trained from orthogonal axial, coronal or sagittal CT views.
In conclusion, we present a holistic deep CNN approach for pancreas localization and segmentation in abdominal CT scans, exploiting multi-view spatial pooling and combining interior and boundary mid-level cues. The robust fusion of HNNmeanmax aggregating on interior holistically- nested networks (HNN-I) alone already achieve good performance at DSC of 81.14%±7.30% in 4- fold CV. The other method variant HNN-RF incorporates the organ boundary responses from the HNN-B model and significantly improves the worst case pancreas segmentation accuracy in Hausdorff distance (p<0.001). The highest reported DSCs of 81.27%+6.27% is achieved, at the computational cost of 2-3 minutes, not hours. Our deep learning based organ segmentation approach could be generalizable to other segmentation problems with large variations and pathologies, e.g., pathological organs and tumors.
Figure imgf000019_0001
TABLE IV: Four-fold cross-validation: The DSC [%] and average surface-to-surface minimum distance (AVGDIST [mm]) performance, HNNmeanmax, HNN-RF spatial aggregation, and optimally achievable superpixel assignments (italic). Best performing method in bold.
Figure imgf000020_0001
TABLE V: Four-fold cross-validation: The quantitative pancreas segmentation performance results of our two method variants, HNNmeanmax, HNN-RF spatial aggregation, in four metrics of DSC (%), Jaccard Index (%), Hausdorff distance (HDRFDST [mm]), and AVGDIST [mm]. Best performing methods are shown in bold. Note that there is no statistical significance when comparing the performance by two variants in three measures of DSC, JACARD, and AVGDIST, except for HDRFDIST with p < 0.001 (Wilcoxon Signed Rank Test). This indicates that HNN-RF may be more robust than HNNmeanmax in the worst case scenario. Exemplary benefits and advantages of the disclosed technology can include:
1) A unified deep convolutional neural network framework for fully-automated localization and segmentation of highly-deformable or variable organs given an input CT volume.
2) Proposing using simple, effective and novel multiple view 2D holistically-nested neural networks to aggregate a reliable 3D organ segmentation confidence map.
3) Random Forest and image region segmentation information aggregation and decision integration.
4) Significantly better quantitative performance for the pancreas (one of the most difficult organs to segment) from CT imaging, compared to all known state-of-the-art approaches, with strong clinical indications and impacts.
V. Applications for Lymph Nodes and Other Organs
Lymph node segmentation is also an important challenge in medical image analysis. The presence of enlarged lymph nodes (LNs) signals the onset of progression of a malignant disease or infection. In the thoracoabdominal (TA) body region, neighboring enlarged LNs often spatially collapse into swollen lymph node clusters (LNCs) (up to 9 LNs in our data sets). Accurate segmentation of TA LNCs is complexified by the noticeably poor intensity and texture contrast among neighboring LNs and surrounding tissues, and has not been addressed adequately before.
LN segmentation and volume measurement play a crucial role in important medical imaging base4d diagnosis tasks, such as quantitatively evaluating disease progression and the effectiveness of a given treatment or therapy. Enlarged LNs of greater than 10 mm on a CT slice signal the onset or progression of a malignant disease or an infection. Often performed manually, LN segmentation is highly complex, tedious and time consuming. For example, weak intensity contrast renders the boundaries of distinct agglomerated LNs ambiguous, as shown in FIG. 10.
The methods disclosed herein provide a fully-automated method for TA LNC segmentation. More importantly, the segmentation task is formulated as a flexible, bottom-up image binary classification problem that can be effectively solved using deep CNNs and graph-based structured optimization and inference. This disclosed methods can handle all variations in LNC size and spatial configuration. Furthermore, the methods disclosed herein are well-suited for measuring agglomerated LNs, whose ambiguous boundaries compromise the accuracy of diameter measurement.
The segmentation framework for lymph nodes can be similar to what is described elsewhere herein for the pancreas and other organs. More information regarding applications to lymph node segmentation can be found in U.S. Provisional Patent Application No. 62/345,606, filed June 3, 2016, which is incorporated by reference herein.
FIG. 11 illustrates disclosed frameworks integrating trained holistic ally-nested neural networks that capture the interior appearance and boundary cues of the organ to segment, via structured optimization (a) or superpixel based spatial aggregation (b). In (a), three different grid- structured representation and optimization methods are used and evaluated, namely dense CRF, graph cuts (GC) and boundary neural fields (BNF). For dense CRF, the parwise energy terms are not learned but directly computed from the CT intensity contrast and pixel distance measurements. In both GC and BNF, the learned outputs of HNN-B are incorporated into the pairwise interactions between pixels within a large spatial neighborhood, e.g., 20x20. In (b), the spatial aggregation is performed on the enforced boundary-preserving superpixels computed using multiscale boundary maps from HNN-B.
FIG. 12 illustrates, at left, a cropped CT image with a Lymph node located in the center together with its associated HNN-A (middle) and HNN-B before non-maximum suppression (right) maps.
FIG. 13 shows examples of LN CT image segmentation. Top: original CT images with ground truth (red) and BNF segmented (green) boundaries. Bottom: HNN-A LN probability maps. Left and Middle depict successful segmentation results. Right represents an unsuccessful case.
FIG. 14 shows three graphs comparing ground truth and predicted LN volumes, for (a) HNN-A, (b) boundary neural fields, and (c) graph cuts. FIG. 15 shows example of lymph node thoracoabdominal CT image segmentation. Top: original CT images with ground truth (red) and BNF segmented (green) boundaries. Bottom: HNN-A lymph node probability maps (red: probability 1; blue: probability 0). A computer or other processing system comprising a processor and memory, such as a personal computer, a workstation, a mobile computing device, or a networked computer, can be used to perform the methods disclosed herein, including any combination of CT or MR imaging acquisition, imaging processing, imaging data analysis, data storage, and output/display of results (e.g., segmentation maps, etc.). The computer or processing system may include a hard disk, a removable storage medium such as a floppy disk or CD-ROM, and/or other memory such as random access memory (RAM). Computer-executable instructions for causing a computing system to execute the disclosed methods can be provided on any form of tangible and/or non-transitory data storage media, and/or delivered to the computing system via a local area network, the Internet, or other network. Any associated computing process or method step can be performed with distributed processing. For example, extracting information from the imaging data and determining producing segmentation maps can be performed at different locations and/or using different computing systems.
For purposes of this description, certain aspects, advantages, and novel features of the embodiments of this disclosure are described herein. The disclosed methods, apparatuses, and systems should not be construed as limiting in any way. Instead, the present disclosure is directed toward all novel and nonobvious features and aspects of the various disclosed embodiments, alone and in various combinations and sub-combinations with one another. The methods, apparatuses, and systems are not limited to any specific aspect or feature or combination thereof, nor do the disclosed embodiments require that any one or more specific advantages be present or problems be solved.
Characteristics and features described in conjunction with a particular aspect, embodiment, or example of the disclosed technology are to be understood to be applicable to any other aspect, embodiment or example described herein unless incompatible therewith. All of the features disclosed in this specification (including any accompanying claims, abstract and drawings), and/or all of the steps of any method or process so disclosed, may be combined in any combination, except combinations where at least some of such features and/or steps are mutually exclusive. The invention is not restricted to the details of any embodiments disclosed in this application. The invention extends to any novel one, or any novel combination, of the features disclosed in this application, or to any novel one, or any novel combination, of the steps of any method or process so disclosed.
Although the operations of some of the disclosed methods are described in a particular, sequential order for convenient presentation, it should be understood that this manner of description encompasses rearrangement, unless a particular ordering is required by specific language. For example, operations described sequentially may in some cases be rearranged or performed concurrently. Moreover, for the sake of simplicity, the figures of this application may not show the various ways in which the disclosed methods can be used in conjunction with other methods.
In view of the many possible embodiments to which the principles of the disclosed technology may be applied, it should be recognized that the illustrated embodiments are only examples and should not be taken as limiting the scope of the disclosure. Rather, the scope of the disclosure is at least as broad as the following claims. We therefore claim all that comes within the scope of the following claims.

Claims

CLAIMS:
1. A method for localization and segmentation of organs based on data from 3D medical imaging, the method comprising:
receiving 3D imaging data for a patient including a target organ, the 3D imaging data including three orthogonal axial, sagittal, and coronal views;
localizing the target organ from 3D imaging data;
applying holistically-nested convolutional neural networks ("HNNs") on the three orthogonal axial, sagittal, and coronal views to produce per-pixel probability maps;
fusing the probability maps using pooling to produce a 3D bounding box of the target organ;
integrating semantic mid-level cues of deeply-learned organ interior and boundary maps, obtained by two additional and separate realizations of HNNs; and
based on the integration of the mid- level cues, generating boundary-preserving pixel-wise class label maps that result in final segmentations of the target organ.
2. The method of claim 1, wherein the target organ is a pancreas.
3. The method of claim 1, wherein the target organ is a lymph node.
4. The method of claim 1, wherein the target organ comprises a cluster of lymph nodes.
5. The method of any one of claims 1-4, wherein the 3D imaging data is derived from one or more computerized tomography scans.
6. The method of any one of claims 1-5, wherein the method comprises utilizing multiscale combinatorial grouping (MCG) to generate target organ- specific superpixels based on learned boundary maps.
7. The method of any one of claims 1-6, wherein superpixels are extracted via continuous oriented watershed transform at three different scales supervisedly learned by HNN boundary.
8. The method of claim 7, further comprising computation of a hierarchy of superpixel partitions at each scale, and merger of superpixels across scales.
9. The method of claim 8, further comprising grouping merged superpixels into superpixel sets and using the superpixel sets for a subsequent spatial aggregation via random forest classification.
10. The method of claim 9, further comprising using superpixels sets to generate features describing each superpixel using said features to train a random forest classifier on training positive or negative superpixels.
11. The method of claim 10, further comprising obtaining a final 3D organ segmentation by stacking slice predictions back into an original CT volume space.
12. A computing system comprising a processor and memory, the system operable to implement the method of any one of claims 1-11.
13. A system comprising:
a 3D imaging system operable to receiving 3D imaging data for a patient including a target organ, the 3D imaging data including three orthogonal axial, sagittal, and coronal views; and
a computing system comprising a processor, memory, and software, the computing system operable to:
localize the target organ from 3D imaging data;
apply holistically-nested convolutional neural networks ("HNNs") on the three orthogonal axial, sagittal, and coronal views to produce per-pixel probability maps;
fuse the probability maps using pooling to produce a 3D bounding box of the target organ;
integrate semantic mid-level cues of deeply-learned organ interior and boundary maps, obtained by two additional and separate realizations of HNNs; and
based on the integration of the mid-level cues, generate boundary-preserving pixel- wise class label maps that result in final segmentations of the target organ.
14. The system of claim 13, wherein the target organ is a pancreas or a lymph node.
15. The system of claim 13 of claim 14, wherein the 3D imaging system comprises a computerized tomography system and the 3D imaging data is derived from one or more computerized tomography scans.
16. The system of any one of claims 13-15, wherein the computing system is operable to utilize multiscale combinatorial grouping to generate target organ-specific superpixels based on learned boundary maps.
17. The system of any one of claims 13-16, wherein the computing system is operable to extract superpixels via continuous oriented watershed transform at three different scales supervisedly learned by HNN boundary.
18. The system of claim 17, wherein the computing system is operable to compute a hierarchy of superpixel partitions at each scale, and merge superpixels across scales.
19. The system of claim 18, wherein the computing system is operable to group merged superpixels into superpixel sets and using the superpixel sets for a subsequent spatial aggregation via random forest classification.
20. The system of claim 19, wherein the computing system is operable to use superpixel sets to generate features describing each superpixel using said features to train a random forest classifier on training positive or negative superpixels.
21. The system of claim 20, wherein the computing system is operable to obtain a final 3D organ segmentation by stacking slice predictions back into an original CT volume space.
22. One or more non-transitory computer readable media storing computer-executable instructions, which when executed by a computer cause the computer to perform the method of any one of claims 1-11.
PCT/US2017/035974 2016-06-03 2017-06-05 Spatial aggregation of holistically-nested convolutional neural networks for automated organ localization and segmentation in 3d medical scans Ceased WO2017210690A1 (en)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US201662345606P 2016-06-03 2016-06-03
US62/345,606 2016-06-03
US201762450681P 2017-01-26 2017-01-26
US62/450,681 2017-01-26

Publications (1)

Publication Number Publication Date
WO2017210690A1 true WO2017210690A1 (en) 2017-12-07

Family

ID=60478011

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2017/035974 Ceased WO2017210690A1 (en) 2016-06-03 2017-06-05 Spatial aggregation of holistically-nested convolutional neural networks for automated organ localization and segmentation in 3d medical scans

Country Status (1)

Country Link
WO (1) WO2017210690A1 (en)

Cited By (65)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108399422A (en) * 2018-02-01 2018-08-14 华南理工大学 A kind of image channel fusion method based on WGAN models
CN108549871A (en) * 2018-04-17 2018-09-18 北京华捷艾米科技有限公司 A kind of hand Segmentation method based on region growing and machine learning
CN108805036A (en) * 2018-05-22 2018-11-13 电子科技大学 A kind of new non-supervisory video semanteme extracting method
CN109416743A (en) * 2018-01-15 2019-03-01 深圳鲲云信息科技有限公司 A kind of Three dimensional convolution device artificially acted for identification
CN109461161A (en) * 2018-10-22 2019-03-12 北京连心医疗科技有限公司 A method of human organ in medical image is split based on neural network
CN109559295A (en) * 2018-06-04 2019-04-02 新影智能科技(昆山)有限公司 Image analysis system, method, computer readable storage medium and electric terminal
CN109598727A (en) * 2018-11-28 2019-04-09 北京工业大学 A kind of CT image pulmonary parenchyma three-dimensional semantic segmentation method based on deep neural network
CN109636808A (en) * 2018-11-27 2019-04-16 杭州健培科技有限公司 A kind of lobe of the lung dividing method based on full convolutional neural networks
CN109741347A (en) * 2018-12-30 2019-05-10 北京工业大学 An Image Segmentation Method Based on Iterative Learning of Convolutional Neural Networks
CN109948707A (en) * 2019-03-20 2019-06-28 腾讯科技(深圳)有限公司 Model training method, device, terminal and storage medium
CN109961059A (en) * 2019-04-10 2019-07-02 杭州智团信息技术有限公司 Detect the method and system in kidney tissue of interest region
CN110009599A (en) * 2019-02-01 2019-07-12 腾讯科技(深圳)有限公司 Liver masses detection method, device, equipment and storage medium
WO2019136922A1 (en) * 2018-01-12 2019-07-18 平安科技(深圳)有限公司 Pulmonary nodule detection method, application server, and computer-readable storage medium
CN110096961A (en) * 2019-04-04 2019-08-06 北京工业大学 A kind of indoor scene semanteme marking method of super-pixel rank
CN110246566A (en) * 2019-04-24 2019-09-17 中南大学湘雅二医院 Method, system and storage medium are determined based on the conduct disorder of convolutional neural networks
CN110287777A (en) * 2019-05-16 2019-09-27 西北大学 A Body Segmentation Algorithm for Golden Monkey in Natural Scenes
WO2019200753A1 (en) * 2018-04-17 2019-10-24 平安科技(深圳)有限公司 Lesion detection method, device, computer apparatus and storage medium
US10475182B1 (en) 2018-11-14 2019-11-12 Qure.Ai Technologies Private Limited Application of deep learning for medical imaging evaluation
CN110599498A (en) * 2018-10-19 2019-12-20 北京连心医疗科技有限公司 Method for segmenting human body organ in medical image based on neural network
CN110638477A (en) * 2018-06-26 2020-01-03 佳能医疗系统株式会社 Medical image diagnosis device and alignment method
CN110766691A (en) * 2019-12-06 2020-02-07 北京安德医智科技有限公司 Method and device for cardiac magnetic resonance image analysis and cardiomyopathy prediction
WO2020038974A1 (en) * 2018-08-21 2020-02-27 Koninklijke Philips N.V. Salient visual explanations of feature assessments by machine learning models
CN110889853A (en) * 2018-09-07 2020-03-17 天津大学 A Residual-Attention Deep Neural Network Based Tumor Segmentation Method
CN110889852A (en) * 2018-09-07 2020-03-17 天津大学 Liver segmentation method based on residual-attention deep neural network
WO2020099940A1 (en) * 2018-11-14 2020-05-22 Qure.Ai Technologies Private Limited Application of deep learning for medical imaging evaluation
TWI697010B (en) * 2018-12-28 2020-06-21 國立成功大學 Method of obtaining medical sagittal image, method of training neural network and computing device
WO2020156303A1 (en) * 2019-01-30 2020-08-06 广州市百果园信息技术有限公司 Method and apparatus for training semantic segmentation network, image processing method and apparatus based on semantic segmentation network, and device and storage medium
CN111667459A (en) * 2020-04-30 2020-09-15 杭州深睿博联科技有限公司 Medical sign detection method, system, terminal and storage medium based on 3D variable convolution and time sequence feature fusion
CN111899273A (en) * 2020-06-10 2020-11-06 上海联影智能医疗科技有限公司 Image segmentation method, computer device and storage medium
CN112258499A (en) * 2020-11-10 2021-01-22 北京深睿博联科技有限责任公司 Lymph node partition method, apparatus, device and computer readable storage medium
CN112270644A (en) * 2020-10-20 2021-01-26 西安工程大学 Face super-resolution method based on spatial feature transformation and cross-scale feature integration
US10963757B2 (en) 2018-12-14 2021-03-30 Industrial Technology Research Institute Neural network model fusion method and electronic device using the same
CN112598634A (en) * 2020-12-18 2021-04-02 燕山大学 CT image organ positioning method based on 3D CNN and iterative search
CN112634211A (en) * 2020-12-14 2021-04-09 上海健康医学院 MRI (magnetic resonance imaging) image segmentation method, device and equipment based on multiple neural networks
CN112700451A (en) * 2019-10-23 2021-04-23 通用电气精准医疗有限责任公司 Method, system and computer readable medium for automatic segmentation of 3D medical images
CN112800915A (en) * 2021-01-20 2021-05-14 北京百度网讯科技有限公司 Building change detection method, building change detection device, electronic device, and storage medium
WO2021093435A1 (en) * 2019-11-12 2021-05-20 腾讯科技(深圳)有限公司 Semantic segmentation network structure generation method and apparatus, device, and storage medium
CN113012178A (en) * 2021-05-07 2021-06-22 西安智诊智能科技有限公司 Kidney tumor image segmentation method
CN113160229A (en) * 2021-03-15 2021-07-23 西北大学 Pancreas segmentation method and device based on hierarchical supervision cascade pyramid network
US11127137B2 (en) * 2017-04-12 2021-09-21 Kheiron Medical Technologies Ltd Malignancy assessment for tumors
EP3889888A1 (en) * 2019-10-23 2021-10-06 GE Precision Healthcare LLC Method, system and computer readable medium for automatic segmentation of a 3d medical image
WO2021198117A1 (en) * 2020-03-30 2021-10-07 Varian Medical Systems International Ag Automatically-planned radiation-based treatment
CN113506310A (en) * 2021-07-16 2021-10-15 首都医科大学附属北京天坛医院 Medical image processing method and device, electronic equipment and storage medium
CN113610739A (en) * 2021-08-10 2021-11-05 平安国际智慧城市科技股份有限公司 Image data enhancement method, device, equipment and storage medium
US20210365717A1 (en) * 2019-04-22 2021-11-25 Tencent Technology (Shenzhen) Company Limited Method and apparatus for segmenting a medical image, and storage medium
CN113706469A (en) * 2021-07-29 2021-11-26 天津中科智能识别产业技术研究院有限公司 Iris automatic segmentation method and system based on multi-model voting mechanism
US11308623B2 (en) 2019-07-09 2022-04-19 The Johns Hopkins University System and method for multi-scale coarse-to-fine segmentation of images to detect pancreatic ductal adenocarcinoma
CN114387282A (en) * 2021-12-08 2022-04-22 罗雄彪 Accurate automatic segmentation method and system for medical image organs
US11406844B2 (en) 2020-03-30 2022-08-09 Varian Medical Systems International Ag Method and apparatus to derive and utilize virtual volumetric structures for predicting potential collisions when administering therapeutic radiation
US11416772B2 (en) 2019-12-02 2022-08-16 International Business Machines Corporation Integrated bottom-up segmentation for semi-supervised image segmentation
US11430176B2 (en) 2020-05-20 2022-08-30 International Business Machines Corporation Generating volume predictions of three-dimensional volumes using slice features
US11461998B2 (en) 2019-09-25 2022-10-04 Samsung Electronics Co., Ltd. System and method for boundary aware semantic segmentation
US11478210B2 (en) 2020-03-30 2022-10-25 Varian Medical Systems International Ag Automatically-registered patient fixation device images
US11488306B2 (en) 2018-06-14 2022-11-01 Kheiron Medical Technologies Ltd Immediate workup
US20220414453A1 (en) * 2021-06-28 2022-12-29 X Development Llc Data augmentation using brain emulation neural networks
US11593943B2 (en) 2017-04-11 2023-02-28 Kheiron Medical Technologies Ltd RECIST assessment of tumour progression
CN116188474A (en) * 2023-05-05 2023-05-30 四川省肿瘤医院 Three-level Lymphatic Structure Recognition Method and System Based on Image Semantic Segmentation
CN116645336A (en) * 2023-05-10 2023-08-25 烟台大学 MRI brain image gland pituitary segmentation method
CN117078760A (en) * 2023-09-18 2023-11-17 北方民族大学 Valve body center positioning method based on image processing
CN118397283A (en) * 2024-07-01 2024-07-26 山东大学 Gastric atrophy area segmentation system, electronic equipment and readable storage medium
CN118537351A (en) * 2024-05-22 2024-08-23 山东大学 Steel structure corrosion identification method and system based on active migration learning
US12141694B2 (en) 2019-04-23 2024-11-12 The Johns Hopkins University Abdominal multi-organ segmentation with organ-attention networks
WO2025034588A1 (en) * 2023-08-10 2025-02-13 Intuitive Surgical Operations, Inc. Computer assisted stomach volume reduction procedures
US12327192B2 (en) 2019-11-11 2025-06-10 The Johns Hopkins University Early detection of pancreatic neoplasms using cascaded machine learning models
CN120411060A (en) * 2025-05-06 2025-08-01 徐州绪权印刷有限公司 A printing defect detection method and system based on computer vision

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120207359A1 (en) * 2011-02-11 2012-08-16 Microsoft Corporation Image Registration

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120207359A1 (en) * 2011-02-11 2012-08-16 Microsoft Corporation Image Registration

Cited By (99)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11593943B2 (en) 2017-04-11 2023-02-28 Kheiron Medical Technologies Ltd RECIST assessment of tumour progression
US11423541B2 (en) 2017-04-12 2022-08-23 Kheiron Medical Technologies Ltd Assessment of density in mammography
US11423540B2 (en) 2017-04-12 2022-08-23 Kheiron Medical Technologies Ltd Segmentation of anatomical regions and lesions
US11127137B2 (en) * 2017-04-12 2021-09-21 Kheiron Medical Technologies Ltd Malignancy assessment for tumors
WO2019136922A1 (en) * 2018-01-12 2019-07-18 平安科技(深圳)有限公司 Pulmonary nodule detection method, application server, and computer-readable storage medium
CN109416743A (en) * 2018-01-15 2019-03-01 深圳鲲云信息科技有限公司 A kind of Three dimensional convolution device artificially acted for identification
CN109416743B (en) * 2018-01-15 2022-05-24 深圳鲲云信息科技有限公司 Three-dimensional convolution device for identifying human actions
CN108399422A (en) * 2018-02-01 2018-08-14 华南理工大学 A kind of image channel fusion method based on WGAN models
CN108549871A (en) * 2018-04-17 2018-09-18 北京华捷艾米科技有限公司 A kind of hand Segmentation method based on region growing and machine learning
WO2019200753A1 (en) * 2018-04-17 2019-10-24 平安科技(深圳)有限公司 Lesion detection method, device, computer apparatus and storage medium
CN108805036A (en) * 2018-05-22 2018-11-13 电子科技大学 A kind of new non-supervisory video semanteme extracting method
CN109559295A (en) * 2018-06-04 2019-04-02 新影智能科技(昆山)有限公司 Image analysis system, method, computer readable storage medium and electric terminal
US11488306B2 (en) 2018-06-14 2022-11-01 Kheiron Medical Technologies Ltd Immediate workup
CN110638477A (en) * 2018-06-26 2020-01-03 佳能医疗系统株式会社 Medical image diagnosis device and alignment method
CN110638477B (en) * 2018-06-26 2023-08-11 佳能医疗系统株式会社 Medical image diagnostic device and alignment method
WO2020038974A1 (en) * 2018-08-21 2020-02-27 Koninklijke Philips N.V. Salient visual explanations of feature assessments by machine learning models
US12062429B2 (en) 2018-08-21 2024-08-13 Koninklijke Philips N.V. Salient visual explanations of feature assessments by machine learning models
CN110889852B (en) * 2018-09-07 2022-05-06 天津大学 Liver segmentation method based on residual error-attention deep neural network
CN110889853A (en) * 2018-09-07 2020-03-17 天津大学 A Residual-Attention Deep Neural Network Based Tumor Segmentation Method
CN110889852A (en) * 2018-09-07 2020-03-17 天津大学 Liver segmentation method based on residual-attention deep neural network
CN110599498A (en) * 2018-10-19 2019-12-20 北京连心医疗科技有限公司 Method for segmenting human body organ in medical image based on neural network
CN110599498B (en) * 2018-10-19 2023-05-05 北京连心医疗科技有限公司 Method for dividing human body organ in medical image based on neural network
CN109461161A (en) * 2018-10-22 2019-03-12 北京连心医疗科技有限公司 A method of human organ in medical image is split based on neural network
JP7222882B2 (en) 2018-11-14 2023-02-15 キュア.エーアイ テクノロジーズ プライベート リミテッド Application of deep learning for medical image evaluation
WO2020099940A1 (en) * 2018-11-14 2020-05-22 Qure.Ai Technologies Private Limited Application of deep learning for medical imaging evaluation
US10504227B1 (en) 2018-11-14 2019-12-10 Qure.Ai Technologies Private Limited Application of deep learning for medical imaging evaluation
JP2021509977A (en) * 2018-11-14 2021-04-08 キュア.エーアイ テクノロジーズ プライベート リミテッド Application of deep learning for medical image evaluation
US10475182B1 (en) 2018-11-14 2019-11-12 Qure.Ai Technologies Private Limited Application of deep learning for medical imaging evaluation
CN109636808B (en) * 2018-11-27 2022-08-12 杭州健培科技有限公司 Lung lobe segmentation method based on full convolution neural network
CN109636808A (en) * 2018-11-27 2019-04-16 杭州健培科技有限公司 A kind of lobe of the lung dividing method based on full convolutional neural networks
CN109598727A (en) * 2018-11-28 2019-04-09 北京工业大学 A kind of CT image pulmonary parenchyma three-dimensional semantic segmentation method based on deep neural network
CN109598727B (en) * 2018-11-28 2021-09-14 北京工业大学 CT image lung parenchyma three-dimensional semantic segmentation method based on deep neural network
US10963757B2 (en) 2018-12-14 2021-03-30 Industrial Technology Research Institute Neural network model fusion method and electronic device using the same
TWI697010B (en) * 2018-12-28 2020-06-21 國立成功大學 Method of obtaining medical sagittal image, method of training neural network and computing device
CN109741347A (en) * 2018-12-30 2019-05-10 北京工业大学 An Image Segmentation Method Based on Iterative Learning of Convolutional Neural Networks
CN109741347B (en) * 2018-12-30 2021-03-16 北京工业大学 An Image Segmentation Method Based on Iterative Learning of Convolutional Neural Networks
WO2020156303A1 (en) * 2019-01-30 2020-08-06 广州市百果园信息技术有限公司 Method and apparatus for training semantic segmentation network, image processing method and apparatus based on semantic segmentation network, and device and storage medium
CN111507343B (en) * 2019-01-30 2021-05-18 广州市百果园信息技术有限公司 Training of semantic segmentation network and image processing method and device thereof
CN111507343A (en) * 2019-01-30 2020-08-07 广州市百果园信息技术有限公司 Training of semantic segmentation network and image processing method and device thereof
CN110009599A (en) * 2019-02-01 2019-07-12 腾讯科技(深圳)有限公司 Liver masses detection method, device, equipment and storage medium
CN109948707B (en) * 2019-03-20 2023-04-18 腾讯科技(深圳)有限公司 Model training method, device, terminal and storage medium
CN109948707A (en) * 2019-03-20 2019-06-28 腾讯科技(深圳)有限公司 Model training method, device, terminal and storage medium
CN110096961A (en) * 2019-04-04 2019-08-06 北京工业大学 A kind of indoor scene semanteme marking method of super-pixel rank
CN110096961B (en) * 2019-04-04 2021-03-02 北京工业大学 A Superpixel-Level Semantic Annotation Method for Indoor Scenes
CN109961059A (en) * 2019-04-10 2019-07-02 杭州智团信息技术有限公司 Detect the method and system in kidney tissue of interest region
US20210365717A1 (en) * 2019-04-22 2021-11-25 Tencent Technology (Shenzhen) Company Limited Method and apparatus for segmenting a medical image, and storage medium
US11887311B2 (en) * 2019-04-22 2024-01-30 Tencent Technology (Shenzhen) Company Limited Method and apparatus for segmenting a medical image, and storage medium
US12141694B2 (en) 2019-04-23 2024-11-12 The Johns Hopkins University Abdominal multi-organ segmentation with organ-attention networks
CN110246566A (en) * 2019-04-24 2019-09-17 中南大学湘雅二医院 Method, system and storage medium are determined based on the conduct disorder of convolutional neural networks
CN110287777A (en) * 2019-05-16 2019-09-27 西北大学 A Body Segmentation Algorithm for Golden Monkey in Natural Scenes
US12125211B2 (en) 2019-07-09 2024-10-22 The Johns Hopkins University System and method for multi-scale coarse-to-fine segmentation of images to detect pancreatic ductal adenocarcinoma
US11308623B2 (en) 2019-07-09 2022-04-19 The Johns Hopkins University System and method for multi-scale coarse-to-fine segmentation of images to detect pancreatic ductal adenocarcinoma
US11461998B2 (en) 2019-09-25 2022-10-04 Samsung Electronics Co., Ltd. System and method for boundary aware semantic segmentation
EP3889888A1 (en) * 2019-10-23 2021-10-06 GE Precision Healthcare LLC Method, system and computer readable medium for automatic segmentation of a 3d medical image
CN112700451A (en) * 2019-10-23 2021-04-23 通用电气精准医疗有限责任公司 Method, system and computer readable medium for automatic segmentation of 3D medical images
EP3893200A1 (en) * 2019-10-23 2021-10-13 GE Precision Healthcare LLC Method, system and computer readable medium for automatic segmentation of a 3d medical image
US20210125707A1 (en) * 2019-10-23 2021-04-29 GE Precision Healthcare LLC Method, system and computer readable medium for automatic segmentation of a 3d medical image
US11581087B2 (en) 2019-10-23 2023-02-14 GE Precision Healthcare LLC Method, system and computer readable medium for automatic segmentation of a 3D medical image
US12327192B2 (en) 2019-11-11 2025-06-10 The Johns Hopkins University Early detection of pancreatic neoplasms using cascaded machine learning models
US12130887B2 (en) 2019-11-12 2024-10-29 Tencent Technology (Shenzhen) Company Limited Semantic segmentation network structure generation method and apparatus, device, and storage medium
WO2021093435A1 (en) * 2019-11-12 2021-05-20 腾讯科技(深圳)有限公司 Semantic segmentation network structure generation method and apparatus, device, and storage medium
US11416772B2 (en) 2019-12-02 2022-08-16 International Business Machines Corporation Integrated bottom-up segmentation for semi-supervised image segmentation
CN110766691A (en) * 2019-12-06 2020-02-07 北京安德医智科技有限公司 Method and device for cardiac magnetic resonance image analysis and cardiomyopathy prediction
US11406844B2 (en) 2020-03-30 2022-08-09 Varian Medical Systems International Ag Method and apparatus to derive and utilize virtual volumetric structures for predicting potential collisions when administering therapeutic radiation
WO2021198117A1 (en) * 2020-03-30 2021-10-07 Varian Medical Systems International Ag Automatically-planned radiation-based treatment
US11478660B2 (en) 2020-03-30 2022-10-25 Varian Medical Systems International Ag Automatically-planned radiation-based treatment
US11478210B2 (en) 2020-03-30 2022-10-25 Varian Medical Systems International Ag Automatically-registered patient fixation device images
US11786204B2 (en) 2020-03-30 2023-10-17 Siemens Heathineers International AG Automatically-registered patient fixation device images
CN111667459A (en) * 2020-04-30 2020-09-15 杭州深睿博联科技有限公司 Medical sign detection method, system, terminal and storage medium based on 3D variable convolution and time sequence feature fusion
CN111667459B (en) * 2020-04-30 2023-08-29 杭州深睿博联科技有限公司 Medical sign detection method, system, terminal and storage medium based on 3D variable convolution and time sequence feature fusion
US11430176B2 (en) 2020-05-20 2022-08-30 International Business Machines Corporation Generating volume predictions of three-dimensional volumes using slice features
CN111899273A (en) * 2020-06-10 2020-11-06 上海联影智能医疗科技有限公司 Image segmentation method, computer device and storage medium
CN112270644B (en) * 2020-10-20 2024-05-28 饶金宝 Face super-resolution method based on spatial feature transformation and trans-scale feature integration
CN112270644A (en) * 2020-10-20 2021-01-26 西安工程大学 Face super-resolution method based on spatial feature transformation and cross-scale feature integration
CN112258499B (en) * 2020-11-10 2023-09-26 北京深睿博联科技有限责任公司 Lymph node partition method, apparatus, device, and computer-readable storage medium
CN112258499A (en) * 2020-11-10 2021-01-22 北京深睿博联科技有限责任公司 Lymph node partition method, apparatus, device and computer readable storage medium
WO2022127500A1 (en) * 2020-12-14 2022-06-23 上海健康医学院 Multiple neural networks-based mri image segmentation method and apparatus, and device
CN112634211A (en) * 2020-12-14 2021-04-09 上海健康医学院 MRI (magnetic resonance imaging) image segmentation method, device and equipment based on multiple neural networks
CN112598634A (en) * 2020-12-18 2021-04-02 燕山大学 CT image organ positioning method based on 3D CNN and iterative search
CN112598634B (en) * 2020-12-18 2022-11-25 燕山大学 CT image organ positioning method based on 3D CNN and iterative search
CN112800915B (en) * 2021-01-20 2023-06-27 北京百度网讯科技有限公司 Building change detection method, device, electronic equipment and storage medium
CN112800915A (en) * 2021-01-20 2021-05-14 北京百度网讯科技有限公司 Building change detection method, building change detection device, electronic device, and storage medium
CN113160229A (en) * 2021-03-15 2021-07-23 西北大学 Pancreas segmentation method and device based on hierarchical supervision cascade pyramid network
CN113012178A (en) * 2021-05-07 2021-06-22 西安智诊智能科技有限公司 Kidney tumor image segmentation method
US20220414453A1 (en) * 2021-06-28 2022-12-29 X Development Llc Data augmentation using brain emulation neural networks
CN113506310B (en) * 2021-07-16 2022-03-01 首都医科大学附属北京天坛医院 Medical image processing method, device, electronic device and storage medium
CN113506310A (en) * 2021-07-16 2021-10-15 首都医科大学附属北京天坛医院 Medical image processing method and device, electronic equipment and storage medium
CN113706469B (en) * 2021-07-29 2024-04-05 天津中科智能识别产业技术研究院有限公司 Iris automatic segmentation method and system based on multi-model voting mechanism
CN113706469A (en) * 2021-07-29 2021-11-26 天津中科智能识别产业技术研究院有限公司 Iris automatic segmentation method and system based on multi-model voting mechanism
CN113610739A (en) * 2021-08-10 2021-11-05 平安国际智慧城市科技股份有限公司 Image data enhancement method, device, equipment and storage medium
CN114387282A (en) * 2021-12-08 2022-04-22 罗雄彪 Accurate automatic segmentation method and system for medical image organs
CN116188474A (en) * 2023-05-05 2023-05-30 四川省肿瘤医院 Three-level Lymphatic Structure Recognition Method and System Based on Image Semantic Segmentation
CN116645336B (en) * 2023-05-10 2024-05-07 烟台大学 MRI brain image gland pituitary segmentation method
CN116645336A (en) * 2023-05-10 2023-08-25 烟台大学 MRI brain image gland pituitary segmentation method
WO2025034588A1 (en) * 2023-08-10 2025-02-13 Intuitive Surgical Operations, Inc. Computer assisted stomach volume reduction procedures
CN117078760A (en) * 2023-09-18 2023-11-17 北方民族大学 Valve body center positioning method based on image processing
CN118537351A (en) * 2024-05-22 2024-08-23 山东大学 Steel structure corrosion identification method and system based on active migration learning
CN118397283A (en) * 2024-07-01 2024-07-26 山东大学 Gastric atrophy area segmentation system, electronic equipment and readable storage medium
CN120411060A (en) * 2025-05-06 2025-08-01 徐州绪权印刷有限公司 A printing defect detection method and system based on computer vision

Similar Documents

Publication Publication Date Title
WO2017210690A1 (en) Spatial aggregation of holistically-nested convolutional neural networks for automated organ localization and segmentation in 3d medical scans
Wang et al. Medical image segmentation using deep learning: A survey
Tang et al. A two-stage approach for automatic liver segmentation with Faster R-CNN and DeepLab
Gecer et al. Detection and classification of cancer in whole slide breast histopathology images using deep convolutional networks
US12361543B2 (en) Automated detection of tumors based on image processing
William et al. A review of image analysis and machine learning techniques for automated cervical cancer screening from pap-smear images
Wan et al. Accurate segmentation of overlapping cells in cervical cytology with deep convolutional neural networks
Kar et al. A review on progress in semantic image segmentation and its application to medical images
US11972571B2 (en) Method for image segmentation, method for training image segmentation model
Xing et al. Robust nucleus/cell detection and segmentation in digital pathology and microscopy images: a comprehensive review
Peng et al. A review of heart chamber segmentation for structural and functional analysis using cardiac magnetic resonance imaging
Alzahrani et al. Biomedical image segmentation: a survey
CN106780518B (en) A kind of MR image three-dimensional interactive segmentation method of the movable contour model cut based on random walk and figure
RU2654199C1 (en) Segmentation of human tissues in computer image
CN106340021B (en) Blood vessel extraction method
WO2015130231A1 (en) Segmentation of cardiac magnetic resonance (cmr) images using a memory persistence approach
Dong et al. A left ventricular segmentation method on 3D echocardiography using deep learning and snake
Ammari et al. A review of approaches investigated for right ventricular segmentation using short‐axis cardiac MRI
Atehortúa et al. Automatic segmentation of right ventricle in cardiac cine MR images using a saliency analysis
Cerrolaza et al. Fetal skull segmentation in 3D ultrasound via structured geodesic random forest
Krasnobaev et al. An overview of techniques for cardiac left ventricle segmentation on short-axis MRI
Chatterjee et al. A survey on techniques used in medical imaging processing
Banerjee et al. A CADe system for gliomas in brain MRI using convolutional neural networks
Albukhnefis et al. Image Segmentation Techniques: An In-Depth Review and Analysis
Gao et al. Hybrid decision forests for prostate segmentation in multi-channel MR images

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17807667

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 17807667

Country of ref document: EP

Kind code of ref document: A1