CN117392105A - Detection method of Crohn's disease and intestinal tuberculosis based on full-field digital sectioning - Google Patents
Detection method of Crohn's disease and intestinal tuberculosis based on full-field digital sectioning Download PDFInfo
- Publication number
- CN117392105A CN117392105A CN202311452206.3A CN202311452206A CN117392105A CN 117392105 A CN117392105 A CN 117392105A CN 202311452206 A CN202311452206 A CN 202311452206A CN 117392105 A CN117392105 A CN 117392105A
- Authority
- CN
- China
- Prior art keywords
- full
- field digital
- slice
- digital slice
- image
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/0002—Inspection of images, e.g. flaw detection
- G06T7/0012—Biomedical image inspection
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/44—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
- G06V10/443—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components by matching or filtering
- G06V10/449—Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters
- G06V10/451—Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters with interaction between the filter responses, e.g. cortical complex cells
- G06V10/454—Integrating the filters into a hierarchical structure, e.g. convolutional neural networks [CNN]
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/764—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/60—Type of objects
- G06V20/69—Microscopic objects, e.g. biological cells or cellular parts
- G06V20/698—Matching; Classification
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10056—Microscopic image
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20021—Dividing image into blocks, subimages or windows
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30004—Biomedical image processing
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V2201/00—Indexing scheme relating to image or video recognition or understanding
- G06V2201/03—Recognition of patterns in medical or anatomical images
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02A—TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
- Y02A90/00—Technologies having an indirect contribution to adaptation to climate change
- Y02A90/10—Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- Multimedia (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Artificial Intelligence (AREA)
- Medical Informatics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Databases & Information Systems (AREA)
- Software Systems (AREA)
- Quality & Reliability (AREA)
- Radiology & Medical Imaging (AREA)
- Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
- Biodiversity & Conservation Biology (AREA)
- Image Analysis (AREA)
Abstract
本申请公开了一种基于全视野数字切片的克罗恩病和肠结核的检测方法,方法包括获取若干全视野数字切片,并且确定每个全视野数字切片对应的若干图像块,将各全视野数字切片对应的若干图像块输入经过训练的检测网络模型,通过所述检测网络模型确定每个全视野数字切片对应的初始预测概率矩阵;基于各全视野数字切片的初始预测概率矩阵,确定所述目标对象的预测结果。本申请通过整合多张全视野数字切片的初始概率来确定患者水平的预测结果,可以实现了通过多层次分类结果来确定患者水平的结果,提高了患者水平的预测精度,从而可以提高克罗恩病和肠结核的检测的准确性。
This application discloses a detection method for Crohn's disease and intestinal tuberculosis based on full-field digital slices. The method includes acquiring several full-field digital slices, determining several image blocks corresponding to each full-field digital slice, and converting each full-field digital slice into Several image blocks corresponding to the digital slices are input into a trained detection network model, and the initial prediction probability matrix corresponding to each full-field digital slice is determined through the detection network model; based on the initial prediction probability matrix of each full-field digital slice, the Prediction results for the target object. This application determines the patient-level prediction results by integrating the initial probabilities of multiple full-field digital slices, which can determine the patient-level results through multi-level classification results, improve the patient-level prediction accuracy, and thereby improve Crohn's disease and intestinal tuberculosis detection.
Description
技术领域Technical field
本申请涉及生物学技术领域,特别涉及一种基于全视野数字切片的克罗恩病和肠结核的检测方法及相关装置。The present application relates to the field of biological technology, and in particular to a detection method for Crohn's disease and intestinal tuberculosis based on full-field digital slices and related devices.
背景技术Background technique
克罗恩病(Crohn’s disease,CD)是一种慢性非特异性肉芽肿性疾病,其确切的病因仍然不明确。随着时间的推移,我国的克罗恩病发病率呈现出上升趋势。另一种疾病,肠结核(Intestinal tuberculosis,ITB),是由结核分枝杆菌引发的特异性感染疾病,也是肺外结核的主要类型之一。尽管这两种疾病在症状、内窥镜观察以及病理学上有很多相似之处,使得两种疾病容易出现误诊。Crohn’s disease (CD) is a chronic, nonspecific granulomatous disease whose exact cause remains unclear. Over time, the incidence of Crohn's disease in my country has shown an upward trend. Another disease, Intestinal tuberculosis (ITB), is a specific infectious disease caused by Mycobacterium tuberculosis and is also one of the main types of extrapulmonary tuberculosis. Although the two diseases share many similarities in symptoms, endoscopic observations, and pathology, they are prone to misdiagnosis.
目前通过AI手段鉴别这两种疾病的方法普遍是通过使用基于深度学习的病理图像诊断来提升对克罗恩病和肠结核的诊断效率和准确度。但是当前的病理图像诊断模型主要聚焦于切片水平,而忽略了与临床实际应用更为紧密的患者水平分类,从而影响了鉴别的准确性。The current method of identifying these two diseases through AI means generally uses pathological image diagnosis based on deep learning to improve the efficiency and accuracy of diagnosis of Crohn's disease and intestinal tuberculosis. However, the current pathological image diagnosis model mainly focuses on the slice level and ignores the patient-level classification that is more closely related to clinical practice, thus affecting the accuracy of identification.
因而现有技术还有待改进和提高。Therefore, the existing technology still needs to be improved and improved.
发明内容Contents of the invention
本申请要解决的技术问题在于,针对现有技术的不足,提供一种基于全视野数字切片的克罗恩病和肠结核的检测方法及相关装置。The technical problem to be solved by this application is to provide a detection method and related device for Crohn's disease and intestinal tuberculosis based on full-field digital slices in view of the shortcomings of the existing technology.
为了解决上述技术问题,本申请实施例第一方面提供了一种基于全视野数字切片的克罗恩病和肠结核的检测方法,所述方法包括:In order to solve the above technical problems, the first aspect of the embodiment of the present application provides a detection method for Crohn's disease and intestinal tuberculosis based on full-field digital slices. The method includes:
获取若干全视野数字切片,并且确定每个全视野数字切片对应的若干图像块,其中,若干全视野数字切片为同一目标对象的全视野数字切片;Obtain a number of full-field digital slices, and determine a number of image blocks corresponding to each full-field digital slice, where the several full-field digital slices are full-field digital slices of the same target object;
对于每个全视野数字切片,将所述全视野数字切片对应的若干图像块输入经过训练的检测网络模型,通过所述检测网络模型确定每个全视野数字切片对应的初始预测概率矩阵;For each full-field digital slice, input several image blocks corresponding to the full-field digital slice into a trained detection network model, and determine the initial prediction probability matrix corresponding to each full-field digital slice through the detection network model;
基于各全视野数字切片的初始预测概率矩阵,确定所述目标对象的预测结果,其中,所述预测结果包括克罗恩病类别、肠结核类别或正常组织类别。Based on the initial prediction probability matrix of each full-field digital slice, a prediction result of the target object is determined, wherein the prediction result includes a Crohn's disease category, an intestinal tuberculosis category, or a normal tissue category.
根据上述技术手段,本申请在获取到患者的全视野数字切片并进一步确定每个切片对应的若干图像块后,使用经过训练的检测网络模型处理图像块以确定每个切片的初始预测概率,再基于初始预测概率确定全视野数字切片的预测概率,这样通过整合切片水平和患者水平分类结果,实现更精确的多层次分类,有效地减少误诊,进一步提升治疗的精度。According to the above technical means, after obtaining the patient's full field of view digital slices and further determining several image blocks corresponding to each slice, this application uses a trained detection network model to process the image blocks to determine the initial prediction probability of each slice, and then The predicted probability of full-field digital slices is determined based on the initial predicted probability. In this way, by integrating slice-level and patient-level classification results, more accurate multi-level classification can be achieved, effectively reducing misdiagnosis and further improving the accuracy of treatment.
在一个实现方式中,所述基于全视野数字切片的克罗恩病和肠结核的检测方法,其中,检测网络模型包括特征提取模块和分类模块,所述将若干全视野数字切片中的每个全视野数字切片输入经过训练的检测网络模型,通过所述检测网络模型确定每个全视野数字切片对应的初始预测概率矩阵具体包括:In one implementation, the detection method of Crohn's disease and intestinal tuberculosis based on full-field digital slices, wherein the detection network model includes a feature extraction module and a classification module, and each of the several full-field digital slices is The full-field digital slice is input into a trained detection network model, and the initial prediction probability matrix corresponding to each full-field digital slice is determined through the detection network model, which specifically includes:
对于每个全视野数字切片,确定所述全视野数字切片对应的若干图像块;For each full-field digital slice, determine several image blocks corresponding to the full-field digital slice;
将若干图像块分别输入所述特征提取模块,通过所述特征提取模块确定各图像块各自对应的特征向量;Input several image blocks into the feature extraction module respectively, and determine the corresponding feature vectors of each image block through the feature extraction module;
将各图像块各自对应的特征向量输入所述分类模块,通过所述分类模块确定所述全视野数字切片对应的初始预测概率矩阵。The corresponding feature vectors of each image block are input into the classification module, and the initial prediction probability matrix corresponding to the full-view digital slice is determined through the classification module.
根据上述技术手段,可以确保每个全视野数字切片都能被分解为数个图像块,并对每个图像块进行特征提取,实现对全视野数字切片的分析;同时,通过特征提取模块确定每个图像块的特征向量,增加了对病理特征的认知和辨识能力,为诊断提供信息的精度。According to the above technical means, it can be ensured that each full-field digital slice can be decomposed into several image blocks, and feature extraction is performed on each image block to realize the analysis of the full-field digital slice; at the same time, each full-field digital slice is determined through the feature extraction module The feature vectors of image blocks increase the ability to recognize and identify pathological features and provide information accuracy for diagnosis.
在一个实现方式中,所述分类模块包括注意力单元和多层感知机;所述将各图像块各自对应的特征向量输入所述分类模块,通过所述分类模块确定所述全视野数字切片对应的初始预测概率矩阵具体包括:In one implementation, the classification module includes an attention unit and a multi-layer perceptron; the feature vector corresponding to each image block is input into the classification module, and the corresponding feature vector of the full-field digital slice is determined through the classification module The initial prediction probability matrix specifically includes:
将各图像块各自对应的特征向量输入注意力单元,通过注意力单元确定各图像块的注意力系数,并基于所述注意力系数和各图像块各自对应的特征向量形成注意力特征矩阵;Input the corresponding feature vectors of each image block into the attention unit, determine the attention coefficient of each image block through the attention unit, and form an attention feature matrix based on the attention coefficient and the corresponding feature vector of each image block;
将注意力特征矩阵输入所述多层感知机,通过所述多层感知机确定所述全视野数字切片对应的初始预测概率矩阵。The attention feature matrix is input into the multi-layer perceptron, and the initial prediction probability matrix corresponding to the full-view digital slice is determined through the multi-layer perceptron.
根据上述技术手段,利用注意力单元,强化对与疾病相关的关键图像区域的关注,有助于检测和诊断克罗恩病和肠结核。其次,基于注意力的特征加权方式能够提高诊断的准确性,减少临床医生的误诊率。再者,多层感知机的引入为预测模型提供了学习能力,能够在图像数据中进行模式识别。According to the above technical means, the attention unit is used to strengthen the focus on key image areas related to the disease, which is helpful for the detection and diagnosis of Crohn's disease and intestinal tuberculosis. Secondly, the attention-based feature weighting method can improve the accuracy of diagnosis and reduce the misdiagnosis rate of clinicians. Furthermore, the introduction of multi-layer perceptron provides learning capabilities for prediction models and can perform pattern recognition in image data.
在一个实现方式中,所述基于全视野数字切片的克罗恩病和肠结核的检测方法,其中,所述确定每个全视野数字切片对应的若干图像块具体包括:In one implementation, the method for detecting Crohn's disease and intestinal tuberculosis based on full-field digital slices, wherein the determining several image blocks corresponding to each full-field digital slice specifically includes:
确定所述全视野数字切片的二进制掩膜,并对所述二进制掩膜进行填补以得到所述全视野数字切片中的前景对象轮廓;Determine a binary mask of the full-field digital slice, and fill the binary mask to obtain the foreground object outline in the full-field digital slice;
基于所述前景对象轮廓分割所述全视野数字切片,以得到前景图像;Segment the full-field digital slice based on the foreground object outline to obtain a foreground image;
将所述前景图像划分为若干图像块,以得到所述全视野数字切片对应的若干图像块。The foreground image is divided into several image blocks to obtain several image blocks corresponding to the full-field digital slice.
根据上述技术手段,为后续的检测网络模型输入提供图像块,确保了图像的完整性和准确性。According to the above technical means, image blocks are provided for subsequent detection network model input, ensuring the integrity and accuracy of the image.
在一个实现方式中,所述基于全视野数字切片的克罗恩病和肠结核的检测方法,其中,所述确定所述全视野数字切片的二进制掩膜具体包括:In one implementation, the detection method of Crohn's disease and intestinal tuberculosis based on full-field digital slices, wherein the determining the binary mask of the full-field digital slice specifically includes:
将全视野数字切片转换为标签图像文件格式,并将转换后的全视野数字切片从RGB颜色空间转换至六角椎体颜色模型颜色空间;Convert full-field digital slices into a label image file format, and convert the converted full-field digital slices from RGB color space to hexagonal pyramid color model color space;
根据颜色空间转换后的全视野数字切片的饱和度通道阈值化计算得到所述全视野数字切片的二进制掩膜。The binary mask of the full-field digital slice is calculated based on the saturation channel thresholding of the full-field digital slice after color space conversion.
根据上述技术手段,颜色空间的转换可以更直观地分析和处理图像中的颜色变化和分布,借助于二进制掩膜可以更加直观地表达有价值的图片区域。同时,可以保证不同切片之间的颜色分布一致,不会因染色操作而导致不同切片的颜色差异大,从而可以提高后续检测的准确性。According to the above technical means, the conversion of color space can more intuitively analyze and process the color changes and distribution in the image, and the valuable image areas can be expressed more intuitively with the help of binary masks. At the same time, it can ensure that the color distribution between different slices is consistent, and there will be no large color differences in different slices due to staining operations, thereby improving the accuracy of subsequent detection.
在一个实现方式中,所述基于全视野数字切片的克罗恩病和肠结核的检测方法,其中,所述基于各全视野数字切片的初始预测概率矩阵,确定所述目标对象的预测结果具体包括:In one implementation, the method for detecting Crohn's disease and intestinal tuberculosis based on full-field digital slices, wherein the specific prediction result of the target object is determined based on the initial prediction probability matrix of each full-field digital slice. include:
对各全视野数字切片的初始预测概率矩阵进行加权,得到目标预测概率矩阵;Weight the initial prediction probability matrix of each full-field digital slice to obtain the target prediction probability matrix;
基于所述目标预测概率矩阵确定所述目标对象的预测结果。The prediction result of the target object is determined based on the target prediction probability matrix.
根据上述技术手段,加权的方式使得全视野数字切片的预测概率得到了权重分配,代表了目标对象的情况,减少临床医生的误诊率,提高治疗的精度,在克罗恩病和肠结核的检测过程中识别出病理特征,提高诊断的效率。According to the above technical means, the weighted method enables the prediction probability of full-field digital slices to be weighted, representing the condition of the target object, reducing the misdiagnosis rate of clinicians, improving the accuracy of treatment, and in the detection of Crohn's disease and intestinal tuberculosis. Pathological characteristics are identified during the process to improve the efficiency of diagnosis.
在一个实现方式中,所述基于全视野数字切片的克罗恩病和肠结核的检测方法,其中,所述方法还包括:In one implementation, the method for detecting Crohn's disease and intestinal tuberculosis based on full-field digital slices, wherein the method further includes:
对于每个全视野数字切片,获取所述全视野数字切片中的每个图像块相对于其他图像块的相似程度,以得到每个图像块的注意力分数;For each full-field digital slice, obtain the similarity degree of each image block in the full-field digital slice relative to other image blocks to obtain the attention score of each image block;
基于所述注意力分数形成注意力分布图,其中,所述注意力分布图中不同注意力分数对应的图像颜色不同;An attention distribution map is formed based on the attention score, wherein the images corresponding to different attention scores in the attention distribution map have different colors;
将所述注意力分布图叠加至所述全视野数字切片上,以得到全视野数字切片的注意力热图。The attention distribution map is superimposed on the full-field digital slice to obtain an attention heat map of the full-field digital slice.
根据上述技术手段,上述方法显示在切片中具有重要性或与其他图像块不同的区域,帮助临床医生定位和识别病理特征;通过基于注意力分数形成的注意力分布图,增强临床医生对特定区域的关注;将注意力分布图叠加到全视野数字切片上得到的注意力热图为临床医生提供工具,提高在检测克罗恩病和肠结核时定位和分析区域的精确度。According to the above technical means, the above method displays areas that are important or different from other image blocks in the slice to help clinicians locate and identify pathological features; through an attention distribution map formed based on attention scores, it enhances clinicians' understanding of specific areas. Attention; attention heatmaps obtained by superimposing attention maps onto full-field digital slices provide clinicians with tools to improve the accuracy of locating and analyzing regions when detecting Crohn's disease and intestinal tuberculosis.
本申请实施例第二方面提供了一种基于全视野数字切片的克罗恩病和肠结核的检测装置,所述装置包括:The second aspect of the embodiment of the present application provides a detection device for Crohn's disease and intestinal tuberculosis based on full-field digital slices. The device includes:
获取模块,用于获取若干全视野数字切片,并且确定每个全视野数字切片对应的若干图像块,其中,若干全视野数字切片为同一目标对象的全视野数字切片;The acquisition module is used to acquire a number of full-field digital slices, and determine a number of image blocks corresponding to each full-field digital slice, where the several full-field digital slices are full-field digital slices of the same target object;
控制模块,用于控制经过训练的检测网络模型确定每个全视野数字切片对应的初始预测概率矩阵;A control module used to control the trained detection network model to determine the initial prediction probability matrix corresponding to each full-field digital slice;
确定模块,用于基于各全视野数字切片的初始预测概率矩阵,确定所述目标对象的预测结果,其中,所述预测结果包括克罗恩病类别、肠结核类别或正常组织类别。A determining module, configured to determine the prediction result of the target object based on the initial prediction probability matrix of each full-field digital slice, wherein the prediction result includes a Crohn's disease category, an intestinal tuberculosis category, or a normal tissue category.
本申请实施例第三方面提供了一种计算机可读存储介质,所述计算机可读存储介质存储有一个或者多个程序,所述一个或者多个程序可被一个或者多个处理器执行,以实现如上任一所述的基于全视野数字切片的克罗恩病和肠结核的检测方法中的步骤。The third aspect of the embodiments of the present application provides a computer-readable storage medium that stores one or more programs, and the one or more programs can be executed by one or more processors to Implement the steps in the method for detecting Crohn's disease and intestinal tuberculosis based on full-field digital sectioning as described in any one of the above.
本申请实施例第四方面提供了一种终端设备,其包括:处理器和存储器;The fourth aspect of the embodiment of the present application provides a terminal device, which includes: a processor and a memory;
所述存储器上存储有可被所述处理器执行的计算机可读程序;The memory stores a computer-readable program that can be executed by the processor;
所述处理器执行所述计算机可读程序时实现如上任一所述的基于全视野数字切片的克罗恩病和肠结核的检测方法中的步骤。When the processor executes the computer-readable program, the steps in any of the above-described methods for detecting Crohn's disease and intestinal tuberculosis based on full-field digital slices are implemented.
有益效果:与现有技术相比,本申请提供了一种基于全视野数字切片的克罗恩病和肠结核的检测方法,所述方法包括获取若干全视野数字切片,并且确定每个全视野数字切片对应的若干图像块,其中,若干全视野数字切片为同一目标对象的全视野数字切片;将每个全视野数字切片对应的若干图像块输入经过训练的检测网络模型,确定每个全视野数字切片对应的初始预测概率矩阵;基于各全视野数字切片的初始预测概率矩阵,确定所述目标对象的预测结果。本申请通过整合多张全视野数字切片的初始概率来确定患者水平的预测结果,可以实现了通过多层次分类结果来确定患者水平的结果,提高了患者水平的预测精度,从而可以提高克罗恩病和肠结核的检测的准确性。Beneficial effects: Compared with the existing technology, this application provides a detection method for Crohn's disease and intestinal tuberculosis based on full-field digital slices. The method includes acquiring several full-field digital slices and determining each full-field digital slice. Several image blocks corresponding to the digital slices, among which several full-field digital slices are full-field digital slices of the same target object; input several image blocks corresponding to each full-field digital slice into the trained detection network model to determine each full-field digital slice The initial prediction probability matrix corresponding to the digital slice; based on the initial prediction probability matrix of each full-field digital slice, determine the prediction result of the target object. This application determines the patient-level prediction results by integrating the initial probabilities of multiple full-field digital slices, which can determine the patient-level results through multi-level classification results, improve the patient-level prediction accuracy, and thereby improve Crohn's disease and intestinal tuberculosis detection.
附图说明Description of the drawings
为了更清楚地说明本申请实施例中的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员而言,在不符创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments will be briefly introduced below. Obviously, the drawings in the following description are only some embodiments of the present application. For those of ordinary skill in the art, other drawings can also be obtained based on these drawings without incompatible creative efforts.
图1为本申请提供的基于全视野数字切片的克罗恩病和肠结核的检测方法的流程图。Figure 1 is a flow chart of the detection method for Crohn's disease and intestinal tuberculosis based on full-field digital sectioning provided by this application.
图2为本申请提供的基于全视野数字切片的克罗恩病和肠结核的检测方法的深度学习管道图。Figure 2 is a deep learning pipeline diagram of the detection method for Crohn's disease and intestinal tuberculosis based on full-field digital slices provided by this application.
图3为本申请提供的全视野数字切片图像及其相应的注意力热度图。Figure 3 is a full-field digital slice image provided by this application and its corresponding attention heat map.
图4为本申请提供的基于全视野数字切片的克罗恩病和肠结核的检测装置的结构原理图。Figure 4 is a schematic structural diagram of the detection device for Crohn's disease and intestinal tuberculosis based on full-field digital slices provided by this application.
图5为本申请提供的终端设备的结构原理图。Figure 5 is a schematic structural diagram of the terminal equipment provided by this application.
具体实施方式Detailed ways
本申请提供一种基于全视野数字切片的克罗恩病和肠结核的检测方法及相关装置,为使本申请的目的、技术方案及效果更加清楚、明确,以下参照附图并举实施例对本申请进一步详细说明。应当理解,此处所描述的具体实施例仅用以解释本申请,并不用于限定本申请。This application provides a detection method and related devices for Crohn's disease and intestinal tuberculosis based on full-field digital slices. In order to make the purpose, technical solutions and effects of this application more clear and definite, the following examples of this application are described with reference to the accompanying drawings. To elaborate further. It should be understood that the specific embodiments described here are only used to explain the present application and are not used to limit the present application.
本技术领域技术人员可以理解,除非特意声明,这里使用的单数形式“一”、“一个”、“所述”和“该”也可包括复数形式。应该进一步理解的是,本申请的说明书中使用的措辞“包括”是指存在所述特征、整数、步骤、操作、元件和/或组件,但是并不排除存在或添加一个或多个其他特征、整数、步骤、操作、元件、组件和/或它们的组。应该理解,当我们称元件被“连接”或“耦接”到另一元件时,它可以直接连接或耦接到其他元件,或者也可以存在中间元件。此外,这里使用的“连接”或“耦接”可以包括无线连接或无线耦接。这里使用的措辞“和/或”包括一个或更多个相关联的列出项的全部或任一单元和全部组合。Those skilled in the art will understand that, unless expressly stated otherwise, the singular forms "a", "an", "the" and "the" used herein may also include the plural form. It should be further understood that the word "comprising" used in the description of this application refers to the presence of stated features, integers, steps, operations, elements and/or components, but does not exclude the presence or addition of one or more other features, Integers, steps, operations, elements, components and/or groups thereof. It will be understood that when we refer to an element being "connected" or "coupled" to another element, it can be directly connected or coupled to the other element or intervening elements may also be present. Additionally, "connected" or "coupled" as used herein may include wireless connections or wireless couplings. As used herein, the term "and/or" includes all or any unit and all combinations of one or more of the associated listed items.
本技术领域技术人员可以理解,除非另外定义,这里使用的所有术语(包括技术术语和科学术语),具有与本申请所属领域中的普通技术人员的一般理解相同的意义。还应该理解的是,诸如通用字典中定义的那些术语,应该被理解为具有与现有技术的上下文中的意义一致的意义,并且除非像这里一样被特定定义,否则不会用理想化或过于正式的含义来解释。It will be understood by those skilled in the art that, unless otherwise defined, all terms (including technical terms and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. It should also be understood that terms, such as those defined in general dictionaries, are to be understood to have meanings consistent with their meaning in the context of the prior art, and are not to be used in an idealistic or overly descriptive manner unless specifically defined as here. to explain the formal meaning.
应理解,本实施例中各步骤的序号和大小并不意味着执行顺序的先后,各过程的执行顺序以其功能和内在逻辑确定,而不应对本申请实施例的实施过程构成任何限定。It should be understood that the sequence number and size of each step in this embodiment does not mean the order of execution. The execution order of each process is determined by its function and internal logic, and should not constitute any limitation on the implementation process of the embodiment of the present application.
发明人经过研究发现,克罗恩病(Crohn’s disease,CD)是一种慢性非特异性肉芽肿性疾病,其确切的病因仍然不明确。随着时间的推移,我国的克罗恩病发病率呈现出上升趋势。另一种疾病,肠结核(Intestinal tuberculosis,ITB),是由结核分枝杆菌引发的特异性感染疾病,也是肺外结核的主要类型之一。尽管这两种疾病在症状、内窥镜观察以及病理学上有很多相似之处,使得两种疾病容易出现误诊。The inventor discovered through research that Crohn’s disease (CD) is a chronic non-specific granulomatous disease, and its exact cause is still unclear. Over time, the incidence of Crohn's disease in my country has shown an upward trend. Another disease, Intestinal tuberculosis (ITB), is a specific infectious disease caused by Mycobacterium tuberculosis and is also one of the main types of extrapulmonary tuberculosis. Although the two diseases share many similarities in symptoms, endoscopic observations, and pathology, they are prone to misdiagnosis.
目前通过AI手段鉴别这两种疾病的方法是通过使用基于深度学习的病理图像诊断来提升对克罗恩病和肠结核的诊断效率和准确度。但是当前的病理图像诊断模型主要聚焦于切片水平,而忽略了与临床实际应用更为紧密的患者水平分类,从而影响了鉴别的准确性。The current method of identifying these two diseases through AI is to improve the efficiency and accuracy of diagnosis of Crohn's disease and intestinal tuberculosis by using pathological image diagnosis based on deep learning. However, the current pathological image diagnosis model mainly focuses on the slice level and ignores the patient-level classification that is more closely related to clinical practice, thus affecting the accuracy of identification.
为了解决上述问题,在本申请实施例中,获取若干全视野数字切片,并且确定每个全视野数字切片对应的若干图像块,其中,若干全视野数字切片为同一目标对象的全视野数字切片;将每个全视野数字切片对应的若干图像块输入经过训练的检测网络模型,确定每个全视野数字切片对应的初始预测概率矩阵;基于各全视野数字切片的初始预测概率矩阵,确定所述目标对象的预测结果。本申请通过整合多张全视野数字切片的初始概率来确定患者水平的预测结果,可以实现了通过多层次分类结果来确定患者水平的结果,提高了患者水平的预测精度,从而可以提高克罗恩病和肠结核的检测的准确性。In order to solve the above problem, in the embodiment of the present application, several full-field digital slices are obtained, and several image blocks corresponding to each full-field digital slice are determined, where the several full-field digital slices are full-field digital slices of the same target object; Input several image blocks corresponding to each full-field digital slice into the trained detection network model to determine the initial prediction probability matrix corresponding to each full-field digital slice; based on the initial prediction probability matrix of each full-field digital slice, determine the target The predicted result of the object. This application determines the patient-level prediction results by integrating the initial probabilities of multiple full-field digital slices, which can determine the patient-level results through multi-level classification results, improve the patient-level prediction accuracy, and thereby improve Crohn's disease and intestinal tuberculosis detection.
下面结合附图,通过对实施例的描述,对申请内容作进一步说明。The content of the application will be further explained below by describing the embodiments in conjunction with the accompanying drawings.
本实施例提供了一种基于全视野数字切片的克罗恩病和肠结核的检测方法,如图1所示,所述方法包括:This embodiment provides a detection method for Crohn's disease and intestinal tuberculosis based on full-field digital slices, as shown in Figure 1. The method includes:
S10、获取若干全视野数字切片,并且确定每个全视野数字切片对应的若干图像块,其中,若干全视野数字切片为同一目标对象的全视野数字切片。S10. Obtain a number of full-field digital slices, and determine a number of image blocks corresponding to each full-field digital slice, where the several full-field digital slices are full-field digital slices of the same target object.
具体地,全视野数字切片是用于医学图像分析的高分辨率图像,高分辨率图像通常是从生物样本(例如,组织或细胞制备的薄片等)上进行扫描得到的。全视野数字切片常被用于病理学诊断、疾病研究、药物效果评估等医学研究领域,且若干全视野数字切片为目标对象的多张连续或随机切片。例如,一个目标对象的克罗恩病组织样本会被切成数十到数百片连续的薄片,并分别进行扫描,得到多张全视野数字切片。连续的全视野数字切片可以为研究人员提供对目标对象的连续性观察,从而更准确地识别和定位异常区域。Specifically, full-field digital slices are high-resolution images used for medical image analysis. High-resolution images are usually scanned from biological samples (for example, thin sections prepared from tissues or cells, etc.). Full-field digital slices are often used in medical research fields such as pathological diagnosis, disease research, and drug effect evaluation, and several full-field digital slices are multiple continuous or random slices of the target object. For example, a target subject's Crohn's disease tissue sample will be cut into dozens to hundreds of continuous thin slices and scanned separately to obtain multiple full-field digital slices. Continuous full-field digital slices can provide researchers with continuous observation of the target object, allowing more accurate identification and localization of abnormal areas.
每个全视野数字切片的尺寸较大,为了便于分析和处理,通常会将其划分为若干较小的图像块。例如,一个10,000x10,000像素的全视野数字切片,可以被划分为若干224*224像素或384*384像素的图像块。每个图像块保持相同的尺寸和分辨率,以便进行批量分析和处理。Each full-field digital slice is large in size and is usually divided into several smaller image blocks for easier analysis and processing. For example, a full-field digital slice of 10,000x10,000 pixels can be divided into several image blocks of 224*224 pixels or 384*384 pixels. Each image patch remains the same size and resolution for batch analysis and processing.
在一个实现方式中,所述确定每个全视野数字切片对应的若干图像块具体包括:In one implementation, determining several image blocks corresponding to each full-field digital slice specifically includes:
S11、确定所述全视野数字切片的二进制掩膜,并对所述二进制掩膜进行填补以得到所述全视野数字切片中的前景对象轮廓;S11. Determine the binary mask of the full-field digital slice, and fill the binary mask to obtain the foreground object outline in the full-field digital slice;
S12、基于所述前景对象轮廓分割所述全视野数字切片,以得到前景图像;S12. Segment the full-field digital slice based on the contour of the foreground object to obtain a foreground image;
S13、将所述前景图像划分为若干图像块,以得到所述全视野数字切片对应的若干图像块。S13. Divide the foreground image into several image blocks to obtain several image blocks corresponding to the full-field digital slice.
具体地,在步骤S11中全视野数字切片来源于生物医学领域,例如组织学或细胞学的显微图像,其中包含了目标细胞、组织和病理区域,在本申请的方案中,全视野数字切片来自于目标对象的病理组织样本。为了精确地定位和分析目标病理区域,首先需要构建一个二进制掩膜来代表图像中的前景和背景部分。二进制掩膜是一个由0和1构成的图像,其中1代表前景(例如目标细胞、组织和病例区域),0代表背景。二进制掩膜可以通过阈值化方法、图像分割技术或深度学习模型(如U-Net,Yolo等)获取。例如,当图像中的细胞亮度高于预设阈值时,该细胞位置在掩膜中被标记为1,否则为0,其中,在全视野数字切片中,每个像素都有一个与之关联的亮度或强度值,通常在灰度图像中的范围是0到255(8位深度),因此,预设阈值的取值范围是0到255之间,例如150。Specifically, in step S11, the full-field digital slices are derived from microscopic images in the biomedical field, such as histology or cytology, which contain target cells, tissues and pathological areas. In the solution of this application, the full-field digital slices Pathological tissue samples from target subjects. In order to accurately locate and analyze the target pathological area, a binary mask needs to be constructed first to represent the foreground and background parts in the image. A binary mask is an image composed of 0 and 1, where 1 represents the foreground (such as target cells, tissue, and case areas) and 0 represents the background. Binary masks can be obtained through thresholding methods, image segmentation techniques, or deep learning models (such as U-Net, Yolo, etc.). For example, when the brightness of a cell in the image is higher than a preset threshold, that cell location is marked as 1 in the mask, otherwise it is 0, where, in a full-field digital slice, each pixel has an associated The brightness or intensity value usually ranges from 0 to 255 in a grayscale image (8-bit depth). Therefore, the preset threshold ranges from 0 to 255, such as 150.
然而,由于全视野数字切片中存在的噪声或其他干扰因素,所获得的二进制掩膜不完美(有缺失或断裂),为了得到连续的前景对象轮廓,需要对初步的二进制掩膜进行填补,填补采用数学形态学操作,如腐蚀和膨胀,或利用图像处理技术(例如中值滤波、开操作和闭操作等),以消除噪声并完善前景对象轮廓。However, due to the presence of noise or other interference factors in full-field digital slices, the obtained binary mask is imperfect (missing or broken). In order to obtain a continuous foreground object outline, the preliminary binary mask needs to be filled. Filling Use mathematical morphological operations such as erosion and dilation, or utilize image processing techniques such as median filtering, opening and closing operations, etc., to eliminate noise and refine the outline of foreground objects.
在本申请的一个实现方式中,当有一个全视野数字切片显示肠结核病理细胞,通过U-Net模型可以得到一个初步的二进制掩膜,再使用数学形态学的闭操作对二进制掩膜进行填补,从而得到一个连续的肠结核病理细胞前景轮廓,进而使得研究人员或临床医生基于连续的肠结核病理细胞前景轮廓精确地分析细胞的形态和数量,为进一步的研究或诊断提供重要信息。In one implementation of this application, when there is a full-field digital slice showing intestinal tuberculosis pathological cells, a preliminary binary mask can be obtained through the U-Net model, and then the closing operation of mathematical morphology is used to fill the binary mask. , thereby obtaining a continuous foreground profile of intestinal tuberculosis pathological cells, which allows researchers or clinicians to accurately analyze the morphology and number of cells based on the continuous foreground profile of intestinal tuberculosis pathological cells, providing important information for further research or diagnosis.
在本申请的实现方式中,所述确定所述全视野数字切片的二进制掩膜具体包括:In the implementation of this application, determining the binary mask of the full-field digital slice specifically includes:
S111、将全视野数字切片转换为TIF(Tag ImageFile Format)格式,并将转换后的全视野数字切片从RGB颜色空间转换至HSV(Hue,Saturation,Value)颜色空间;S111. Convert the full-field digital slice to TIF (Tag ImageFile Format) format, and convert the converted full-field digital slice from RGB color space to HSV (Hue, Saturation, Value) color space;
S112、根据颜色空间转换后的全视野数字切片的饱和度通道阈值化计算得到所述全视野数字切片的二进制掩膜。S112. Calculate the binary mask of the full-field digital slice based on the saturation channel thresholding of the full-field digital slice after color space conversion.
具体地,在步骤S111中,在获得全视野数字切片后,为了确保与多种图像处理工具和软件的兼容性,以及提高图像的存储和传输效率,需要将全视野数字切片转换为TIF格式。TIF是一个广泛使用的、支持无损压缩和多层次存储的图像格式,适用于需要高质量和高分辨率的医学和科研应用。Specifically, in step S111, after obtaining the full-field digital slice, in order to ensure compatibility with a variety of image processing tools and software, and to improve image storage and transmission efficiency, the full-field digital slice needs to be converted into TIF format. TIF is a widely used image format that supports lossless compression and multi-level storage, and is suitable for medical and scientific research applications that require high quality and high resolution.
基于此,全视野数字切片首先通过预设图像处理软件或自定义编写的工具进行格式转换。例如,使用开源的图像处理库(如OpenCV或图像处理工具包(Python ImageLibrary,PIL)),进行格式的转换。在转换过程中,通过图像压缩或质量调整,优化存储大小和保持图像的细节信息。在本申请的实现方式中使用KFBioConverter进行TIF格式转化。Based on this, full-field digital slices are first converted into formats using preset image processing software or custom-written tools. For example, use an open source image processing library (such as OpenCV or image processing toolkit (Python ImageLibrary, PIL)) to perform format conversion. During the conversion process, image compression or quality adjustment is used to optimize storage size and preserve image details. In the implementation of this application, KFBioConverter is used to convert TIF format.
进一步地,为了对图像中的色彩信息进行更细致和专业的分析,通常使用OpenCV工具库将转换后的全视野数字切片从RGB颜色空间转换至HSV颜色空间。其中,RGB颜色空间基于红色、绿色和蓝色三个通道的组合来表示颜色,而HSV则基于色调(Hue)、饱和度(Saturation)和亮度(Value)来描述颜色。在HSV颜色空间中,色调描述了颜色的种类(如红、绿、蓝等),饱和度描述了颜色的纯度,而亮度描述了颜色的明亮程度。Furthermore, in order to perform more detailed and professional analysis of the color information in the image, the OpenCV tool library is usually used to convert the converted full-field digital slice from the RGB color space to the HSV color space. Among them, RGB color space represents color based on the combination of three channels: red, green and blue, while HSV describes color based on hue (Hue), saturation (Saturation) and brightness (Value). In the HSV color space, hue describes the type of color (such as red, green, blue, etc.), saturation describes the purity of the color, and brightness describes the brightness of the color.
其中,由RGB颜色空间转换到HSV颜色空间的好处是,更直观地分析和处理图像中的颜色变化和分布,特别是在实际应用中,如细胞染色或组织分析,颜色的细微差异与某些生物或病理特征相关。此外,HSV颜色空间中的各个通道可以单独处理,方便进行更专业的色彩调整和分析。Among them, the advantage of converting from RGB color space to HSV color space is that it can more intuitively analyze and process the color changes and distribution in the image, especially in practical applications, such as cell staining or tissue analysis, where subtle differences in color are related to certain related to biological or pathological characteristics. In addition, each channel in the HSV color space can be processed independently to facilitate more professional color adjustment and analysis.
在一种实现方式中,当分析某种特定的细胞颜色时,只对某种特定色调的颜色感兴趣,可以直接使用OpenCV工具库在HSV颜色空间的色调通道上设置阈值,这样可以有效地分离出目标颜色区域。In one implementation, when analyzing a specific cell color, you are only interested in the color of a specific hue. You can directly use the OpenCV tool library to set a threshold on the hue channel of the HSV color space, which can effectively separate out of the target color area.
在本申请的实现方式中,首先采用了OpenSlide工具来处理全视野数字切片。OpenSlide是一个为高分辨率图像,特别是生物医学切片图像提供支持的开源库。考虑到全视野数字切片的大尺寸,直接加载整个图像到内存中会非常耗费资源,因此选择了使用OpenSlide对全视野数字切片进行下采样操作。这意味着读入的是全视野数字切片的一个缩小版本,具有较低的分辨率,但仍保留了大部分关键的结构和颜色信息。通过下采样操作,更加高效地进行初步的图像处理和分析。当经过下采样的全视野数字切片读入内存后,使用OpenCV工具库将全视野数字切片从RGB颜色空间转换到HSV颜色空间。这一步骤是通过OpenCV库中的cvtColor函数实现的,该函数接受原始的RGB图像和转换类型(在这种情况下是cv2.COLOR_RGB2HSV)作为输入参数,然后返回转换后的HSV图像。通过这种方式,实现更专业地分析和处理全视野数字切片中的颜色变化和分布,从而在后续步骤中精确地对全视野数字切片进行处理和分析。In the implementation of this application, the OpenSlide tool is first used to process full-field digital slices. OpenSlide is an open source library that provides support for high-resolution images, especially biomedical slide images. Considering the large size of the full-field digital slices, directly loading the entire image into memory will be very resource-intensive, so we chose to use OpenSlide to downsample the full-field digital slices. This means that what is read in is a scaled-down version of the full-field digital slice, with lower resolution but still retaining most of the key structural and color information. Through downsampling operations, preliminary image processing and analysis can be performed more efficiently. After the downsampled full-field digital slice is read into the memory, the OpenCV tool library is used to convert the full-field digital slice from the RGB color space to the HSV color space. This step is implemented through the cvtColor function in the OpenCV library, which accepts the original RGB image and the conversion type (cv2.COLOR_RGB2HSV in this case) as input parameters, and returns the converted HSV image. In this way, more professional analysis and processing of color changes and distribution in full-field digital slices is achieved, so that full-field digital slices can be accurately processed and analyzed in subsequent steps.
进一步地,步骤S112中,在获得颜色空间转换后的全视野数字切片之后,为了进一步处理和分析全视野数字切片中的特定内容和结构,使用其饱和度通道进行阈值化处理以获得相应的二进制掩膜。其中,饱和度通道在HSV颜色空间中描述了颜色的纯度,因此,饱和度通道通常被用于分离图像中饱和度较高或较低的区域。Further, in step S112, after obtaining the full-field digital slice after color space conversion, in order to further process and analyze the specific content and structure in the full-field digital slice, its saturation channel is used for thresholding processing to obtain the corresponding binary mask. Among them, the saturation channel describes the purity of the color in the HSV color space. Therefore, the saturation channel is usually used to separate areas with higher or lower saturation in the image.
具体地,根据全视野数字切片中的内容和应用需求,选择一个合适的阈值。例如,从全视野数字切片中提取饱和度较高的区域,如特定的染色或标记,设置一个预设阈值,例如150(在0-255的范围内),以便只有饱和度高于预设阈值的像素才被识别。反之,若目的是识别饱和度较低的区域,如背景或未染色部分,预设阈值可以设置得较低,例如50。基于实验数据或先验知识,预设阈值可以进行微调,确保对特定应用的有效性和准确性。此外,为了获得最佳效果,可以结合直方图分析或大津法自动确定最佳预设阈值。Specifically, an appropriate threshold is selected based on the content and application requirements in the full-field digital slice. For example, extract areas with higher saturation from a full-field digital slice, such as a specific stain or marker, and set a preset threshold such as 150 (on a scale of 0-255) so that only saturation is higher than the preset threshold pixels are recognized. On the other hand, if the purpose is to identify less saturated areas, such as background or unstained parts, the preset threshold can be set lower, such as 50. Based on experimental data or prior knowledge, preset thresholds can be fine-tuned to ensure effectiveness and accuracy for specific applications. In addition, for best results, the best preset threshold can be automatically determined in combination with histogram analysis or Otsu's method.
通过应用阈值化操作,全视野数字切片的饱和度通道中的像素将被转换为二进制值,即“0”或“1”。其中,高于预设阈值的像素将被标记为“1”(或白色),而低于或等于预设阈值的像素将被标记为“0”(或黑色)。进而,得到了一个二进制掩膜,其中白色区域表示感兴趣的、饱和度较高的部分,而黑色区域则表示其他部分。By applying a thresholding operation, pixels in the saturation channel of a full-field digital slice are converted to binary values, either "0" or "1". Among them, pixels higher than the preset threshold will be marked as "1" (or white), while pixels lower than or equal to the preset threshold will be marked as "0" (or black). In turn, a binary mask is obtained in which white areas represent the more saturated parts of interest and black areas represent other parts.
此外,为了增强掩膜的准确性并去除潜在噪声,使用图像处理技术对二进制掩膜进行后处理,例如使用形态学操作如开运算或闭运算来去除小的噪声点或填充小的空洞。In addition, in order to enhance the accuracy of the mask and remove potential noise, image processing techniques are used to post-process the binary mask, such as using morphological operations such as opening or closing operations to remove small noise points or fill small holes.
具体地,在步骤S12中,基于步骤S11获得的前景对象轮廓,可以对整个全视野数字切片进行精确分割,从而提取与前景对象相关的图像区域,这被称为前景图像。其中,全视野数字切片包含多种元素,例如背景、噪声和感兴趣的前景对象(例如目标细胞、组织或特定病理区域),为了只关注前景对象,利用获得的前景对象轮廓分割原始的全视野数字切片。分割可以通过简单的点乘操作完成,即将前景对象轮廓(由1和0组成)与原始全视野数字切片进行点乘,从而只保留前景区域,而将背景区域设置为0或其他指定的背景值。Specifically, in step S12, based on the foreground object outline obtained in step S11, the entire full-field digital slice can be accurately segmented, thereby extracting the image area related to the foreground object, which is called the foreground image. Among them, the full field of view digital slice contains a variety of elements, such as background, noise and foreground objects of interest (such as target cells, tissues or specific pathological areas). In order to focus only on the foreground objects, the obtained foreground object contours are used to segment the original full field of view. Digital slicing. Segmentation can be accomplished by a simple dot-multiplication operation, i.e. dot-multiply the foreground object outline (consisting of 1 and 0) with the original full-field digital slice, thereby retaining only the foreground area and setting the background area to 0 or other specified background value .
在实际应用中,为了进一步提高前景图像的清晰度和可读性,采用后处理技术,例如平滑、锐化等,来增强前景图像的细节,此外,前景图像可以进一步进行彩色或灰度调整,以满足特定的应用或分析要求。In practical applications, in order to further improve the clarity and readability of the foreground image, post-processing techniques, such as smoothing, sharpening, etc., are used to enhance the details of the foreground image. In addition, the foreground image can be further adjusted in color or grayscale, to meet specific application or analytical requirements.
在本申请的实现方式中,全视野数字切片来源于“克罗恩、肠结核判别“的相关研究,更关心疾病特征的形态和结构。在获得前景对象轮廓后,将轮廓应用到原始全视野数字切片上,从而得到与克罗恩和肠结核相关的前景图像。为了更好地观察疾病的结构和特点,对前景图像进行图像增强操作,使得相关区域的某些细节更为明显。In the implementation of this application, full-field digital slices are derived from related research on "Discrimination of Crohn's and Intestinal Tuberculosis" and are more concerned with the morphology and structure of disease characteristics. After obtaining the foreground object contours, the contours are applied to the original full-field digital slices, resulting in Crohn's and intestinal tuberculosis-related foreground images. In order to better observe the structure and characteristics of the disease, image enhancement operations are performed on the foreground image to make certain details of the relevant areas more obvious.
具体地,在步骤S13中,基于步骤S12获得的前景图像,为了进一步细化分析或进行特定的前景图像处理操作,将前景图像划分为多个小的图像块。划分操作可以更细致地研究前景图像中的局部特征、结构或纹理,或者并行处理各个图像块以提高计算效率。具体地,前景图像可以被等分或按照预定的规模和步长划分为多个图像块,图像块可以是方形、矩形或其他指定的形状,图像块的大小和数量则根据实际需要和应用场景来确定。Specifically, in step S13, based on the foreground image obtained in step S12, in order to further refine the analysis or perform specific foreground image processing operations, the foreground image is divided into a plurality of small image blocks. Partitioning operations allow for more detailed study of local features, structures, or textures in foreground images, or for processing individual image patches in parallel to increase computational efficiency. Specifically, the foreground image can be equally divided or divided into multiple image blocks according to a predetermined scale and step size. The image blocks can be square, rectangular or other specified shapes. The size and number of the image blocks are based on actual needs and application scenarios. to make sure.
在一个实现方式中,前景图像的分辨率很高,为了避免大量的计算,可以将其划分为较大的图像块;反之,如果需要详细研究前景图像中的微观特征,则将前景图像划分为较小的图像块,在本申请的实现方式中,利用OpenCV在40X分辨率下将组织分成若干384x384像素的图像块。In one implementation, the resolution of the foreground image is very high, and in order to avoid a large amount of calculations, it can be divided into larger image blocks; conversely, if the microscopic features in the foreground image need to be studied in detail, the foreground image can be divided into For smaller image blocks, in the implementation of this application, OpenCV is used to divide the tissue into several image blocks of 384x384 pixels at 40X resolution.
其次,在一个实现方式中,采用滑动窗口的方式对前景图像进行划分,可以获得图像块之间的部分重叠,从而在后续的处理中获得更好的连续性和一致性,重叠滑动窗口的方法在需要边缘信息或当图像块的边界与感兴趣的目标结构交叉时非常有用,可以确保目标结构被完整地包含在某个图像块中,而不是被切断。Secondly, in one implementation, the sliding window method is used to divide the foreground image, and partial overlap between image blocks can be obtained, thereby obtaining better continuity and consistency in subsequent processing. The overlapping sliding window method It is useful when edge information is needed or when the boundary of an image patch intersects the target structure of interest. It can ensure that the target structure is completely included in a certain image patch instead of being cut off.
然而,在存储或并行处理的场景中,更希望生成互不重叠的图像块,在这种情况下,可以直接按照固定的图像块大小和间隔对前景图像进行均匀划分。例如,选择的图像块大小为64x64像素,那么每64像素进行一次划分,确保每个图像块都是独立的,没有任何重叠部分,避免在互不重叠的图像块划分过程中不会出现的任何信息丢失或误差,确保所选择的图像块大小与前景图像中的目标结构或特征的尺寸相匹配。However, in storage or parallel processing scenarios, it is more desirable to generate non-overlapping image blocks. In this case, the foreground image can be evenly divided directly according to fixed image block sizes and intervals. For example, if the selected image block size is 64x64 pixels, then divide it every 64 pixels to ensure that each image block is independent and does not have any overlapping parts to avoid any problems that will not occur during the division of non-overlapping image blocks. Information loss or error, ensure that the selected image patch size matches the size of the target structure or feature in the foreground image.
此外,如果前景图像的尺寸不是图像块大小的整数倍,可以对图像进行适当的填充或裁剪,以确保完整划分。此外,为了确保每个图像块都有足够的信息或特征点,采用预设的评估机制,如图像块内的特征点数量、对比度或纹理复杂度等,来决定是否接受或拒绝某个图像块。例如,当全视野数字切片是克罗恩病理组织的前景图像时,将其划分为大小为256x256的图像块,对于每个图像块,进一步进行特定的图像分析(如细胞计数、组织结构分析或颜色统计),在处理完所有图像块后,重新组合图像块,以获得一个完整的、处理后的全视野数字切片。In addition, if the size of the foreground image is not an integral multiple of the image block size, the image can be padded or cropped appropriately to ensure complete division. In addition, in order to ensure that each image patch has enough information or feature points, a preset evaluation mechanism is used, such as the number of feature points within the image patch, contrast or texture complexity, etc., to decide whether to accept or reject an image patch. . For example, when the full-field digital slice is the foreground image of Crohn's pathology tissue, it is divided into image blocks of size 256x256, and for each image block, specific image analysis (such as cell counting, tissue structure analysis, or Color statistics), after all image patches have been processed, the image patches are reassembled to obtain a complete, processed full-field digital slice.
在本申请的实现方式中,使用OpenCV库完成对全视野数字切片的预处理操作。首先,使用OpenSlide读取全视野数字切片。考虑到全视野数字切片中包含多种元素,如背景、噪声以及真正感兴趣的前景对象(克罗恩病和肠结核的病理区域),所以需要首先对全视野数字切片进行分割,确保只关注前景对象。In the implementation of this application, the OpenCV library is used to complete the preprocessing operation of full-field digital slices. First, full-field digital slices were read using OpenSlide. Considering that full-field digital slices contain a variety of elements, such as background, noise, and foreground objects of real interest (pathological areas of Crohn's disease and intestinal tuberculosis), the full-field digital slices need to be segmented first to ensure that only the Foreground object.
为了去除无关的空白区域和提高处理效率,利用OpenCV在低分辨率下对全视野数字切片进行组织轮廓的分割。获得前景对象轮廓后,直接将前景对象轮廓与原始全视野数字切片进行点乘操作。点乘操作可以简单地通过将前景对象轮廓(由1和0组成的二值图像)与原始全视野数字切片进行点乘来实现。进而,与前景对象相关的图像区域会被保留,同时其他区域则会被设置为0或其他指定的背景值。In order to remove irrelevant blank areas and improve processing efficiency, OpenCV was used to segment tissue contours on full-field digital slices at low resolution. After obtaining the foreground object outline, directly perform a dot multiplication operation on the foreground object outline and the original full-field digital slice. The dot product operation can be implemented simply by dot multiplying the foreground object outline (a binary image consisting of 1s and 0s) with the original full-field digital slice. Furthermore, the image area associated with the foreground object is retained, while other areas are set to 0 or other specified background values.
在完成前景轮廓分割后,为了进一步的细化处理和分析,选择在40X分辨率下将前景对象划分为若干384×384大小的互不重叠的图像块(patch)。进而,更细致地研究每个图像块内的局部特征、结构或纹理。而且,处理小图像块比处理整个大图像在计算上更加高效。After completing the foreground contour segmentation, for further refinement and analysis, the foreground object is divided into a number of non-overlapping image patches (patches) of 384×384 size at 40X resolution. Then, the local features, structures or textures within each image patch are studied in more detail. Moreover, processing small image patches is more computationally efficient than processing an entire large image.
每个图像块都会被保存,并记录其在原始全视野数字切片中的左上角像素坐标。此外,在后续的处理和分析中,如果需要重构整个前景对象或考虑图像块之间的关系,像素坐标信息将起到定位的作用。Each image patch is saved with its upper left corner pixel coordinates recorded in the original full-field digital slice. In addition, in subsequent processing and analysis, if it is necessary to reconstruct the entire foreground object or consider the relationship between image blocks, the pixel coordinate information will play a positioning role.
此外,根据分析的需要,每个图像块还可以进行进一步的图像增强和处理。例如,针对特定的细胞染色技术,更关心染色后的细胞形态和结构,这时可以应用图像增强算法,如锐化、对比度增强等,以使细胞内部的细节更为突出。In addition, each image patch can undergo further image enhancement and processing depending on the needs of the analysis. For example, for a specific cell staining technology, you are more concerned about the cell morphology and structure after staining. In this case, you can apply image enhancement algorithms, such as sharpening, contrast enhancement, etc., to make the details inside the cells more prominent.
在处理完所有的图像块后,图像块可以重新组合,生成一个处理后的全视野数字切片,从而为后续的分析提供更丰富的信息。After all image tiles have been processed, the image tiles can be recombined to generate a processed full-field digital slice, providing richer information for subsequent analysis.
S20、对于每个全视野数字切片,将所述全视野数字切片对应的若干图像块输入经过训练的检测网络模型,通过所述检测网络模型确定每个全视野数字切片对应的初始预测概率矩阵。S20. For each full-field digital slice, input several image blocks corresponding to the full-field digital slice into a trained detection network model, and determine the initial prediction probability matrix corresponding to each full-field digital slice through the detection network model.
具体地,在获得全视野数字切片并将其划分为若干图像块后,为了分析和识别图像块中的特定结构或内容,使用经过训练的检测网络模型对图像块进行处理。其中,检测网络模型通常是一种深度学习模型,常用于图像中目标的检测和识别。Specifically, after obtaining a full-field digital slice and dividing it into several image blocks, in order to analyze and identify specific structures or contents in the image blocks, a trained detection network model is used to process the image blocks. Among them, the detection network model is usually a deep learning model, which is often used for the detection and recognition of targets in images.
其次,每个图像块被单独输入到经过训练的检测网络模型中。这种模型通常由多个卷积层、池化层、全连接层组成,旨在从输入的图像块中提取有意义的特征,并进行相关目标的检测。通过这种方式,模型可以为每个图像块生成一个预测概率矩阵,表示图像块中每个像素属于某个目标类别的概率。Second, each image patch is individually input into the trained detection network model. This model usually consists of multiple convolutional layers, pooling layers, and fully connected layers, aiming to extract meaningful features from input image blocks and detect related targets. In this way, the model can generate a predicted probability matrix for each image patch, representing the probability that each pixel in the image patch belongs to a certain target class.
例如,实际任务是识别特定的病理组织,那么预测概率矩阵中的每个值都将表示相应像素点是否属于这种病理组织的概率。一个高的概率值表示该像素点属于目标病理组织。For example, if the actual task is to identify a specific pathological tissue, then each value in the prediction probability matrix will represent the probability of whether the corresponding pixel belongs to this pathological tissue. A high probability value indicates that the pixel belongs to the target pathological tissue.
对于每个全视野数字切片,其对应的若干图像块都将被依次输入到检测网络模型中,并得到每个图像块的预测概率矩阵。其中,预测概率矩阵可以被组合或合并,以得到整个全视野数字切片的初始预测概率矩阵。For each full-field digital slice, its corresponding image blocks will be input into the detection network model in turn, and the prediction probability matrix of each image block will be obtained. Among them, the prediction probability matrices can be combined or merged to obtain the initial prediction probability matrix of the entire full-field digital slice.
此外,检测网络模型通常需要大量的标注数据进行训练。标注数据是带有标注的图像块,图像块的目标结构或内容已被准确标出。通过大量的训练数据,模型可以学习到如何准确地识别和检测目标,从而在实际应用中具有高的准确率和鲁棒性。In addition, detection network models usually require a large amount of annotated data for training. Annotation data is annotated image blocks, and the target structure or content of the image block has been accurately marked. Through a large amount of training data, the model can learn how to accurately identify and detect targets, thereby achieving high accuracy and robustness in practical applications.
在一个实现方式中,如图2所示,所述检测网络模型包括特征提取模块和分类模块。In one implementation, as shown in Figure 2, the detection network model includes a feature extraction module and a classification module.
具体地,特征提取模块的主要任务是从输入的图像块中提取有意义的特征,进而捕获图像中的重要信息,例如边缘、纹理、形状和颜色等。为实现这一目的,特征提取模块通常采用多个卷积层、激活层以及池化层的结构。例如,卷积层通过不同的卷积核滑动图像,捕获各种空间特征;池化层降低特征的空间维度,同时保留最重要的信息。Specifically, the main task of the feature extraction module is to extract meaningful features from the input image patches, thereby capturing important information in the image, such as edges, texture, shape, and color. To achieve this purpose, the feature extraction module usually adopts a structure of multiple convolutional layers, activation layers and pooling layers. For example, the convolutional layer slides the image through different convolution kernels to capture various spatial features; the pooling layer reduces the spatial dimension of the features while retaining the most important information.
在实际的实现方式中,特征提取模块可以采用预训练的网络结构,例如VGG16、ResNet50或MobileNet等。预训练的模型在大量图像数据上进行了预训练操作,可以有效地捕获图像的基本特征,从而加速并提高新任务的学习效率。In actual implementation, the feature extraction module can use a pre-trained network structure, such as VGG16, ResNet50 or MobileNet, etc. The pre-trained model is pre-trained on a large amount of image data and can effectively capture the basic features of the image, thereby accelerating and improving the learning efficiency of new tasks.
本申请的实现方式中,在特征提取模块在操作上采用了基于ImageNet数据集预训练的视觉变压器(Vision Transformer,ViT)模型,确保每个图像块都能得到高质量、低维的特征表示。为了进一步优化并增强模型的性能和泛化能力,每个图像块在送入ViT模型之前都进行了一系列的数据增强操作。首先,根据保存的坐标信息对每块图像进行读取,然后随机应用以下数据增强方法之一或多个:In the implementation of this application, the feature extraction module uses a Vision Transformer (ViT) model pre-trained based on the ImageNet data set to ensure that each image block can obtain high-quality, low-dimensional feature representation. In order to further optimize and enhance the performance and generalization ability of the model, each image patch undergoes a series of data enhancement operations before being fed into the ViT model. First, each image patch is read based on the saved coordinate information, and then one or more of the following data augmentation methods are randomly applied:
1.进行随机旋转,例如旋转角度为90°,或进行水平翻转。1. Perform random rotation, such as a 90° rotation angle, or perform a horizontal flip.
2.通过仿射变换,实现对图像块的缩放和旋转。具体地,缩放的范围是在原图的80%到120%之间,而旋转的角度范围在-30°到+30°。2. Realize scaling and rotation of image blocks through affine transformation. Specifically, the scaling range is between 80% and 120% of the original image, while the rotation angle ranges from -30° to +30°.
3.在图像块中引入高斯噪声或应用高斯模糊,使其更具有鲁棒性。3. Introduce Gaussian noise or apply Gaussian blur to the image block to make it more robust.
4.调整图像块的亮度和对比度,进一步增加模型的容错性。4. Adjust the brightness and contrast of image blocks to further increase the fault tolerance of the model.
5.对图像块的色调和饱和度进行调整,增加模型对各种真实环境条件的适应能力。5. Adjust the hue and saturation of the image blocks to increase the model's adaptability to various real environmental conditions.
经过数据增强步骤后,每个图像块都被送入ViT模型进行处理,转换为一个具有768维的特征向量。这样,每个全视野数字切片都能够产生数千个这样的特征向量,为后续的图像分析提供了丰富的信息基础。After the data augmentation step, each image patch is fed into the ViT model for processing and converted into a feature vector with 768 dimensions. In this way, each full-field digital slice can generate thousands of such feature vectors, providing a rich information basis for subsequent image analysis.
具体地,经过特征提取模块从图像中提取出有意义的特征后,分类模块的任务是根据有意义的特征对图像块进行分类和预测。通常分类和预测的过程涉及到几个全连接神经网络层,通过全连接神经网络层将特征向量转化为最终的预测结果,在本申请的实现方式中,最终转化为概率矩阵。Specifically, after the feature extraction module extracts meaningful features from the image, the task of the classification module is to classify and predict image blocks based on the meaningful features. Usually, the process of classification and prediction involves several fully connected neural network layers. The feature vector is converted into the final prediction result through the fully connected neural network layer. In the implementation of this application, it is finally converted into a probability matrix.
为了防止过拟合并提高模型的泛化能力,分类模块中包括正则化技术(如随机失活或L2正则化)。To prevent overfitting and improve the generalization ability of the model, regularization techniques (such as random dropout or L2 regularization) are included in the classification module.
此外,分类模块中还包括激活函数(如线性整流函数或柔性最大值函数),用于增加模型的非线性能力和将输出转化为概率值。In addition, the classification module also includes activation functions (such as linear rectification functions or flexible maximum functions), which are used to increase the nonlinear capabilities of the model and convert the output into probability values.
综上所述,当全视野数字切片的图像块被输入到检测网络模型时,图像块首先经过特征提取模块,得到一组特征向量。其次,特征向量被传递给分类模块,得到最终的概率矩阵。这种结合特征提取和分类的方法不仅提高了模型的准确性,也确保了模型在面对不同类型的图像数据时具有良好的鲁棒性。To sum up, when the image blocks of full-field digital slices are input to the detection network model, the image blocks first pass through the feature extraction module to obtain a set of feature vectors. Secondly, the feature vector is passed to the classification module to obtain the final probability matrix. This method of combining feature extraction and classification not only improves the accuracy of the model, but also ensures that the model has good robustness when facing different types of image data.
在另一个实现方式中,所述将若干全视野数字切片中的每个全视野数字切片输入经过训练的检测网络模型,通过所述检测网络模型确定每个全视野数字切片对应的初始预测概率矩阵具体包括:In another implementation, each of the plurality of full-field digital slices is input into a trained detection network model, and the initial prediction probability matrix corresponding to each full-field digital slice is determined through the detection network model. Specifically include:
S221、对于每个全视野数字切片,确定所述全视野数字切片对应的若干图像块;S221. For each full-field digital slice, determine several image blocks corresponding to the full-field digital slice;
S222、将若干图像块分别输入所述特征提取模块,通过所述特征提取模块确定各图像块各自对应的特征向量;S222. Input several image blocks into the feature extraction module respectively, and determine the corresponding feature vectors of each image block through the feature extraction module;
S223、将各图像块各自对应的特征向量输入所述分类模块,通过所述分类模块确定所述全视野数字切片对应的初始预测概率矩阵。S223. Input the corresponding feature vectors of each image block into the classification module, and determine the initial prediction probability matrix corresponding to the full-view digital slice through the classification module.
具体地,在获得各图像块对应的特征向量后,进一步利用分类模块处理特征向量,从而确定全视野数字切片对应的初始预测概率矩阵。Specifically, after obtaining the feature vectors corresponding to each image block, the feature vectors are further processed using a classification module to determine the initial prediction probability matrix corresponding to the full-view digital slice.
在实际的实现方式中,上述分类模块通常是基于深度神经网络设计的(例如使用多层全连接神经网络层),确保从特征向量中提取到有用的、对分类有决定性的信息。在分类模块的最后,使用柔性最大值激活函数,使得模块可以为每个图像块输出一个预测概率矩阵。In actual implementation, the above classification module is usually designed based on a deep neural network (for example, using multiple layers of fully connected neural network layers) to ensure that useful and decisive information for classification is extracted from the feature vector. At the end of the classification module, a flexible maximum activation function is used so that the module can output a prediction probability matrix for each image block.
预测概率矩阵反映了模型对每个图像块所属类别的置信度。具体来说,当实际的任务是二分类任务时,预测概率矩阵的每个元素都代表了对应图像块被判断为预设类别的概率。而对于多分类任务,预测概率矩阵的每一行则会展示出模型对该图像块属于所有类别的预测概率。The prediction probability matrix reflects the model's confidence in the category to which each image patch belongs. Specifically, when the actual task is a binary classification task, each element of the prediction probability matrix represents the probability that the corresponding image block is judged to be a preset category. For multi-classification tasks, each row of the prediction probability matrix shows the model's prediction probability that the image patch belongs to all categories.
为了防止过拟合并提高模型的泛化能力,分类模块中还包括一些正则化技术,如随机失活。In order to prevent overfitting and improve the generalization ability of the model, some regularization techniques such as random deactivation are also included in the classification module.
此外,为了提高模型的训练效率和收敛速度,还可以在模块中加入批归一化(Batch Normalization)层。In addition, in order to improve the training efficiency and convergence speed of the model, a batch normalization (Batch Normalization) layer can also be added to the module.
分类模块处理结束后,所有的图像块都会得到一个与其对应的预测概率矩阵。将上述概率矩阵合在一起,就构成了全视野数字切片对应的初始预测概率矩阵,为后续的分析和决策提供了关键数据。After the classification module is processed, all image blocks will obtain a corresponding prediction probability matrix. Combining the above probability matrices together forms the initial prediction probability matrix corresponding to the full-field digital slice, which provides key data for subsequent analysis and decision-making.
在一个实现方式中,所述分类模块包括注意力单元和多层感知机。In one implementation, the classification module includes an attention unit and a multi-layer perceptron.
具体地,在获得特征向量后,将特征向量进一步传递到所述的分类模块,以确保能够根据特征向量做出准确的分类判断,并生成全视野数字切片对应的初始预测概率矩阵。Specifically, after obtaining the feature vector, the feature vector is further passed to the classification module to ensure that accurate classification judgments can be made based on the feature vector, and an initial prediction probability matrix corresponding to the full-field digital slice is generated.
在本申请的实现方式中,分类模块首先利用一个或多个注意力单元来对输入的特征向量进行处理。注意力机制在深度学习领域已经被证明是一种强大的方法,可以有效地为模型提供区分不同部分特征的能力,从而确保模型更加关注对分类更为关键的特征部分。注意力单元可以通过自注意力(Self-Attention)或者其他的变种(比如稀疏注意力Sparse-Attention)来实现。自注意力机制能够为每个输入特征分配一个权重,权重基于输入特征之间的相互关系而计算得到。因此,对于潜在与当前任务高度相关的特征,它们会被赋予更高的权重。In the implementation of this application, the classification module first uses one or more attention units to process the input feature vector. The attention mechanism has proven to be a powerful method in the field of deep learning, which can effectively provide the model with the ability to distinguish different parts of features, thereby ensuring that the model pays more attention to the feature parts that are more critical for classification. The attention unit can be implemented through self-attention or other variants (such as sparse attention Sparse-Attention). The self-attention mechanism can assign a weight to each input feature, and the weight is calculated based on the relationship between the input features. Therefore, higher weights are given to features that are potentially highly relevant to the current task.
在经过注意力单元处理后,特征向量的每个部分都已被加权,进而为后续的分类提供了更为丰富和区分度更高的特征表示。After being processed by the attention unit, each part of the feature vector has been weighted, thereby providing a richer and more discriminative feature representation for subsequent classification.
随后,处理后的特征被传递到多层感知机进行进一步的处理。多层感知机是由多个全连接神经网络层组成的,可以学习到输入特征的非线性表示,并确保能够根据非线性表示生成准确的分类结果。Subsequently, the processed features are passed to the multilayer perceptron for further processing. Multi-layer perceptron is composed of multiple fully connected neural network layers, which can learn non-linear representation of input features and ensure that accurate classification results can be generated based on non-linear representation.
在多层感知机的最后,通常使用柔性最大值激活函数来输出预测概率,确保每个输入的图像块都能获得一个清晰的、概率形式的分类结果。At the end of the multi-layer perceptron, a flexible maximum activation function is usually used to output the predicted probability, ensuring that each input image block can obtain a clear, probabilistic classification result.
此外,为了增加模型的鲁棒性和防止过拟合,分类模块还在注意力单元和多层感知机之间或之内加入随机失活和其他正则化方法。In addition, in order to increase the robustness of the model and prevent overfitting, the classification module also adds random deactivation and other regularization methods between or within the attention unit and the multi-layer perceptron.
在一个实现方式中,所述将各图像块各自对应的特征向量输入所述分类模块,通过所述分类模块确定所述全视野数字切片对应的初始预测概率矩阵具体包括:In one implementation, inputting the corresponding feature vectors of each image block into the classification module, and determining the initial prediction probability matrix corresponding to the full-field digital slice through the classification module specifically includes:
S231、将各图像块各自对应的特征向量输入注意力单元,通过注意力单元确定各图像块的注意力系数,并基于所述注意力系数和各图像块各自对应的特征向量形成注意力特征矩阵;S231. Input the corresponding feature vectors of each image block into the attention unit, determine the attention coefficient of each image block through the attention unit, and form an attention feature matrix based on the attention coefficient and the corresponding feature vector of each image block. ;
S232、将注意力特征矩阵输入所述多层感知机,通过所述多层感知机确定所述全视野数字切片对应的初始预测概率矩阵。S232. Input the attention feature matrix into the multi-layer perceptron, and determine the initial prediction probability matrix corresponding to the full-view digital slice through the multi-layer perceptron.
具体地,在步骤S231中,获得特征向量后,并为了能够捕获和强化特定区域之间的相关性和重要性,将特征向量传递到注意力单元以确定各图像块的注意力系数。注意力系数为模型提供了一种机制,允许模型集中于那些包含关键信息的特征区域。Specifically, in step S231, after obtaining the feature vector, and in order to capture and strengthen the correlation and importance between specific areas, the feature vector is passed to the attention unit to determine the attention coefficient of each image block. The attention coefficient provides a mechanism for the model to focus on those feature areas that contain key information.
在本申请的实现方式中,将每个全视野数字切片平铺成数千个图像块,并使用视觉变压器网络模型对图像块进行特征提取。这意味着,每个切片的数千个图像块均会提取出对应的特征向量,对于特征向量代表了各自图像块的内容和结构特性。In the implementation of this application, each full-field digital slice is tiled into thousands of image blocks, and a visual transformer network model is used to extract features from the image blocks. This means that for each slice of thousands of image patches, corresponding feature vectors are extracted, which represent the content and structural properties of the respective image patch.
为了计算注意力得分,首先将特征向量进行线性变换,以得到三组新的向量:查询(Query)、键(Key)和值(Value)。对于每一个查询向量,通过与所有键向量进行点积,然后通过柔性最大值函数,得到一个权重系数。In order to calculate the attention score, the feature vector is first linearly transformed to obtain three new sets of vectors: query, key and value. For each query vector, a weight coefficient is obtained by performing a dot product with all key vectors and then passing it through the flexible maximum function.
上述得到的权重系数就是注意力得分,表示当前图像块与其他所有图像块之间的相关性强度。注意力得分与对应的值向量相乘,得到一个加权的特征表示,也被称为注意力输出。这样,对于每个图像块,都可以得到一个基于其他所有图像块的上下文信息进行加权的特征向量。The weight coefficient obtained above is the attention score, which represents the correlation strength between the current image block and all other image blocks. The attention score is multiplied by the corresponding value vector to obtain a weighted feature representation, also known as the attention output. In this way, for each image patch, a feature vector weighted based on the contextual information of all other image patches can be obtained.
利用加权特征向量,形成一个新的、经过加权的特征矩阵,称之为“注意力特征矩阵”。注意力特征矩阵提供了一个更为丰富和区分度更高的特征表示,使模型能够在后续的处理中更加关注那些对分类决策有决定性影响的区域。Using the weighted feature vector, a new, weighted feature matrix is formed, which is called the "attention feature matrix". The attention feature matrix provides a richer and more discriminative feature representation, allowing the model to pay more attention to those areas that have a decisive impact on classification decisions in subsequent processing.
进一步,在步骤S232中,在获得注意力特征矩阵后,为了进一步确定全视野数字切片的预测概率,将注意力特征矩阵输入到多层感知机中。Further, in step S232, after obtaining the attention feature matrix, in order to further determine the prediction probability of the full-field digital slice, the attention feature matrix is input into the multi-layer perceptron.
其中,多层感知机主要由输入层、若干隐藏层和输出层组成。每一层都包含多个神经元,并且每个神经元都与前一层的所有神经元相连接。Among them, the multi-layer perceptron mainly consists of an input layer, several hidden layers and an output layer. Each layer contains multiple neurons, and each neuron is connected to all neurons in the previous layer.
输入层接收由注意力机制产生的特征向量。具体来说,有一个d维的特征向量,那么输入层将有d个神经元。The input layer receives feature vectors generated by the attention mechanism. Specifically, if there is a d-dimensional feature vector, then the input layer will have d neurons.
隐藏层由多个神经元组成,用于提取和学习输入数据中的复杂模式。隐藏层的神经元计算公式为:Hidden layers are composed of multiple neurons and are used to extract and learn complex patterns in the input data. The calculation formula of neurons in the hidden layer is:
hi=f(Wi·hi-1+bi)h i =f(W i ·h i-1 +b i )
其中,hi是第i个隐藏层的输出,hi-1是第i-1个隐藏层或输入层的输出,Wi和bi分别是第i个隐藏层的权重和偏置,而f是一个非线性激活函数,例如线性整流函数、逻辑斯蒂函数或双曲正切函数。Among them, h i is the output of the i-th hidden layer, hi-1 is the output of the i-1 hidden layer or input layer, Wi and b i are the weight and bias of the i-th hidden layer respectively, and f is a nonlinear activation function, such as a linear rectifier function, a logistic function, or a hyperbolic tangent function.
输出层产生模型的预测结果。在分类任务中,输出层的神经元数量通常与类别数相同。输出层的计算公式可以表示为:The output layer produces the model’s predictions. In classification tasks, the number of neurons in the output layer is usually the same as the number of categories. The calculation formula of the output layer can be expressed as:
o=softmax(Wi·hlast+bi)o=softmax(W i ·h last +b i )
其中,o是输出层的输出,即各类别的预测概率;hlast是最后一个隐藏层的输出;Wi和bi分别是输出层的权重和偏置;而Softmax函数是将模型的原始输出转换为概率的函数。Among them, o is the output of the output layer, that is, the predicted probability of each category; h last is the output of the last hidden layer; Wi and bi are the weight and bias of the output layer respectively; and the Softmax function is to convert the original output of the model Function converted to probability.
在本申请的实现方式中,根据步骤S231的注意力机制计算得到的注意力得分,对注意力特征矩阵中的每一个特征向量进行加权。这个步骤可以理解为将每一个图像块的特征向量与它在整个切片中的重要性相乘,从而得到了一个综合了空间关系和内容信息的特征矩阵。In the implementation of this application, each feature vector in the attention feature matrix is weighted according to the attention score calculated by the attention mechanism in step S231. This step can be understood as multiplying the feature vector of each image patch by its importance in the entire slice, thereby obtaining a feature matrix that combines spatial relationship and content information.
其次,将由多层感知器获得的图像块的注意力得分乘以相应的特征矩阵以获取切片基于图像块权重变化的特征矩阵。Secondly, the attention score of the image patch obtained by the multi-layer perceptron is multiplied by the corresponding feature matrix to obtain the feature matrix of the slice based on the weight change of the image patch.
接着,将得到的加权特征矩阵送入多层感知机。多层感知机会通过隐藏层对基于图像块权重变化的特征矩阵进行进一步的转换,最后在输出层给出单个全视野数字切片的预测概率。Then, the obtained weighted feature matrix is fed into the multilayer perceptron. The multi-layer perception opportunity further transforms the feature matrix based on the image block weight changes through the hidden layer, and finally gives the predicted probability of a single full-field digital slice in the output layer.
在得到了每个全视野数字切片的预测概率后,为了得到患者整体水平的预测结果,将所有切片的预测概率进行平均。通过综合考虑每个切片的信息,从而得到一个更为全面和稳健的预测。After obtaining the predicted probability of each full-field digital slice, in order to obtain the prediction result at the overall patient level, the predicted probabilities of all slices are averaged. By comprehensively considering the information from each slice, a more comprehensive and robust prediction is obtained.
S30、基于各全视野数字切片的初始预测概率矩阵,确定所述目标对象的预测结果,其中,所述预测结果包括克罗恩病类别、肠结核类别或正常组织类别。S30. Based on the initial prediction probability matrix of each full-field digital slice, determine the prediction result of the target object, where the prediction result includes a Crohn's disease category, an intestinal tuberculosis category, or a normal tissue category.
具体地,在获得各全视野数字切片的初始预测概率矩阵后,进一步分析初始预测概率矩阵,以确定目标对象的具体预测结果。Specifically, after obtaining the initial prediction probability matrix of each full-field digital slice, the initial prediction probability matrix is further analyzed to determine the specific prediction result of the target object.
在本申请的具体实现中,每个初始预测概率矩阵提供了三个主要类别的概率分布:克罗恩病、肠结核和正常组织。对于每个全视野数字切片,比较这三个类别的概率,并将概率最高的类别确定为该切片的预测结果。In the specific implementation of this application, each initial prediction probability matrix provides probability distributions for three main categories: Crohn's disease, intestinal tuberculosis, and normal tissue. For each full-field digital slice, the probabilities of the three categories are compared, and the category with the highest probability is determined as the prediction for that slice.
为确保预测的准确性和鲁棒性,采用投票机制或集成策略。例如,如果一个全视野数字切片被分为多个图像块,并且多数图像块的预测结果都指向“克罗恩病”,那么该全视野数字切片的最终预测结果就确定为“克罗恩病”。To ensure the accuracy and robustness of predictions, a voting mechanism or ensemble strategy is adopted. For example, if a full-field digital slice is divided into multiple image blocks, and the prediction results of most of the image blocks point to "Crohn's disease", then the final prediction result of the full-field digital slice is determined to be "Crohn's disease" ".
此外,为了进一步提高分类的准确性,还可以增加预设概率阈值机制。例如,如果克罗恩病的预测概率超过了预设的阈值,如90%,将更有信心地确定这一类别为预测结果。反之,如果所有类别的概率都较低,会标记此切片为“待复查”或“不确定”,并建议专家进一步审查。In addition, in order to further improve the accuracy of classification, a preset probability threshold mechanism can also be added. For example, if the predicted probability of Crohn's disease exceeds a preset threshold, such as 90%, this category will be identified with greater confidence. On the other hand, if the probabilities are low for all categories, the slice is marked as "to be reviewed" or "uncertain" and an expert is recommended for further review.
为了避免假阳性和假阴性的预测,可以引入额外的后处理步骤,例如空间连续性分析或结构分析,来确保预测结果在全局和局部都具有连续性和一致性。In order to avoid false positive and false negative predictions, additional post-processing steps can be introduced, such as spatial continuity analysis or structural analysis, to ensure that the prediction results are consistent and consistent both globally and locally.
在得到每个全视野数字切片的预测结果后,进一步为医生或相关人员提供可视化界面,显示预测结果、关联的概率值以及异常区域,以辅助他们进行最终的诊断判断。After obtaining the prediction results of each full-field digital slice, a visual interface is further provided for doctors or related personnel to display the prediction results, associated probability values, and abnormal areas to assist them in making final diagnostic judgments.
在一个实现方式中,所述基于各全视野数字切片的初始预测概率矩阵,确定所述目标对象的预测结果具体包括:In one implementation, determining the prediction result of the target object based on the initial prediction probability matrix of each full-field digital slice specifically includes:
S31、对各全视野数字切片的初始预测概率矩阵进行加权,得到目标预测概率矩阵;S31. Weight the initial prediction probability matrix of each full-field digital slice to obtain the target prediction probability matrix;
S32、基于所述目标预测概率矩阵确定所述目标对象的预测结果。S32. Determine the prediction result of the target object based on the target prediction probability matrix.
具体地,在步骤S31中,获取每个全视野数字切片的初始预测概率矩阵后,为了进一步优化和提升模型的预测性能,可以采取加权策略来得到更为准确的目标预测概率矩阵。Specifically, in step S31, after obtaining the initial prediction probability matrix of each full-field digital slice, in order to further optimize and improve the prediction performance of the model, a weighting strategy can be adopted to obtain a more accurate target prediction probability matrix.
例如,加权策略基于以下几个考虑因素:For example, weighting strategies are based on several considerations:
数据质量权重:并非所有的全视野数字切片质量都是相等的。某些全视野数字切片由于采样、设备或其他因素的差异而具有较低的图像质量。因此,根据每个全视野数字切片的质量,可以为其分配一个预设权重(例如,图像清晰度高的切片获得更高的权重0.8,权重范围0-1)。Data quality weighting: Not all full-field digital slices are of equal quality. Some full-field digital slices have lower image quality due to differences in sampling, equipment, or other factors. Therefore, each full-field digital slice can be assigned a preset weight based on its quality (for example, slices with high image clarity receive a higher weight of 0.8, with a weight range of 0-1).
上下文信息权重:如果某些全视野数字切片与其周围的切片在内容或特征上具有相似性,它们的预测结果被认为更为可靠。因此,可以考虑为特定的全视野切片分配较高的权重。Contextual information weighting: If certain full-field digital slices are similar in content or features to their surrounding slices, their prediction results are considered more reliable. Therefore, one might consider assigning a higher weight to a specific full-field slice.
模型置信度权重:对于每个预测,模型本身会有一个内部的置信度评分。对于模型非常确信的预测,赋予更高的权重,而对于模型不太确定的预测,赋予较低的权重。Model confidence weight: For each prediction, the model itself will have an internal confidence score. Predictions for which the model is very confident are given higher weights, while predictions for which the model is less certain are given lower weights.
结合上述权重,对每个全视野数字切片的初始预测概率矩阵进行加权处理。具体地,可以将上述权重作为加权因子与初始预测概率矩阵相乘,从而得到加权后的预测概率矩阵。Combined with the above weights, the initial prediction probability matrix of each full-field digital slice is weighted. Specifically, the above weight can be used as a weighting factor and multiplied by the initial prediction probability matrix to obtain a weighted prediction probability matrix.
进一步地,在步骤S32中,在获得目标预测概率矩阵后,系统进一步进行处理以确定最终的目标对象预测结果。Further, in step S32, after obtaining the target prediction probability matrix, the system further performs processing to determine the final target object prediction result.
在实际的实现方式中,目标预测概率矩阵为每个图像块或区域提供一个概率值,表示该块或区域中包含目标对象的可能性。这种方式提供了一个精细的、每个区域级别的概率地图,从而帮助更精确地定位和识别目标。In a practical implementation, the target prediction probability matrix provides a probability value for each image block or region, indicating the likelihood that the target object is contained in the block or region. This approach provides a granular, per-area-level probability map, helping to locate and identify targets more precisely.
为了从概率矩阵中得到明确的预测结果,通常会设定一个预设阈值。当某个区域的概率超过此阈值时,该区域被判断为包含目标对象。相反,如果概率低于阈值,该区域则被判断为不包含目标对象。选择合适的阈值是关键,通常基于交叉验证技术来确定,以确保预测的准确性和鲁棒性。In order to obtain clear prediction results from the probability matrix, a preset threshold is usually set. When the probability of a region exceeds this threshold, the region is judged to contain the target object. On the contrary, if the probability is lower than the threshold, the region is judged as not containing the target object. Choosing an appropriate threshold is key and is usually determined based on cross-validation techniques to ensure the accuracy and robustness of predictions.
具体来说,阈值的选择会受到应用场景和数据集特性的影响。例如,在医学影像分析任务中,考虑到患者的安全和避免漏诊的重要性,会倾向于选择一个较低的阈值,以确保所有的病变区域都能被检测到,即使这样会带来更高的假阳性率。反之,在其他情况下,例如在进行物体检测时,为了避免大量的误报和提高预测的准确性,会选择一个较高的阈值。Specifically, the choice of threshold will be affected by the application scenario and data set characteristics. For example, in medical image analysis tasks, considering patient safety and the importance of avoiding missed diagnoses, there is a tendency to choose a lower threshold to ensure that all diseased areas can be detected, even if this will bring higher false positive rate. On the contrary, in other cases, such as when performing object detection, a higher threshold is chosen in order to avoid a large number of false positives and improve prediction accuracy.
为了确定最佳阈值,一个常用的方法是绘制ROC(Receiver OperatingCharacteristic)曲线,然后选择一个使得真阳性率和假阳性率之间的权衡最优的点作为阈值。In order to determine the optimal threshold, a common method is to draw the ROC (Receiver Operating Characteristic) curve, and then select a point that makes the trade-off between the true positive rate and the false positive rate optimal as the threshold.
另一种方法是利用F1分数,F1分数是精确率和召回率的调和平均值,选择使F1分数最大化的阈值。例如,在进行肠结节检测的深度学习模型中,训练后的模型为每个预测区域给出一个介于0到1之间的概率值,表示该区域包含结节的概率。为了确定最佳的阈值,使用交叉验证数据集,并针对不同的阈值计算F1分数。假设尝试了0.3,0.5,0.7和0.9这四个阈值,发现阈值为0.5时F1分数最高,那么就可以选择0.5作为判断结节存在与否的阈值。Another approach is to use the F1 score, which is the harmonic mean of precision and recall, and choose a threshold that maximizes the F1 score. For example, in a deep learning model for intestinal nodule detection, the trained model gives each prediction area a probability value between 0 and 1, indicating the probability that the area contains a nodule. To determine the optimal threshold, use a cross-validation dataset and calculate the F1 score for different thresholds. Assume that four thresholds of 0.3, 0.5, 0.7 and 0.9 are tried and it is found that the F1 score is the highest when the threshold is 0.5, then 0.5 can be selected as the threshold to determine the presence or absence of nodules.
在本申请的实现方式中,首先将克罗恩病和肠结核病数据集的训练集部分输入到先前描述的模型中,以对模型的参数进行细化调整,从而得到一个能较好鉴别克罗恩病、肠结核病与正常组织的深度学习模型。同时,每个病人都有一个由临床给出的疾病诊断作为该病人的标签。In the implementation of this application, the training set part of the Crohn's disease and intestinal tuberculosis data sets is first input into the previously described model to finely adjust the parameters of the model, thereby obtaining a model that can better identify Crohn's disease. Deep learning models of encephalitis, intestinal tuberculosis and normal tissues. At the same time, each patient has a clinically given disease diagnosis as the patient's label.
考虑到存在两个水平的分类任务,所用数据不仅利用每个患者的诊断结果作为患者水平的标签,还对每张全视野数字切片进行临床评估,获取全视野数字切片水平的标签。Considering that there are two levels of classification tasks, the data used not only utilize the diagnosis results of each patient as patient-level labels, but also perform clinical evaluation on each full-field digital slice to obtain full-field digital slice-level labels.
每个全视野数字切片的数千个图像块都会被提取出相应的特征向量。为了获得切片水平的预测,先通过注意力系数,得到每个图像块相对于其他图像块的权重,从而获得了最终的特征矩阵。特征矩阵经由多层感知器处理,得到了单个全视野数字切片的预测概率。接着,平均多个全视野数字切片的预测结果,从而得到患者水平的预测。Corresponding feature vectors are extracted from thousands of image patches for each full-field digital slice. In order to obtain slice-level predictions, the weight of each image block relative to other image blocks is first obtained through the attention coefficient, thereby obtaining the final feature matrix. The feature matrix is processed by a multi-layer perceptron to obtain the predicted probability of a single full-field digital slice. Next, predictions from multiple full-field digital slices are averaged to obtain patient-level predictions.
为了确保模型的性能和泛化能力,采用五倍交叉验证的方式,基于每个样本的概率,评估训练队列中的整体预测性能。随后,在独立测试队列中对模型进行了进一步验证。To ensure the performance and generalization ability of the model, five-fold cross-validation was used to evaluate the overall prediction performance in the training cohort based on the probability of each sample. Subsequently, the model was further validated in an independent testing cohort.
在模型的训练过程中,本申请使用了Adam(Adaptive Moment Estimationoptimizer)优化器来更新网络的权重,并采用了初始学习率为0.0001、权重衰减为0.0001以及迭代次数为100的参数设置。考虑到数据集中存在的数据不平衡问题,为了确保模型能够平等地关注每个类别,特意为频率较低的类别分配了更高的权重,并采用LDAM损失来计算损失。这种方法确保了模型在面对不平衡数据时也能够获得稳定和准确的预测结果。During the training process of the model, this application uses the Adam (Adaptive Moment Estimation optimizer) optimizer to update the weight of the network, and adopts parameter settings with an initial learning rate of 0.0001, a weight attenuation of 0.0001, and a number of iterations of 100. Considering the data imbalance problem existing in the dataset, in order to ensure that the model can pay equal attention to each category, lower frequency categories are deliberately assigned higher weights, and LDAM loss is used to calculate the loss. This method ensures that the model can obtain stable and accurate prediction results even in the face of imbalanced data.
本申请的实施例中,所述的基于全视野数字切片的克罗恩病和肠结核的检测方法还包括:In the embodiment of the present application, the detection method of Crohn's disease and intestinal tuberculosis based on full-field digital slices also includes:
S41、对于每个全视野数字切片,获取所述全视野数字切片中的每个图像块相对于其他图像块的相似程度,以得到每个图像块的注意力分数;S41. For each full-field digital slice, obtain the similarity degree of each image block in the full-field digital slice relative to other image blocks to obtain the attention score of each image block;
S42、基于所述注意力分数形成注意力分布图,其中,所述注意力分布图中不同注意力分数对应的图像颜色不同;S42. Form an attention distribution map based on the attention score, wherein the images corresponding to different attention scores in the attention distribution map have different colors;
S43、将所述注意力分布图叠加至所述全视野数字切片上,以得到全视野数字切片的注意力热图。S43. Superimpose the attention distribution map onto the full-field digital slice to obtain an attention heat map of the full-field digital slice.
具体地,在步骤S41中,处理全视野数字切片时,为了确保模型能够充分理解全视野数字切片内的各个区域对于整体分析的相对重要性,每个图像块与其他图像块的相似程度是关键。这种相似程度衡量了图像块内部细节与其他图像块内部细节的关系,从而为每个图像块提供了一个注意力分数。注意力分数代表了模型对该图像块的关注程度,有助于更准确地分析病理信息。Specifically, in step S41, when processing the full-field digital slice, in order to ensure that the model can fully understand the relative importance of each area within the full-field digital slice for the overall analysis, the similarity of each image block to other image blocks is key. . This similarity measures how the internal details of an image patch relate to the internal details of other image patches, thus providing an attention score for each image patch. The attention score represents the degree of attention the model pays to this image patch, helping to analyze pathological information more accurately.
为了解释模型在切片级别上对不同区域的相对重要性,在本申请的实现方式中采取了一种特定的量化方法。首,先将全视野数字切片裁剪为多个16x16像素的图像块,这种特定的尺寸选择是基于病理细胞的尺寸和分布决定的,确保每个图像块都能够捕捉到足够的病理信息,每个图像块都包含了病理细胞的信息。此外,这种裁剪方法是在病理细胞的区域进行的,因此是公平的,不带有任何偏见或倾向性。In order to explain the relative importance of the model to different regions at the slice level, a specific quantification method is adopted in the implementation of this application. First, the full-field digital slices are cropped into multiple 16x16 pixel image blocks. This specific size selection is based on the size and distribution of pathological cells to ensure that each image block can capture sufficient pathological information. Each image block contains information about pathological cells. In addition, this cropping method is performed in the area of pathological cells and is therefore fair and free of any bias or bias.
其次,对于每个图像块,通过计算它与其他图像块的特征向量的相似性来量化它的相似程度。具体地说,计算相似性是通过计算特征向量之间的余弦相似性得分来实现。基于余弦相似性得分相似性得分,为每个图像块提取一个注意力分数,代表了该图像块相对于其他图像块的重要性。Second, for each image patch, its degree of similarity is quantified by calculating the similarity of its feature vectors with other image patches. Specifically, the similarity is calculated by calculating the cosine similarity score between feature vectors. Based on the cosine similarity score similarity score, an attention score is extracted for each image patch, which represents the importance of this image patch relative to other image patches.
具体地,基于图像块之间的特征向量相似性得分,为每个图像块提取注意力分数是一种强化关键信息并降低不重要信息对模型分析影响的方法。通常这一过程可以通过以下步骤实现:Specifically, extracting attention scores for each image patch based on feature vector similarity scores between image patches is a method to enhance key information and reduce the impact of unimportant information on model analysis. Usually this process can be achieved through the following steps:
S411、计算相似性得分:对于一个给定的图像块A,通过与其他所有图像块的特征向量计算相似性得分。常用的相似性度量方法是余弦相似性,它测量两个向量之间的cosine角度,公式为:S411. Calculate the similarity score: For a given image block A, calculate the similarity score through the feature vectors of all other image blocks. A commonly used similarity measure is cosine similarity, which measures the cosine angle between two vectors. The formula is:
其中,A和B分别为两个图像块的特征向量。Among them, A and B are the feature vectors of the two image blocks respectively.
S412、归一化相似性得分:为了确保得分在一个合理的范围内,对所有的相似性得分进行归一化。这可以通过柔最大值性函数来实现:S412. Normalized similarity scores: In order to ensure that the scores are within a reasonable range, all similarity scores are normalized. This can be achieved via a soft maxima function:
其中,注意力分数(Attention-Score)代表图像块A相对于其他图像块的重要性。Among them, the attention score (Attention-Score) represents the importance of image block A relative to other image blocks.
S413、加权平均:得到归一化后的相似性得分,为每个图像块计算一个加权平均的特征向量,其中权重是选中图像块与其他图像块的归一化相似性得分,具体公式为:S413. Weighted average: Obtain the normalized similarity score, and calculate a weighted average feature vector for each image block, where the weight is the normalized similarity score between the selected image block and other image blocks. The specific formula is:
Weighted-Feature-Vector(A)=∑Attention-Score(B)×Feature-Vector(B)Weighted-Feature-Vector(A)=∑Attention-Score(B)×Feature-Vector(B)
S414、注意力分数提取:加权平均后的特征向量捕获了图像块A相对于其他图像块的重要性。通过与原始特征向量的比较来进一步强化这种差异,提取更准确的注意力分数。S414. Attention score extraction: The weighted average feature vector captures the importance of image block A relative to other image blocks. This difference is further strengthened by comparison with the original feature vector to extract a more accurate attention score.
具体地,在步骤S42中,进行深度学习模型的训练和推理时,注意力机制是在解释模型的决策过程中扮演了一个非常重要的角色。通过注意力分数,明确地展示模型对输入数据中哪些部分更为关注,即模型认为这些部分对于决策更为重要。Specifically, in step S42, when training and inferring the deep learning model, the attention mechanism plays a very important role in explaining the decision-making process of the model. The attention score clearly shows which parts of the input data the model pays more attention to, that is, which parts the model considers more important for decision-making.
在本申请的实现方式中,如图3所示,基于所述的注意力分数,形成了一张注意力分布图。注意力得分通过使用发散的颜色图转换为RGB颜色,并在它们各自的空间位置上显示,使得观察者从视觉上识别出不同的区域。具体来说,所述注意力分布图中不同的注意力得分对应不同的图像颜色,直观地展示模型对于输入图像的哪些部分给予了更高的关注度。In the implementation of this application, as shown in Figure 3, an attention distribution map is formed based on the attention score. The attention scores are converted into RGB colors using a divergent color map and displayed at their respective spatial locations, allowing the observer to visually identify different regions. Specifically, different attention scores in the attention distribution map correspond to different image colors, which intuitively shows which parts of the input image the model pays higher attention to.
在本申请的实现方式中,为了进一步确定高关注度的区域的具体性质,胃肠病理学家对它们进行了评估,并将它们分类为10组,包括巨细胞/大型肉芽肿/干酪样坏死、粘膜/腺体、炎性浸润、松散或致密的纤维结缔组织、增厚的肌层、黏膜下/浆膜下脂肪组织、增生性肌间丛、血管/红细胞、细胞外黏液和淋巴液。通过对上述区域的详细评估,为每个全视野数字切片提供了更为深入的诊断分析。In the implementation of this application, to further determine the specific nature of the areas of high concern, they were evaluated by a gastrointestinal pathologist and classified into 10 groups, including giant cells/large granulomas/caseating necrosis , mucosa/gland, inflammatory infiltrate, loose or dense fibrous connective tissue, thickened muscle layer, submucosal/subserosa adipose tissue, hyperplastic myenteric plexus, blood vessels/red blood cells, extracellular mucus and lymph fluid. Detailed evaluation of the above areas provides a more in-depth diagnostic analysis for each full-field digital slice.
与传统的方法相比,上述基于全视野数字切片的可解释深度学习方法提供了更为精细的结果。其次,使用基于注意力的学习来自动识别具有高诊断价值的子区域并准确分类全视野数字切片,而不需要像本申请之前的研究那样依赖于多变量分析或结肠镜检查图像。Compared with traditional methods, the above-mentioned interpretable deep learning method based on full-field digital slices provides more refined results. Second, attention-based learning is used to automatically identify subregions of high diagnostic value and accurately classify full-field digital slices without relying on multivariate analysis or colonoscopy images as in previous studies in this application.
值得注意的是,上述基于全视野数字切片的可解释深度学习方法并不依赖于全视野数字切片中任何特定区域的信息或注释,因为整个诊断是基于整张全视野数字切片进行的。这也证明了模型可以提供精细的热图以供解释,而无需使用类激活映射或像素级注释。虽然在训练过程中并没有专门针对正常组织进行训练,但该方法仍然能够有效地识别出与预测最为相关的区域,并为临床医生展示每个全视野数字切片的每个组织区域对模型预测的相对贡献和重要性。It is worth noting that the above-mentioned interpretable deep learning method based on full-field digital slices does not rely on the information or annotations of any specific area in the full-field digital slices, because the entire diagnosis is performed based on the entire full-field digital slices. This also demonstrates that the model can provide refined heatmaps for interpretation without using class activation maps or pixel-level annotations. Although not specifically trained on normal tissue during the training process, the method is still able to effectively identify the regions most relevant to predictions and show clinicians how each tissue region of each full-field digital slice contributes to the model predictions. Relative contribution and importance.
具体地,步骤S43中,在确定初始预测概率矩阵后,可以进一步将注意力机制产生的分布图与全视野数字切片相结合,以直观地展示模型在分类时关注的区域,从而生成注意力热图。Specifically, in step S43, after determining the initial prediction probability matrix, the distribution map generated by the attention mechanism can be further combined with the full-field digital slice to intuitively display the area that the model focuses on when classifying, thereby generating attention heat. picture.
首先获取由上述注意力单元产生的注意力分布图,注意力分布图为一个与输入特征向量尺寸相同的矩阵,矩阵的值表示了模型对每一块区域的关注度。矩阵的值在0到1之间,其中较高的值表示模型在做决策时更加关注这一部分。First, obtain the attention distribution map generated by the above attention unit. The attention distribution map is a matrix with the same size as the input feature vector. The value of the matrix represents the model's attention to each area. The value of the matrix ranges from 0 to 1, with higher values indicating that the model pays more attention to this part when making decisions.
为了将值映射到可视化的颜色,采用彩色编码策略。常见的策略是使用从冷色(例如蓝色,代表低关注度)到暖色(例如红色,代表高关注度)的颜色梯度来表示注意力的强度。To map values to the colors of the visualization, a color coding strategy is employed. A common strategy is to use a color gradient from cool colors (e.g., blue, representing low attention) to warm colors (e.g., red, representing high attention) to represent the intensity of attention.
接着,将颜色值与原始的全视野数字切片相叠加,通过透明度混合技术将其中原始切片的像素和对应的颜色值都被考虑进来,从而在视觉上创建一个叠加的效果。Next, the color values are overlaid with the original full-field digital slice, using a transparency blending technique that takes both the pixels of the original slice and the corresponding color values into account, creating an overlay effect visually.
叠加的结果是“注意力热图”,其中颜色的深浅和温度变化直观地展示了模型在分类时关注的区域。例如,如全视野数字切片的某个区域在热图中呈现出鲜红色,这意味着模型特别关注这一部分,并且这部分在模型的决策中起到了关键作用。The result of the overlay is an "attention heat map," in which shades of color and temperature changes visually demonstrate the areas the model focuses on when classifying. For example, if a certain area of a full-field digital slice appears bright red in the heat map, it means that the model pays special attention to this part and this part plays a key role in the model's decision-making.
此外,注意力热图不仅为研究人员和医生提供了深入了解模型决策过程的机会,而且还能在医学图像诊断中,为医生提供关于疾病位置和性质的直观线索。In addition, attention heat maps not only provide researchers and doctors with the opportunity to gain insight into the model's decision-making process, but also provide doctors with intuitive clues about the location and nature of disease in medical image diagnosis.
本实施例通过采用基于克罗恩病全视野数字切片和肠结核全视野数字切片训练得到的检测网络模型来识别目标对象的病症类型,可以快速实现对目标患者病理图像的鉴别。This embodiment uses a detection network model trained based on full-field digital slices of Crohn's disease and full-field digital slices of intestinal tuberculosis to identify the disease type of the target object, and can quickly identify the pathological images of the target patient.
综上所述,本实施例提供了一种基于全视野数字切片的克罗恩病和肠结核的检测方法,所述方法包括获取若干全视野数字切片,并且确定每个全视野数字切片对应的若干图像块,其中,若干全视野数字切片为同一目标对象的全视野数字切片;对于每个全视野数字切片,将所述全视野数字切片对应的若干图像块输入经过训练的检测网络模型,通过所述检测网络模型确定每个全视野数字切片对应的初始预测概率矩阵;基于各全视野数字切片的初始预测概率矩阵,确定所述目标对象的预测结果,其中,所述预测结果包括克罗恩病类别、肠结核类别或正常组织类别。根据上述技术手段,本申请在获取到患者的全视野数字切片并进一步确定每个切片对应的若干图像块后,使用经过训练的检测网络模型处理图像块以确定每个切片的初始预测概率,再基于初始预测概率确定全视野数字切片的预测概率,这样通过整合切片水平和患者水平分类结果,实现更精确的多层次分类,有效地减少误诊,进一步提升治疗的精度。To sum up, this embodiment provides a method for detecting Crohn's disease and intestinal tuberculosis based on full-field digital slices. The method includes acquiring several full-field digital slices and determining the corresponding value of each full-field digital slice. Several image blocks, wherein several full-field digital slices are full-field digital slices of the same target object; for each full-field digital slice, input several image blocks corresponding to the full-field digital slice into a trained detection network model, and pass The detection network model determines the initial prediction probability matrix corresponding to each full-field digital slice; based on the initial prediction probability matrix of each full-field digital slice, determines the prediction result of the target object, wherein the prediction result includes Crohn's Disease category, intestinal tuberculosis category or normal tissue category. According to the above technical means, after obtaining the patient's full field of view digital slices and further determining several image blocks corresponding to each slice, this application uses a trained detection network model to process the image blocks to determine the initial prediction probability of each slice, and then The predicted probability of full-field digital slices is determined based on the initial predicted probability. In this way, by integrating slice-level and patient-level classification results, more accurate multi-level classification can be achieved, effectively reducing misdiagnosis and further improving the accuracy of treatment.
基于上述一种基于全视野数字切片的克罗恩病和肠结核的检测方法,本实施例提供了一种基于全视野数字切片的克罗恩病和肠结核的检测装置,如图4所示,所述装置包括:Based on the above-mentioned detection method of Crohn's disease and intestinal tuberculosis based on full-field digital slices, this embodiment provides a detection device for Crohn's disease and intestinal tuberculosis based on full-field digital slices, as shown in Figure 4 , the device includes:
获取模块100,用于获取若干全视野数字切片,并且确定每个全视野数字切片对应的若干图像块,其中,若干全视野数字切片为同一目标对象的全视野数字切片;The acquisition module 100 is used to acquire a number of full-field digital slices, and determine a number of image blocks corresponding to each full-field digital slice, where the several full-field digital slices are full-field digital slices of the same target object;
控制模块200,用于控制经过训练的检测网络模型确定每个全视野数字切片对应的初始预测概率矩阵;The control module 200 is used to control the trained detection network model to determine the initial prediction probability matrix corresponding to each full-field digital slice;
确定模块300,用于基于各全视野数字切片的初始预测概率矩阵,确定所述目标对象的预测结果,其中,所述预测结果包括克罗恩病类别、肠结核类别或正常组织类别。The determination module 300 is configured to determine the prediction result of the target object based on the initial prediction probability matrix of each full-field digital slice, where the prediction result includes a Crohn's disease category, an intestinal tuberculosis category, or a normal tissue category.
基于上述基于全视野数字切片的克罗恩病和肠结核的检测方法,本实施例提供了一种计算机可读存储介质,所述计算机可读存储介质存储有一个或者多个程序,所述一个或者多个程序可被一个或者多个处理器执行,以实现如上述实施例所述的基于全视野数字切片的克罗恩病和肠结核的检测方法中的步骤。Based on the above detection method of Crohn's disease and intestinal tuberculosis based on full-field digital slices, this embodiment provides a computer-readable storage medium, the computer-readable storage medium stores one or more programs, and the one Or multiple programs may be executed by one or more processors to implement the steps in the method for detecting Crohn's disease and intestinal tuberculosis based on full-field digital slices as described in the above embodiment.
基于上述基于全视野数字切片的克罗恩病和肠结核的检测方法,本申请还提供了一种终端设备,如图5所示,其包括至少一个处理器(processor)20;显示屏21;以及存储器(memory)22,还可以包括通信接口(Communications Interface)23和总线24。其中,处理器20、显示屏21、存储器22和通信接口23可以通过总线24完成相互间的通信。显示屏21设置为显示初始设置模式中预设的用户引导界面。通信接口23可以传输信息。处理器20可以调用存储器22中的逻辑指令,以执行上述实施例中的方法。Based on the above detection method of Crohn's disease and intestinal tuberculosis based on full-field digital slices, this application also provides a terminal device, as shown in Figure 5, which includes at least one processor (processor) 20; display screen 21; As well as a memory (memory) 22, it may also include a communications interface (Communications Interface) 23 and a bus 24. Among them, the processor 20, the display screen 21, the memory 22 and the communication interface 23 can complete communication with each other through the bus 24. The display screen 21 is configured to display a user guidance interface preset in the initial setting mode. Communication interface 23 can transmit information. The processor 20 can call logical instructions in the memory 22 to execute the methods in the above embodiments.
此外,上述的存储器22中的逻辑指令可以通过软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。In addition, the above-mentioned logical instructions in the memory 22 can be implemented in the form of software functional units and can be stored in a computer-readable storage medium when sold or used as an independent product.
存储器22作为一种计算机可读存储介质,可设置为存储软件程序、计算机可执行程序,如本公开实施例中的方法对应的程序指令或模块。处理器20通过运行存储在存储器22中的软件程序、指令或模块,从而执行功能应用以及数据处理,即实现上述实施例中的方法。As a computer-readable storage medium, the memory 22 can be configured to store software programs, computer-executable programs, such as program instructions or modules corresponding to the methods in the embodiments of the present disclosure. The processor 20 executes software programs, instructions or modules stored in the memory 22 to execute functional applications and data processing, that is, to implement the methods in the above embodiments.
存储器22可包括存储程序区和存储数据区,其中,存储程序区可存储操作系统、至少一个功能所需的应用程序;存储数据区可存储根据终端设备的使用所创建的数据等。此外,存储器22可以包括高速随机存取存储器,还可以包括非易失性存储器。例如,U盘、移动硬盘、只读存储器(Read-Only Memory,ROM)、随机存取存储器(Random Access Memory,RAM)、磁碟或者光盘等多种可以存储程序代码的介质,也可以是暂态存储介质。The memory 22 may include a program storage area and a data storage area, where the program storage area may store an operating system and at least one application program required for a function; the storage data area may store data created according to the use of the terminal device, etc. In addition, the memory 22 may include high-speed random access memory, and may also include non-volatile memory. For example, there are many media that can store program code, such as U disk, mobile hard disk, read-only memory (ROM), random access memory (Random Access Memory, RAM), magnetic disk or optical disk, or they can also be temporary state storage media.
此外,上述存储介质以及终端设备中的多条指令处理器加载并执行的具体过程在上述方法中已经详细说明,在这里就不再一一陈述。In addition, the specific process of loading and executing the multiple instruction processors in the above storage medium and terminal device has been described in detail in the above method, and will not be described one by one here.
最后应说明的是:以上实施例仅用以说明本申请的技术方案,而非对其限制;尽管参照前述实施例对本申请进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本申请各实施例技术方案的精神和范围。Finally, it should be noted that the above embodiments are only used to illustrate the technical solution of the present application, but not to limit it; although the present application has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art should understand that it can still be Modifications are made to the technical solutions described in the foregoing embodiments, or equivalent substitutions are made to some of the technical features; however, these modifications or substitutions do not cause the essence of the corresponding technical solutions to deviate from the spirit and scope of the technical solutions in the embodiments of the present application.
Claims (10)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202311452206.3A CN117392105A (en) | 2023-11-02 | 2023-11-02 | Detection method of Crohn's disease and intestinal tuberculosis based on full-field digital sectioning |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202311452206.3A CN117392105A (en) | 2023-11-02 | 2023-11-02 | Detection method of Crohn's disease and intestinal tuberculosis based on full-field digital sectioning |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| CN117392105A true CN117392105A (en) | 2024-01-12 |
Family
ID=89462902
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN202311452206.3A Pending CN117392105A (en) | 2023-11-02 | 2023-11-02 | Detection method of Crohn's disease and intestinal tuberculosis based on full-field digital sectioning |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN117392105A (en) |
Cited By (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN119006849A (en) * | 2024-10-23 | 2024-11-22 | 江西医至初医学病理诊断管理有限公司 | Panoramic pathology scanning image descriptor generation method and electronic equipment |
-
2023
- 2023-11-02 CN CN202311452206.3A patent/CN117392105A/en active Pending
Cited By (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN119006849A (en) * | 2024-10-23 | 2024-11-22 | 江西医至初医学病理诊断管理有限公司 | Panoramic pathology scanning image descriptor generation method and electronic equipment |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US11681418B2 (en) | Multi-sample whole slide image processing in digital pathology via multi-resolution registration and machine learning | |
| Li et al. | A comprehensive review of computer-aided whole-slide image analysis: from datasets to feature extraction, segmentation, classification and detection approaches | |
| CN111524137B (en) | Cell identification counting method and device based on image identification and computer equipment | |
| US8379961B2 (en) | Mitotic figure detector and counter system and method for detecting and counting mitotic figures | |
| CN109977955B (en) | Cervical carcinoma pre-lesion identification method based on deep learning | |
| CN117015796A (en) | Method for processing tissue images and system for processing tissue images | |
| WO2012154216A1 (en) | Diagnosis support system providing guidance to a user by automated retrieval of similar cancer images with user feedback | |
| CN112990214A (en) | Medical image feature recognition prediction model | |
| CN105512612A (en) | SVM-based image classification method for capsule endoscope | |
| CN117994241B (en) | Gastric mucosa image analysis method and system for helicobacter pylori detection | |
| US20210209755A1 (en) | Automatic lesion border selection based on morphology and color features | |
| CN114080644A (en) | System and method for diagnosing small bowel cleanliness | |
| CN116993673A (en) | Method for evaluating thyroid image based on language model | |
| CN116309333A (en) | WSI image weak supervision pathological analysis method and device based on deep learning | |
| CN114648509A (en) | Thyroid cancer detection system based on multi-classification task | |
| CN113313680A (en) | Colorectal cancer pathological image prognosis auxiliary prediction method and system | |
| CN117392105A (en) | Detection method of Crohn's disease and intestinal tuberculosis based on full-field digital sectioning | |
| Qing et al. | MPSA: Multi-Position Supervised Soft Attention-based convolutional neural network for histopathological image classification | |
| KR20230095801A (en) | Artificial intelligence system and method on location cancerous region on digital pathology with customized resoluiont | |
| Kumar et al. | Enhanced breast cancer detection and classification via CAMR-Gabor filters and LSTM: A deep Learning-Based method | |
| CN120047416A (en) | Tumor CT image segmentation processing method and system | |
| CN119495422A (en) | A machine vision detection method based on deep learning | |
| Hussain et al. | Enhancing skin lesion Classification: A machine learning approach using KNN, XGBoost, and Random Forest | |
| Balkys et al. | Segmenting the eye fundus images for identification of blood vessels | |
| Kaur et al. | Kidney Tumor Detection and Classification Using Convolutional Neural Network Architecture |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination |