[go: up one dir, main page]

CN116630299A - A medical image processing method based on semi-supervised neural network - Google Patents

A medical image processing method based on semi-supervised neural network Download PDF

Info

Publication number
CN116630299A
CN116630299A CN202310714829.7A CN202310714829A CN116630299A CN 116630299 A CN116630299 A CN 116630299A CN 202310714829 A CN202310714829 A CN 202310714829A CN 116630299 A CN116630299 A CN 116630299A
Authority
CN
China
Prior art keywords
model
neural network
training
data
enhancement
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202310714829.7A
Other languages
Chinese (zh)
Other versions
CN116630299B (en
Inventor
陈一飞
黄凡丁
刘敏哲
林彬
秦飞巍
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Dianzi University
Original Assignee
Hangzhou Dianzi University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Dianzi University filed Critical Hangzhou Dianzi University
Priority to CN202310714829.7A priority Critical patent/CN116630299B/en
Publication of CN116630299A publication Critical patent/CN116630299A/en
Application granted granted Critical
Publication of CN116630299B publication Critical patent/CN116630299B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0012Biomedical image inspection
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/0895Weakly supervised learning, e.g. semi-supervised or self-supervised learning
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H30/00ICT specially adapted for the handling or processing of medical images
    • G16H30/20ICT specially adapted for the handling or processing of medical images for handling medical images, e.g. DICOM, HL7 or PACS
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Medical Informatics (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Data Mining & Analysis (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Radiology & Medical Imaging (AREA)
  • Multimedia (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Databases & Information Systems (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Epidemiology (AREA)
  • Primary Health Care (AREA)
  • Public Health (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a medical image processing method based on a semi-supervised neural network, which comprises the following steps of S1, acquiring a picture data set; s2, constructing and training a DFCPS model, wherein the DFCPS model comprises a data enhancement strategy and a neural network, and the neural network comprises a main network, an ASPP module, an up-sampling module and a Softmax function; the backbone network adopts ResNet-50, and introduces residual connection to construct a deep network, the ASPP module comprises four convolution layers, and the cavity convolution rates of the four convolution layers are respectively 1, 12, 24 and 36; s3, performing image segmentation by using the trained neural network. The method trains the model and obtains ideal results by reasonably utilizing unlabeled data and a small amount of labeled data and a data enhancement technology of strong enhancement and weak enhancement.

Description

一种基于半监督神经网络的医学影像处理方法A medical image processing method based on semi-supervised neural network

技术领域technical field

本发明涉及数据增强、语义分割技术领域,具体指一种基于半监督神经网络的医学影像处理方法。The invention relates to the technical fields of data enhancement and semantic segmentation, in particular to a medical image processing method based on a semi-supervised neural network.

背景技术Background technique

在医学方面,为了对病人病情阶段进行全面的诊断去判断器官是否存在病变以及提出相对应的治疗方案,常对病人使用的方法有:DR扫描、CT扫描或者MRI(核磁共振)扫描,得到相应图像后从而交由医生去通过肉眼观察确定病人内部器官的病变情况,在深度学习没有兴起之前,这个过程往往由经验丰富的医生直接观察完成。In medicine, in order to conduct a comprehensive diagnosis of the patient's disease stage to determine whether there is a disease in the organ and propose a corresponding treatment plan, the methods that are often used on the patient are: DR scan, CT scan or MRI (nuclear magnetic resonance) scan. The image is then handed over to the doctor to determine the pathological changes of the patient's internal organs through naked eye observation. Before the rise of deep learning, this process was often directly observed by experienced doctors.

虽然医生肉眼的误诊率低,判断精度较高,但培养满足上述要求的医生需要付出巨大的时间、金钱成本,且医生作为人,会受到情绪波动、长时间工作精力不足等因素的影响,导致了判断精度的不稳定性。因此,为了减小误诊率,辅助医生诊断的医学图像分割应运而生。Although doctors have a low misdiagnosis rate and high judgment accuracy, it takes a lot of time and money to train doctors who meet the above requirements, and doctors, as human beings, will be affected by factors such as mood swings and lack of energy for long hours of work, resulting in The instability of judgment accuracy. Therefore, in order to reduce the misdiagnosis rate, medical image segmentation to assist doctors in diagnosis came into being.

神经网络作为半监督学习中的重要工具,已经在医学图像分割中取得了显著的成果。神经网络能够从大规模图像数据中学习到复杂的特征表示,并通过端到端的训练过程进行优化。然而,在医学图像分割中,由于标注数据的限制和模型的复杂性,仅依靠有限的标注数据进行监督训练的神经网络难以达到令人满意的性能,因此,应对上述问题的神经网络的需求迫在眉睫。As an important tool in semi-supervised learning, neural networks have achieved remarkable results in medical image segmentation. Neural networks can learn complex feature representations from large-scale image data and optimize them through an end-to-end training process. However, in medical image segmentation, due to the limitation of labeled data and the complexity of the model, it is difficult to achieve satisfactory performance for neural networks that only rely on limited labeled data for supervised training. Therefore, the need for neural networks to deal with the above problems is imminent .

用于医学图像的自训练是一种半监督学习的方法,它可以利用未标记的数据来通过持续的迭代训练以求改进模型的性能,达到减少标记数据数量的目的。自训练会存在伪标签噪声较高的特点,自训练方法的性能取决于伪标签的质量和可靠性。因此,在选择伪标签时需要谨慎考虑置信度和噪声问题。由于医学解剖结构特点具有相似性,所以带标签的图像可以传递强参考点,用于匹配和信息的迁移到无标签图片。Self-training for medical images is a semi-supervised learning method, which can use unlabeled data to improve the performance of the model through continuous iterative training, and achieve the purpose of reducing the amount of labeled data. Self-training will have the characteristics of high pseudo-label noise, and the performance of self-training methods depends on the quality and reliability of pseudo-labels. Therefore, confidence and noise issues need to be carefully considered when selecting pseudo-labels. Due to the similarity of medical anatomical features, labeled images can convey strong reference points for matching and information transfer to unlabeled images.

一致性学习的核心思想是对原样本进行不同程度的扰动如高斯噪声、颜色改变、随机旋转,模型仍对原样本有着相似的输出。通过一致性学习,模型可以从未标记数据中获得额外的训练信号,并学习到更鲁棒和泛化的特征表示。但是,如果模型没有得到很好的优化并提供了错误的监督信息,那么软标签和真实标签相互冲突的风险就会变得很高。The core idea of consistent learning is to perturb the original sample to different degrees, such as Gaussian noise, color change, random rotation, and the model still has a similar output to the original sample. Through consistent learning, the model can obtain additional training signals from unlabeled data and learn more robust and generalizable feature representations. However, if the model is not well optimized and provides wrong supervision information, the risk of soft and true labels colliding becomes high.

如果能解决上述两个问题,对医学图像分割识别进准度可能是一个有效的突破口。If the above two problems can be solved, it may be an effective breakthrough to improve the accuracy of medical image segmentation and recognition.

发明内容Contents of the invention

针对现有技术的不足,提出一种基于半监督神经网络的医学影像处理方法,通过合理地利用未标记数据和少量标记数据,以及强增强和弱增强的数据增强技术,对模型进行训练并取得理想的结果。Aiming at the deficiencies of existing technologies, a medical image processing method based on semi-supervised neural network is proposed. By rationally using unlabeled data and a small amount of labeled data, as well as strong and weak enhanced data enhancement techniques, the model is trained and obtained ideal result.

为了解决上述技术问题,本发明的技术方案为:In order to solve the problems of the technologies described above, the technical solution of the present invention is:

一种基于半监督神经网络的医学影像处理方法,包括如下步骤:A medical image processing method based on a semi-supervised neural network, comprising the steps of:

S1、获取图片数据集;S1. Acquiring a picture data set;

S2、构建DFCPS模型并训练,所述DFCPS模型包括数据增强策略和神经网络,所述神经网络包括主干网络、ASPP模块、上采样模块和Softmax函数;S2, build DFCPS model and train, described DFCPS model comprises data enhancement strategy and neural network, and described neural network comprises backbone network, ASPP module, upsampling module and Softmax function;

所述主干网络采用ResNet-50,并引入残差连接构建深层网络,所述ASPP模块包括四个卷积层,四个所述卷积层的空洞卷积率分别为1、12、24、36;The backbone network adopts ResNet-50, and introduces residual connections to build a deep network. The ASPP module includes four convolutional layers, and the atrous convolution rates of the four convolutional layers are 1, 12, 24, and 36, respectively. ;

所述DFCPS模型的训练方法为:首先,通过数据增强策略对原始样本X进行两次不同程度的强弱数据增强,生成了两组增强样本,分别为一组是强增强样本,另一组是弱增强样本,强弱增强样本分组组合,其中组合后的强弱增强样本进入到四个神经网络之中进行训练,通过不断优化模型的参数来最小化损失值;The training method of the DFCPS model is as follows: firstly, the original sample X is subjected to two different degrees of strong and weak data enhancement through the data enhancement strategy, and two sets of enhanced samples are generated, one group is a strong enhanced sample, and the other is a strong enhanced sample. Weakly enhanced samples, strong and weakly enhanced samples are grouped and combined, and the combined strong and weak enhanced samples are entered into four neural networks for training, and the loss value is minimized by continuously optimizing the parameters of the model;

在训练过程中,模型通过不断优化模型的参数来最小化损失值,防止出现过拟合以及欠拟合的现象,以提高模型的性能和准确性。本发明的loss设计在整个神经网络的训练过程涉及两个关键的损失函数:监督损失Ls和交叉伪监督损失LcpsDuring the training process, the model minimizes the loss value by continuously optimizing the parameters of the model to prevent over-fitting and under-fitting phenomena, so as to improve the performance and accuracy of the model. The loss design of the present invention involves two key loss functions in the whole neural network training process: supervision loss L s and cross pseudo-supervision loss L cps .

综上所述,整个训练过程中的损失函数主要包括监督损失和交叉伪监督损失。监督损失通过强增强样本与弱增强样本的伪标签之间的比较来指导网络的学习。交叉伪监督损失则通过对不同组之间生成的伪分割图的比较来促使网络在整体上学习更一致的分割结果。这样的训练策略有助于提高模型的性能,并使得网络能够更好地适应不同组的数据和标签。本设计在强弱增强的方法中借鉴了Fixmatch的思想,并对其进行了相应扩展,以适用于医学图像标签分割任务。然而,需要特别注意伪标签的可靠性、准确性和清洁度,因为伪标签的质量直接影响模型的性能和泛化能力。To sum up, the loss function in the whole training process mainly includes supervised loss and cross-pseudo-supervised loss. The supervised loss guides the learning of the network through the comparison between pseudo-labels of strongly augmented samples and weakly augmented samples. The cross-pseudo-supervised loss encourages the network to learn more consistent segmentation results as a whole by comparing the pseudo-segmentation maps generated between different groups. Such a training strategy helps to improve the performance of the model and enables the network to better adapt to different sets of data and labels. This design draws on the idea of Fixmatch in the method of strong and weak enhancement, and expands it accordingly to apply to the task of medical image label segmentation. However, special attention needs to be paid to the reliability, accuracy, and cleanliness of the pseudo-labels, since the quality of the pseudo-labels directly affects the performance and generalization ability of the model.

为了确保伪标签的质量足够高,本发明引入了置信阈值的概念,该阈值在本发明中被定义为μ。通过设置置信阈值,模型可以筛选出质量较高的伪标签,从而避免对模型训练产生负面影响。具体而言,我将伪标签的置信度与置信阈值进行对比。如果伪标签的置信度与置信阈值相近,即在0.5附近,就可以忽略不计,因为这样的伪标签可能不够可靠,将伪分割图大于置信阈值μ的部分参与损失计算。对强扩增的样本,输出的预测结果和对应的超过置信阈值的目标弱标记样本的得到的伪分割图同样做交叉熵损失。In order to ensure that the quality of the pseudo-label is high enough, the present invention introduces the concept of a confidence threshold, which is defined as μ in the present invention. By setting a confidence threshold, the model can filter out high-quality pseudo-labels, thereby avoiding negative impact on model training. Specifically, I compare the confidence of the pseudo-label against a confidence threshold. If the confidence of the pseudo-label is close to the confidence threshold, that is, around 0.5, it can be ignored, because such a pseudo-label may not be reliable enough, and the part of the pseudo-segmentation map greater than the confidence threshold μ is involved in the loss calculation. For strongly amplified samples, cross-entropy loss is also performed on the output prediction results and the corresponding pseudo-segmentation maps obtained from target weakly labeled samples that exceed the confidence threshold.

S3、使用训练好的神经网络进行图像分割S3, use the trained neural network for image segmentation

S3-1、将经预处理的数据集作为输入,通过主干网络获取特征图;S3-1. Using the preprocessed data set as input, obtain the feature map through the backbone network;

S3-2、将获取的特征图作为输入,通过ASPP模块通过多个并行的分支来实现多尺度的综合特征提取;S3-2. Using the acquired feature map as input, the multi-scale comprehensive feature extraction is realized through multiple parallel branches through the ASPP module;

S3-3、将获取的综合特征表示通过上采样技术增加数据细节,通过Softmax函数将模型的输出映射为每个类别的概率,然后根据概率选择最可能的类别作为预测结果,以生成可信度高的预测样本。S3-3. Represent the obtained comprehensive features through upsampling technology to increase data details, map the output of the model to the probability of each category through the Softmax function, and then select the most likely category as the prediction result according to the probability to generate credibility High predictive samples.

作为优选,所述步骤S1中通过数据增强策略对图片数据进行预处理。Preferably, in the step S1, the picture data is pre-processed through a data enhancement strategy.

作为优选,所述数据增强策略包括随机旋转、随机亮度、对比度调整,随机平移。Preferably, the data enhancement strategy includes random rotation, random brightness, contrast adjustment, and random translation.

作为优选,所述步骤S2中,在训练时,每组用于强弱增强的神经网络共享参数和权重,每组中由弱增强后经神经网络输出的预测结果生成的伪标签作为每个样本的强增强样本预测的目标。Preferably, in the step S2, during training, each group is used for strong and weak enhanced neural network sharing parameters and weights, and in each group, the pseudo-label generated by the prediction result output by the neural network after weak enhancement is used as each sample The target of strong augmented sample prediction.

作为优选,所述步骤S2中,在训练过程中,引入监督损失为Ls利用强增强样本和弱增强样本的信息来指导DFCPS模型的学习过程,具体而言,利用强增强样本通过神经网络生成的预测结果与相应的弱增强样本的伪标签之间的差异来计算监督损失,进而迭代模型参数。Preferably, in the step S2, in the training process, the supervisory loss is introduced as L s , and the information of the strong enhancement sample and the weak enhancement sample is used to guide the learning process of the DFCPS model, specifically, the strong enhancement sample is generated by the neural network The difference between the predicted results of the predicted results and the pseudo-labels of the corresponding weakly enhanced samples is used to calculate the supervised loss, and then iterate the model parameters.

上述技术方案中,监督损失Ls是通过将弱增强样本生成的伪标签作为强增强样本的目标而产生的。具体而言,利用强增强样本通过神经网络生成的预测结果与相应的弱增强样本的伪标签之间的差异来计算监督损失。这样做的目的是让网络学习从弱增强样本到强增强样本的映射关系,以提高模型在强增强样本上的预测准确性。通过最小化监督损失,鼓励网络学习正确的分割目标。In the above technical solution, the supervised loss L s is generated by targeting the pseudo labels generated by the weakly enhanced samples as the targets of the strongly enhanced samples. Specifically, the supervised loss is computed using the difference between the predictions generated by the neural network for strongly augmented samples and the pseudo-labels of the corresponding weakly augmented samples. The purpose of this is to let the network learn the mapping relationship from weakly enhanced samples to strongly enhanced samples, so as to improve the prediction accuracy of the model on strongly enhanced samples. By minimizing the supervised loss, the network is encouraged to learn the correct segmentation target.

作为优选,所述步骤S2中,训练过程设计引入了交叉伪监督损失Lcps使得不同组弱监督版本的样本生成的伪标签之间会相互约束。Preferably, in the step S2, the training process design introduces a cross pseudo-supervision loss L cps so that the pseudo-labels generated by samples of different groups of weakly supervised versions will be mutually constrained.

上述技术方案中,交叉伪监督损失Lcps用于约束不同组之间生成的伪分割图之间的差异。本设计将每组弱监督版本样本生成的伪标签视为训练其他组弱监督版本样本的目标,并使用交叉熵损失函数来计算伪标签之间的损失。通过这种交叉伪监督,促使不同组之间的伪标签相互影响和纠正,以提高整体的分割一致性。In the above technical solution, the cross-pseudo-supervised loss Lcps is used to constrain the difference between the pseudo-segmentation maps generated between different groups. This design considers the pseudo-labels generated by each set of weakly supervised version samples as the goal of training other sets of weakly supervised version samples, and uses the cross-entropy loss function to calculate the loss between the pseudo-labels. Through this cross-pseudo-supervision, the pseudo-labels between different groups are encouraged to influence and correct each other to improve the overall segmentation consistency.

作为优选,所述监督损失Ls由交叉熵损失函数决定,对于有标记的样本进行正常的监督学习,表达式如下:Preferably, the supervised loss L s is determined by a cross-entropy loss function, and normal supervised learning is performed on marked samples, and the expression is as follows:

其中,Du:表示未标签样本的集合,Dl表示已有标签的样本集合,S定义为输入图像的面积,用高度*宽度计算,pi为对应的置信向量,yj为ground truth,lce为交叉熵损失函数。Among them, Du : represents the collection of unlabeled samples, D l represents the collection of samples with labels, S is defined as the area of the input image, calculated by height*width, p i is the corresponding confidence vector, y j is the ground truth, l ce is the cross entropy loss function.

作为优选,所述交叉伪监督损失定义为和/>其中/>表示的是有标签数据的监督损失,/>表示的是无标签数据的监督损失,表达式如下:Preferably, the cross-pseudo-supervised loss is defined as and /> where /> Represents the supervised loss of labeled data, /> Represents the supervised loss of unlabeled data, the expression is as follows:

故交叉伪监督损失可以定义为So the cross-pseudo-supervised loss can be defined as

总的损失则可以相应定义为,其中ω为权重The total loss can be correspondingly defined as, where ω is the weight

Loss=Ls+ωLcps (5)。Loss=L s +ωL cps (5).

上述技术方案中,首先使用带有真实标签的有标记样本训练一个初始模型。其次,使用该初始模型对未标记的样本进行预测,并将预测结果作为伪标签添加到这些样本上。然后将带有伪标签的未标记样本与带有真实标签的有标记样本合并为一个扩展的训练集。最后使用扩展的训练集重新训练模型。通过反复迭代,模型可以逐渐利用未标记数据中的信息来提升性能。In the above technical solution, an initial model is first trained using labeled samples with real labels. Second, use this initial model to make predictions on unlabeled samples, and add the predictions as pseudo-labels to these samples. Then unlabeled samples with pseudo-labels and labeled samples with real labels are combined into an expanded training set. Finally retrain the model using the expanded training set. Through repeated iterations, the model can gradually use the information in the unlabeled data to improve performance.

作为优选,所述步骤S3-1的具体方法为:使用训练后的DFCPS模型,在主干网络ResNet-50中,将预处理后的数据集作为输入,使用了池化层和全连接层进行特征提取和分类。通过引入残差连接(Residual Connection)来构建深层网络。残差连接允许信息直接跳过一些层,使得网络能够更轻松地学习到恒等映射,避免了深层网络退化的问题。ResNet-50的网络结构相对较深,包含了50个卷积层,其中包括多个残差块(Residual Blocks)。每个残差块由两个卷积层组成,其中一个卷积层用于降低特征图的尺寸,另一个卷积层用于保持特征图的尺寸不变。Preferably, the specific method of the step S3-1 is: using the trained DFCPS model, in the backbone network ResNet-50, using the preprocessed data set as input, using the pooling layer and the fully connected layer for feature extraction and classification. Build a deep network by introducing residual connections. The residual connection allows information to skip some layers directly, making it easier for the network to learn the identity mapping, avoiding the problem of deep network degradation. The network structure of ResNet-50 is relatively deep, including 50 convolutional layers, including multiple residual blocks (Residual Blocks). Each residual block consists of two convolutional layers, one of which is used to reduce the size of the feature map and the other convolutional layer is used to keep the size of the feature map unchanged.

作为优选,所述步骤S3-2的具体方法为:在获取相应输出之后,样本会进入ASPP模块,使用多个并行的分支来实现多尺度的特征提取,通过使用空洞卷积来增加感受野,ASPP模块能够充分利用输入特征图的上下文信息,可以帮助模型更好地理解图像中不同位置的语义关联,并能够灵活地嵌入到不同网络架构中。每个分支都采用不同的采样率(也称为空洞率或膨胀率),这决定了分支在输入特征图上的感受野大小。采样率与感受野呈正相关。通过使用强弱增强的样本以及ASPP模块,本发明所设计神经网络能够从不同尺度上获取全局和局部的上下文信息,以提高图像分割任务的性能。Preferably, the specific method of the step S3-2 is: after obtaining the corresponding output, the sample will enter the ASPP module, use multiple parallel branches to realize multi-scale feature extraction, and increase the receptive field by using dilated convolution, The ASPP module can make full use of the context information of the input feature map, which can help the model better understand the semantic associations of different positions in the image, and can be flexibly embedded in different network architectures. Each branch adopts a different sampling rate (also known as hole rate or dilation rate), which determines the receptive field size of the branch on the input feature map. The sampling rate is positively correlated with the receptive field. By using strength-enhanced samples and an ASPP module, the neural network designed in the present invention can obtain global and local context information from different scales to improve the performance of image segmentation tasks.

作为优选,所述步骤S3-3的具体方法为:最终的分类或预测任务提供更具判别性的特征表示,经过卷积、批归一化和ReLU函数激活后,将所有空洞卷积分支和空间金字塔池化分支的结果进行并行连接,形成一个综合的特征表示。再应用1x1卷积层对输入的特征进行降维操作,从而减少通道数。这有助于减少计算负担输入特征的通道数减少到较小的维度,以减少计算复杂度并提取更抽象的特征。接下来,将获取的综合特征表示通过上采样技术增加数据细节,从而改善图像质量。上采样是一种处理技术,用于将低分辨率或低频率的数据提升到更高的分辨率或频率。上采样后的综合特征与经过卷积处理的底层特征连接在一起,形成更加综合的特征表示,经过上采样和最后的卷积,通过卷积层的操作对输入的特征进行非线性变换和特征提取,以产生更高级别的特征表示。这些操作有助于捕捉图像中更复杂的模式和结构,并为最终的分类或预测任务提供更具判别性的特征表示,在通过Softmax函数将模型的输出映射为每个类别的概率,然后可以根据概率选择最可能的类别作为预测结果,以生成可信度高的预测样本。Preferably, the specific method of step S3-3 is as follows: the final classification or prediction task provides a more discriminative feature representation, after convolution, batch normalization and ReLU function activation, all dilated convolution branches and The results of the spatial pyramid pooling branches are concatenated in parallel to form a comprehensive feature representation. Then apply a 1x1 convolutional layer to perform dimensionality reduction operations on the input features, thereby reducing the number of channels. This helps to reduce the computational burden. The number of channels of input features is reduced to smaller dimensions to reduce computational complexity and extract more abstract features. Next, the acquired comprehensive feature representation is upsampled to increase data details, thereby improving image quality. Upsampling is a processing technique used to upscale low-resolution or low-frequency data to a higher resolution or frequency. The comprehensive features after upsampling are connected with the underlying features after convolution processing to form a more comprehensive feature representation. After upsampling and final convolution, the input features are nonlinearly transformed and characterized by the operation of the convolution layer. extraction to produce higher-level feature representations. These operations help to capture more complex patterns and structures in the image, and provide a more discriminative feature representation for the final classification or prediction task. After the output of the model is mapped to the probability of each category through the Softmax function, then it can be According to the probability, the most probable class is selected as the prediction result to generate a prediction sample with high reliability.

本发明具有以下的特点和有益效果:The present invention has following characteristics and beneficial effect:

本发明通过合理地利用未标记数据和少量标记数据,以及强增强和弱增强的数据增强技术,对模型进行训练并取得理想的结果。首先,对原始样本X进行两次不同程度的强弱数据增强,生成了两组增强样本:一组是强增强样本,另一组是弱增强样本,强弱增强样本分组组合,其中组合后的强弱增强样本进入到四个不同的神经网络F(θn)之中进行训练,利用强增强样本和弱增强样本的信息来指导模型的学习过程,并提高模型的性能。The invention trains the model and obtains ideal results by rationally utilizing unmarked data and a small amount of marked data, as well as data enhancement techniques of strong enhancement and weak enhancement. First, the original sample X is enhanced twice with different degrees of strong and weak data, and two sets of enhanced samples are generated: one is strong enhanced samples, the other is weakly enhanced samples, and the strong and weak enhanced samples are grouped and combined, where the combined The strong and weak enhancement samples enter into four different neural networks F(θ n ) for training, and use the information of strong enhancement samples and weak enhancement samples to guide the learning process of the model and improve the performance of the model.

本发明使用的主干网络为ResNet-50,其主要解决了深度神经网络训练过程中的梯度消失和梯度爆炸问题,通过引入残差连接(Residual Connection)来构建深层网络。残差连接允许信息直接跳过一些层,使得网络能够更轻松地学习到恒等映射,避免了深层网络退化的问题。ResNet-50的网络结构相对较深,包含了50个卷积层,其中包括多个残差块(Residual Blocks)。每个残差块由两个卷积层组成,其中一个卷积层用于降低特征图的尺寸,另一个卷积层用于保持特征图的尺寸不变。此外,ResNet-50还使用了池化层和全连接层进行特征提取和分类。ResNet-50的网络结构非常适合处理大规模图像分类任务,具有较强的表达能力和泛化能力。它已被广泛应用于计算机视觉领域的各种任务,如图像分类、目标检测和语义分割等,并且多个计算机视觉任务中取得了显著的成果The backbone network used in the present invention is ResNet-50, which mainly solves the problems of gradient disappearance and gradient explosion in the deep neural network training process, and a deep network is constructed by introducing a residual connection (Residual Connection). The residual connection allows information to skip some layers directly, making it easier for the network to learn the identity mapping, avoiding the problem of deep network degradation. The network structure of ResNet-50 is relatively deep, including 50 convolutional layers, including multiple residual blocks (Residual Blocks). Each residual block consists of two convolutional layers, one of which is used to reduce the size of the feature map and the other convolutional layer is used to keep the size of the feature map unchanged. In addition, ResNet-50 also uses pooling layers and fully connected layers for feature extraction and classification. The network structure of ResNet-50 is very suitable for processing large-scale image classification tasks, and has strong expressive ability and generalization ability. It has been widely used in various tasks in the field of computer vision, such as image classification, object detection and semantic segmentation, etc., and has achieved remarkable results in multiple computer vision tasks

附图说明Description of drawings

为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。In order to more clearly illustrate the technical solutions in the embodiments of the present invention or the prior art, the following will briefly introduce the drawings that need to be used in the description of the embodiments or the prior art. Obviously, the accompanying drawings in the following description are only These are some embodiments of the present invention. For those skilled in the art, other drawings can also be obtained according to these drawings without any creative effort.

图1为本发明实施例中DFCPS模型的网络架构图。FIG. 1 is a network architecture diagram of a DFCPS model in an embodiment of the present invention.

图2为DFCPS模型中神经网络结构图。Figure 2 is a diagram of the neural network structure in the DFCPS model.

图3为所设计的损失函数设计图。Figure 3 is a design diagram of the designed loss function.

图4为本发明实施例的Loss趋势图。Fig. 4 is a Loss trend diagram of an embodiment of the present invention.

图5为本发明实施例的mIOU曲线图。Fig. 5 is a mIOU curve diagram of an embodiment of the present invention.

图6为本发明实施例的分割结构对比图。FIG. 6 is a comparison diagram of segmentation structures according to an embodiment of the present invention.

图7为本发明实施例的mIOU对比曲线图。Fig. 7 is a mIOU comparison curve of the embodiment of the present invention.

具体实施方式Detailed ways

需要说明的是,在不冲突的情况下,本发明中的实施例及实施例中的特征可以相互组合。It should be noted that, in the case of no conflict, the embodiments of the present invention and the features in the embodiments can be combined with each other.

在本发明的描述中,需要理解的是,术语“中心”、“纵向”、“横向”、“上”、“下”、“前”、“后”、“左”、“右”、“竖直”、“水平”、“顶”、“底”、“内”、“外”等指示的方位或位置关系为基于附图所示的方位或位置关系,仅是为了便于描述本发明和简化描述,而不是指示或暗示所指的装置或元件必须具有特定的方位、以特定的方位构造和操作,因此不能理解为对本发明的限制。此外,术语“第一”、“第二”等仅用于描述目的,而不能理解为指示或暗示相对重要性或者隐含指明所指示的技术特征的数量。由此,限定有“第一”、“第二”等的特征可以明示或者隐含地包括一个或者更多个该特征。在本发明的描述中,除非另有说明,“多个”的含义是两个或两个以上。In describing the present invention, it should be understood that the terms "center", "longitudinal", "transverse", "upper", "lower", "front", "rear", "left", "right", " The orientations or positional relationships indicated by "vertical", "horizontal", "top", "bottom", "inner" and "outer" are based on the orientations or positional relationships shown in the drawings, and are only for the convenience of describing the present invention and Simplified descriptions, rather than indicating or implying that the device or element referred to must have a particular orientation, be constructed and operate in a particular orientation, and thus should not be construed as limiting the invention. In addition, the terms "first", "second", etc. are used for descriptive purposes only, and should not be understood as indicating or implying relative importance or implicitly specifying the quantity of the indicated technical features. Thus, a feature defined as "first", "second", etc. may expressly or implicitly include one or more of that feature. In the description of the present invention, unless otherwise specified, "plurality" means two or more.

在本发明的描述中,需要说明的是,除非另有明确的规定和限定,术语“安装”、“相连”、“连接”应做广义理解,例如,可以是固定连接,也可以是可拆卸连接,或一体地连接;可以是机械连接,也可以是电连接;可以是直接相连,也可以通过中间媒介间接相连,可以是两个元件内部的连通。对于本领域的普通技术人员而言,可以通过具体情况理解上述术语在本发明中的具体含义。In the description of the present invention, it should be noted that unless otherwise specified and limited, the terms "installation", "connection" and "connection" should be understood in a broad sense, for example, it can be a fixed connection or a detachable connection. Connected, or integrally connected; it may be mechanically connected or electrically connected; it may be directly connected or indirectly connected through an intermediary, and it may be the internal communication of two components. Those of ordinary skill in the art can understand the specific meanings of the above terms in the present invention based on specific situations.

本发明提供了一种基于半监督神经网络的医学影像处理方法,包括如下步骤:The invention provides a medical image processing method based on a semi-supervised neural network, comprising the steps of:

S1、获取图片数据集;S1. Acquiring a picture data set;

本实施例中提供了两种数据集,分别为PASCAL VOC 2012数据集和Kvasir-SEG数据集。Two data sets are provided in this embodiment, namely the PASCAL VOC 2012 data set and the Kvasir-SEG data set.

其中,PASCAL VOC 2012数据集:PASCAL VOC 2012是一个标准的以对象为中心的语义分割数据集,它由超过13000张图像组成,包含20个对象类和1个背景类。标准的训练集、验证集和测试集分别由1464、1449和1456张图像组成[3]。涵盖了广泛的真实场景和物体类别,如人、车辆、动物、家具等。数据集中的图像来源于各种来源,包括互联网图像、标注图像和专业摄影师的贡献。每个图像都有与之对应的像素级别的标注,用于标记图像中物体的位置和类别。对于目标检测任务,每个物体都有一个边界框标注;对于语义分割任务,每个像素都被标注为属于哪个类别。本设计将使用该数据集作为预训练集,来获得预训练权重。Among them, PASCAL VOC 2012 dataset: PASCAL VOC 2012 is a standard object-centric semantic segmentation dataset, which consists of more than 13,000 images, including 20 object classes and 1 background class. The standard training set, validation set and test set consist of 1464, 1449 and 1456 images respectively [3] . Covers a wide range of real-world scenes and object categories, such as people, vehicles, animals, furniture, etc. The images in the dataset are derived from a variety of sources, including Internet images, annotated images, and contributions from professional photographers. Each image has corresponding pixel-level annotations, which are used to mark the location and category of objects in the image. For object detection tasks, each object is labeled with a bounding box; for semantic segmentation tasks, each pixel is labeled as which category it belongs to. This design will use this dataset as a pre-training set to obtain pre-training weights.

Kvasir-SEG数据集基于先前的Kvasir数据集,该数据集是第一个用于胃肠道(GI)疾病检测和分类的多类数据集,包含了一系列内镜图像,其中包括胃镜和结肠镜的图像。这些图像涵盖了多种常见的胃肠道疾病,如息肉、溃疡和癌症等。每个图像都配备了像素级的分割标签,标注了图像中不同病变区域的边界。原始的Kvasir[6]数据集包含来自8个类别的8,000个GI道图像,每个类别由1000个图像组成。而Kvasir-SEG[6]分段息肉数据集用新图像替换了息肉类的13张图像,提高数据集的质量。本次实验使用的是1000张肠道息肉图像数据集,模型将用于识别肠道息肉特定病灶区。The Kvasir-SEG dataset is based on the previous Kvasir dataset, the first multi-class dataset for gastrointestinal (GI) disease detection and classification, which contains a series of endoscopic images, including gastroscope and colon mirror image. These images cover a variety of common gastrointestinal disorders such as polyps, ulcers and cancer. Each image is equipped with pixel-level segmentation labels annotating the boundaries of different lesion regions in the image. The original Kvasir [6] dataset contains 8,000 GI tract images from 8 categories, each category consists of 1000 images. While the Kvasir-SEG [6] segmented polyp dataset replaces 13 images of polyps with new images to improve the quality of the dataset. This experiment uses a data set of 1,000 images of intestinal polyps, and the model will be used to identify specific lesion areas of intestinal polyps.

根据CPC模型的划分协议,按照随机选择的方式,将Kvasir-SEG数据集分为两组:一组包含1/2、1/4、1/8和1/16的标签数据,而另一组则是剩下的无标签数据称为无标记组,用以模拟标签稀缺的情况。在评估过程中,将会使用相同的骨干网络对基线模型和本设计模型进行训练和测试,以便检测模型在缺少标注数据的情况下的性能。这样的设置可以尽可能的模拟评估模型在缺少标注情况下的医学图像领域的表现差异。According to the division protocol of the CPC model, the Kvasir-SEG dataset was randomly selected into two groups: one group contained 1/2, 1/4, 1/8 and 1/16 label data, while the other group The remaining unlabeled data is called the unlabeled group, which is used to simulate the scarcity of labels. During the evaluation, the baseline model and the design model will be trained and tested using the same backbone network to check the performance of the model in the absence of labeled data. Such a setting can simulate as much as possible the performance difference of the evaluation model in the field of medical images in the absence of annotations.

S2、构建DFCPS模型并训练,所述DFCPS模型如图1所示,包括数据增强策略和神经网络。S2. Construct and train a DFCPS model, as shown in FIG. 1 , including a data enhancement strategy and a neural network.

进一步的,如图2所示,所述神经网络包括主干网络、ASPP模块、上采样模块和Softmax函数;Further, as shown in Figure 2, the neural network includes a backbone network, an ASPP module, an upsampling module and a Softmax function;

所述主干网络采用ResNet-50,并引入残差连接构建深层网络,所述ASPP模块包括四个卷积层,四个所述卷积层的空洞卷积率分别为1、12、24、36;The backbone network adopts ResNet-50, and introduces residual connections to build a deep network. The ASPP module includes four convolutional layers, and the atrous convolution rates of the four convolutional layers are 1, 12, 24, and 36, respectively. ;

所述DFCPS模型的训练方法为:首先,通过数据增强策略对原始样本X进行两次不同程度的强弱数据增强,生成了两组增强样本,分别为一组是强增强样本,另一组是弱增强样本,强弱增强样本分组组合,其中组合后的强弱增强样本进入到四个神经网络之中进行训练,通过不断优化模型的参数来最小化损失值;The training method of the DFCPS model is as follows: firstly, the original sample X is subjected to two different degrees of strong and weak data enhancement through the data enhancement strategy, and two sets of enhanced samples are generated, one group is a strong enhanced sample, and the other is a strong enhanced sample. Weakly enhanced samples, strong and weakly enhanced samples are grouped and combined, and the combined strong and weak enhanced samples are entered into four neural networks for training, and the loss value is minimized by continuously optimizing the parameters of the model;

在训练过程中,模型通过不断优化模型的参数来最小化损失值,防止出现过拟合以及欠拟合的现象,以提高模型的性能和准确性。本发明的loss设计在整个神经网络的训练过程涉及两个关键的损失函数:监督损失Ls和交叉伪监督损失LcpsDuring the training process, the model minimizes the loss value by continuously optimizing the parameters of the model to prevent over-fitting and under-fitting phenomena, so as to improve the performance and accuracy of the model. The loss design of the present invention involves two key loss functions in the whole neural network training process: supervision loss L s and cross pseudo-supervision loss L cps .

具体的,本实施例中,首先将在PASCAL VOC 2012数据集中完成预训练,通过在这个丰富的数据集上进行预训练,来获得相应的预训练权重,神经网络能够学习到丰富的视觉特征,从而提高其在各种图像分割任务上的表现。预训练的目的是使网络能够对图像的特征有更好的理解和表示,为后续微调提供一个良好的初始状态。再根据Kvasir-SEG数据集的特定任务进行微调,针对该特定任务进行优化,使网络能够更好地适应医学图像的特征和分割需求。充分考虑数据特性、任务需求和模型的限制,以及对伪标签的处理和验证。作为数据增强,将会使用随机裁剪、旋转、高斯噪音和带有额外颜色翻转的随机抖动来完成设计目标。具体的,数据增强参数如表1所示:Specifically, in this embodiment, firstly, pre-training will be completed in the PASCAL VOC 2012 data set. By performing pre-training on this rich data set, the corresponding pre-training weights will be obtained, and the neural network can learn rich visual features. Thereby improving its performance on various image segmentation tasks. The purpose of pre-training is to enable the network to better understand and represent the features of the image, and provide a good initial state for subsequent fine-tuning. Then fine-tune according to the specific task of the Kvasir-SEG dataset, and optimize for this specific task, so that the network can better adapt to the characteristics and segmentation requirements of medical images. Fully consider data characteristics, task requirements, and model constraints, as well as the handling and validation of pseudo-labels. As data augmentation, random cropping, rotation, Gaussian noise, and random dithering with additional color flipping will be used to accomplish the design goals. Specifically, the data enhancement parameters are shown in Table 1:

表1:数据增强方法Table 1: Data Augmentation Methods

在基础学习率设置为0.01的情况下训练了60个周期的PASCAL VOC 2012,再将预训练后的模型权重,迁移到Kvasir-SEG数据集中进行100个epoch的训练。判断当前的batch_size,进行自适应学习率的调整,batch_size默认设置为12,最大学习率设定为1e-4,最小学习率设定为最大学习率的0.01倍。With the basic learning rate set to 0.01, PASCAL VOC 2012 was trained for 60 cycles, and then the pre-trained model weights were transferred to the Kvasir-SEG dataset for 100 epoch training. Determine the current batch_size and adjust the adaptive learning rate. The batch_size is set to 12 by default, the maximum learning rate is set to 1e-4, and the minimum learning rate is set to 0.01 times the maximum learning rate.

进一步的,训练分为冻结阶段和解冻阶段两个阶段。在冻结阶段,预训练模型的部分或全部参数被固定,仅对新增的层进行训练。而解冻阶段则是在一定的训练周期后,解除对预训练模型参数的限制,使其参与整个模型的训练。Further, the training is divided into two stages: freezing stage and unfreezing stage. In the freezing stage, some or all parameters of the pre-trained model are fixed, and only the newly added layers are trained. In the unfreezing phase, after a certain training period, the restrictions on the parameters of the pre-trained model are lifted, so that it can participate in the training of the entire model.

冻结训练的主要优点在于加快训练速度。以下是冻结训练的具体步骤:The main advantage of freezing training is that it speeds up training. The following are the specific steps to freeze training:

1、创建基础模型:首先,创建一个基础模型,即骨干网络,在其上进行冻结训练。这个基础模型可以是一个已经被训练过的模型,也可以是一个新的、未经过训练的模型。1. Create a basic model: First, create a basic model, the backbone network, on which to perform frozen training. This base model can be an already trained model or a new, untrained model.

2、冻结模型的层:确定哪些层将被冻结。通常,模型的底层(即输入层周围的层)和顶层(输出层周围的层)会被保留,因为这些层包含了模型的核心功能,而中间层会被冻结。2. Freeze the layers of the model: determine which layers will be frozen. Usually, the bottom layer (i.e., the layers around the input layer) and the top layer (the layer around the output layer) of the model are preserved, because these layers contain the core functions of the model, while the middle layers are frozen.

3、添加新的层:为了适应特定的问题,需要在冻结的模型上添加新的层或替换部分中间层,以增加模型的深度或广度。3. Adding new layers: In order to adapt to specific problems, it is necessary to add new layers on the frozen model or replace some intermediate layers to increase the depth or breadth of the model.

4、冻结层的权重:将被冻结的层的权重设置为不可训练,并锁定它们的权重。这意味着在训练过程中,这些层将不会被更新。4. Freeze layer weights: Set the weights of frozen layers to be non-trainable and lock their weights. This means that during training, these layers will not be updated.

5、编译模型:设置模型的损失函数、优化器和评估指标等参数,并将它们与模型相结合。这是训练过程中的关键步骤。5. Compile the model: Set parameters such as the loss function, optimizer, and evaluation indicators of the model, and combine them with the model. This is a critical step in the training process.

6、训练模型:在设置好损失函数、优化器和评估指标之后,开始训练模型。通常,以一定批次(batch)的数据作为输入,对模型进行训练。只训练被添加的新层,而被冻结的层的权重保持不变。6. Training model: After setting the loss function, optimizer and evaluation indicators, start training the model. Usually, a certain batch of data is used as input to train the model. Only the new layers that are added are trained, while the weights of frozen layers remain unchanged.

7、解锁冻结的层:如果模型不能产生满意的结果,可以通过解锁之前冻结的层来进行微调。这就允许这些层的权重进行进一步的训练和调整,以进一步改进模型的性能。7. Unlock frozen layers: If the model does not produce satisfactory results, fine-tuning can be done by unlocking previously frozen layers. This allows the weights of these layers to be further trained and tuned to further improve the performance of the model.

8、评估模型:在训练结束后,对模型进行最终评估。使用独立的测试数据集来评估模型的性能和准确性。8. Evaluate the model: After the training, the final evaluation of the model is carried out. Use an independent test dataset to evaluate model performance and accuracy.

一般来说,模型的前几层学习到的是低级特征,而后面的层则学习到更高级别的特征。然而,在某些情况下,前几层的参数已经足够好地捕捉到了输入数据的特征,不需要再进一步更新,而后面的层则需要更多的训练来提高性能。通过仅对新增层进行训练,可以大大减少训练计算量,从而提高训练效率。此外,冻结训练还有助于防止过拟合问题的发生。在训练初期,模型的参数可能会过度拟合训练数据,导致泛化性能下降。通过冻结一部分参数,可以减少模型的学习能力,从而有效地降低过拟合的风险,提高模型的泛化能力。由于预训练模型已经在大规模数据集上进行了训练,具备较好的泛化能力。通过冻结其参数,可以避免在小样本数据集上过度拟合,提高模型的泛化性能。此外,冻结训练还能够保护预训练模型的权重。在某些情况下,预训练模型的参数可能具有较高的精度和质量,通过冻结预训练模型的参数,可以避免其被破坏或覆盖。并且冻结训练,可以在机器性能不够的情况下保持训练。但随着训练的进行,模型逐渐学习到更高级别的特征,这时固定的参数可能成为限制模型性能的瓶颈。为了充分发挥模型的表达能力,解冻训练就是逐渐解除参数的固定,允许整个模型的参数进行更新。Generally speaking, the first few layers of the model learn low-level features, while the later layers learn higher-level features. However, in some cases, the parameters of the first few layers have captured the characteristics of the input data well enough that no further updates are needed, while the later layers require more training to improve performance. By only training the newly added layers, the amount of training computation can be greatly reduced, thus improving the training efficiency. Additionally, freezing training helps prevent overfitting problems. In the early stage of training, the parameters of the model may overfit the training data, resulting in a decrease in generalization performance. By freezing some parameters, the learning ability of the model can be reduced, thereby effectively reducing the risk of overfitting and improving the generalization ability of the model. Since the pre-trained model has been trained on a large-scale data set, it has better generalization ability. By freezing its parameters, it can avoid overfitting on small sample data sets and improve the generalization performance of the model. In addition, freezing training can also protect the weights of pre-trained models. In some cases, the parameters of the pre-trained model may have high accuracy and quality, and by freezing the parameters of the pre-trained model, it can be avoided from being destroyed or overwritten. And freeze the training, you can keep training when the performance of the machine is not enough. However, as the training progresses, the model gradually learns higher-level features, and the fixed parameters may become a bottleneck that limits the performance of the model. In order to give full play to the expressive ability of the model, unfreezing training is to gradually release the fixed parameters, allowing the parameters of the entire model to be updated.

解冻训练的主要优点在于微调预训练模型。以下是解冻训练的具体步骤:The main advantage of unfreezing training lies in fine-tuning the pre-trained model. The following are the specific steps to unfreeze training:

1.冻结部分模型层:在完成骨干模型的训练后,从输出层向前逐层冻结部分模型层,通常选择离输出层较远的层进行冻结。1. Freeze part of the model layer: After completing the training of the backbone model, freeze part of the model layer layer by layer from the output layer forward, usually select the layer farther away from the output layer for freezing.

2.训练部分模型层:对剩余未冻结的模型层进行训练,一般使用较小的学习率。2. Training part of the model layer: train the remaining unfrozen model layers, generally using a smaller learning rate.

3.解冻更多模型层:随着模型在训练过程中的不断优化,继续解冻更多的模型层,逐步使模型变得更加复杂。3. Unfreeze more model layers: As the model is continuously optimized during the training process, continue to unfreeze more model layers, gradually making the model more complex.

4.重复步骤2和3:重复步骤2和3,直到所有模型层都完成训练。4. Repeat steps 2 and 3: Repeat steps 2 and 3 until all model layers are trained.

5.整体微调:最后,对整个模型进行微调,以进一步提高性能和准确度。5. Overall fine-tuning: Finally, the entire model is fine-tuned to further improve performance and accuracy.

通过解除对预训练模型参数的限制,将其纳入整个模型的训练中,可以进一步优化模型的性能。预训练模型已经具备了较好的特征提取能力,通过解冻和微调可以使其更好地适应目标任务。解冻训练能够提升模型在目标任务上的表现,并进一步改善其泛化能力。综上所述,冻结训练和解冻训练的结合利用了预训练模型的优势,既加快了训练速度,又提高了模型的泛化能力。The performance of the model can be further optimized by removing restrictions on the parameters of the pre-trained model and incorporating them into the training of the entire model. The pre-trained model already has a good feature extraction ability, and it can be better adapted to the target task through unfreezing and fine-tuning. Unfreezing training can improve the performance of the model on the target task and further improve its generalization ability. To sum up, the combination of freezing training and unfreezing training takes advantage of the pre-trained model, which not only speeds up the training speed, but also improves the generalization ability of the model.

在模型训练完成后,将选定合适的评价指标在训练集或验证集上对不同模型的结果进行评估和比较。通过计算各个指标的数值,选择具有最佳性能的模型作为最终选择,将其应用于数据集的分割任务中,并观察其分割结果,涉及到分割边界的清晰度、准确性、一致性以及与专业知识和临床应用的符合程度,通过这些评估,最终确定模型在数据集分割任务中的完成情况,并对其性能进行综合评估。After the model training is completed, an appropriate evaluation index will be selected to evaluate and compare the results of different models on the training set or validation set. By calculating the value of each indicator, select the model with the best performance as the final choice, apply it to the segmentation task of the data set, and observe the segmentation results, which involve the clarity, accuracy, consistency and consistency of the segmentation boundary. The degree of conformity between expertise and clinical application, through these evaluations, finally determines the completion of the model in the dataset segmentation task, and comprehensively evaluates its performance.

进一步的,如图3所示,综上所述,整个训练过程中的损失函数主要包括监督损失和交叉伪监督损失。监督损失通过强增强样本与弱增强样本的伪标签之间的比较来指导网络的学习。交叉伪监督损失则通过对不同组之间生成的伪分割图的比较来促使网络在整体上学习更一致的分割结果。这样的训练策略有助于提高模型的性能,并使得网络能够更好地适应不同组的数据和标签。本设计在强弱增强的方法中借鉴了Fixmatch的思想,并对其进行了相应扩展,以适用于医学图像标签分割任务。然而,需要特别注意伪标签的可靠性、准确性和清洁度,因为伪标签的质量直接影响模型的性能和泛化能力。Further, as shown in Figure 3, in summary, the loss function in the whole training process mainly includes supervised loss and cross-pseudo-supervised loss. The supervised loss guides the learning of the network through the comparison between pseudo-labels of strongly augmented samples and weakly augmented samples. The cross-pseudo-supervised loss encourages the network to learn more consistent segmentation results as a whole by comparing the pseudo-segmentation maps generated between different groups. Such a training strategy helps to improve the performance of the model and enables the network to better adapt to different sets of data and labels. This design draws on the idea of Fixmatch in the method of strong and weak enhancement, and expands it accordingly to apply to the task of medical image label segmentation. However, special attention needs to be paid to the reliability, accuracy, and cleanliness of the pseudo-labels, since the quality of the pseudo-labels directly affects the performance and generalization ability of the model.

所述监督损失Ls由交叉熵损失函数决定,对于有标记的样本进行正常的监督学习,表达式如下:The supervised loss L s is determined by the cross-entropy loss function, and normal supervised learning is performed on marked samples, and the expression is as follows:

其中,Du:表示未标签样本的集合,Dl表示已有标签的样本集合,S定义为输入图像的面积,用高度*宽度计算,pi为对应的置信向量,yj为ground truth,lce为交叉熵损失函数。Among them, Du : represents the collection of unlabeled samples, D l represents the collection of samples with labels, S is defined as the area of the input image, calculated by height*width, p i is the corresponding confidence vector, y j is the ground truth, l ce is the cross entropy loss function.

所述交叉伪监督损失定义为和/>其中/>表示的是有标签数据的监督损失,/>表示的是无标签数据的监督损失,表达式如下:The cross-pseudo-supervised loss is defined as and /> where /> Represents the supervised loss of labeled data, /> Represents the supervised loss of unlabeled data, the expression is as follows:

故交叉伪监督损失可以定义为So the cross-pseudo-supervised loss can be defined as

总的损失则可以相应定义为,其中ω为权重The total loss can be correspondingly defined as, where ω is the weight

Loss=Ls+ωLcpsLoss=L s +ωL cps .

为了确保伪标签的质量足够高,本发明引入了置信阈值的概念,该阈值在本发明中被定义为μ。通过设置置信阈值,模型可以筛选出质量较高的伪标签,从而避免对模型训练产生负面影响。具体而言,我将伪标签的置信度与置信阈值进行对比。如果伪标签的置信度与置信阈值相近,即在0.5附近,就可以忽略不计,因为这样的伪标签可能不够可靠,将伪分割图大于置信阈值μ的部分参与损失计算。对强扩增的样本,输出的预测结果和对应的超过置信阈值的目标弱标记样本的得到的伪分割图同样做交叉熵损失。In order to ensure that the quality of the pseudo-label is high enough, the present invention introduces the concept of a confidence threshold, which is defined as μ in the present invention. By setting a confidence threshold, the model can filter out high-quality pseudo-labels, thereby avoiding negative impact on model training. Specifically, I compare the confidence of the pseudo-label against a confidence threshold. If the confidence of the pseudo-label is close to the confidence threshold, that is, around 0.5, it can be ignored, because such a pseudo-label may not be reliable enough, and the part of the pseudo-segmentation map greater than the confidence threshold μ is involved in the loss calculation. For strongly amplified samples, cross-entropy loss is also performed on the output prediction results and the corresponding pseudo-segmentation maps obtained from target weakly labeled samples that exceed the confidence threshold.

完成训练后,在本实施例中,进一步的,使用mIOU(Mean Intersection overUnion,平均交并比)作为评估模型性能的指标,通过计算模型预测的分割结果与真实标签种每个类别的交并比并取平均值,可以得到一个综合的性能评估指标,它衡量了模型预测的分割结果与真实标签之间的重叠程度。mIOU越高,表示模型对不同类别的分割准确性越高。并且每隔10个迭代周期对模型进行评估。After the training is completed, in this embodiment, further, using mIOU (Mean Intersection over Union, average intersection and union ratio) as an index for evaluating model performance, by calculating the intersection and union ratio of the segmentation results predicted by the model and each category of the real label species And taking the average, we can get a comprehensive performance evaluation index, which measures the degree of overlap between the segmentation results predicted by the model and the ground truth labels. The higher the mIOU, the higher the segmentation accuracy of the model for different categories. And the model is evaluated every 10 iterations.

但单纯的观察mIOU值并非越高越好,mIOU值越高可以体现模型的泛化性能越高。同时也需观察Loss值的大小可以用来判断模型是否收敛,但更重要的是观察其趋势,特别是验证集Loss的趋势。当验证集损失不断下降时,可以认为模型正在收敛。如果验证集Loss基本上不变,那么模型基本上已经收敛了。当模型收敛时,Loss值通常会呈现出下降的趋势,尤其是在验证集上的Loss。这意味着模型在训练过程中不断优化,并逐渐逼近最优解。如果验证集Loss基本上保持不变,表示模型已经趋于稳定,进一步优化可能会带来较小的收益。However, simply observing the mIOU value is not as high as possible, and the higher the mIOU value can reflect the higher the generalization performance of the model. At the same time, it is also necessary to observe the size of the Loss value to judge whether the model is converged, but more importantly, to observe its trend, especially the trend of the verification set Loss. When the validation loss keeps decreasing, the model can be considered to be converging. If the verification set Loss is basically unchanged, then the model has basically converged. When the model converges, the Loss value usually shows a downward trend, especially the Loss on the validation set. This means that the model is continuously optimized during the training process and gradually approaches the optimal solution. If the verification set Loss remains basically unchanged, it means that the model has stabilized, and further optimization may bring small benefits.

为了评估模型的性能,在训练过程中每隔10个迭代周期对模型进行一次评估。以便及时监控模型的性能,并根据评估结果进行调整和优化。评估过程中,使用测试集的图像样本进行推断,并与对应的真实标签进行比较,计算mIOU指标。To evaluate the performance of the model, the model is evaluated every 10 iterations during the training process. In order to monitor the performance of the model in time, and adjust and optimize according to the evaluation results. During evaluation, image samples from the test set are used for inference and compared with the corresponding ground truth labels to calculate the mIOU metric.

在评估过程中,将会选择在训练集上表现最好的模型应用于测试集,去验证模型在未见过的数据上的泛化能力。通过将最佳模型应用于测试集,能够得到一个更准确和可靠的评估结果,进一步验证模型的性能。这样的设计能够确保模型在不同数据集上的稳定性和泛化能力。During the evaluation process, the model that performed best on the training set will be selected and applied to the test set to verify the generalization ability of the model on unseen data. By applying the best model to the test set, a more accurate and reliable evaluation result can be obtained, further verifying the performance of the model. Such a design can ensure the stability and generalization ability of the model on different data sets.

以CPC、CPS模型作为基线baseline,与设计的模型进行对比,用以评估模型性能,通过与CPC和CPS模型进行对比评估,希望可以全面了解本发明设计的神经网络模型在半监督学习任务中的性能表现。这种对比分析有助于验证设计的模型的优势和创新点,并为进一步改进和优化提供指导。With CPC, CPS model as baseline baseline, compare with the model of design, in order to evaluate model performance, by comparing and evaluating with CPC and CPS model, hope can fully understand the neural network model of the present invention design in semi-supervised learning task performance. This comparative analysis helps to verify the advantages and innovations of the designed model, and provides guidance for further improvement and optimization.

当所构建DFCPS模型符合mIOU指标后,应用DFCPS训练后的分割模型做仿真实验:When the constructed DFCPS model meets the mIOU index, the segmentation model trained by DFCPS is used for simulation experiments:

1、将经预处理的数据集作为输入,通过主干网络获取特征图;1. The preprocessed data set is used as input, and the feature map is obtained through the backbone network;

2、将获取的特征图作为输入,通过ASPP模块通过多个并行的分支来实现多尺度的综合特征提取;2. Take the obtained feature map as input, and realize multi-scale comprehensive feature extraction through ASPP module through multiple parallel branches;

3、将获取的综合特征表示通过上采样技术增加数据细节,通过Softmax函数将模型的输出映射为每个类别的概率,然后根据概率选择最可能的类别作为预测结果,以生成可信度高的预测样本。3. The obtained comprehensive features are represented by upsampling technology to increase data details, and the output of the model is mapped to the probability of each category through the Softmax function, and then the most likely category is selected as the prediction result according to the probability to generate a highly reliable Forecast samples.

4、利用mIOU指标,在不同实验设置间互相比较其模型预测结果性能。4. Use the mIOU index to compare the performance of the model prediction results between different experimental settings.

本实施例中,鉴于数据增强策略在FixMatch中对模型有着显著的影响力,故本发明将计划设计两组消融实验来验证不同程度的数据增强策略对本DFCPS模型的影响。本模型原先设定的数据增强策略为对同一组的样本进行不同的强增强和弱增强,强弱增强组合为一组,再进入到网络中。In this embodiment, in view of the fact that the data enhancement strategy has a significant influence on the model in FixMatch, the present invention plans to design two sets of ablation experiments to verify the impact of different levels of data enhancement strategies on the DFCPS model. The data enhancement strategy originally set in this model is to perform different strong enhancement and weak enhancement on the same group of samples, and the strong and weak enhancements are combined into one group before entering the network.

在当前的消融实验设置中,本设计将实验自定义的数据增强策略设定为两种,一种数据增强策略设置为同一组图片样本进行两次不同程度的强增强,在其他情况保持不变的基础上,组合后再进入到网络中,获取结果用以观察模型性能。而另一种数据增强策略设置为同一组的原图片样本进行两次不同程度的弱增强,在其他情况保持不变的基础上,组合后再进入到网络中,同时以此来观察模型的性能。此外,还将直接使用原始样本(未经任何数据增强策略处理)送入网络中进行训练,并评估模型的性能。通过上述控制数据增强策略方式与DFCPS所采用的数据增强策略进行对比,分析不同程度的数据增强策略对DFCPS模型的影响,有助于更好地理解数据增强在模型性能提升中的作用,并为进一步优化模型提供指导。In the current ablation experiment setting, this design sets two kinds of data enhancement strategies customized by the experiment. One data enhancement strategy is set to perform two different degrees of strong enhancement on the same group of image samples, and it remains unchanged in other cases. Based on the combination, it enters the network and obtains the results to observe the performance of the model. Another data enhancement strategy is set to perform two weak enhancements of different degrees on the original image samples of the same group. On the basis of keeping other conditions unchanged, the combination is then entered into the network, and at the same time, the performance of the model is observed. . In addition, raw samples (not processed by any data augmentation strategy) will be directly used to feed the network for training and evaluate the performance of the model. By comparing the above control data enhancement strategy with the data enhancement strategy adopted by DFCPS, and analyzing the impact of different levels of data enhancement strategies on the DFCPS model, it will help to better understand the role of data enhancement in the improvement of model performance, and provide Provide guidance for further optimization of the model.

本实施例中,实验服务器配置有六张Nvidia GTX 2080Ti显卡,显存大小64GB,运行在Ubuntu 20.04.1系统上。opencv_python的版本是4.1.2.30In this embodiment, the experimental server is configured with six Nvidia GTX 2080Ti graphics cards with a memory size of 64GB and runs on the Ubuntu 20.04.1 system. The version of opencv_python is 4.1.2.30

经过100个周期的训练后,我们对损失和mIOU进行了可视化分析。通过观察损失的趋势和mIOU曲线,我们可以看到损失逐渐趋于稳定,而mIOU值达到了80%以上,这一结果是相当理想的。After training for 100 epochs, we performed a visual analysis of the loss and mIOU. By observing the loss trend and mIOU curve, we can see that the loss gradually stabilized, and the mIOU value reached more than 80%, which is quite ideal.

在Loss趋势的可视化,如图4所示,观察到Loss在训练过程中逐渐减小并趋于平稳,这表明模型的训练已经收敛。而在mIOU曲线的可视化,如图5所示,mIOU值随着训练的进行逐渐增加,并最终超过了80%,在后期训练的阶段也较为准确。这显示出模型对于分割任务的性能表现良好,模型能够较为准确地捕捉图像中的目标边界并进行分割,具有良好的性能和泛化能力。因此,本发明提出的整体设计达到了预期的性能水平,这些结果进一步验证了本发明提出的方法的有效性和可行性。In the visualization of the Loss trend, as shown in Figure 4, it is observed that the Loss gradually decreases and tends to be stable during the training process, which indicates that the training of the model has converged. In the visualization of the mIOU curve, as shown in Figure 5, the mIOU value gradually increases with the training, and finally exceeds 80%, and it is more accurate in the later training stage. This shows that the performance of the model for the segmentation task is good, the model can more accurately capture the target boundary in the image and perform segmentation, and has good performance and generalization ability. Therefore, the overall design proposed by the present invention has reached the expected performance level, and these results further verify the effectiveness and feasibility of the method proposed by the present invention.

通过本神经网络,可以获得肠道息肉的分割图像,如图6所示,除了数值指标,视觉评估也是评价医学图像分割结果的重要方法之一。通过对比分割结果与真实标签,我们可以直观地检查其边界清晰度、轮廓一致性等特征。在肉眼观察分割结果时,可以得出以下观察结果:Through this neural network, the segmentation image of intestinal polyps can be obtained, as shown in Figure 6, in addition to numerical indicators, visual evaluation is also one of the important methods to evaluate the results of medical image segmentation. By comparing the segmentation results with the ground truth labels, we can visually check their features such as boundary clarity, contour consistency, etc. When visually observing the segmentation results, the following observations can be drawn:

首先,分割边界清晰,能够明确区分出肠道息肉的边缘轮廓,不存在模糊或模糊不清的情况。其次,分割结果与真实标签的轮廓基本一致,即分割出的区域与实际的肠道息肉位置相匹配。此外,分割结果没有漏分割或误分割的情况,没有遗漏掉肠道息肉区域或将正常区域错误地标记为肠道息肉。除了边界清晰度和轮廓一致性,分割结果还与医学专业知识和临床应用要求相符。这意味着我们的模型能够准确地标记病变区域,并且分割出的区域包含了感兴趣的解剖结构,能够基本完成设计任务。通过与基线模型的对比评估,进一步验证了本模型在性能上的优势。并且相较于基线模型,本模型在轮廓分割和清晰度方面表现更出色。其像素准确度略高于基线模型,表明本模型能够更准确地进行像素级别的分割。First, the segmentation boundary is clear, and the edge contour of intestinal polyps can be clearly distinguished without blurring or blurring. Second, the segmentation results are basically consistent with the contours of the ground truth labels, i.e. the segmented regions match the actual intestinal polyp locations. In addition, the segmentation results did not miss or mis-segment, and did not miss intestinal polyp regions or falsely label normal regions as intestinal polyps. In addition to boundary clarity and contour consistency, the segmentation results also match medical expertise and clinical application requirements. This means that our model can accurately label the lesion area, and the segmented area contains the anatomical structure of interest, which can basically complete the design task. The advantages of the proposed model in performance are further verified by comparative evaluation with the baseline model. And compared to the baseline model, the proposed model performs better in terms of contour segmentation and clarity. Its pixel accuracy is slightly higher than the baseline model, indicating that the proposed model is capable of more accurate pixel-level segmentation.

综上所述,通过视觉评估我们可以得出结论,本模型在肠道息肉的分割任务上表现优良。分割结果具有清晰的边界、与真实标签的轮廓基本一致,并且没有漏分割或误分割的情况。此外,分割结果符合医学专业知识和临床应用要求。In conclusion, from the visual evaluation we can conclude that our model performs well on the segmentation task of intestinal polyps. The segmentation results have clear boundaries, are basically consistent with the contours of the real labels, and there is no missing segmentation or mis-segmentation. In addition, the segmentation results meet the requirements of medical expertise and clinical application.

在表2中,展示了在RESNET-50模型下,不同方法在不同标签数据集比例下的mIOU值,与CPC和CPS基线组相比,本发明的方法表现更好。特别是在标签覆盖率为1/2的情况下,DFCPS达到了最佳性能,相对于CPC[2]和CPS[4]模型,我设计的DFCPS方法分别提高了2.21%和1.65%的mIOU值。这些结果表明本发明的方法在缺少标注条件下的医学图像分割任务上具有一定的优势,表现出了理想的性能。In Table 2, under the RESNET-50 model, the mIOU values of different methods under different label data set ratios are shown. Compared with the CPC and CPS baseline groups, the method of the present invention performs better. Especially in the case of label coverage of 1/2, DFCPS achieves the best performance, compared to the CPC [2] and CPS [4] models, the DFCPS method I designed improves the mIOU value by 2.21% and 1.65%, respectively . These results show that the method of the present invention has certain advantages in the medical image segmentation task under the condition of lack of annotation, and shows ideal performance.

表2mIOU值Table 2 mIOU values

通过对比折线图,如图7所示,可以明显看出本发明方法在不同标签数据集比例下取得了更好的表现。与基线组相比,DFCPS模型在1/2、1/4、1/8和1/16标签覆盖率的条件下,均实现了更高的mIOU值,取得了明显的优势,进一步证明了其在不同数据集划分条件下的稳定性和鲁棒性,以说明在标注稀少的情况,本模型更能胜任医学图像标注任务。也表明本发明的方法能够更准确地进行医学图像标注,提高了模型对目标的检测和定位能力。从可视化图中也可以看出,本发明设计的半监督学习神经网络在各个划分数据集的情况都表现出了比基线组更加好的性能,这些实验结果进一步支持了本发明设计的半监督学习神经网络在解决医学图像标注问题中的可行性。相较于基线组,本发明的方法充分利用了有限的标注数据,通过半监督学习的思想,利用未标注数据进行模型训练和优化。这种方法不仅提高了训练效率,还显著改善了模型性能,为解决医学图像标注中数据稀缺的挑战提供了一种有效的解决方案。By comparing the line graphs, as shown in Figure 7, it can be clearly seen that the method of the present invention has achieved better performance under different ratios of label data sets. Compared with the baseline group, the DFCPS model achieved higher mIOU values under the conditions of 1/2, 1/4, 1/8 and 1/16 label coverage, and achieved obvious advantages, further proving its The stability and robustness under different data set division conditions show that this model is more competent for medical image labeling tasks in the case of sparse labeling. It also shows that the method of the present invention can more accurately label medical images and improve the model's ability to detect and locate targets. It can also be seen from the visualization figure that the semi-supervised learning neural network designed by the present invention has shown better performance than the baseline group in each division of data sets, and these experimental results further support the semi-supervised learning designed by the present invention Feasibility of neural networks in solving medical image annotation problems. Compared with the baseline group, the method of the present invention makes full use of limited labeled data, and uses unlabeled data for model training and optimization through the idea of semi-supervised learning. This method not only improves the training efficiency, but also significantly improves the model performance, providing an effective solution to the challenge of data scarcity in medical image annotation.

在消融实验中,可以观察到不同增强策略对模型性能的影响。其结果显示(如表3),弱-弱增强组合优于强-强增强组合,而直接使用原始样本,不经过任何数据增强策略的效果最差。除此之外,本发明所设计的DFCPS模型在使用的强-弱增强组合的数据增强策略时,表现最佳,这说明本作所采用的数据增强策略是合理的。In ablation experiments, the impact of different augmentation strategies on model performance can be observed. The results show (as shown in Table 3) that the combination of weak-weak enhancement is better than the combination of strong-strong enhancement, and the effect of directly using the original sample without any data enhancement strategy is the worst. In addition, the DFCPS model designed by the present invention performs best when using the strong-weak augmentation combination data augmentation strategy, which shows that the data augmentation strategy adopted in this work is reasonable.

表3:不同增强策略下的mIOU值Table 3: mIOU values under different enhancement strategies

在六卡2080ti的实验环境下,训练时间方面与基线模型的对比(表4),DFCPS相较于CPS[4]和CPC[2]模型有着更长的训练时间,在DFCPS中,训练时间的增加主要来自两个方面。首先,DFCPS需要利用额外的神经网络模型来提取特征并计算特征的一致性损失。这涉及到对多个模型进行训练和优化,因此会增加整体的训练时间。其次,DFCPS还需要通过反向传播来更新策略网络和特征提取网络。这个过程通常需要多个迭代和训练周期,以期望达到较好的性能。相比之下,CPS[4]和CPC[2]模型可能省略了这个特征一致性损失和反向传播的步骤,因此它们的训练时间可能相对较短。DFCPS在完成训练后,使用学习到的策略进行推理的时间消耗通常是相似的,由于在性能方面,DFCPS表现的更加优秀,我认为这样的交换是有所价值的。In the six-card 2080ti experimental environment, the training time is compared with the baseline model (Table 4). Compared with the CPS [4] and CPC [2] models, DFCPS has a longer training time. In DFCPS, the training time is The increase mainly comes from two aspects. First, DFCPS needs to utilize additional neural network models to extract features and calculate the consistency loss of features. This involves training and optimizing multiple models, thus increasing the overall training time. Second, DFCPS also needs to update the policy network and feature extraction network through backpropagation. This process usually requires multiple iterations and training cycles in the hope of achieving good performance. In contrast, the CPS [4] and CPC [2] models may omit this step of feature consistency loss and backpropagation, so their training time may be relatively short. After DFCPS completes the training, the time consumption of using the learned strategy for reasoning is usually similar. Since DFCPS performs better in terms of performance, I think such an exchange is worthwhile.

表4:时间对比Table 4: Time comparison

以上结合附图对本发明的实施方式作了详细说明,但本发明不限于所描述的实施方式。对于本领域的技术人员而言,在不脱离本发明原理和精神的情况下,对这些实施方式包括部件进行多种变化、修改、替换和变型,仍落入本发明的保护范围内。The embodiments of the present invention have been described in detail above with reference to the accompanying drawings, but the present invention is not limited to the described embodiments. For those skilled in the art, without departing from the principle and spirit of the present invention, various changes, modifications, replacements and modifications to these implementations, including components, still fall within the protection scope of the present invention.

Claims (10)

1. A medical image processing method based on a semi-supervised neural network is characterized by comprising the following steps:
s1, acquiring a picture data set;
s2, constructing and training a DFCPS model, wherein the DFCPS model comprises a data enhancement strategy and a neural network, and the neural network comprises a main network, an ASPP module, an up-sampling module and a Softmax function;
the backbone network adopts ResNet-50, and introduces residual connection to construct a deep network, the ASPP module comprises four convolution layers, and the cavity convolution rates of the four convolution layers are respectively 1, 12, 24 and 36;
The training method of the DFCPS model comprises the following steps: firstly, carrying out two times of strong and weak data enhancement on an original sample X through a data enhancement strategy to generate two groups of enhancement samples, wherein one group is a strong enhancement sample, the other group is a weak enhancement sample, the strong and weak enhancement samples are combined in groups, the combined strong and weak enhancement samples enter four neural networks for training, and the loss value is minimized by continuously optimizing parameters of a model;
s3, performing image segmentation by using the trained neural network
S3-1, taking the preprocessed data set as input, and acquiring a feature map through a backbone network;
s3-2, taking the obtained feature map as input, extracting features through an ASPP module and fusing to obtain comprehensive features;
s3-3, increasing data details of the acquired comprehensive characteristic representation through an up-sampling technology, mapping the output of the model into the probability of each category through a Softmax function, and then selecting the most probable category as a prediction result according to the probability to generate a prediction sample with high reliability.
2. The method for processing medical images based on semi-supervised neural network as set forth in claim 1, wherein the data enhancement strategy is used for preprocessing the image data.
3. The method of claim 2, wherein the data enhancement strategy comprises random rotation, random brightness, contrast adjustment, random translation.
4. The method according to claim 1, wherein in the step S2, each group of the neural networks for strong and weak enhancement shares parameters and weights during training, and a pseudo tag generated from a prediction result outputted from the neural network after weak enhancement in each group is used as a target for strong enhancement sample prediction for each sample.
5. The method for processing medical images based on semi-supervised neural network as set forth in claim 4, wherein in the step S2, a supervision loss is introduced as L during the training process s The learning process of the DFCPS model is guided by the information of the strong enhanced samples and the weak enhanced samples.
6. The method for medical image processing based on semi-supervised neural network as set forth in claim 5, wherein in the step S2, the training process design introduces a cross pseudo-supervision loss L cps So that the pseudo tags generated by the samples of different groups of weakly supervised versions are constrained by each other.
7. The method for processing medical images based on semi-supervised neural network as set forth in claim 6, wherein the supervision loss L s The normal supervised learning is performed on the marked samples, determined by the cross entropy loss function, expressed as follows:
wherein D is u Representing a set of unlabeled exemplars, D l A sample set representing existing labels, S being defined as the area of the input image, p calculated by height x width i For the corresponding confidence vector, y j For group trunk, l ce Is a cross entropy loss function.
8. The method for processing medical images based on semi-supervised neural network as set forth in claim 7, wherein the cross-pseudo-supervision loss is defined asAnd->Wherein->Indicated is the supervised loss of tagged data,representing the supervision loss of unlabeled dataThe expression is as follows:
so the cross pseudo-supervision loss can be defined as
The total loss can be defined accordingly as, where ω is the weight
Loss=L s +ωL cps (5)。
9. The medical image processing method based on the semi-supervised neural network according to claim 1, wherein the specific method of the step S3-1 is as follows: in the backbone network ResNet-50, the preprocessed data set is taken as input, the pooling layer and the full-connection layer in the ResNet-50 are used for feature extraction and classification, and information is allowed to directly skip some layers through residual connection, so that a feature map is output.
10. The medical image processing method based on the semi-supervised neural network as set forth in claim 9, wherein the specific method of step S3-2 is as follows: and taking the output of the neural network as the input of an ASPP module, wherein the ASPP module uses a plurality of parallel branches to perform multi-scale feature extraction, enhances a receptive field through hole convolution, and finally performs semantic association at different positions by utilizing the context information of the input feature map so as to output the extracted features.
CN202310714829.7A 2023-06-16 2023-06-16 A medical image processing method based on semi-supervised neural network Active CN116630299B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310714829.7A CN116630299B (en) 2023-06-16 2023-06-16 A medical image processing method based on semi-supervised neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310714829.7A CN116630299B (en) 2023-06-16 2023-06-16 A medical image processing method based on semi-supervised neural network

Publications (2)

Publication Number Publication Date
CN116630299A true CN116630299A (en) 2023-08-22
CN116630299B CN116630299B (en) 2025-08-26

Family

ID=87592094

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310714829.7A Active CN116630299B (en) 2023-06-16 2023-06-16 A medical image processing method based on semi-supervised neural network

Country Status (1)

Country Link
CN (1) CN116630299B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN119649848A (en) * 2024-10-28 2025-03-18 浙江大学 A method for detecting acoustic leakage in water supply network based on enhanced semi-supervised model
CN120047754A (en) * 2025-04-24 2025-05-27 泉州装备制造研究所 Image classification method and system

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115115608A (en) * 2022-07-20 2022-09-27 南京工业大学 Aero-engine damage detection method based on semi-supervised semantic segmentation
CN115359029A (en) * 2022-08-30 2022-11-18 江苏科技大学 Semi-supervised medical image segmentation method based on heterogeneous cross pseudo-supervised network
US20230053716A1 (en) * 2021-08-09 2023-02-23 Naver Corporation System and method of semi-supervised learning with few labeled images per class
WO2023047091A1 (en) * 2021-09-27 2023-03-30 Ucl Business Ltd Image segmentation
CN116051942A (en) * 2022-12-02 2023-05-02 天津大学 Semi-supervised image classification method based on evidence theory

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20230053716A1 (en) * 2021-08-09 2023-02-23 Naver Corporation System and method of semi-supervised learning with few labeled images per class
WO2023047091A1 (en) * 2021-09-27 2023-03-30 Ucl Business Ltd Image segmentation
CN115115608A (en) * 2022-07-20 2022-09-27 南京工业大学 Aero-engine damage detection method based on semi-supervised semantic segmentation
CN115359029A (en) * 2022-08-30 2022-11-18 江苏科技大学 Semi-supervised medical image segmentation method based on heterogeneous cross pseudo-supervised network
CN116051942A (en) * 2022-12-02 2023-05-02 天津大学 Semi-supervised image classification method based on evidence theory

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
XIAOQIANG LU等: "Weak-to-Strong Consistency Learning for Semisupervised Image Segmentation", IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 15 May 2023 (2023-05-15), pages 1 - 15 *
YIFEI CHEN等: "Semi-Supervised Medical Image Segmentation Method Based on Cross-Pseudo Labeling Leveraging Strong and Weak Data Augmentation Strategies", 2024 IEEE INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING (ISBI), 30 May 2024 (2024-05-30), pages 1 - 5, XP034677499, DOI: 10.1109/ISBI56570.2024.10635443 *
白罡旭: "基于半监督学习的图像分类方法及应用", 中国优秀硕士学位论文全文数据库 信息科技辑, 15 March 2024 (2024-03-15), pages 19 - 40 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN119649848A (en) * 2024-10-28 2025-03-18 浙江大学 A method for detecting acoustic leakage in water supply network based on enhanced semi-supervised model
CN120047754A (en) * 2025-04-24 2025-05-27 泉州装备制造研究所 Image classification method and system

Also Published As

Publication number Publication date
CN116630299B (en) 2025-08-26

Similar Documents

Publication Publication Date Title
JP7707299B2 (en) Automatic liver CT segmentation method based on deep shape learning
Yamuna et al. Integrating AI for Improved Brain Tumor Detection and Classification
US11610310B2 (en) Method, apparatus, system, and storage medium for recognizing medical image
Dheir et al. Classification of anomalies in gastrointestinal tract using deep learning
Zhang et al. Diabetic retinopathy grading by a source-free transfer learning approach
Pogorelov et al. Deep learning and hand-crafted feature based approaches for polyp detection in medical videos
CN110599448A (en) Migratory learning lung lesion tissue detection system based on MaskScoring R-CNN network
CN106296699A (en) Cerebral tumor dividing method based on deep neural network and multi-modal MRI image
CN110689025A (en) Image recognition method, device and system, and endoscope image recognition method and device
Haq et al. BTS-GAN: computer-aided segmentation system for breast tumor using MRI and conditional adversarial networks
CN110335231A (en) A method for assisted screening of chronic kidney disease with ultrasound imaging based on texture features and depth features
CN116630299A (en) A medical image processing method based on semi-supervised neural network
Liu et al. Gastric pathology image recognition based on deep residual networks
Yonekura et al. Improving the generalization of disease stage classification with deep CNN for glioma histopathological images
CN116912253A (en) Lung cancer pathological image classification method based on multi-scale hybrid neural network
Ramalakshmi et al. An extensive analysis of artificial intelligence and segmentation methods transforming cancer recognition in medical imaging
Wang et al. Explainable multitask Shapley explanation networks for real-time polyp diagnosis in videos
CN114398979A (en) Ultrasonic image thyroid nodule classification method based on feature decoupling
Kumar et al. Gannet devil optimization-based deep learning for skin lesion segmentation and identification
CN118334417A (en) Medical ultrasonic image recognition system and method based on deep learning
Gong et al. Gastric cancer detection using a hybrid version of gated recurrent unit network and adjusted tyrannosaurus optimization algorithm
CN117765338A (en) Uterine cavity lesion image classification method based on granularity perception and structural modeling
CN114330484B (en) Weak supervision learning diabetic retinopathy grading and focus identification method and system
Zheng et al. Image segmentation of intestinal polyps using attention mechanism based on convolutional neural network
CN115690518A (en) A classification system for the severity of intestinal metaplasia

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant