WO2024221231A1

WO2024221231A1 - Semi-supervised learning method and apparatus based on model framework

Info

Publication number: WO2024221231A1
Application number: PCT/CN2023/090635
Authority: WO
Inventors: 胡战利; 黄正勇; 张娜; 梁栋; 郑海荣
Original assignee: Shenzhen Institute of Advanced Technology of CAS
Current assignee: Shenzhen Institute of Advanced Technology of CAS
Priority date: 2023-04-25
Filing date: 2023-04-25
Publication date: 2024-10-31
Anticipated expiration: 2025-10-25

Abstract

The present invention relates to a semi-supervised learning method and apparatus based on a model framework. The method and apparatus comprise: providing an overall network framework, wherein the overall network framework comprises a student model, a teacher model, a projector network, and an output layer network; inputting medical images into the student model and the teacher model, and obtaining projected feature representations from the outputs of the student model and the teacher model by means of the projector network; inputting the projected feature representations into the output layer network to obtain a final segmentation result of the student model and the teacher model; and designing a network loss function, and training the overall network framework by using the network loss function. In the present invention, a Mean-Teacher model is used, and contrastive learning as a loss function is added into a teacher-student model, so that the accuracy of consistency learning is further improved, the segmentation accuracy is thus improved, and a more accurate segmentation result is finally obtained.

Description

A semi-supervised learning method and device based on model framework

Technical Field

本发明涉及医学图像分割领域，具体而言，涉及一种基于模型框架的半监督学习方法及装置。The present invention relates to the field of medical image segmentation, and in particular to a semi-supervised learning method and device based on a model framework.

Background Art

左心房结构是临床医生诊断和治疗心房颤动的重要信息，心房颤动是最常见的心率紊乱。医学图像分割是各种医学图像应用的基础，如确定癌症分期、制定治疗计划、放射组学分析以及制定个性化医疗服务等等。在分割任务中，肿瘤靶区勾画是治疗癌症的关键一步，其目的是最大限度将放射剂集中在靶区内，让周围的正常组织和器官尽可能减少甚至免受伤害。然而人工勾画肿瘤靶区是一个费时费力的过程，而且人工标注的精度在很大程度上依赖于肿瘤学家的经验知识，不同医生之间的差异导致他们对于同一肿瘤的标注也可能不同。有监督三维医学图像分割方法已经取得了很大的成功，但它们依赖于大量的带标注数据，这极大限制了有监督方法的应用范围。半监督分割方法通过使用大量无标记数据和少量有标记数据解决了这一问题。目前，最成功的半监督学习方法是基于一致性学习，最小化从未标记数据的扰动视图中获得的模型响应之间的距离。此外，对比学习已被证明是一种有效的无监督学习方法。因此，研究和开发一种基于对比学习的医学图像半监督分割方法，在保证分割精度的ton更是，尽可能减少对标注数据的依赖，这对于医疗诊断领域具有重要的科学意义和应用前景。The structure of the left atrium is important information for clinicians to diagnose and treat atrial fibrillation, the most common heart rhythm disorder. Medical image segmentation is the basis of various medical image applications, such as determining cancer staging, formulating treatment plans, radiomics analysis, and developing personalized medical services. Among the segmentation tasks, tumor target delineation is a key step in cancer treatment, the purpose of which is to maximize the concentration of radioactive agents in the target area and minimize or even avoid damage to surrounding normal tissues and organs. However, manual delineation of tumor targets is a time-consuming and laborious process, and the accuracy of manual annotation depends largely on the experience and knowledge of oncologists. Differences between different doctors may also lead to different annotations for the same tumor. Supervised 3D medical image segmentation methods have achieved great success, but they rely on a large amount of labeled data, which greatly limits the scope of application of supervised methods. Semi-supervised segmentation methods solve this problem by using a large amount of unlabeled data and a small amount of labeled data. At present, the most successful semi-supervised learning method is based on consistency learning, which minimizes the distance between model responses obtained from perturbed views of unlabeled data. In addition, contrastive learning has been proven to be an effective unsupervised learning method. Therefore, it is important to study and develop a semi-supervised segmentation method for medical images based on contrastive learning, while ensuring the segmentation accuracy and minimizing the dependence on labeled data, which has important scientific significance and application prospects in the field of medical diagnosis.

Lequan Yu等人于2019年在MICCAI会议上发表文章“Uncertainty-aware Self-ensembling Model for Semi-supervised 3D Left Atrium Segmentation”。该方法结合Mean-Teacher模型和蒙特卡洛模拟提出了一种基于不确定性的半监督学习框架。学生模型通过利用教师模型的不确定性信息，逐渐从有意义和可靠的目标中学习。除了生成目标输出，教师模型还通过Monte Carlo Dropout估计每个目标预测的不确定性。在估计不确定性的指导下，计算一致性损失时过滤掉不可靠的预测，只保留可靠的预测(低不确定性)。因此，学生模型得到了优化，得到了更可靠的监督，并反过来鼓励教师模型生成更高质量的目标。Lequan Yu et al. published an article titled "Uncertainty-aware Self-ensembling Model for Semi-supervised 3D Left Atrium Segmentation" at the MICCAI conference in 2019. This method proposes an uncertainty-based semi-supervised learning framework by combining the Mean-Teacher model and Monte Carlo simulation. The student model gradually learns from meaningful and reliable targets by leveraging the uncertainty information of the teacher model. In addition to generating target outputs, the teacher model also estimates the uncertainty of each target prediction through Monte Carlo Dropout. Guided by the estimated uncertainty, unreliable predictions are filtered out when calculating the consistency loss, and only reliable predictions (low uncertainty) are retained. As a result, the student model is optimized and receives more reliable supervision, which in turn encourages the teacher model to generate higher quality targets.

Ting Chen等人于2020在ICML发表文章“SimCLR:A Simple Framework for Contrastive Learning of Visual Representations”。该文章提出了一种新的自监督对比学习方法，SimCLR学习框架主要由四个部件组成，包括随机数据增强模块、特征编码模块、特征投影模块、对比损失模块。其核心思想是最大化同一数据示例的不同增强视图之间的一致性来学习表示。Ting Chen et al. published the article "SimCLR: A Simple Framework for Contrastive Learning of Visual Representations" at ICML in 2020. The article proposed a new self-supervised contrastive learning method. The SimCLR learning framework mainly consists of four components, including a random data augmentation module, a feature encoding module, a feature projection module, and a contrastive loss module. The core idea is to maximize the consistency between different augmented views of the same data example to learn representations.

综上，现有技术存在如下技术缺陷：In summary, the prior art has the following technical defects:

1.医学图像数据标注困难，费时费力；1. Medical image data annotation is difficult, time-consuming and laborious;

2.人工标注数据依赖于专家经验知识，且不同专家存在差异；2. Manually labeled data depends on expert experience and knowledge, and different experts have differences;

3.由于大部分算法基于少数部位设计，算法鲁棒性差。3. Since most algorithms are designed based on a few parts, the algorithm robustness is poor.

发明内容Summary of the invention

本发明实施例提供了一种基于模型框架的半监督学习方法及装置，以最终获得更精确的医学图像分割结果。The embodiment of the present invention provides a semi-supervised learning method and device based on a model framework to ultimately obtain a more accurate medical image segmentation result.

根据本发明的一实施例，提供了一种基于模型框架的半监督学习方法，包括以下步骤：According to an embodiment of the present invention, a semi-supervised learning method based on a model framework is provided, comprising the following steps:

S101:设置网络整体框架,网络整体框架包含学生模型、教师模型、投影器网络以及输出层网络；S101: Setting up the overall network framework, which includes a student model, a teacher model, a projector network, and an output layer network;

S102:将医学图像输入至学生模型、教师模型，将学生模型、教师模型的输出经过投影器网络得到投影特征表示；S102: inputting the medical image into the student model and the teacher model, and passing the output of the student model and the teacher model through the projector network to obtain a projection feature representation;

S103:将投影特征表示输入至输出层网络，得到学生模型、教师模型的最终分割结果；S103: Input the projected feature representation to the output layer network to obtain the final segmentation results of the student model and the teacher model;

S104:设计网络损失函数，使用网络损失函数对网络整体框架进行训练。S104: Design a network loss function, and use the network loss function to train the overall network framework.

进一步地，学生模型和教师模型均采用V-Net作为主干网络，网络编码器和解码器分别包含4个卷积-池化层，卷积层的卷积核为3x3x3，池化层的卷积核为2x2x2，输出通道分别为16、32、64、128、128、64、32、16，激活函数使用ReLU。Furthermore, both the student model and the teacher model use V-Net as the backbone network, and the network encoder The network and decoder contain 4 convolution-pooling layers respectively. The convolution kernel of the convolution layer is 3x3x3, the convolution kernel of the pooling layer is 2x2x2, the output channels are 16, 32, 64, 128, 128, 64, 32, and 16 respectively, and the activation function is ReLU.

进一步地，投影器网络包含两个卷积层，第一个卷积层的输出通道为16，第二个卷积层的输出通道为8，卷积核大小均为3x3x3。Furthermore, the projector network contains two convolutional layers, the output channels of the first convolutional layer are 16, the output channels of the second convolutional layer are 8, and the convolution kernel size is 3x3x3.

进一步地，输出层网络为一个卷积层,输入为V-Net的输出，通道数为16，输出层网络的输出通道为2，卷积核大小为1x1x1。Furthermore, the output layer network is a convolutional layer, the input is the output of V-Net, the number of channels is 16, the output channel of the output layer network is 2, and the convolution kernel size is 1x1x1.

进一步地，网络损失函数分为四个部分，为学生模型有监督损失、学生-教师模型一致性损失、学生-教师模型交叉损失以及学生-教师模型对比损失之和。Furthermore, the network loss function is divided into four parts, which is the sum of the student model supervised loss, the student-teacher model consistency loss, the student-teacher model cross loss, and the student-teacher model contrast loss.

进一步地，对于学生模型的有监督损失，给定训练数据集D＝{(x₁,y₁),(x₂,y₂),…,(x_N,y_N)}，其中，x＝{x₁,x₂，…,x_N}网络输入图像，y＝{y₁,y₂，…,y_N}是从医生标注的图像，N是训练样本的总数；学生监督损失使用Dice损失和交叉熵损失，表示为：

Furthermore, for the supervised loss of the student model, given a training dataset D = {(x ₁ , y ₁ ), (x ₂ , y ₂ ), …, (x _N , y _N )}, where x = {x ₁ , x ₂ , …, x _N } is the network input image, y = {y ₁ , y ₂ , …, y _N } is the image annotated by the doctor, and N is the total number of training samples; the student supervised loss uses Dice loss and cross entropy loss, expressed as:

其中表示学生模型对带标签数据的预测结果，ε是一个很小的常数。in Represents the prediction result of the student model for labeled data, and ε is a small constant.

进一步地，学生模型、教师模型的一致性损失和交叉损失均使用均方误差损失，表示为：
Furthermore, the consistency loss and cross loss of the student model and the teacher model both use mean square error loss, expressed as:

y_1i和分别为学生模型和教师模型的输出结果。y _1i and are the output results of the student model and the teacher model respectively.

进一步地，学生模型、教师模型对比损失表示为：
Furthermore, the contrast loss between the student model and the teacher model is expressed as:

其中是一个指示性函数，当且仅当k≠i时，其值为1，否为0；τ是一个常数；是一个余弦相似度函数；z_i和z_j分别为学生模型和教师模型的投影输出结果。in is an indicative function, whose value is 1 if and only if k≠i, otherwise it is 0; τ is a constant; is a cosine similarity function; z _i and z _j are the projection output results of the student model and the teacher model respectively.

进一步地，采用Adam优化器来优化损失函数。Furthermore, the Adam optimizer is used to optimize the loss function.

根据本发明的另一实施例，提供了一种基于模型框架的半监督学习装置，包括：According to another embodiment of the present invention, a semi-supervised learning device based on a model framework is provided, comprising:

框架设置单元，用于设置网络整体框架,网络整体框架包含学生模型、教师模型、投影器网络以及输出层网络；The framework setting unit is used to set the overall network framework, which includes the student model, the teacher model, the projector network and the output layer network;

投影特征表示获取单元，用于将医学图像输入至学生模型、教师模型，将学生模型、教师模型的输出经过投影器网络得到投影特征表示；A projection feature representation acquisition unit is used to input the medical image into the student model and the teacher model, and pass the output of the student model and the teacher model through the projector network to obtain the projection feature representation;

最终分割结果获取单元，用于将投影特征表示输入至输出层网络，得到学生模型、教师模型的最终分割结果；The final segmentation result acquisition unit is used to input the projection feature representation into the output layer network to obtain the final segmentation results of the student model and the teacher model;

网络损失函数设计单元，用于设计网络损失函数，使用网络损失函数对网络整体框架进行训练。The network loss function design unit is used to design the network loss function and use the network loss function to train the overall network framework.

一种存储介质，存储介质存储有能够实现上述任意一项基于模型框架的半监督学习方法的程序文件。A storage medium stores a program file capable of implementing any of the above-mentioned semi-supervised learning methods based on a model framework.

一种处理器，处理器用于运行程序，其中，程序运行时执行上述任意一项的基于模型框架的半监督学习方法。A processor is used to run a program, wherein when the program is run, any of the above-mentioned semi-supervised learning methods based on the model framework is executed.

本发明实施例中的基于模型框架的半监督学习方法及装置，考虑自监督学习、对比学习方法，设计了一个半监督学习框架，结合Mean-Teacher模型将对比学习作为损失函数添加到教师-学生模型之中，进一步增加一致性学习的准确性，进而提高分割准确性能，本发明最终获得更精确的分割结果。 The semi-supervised learning method and device based on the model framework in the embodiment of the present invention designs a semi-supervised learning framework by considering self-supervised learning and contrastive learning methods. In combination with the Mean-Teacher model, contrastive learning is added as a loss function to the teacher-student model to further increase the accuracy of consistency learning, thereby improving the segmentation accuracy performance. The present invention ultimately obtains a more accurate segmentation result.

BRIEF DESCRIPTION OF THE DRAWINGS

图1为本发明基于对比学习的半监督分割网络框架图；FIG1 is a framework diagram of a semi-supervised segmentation network based on contrastive learning of the present invention;

图2为本发明实验验证图。FIG. 2 is an experimental verification diagram of the present invention.

DETAILED DESCRIPTION

为了使本技术领域的人员更好地理解本发明方案，下面将结合本发明实施例中的附图，对本发明实施例中的技术方案进行清楚、完整地描述，显然，所描述的实施例仅仅是本发明一部分的实施例，而不是全部的实施例。基于本发明中的实施例，本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例，都应当属于本发明保护的范围。In order to enable those skilled in the art to better understand the scheme of the present invention, the technical scheme in the embodiments of the present invention will be clearly and completely described below in conjunction with the drawings in the embodiments of the present invention. Obviously, the described embodiments are only part of the embodiments of the present invention, not all of the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by ordinary technicians in this field without creative work should fall within the scope of protection of the present invention.

需要说明的是，本发明的说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类似的对象，而不必用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换，以便这里描述的本发明的实施例能够以除了在这里图示或描述的那些以外的顺序实施。此外，术语“包括”和“具有”以及他们的任何变形，意图在于覆盖不排他的包含，例如，包含了一系列步骤或单元的过程、方法、系统、产品或设备不必限于清楚地列出的那些步骤或单元，而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或单元。It should be noted that the terms "first", "second", etc. in the specification and claims of the present invention and the above-mentioned drawings are used to distinguish similar objects, and are not necessarily used to describe a specific order or sequence. It should be understood that the data used in this way can be interchanged where appropriate, so that the embodiments of the present invention described herein can be implemented in an order other than those illustrated or described herein. In addition, the terms "including" and "having" and any variations thereof are intended to cover non-exclusive inclusions, for example, a process, method, system, product or device that includes a series of steps or units is not necessarily limited to those steps or units that are clearly listed, but may include other steps or units that are not clearly listed or inherent to these processes, methods, products or devices.

为解决医学图像分割标注数据少的问题，本发明考虑自监督学习、对比学习方法，设计了一个半监督学习框架，结合Mean-Teacher模型将对比学习作为损失函数添加到教师-学生模型之中，进一步增加一致性学习的准确性，进而提高分割准确性能，本发明最终获得更精确的分割结果。本发明提出的一种基于模型框架的半监督学习方法及装置，将一致学习和对比学习相结合，并利用均方误差损失交替优化模型结果，最终得到更为精确的分割结果。In order to solve the problem of insufficient annotated data in medical image segmentation, the present invention considers self-supervised learning and contrastive learning methods, designs a semi-supervised learning framework, combines the Mean-Teacher model to add contrastive learning as a loss function to the teacher-student model, further increases the accuracy of consistency learning, and thus improves the segmentation accuracy performance. The present invention ultimately obtains a more accurate segmentation result. The present invention proposes a semi-supervised learning method and device based on a model framework, which combines consistency learning and contrastive learning, and uses mean square error loss to alternately optimize the model results, ultimately obtaining a more accurate segmentation result.

为实现上述目的，本发明所采用的技术方案具体操作步骤如下：To achieve the above purpose, the specific operation steps of the technical solution adopted by the present invention are as follows:

步骤一：设置网络整体框架Step 1: Set up the overall network framework

网络整体采用教师-学生模型框架，如图1所示。主要包含学生模型、教师模型、投影器网络以及输出层网络。The network adopts the teacher-student model framework as shown in Figure 1. It mainly includes the student model, the teacher Teacher model, projector network, and output layer network.

学生模型和教师模型均采用V-Net作为主干网络，网络编码器和解码器分别包含4个卷积-池化层，卷积层的卷积核为3x3x3，池化层的卷积核为2x2x2,输出通道分别为16、32、64、128、128、64、32、16，激活函数使用ReLU。Both the student model and the teacher model use V-Net as the backbone network. The network encoder and decoder contain 4 convolution-pooling layers respectively. The convolution kernel of the convolution layer is 3x3x3, and the convolution kernel of the pooling layer is 2x2x2. The output channels are 16, 32, 64, 128, 128, 64, 32, and 16 respectively, and the activation function uses ReLU.

表1 V-Net网络参数设置
Table 1 V-Net network parameter settings

步骤二：设置对比学投影器网络Step 2: Set up the comparative projector network

先将学生-教师模型的输出经过投影器网络得到投影特征表示。投影器网络包含两个卷积层，第一个卷积层的输出通道为16，第二个卷积层的输出通道为8，卷积核大小均为3x3x3。First, the output of the student-teacher model is passed through the projector network to obtain the projected feature representation. The projector network contains two convolutional layers, the output channels of the first convolutional layer are 16, the output channels of the second convolutional layer are 8, and the convolution kernel size is 3x3x3.

步骤三：设置输出层网络Step 3: Set up the output layer network

如图1所示，输出层网络为一个卷积层,输入为V-Net的输出，通道数为16，输出层网络的输出通道为2，卷积核大小为1x1x1，输出得到学生-教师模型的最终分割结果。As shown in Figure 1, the output layer network is a convolutional layer, the input is the output of V-Net, the number of channels is 16, the output channel of the output layer network is 2, the convolution kernel size is 1x1x1, and the output is the final segmentation result of the student-teacher model.

步骤四：设计网络损失函数Step 4: Design network loss function

整个网络损失函数分为四个部分，学生模型带标签的有监督损失、学生-教师模型一致性损失、学生-教师模型交叉损失以及学生-教师模型对比损失。 The entire network loss function is divided into four parts: the labeled supervised loss of the student model, the consistency loss of the student-teacher model, the cross loss of the student-teacher model, and the contrast loss of the student-teacher model.

对于学生模型的有监督损失，给定训练数据集D＝{(x₁,y₁),(x₂,y₂),…,(x_N,y_N)}，其中，x＝{x₁,x₂，…,x_N}网络输入图像，y＝{y₁,y₂，…,y_N}是从医生标注的图像，N是训练样本的总数。学生监督损失使用Dice损失和交叉熵损失，表示为：

For the supervised loss of the student model, given the training dataset D = {(x ₁ ,y ₁ ), (x ₂ ,y ₂ ), …, (x _N ,y _N )}, where x = {x ₁ ,x ₂ , …,x _N } is the network input image, y = {y ₁ ,y ₂ , …,y _N } is the image annotated by the doctor, and N is the total number of training samples. The student supervised loss uses Dice loss and cross entropy loss, expressed as:

其中表示学生模型对带标签数据的预测结果，ε是一个很小的常数，避免分母为0,实验中设置为0.001。in It represents the prediction result of the student model for the labeled data. ε is a very small constant to avoid the denominator being 0. It is set to 0.001 in the experiment.

学生-教师模型的一致性损失和交叉损失均使用均方误差损失，表示为：
The consistency loss and cross loss of the student-teacher model both use mean square error loss, expressed as:

学生-教师模型对比损失表示为：
The student-teacher model contrast loss is expressed as:

其中是一个指示性函数，当且仅当k≠i时，其值为1，否为0。τ是一个常数，设置为2。是一个余弦相似度函数。z_i和z_j分别为学生模型和教师模型的投影输出结果。in is an indicative function whose value is 1 if and only if k≠i, and 0 otherwise. τ is a constant set to 2. is a cosine similarity function. _{z i} and z _j are the projection output results of the student model and the teacher model respectively.

网络的最终损失函数为以上四个部分之和。The final loss function of the network is the sum of the above four parts.

步骤五：对于以上设计的整体网络框架，采用Adam优化器来优化损失函数。Step 5: For the overall network framework designed above, use Adam optimizer to optimize the loss function.

步骤六：训练网络。Step 6: Train the network.

本发明与现有技术相比，本发明有益效果包括：本发明通过考虑学生-教师模型之间的一致性，增加了交叉损失，这样使得两个模型结果互为标签，提高半监督学习准确性。此外，在网络中引入了对比学习这样的无监督学习机制，在一部分标注数据的辅助下，更进一步提高了分割的准确性。本发明有效提高了网络版监督分割性能，分割结果更好。Compared with the prior art, the present invention has the following beneficial effects: the present invention increases the cross loss by considering the consistency between the student-teacher model, so that the results of the two models are mutually labeled, High semi-supervised learning accuracy. In addition, an unsupervised learning mechanism such as contrastive learning is introduced into the network, which further improves the accuracy of segmentation with the assistance of a portion of labeled data. The present invention effectively improves the performance of network-based supervised segmentation, and the segmentation results are better.

本发明使用MRI数据进行了验证，本发明经过实验、模拟、使用而证明可行，实验结果如图2所示。除MRI数据外同样可以应用于其它模态的医学图像数据如CT和PET。The present invention is verified using MRI data, and the present invention is proven to be feasible through experiments, simulations, and use, and the experimental results are shown in Figure 2. In addition to MRI data, the present invention can also be applied to other modal medical image data such as CT and PET.

实施例2Example 2

实施例3Example 3

上述本发明实施例序号仅仅为了描述，不代表实施例的优劣。The serial numbers of the above embodiments of the present invention are only for description and do not represent the advantages or disadvantages of the embodiments.

在本发明的上述实施例中，对各个实施例的描述都各有侧重，某个实施例中没有详述的部分，可以参见其他实施例的相关描述。In the above embodiments of the present invention, the description of each embodiment has its own emphasis. For parts that are not described in detail in a certain embodiment, reference can be made to the relevant descriptions of other embodiments.

在本申请所提供的几个实施例中，应该理解到，所揭露的技术内容，可通过其它的方式实现。其中，以上所描述的系统实施例仅仅是示意性的，例如单元的划分，可以为一种逻辑功能划分，实际实现时可以有另外的划分方式，例如多个单元或组件可以结合或者可以集成到另一个系统，或一些特征可以忽略，或不执行。另一点，所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口，单元或模块的间接耦合或通信连接，可以是电性或其它的形式。In the several embodiments provided in this application, it should be understood that the disclosed technical content can be implemented in other ways. Among them, the system embodiments described above are only schematic. For example, the division of units can be a logical function division. There may be other division methods in actual implementation. For example, multiple units or components can be combined or integrated into another system, or some features can be ignored or not executed. Another point is that the mutual coupling or direct coupling or communication connection shown or discussed can be through some interfaces, indirect coupling or communication connection of units or modules, which can be electrical or other forms.

作为分离部件说明的单元可以是或者也可以不是物理上分开的，作为单元显示的部件可以是或者也可以不是物理单元，即可以位于一个地方，或者也可以分布到多个单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。 The units described as separate components may or may not be physically separated, and the components shown as units may or may not be physical units, that is, they may be located in one place or distributed on multiple units. Some or all of the units may be selected according to actual needs to achieve the purpose of the present embodiment.

另外，在本发明各个实施例中的各功能单元可以集成在一个处理单元中，也可以是各个单元单独物理存在，也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现，也可以采用软件功能单元的形式实现。In addition, each functional unit in each embodiment of the present invention may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit. The above-mentioned integrated unit may be implemented in the form of hardware or in the form of software functional units.

集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时，可以存储在一个计算机可读取存储介质中。基于这样的理解，本发明的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来，该计算机软件产品存储在一个存储介质中，包括若干指令用以使得一台计算机设备(可为个人计算机、服务器或者网络设备等)执行本发明各个实施例方法的全部或部分步骤。而前述的存储介质包括：U盘、只读存储器(ROM，Read-Only Memory)、随机存取存储器(RAM，Random Access Memory)、移动硬盘、磁碟或者光盘等各种可以存储程序代码的介质。If the integrated unit is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present invention, in essence, or the part that contributes to the prior art, or all or part of the technical solution can be embodied in the form of a software product. The computer software product is stored in a storage medium, including several instructions for a computer device (which can be a personal computer, server or network device, etc.) to perform all or part of the steps of the methods of each embodiment of the present invention. The aforementioned storage medium includes: U disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), mobile hard disk, magnetic disk or optical disk and other media that can store program codes.

以上所述仅是本发明的优选实施方式，应当指出，对于本技术领域的普通技术人员来说，在不脱离本发明原理的前提下，还可以做出若干改进和润饰，这些改进和润饰也应视为本发明的保护范围。 The above is only a preferred embodiment of the present invention. It should be pointed out that for ordinary technicians in this technical field, several improvements and modifications can be made without departing from the principle of the present invention. These improvements and modifications should also be regarded as the scope of protection of the present invention.

Claims

A semi-supervised learning method based on a model framework, characterized by comprising the following steps:

S101: Setting up the overall network framework, which includes a student model, a teacher model, a projector network, and an output layer network;

S102: inputting the medical image into the student model and the teacher model, and passing the output of the student model and the teacher model through the projector network to obtain a projection feature representation;

S103: Input the projected feature representation to the output layer network to obtain the final segmentation results of the student model and the teacher model;

S104: Design a network loss function, and use the network loss function to train the overall network framework.

The semi-supervised learning method based on the model framework according to claim 1 is characterized in that both the student model and the teacher model use V-Net as the backbone network, the network encoder and decoder respectively include 4 convolution-pooling layers, the convolution kernel of the convolution layer is 3x3x3, the convolution kernel of the pooling layer is 2x2x2, the output channels are 16, 32, 64, 128, 128, 64, 32, and 16 respectively, and the activation function uses ReLU.

The semi-supervised learning method based on the model framework according to claim 1 is characterized in that the projector network comprises two convolutional layers, the output channel of the first convolutional layer is 16, the output channel of the second convolutional layer is 8, and the convolution kernel size is 3x3x3.

The semi-supervised learning method based on the model framework according to claim 1 is characterized in that the output layer network is a convolutional layer, the input is the output of V-Net, the number of channels is 16, the output channel of the output layer network is 2, and the convolution kernel size is 1x1x1.

The semi-supervised learning method based on the model framework according to claim 1 is characterized in that the network loss function is divided into four parts, which is the sum of the student model supervised loss, the student-teacher model consistency loss, the student-teacher model cross loss and the student-teacher model contrast loss.

The semi-supervised learning method based on the model framework according to claim 5 is characterized in that, for the supervised loss of the student model, given a training data set D = {(x ₁ , y ₁ ), (x ₂ , y ₂ ), …, (x _N , y _N )}, wherein x = {x ₁ , x ₂ , …, x _N } is the network input image, y = {y ₁ , y ₂ , …, y _N } is the image annotated by the doctor, and N is the total number of training samples; the student supervision loss uses Dice loss and cross entropy loss, expressed as:

in Represents the prediction result of the student model for labeled data, and ε is a small constant.

The semi-supervised learning method based on the model framework according to claim 5 is characterized in that the consistency loss and the cross loss of the student model and the teacher model both use the mean square error loss, which is expressed as:

y _1i and are the output results of the student model and the teacher model respectively.

The semi-supervised learning method based on the model framework according to claim 5 is characterized in that the contrast loss between the student model and the teacher model is expressed as:

in is an indicative function, whose value is 1 if and only if k≠i, otherwise it is 0; τ is a constant; is a cosine similarity function; z _i and z _j are the projection output results of the student model and the teacher model respectively.

The semi-supervised learning method based on the model framework according to claim 1 is characterized in that an Adam optimizer is used to optimize the loss function.

A semi-supervised learning device based on a model framework, characterized by comprising:

The framework setting unit is used to set the overall network framework, which includes the student model, the teacher model, the projector network and the output layer network;

A projection feature representation acquisition unit is used to input the medical image into the student model and the teacher model, and pass the output of the student model and the teacher model through the projector network to obtain the projection feature representation;

The final segmentation result acquisition unit is used to input the projection feature representation into the output layer network to obtain the final segmentation results of the student model and the teacher model;

The network loss function design unit is used to design the network loss function and use the network loss function to train the overall network framework.