CN116777925A

CN116777925A - A generalization method for image segmentation domain based on style transfer

Info

Publication number: CN116777925A
Application number: CN202311036723.2A
Authority: CN
Inventors: 王瑞; 宋艳枝; 杨周旺
Original assignee: University of Science and Technology of China USTC
Current assignee: University of Science and Technology of China USTC
Priority date: 2023-08-17
Filing date: 2023-08-17
Publication date: 2023-09-19
Anticipated expiration: 2043-08-17
Also published as: CN116777925B

Abstract

The present invention relates to the technical field of image processing, and solves the technical problem that the previous method based on style transfer is difficult to generate diversified styles while retaining the structural information of the image. In particular, it relates to an image segmentation domain generalization method based on style transfer. It includes the following steps: S1. Extract texture structure features from the training data set P containing multiple source domains; S2. Use the training data set P to train a random style generation model based on local invariance. This invention realizes image generation through local texture invariant constraints and domain-independent random variables, and then constructs a scenario training scheme that simulates domain shifts, so that the model pays more attention to the content of the image, reduces dependence on style information, and can generate diversified styles. While retaining the structural information of the image, it effectively improves the segmentation accuracy of the model in the unknown domain and improves the efficiency of model deployment.

Description

A generalization method for image segmentation domain based on style transfer

技术领域Technical field

本发明涉及图像处理技术领域，尤其涉及一种基于风格迁移的图像分割域泛化方法。The present invention relates to the field of image processing technology, and in particular to an image segmentation domain generalization method based on style transfer.

背景技术Background technique

域泛化从不同的训练数据集中学习一个通用模型，该模型可以很好地泛化到任意看不见的目标域，而无需目标数据收集和微调，其中，利用数据生成是下游任务的一种更直观、更有效的方法，因为生成的样本的质量更容易评估。Domain generalization learns a common model from different training data sets that can generalize well to arbitrary unseen target domains without the need for target data collection and fine-tuning, where exploiting data generation is a more efficient way of downstream tasks. Intuitive and more efficient method because the quality of the generated samples is easier to assess.

当前基于风格迁移的方法难以在生成多样化风格的同时保留图像的结构信息，从而降低下游任务的性能。例如，利用傅里叶变换得到频域信息，通过交换图像之间的低频分量实现图像风格的转换，这类方法能很好地保留图像的语义信息，但生成的风格并不真实。另一类方法是基于CycleGAN 架构来实现数据生成的，能够生成更接近真实场景的图像风格，但经常会在生成过程中改变图像的结构。Current methods based on style transfer have difficulty in generating diverse styles while retaining the structural information of images, thereby reducing the performance of downstream tasks. For example, Fourier transform is used to obtain frequency domain information, and image style conversion is achieved by exchanging low-frequency components between images. This type of method can well retain the semantic information of the image, but the generated style is not realistic. Another type of method is based on the CycleGAN architecture to achieve data generation, which can generate image styles closer to real scenes, but often changes the structure of the image during the generation process.

发明内容Contents of the invention

针对现有技术的不足，本发明提供了一种基于风格迁移的图像分割域泛化方法，解决了前基于风格迁移的方法难以在生成多样化风格的同时保留图像的结构信息的技术问题。In view of the shortcomings of the existing technology, the present invention provides an image segmentation domain generalization method based on style transfer, which solves the technical problem that the previous method based on style transfer is difficult to generate diversified styles while retaining the structural information of the image.

为解决上述技术问题，本发明提供了如下技术方案：一种基于风格迁移的图像分割域泛化方法，该方法包括以下步骤：In order to solve the above technical problems, the present invention provides the following technical solution: an image segmentation domain generalization method based on style transfer, which method includes the following steps:

S1、从包含多个源域的训练数据集P中提取纹理结构特征，纹理结构特征包括原始图像与生成图像/>中类别c的局部二值模式特征/>和/>；S1. Extract texture structure features from the training data set P containing multiple source domains. The texture structure features include the original image. with generated image/> Local binary pattern features of category c/> and/> ;

S2、采用训练数据集P训练基于局部不变性的随机风格生成模型，随机风格生成模型采用包含两个生成器、/>和两个判别器/>、/>的循环生成对抗网络；S2. Use the training data set P to train a random style generation model based on local invariance. The random style generation model uses two generators. ,/> and two discriminators/> ,/> cyclic generative adversarial network;

S3、基于风格随机生成模型构建元学习的训练数据集M，训练数据集M包括元训练集和元测试集；S3. Construct a meta-learning training data set M based on the style random generation model. The training data set M includes a meta-training set and a meta-test set;

S4、采用训练数据集M训练基于元学习范式的泛化模型，通过在训练过程中模拟域偏移来促进鲁棒优化，以得到可以应对不同风格下场景的泛化模型。S4. Use the training data set M to train a generalization model based on the meta-learning paradigm, and promote robust optimization by simulating domain shifts during the training process to obtain a generalization model that can cope with scenarios in different styles.

进一步地，在步骤S1中，具体包括：Further, in step S1, it specifically includes:

S11、在训练数据集P中根据标签提取原始图像与生成图像/>中类别c的局部二值模式特征/>和/>；S11. Extract the original image according to the label in the training data set P with generated image/> Local binary pattern features of category c/> and/> ;

S12、根据局部二值模式特征和/>计算余弦相似度来构建局部相似损失/>，其用于使得图像风格生成过程中掩膜区域内的纹理结构保持不变。S12. According to local binary pattern characteristics and/> Calculate cosine similarity to construct local similarity loss/> , which is used to keep the texture structure in the mask area unchanged during the image style generation process.

进一步地，在步骤S12中，局部相似损失的计算公式为：Further, in step S12, local similarity loss The calculation formula is:

上式中，C表示类别c的数量，和/>均为原始图像/>与生成图像/>中类别c的局部二值模式特征。In the above formula, C represents the number of categories c, and/> All are original images/> with generated image/> Local binary pattern features of class c.

进一步地，在步骤S2中，两个生成器和/>分别用于将风格为S的真实图像/>和风格为T的真实图像/>转换为中间态风格的生成图像/>；Further, in step S2, the two generators and/> Respectively used to convert real images with style S/> and real images with style T/> Convert to intermediate-style generated image/> ;

判别器用于区别真实样本和生成样本，两个判别器和/>分别来判别图像/>和图像/>。The discriminator is used to distinguish real samples and generated samples. Two discriminators and/> Distinguish images separately/> and images/> .

进一步地，在步骤S2中，随机风格生成模型的损失函数由对抗损失/>、循环一致损失/>和局部相似损失/>组成，定义为：/> ；Further, in step S2, the loss function of the random style generation model Loss by confrontation/> , cycle consistent loss/> and local similarity loss/> Composition, defined as:/> ;

其中，对抗损失定义为：Among them, against losses defined as:

其中，in,

上式中，z表示域无关的随机变量，和/>分别代表源域和目标域的图片集合，表示计算来自数据集/>的所有样本/>的期望值，/>表示计算来自数据集/>的所有样本/>的期望值；In the above formula, z represents a domain-independent random variable, and/> A collection of images representing the source domain and the target domain respectively, Indicates that the calculation comes from the data set/> All samples/> The expected value,/> Indicates that the calculation comes from the data set/> All samples/> the expected value;

循环一致损失定义为：cycle consistent loss defined as:

上式中，z表示域无关的随机变量，表示将风格为S的真实图像转换为生成图像的生成器，/>表示将风格为T的真实图像转换为生成图像的生成器；In the above formula, z represents a domain-independent random variable, Represents a generator that converts real images of style S into generated images, /> Represents a generator that converts real images of style T into generated images;

局部相似损失定义为：local similarity loss defined as:

进一步地，域无关的随机变量z∈[0,1]，随机变量z越接近0时，中间态的图片风格越接近原始图像/>，反之当随机变量z越接近1时，中间态/>的图片风格越接近风格图像。Furthermore, for domain-independent random variables z∈[0,1], when the random variable z is closer to 0, the intermediate state The closer the picture style is to the original image/> , on the contrary, when the random variable z is closer to 1, the intermediate state/> The closer the picture style is to the style image .

进一步地，在步骤S3中，具体包括：Further, in step S3, it specifically includes:

S31、从源域S中随机选择一个域，并从该域中随机采样n张图像作为元训练集；S31. Randomly select a domain from the source domain S, and randomly sample n images from this domain as a meta-training set. ;

S32、根据训练好的风格随机生成模型将n张图像的风格由原始图像转换为中间态/>，将生成图像/>作为元测试集/>，原始图像/>与其对应的生成图像/>共享相同的语义标签。S32. Randomly generate a model based on the trained style to change the styles of n images from the original images. Convert to intermediate state/> , will generate image/> As a meta-test set/> , original image/> The corresponding generated image/> share the same semantic label.

进一步地，在步骤S4中，泛化模型采用一个基于编码器-解码器的分割网络作为元学习范式中的骨干网络。Further, in step S4, the generalization model adopts an encoder-decoder based segmentation network as the backbone network in the meta-learning paradigm.

进一步地，分割网络的损失函数由交叉熵损失/>和Dice损失/>组成，定义为：/>，其中；Furthermore, the loss function of the segmentation network By cross entropy loss/> and Dice loss/> Composition, defined as:/> ,in;

交叉熵损失定义为：cross entropy loss defined as:

Dice损失定义为：Dice loss defined as:

上式中，M 代表该图像中像素的总和，C 代表类别数量，和/>分别代表第i个像素的第c个类别的预测概率和真实值。In the above formula, M represents the total number of pixels in the image, C represents the number of categories, and/> Represent the predicted probability and true value of the c-th category of the i-th pixel respectively.

借由上述技术方案，本发明提供了一种基于风格迁移的图像分割域泛化方法，至少具备以下有益效果：Through the above technical solutions, the present invention provides an image segmentation domain generalization method based on style transfer, which at least has the following beneficial effects:

1、本发明通过局部纹理不变约束和域无关的随机变量实现图像生成，进而构建模拟域偏移的情景训练方案，使得模型更加关注图像的内容，减少对于风格信息的依赖，能够在生成多样化风格的同时保留图像的结构信息，有效提高了模型在未知域上的分割准确率，提升了模型部署的效率。1. The present invention realizes image generation through local texture invariant constraints and domain-independent random variables, and then constructs a scenario training scheme that simulates domain shifts, so that the model pays more attention to the content of the image, reduces dependence on style information, and can generate diverse It optimizes the style while retaining the structural information of the image, effectively improving the model's segmentation accuracy in unknown domains and improving the efficiency of model deployment.

2、本发明通过计算纹理特征的相似度来保证局部不变性，并在对抗损失函数中添加了一个域无关随机变量，实现在多个源域中构建成对的风格迁移模型，能够在保证掩码区域纹理特征不变的情况下，生成不同角度、光线强度等风格的图像来模拟不同设备及不同拍摄角度的实际场景，大大降低了数据采集及标注的成本。2. The present invention ensures local invariance by calculating the similarity of texture features, and adds a domain-independent random variable to the adversarial loss function to construct a pairwise style transfer model in multiple source domains, which can ensure the masking While the texture characteristics of the code area remain unchanged, images with different angles, light intensities and other styles are generated to simulate actual scenes with different equipment and different shooting angles, which greatly reduces the cost of data collection and labeling.

3、本发明能够在未见过的数据集上开发一种可复用的基于卷积神经网络的图像分割方法，该方法能够生成更加真实且多样化的数据，并基于多样化的风格图像设计一个训练范式，使得模型更加关注图像的内容，减少对于风格信息的依赖。3. The present invention can develop a reusable image segmentation method based on convolutional neural network on unseen data sets. This method can generate more realistic and diverse data and be based on diverse style image design. A training paradigm that makes the model pay more attention to the content of the image and reduce its reliance on style information.

4、基于随机风格生成模型，本发明在每次迭代中随机选取一个源域的n张图像作为元训练集，并将元训练集随机生成n张风格图像作为元测试集，其中元训练集和元测试集共享相同的语义标签，通过在训练过程中模拟域偏移来促进鲁棒优化，从而使得模型在未知场景下具有良好的分割能力。4. Based on the random style generation model, the present invention randomly selects n images in a source domain as a meta-training set in each iteration, and randomly generates n style images from the meta-training set as a meta-test set, where the meta-training set and The meta-test set shares the same semantic labels, which promotes robust optimization by simulating domain shifts during training, thereby enabling the model to have good segmentation capabilities in unknown scenarios.

附图说明Description of drawings

此处所说明的附图用来提供对本申请的进一步理解，构成本申请的一部分，本申请的示意性实施例及其说明用于解释本申请，并不构成对本申请的不当限定。在附图中：The drawings described here are used to provide a further understanding of the present application and constitute a part of the present application. The illustrative embodiments of the present application and their descriptions are used to explain the present application and do not constitute an improper limitation of the present application. In the attached picture:

图1为本发明图像分割域泛化方法的流程图；Figure 1 is a flow chart of the image segmentation domain generalization method of the present invention;

图2为本发明基于局部不变性的随机风格生成模型的网络结构图；Figure 2 is a network structure diagram of the random style generation model based on local invariance of the present invention;

图3为本发明基于元学习范式的泛化模型的网络结构图。Figure 3 is a network structure diagram of the generalization model based on the meta-learning paradigm of the present invention.

具体实施方式Detailed ways

为使本发明的上述目的、特征和优点能够更加明显易懂，下面结合附图和具体实施方式对本发明作进一步详细的说明。借此对本申请如何应用技术手段来解决技术问题并达成技术功效的实现过程能充分理解并据以实施。In order to make the above objects, features and advantages of the present invention more obvious and understandable, the present invention will be further described in detail below in conjunction with the accompanying drawings and specific embodiments. In this way, the implementation process of how this application applies technical means to solve technical problems and achieve technical effects can be fully understood and implemented accordingly.

深度学习技术在图像分割方面取得了重大进展，然而，这些成就很大程度上依赖于源域数据和目标域数据相同分布的假设，直接应用相关方法而不解决分布偏移会导致实际应用效果的显著退化。在这样的背景下，域泛化成为一个活跃的研究方向，并在图像分析界获得了广泛的兴趣和前景。域泛化从不同的训练数据集中学习一个通用模型，该模型可以很好地泛化到任意看不见的目标域，而无需目标数据收集和微调。Deep learning technology has made significant progress in image segmentation. However, these achievements largely rely on the assumption that source domain data and target domain data are equally distributed. Directly applying related methods without addressing distribution shifts will lead to poor practical application results. Significant degradation. In this context, domain generalization has become an active research direction and has gained widespread interest and prospects in the image analysis community. Domain generalization learns a common model from different training data sets that generalizes well to arbitrary unseen target domains without the need for target data collection and fine-tuning.

现有的域泛化研究一般可分为三类：即域不变表示学习、元学习和数据增强。域不变表示学习执行域之间的显式特征对齐和不变风险最小化以学习域不变表示。元学习通过划分源域来模拟看不见的目标域，从而增加模型的泛化性能。最近，基于数据增强的域泛化方法变得更加流行，该方法侧重于操作输入数据以扩展训练分布并尝试匹配测试分布。其中，利用数据生成是下游任务的一种更直观、更有效的方法，因为生成的样本的质量更容易评估。Existing domain generalization research can generally be divided into three categories: domain-invariant representation learning, meta-learning and data enhancement. Domain-invariant representation learning performs explicit feature alignment between domains and invariant risk minimization to learn domain-invariant representations. Meta-learning increases the generalization performance of the model by partitioning the source domain to simulate the unseen target domain. Recently, domain generalization methods based on data augmentation have become more popular, which focus on manipulating the input data to expand the training distribution and try to match the test distribution. Among them, leveraging data generation is a more intuitive and efficient method for downstream tasks because the quality of the generated samples is easier to evaluate.

请参照图1-图3，示出了本实施例的一种具体实施方式，本实施例利用局部不变性在多个源域中构建成对的随机风格生成模型，在保证掩码区域纹理特征不变的情况下，生成不同角度、光线强度等风格的图像来模拟不同设备及不同拍摄角度的实际场景；随后，在基于元学习范式的泛化模型训练过程中，通过上述预训练的随机风格生成模型分别生成两组图像作为元训练集和元测试集，通过在训练过程中模拟域偏移来促进鲁棒优化，从而使得泛化模型可以应对不同风格下的场景。Please refer to Figures 1-3, which illustrates a specific implementation of this embodiment. This embodiment uses local invariance to construct a pair of random style generation models in multiple source domains, while ensuring the texture characteristics of the mask area. Under the same condition, images with different angles, light intensity and other styles are generated to simulate actual scenes with different devices and different shooting angles; then, during the training process of the generalization model based on the meta-learning paradigm, through the above-mentioned pre-trained random styles The generative model generates two sets of images as a meta-training set and a meta-test set respectively, and promotes robust optimization by simulating domain shifts during the training process, so that the generalization model can cope with scenes in different styles.

请参照图1，本实施例提出了一种基于风格迁移的图像分割域泛化方法，该方法包括以下步骤：Please refer to Figure 1. This embodiment proposes an image segmentation domain generalization method based on style transfer. The method includes the following steps:

S1、从包含多个源域的训练数据集P中提取纹理结构特征，纹理结构特征包括原始图像与生成图像/>中类别c的局部二值模式特征/>和/>；在本实施例中，为了对步骤S1的实现方式进行说明，具体过程是通过以下步骤实现的，详细的实施方法如下：S1. Extract texture structure features from the training data set P containing multiple source domains. The texture structure features include the original image. with generated image/> Local binary pattern features of category c/> and/> ; In this embodiment, in order to illustrate the implementation of step S1, the specific process is implemented through the following steps, and the detailed implementation method is as follows:

在步骤S12中，局部相似损失的计算公式为：In step S12, local similarity loss The calculation formula is:

在该步骤中，对于包含多个源域的训练数据集，如图2所示，根据标签提取原始图像与生成图像/>中类别c的局部二值模式特征为/>和/>，通过计算余弦相似度来构建局部相似损失/>，局部相似损失/>保持原始图像/>与生成图像中局部信息（掩膜区域）的纹理不变以保证掩膜的真实性，使得图像生成过程中只改变风格，而掩膜区域内的纹理结构保持不变。In this step, for a training data set containing multiple source domains, as shown in Figure 2, the original image is extracted according to the label with generated image/> The local binary pattern feature of category c is/> and/> , construct local similarity loss by calculating cosine similarity/> , local similarity loss/> Keep original image/> and generate images The texture of the local information (mask area) remains unchanged to ensure the authenticity of the mask, so that only the style is changed during the image generation process, while the texture structure in the mask area remains unchanged.

在该步骤中，通过计算纹理特征的相似度来保证局部不变性，并在对抗损失函数中添加了一个域无关随机变量，实现在多个源域中构建成对的风格迁移模型，能够在保证掩码区域纹理特征不变的情况下，生成不同角度、光线强度等风格的图像来模拟不同设备及不同拍摄角度的实际场景，大大降低了数据采集及标注的成本。In this step, local invariance is ensured by calculating the similarity of texture features, and a domain-independent random variable is added to the adversarial loss function to build a pairwise style transfer model in multiple source domains, which can ensure While the texture characteristics of the mask area remain unchanged, images with different angles, light intensity and other styles are generated to simulate actual scenes with different devices and different shooting angles, which greatly reduces the cost of data collection and annotation.

S2、采用训练数据集P训练基于局部不变性的随机风格生成模型，随机风格生成模型采用包含两个生成器、/>和两个判别器/>、/>的循环生成对抗网络；两个生成器/>和/>分别用于将风格为S的真实图像/>和风格为T的真实图像/>转换为中间态风格的生成图像/>；S2. Use the training data set P to train a random style generation model based on local invariance. The random style generation model uses two generators. ,/> and two discriminators/> ,/> A recurrent generative adversarial network; two generators/> and/> Respectively used to convert real images with style S/> and real images with style T/> Convert to intermediate-style generated image/> ;

判别器用于区别真实样本和生成样本，两个判别器和/>分别来判别图像/>和图像/>。判别器的输出是一个概率值，用来表示输入图像为真实样本的概率，对于/>判别器，如果输入的是风格为S的真实样本/>，则/>的输出接近1，表示它很有可能是真实样本；如果输入的是风格为S的生成样本/>，则/>的输出接近0，表示它很有可能是生成样本。类似地，对于判别器/>，对风格为T的样本进行判断。The discriminator is used to distinguish real samples and generated samples. Two discriminators and/> Distinguish images separately/> and images/> . The output of the discriminator is a probability value, used to represent the probability that the input image is a real sample, for/> Discriminator, if the input is a real sample of style S/> , then/> The output is close to 1, indicating that it is very likely to be a real sample; if the input is a generated sample with style S/> , then/> The output is close to 0, indicating that it is likely to be a generated sample. Similarly, for the discriminator/> , judge the sample with style T.

循环生成对抗网络可以实现一个源域中的图像与另一个源域中的图像之间的相互转换，在此基础上输入一个域无关的随机变量z，如图2所示，将风格为S的源域图像生成中间态风格的图像/>，而不仅是另一个源域图像/>的风格，从而得到多样化的具有随机风格的图像。The cyclic generative adversarial network can realize the mutual conversion between images in one source domain and images in another source domain. On this basis, a domain-independent random variable z is input. As shown in Figure 2, the style S Source domain image Generate intermediate-style images/> , not just another source domain image/> style, thereby obtaining diverse images with random styles.

在步骤S2中，随机风格生成模型的损失函数由对抗损失/>、循环一致损失和局部相似损失/>组成，定义为：/>。In step S2, the loss function of the random style generation model Loss by confrontation/> , cycle consistent loss and local similarity loss/> Composition, defined as:/> .

其中，对抗损失定义为：Among them, against losses defined as:

其中，in,

上式中，z表示域无关的随机变量，和/>分别代表源域和目标域的图片集合，表示计算来自数据集/>的所有样本/>的期望值，/>表示计算来自数据集/>的所有样本/>的期望值。In the above formula, z represents a domain-independent random variable, and/> A collection of images representing the source domain and the target domain respectively, Indicates that the calculation comes from the data set/> All samples/> The expected value,/> Indicates that the calculation comes from the data set/> All samples/> expected value.

其中，循环一致损失保证生成后的图像还能重构回原始图像，从而保证生成图像的语义信息没有发生改变，循环一致损失/>定义为：Among them, the cycle-consistent loss It is guaranteed that the generated image can be reconstructed back to the original image, thereby ensuring that the semantic information of the generated image has not changed, and the loop consistency loss/> defined as:

上式中，表示将风格为S的真实图像转换为生成图像的生成器，/>表示将风格为T的真实图像转换为生成图像的生成器。In the above formula, Represents a generator that converts real images of style S into generated images, /> Represents a generator that converts real images of style T into generated images.

局部相似损失保持原始图像与生成图像/>中局部信息（掩膜区域）的纹理不变以保证掩膜的真实性，局部相似损失/>定义为：Local similarity loss preserves the original image with generated image/> The texture of the local information (mask area) remains unchanged to ensure the authenticity of the mask, and local similarity is lost/> defined as:

域无关的随机变量z∈[0,1]，随机变量z越接近0时，中间态的图片风格越接近原始图像/>，反之当随机变量z越接近1时，中间态/>的图片风格越接近风格图像/>。Domain-independent random variable z∈[0,1], the closer the random variable z is to 0, the intermediate state The closer the picture style is to the original image/> , on the contrary, when the random variable z is closer to 1, the intermediate state/> The closer the picture style is to the style image/> .

在该步骤中，基于局部不变性的风格随机生成模型的损失函数由对抗损失、循环一致损失和局部相似损失组成，通过随机梯度下降得到能够生成真实且多样化图像的风格转换模型。In this step, the loss function of the style random generation model based on local invariance consists of adversarial loss, cycle consistent loss and local similarity loss, and a style transfer model that can generate realistic and diverse images is obtained through stochastic gradient descent.

S3、基于风格随机生成模型构建元学习的训练数据集M，训练数据集M包括元训练集和元测试集；在本实施例中，为了对步骤S3的实现方式进行说明，具体过程是通过以下步骤实现的，详细的实施方法如下：S3. Construct a meta-learning training data set M based on the style random generation model. The training data set M includes a meta-training set and a meta-test set; in this embodiment, in order to illustrate the implementation of step S3, the specific process is as follows The detailed implementation method is as follows:

在该步骤中，如图3中的a所示，每次迭代时，先从源域S中随机选择一个域，并从该域中随机采样n张图像作为元训练集，根据训练好的基于局部不变性的风格随机生成模型将n张图像的风格由原始图像/>转换为中间态/>，将生成图像/>作为元测试集，原始图像/>与其对应的生成图像/>共享相同的语义标签。In this step, as shown in a in Figure 3, at each iteration, a domain is randomly selected from the source domain S, and n images are randomly sampled from this domain as a meta-training set , according to the trained style random generation model based on local invariance, the style of n images is changed from the original image/> Convert to intermediate state/> , will generate image/> as a meta-test set , original image/> The corresponding generated image/> share the same semantic label.

在本实施例中，基于随机风格生成模型，本发明在每次迭代中随机选取一个源域的n张图像作为元训练集，并将元训练集随机生成n张风格图像作为元测试集，其中元训练集和元测试集共享相同的语义标签，通过在训练过程中模拟域偏移来促进鲁棒优化，从而使得模型在未知场景下具有良好的分割能力。In this embodiment, based on the random style generation model, the present invention randomly selects n images in a source domain as a meta-training set in each iteration, and randomly generates n style images from the meta-training set as a meta-test set, where The meta-training set and meta-test set share the same semantic labels, which promotes robust optimization by simulating domain shifts during training, thereby enabling the model to have good segmentation capabilities in unknown scenarios.

S4、采用训练数据集M训练基于元学习范式的泛化模型，通过在训练过程中模拟域偏移来促进鲁棒优化，以得到可以应对不同风格下场景的泛化模型，泛化模型采用一个基于编码器-解码器的分割网络作为元学习范式中的骨干网络；分割网络的损失函数由交叉熵损失/>和Dice损失/>组成，定义为：/>，其中；S4. Use the training data set M to train a generalization model based on the meta-learning paradigm, and promote robust optimization by simulating domain shifts during the training process to obtain a generalization model that can cope with scenarios in different styles. The generalization model uses a Encoder-decoder based segmentation network as the backbone network in the meta-learning paradigm; loss function of the segmentation network By cross entropy loss/> and Dice loss/> Composition, defined as:/> ,in;

交叉熵损失定义为：cross entropy loss defined as:

Dice损失定义为：Dice loss defined as:

如图3中的b所示，上述得到的元训练集通过骨干网络训练得到元训练损失，元训练损失/>的表达式为：As shown in b in Figure 3, the meta-training set obtained above is trained through the backbone network to obtain the meta-training loss. , meta-training loss/> The expression is:

上式中，为分割网络的损失函数，/>为元训练集，/>为关于权重/>的函数，表示梯度。In the above formula, is the loss function of the segmentation network,/> is the meta-training set,/> About weight/> function, representing the gradient.

基于随机梯度下降更新元优化器的权重并获得更新后的权重/>，/> ，随后在元测试集/>上基于更新后的权重/>计算元测试损失，元测试损失/>的表达式为：Updating meta-optimizer weights based on stochastic gradient descent and get the updated weight/> ,/> , and then in the meta-test set/> based on the updated weights/> Calculate meta-test loss , meta-test loss/> The expression is:

上式中，为元测试集，/>为关于权重/>的函数，表示梯度。In the above formula, is the meta-test set,/> About weight/> function, representing the gradient.

最终通过随机梯度下降更新权重，/>，其中，/>是元测试阶段的学习率，/>是元测试阶段的学习率。Finally, the weights are updated via stochastic gradient descent ,/> , where,/> is the learning rate in the meta-test phase,/> is the learning rate in the meta-test phase.

在此过程中，每次迭代中源域的随机选择以及域无关的随机变量z的随机取值保证了元训练阶段中的元训练集和元测试阶段中的元测试集/>风格的差异化，同时由于局部不变性保证了目标区域的真实性，从而经过训练迭代之后，泛化模型可以提升对图像风格的鲁棒性，从而实现对不可见域中图像的精准分割。In this process, the random selection of the source domain and the random value of the domain-independent random variable z in each iteration ensure that the meta-training set in the meta-training phase and the meta-test set in the meta-testing phase/> The differentiation of style, and the local invariance ensures the authenticity of the target area, so after training iterations, the generalization model can improve the robustness to image style, thereby achieving accurate segmentation of images in the invisible domain.

本实施例多提出的基于风格迁移的图像分割域泛化方法，通过局部纹理不变约束和域无关的随机变量实现图像生成，进而构建模拟域偏移的情景训练方案，使得模型更加关注图像的内容，减少对于风格信息的依赖，能够在生成多样化风格的同时保留图像的结构信息，有效提高了模型在未知域上的分割准确率，提升了模型部署的效率。The image segmentation domain generalization method based on style transfer proposed in this embodiment realizes image generation through local texture invariant constraints and domain-independent random variables, and then constructs a scenario training scheme that simulates domain shift, making the model pay more attention to the image. content, reducing the dependence on style information, retaining the structural information of the image while generating diversified styles, effectively improving the segmentation accuracy of the model in unknown domains, and improving the efficiency of model deployment.

本发明能够在未见过的数据集上开发一种可复用的基于卷积神经网络的图像分割方法，该方法能够生成更加真实且多样化的数据，并基于多样化的风格图像设计一个训练范式，使得模型更加关注图像的内容，减少对于风格信息的依赖。The present invention can develop a reusable image segmentation method based on convolutional neural network on unseen data sets, which can generate more realistic and diverse data, and design a training method based on diverse style images. Paradigm makes the model pay more attention to the content of the image and reduce its dependence on style information.

本领域普通技术人员可以理解实现上述实施例方法中的全部或部分步骤是可以通过程序来指令相关的硬件来完成，因此，本申请可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且，本申请可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。Those of ordinary skill in the art can understand that all or part of the steps in implementing the methods of the above embodiments can be completed by instructing relevant hardware through programs. Therefore, this application can adopt a complete hardware embodiment, a complete software embodiment, or a combination of software and Hardware embodiments. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.

以上实施方式对本发明进行了详细介绍，本文中应用了具体个例对本发明的原理及实施方式进行了阐述，以上实施例的说明只是用于帮助理解本发明的方法及其核心思想；同时，对于本领域的一般技术人员，依据本发明的思想，在具体实施方式及应用范围上均会有改变之处，综上所述，本说明书内容不应理解为对本发明的限制。The above embodiments introduce the present invention in detail. Specific examples are used in this article to illustrate the principles and implementation modes of the present invention. The description of the above embodiments is only used to help understand the method of the present invention and its core idea; at the same time, for Those of ordinary skill in the art will make changes in the specific implementation and application scope based on the ideas of the present invention. In summary, the contents of this description should not be understood as limiting the present invention.

Claims

1. The image segmentation domain generalization method based on style migration is characterized by comprising the following steps of:

s1, extracting texture structure features from a training data set P containing a plurality of source domains, wherein the texture structure features comprise original imagesAnd generate image +.>Partial binary pattern feature of class c> and />；

S2, training a random style generation model based on local invariance by adopting a training data set P, wherein the random style generation model comprises two generators、/>And two discriminators->、/>Is a loop generation countermeasure network;

s3, constructing a meta-learning training data set M based on a style random generation model, wherein the training data set M comprises a meta-training set and a meta-testing set;

s4, training a generalization model based on a meta-learning paradigm by adopting a training data set M, and promoting robust optimization by simulating domain offset in the training process so as to obtain the generalization model which can cope with scenes in different styles.

2. The image segmentation domain generalization method according to claim 1, specifically comprising, in step S1:

s11, extracting an original image from the training data set P according to the labelAnd generate image +.>Partial binary pattern feature of class c> and />；

S12, according to the local binary pattern characteristics and />Calculating cosine similarity to construct local similarity loss +.>Which is used to keep the texture in the mask area unchanged during the image style generation.

3. The image segmentation domain generalization method according to claim 2, characterized in that in step S12, local similarity is lostThe calculation formula of (2) is as follows:

；

in the above formula, C represents the number of categories C, and />Are all original images +.>And generate image +.>A local binary pattern feature of category c.

4. The image segmentation domain generalization method according to claim 1, characterized in that in step S2, two generators and />For the real images of style S respectively +.>And true image with style T +.>Generating image converted into intermediate form style +.>Wherein z represents a domain independent random variable;

the discriminators are used for distinguishing the real sample from the generated sample, and two discriminators and />Respectively to distinguish imagesAnd image->。

5. The method of image segmentation generalizing according to claim 1, characterized in that in step S2, a loss function of a model is generated in a random styleBy countering losses->Loss of cycle consistency>And local similarity loss->The composition is defined as: />；

Wherein, countering the lossThe definition is as follows:

；

wherein ,

；

in the above equation, z represents a domain independent random variable, and />Sets of pictures representing source domain and target domain, respectively,/->Representing computation from dataset +.>Is>Is>Representing computation from dataset +.>Is>Is a desired value of (2);

loss of cyclic uniformityThe definition is as follows:

；

in the above equation, z represents a domain independent random variable,representing a generator for converting a real image of style S into a generated image, < >>A generator for converting the real image representing the style T into a generated image;

loss of local similarityThe definition is as follows:

；

6. The method of image segmentation domain generalization according to claim 5, characterized in that the domain independent random variable z e [0,1 ]]The closer the random variable z is to 0, the intermediate stateThe closer the picture style of (a) is to the original image +.>Whereas when the random variable z is closer to 1, the intermediate state +.>The closer the picture style of (a) is to the style image +.>。

7. The image segmentation domain generalization method according to claim 1, specifically comprising, in step S3:

s31, randomly selecting a domain from the source domain S, and randomly sampling n images from the domain as a meta-training set；

S32, randomly generating a model according to the trained styles, and changing the styles of the n images from the original imagesTransition to intermediate state->Will generate an image +.>As meta-test set->Original image +.>Corresponding to the generated image +.>Share the same semantic tags.

8. The image segmentation domain generalization method according to claim 1, characterized in that in step S4, the generalization model adopts a segmentation network based on encoder-decoder as a backbone network in a meta-learning paradigm.

9. The method of image segmentation domain generalization according to claim 8, characterized in that the loss function of the segmentation networkBy cross entropy loss->And Dice loss->The composition is defined as: />, wherein ；

cross entropy lossThe definition is as follows:

；

dice lossThe definition is as follows:

；

in the above equation, M represents the sum of pixels in the image, C represents the number of classes, and />Representing the predicted probability and the true value, respectively, of the c-th class of the i-th pixel.