CN111275638A - Face restoration method for generating confrontation network based on multi-channel attention selection - Google Patents
Face restoration method for generating confrontation network based on multi-channel attention selection Download PDFInfo
- Publication number
- CN111275638A CN111275638A CN202010044569.3A CN202010044569A CN111275638A CN 111275638 A CN111275638 A CN 111275638A CN 202010044569 A CN202010044569 A CN 202010044569A CN 111275638 A CN111275638 A CN 111275638A
- Authority
- CN
- China
- Prior art keywords
- image
- channel attention
- face
- network
- map
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/77—Retouching; Inpainting; Scratch removal
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Biomedical Technology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Image Analysis (AREA)
- Image Processing (AREA)
Abstract
本发明提供一种基于多通道注意力选择生成对抗网络的人脸修复方法,其包括以下步骤:S1、采集人脸数据并进行预处理;S2、建立人脸修复模型及损失函数;S3、第一阶段,学习图像生成子网Gi并初步修复图像;S4、第二阶段,产生中间输出图IG并学习多通道注意力图IA;S5、构建多通道注意力选择模型并输出最终合成图;S6、进行人脸修复。所述人脸修复模型包括生成器网络Gi、参数共享鉴别器D和多通道注意力选择网络Ga,所述损失函数包括不确定性像素损失函数和对抗性损失函数。本发明提供的人脸修复方法有效地学习不确定性图,以指导像素损失,从而实现更强大的优化,提供一种更优的人脸修复方法。
The present invention provides a face restoration method based on multi-channel attention selection and generating confrontation network, which comprises the following steps: S1, collecting face data and performing preprocessing; S2, establishing a face restoration model and a loss function; S3, the first step In the first stage, learn the image generation sub-network G i and initially repair the image; S4, in the second stage, generate the intermediate output map IG and learn the multi-channel attention map IA ; S5, construct the multi-channel attention selection model and output the final composite map ; S6, perform face repair. The face inpainting model includes a generator network G i , a parameter sharing discriminator D and a multi-channel attention selection network Ga , and the loss function includes an uncertainty pixel loss function and an adversarial loss function. The face repairing method provided by the present invention effectively learns the uncertainty map to guide pixel loss, thereby achieving more powerful optimization and providing a better face repairing method.
Description
【技术领域】【Technical field】
本发明涉及深度学习和图像处理领域,尤其涉及一种基于多通道注意力选择生成对抗网络的人脸修复方法。The invention relates to the fields of deep learning and image processing, in particular to a face inpainting method based on multi-channel attention selection and generating confrontation network.
【背景技术】【Background technique】
在图像修复领域,尤其对于眼内绘画,尽管DNN(深度神经网络)可以产生语义上合理且看起来逼真的结果,但大多数深度学习技术都无法在照片中保留人物的身份。例如,DNN可以学会睁开一对闭合的眼睛,但所述DNN本身并不能保证新的眼睛将与原始人的特定眼部结构相对应。In the field of image inpainting, especially for intraocular painting, although DNNs (deep neural networks) can produce semantically plausible and realistic-looking results, most deep learning techniques fail to preserve the identities of people in photos. For example, a DNN can learn to open a pair of closed eyes, but the DNN itself does not guarantee that the new eyes will correspond to the specific eye structure of the original person.
GAN(Generative adversarial networks,生成对抗网络)是一种特定类型的深层网络,其中包括以鉴别器网络为代表的可学习的对抗损失功能。GAN已成功地用于从头开始生成面部,或在面部上绘制缺失区域,适合一般的面部操作。GAN (Generative adversarial networks) is a specific type of deep network that includes a learnable adversarial loss function represented by a discriminator network. GANs have been successfully used to generate faces from scratch, or to draw missing areas on faces, suitable for general facial manipulation.
一种GAN变体,即条件GAN(cGAN),可以用额外的信息约束生成器。通过加入统一身份的参照信息,则该GAN不必从头开始幻化纹理或结构,但仍将保留原始图像的语义,来产生高质量的个性化修复结果。然而在某些情况下,GAN仍然会失败,比如当一个人的眼睛被一缕头发遮住了一部分,或者有时不能正确地着色,就会产生一些奇怪的人工痕迹。A GAN variant, Conditional GAN (cGAN), can constrain the generator with additional information. By adding the reference information of the unified identity, the GAN does not have to phantom texture or structure from scratch, but will still retain the semantics of the original image to produce high-quality personalized inpainting results. However, GANs still fail in some cases, such as when a person's eyes are partially obscured by a strand of hair, or sometimes cannot be colored correctly, resulting in strange artifacts.
生成对抗网络三通道生成空间可能不足以适合学习良好的映射,扩大生成空间并学习自动选择机制以合成更细粒度的生成结果成为一种可行的尝试。而将多通道注意力选择GAN框架(SelectionGAN)用于图像修复任务成为可能。The generative adversarial network three-channel generative space may be insufficient for learning good mappings, and it becomes a feasible attempt to expand the generative space and learn automatic selection mechanisms to synthesize more fine-grained generative results. It is possible to use the multi-channel attention selection GAN framework (SelectionGAN) for image inpainting tasks.
因此,本发明提供一种基于多通道注意力选择生成对抗网络的人脸修复系统。Therefore, the present invention provides a face inpainting system based on multi-channel attention selection generative adversarial network.
【发明内容】[Content of the invention]
为了解决人脸修复技术在个别条件下会出现图像遮挡、不正确的着色修复及奇怪人工修复痕迹等问题,本发明提供了一种基于多通道注意力选择生成对抗网络的人脸修复方法。In order to solve the problems of image occlusion, incorrect coloring and repairing and strange artificial repairing traces in the face inpainting technology under individual conditions, the present invention provides a face inpainting method based on multi-channel attention selection generative confrontation network.
一种基于多通道注意力选择生成对抗网络的人脸修复方法,其包括以下步骤:A face inpainting method based on multi-channel attention selection generative adversarial network, which includes the following steps:
S1、采集人脸数据并进行预处理:获取同一个人的人脸图像对,包含睁眼和闭眼的图像,对收集到的图像进行预处理;S1. Collect face data and perform preprocessing: acquire face image pairs of the same person, including images with eyes open and eyes closed, and perform preprocessing on the collected images;
S2、建立人脸修复模型及损失函数:设计并构建人脸修复模型及损失函数,所述人脸修复模型基于条件对抗生成网络,所述人脸修复模型包括生成器网络Gi、参数共享鉴别器D和多通道注意力选择网络Ga,所述损失函数包括不确定性像素损失函数和对抗性损失函数;S2, establish a face restoration model and a loss function: design and build a face restoration model and a loss function, the face restoration model is based on a conditional confrontation generation network, and the face restoration model includes a generator network G i , a parameter sharing identification A device D and a multi-channel attention selection network Ga, the loss function includes an uncertainty pixel loss function and an adversarial loss function;
S3、第一阶段,学习图像生成子网Gi并初步修复图像:学习图像生成子网Gi,所述图像生成子网Gi接收由标记的输入图像Ia和参考图像Rg组成的图像对,并初步修复所述图像对,生成修复图像I'g=Gi(Ia,Rg);S3, the first stage, learning the image generation sub-network G i and preliminarily repairing the image: learning the image generation sub-network G i , the image generation sub-network G i receives the image composed of the labeled input image I a and the reference image R g pair, and initially repair the image pair to generate a repaired image I' g =G i (I a , R g );
S4、第二阶段,产生中间输出图IG并学习通道注意力图IA:将来自所述图像生成子网Gi的粗略修复图像I'g、真值图片Ig以及来自所述生成器网络Gi最后一层的深层特征图Fi作为新特征Fc=concat(I'g,Fi,Ig),其中concat(·)是按通道进行级联操作的函数;将所述新特征Fc输入至所述多通道注意力选择模块Ga中,产生多个中间输出图IG,同时学习一组与中间生成图相同数量的多通道注意力图IA,以指导多个优化损失;S4, the second stage, generate the intermediate output map IG and learn the channel attention map IA : the rough repaired image I'g , the ground -truth image Ig from the image generation sub-network Gi , and the generator network from the The deep feature map F i of the last layer of G i is taken as a new feature F c =concat(I' g , F i , I g ), where concat( ) is a function of cascade operation by channel; the new feature Fc is input into the multi-channel attention selection module Ga , generates multiple intermediate output graphs IG , and simultaneously learns a set of multi-channel attention graphs IA with the same number of intermediate generated graphs to guide multiple optimization losses;
S5、构建多通道注意力选择模型并输出最终合成图:通过所述多通道注意力图IA用于从所述中间输出图IG中执行通道选择,并得到最终合成图I”g;S5, construct a multi-channel attention selection model and output the final synthetic graph: use the multi - channel attention graph IA to perform channel selection from the intermediate output graph IG, and obtain the final synthetic graph 1"g;
S6、进行人脸修复:将测试图像输入到训练好的所述人脸修复模型,获得高质量的人脸修复图像。S6. Perform face restoration: input the test image into the trained face restoration model to obtain a high-quality face restoration image.
优选的,步骤S2中所述人脸修复模型采用级联策略,通过所述生成器网络Gi输出粗略的修复图像,从而产生模糊的眼睛细节以及目标图像的高像素级不相似性,再通过所述多通道注意力选择网络Ga利用粗略的修复图像产生细粒度的最终输出。Preferably, the face restoration model in step S2 adopts a cascade strategy, and outputs a rough restoration image through the generator network G i , so as to generate blurred eye details and high pixel-level dissimilarity of the target image, and then pass The multi-channel attention selection network Ga utilizes the coarse inpainted image to produce a fine-grained final output.
优选的,步骤S4中,所述新特征Fc输入至所述多通道注意力选择模块Ga中具体包括:通过与输入特征的逐元素乘法选择每个合并的特征,将所述特征以相同的分辨率重新调节池化的特征,将特征Fc馈送到卷积层后生成新的多尺度特征Fc'以供在所述多通道注意力选择模块Ga中使用,合并中应用一组M个空间比例{Si}(i=1~M)用于产生具有不同空间分辨率的合并要素,其池化过程表现为:Preferably, in step S4, the input of the new feature F c into the multi-channel attention selection module Ga specifically includes: selecting each merged feature by element-wise multiplication with the input feature, and selecting the feature with the same The resolution rescales the pooled features, and feeds the feature Fc to the convolutional layer to generate a new multi-scale feature Fc ' for use in the multi-channel attention selection module Ga, applying a set of M spatial scales {S i } (i=1~M) are used to generate merged elements with different spatial resolutions, and the pooling process is shown as:
其中,concat(·)是按通道进行级联操作的函数,Fc为新特征,pl_ups(·)表示为以标度s进行池化,表示为逐元素乘法。Among them, concat( ) is a function of cascade operation by channel, F c is a new feature, pl_ups( ) is expressed as pooling with scale s, Represented as element-wise multiplication.
优选的,步骤S4中,所述中间输出图IG通过使用N个卷积滤波器之后进行tanh(·)非线性激活操作得到,所述多通道注意力图IA通过N个卷积滤波器之后进行标准化的基于通道的softmax函数操作后得到,所述中间输出图IG和所述多通道注意力图IA的计算分别为:Preferably, in step S4, the intermediate output graph IG uses N convolution filters After that, the tanh(·) nonlinear activation operation is performed to obtain the multi-channel attention map I A through N convolution filters After performing a standardized channel-based softmax function operation, the intermediate output graph IG and the multi-channel attention graph IA are calculated as:
优选的,步骤S5中,所述最终合成图I”g的计算式为:Preferably, in step S5, the calculation formula of described final composite graph 1 " g is:
其中,I”g代表从多个不同结果中选择的最终合成的生成图,IA为多通道注意力图,IG为中间输出图,符号表示逐元素加法,表示为逐元素乘法。where I” g represents the final synthesized generative map selected from multiple different results, I A is the multi-channel attention map, I G is the intermediate output map, and the symbol represents element-wise addition, Represented as element-wise multiplication.
优选的,所述参数共享鉴别器D在第一阶段中将所述图像生成子网Gi的粗略修复图像I'g和所述真值图片Ig作为输入,鉴别两者是否彼此关联;所述参数共享鉴别器D在第二阶段中将所述最终合成图I”g和所述真值图片Ig作为输入,鼓励所述参数共享鉴别器D区分图像结构的多样性并捕捉局部感知信息。Preferably, in the first stage, the parameter sharing discriminator D takes the roughly repaired image I'g of the image generation sub-network G i and the ground truth image Ig as input, and discriminates whether the two are related to each other; The parameter-sharing discriminator D takes the final synthetic image I'g and the ground-truth image Ig as input in the second stage, encouraging the parameter-sharing discriminator D to distinguish the diversity of image structures and capture local perceptual information .
优选的,所述不确定性像素损失函数为:Preferably, the uncertainty pixel loss function is:
其中Li p表示像素级损耗图,Ui表示第i个不确定性图,σ(·)是用于像素级归一化的Sigmoid函数。where L i p represents the pixel-level loss map, U i represents the ith uncertainty map, and σ( ) is the sigmoid function for pixel-level normalization.
优选的,第一阶段的所述对抗性损失函数为对[Ia,I'g]与真实图像对[Ia,Ig]进行区分,在第二阶段中,将D的对抗性损失公式化为:将合成图像对[Ia,I”g]与真实图像对[Ia,Ig]进行区分,其公式分别如下:Preferably, the adversarial loss function in the first stage is to distinguish the pair [I a , I' g ] from the real image pair [I a , I g ], and in the second stage, the adversarial loss of D is formulated is: distinguish the synthetic image pair [I a , I " g ] from the real image pair [I a , I g ], and the formulas are as follows:
所述对抗性损失函数公式如下:LcGAN=LcGAN(Ia,I'g)+λLcGAN(Ia,I”g),The adversarial loss function formula is as follows: L cGAN =L cGAN (I a , I' g )+ λL cGAN (I a , I " g ),
总的优化损失为: The total optimization loss is:
其中Li p使用L1重建分别计算生成的图像I'g,I”g与相应的真值图像之间的像素损失,LtV是所述最终合成图I”g的总变化正则化(total variation(TV)regularization):where L i p uses L1 reconstruction to calculate the pixel loss between the generated images I'g, I" g and the corresponding ground -truth images respectively, and LtV is the total variation regularization of the final synthetic image I" g (TV)regularization):
其中λi和λtv是权衡参数,以控制不同目标的相对重要性。where λ i and λ tv are trade-off parameters to control the relative importance of different objectives.
与现有技术相比,本发明将基于多通道注意力选择生成对抗网络应用于人脸修复,通过生成器网络Gi、参数共享鉴别器D和多通道注意力选择网络Ga扩大生成空间并自动学习自动选择机制合成更细粒度的生成结果,通过所述多通道注意力选择网络Ga专心选择感兴趣的中间生成图,并能够显著提高最终输出的质量。多通道注意力模块还可以有效地学习不确定性图,以指导像素损失,从而实现更强大的优化,提供一种更优的人脸修复方法。Compared with the prior art, the present invention applies the generative adversarial network based on multi-channel attention selection to face restoration, and expands the generation space through the generator network G i , the parameter sharing discriminator D and the multi-channel attention selection network Ga . The automatic learning automatic selection mechanism synthesizes more fine-grained generation results, and the multi-channel attention selection network Ga concentrates on selecting the intermediate generated graphs of interest, and can significantly improve the quality of the final output. The multi-channel attention module can also effectively learn the uncertainty map to guide the pixel loss for more powerful optimization, providing a better face inpainting method.
【附图说明】【Description of drawings】
为了更清楚地说明本发明实施例中的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其它的附图,其中:In order to illustrate the technical solutions in the embodiments of the present invention more clearly, the following briefly introduces the accompanying drawings used in the description of the embodiments. Obviously, the accompanying drawings in the following description are only some embodiments of the present invention. For those of ordinary skill in the art, under the premise of no creative work, other drawings can also be obtained from these drawings, wherein:
图1为本发明提供的基于多通道注意力选择生成对抗网络的人脸修复方法流程图;1 is a flowchart of a face repair method based on multi-channel attention selection generative adversarial network provided by the present invention;
图2为本发明提供的人脸修复模型的示意图;2 is a schematic diagram of a face restoration model provided by the present invention;
图3为本发明提供的多通道注意力选择模块的网络结构图。FIG. 3 is a network structure diagram of a multi-channel attention selection module provided by the present invention.
【具体实施方式】【Detailed ways】
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅是本发明的一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其它实施例,都属于本发明保护的范围。The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention. Obviously, the described embodiments are only a part of the embodiments of the present invention, but not all of the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative efforts shall fall within the protection scope of the present invention.
请结合参阅图1-图3,本发明提供一种基于多通道注意力选择生成对抗网络的人脸修复方法,所述人脸修复方法步骤如下:Please refer to FIG. 1 to FIG. 3 , the present invention provides a face restoration method based on multi-channel attention selection generative adversarial network. The steps of the face restoration method are as follows:
S1、采集人脸数据并进行预处理:获取同一个人的人脸图像对,包含睁眼和闭眼的图像,对收集到的图像进行预处理。收集大量图像作为数据集,利用如openCV对图像进行人脸识别,提取脸部的信息,尤其是眼睛。将收集到的图像裁剪成设定尺寸大小的人脸训练图像,以便于眼睛和嘴巴能够居中。S1. Collect face data and perform preprocessing: acquire face image pairs of the same person, including images with eyes open and eyes closed, and perform preprocessing on the collected images. Collect a large number of images as a dataset, use such as openCV to perform face recognition on the image, and extract the information of the face, especially the eyes. Crop the collected images into face training images of a set size so that the eyes and mouth can be centered.
S2、建立人脸修复模型及损失函数:设计并构建人脸修复模型及损失函数,所述人脸修复模型基于条件对抗生成网络,所述人脸修复模型包括生成器网络Gi、参数共享鉴别器D和多通道注意力选择网络Ga,所述损失函数包括不确定性像素损失函数和对抗性损失函数。S2, establish a face restoration model and a loss function: design and build a face restoration model and a loss function, the face restoration model is based on a conditional confrontation generation network, and the face restoration model includes a generator network G i , a parameter sharing identification D and multi-channel attention selection network Ga , the loss functions include uncertainty pixel loss function and adversarial loss function.
所述人脸修复模型采用级联策略,通过所述生成器网络Gi输出粗略的修复图像,从而产生模糊的眼睛细节以及目标图像的高像素级不相似性,第一阶段从粗到细的生成策略,以基于粗略的预测来提高综合性能。在第二阶段中再通过所述多通道注意力选择网络Ga利用粗略的修复图像产生细粒度的最终输出。The face inpainting model adopts a cascade strategy, and outputs a rough inpainting image through the generator network G i , thereby producing blurred eye details and high pixel-level dissimilarity of the target image. The first stage is from coarse to fine. Generate policies to improve synthetic performance based on rough predictions. The coarse inpainted image is then used in the second stage to generate a fine-grained final output through the multi-channel attention selection network Ga.
S3、第一阶段,学习图像生成子网Gi并初步修复图像:学习图像生成子网Gi,所述图像生成子网Gi接收由标记的输入图像Ia和参考图像Rg组成的图像对,并初步修复所述图像对,生成修复图像I'g=Gi(Ia,Rg)。所述参考图像Rg可以提供更强的监督能力。这种生成在输入图像Ia、参考图像Rg及真值图像Ig之间添加了更强大的监控,从而促进了网络的优化。S3, the first stage, learning the image generation sub-network G i and preliminarily repairing the image: learning the image generation sub-network G i , the image generation sub-network G i receives the image composed of the labeled input image I a and the reference image R g pair, and preliminarily inpaint the image pair to generate an inpainted image I' g =G i (I a , R g ). The reference image Rg can provide stronger supervision ability. This generation adds more robust supervision between the input image I a , the reference image R g , and the ground-truth image I g , thereby facilitating the optimization of the network.
其中在第一阶段中,所述参数共享鉴别器D用于将所述图像生成子网Gi的粗略修复图像I'g和所述真值图片Ig作为输入,鉴别两者是否彼此关联。In the first stage, the parameter sharing discriminator D is used to take the roughly inpainted image I'g of the image generation sub-network G i and the ground-truth image Ig as input to discriminate whether the two are related to each other.
S4、第二阶段,产生中间输出并学习多通道注意力图:将来自所述图像生成子网Gi的粗略修复图像I'g、真值图片Ig以及来自所述生成器网络Gi最后一层的深层特征图Fi作为新特征Fc=concat(I'g,Fi,Ig),其中concat(·)是按通道进行级联操作的函数;将所述新特征Fc输入至所述多通道注意力选择模块Ga中,产生多个中间输出图IG,同时学习一组与中间生成图相同数量的多通道注意力图IA,以指导多个优化损失。S4, the second stage, generate intermediate output and learn multi-channel attention map: the rough repaired image I' g from the image generation sub-network G i , the ground-truth image I g and the final image from the generator network G i The deep feature map F i of the layer is taken as a new feature F c =concat(I' g , F i , I g ), where concat( ) is a function of cascade operation by channel; the new feature F c is input to In the multi-channel attention selection module Ga , multiple intermediate output graphs IG are generated, and at the same time a set of multi-channel attention graphs IA with the same number as the intermediate generated graphs are learned to guide multiple optimization losses.
单尺度特征可能无法捕获细粒度生成的所有必要细节信息,因此本发明提出一种多尺度的空间池化方案,该方案使用一组不同的内核大小并大步向前,对相同的输入特征执行全局平均池化。这样可获得具有不同感受野的多尺度特征,以感知不同的细节信息。所述新特征Fc输入至所述多通道注意力选择模块Ga中具体包括:通过与输入特征的逐元素乘法选择每个合并的特征,将所述特征以相同的分辨率重新调节池化的特征,将特征Fc馈送到卷积层后生成新的多尺度特征F’c以供在所述多通道注意力选择模块Ga中使用,合并中应用一组M个空间比例{Si}(i=1~M)用于产生具有不同空间分辨率的合并要素,其池化过程表现为:Single-scale features may not capture all the necessary details for fine-grained generation, so the present invention proposes a multi-scale spatial pooling scheme that uses a set of different kernel sizes and strides forward, performing on the same input features Global average pooling. In this way, multi-scale features with different receptive fields can be obtained to perceive different details. The input of the new feature F c into the multi-channel attention selection module Ga specifically includes: selecting each merged feature by element-wise multiplication with the input feature, and re-adjusting and pooling the feature at the same resolution The features of F c are fed to the convolutional layers to generate new multi-scale features F' c for use in the multi-channel attention selection module Ga, a set of M spatial scales {S i are applied in the merging } (i=1~M) is used to generate merged elements with different spatial resolutions, and the pooling process is expressed as:
其中,concat(·)是按通道进行级联操作的函数,Fc为新特征,pl_ups(·)表示为以标度s进行池化,表示为逐元素乘法。Among them, concat( ) is a function of cascade operation by channel, F c is a new feature, pl_ups( ) is expressed as pooling with scale s, Represented as element-wise multiplication.
所述多通道注意力选择模块Ga可以自动从生成中进行空间和时间选择,以合成细粒度的最终输出。给定多尺度特征量Fc'∈R(上h×w×c),其中h和w是特征的宽度和高度,c是通道数。所述中间输出图IG通过使用N个卷积滤波器之后进行tanh(·)非线性激活操作得到,所述多通道注意力图IA通过N个卷积滤波器之后进行标准化的基于通道的softmax函数操作后得到,所述中间输出图IG和所述多通道注意力图IA的计算分别为:The multi-channel attention selection module Ga can automatically make spatial and temporal selections from the generation to synthesize fine-grained final outputs. Given a multi-scale feature quantity F c '∈R (above h×w×c), where h and w are the width and height of the feature, and c is the number of channels. The intermediate output graph IG is obtained by using N convolutional filters After that, the tanh(·) nonlinear activation operation is performed to obtain the multi-channel attention map I A through N convolution filters After performing a standardized channel-based softmax function operation, the intermediate output graph IG and the multi-channel attention graph IA are calculated as:
在第二阶段中,所述参数共享鉴别器D将所述最终合成图I'g'和所述真值图片Ig作为输入,鼓励所述参数共享鉴别器D区分图像结构的多样性并捕捉局部感知信息。In the second stage, the parameter sharing discriminator D takes the final synthetic image I'g ' and the ground-truth image Ig as input, and encourages the parameter sharing discriminator D to distinguish the diversity of image structure and capture Local perception information.
S5、构建多通道注意力选择模型并输出最终合成图:通过所述多通道注意力图IA用于从所述中间输出图IG中执行通道选择,并得到最终合成图I”g。S5. Build a multi-channel attention selection model and output a final synthetic graph: the multi - channel attention graph IA is used to perform channel selection from the intermediate output graph IG, and a final synthetic graph I"g is obtained.
所述最终合成图I”g的计算式为:The calculation formula of the final synthetic graph I" g is:
其中,I”g代表从多个不同结果中选择的最终合成的生成图,IA为多通道注意力图,IG为中间输出图,符号表示逐元素加法,表示为逐元素乘法。where I” g represents the final synthesized generative map selected from multiple different results, I A is the multi-channel attention map, I G is the intermediate output map, and the symbol represents element-wise addition, Represented as element-wise multiplication.
S6、进行人脸修复:将测试图像输入到训练好的所述人脸修复模型,获得高质量的人脸修复图像。S6. Perform face restoration: input the test image into the trained face restoration model to obtain a high-quality face restoration image.
需要说明的是,从预训练模型初步获得的修复图对于所有像素都不准确,这会导致训练过程中的指导错误。为了解决这个问题,本发明提出了生成的多通道注意力图IA来学习不确定性图以控制优化损失。假设我们有K个不同的损耗图需要指导,首先将多个生成的多通道注意力图IA连接起来,并传递到具有K个过滤器的卷积层,以生成一组K个不确定性图。所述不确定性像素损失函数为:It should be noted that the inpainting map initially obtained from the pre-trained model is inaccurate for all pixels, which leads to wrong guidance during training. To solve this problem, the present invention proposes a generated multi-channel attention map IA to learn the uncertainty map to control the optimization loss. Suppose we have K different loss maps to guide, first concatenate multiple generated multi-channel attention maps I A and pass them to filters with K filters to generate a set of K uncertainty maps. The uncertainty pixel loss function is:
其中Li p表示像素级损耗图,Ui表示第i个不确定性图,σ(·)是用于像素级归一化的Sigmoid函数。where L i p represents the pixel-level loss map, U i represents the ith uncertainty map, and σ( ) is the sigmoid function for pixel-level normalization.
第一阶段的所述对抗性损失函数为对[Ia,I′g]与真实图像对[Ia,Ig]进行区分,在第二阶段中,将D的对抗性损失公式化为:将合成图像对[Ia,I″g]与真实图像对[Ia,Ig]进行区分,其公式分别如下:The adversarial loss function in the first stage is to distinguish the pair [I a , I′ g ] from the real image pair [I a , I g ]. In the second stage, the adversarial loss of D is formulated as: The synthetic image pair [I a , I″ g ] is distinguished from the real image pair [I a , I g ], and the formulas are as follows:
两种损失的目的都是保留局部结构信息并产生视觉上令人愉悦的合成图像。因此,所提出的SelectionGAN的对抗损失为(5)和(6)的等式之和。所述对抗性损失函数公式如下:Both losses aim to preserve local structural information and produce visually pleasing synthetic images. Therefore, the adversarial loss of the proposed SelectionGAN is the sum of the equations of (5) and (6). The adversarial loss function formula is as follows:
LcGAN=LcGAN(Ia,I'g)+λLcGAN(Ia,I”g) (7)L cGAN = L cGAN (I a , I' g )+ λL cGAN (I a , I " g ) (7)
总的优化损失是上述损失的加权和,生成器网络Gi、参数共享鉴别器D和多通道注意力选择网络Ga以端到端的方式训练,优化以下最小-最大函数总的优化损失为:The total optimization loss is the weighted sum of the above losses, the generator network G i , the parameter sharing discriminator D and the multi-channel attention selection network Ga are trained in an end-to-end manner, optimizing the following min-max function The total optimization loss is:
其中Li p使用L1重建分别计算生成的图像I'g,I”g与相应的真值图像之间的像素损失,LtV是所述最终合成图I”g的总变化正则化(total variation(TV)regularization):where L i p uses L1 reconstruction to calculate the pixel loss between the generated images I'g, I" g and the corresponding ground -truth images respectively, and LtV is the total variation regularization of the final synthetic image I" g (TV)regularization):
其中λi和λtv是权衡参数,以控制不同目标的相对重要性。where λ i and λ tv are trade-off parameters to control the relative importance of different objectives.
本发明提供的与现有技术相比,本发明将基于多通道注意力选择生成对抗网络应用于人脸修复,通过生成器网络Gi、参数共享鉴别器D和多通道注意力选择网络Ga扩大生成空间并自动学习自动选择机制合成更细粒度的生成结果,通过所述多通道注意力选择网络Ga专心选择感兴趣的中间生成图,并能够显著提高最终输出的质量。所述多通道注意力选择网络Ga还可以有效地学习不确定性图,以指导像素损失,从而实现更强大的优化,提供一种更优的人脸修复方法。Compared with the prior art provided by the present invention, the present invention applies the generative confrontation network based on multi-channel attention selection to face restoration, through the generator network G i , the parameter sharing discriminator D and the multi-channel attention selection network Ga Enlarging the generation space and automatically learning the automatic selection mechanism to synthesize more fine-grained generation results, the multi-channel attention selection network Ga concentrates on selecting the intermediate generated graphs of interest, and can significantly improve the quality of the final output. The multi-channel attention selection network Ga can also effectively learn the uncertainty map to guide the pixel loss, so as to achieve more powerful optimization and provide a better face inpainting method.
以上所述的仅是本发明的实施方式,在此应当指出,对于本领域的普通技术人员来说,在不脱离本发明创造构思的前提下,还可以做出改进,但这些均属于本发明的保护范围。The above are only the embodiments of the present invention. It should be pointed out that for those of ordinary skill in the art, improvements can be made without departing from the inventive concept of the present invention, but these belong to the present invention. scope of protection.
Claims (8)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202010044569.3A CN111275638B (en) | 2020-01-16 | 2020-01-16 | Face repairing method for generating confrontation network based on multichannel attention selection |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202010044569.3A CN111275638B (en) | 2020-01-16 | 2020-01-16 | Face repairing method for generating confrontation network based on multichannel attention selection |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| CN111275638A true CN111275638A (en) | 2020-06-12 |
| CN111275638B CN111275638B (en) | 2022-10-28 |
Family
ID=71003183
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN202010044569.3A Active CN111275638B (en) | 2020-01-16 | 2020-01-16 | Face repairing method for generating confrontation network based on multichannel attention selection |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN111275638B (en) |
Cited By (7)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN112686817A (en) * | 2020-12-25 | 2021-04-20 | 天津中科智能识别产业技术研究院有限公司 | Image completion method based on uncertainty estimation |
| CN113177533A (en) * | 2021-05-28 | 2021-07-27 | 济南博观智能科技有限公司 | Face recognition method and device and electronic equipment |
| CN113673458A (en) * | 2021-08-26 | 2021-11-19 | 上海明略人工智能(集团)有限公司 | A method, device and electronic device for training an object removal model |
| CN113689356A (en) * | 2021-09-14 | 2021-11-23 | 三星电子(中国)研发中心 | Image restoration method and device |
| CN113962893A (en) * | 2021-10-27 | 2022-01-21 | 山西大学 | Face image restoration method based on multi-scale local self-attention generation countermeasure network |
| CN115471901A (en) * | 2022-11-03 | 2022-12-13 | 山东大学 | Multi-pose face frontalization method and system based on generative confrontation network |
| CN115937994A (en) * | 2023-01-06 | 2023-04-07 | 南昌大学 | Data detection method based on deep learning detection model |
Citations (6)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20180284752A1 (en) * | 2016-05-09 | 2018-10-04 | StrongForce IoT Portfolio 2016, LLC | Methods and systems for industrial internet of things data collection in downstream oil and gas environment |
| CN109447918A (en) * | 2018-11-02 | 2019-03-08 | 北京交通大学 | Removing rain based on single image method based on attention mechanism |
| US20190236759A1 (en) * | 2018-01-29 | 2019-08-01 | National Tsing Hua University | Method of image completion |
| CN110222628A (en) * | 2019-06-03 | 2019-09-10 | 电子科技大学 | A kind of face restorative procedure based on production confrontation network |
| CN110288537A (en) * | 2019-05-20 | 2019-09-27 | 湖南大学 | Face Image Completion Method Based on Self-Attention Deep Generative Adversarial Network |
| US20190333198A1 (en) * | 2018-04-25 | 2019-10-31 | Adobe Inc. | Training and utilizing an image exposure transformation neural network to generate a long-exposure image from a single short-exposure image |
-
2020
- 2020-01-16 CN CN202010044569.3A patent/CN111275638B/en active Active
Patent Citations (6)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20180284752A1 (en) * | 2016-05-09 | 2018-10-04 | StrongForce IoT Portfolio 2016, LLC | Methods and systems for industrial internet of things data collection in downstream oil and gas environment |
| US20190236759A1 (en) * | 2018-01-29 | 2019-08-01 | National Tsing Hua University | Method of image completion |
| US20190333198A1 (en) * | 2018-04-25 | 2019-10-31 | Adobe Inc. | Training and utilizing an image exposure transformation neural network to generate a long-exposure image from a single short-exposure image |
| CN109447918A (en) * | 2018-11-02 | 2019-03-08 | 北京交通大学 | Removing rain based on single image method based on attention mechanism |
| CN110288537A (en) * | 2019-05-20 | 2019-09-27 | 湖南大学 | Face Image Completion Method Based on Self-Attention Deep Generative Adversarial Network |
| CN110222628A (en) * | 2019-06-03 | 2019-09-10 | 电子科技大学 | A kind of face restorative procedure based on production confrontation network |
Cited By (11)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN112686817A (en) * | 2020-12-25 | 2021-04-20 | 天津中科智能识别产业技术研究院有限公司 | Image completion method based on uncertainty estimation |
| CN112686817B (en) * | 2020-12-25 | 2023-04-07 | 天津中科智能识别产业技术研究院有限公司 | Image completion method based on uncertainty estimation |
| CN113177533A (en) * | 2021-05-28 | 2021-07-27 | 济南博观智能科技有限公司 | Face recognition method and device and electronic equipment |
| CN113177533B (en) * | 2021-05-28 | 2022-09-06 | 济南博观智能科技有限公司 | Face recognition method and device and electronic equipment |
| CN113673458A (en) * | 2021-08-26 | 2021-11-19 | 上海明略人工智能(集团)有限公司 | A method, device and electronic device for training an object removal model |
| CN113689356A (en) * | 2021-09-14 | 2021-11-23 | 三星电子(中国)研发中心 | Image restoration method and device |
| CN113689356B (en) * | 2021-09-14 | 2023-11-24 | 三星电子(中国)研发中心 | A method and device for image restoration |
| CN113962893A (en) * | 2021-10-27 | 2022-01-21 | 山西大学 | Face image restoration method based on multi-scale local self-attention generation countermeasure network |
| CN113962893B (en) * | 2021-10-27 | 2024-07-09 | 山西大学 | Face image restoration method based on multiscale local self-attention generation countermeasure network |
| CN115471901A (en) * | 2022-11-03 | 2022-12-13 | 山东大学 | Multi-pose face frontalization method and system based on generative confrontation network |
| CN115937994A (en) * | 2023-01-06 | 2023-04-07 | 南昌大学 | Data detection method based on deep learning detection model |
Also Published As
| Publication number | Publication date |
|---|---|
| CN111275638B (en) | 2022-10-28 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN110532871B (en) | Method and apparatus for image processing | |
| CN111275638B (en) | Face repairing method for generating confrontation network based on multichannel attention selection | |
| CN113592736B (en) | Semi-supervised image deblurring method based on fused attention mechanism | |
| KR102224253B1 (en) | Teacher-student framework for light weighted ensemble classifier combined with deep network and random forest and the classification method based on thereof | |
| CN112446270B (en) | Training method of pedestrian re-recognition network, pedestrian re-recognition method and device | |
| Rahmon et al. | Motion U-Net: Multi-cue encoder-decoder network for motion segmentation | |
| CN111861894B (en) | Image de-blurring method based on generative adversarial network | |
| CN112446476A (en) | Neural network model compression method, device, storage medium and chip | |
| US20040184657A1 (en) | Method for image resolution enhancement | |
| CN115393225A (en) | A low-light image enhancement method based on multi-level feature extraction and fusion | |
| CN112446835A (en) | Image recovery method, image recovery network training method, device and storage medium | |
| CN115731597B (en) | Automatic segmentation and restoration management platform and method for mask image of face mask | |
| CN110782503B (en) | Face image synthesis method and device based on two-branch depth correlation network | |
| Swami et al. | Candy: Conditional adversarial networks based fully end-to-end system for single image haze removal | |
| CN110969109A (en) | Blink detection model under non-limited condition and construction method and application thereof | |
| Zheng et al. | Overwater image dehazing via cycle-consistent generative adversarial network | |
| CN115035007A (en) | Face aging system for generating countermeasure network based on pixel level alignment and establishment method | |
| CN116452420B (en) | Hyper-spectral image super-resolution method based on fusion of Transformer and CNN (CNN) group | |
| CN114004758B (en) | A generative adversarial network method for image color cast removal | |
| CN115035159A (en) | Video multi-target tracking method based on deep learning and time sequence feature enhancement | |
| CN117935381B (en) | Face-swapped video detection method and system based on overall forgery traces and local detail information extraction | |
| CN120147181A (en) | An image defogging algorithm guided by fog concentration information | |
| CN117237994B (en) | Method, device and system for counting personnel and detecting behaviors in oil and gas operation area | |
| CN112232221A (en) | Method, system and program carrier for processing human image | |
| CN118447265A (en) | Self-calibration illumination learning-based target detection method in low-illumination state |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| GR01 | Patent grant | ||
| GR01 | Patent grant |