WO2022116161A1

WO2022116161A1 - Portrait cartooning method, robot, and storage medium

Info

Publication number: WO2022116161A1
Application number: PCT/CN2020/133928
Authority: WO
Inventors: 曾钰胜; 庞建新; 程骏
Original assignee: Ubtech Robotics Corp
Current assignee: Ubtech Robotics Corp
Priority date: 2020-12-04
Filing date: 2020-12-04
Publication date: 2022-06-09
Anticipated expiration: 2023-06-04

Abstract

A portrait cartooning method, a robot, and a storage medium. The method comprises: obtaining an original portrait to be processed (102); recognizing a face in the original portrait, and performing face alignment according to recognized face key points to obtain an aligned standard portrait (104); performing portrait segmentation on the standard portrait, and filtering background information to obtain a target portrait (106); and using the target portrait as an input of a portrait cartoonlization model, and obtaining a cartooned portrait corresponding to the target portrait and outputted by the portrait cartoonlization model, the portrait cartoonlization model being obtained by training a generative adversarial network model (108). The target portrait does not contain external environment information and only contains a person, so that the processing of a subsequent portrait cartoonalization model is simplified, the method is suitable for being deployed at a robot end having limited computing power, and the obtained cartoon effect is good due to no background information.

Description

Portrait cartoonization method, robot and storage medium

technical field

本申请涉及计算机技术领域，具体涉及一种人像卡通化方法、机器人及存储介质。The present application relates to the field of computer technology, and in particular, to a method for cartoonizing a portrait, a robot and a storage medium.

Background technique

随着生成对抗网络的不断发展，目前涌现了越来越多的基于生成对抗网络的AI应用。如黑白画上色，图像风格化，人像卡通化等。其中，人像卡通化成为了研究的热点。目前的人像卡通化模型在外部环境多样的情况下，其卡通化的效果往往较差，而如果想要好的效果，往往会增加模型的复杂度。With the continuous development of Generative Adversarial Networks, more and more AI applications based on Generative Adversarial Networks have emerged. Such as black and white painting coloring, image stylization, portrait cartoonization and so on. Among them, the cartoonization of portraits has become a research hotspot. The current portrait cartoon models tend to have poor cartoon effects when the external environment is diverse, and if a good effect is desired, the complexity of the model is often increased.

由于机器人端算力有限，复杂的人像卡通化模型在机器人端很难使用，因此，亟需一种可以在机器人端使用的人像卡通化方法。Due to the limited computing power on the robot side, complex portrait cartoon models are difficult to use on the robot side. Therefore, there is an urgent need for a portrait cartoon method that can be used on the robot side.

发明内容SUMMARY OF THE INVENTION

基于此，有必要针对上述问题，提出一种卡通化效果好且适用于在机器人端使用的人像卡通化方法、机器人及存储介质。Based on this, it is necessary to address the above problems, and to propose a method for cartoonizing a portrait, a robot and a storage medium, which has a good cartoon effect and is suitable for use on the robot side.

一种人像卡通化方法，包括：A method of cartoonizing a portrait, comprising:

获取待处理的原始人物图像；Get the original character image to be processed;

对所述原始人物图像中的人脸进行识别，根据识别得到的人脸关键点进行人脸对齐，得到对齐后的标准人物图像；Recognizing the face in the original character image, and aligning the face according to the key points of the face obtained by the identification, to obtain an aligned standard character image;

基于所述标准人物图像进行人像分割，滤除背景信息，得到目标人物图像；Perform portrait segmentation based on the standard person image, filter out background information, and obtain a target person image;

将所述目标人物图像作为人像卡通化模型的输入，获取所述人像卡通化模型输出的与所述目标人物图像对应的人物卡通化图像，所述人像卡通化模型是通过生成对抗网络模型训练得到的。The target character image is used as the input of the portrait cartoon model, and the character cartoon image corresponding to the target character image output by the portrait cartoon model is obtained, and the portrait cartoon model is obtained by training a generative confrontation network model. of.

一种机器人，包括存储器和处理器，所述存储器存储有计算机程序，所述计算机程序被所述处理器执行时，使得所述处理器执行以下步骤：A robot includes a memory and a processor, the memory stores a computer program, and when the computer program is executed by the processor, the processor executes the following steps:

一种计算机可读存储介质，存储有计算机程序，所述计算机程序被处理器执行时，使得所述处理器执行以下步骤：A computer-readable storage medium storing a computer program, when executed by a processor, the computer program causes the processor to perform the following steps:

上述人像卡通化方法、机器人及存储介质，首先，对原始人物图像中的人脸进行人脸对齐，得到对齐后的标准人物图像，然后对标准人物图像进行人像分割，滤除背景信息，得到目标人物图像，目标人物图像中不存在外部环境信息，只包含人物，所以简化了后续人像卡通化模型的处理，适用于部署在算力有限的机器人端，而且由于没有背景信息，得到的卡通效果好，即在机器人端就可以实现良好的卡通化效果。In the above-mentioned portrait cartoonization method, robot and storage medium, firstly, face alignment is performed on the face in the original character image to obtain the aligned standard character image, and then the standard character image is segmented, and the background information is filtered out to obtain the target. Character images, there is no external environment information in the target character image, only characters are included, so the processing of subsequent portrait cartoon models is simplified, which is suitable for deployment on robots with limited computing power, and because there is no background information, the obtained cartoon effect is good , that is, a good cartoon effect can be achieved on the robot side.

Description of drawings

为了更清楚地说明本申请实施例或现有技术中的技术方案，下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍，显而易见地，下面描述中的附图仅仅是本申请的一些实施例，对于本领域普通技术人员来讲，在不付出创造性劳动的前提下，还可以根据这些附图获得其他的附图。In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the following briefly introduces the accompanying drawings required for the description of the embodiments or the prior art. Obviously, the drawings in the following description are only These are some embodiments of the present application. For those of ordinary skill in the art, other drawings can also be obtained based on these drawings without any creative effort.

其中：in:

图1是一个实施例中人像卡通化方法的流程图；Fig. 1 is the flow chart of the portrait cartoonization method in one embodiment;

图2是一个实施例中人像对齐的示意图；Fig. 2 is a schematic diagram of portrait alignment in one embodiment;

图3是一个实施例中单人人像卡通化前后的示意图；Fig. 3 is the schematic diagram before and after cartoonization of single person portrait in one embodiment;

图4是一个实施例中生成对抗网络模型的训练流程图；Fig. 4 is the training flow chart of generative adversarial network model in one embodiment;

图5是一个实施例中单人人像分割前后的示意图；Fig. 5 is the schematic diagram before and after single person portrait segmentation in one embodiment;

图6是一个实施例中人像卡通化装置的结构框图；Fig. 6 is the structural block diagram of the portrait cartoonization device in one embodiment;

图7是一个实施例中训练模块的结构框图；Fig. 7 is a structural block diagram of a training module in one embodiment;

图8是一个实施例中机器人的内部结构图。FIG. 8 is a diagram of the internal structure of the robot in one embodiment.

Detailed ways

下面将结合本申请实施例中的附图，对本申请实施例中的技术方案进行清楚、完整地描述，显然，所描述的实施例仅仅是本申请一部分实施例，而不是全部的实施例。基于本申请中的实施例，本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例，都属于本申请保护的范围。The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application. Obviously, the described embodiments are only a part of the embodiments of the present application, but not all of the embodiments. Based on the embodiments in the present application, all other embodiments obtained by those of ordinary skill in the art without creative work fall within the protection scope of the present application.

如图1所示，提出了一种人像卡通化方法，该人像卡通化方法可以应用于终端，本实施例以应用于机器人端举例说明。该人像卡通化方法具体包括以下步骤：As shown in FIG. 1 , a method for cartoonizing a portrait is proposed, and the method for cartoonizing a portrait can be applied to a terminal. This embodiment is illustrated by being applied to a robot terminal. The portrait cartoonization method specifically includes the following steps:

步骤102，获取待处理的原始人物图像。Step 102, acquiring the original character image to be processed.

其中，原始人物图像中包含有卡通化的人物图像。不同原始人物图像中人体占比往往是不一致的，比如，有些原始人物图像中人体占比是标到肚子区域，有些人体占比是标到了腿部，还有一些人体占比是标到了肩膀等。如果直接对这些原始人物图像中的人物图像进行卡通化，那么势必需要人像卡通化模型适应多种情况，相对的模型设计会特别复杂，导致计算量大，而由于机器人端往往算力有限，所以不适用于在机器人端使用。原始人物图像可以是通过摄像头直接拍摄得到的图像，也可以是从相册中获取到的图像。在一个实施例中，终端为机器人端。Among them, the original character image includes cartoon character images. The proportion of the human body in different original character images is often inconsistent. For example, in some original human images, the proportion of the human body is marked on the stomach area, some of the human body is marked on the legs, and some are marked on the shoulders, etc. . If the character images in these original character images are directly cartoonized, the cartoonized portrait model must be adapted to various situations, and the relative model design will be particularly complicated, resulting in a large amount of calculation. Not suitable for use on the robot side. The original character image may be an image directly captured by a camera, or an image obtained from an album. In one embodiment, the terminal is a robot terminal.

步骤104，对原始人物图像中的人脸进行识别，根据识别得到的人脸关键点进行人脸对齐，得到对齐后的标准人物图像。Step 104 , identify the face in the original character image, and perform face alignment according to the identified face key points to obtain an aligned standard character image.

其中，人脸关键点是指反映人脸面部特征的特征点，包括：眉毛、眼睛、鼻子、嘴巴和脸部轮廓的特征点。为了便于后续分割，采用人脸对齐方式将原始人物图像进行对齐，得到对齐后的标准人物图像。Among them, the face key points refer to the feature points that reflect the facial features of the face, including: the feature points of eyebrows, eyes, nose, mouth and facial contour. In order to facilitate the subsequent segmentation, the face alignment method is used to align the original person images to obtain the aligned standard person images.

标准人物图像是指预设的规范化的人物图像，比如，可以设置标准任务图像中头发为起始位置，肩部区域为终点位置。人脸对齐的过程相当于等距变换+均匀尺度缩放，其效果具有角度、平行性和垂直性不变特性。对齐后的标准人物图像中人脸是正向的，人体占比符合预设的占比规则。如图2所示，为一个实施例中，人像对齐的示意图。The standard person image refers to a preset normalized person image. For example, the hair in the standard task image can be set as the starting position, and the shoulder area is the ending position. The process of face alignment is equivalent to equidistant transformation + uniform scale scaling, and its effect has the invariant characteristics of angle, parallelism and verticality. The face in the aligned standard human image is positive, and the proportion of the human body conforms to the preset proportion rule. As shown in FIG. 2 , it is a schematic diagram of portrait alignment in one embodiment.

具体地，人脸对齐的目标是人脸中的5个关键点(左眼、右眼、鼻子、嘴左、嘴右)映射到目标空间的指定位置，而其他部分发生非失真的变化。5个关键点的作用是将人脸映射为正脸，然后其他部分随之也相应地进行映射到目标空间。目标空间的选择是根据人体占比进行选择的，如果人体占比小，相应的目标空间也比较小。Specifically, the goal of face alignment is to map five key points in the face (left eye, right eye, nose, mouth left, and mouth right) to specified positions in the target space, while other parts undergo undistorted changes. The role of the 5 key points is to map the face to the frontal face, and then the other parts are mapped to the target space accordingly. The selection of the target space is based on the proportion of the human body. If the proportion of the human body is small, the corresponding target space is also relatively small.

步骤106，基于标准人物图像进行人像分割，滤除背景信息，得到目标人物图像。Step 106, perform portrait segmentation based on the standard person image, filter out the background information, and obtain the target person image.

其中，由于标准人物图像具有统一的单一特点，所以基于标注人物图像进行人像分割更加的简单，只需要采用轻量化的人像分割模型即可实现。进行人像分割后滤除背景信息，这样就得到了没有外部环境信息的目标人物图像，这样后续进行卡通化时，就不会出现因受外部环境影响而导致的卡通效果差。如图3所示，为一个实施例中，单人人像分割前后的示意图。Among them, since the standard person images have a unified and single feature, it is simpler to perform portrait segmentation based on annotated person images, and only a lightweight portrait segmentation model can be used. After the portrait is segmented, the background information is filtered out, so that the target person image without the external environment information is obtained, so that the cartoon effect will not be poor due to the influence of the external environment during subsequent cartoonization. As shown in FIG. 3 , it is a schematic diagram before and after segmentation of a single person portrait in one embodiment.

步骤108，将目标人物图像作为人像卡通化模型的输入，获取人像卡通化模型输出的与目标人物图像对应的人物卡通化图像，人像卡通化模型是通过生成对抗网络模型训练得到的。In step 108, the target character image is used as the input of the portrait cartoon model, and the character cartoon image corresponding to the target character image output by the portrait cartoon model is obtained, and the portrait cartoon model is obtained by training a generative confrontation network model.

其中，由于目标人物图像具有统一身体占比，且不具有任何背景信息干扰，这大大简化了对人像卡通化模型的训练，且有助于提高卡通化的效果。如图4所示，为一个实施例中，人物卡通化前后的示意图。Among them, since the target person image has a uniform body proportion and does not have any background information interference, this greatly simplifies the training of the portrait cartoon model and helps to improve the cartoon effect. As shown in FIG. 4 , it is a schematic diagram of characters before and after cartoonization in one embodiment.

上述人像卡通化方法，首先，对原始人物图像中的人脸进行人脸对齐，得到对齐后的标准人物图像，然后对标准人物图像进行人像分割，滤除背景信息，得到目标人物图像，目标人物图像中不存在外部环境信息，只包含人物，所以简化了后续人像卡通化模型的处理，适用于部署在算力有限的机器人端，而且由于没有背景信息，得到的卡通效果好，即在机器人端就可以实现良好的卡通化效果。In the above-mentioned portrait cartoonization method, firstly, face alignment is performed on the face in the original character image to obtain the aligned standard character image, then the standard character image is segmented, and the background information is filtered out to obtain the target character image, the target character image. There is no external environment information in the image, only characters are included, so the processing of subsequent portrait cartoon models is simplified, which is suitable for deployment on the robot side with limited computing power, and because there is no background information, the obtained cartoon effect is good, that is, on the robot side A good cartoon effect can be achieved.

在一个实施例中，所述生成对抗网络模型包括：正向生成器、反向生成器、第一判别器和第二判别器；所述正向生成器的输出与所述第一判别器连接，所述反向生成器的输出与所述第二判别器连接；所述正向生成器用于将输入的人物图像转换为人物卡通图像，所述反向生成器用于将输入的人物卡通图像转换为人物图像，所述第一判别器用于计算输入的图像属于人物卡通图像的概率，所述第二判别网络模型用于计算输入的图像属于人物图像的概率。In one embodiment, the generative adversarial network model includes: a forward generator, a reverse generator, a first discriminator and a second discriminator; the output of the forward generator is connected to the first discriminator , the output of the reverse generator is connected to the second discriminator; the forward generator is used to convert the input character image into a character cartoon image, and the reverse generator is used to convert the input character cartoon image into is a character image, the first discriminator is used to calculate the probability that the input image belongs to a character cartoon image, and the second discriminant network model is used to calculate the probability that the input image belongs to a character image.

其中，人像卡通化模型是根据生成对抗网络模型训练得到的，该人像卡通化模型由于是在无监督的条件下训练得到的，需要其他的模型来辅助训练。生成对抗网络模型包括：两个生成器，为了区分，称为“正向生成器”和“反向生成器”，然后还有两个判别器，分别称为“第一判别器”和“第二判别器”。最后训练得到的正向生成器为上述人像卡通化模型。Among them, the portrait cartoon model is obtained by training the generative adversarial network model. Since the portrait cartoon model is trained under unsupervised conditions, other models are needed to assist the training. The generative adversarial network model includes: two generators, called "forward generator" and "reverse generator" for distinction, and then two discriminators, called "first discriminator" and "first discriminator" respectively. Second Discriminator". The final trained forward generator is the above-mentioned portrait cartoon model.

具体地，正向生成器与第一判别器连接，反向生成器与第二判别器连接。正向生成器的作用是将输入的人物图像转换为人物卡通图像，反向生成器的作用是将输入的人物卡通图像转换为人物图像。第一判别器的作用是计算输入的图像属于卡通图像的概率，第二判别网络模型作用是计算输入的图像属于人物图像的概率。也就是说，第一判别器的目的是将输入的真实的人物卡通图像判断为真，将正向生成器输出的图像判断为假，如果正向生成器输出图像足以骗过第一判断器，说明正向生成器输出的图像具有了卡通化图像的特征，使得第一判别网络模型无法识别出真伪。即正向生成器与第一判别器采用的是互相对抗学习的方式。同理，反向生成器与第二判别器采用的也是互相对抗学习的方式。Specifically, the forward generator is connected to the first discriminator, and the reverse generator is connected to the second discriminator. The function of the forward generator is to convert the input character image into a character cartoon image, and the function of the reverse generator is to convert the input character cartoon image into a character image. The function of the first discriminator is to calculate the probability that the input image belongs to a cartoon image, and the function of the second discriminator network model is to calculate the probability that the input image belongs to a character image. That is to say, the purpose of the first discriminator is to judge the input real character cartoon image as true, and judge the image output by the forward generator as false, if the output image of the forward generator is enough to fool the first judger, It shows that the image output by the forward generator has the characteristics of cartoon image, so that the first discriminant network model cannot identify the authenticity. That is, the forward generator and the first discriminator adopt the way of learning against each other. In the same way, the reverse generator and the second discriminator also adopt the way of learning against each other.

如图5所示，在一个实施例中，所述生成对抗网络模型的训练步骤包括：As shown in Figure 5, in one embodiment, the training step of the generative adversarial network model includes:

步骤502，获取训练图像集，训练图像集包括：训练人物图像集和训练人物卡通化图像集。Step 502 , acquiring a training image set, where the training image set includes: a training character image set and a training character cartoon image set.

其中，生成对抗网络模型是一种无监督的训练方式，由于训练图像没有标注，所以需要两个图像集合，训练人物图像集和训练人物卡通化图像集，训练人物图像集与训练人物卡通化图像集是非对称的。Among them, the generative adversarial network model is an unsupervised training method. Since the training images are not labeled, two image sets are required, the training character image set and the training character cartoon image set, the training character image set and the training character cartoon image set. Sets are asymmetric.

步骤504，将训练人物图像输入正向生成器，得到第一输出图像、将第一输出图像输入第一判别器得到对应输出的第一概率。Step 504 , input the training person image into the forward generator to obtain the first output image, and input the first output image into the first discriminator to obtain the first probability of the corresponding output.

其中，从训练人物图像集中获取训练人物图像，将训练人物图像作为正向生成器的输入，得到输出的第一输出图像。正向生成器的目的是将输入的训练人物图像转换为人物卡通化图像，由于没有标注，所以需要将第一输出图像作为第一判别器的输入得到输出的第一概率，第一概率是指判断得到的第一输出图像属于人物卡通化图像的概率。第一判别器的目的是为了识别出第一输出图像是伪造的人物卡通图像。The training character image is obtained from the training character image set, and the training character image is used as the input of the forward generator to obtain the outputted first output image. The purpose of the forward generator is to convert the input training character image into a character cartoon image. Since there is no labeling, it is necessary to use the first output image as the input of the first discriminator to obtain the first probability of the output. The first probability refers to Determine the probability that the obtained first output image belongs to the cartoonized image of the character. The purpose of the first discriminator is to identify that the first output image is a fake character cartoon image.

步骤506，将训练人物卡通图像输入反向生成器，得到第二输出图像、将第二输出图像输入第二判别器得到对应输出的第二概率。Step 506: Input the cartoon image of the training character into the reverse generator to obtain the second output image, and input the second output image into the second discriminator to obtain the second probability of the corresponding output.

其中，从训练人物卡通图像集中获取训练人物卡通图像，将训练人物卡通图像作为反向生成器的输入，得到输出的第二输出图像。反向生成器的目的是将输入的训练人物卡通图像转换为人物图像，然后利用第二判别器来识别，第二判别器的目的是输出第二输出图像是伪造的人物图像。The training character cartoon image is obtained from the training character cartoon image set, and the training character cartoon image is used as the input of the reverse generator to obtain the output second output image. The purpose of the reverse generator is to convert the input training character cartoon image into a character image, and then use the second discriminator to identify it, and the purpose of the second discriminator is to output that the second output image is a fake character image.

步骤508，将第一输出图像输入反向生成器，得到第三输出图像；将第二输出图像输入正向生成器，得到第四输出图像。Step 508: Input the first output image into the reverse generator to obtain the third output image; input the second output image into the forward generator to obtain the fourth output image.

其中，为了使得图像转换过程中尽量保留原本图像的内容，所以希望将人物图像通过正向生成器转换为人物卡通图像后，再通过反向生成器能够得到原来的图像。所以经过循环后的图像差异性越小越好，即训练人物图像与第三输出图像的差异性越小越好，训练人物卡通图像与第四输出图像的差异性越小越好。Among them, in order to keep the content of the original image as much as possible during the image conversion process, it is hoped that after the character image is converted into a character cartoon image through the forward generator, the original image can be obtained through the reverse generator. Therefore, the smaller the difference between the images after the cycle, the better, that is, the smaller the difference between the training character image and the third output image, the better, and the smaller the difference between the training character cartoon image and the fourth output image, the better.

步骤510，计算训练人物图像与第三输出图像的第一差异值，计算训练人物卡通图像与第四输出图像的第二差异值。Step 510: Calculate the first difference value between the training character image and the third output image, and calculate the second difference value between the training character cartoon image and the fourth output image.

其中，差异值越小，说明图像之间的相似度越大。所以我们可以通过计算图像相似度的方式来计算得到差异值。即计算训练人物图像与第三输出图像之间的相似度(小于1)，然后用1减去相似度即可得到第一差异值。同样地，计算训练人物卡通图像与第四输出图像之间的相似度，然后用1减去相似度即可得到第二差异值。Among them, the smaller the difference value, the greater the similarity between the images. So we can calculate the difference value by calculating the image similarity. That is, the similarity (less than 1) between the training person image and the third output image is calculated, and then the similarity is subtracted from 1 to obtain the first difference value. Similarly, calculate the similarity between the cartoon image of the training character and the fourth output image, and then subtract the similarity from 1 to obtain the second difference value.

步骤512，根据第一概率、第二概率、第一差异值和第二差异值计算得到第一损失函数值。Step 512: Calculate and obtain a first loss function value according to the first probability, the second probability, the first difference value and the second difference value.

其中，第一概率可以用于计算正向生成器和第一判别器的对抗损失值，第二概率用于计算反向生成器和第二判别器的对抗损失值，第一差异值和第二差异值用于计算循环损失值。三者组合起来构成了第一损失函数值。第一损失函数值也是传统的损失函数值。The first probability can be used to calculate the adversarial loss value of the forward generator and the first discriminator, the second probability can be used to calculate the adversarial loss value of the reverse generator and the second discriminator, the first difference value and the second The difference value is used to calculate the cycle loss value. The three are combined to form the first loss function value. The first loss function value is also a conventional loss function value.

步骤514，将训练人物图像作为人脸特征提取模型的输入，获取人脸特征提取模型提取到的第一人脸特征。Step 514 , taking the training person image as the input of the facial feature extraction model, and acquiring the first facial feature extracted by the facial feature extraction model.

其中，为了让生成的卡通化人脸更贴近真实人脸，即与真实人脸具有更多地相同的人脸特征。本申请实施例中创新地引入了人脸特征提取模型，首先，将训练人物图像作为人脸特征提取模型的输入，提取得到第一人脸特征。人脸特征提取模型用于对训练人物图像中的人脸特征进行识别提取，在一个实施例中，人脸特征提取模型可以采用轻量级的mobilefacenet来实现。Among them, in order to make the generated cartoon face closer to the real face, that is, to have more of the same face features as the real face. A face feature extraction model is innovatively introduced in the embodiment of the present application. First, the training person image is used as the input of the face feature extraction model to extract the first face feature. The facial feature extraction model is used to identify and extract the facial features in the training character images. In one embodiment, the facial feature extraction model can be implemented by using a lightweight mobilefacenet.

步骤516，将第一输出图像作为人脸特征提取模型的输入，获取人脸特征提取模型提取到的第二人脸特征。Step 516 , using the first output image as the input of the face feature extraction model, and obtain the second face feature extracted by the face feature extraction model.

其中，将第一输出图像也作为人脸特征提取模型的输入，得到提取的第二人脸特征，便于后续与第一人脸特征进行比较，以使得第一人脸特征与第二人脸特征更接近。Wherein, the first output image is also used as the input of the face feature extraction model, and the extracted second face feature is obtained, which is convenient for subsequent comparison with the first face feature, so that the first face feature and the second face feature are Closer.

步骤518，计算第一人脸特征与第二人脸特征之间的相似度，根据相似度计算得到第二损失函数值。Step 518: Calculate the similarity between the first face feature and the second face feature, and calculate the second loss function value according to the similarity.

其中，为了使得训练得到的模型输出的第二人脸特征与第一人脸特征更相似，在训练过程中，计算第一人脸特征与第二人脸特征之间的相似度，然后根据相似度计算得到第二损失函数值，便于后续根据第二损失函数值来进行模型的调整。在一个实施例中，第二损失函数值采用如下公式计算得到：Among them, in order to make the second face feature output by the trained model more similar to the first face feature, in the training process, the similarity between the first face feature and the second face feature is calculated, and then according to the similarity The second loss function value is obtained by the degree calculation, which facilitates the subsequent adjustment of the model according to the second loss function value. In one embodiment, the second loss function value is calculated using the following formula:

loss ₂＝1-cosine(fea_realA,fea_fakeA2B) loss ₂ =1-cosine(fea_realA,fea_fakeA2B)

其中，cosine()表示相似度，fea_realA表示第一人脸特征，fea_fakeA2B表示第二人脸特征。Among them, cosine() represents similarity, fea_realA represents the first face feature, and fea_fakeA2B represents the second face feature.

步骤520，根据第一损函数值和第二损失函数值计算得到总损失值，总损失值与相似度成反相关。Step 520: Calculate a total loss value according to the first loss function value and the second loss function value, and the total loss value is inversely correlated with the similarity.

其中，将第一损失函数值和第二损失函数值都作为总损失值的一部分，然后根据总损失值来更新模型中的参数权重，以使得总损失值朝着缩小的方向更新，直到达到收敛条件。具体地，总损失函数值表示为：loss＝loss ₁+loss ₂。loss ₁表示第一损失函数值，loss ₂表示第二损失函数值。 Among them, both the first loss function value and the second loss function value are taken as part of the total loss value, and then the parameter weights in the model are updated according to the total loss value, so that the total loss value is updated in a decreasing direction until convergence is achieved condition. Specifically, the total loss function value is expressed as: loss=loss ₁ +loss ₂ . loss ₁ represents the first loss function value, and loss ₂ represents the second loss function value.

步骤522，根据总损失值更新正向生成器、反向生成器、第一判别器和第二判别器中的权重参数，依此循环，直到总损失值达到收敛条件，将训练完成得到的正向生成器作为人像卡通化模型。Step 522: Update the weight parameters in the forward generator, the reverse generator, the first discriminator and the second discriminator according to the total loss value, and repeat this cycle until the total loss value reaches the convergence condition, and the positive result obtained after training is completed. To the generator as a cartoonized model of a portrait.

其中，计算得到总损失值后，更新正向生成器、反向生成器、第一判别器和第二判别器中的权重参数，使总损失值朝着减小的方向变化，然后以此循环，直到最后总损失值达到收敛条件，训练完成，将训练完成得到的正向生成器作为人像卡通化模型。Among them, after calculating the total loss value, update the weight parameters in the forward generator, the reverse generator, the first discriminator and the second discriminator, so that the total loss value changes in the direction of decreasing, and then repeat this cycle , until the final total loss value reaches the convergence condition, the training is completed, and the forward generator obtained after the training is used as the portrait cartoon model.

在一个实施例中，所述正向生成器包括：编码器和解码器，所述编码器和所述解码器是由Hourglass模块组成的。In one embodiment, the forward generator includes an encoder and a decoder, and the encoder and the decoder are composed of Hourglass modules.

其中，为了使得生成的人物卡通图像更加的逼真，在训练阶段，将正向生成器中的编码器和解码器分别采用Hourglass模块来实现。Hourglass模块有助于提供模型的特征抽象和重建能力。Hourglass模块是由残差模块(Residual)组成的。Hourglass(漏斗)模块的特点是先进行最大池化，然后进行上采样，通过最大池化可以特征图缩小，然后再进行上采样进行扩大。实验表明，通过Hourglass模块来实现编码器和解码器的作用可以大大提升模型的卡通化的逼真度。Among them, in order to make the generated cartoon images of characters more realistic, in the training phase, the encoder and decoder in the forward generator are implemented by Hourglass modules respectively. The Hourglass module helps to provide feature abstraction and reconstruction capabilities of the model. The Hourglass module is composed of the residual module (Residual). The feature of the Hourglass (funnel) module is to perform maximum pooling first, and then perform upsampling. Through maximum pooling, the feature map can be reduced, and then upsampling is performed to expand. Experiments show that implementing the functions of encoder and decoder through Hourglass module can greatly improve the cartoon-like fidelity of the model.

在一个实施例中，所述对所述原始人物图像中的人脸进行识别，根据识别得到的人脸关键点进行人脸对齐，得到对齐后的标准人物图像，包括：根据预设的人体占比确定人脸关键点在预设空间的目标坐标位置；将所述人脸关键点映射到预设空间的所述目标坐标位置，得到预设空间内对齐后的标准人物图像。In one embodiment, the process of recognizing the face in the original character image, aligning the face according to the key points of the face obtained by identification, and obtaining the aligned standard character image, includes: according to the preset human body occupancy Determine the target coordinate position of the face key point in the preset space; map the face key point to the target coordinate position in the preset space to obtain a standard human image aligned in the preset space.

其中，传统的人脸对齐仅仅是将脸部进行对齐，预设空间往往比较小，比如，尺寸在112X112，然后在该有限的空间内，设置五个关键点映射后的坐标位置。比如，5个关键点(左眼、右眼、鼻子、嘴左、嘴右)的坐标位置分别为{[38.2946,51.6963]，[73.5318,51.5014]，[56.0252,71.7366]，[41.5493,92.3655]，[70.7299,92.2041]}。如果标准人物图像不仅需要包含人脸区域，还需要扩展到其他部分，比如，延至肩膀处。那么相应的空间需要扩大，比如，尺寸需要设置为256X256，且相应的5个关键点的坐标位置也需要发生改变，以使得脸部上方头发和脸部下方到肩膀的部位可以显示在标准人物图像中。Among them, the traditional face alignment is only to align the face, and the preset space is often relatively small, for example, the size is 112X112, and then in this limited space, set the coordinate positions of five key points after mapping. For example, the coordinate positions of the 5 key points (left eye, right eye, nose, mouth left, mouth right) are {[38.2946, 51.6963], [73.5318, 51.5014], [56.0252, 71.7366], [41.5493, 92.3655] , [70.7299, 92.2041]}. If the standard person image needs to contain not only the face area, but also needs to be extended to other parts, for example, to the shoulders. Then the corresponding space needs to be expanded. For example, the size needs to be set to 256X256, and the coordinate positions of the corresponding 5 key points also need to be changed, so that the hair above the face and the parts from the bottom of the face to the shoulders can be displayed in the standard character image. middle.

具体地，首先根据图像中预设的人体占比确定人脸关键点映射到预设空间的目标坐标位置，目标坐标位置即指定位置，比如，如果预设的人体占比是从头发开始到肩部，那么在人脸对齐时，为了给头发部分和头以下的部分预留出映射空间，需要将人脸关键点坐标尽量映射到图像中间部位。以图像左下角为原点坐标进行说明，与传统的人脸对齐相比，可以将左眼、右眼位置的纵坐标降低，将左眼的横坐标和右眼的横坐标往图像中间靠近，即扩大左眼的横坐标，减少右眼的横坐标。这样可以为上面的头发部分预留空间，同时为脸的左右部分预留空间，同样的原理，鼻子的横坐标保持不变，纵坐标减少，为脸部下面预留空间，同时嘴左和嘴右的横坐标往图像中间靠拢，纵坐标增加等。Specifically, first determine the target coordinate position where the key points of the face are mapped to the preset space according to the preset proportion of the human body in the image, and the target coordinate position is the designated position. For example, if the preset proportion of the human body starts from the hair to the shoulders In order to reserve the mapping space for the hair part and the part below the head when aligning the face, it is necessary to map the coordinates of the key points of the face to the middle part of the image as much as possible. Taking the lower left corner of the image as the origin coordinate for illustration, compared with the traditional face alignment, the ordinate of the left eye and the right eye can be lowered, and the abscissa of the left eye and the abscissa of the right eye can be moved closer to the middle of the image, that is Expand the abscissa of the left eye and decrease the abscissa of the right eye. In this way, space can be reserved for the upper part of the hair, and space is reserved for the left and right parts of the face. The same principle, the abscissa of the nose remains unchanged, the ordinate decreases, and space is reserved for the lower part of the face, while the left side of the mouth and the mouth The abscissa on the right moves closer to the middle of the image, and the ordinate increases.

在一个实施例中，所述基于所述标准人物图像进行人像分割，滤除背景信息，得到目标人物图像，包括：将所述标准人物图像作为人像分割模型的输入，所述人像分割模型用于从所述标准人物图像中分割出人物图像；基于所述人像分割模型输出的所述人物图像滤除背景信息，生成所述目标人物图像。In one embodiment, the performing portrait segmentation based on the standard human image, filtering out background information, and obtaining the target human image includes: using the standard human image as an input of a portrait segmentation model, and the portrait segmentation model is used for A person image is segmented from the standard person image; background information is filtered out based on the person image output by the portrait segmentation model, and the target person image is generated.

其中，人像分割模型用于对标准人物图像中的目标人物图像进行分割，得到目标人物图像。人像分割模型采用轻量化的卷积神经网络mobilenetv2来实现。基于人像分割模型输出的人物图像滤波背景信息，可以为分割出的人物图像加上统一的白色背景，生成目标人物图像。The portrait segmentation model is used to segment the target person image in the standard person image to obtain the target person image. The portrait segmentation model is implemented by a lightweight convolutional neural network mobilenetv2. Based on the filtered background information of the human image output by the portrait segmentation model, a uniform white background can be added to the segmented human image to generate the target human image.

在一个实施例中，所述人像分割模型采用卷积神经网络训练得到，包括多个卷积层，所述卷积层用于对图像进行特征提取；在采用所述卷积层进行特征提取之前，还包括：对所述图像进行边缘扩增，以使进行卷积后得到的图像分辨率与输入的标准人物图像的分辨率一致。In one embodiment, the portrait segmentation model is obtained by training a convolutional neural network, and includes a plurality of convolutional layers, and the convolutional layers are used to perform feature extraction on images; before using the convolutional layers for feature extraction , and further comprising: performing edge amplification on the image, so that the resolution of the image obtained after the convolution is consistent with the resolution of the input standard person image.

其中，为了使得卷积前后图像分辨率保持不变，在进行卷积之前，先对图像进行边缘扩增，即扩大图像，然后再基于扩大图像进行卷积操作，以使得卷积操作后得到的标准人物图像分辨率与原始输入的人物图像的分辨率保持一致，从而可以保证人像卡通化的精度。Among them, in order to keep the resolution of the image before and after the convolution unchanged, before the convolution, the edge of the image is enlarged, that is, the image is enlarged, and then the convolution operation is performed based on the enlarged image, so that the result obtained after the convolution operation is obtained. The resolution of the standard character image is consistent with the resolution of the original input character image, so as to ensure the accuracy of the cartoonization of the portrait.

如图6所示，在一个实施例中，提出了一种人像卡通化装置，包括：As shown in FIG. 6 , in one embodiment, a device for cartoonizing a portrait is proposed, including:

获取模块602，用于获取待处理的原始人物图像；an acquisition module 602, configured to acquire the original character image to be processed;

对齐模块604，用于对所述原始人物图像中的人脸进行识别，根据识别得到的人脸关键点进行人脸对齐，得到对齐后的标准人物图像；Alignment module 604, configured to identify the human face in the original character image, and perform face alignment according to the identified face key points to obtain an aligned standard character image;

分割模块606，用于基于所述标准人物图像进行人像分割，滤除背景信息，得到目标人物图像；A segmentation module 606, configured to perform portrait segmentation based on the standard person image, filter out background information, and obtain a target person image;

卡通化模块608，用于将所述目标人物图像作为人像卡通化模型的输入，获取所述人像卡通化模块输出的与所述目标人物图像对应的人物卡通化图像，所述人像卡通化模型是通过生成对抗网络模型训练得到的。The cartoonization module 608 is configured to use the target character image as the input of the portrait cartoonization model, and obtain the character cartoonization image corresponding to the target character image output by the portrait cartoonization module, and the portrait cartoonization model is It is obtained by training a generative adversarial network model.

在一个实施例中，所述生成对抗网络模型包括：正向生成器、反向生成器、第一判别器和第二判别器；所述正向生成器的输出与所述第一判别器连接，所述反向生成器的输出与所述第二判别器连接；所述正向生成器用于将输入的人物图像转换为人物卡通图像，所述反向生成器用于将输入的人物卡通图像转换为人物图像，所述第一判别器用于计算输入的图像属于卡通图像的概率，所述第二判别网络模型用于计算输入的图像属于人物图像的概率。In one embodiment, the generative adversarial network model includes: a forward generator, a reverse generator, a first discriminator and a second discriminator; the output of the forward generator is connected to the first discriminator , the output of the reverse generator is connected to the second discriminator; the forward generator is used to convert the input character image into a character cartoon image, and the reverse generator is used to convert the input character cartoon image into is a character image, the first discriminator is used to calculate the probability that the input image belongs to a cartoon image, and the second discriminant network model is used to calculate the probability that the input image belongs to a character image.

在一个实施例中，上述人像卡通化装置还包括：训练模块。In one embodiment, the above-mentioned device for cartoonizing a portrait further includes: a training module.

如图7所示，训练模块601包括：As shown in Figure 7, the training module 601 includes:

训练获取模块601A，用于获取训练图像集，所述训练图像集包括：训练人物图像集和训练人物卡通化图像集；A training acquisition module 601A, configured to acquire a training image set, the training image set includes: a training character image set and a training character cartoon image set;

第一输入模块601B，用于将训练人物图像输入正向生成器，得到第一输出图像、将所述第一输出图像输入第一判别器得到对应输出的第一概率；The first input module 601B is used for inputting the training character image into the forward generator to obtain the first output image, and inputting the first output image into the first discriminator to obtain the first probability of the corresponding output;

第二输入模块601C，用于将训练人物卡通图像输入反向生成器，得到第二输出图像、将所述第二输出图像输入第二判别器得到对应输出的第二概率；The second input module 601C is used to input the cartoon image of the training character into the reverse generator to obtain the second output image, and input the second output image to the second discriminator to obtain the second probability of the corresponding output;

第三输入模块601D，用于将所述第一输出图像输入所述反向生成器，得到第三输出图像；将所述第二输出图像输入所述正向生成器，得到第四输出图像；The third input module 601D is configured to input the first output image into the reverse generator to obtain a third output image; input the second output image into the forward generator to obtain a fourth output image;

第一计算模块601E，用于计算所述训练人物图像与所述第三输出图像的第一差异值，计算所述训练人物卡通图像与所述第四输出图像的第二差异值；A first calculation module 601E, configured to calculate the first difference value between the training character image and the third output image, and calculate the second difference value between the training character cartoon image and the fourth output image;

第二计算模块601F，用于根据所述第一概率、第二概率、第一差异值和第二差异值计算得到第一损失函数值；The second calculation module 601F is configured to calculate and obtain the first loss function value according to the first probability, the second probability, the first difference value and the second difference value;

第一提取模块601G，用于将所述训练人物图像作为人脸特征提取模型的输入，获取所述人脸特征提取模型提取到的第一人脸特征；The first extraction module 601G is used to use the training character image as the input of the facial feature extraction model, and obtain the first facial feature extracted by the facial feature extraction model;

第二提取模块601H，用于将所述第一输出图像作为人脸特征提取模型的输入，获取所述人脸特征提取模型提取到的第二人脸特征；The second extraction module 601H is configured to use the first output image as the input of the facial feature extraction model, and obtain the second facial feature extracted by the facial feature extraction model;

第三计算模块601I，用于计算所述第一人脸特征与所述第二人脸特征之间的相似度，根据所述相似度计算得到第二损失函数值；The third calculation module 601I is used to calculate the similarity between the first face feature and the second face feature, and calculate the second loss function value according to the similarity;

第四计算模块601J，用于根据所述第一损函数值和所述第二损失函数值计算得到总损失值，所述总损失值与所述相似度成反相关；a fourth calculation module 601J, configured to calculate and obtain a total loss value according to the first loss function value and the second loss function value, where the total loss value is inversely correlated with the similarity;

更新模块601K，用于根据所述总损失值更新所述正向生成器、所述反向生成器、所述第一判别器和所述第二判别器中的权重参数，依此循环，直到所述总损失值达到收敛条件，将训练完成得到的正向生成器作为所述人像卡通化模型。The updating module 601K is configured to update the weight parameters in the forward generator, the reverse generator, the first discriminator and the second discriminator according to the total loss value, and repeat this cycle until The total loss value reaches the convergence condition, and the forward generator obtained after training is used as the portrait cartoon model.

在一个实施例中，所述对齐模块还用于根据预设的人体占比确定人脸关键点在预设空间的目标坐标位置；将所述人脸关键点映射到预设空间的所述目标坐标位置，得到预设空间内对齐后的标准人物图像。In one embodiment, the alignment module is further configured to determine the target coordinate position of the key points of the face in the preset space according to the preset proportion of the human body; map the key points of the face to the target in the preset space Coordinate position, get the standard human image aligned in the preset space.

在一个实施例中，分割模块还用于将所述标准人物图像作为人像分割模型的输入，所述人像分割模型用于从所述标准人物图像中分割出人物图像；基于所述人像分割模型输出的所述人物图像滤除背景信息，生成所述目标人物图像。In one embodiment, the segmentation module is further configured to use the standard person image as an input of a portrait segmentation model, and the portrait segmentation model is used to segment a person image from the standard person image; output based on the portrait segmentation model The background information of the person image is filtered out to generate the target person image.

在一个实施例中，所述人像分割模型采用卷积神经网络训练得到，包括多个卷积层，所述卷积层用于对图像进行特征提取；上述装置还包括：扩增模块，用于对所述图像进行边缘扩增，以使进行卷积后得到的图像分辨率与输入的标准人物图像的分辨率一致。In one embodiment, the portrait segmentation model is obtained by training a convolutional neural network, and includes a plurality of convolutional layers, and the convolutional layers are used to perform feature extraction on the image; the above-mentioned apparatus further includes: an amplification module for Edge augmentation is performed on the image, so that the resolution of the image obtained after the convolution is consistent with the resolution of the input standard person image.

图8示出了一个实施例中机器人的内部结构图。如图8所示，该机器人包括通过系统总线连接的处理器、存储器、摄像头和网络接口。其中，存储器包括非易失性存储介质和内存储器。该机器人的非易失性存储介质存储有操作系统，还可存储有计算机程序，该计算机程序被处理器执行时，可使得处理器实现上述的人像卡通化方法。该内存储器中也可储存有计算机程序，该计算机程序被处理器执行时，可使得处理器执行上述的人像卡通化方法。本领域技术人员可以理解，图8中示出的结构，仅仅是与本申请方案相关的部分结构的框图，并不构成对本申请方案所应用于其上的机器人的限定，具体的机器人可以包括比图中所示更多或更少的部件，或者组合某些部件，或者具有不同的部件布置。Figure 8 shows a diagram of the internal structure of the robot in one embodiment. As shown in Figure 8, the robot includes a processor, memory, camera, and network interface connected through a system bus. Wherein, the memory includes a non-volatile storage medium and an internal memory. The non-volatile storage medium of the robot stores an operating system, and also stores a computer program. When the computer program is executed by the processor, the processor can realize the above-mentioned method for cartoonizing a portrait. A computer program may also be stored in the internal memory, and when the computer program is executed by the processor, the processor can execute the above-mentioned method for cartoonizing a portrait. Those skilled in the art can understand that the structure shown in FIG. 8 is only a block diagram of a partial structure related to the solution of the present application, and does not constitute a limitation on the robot to which the solution of the present application is applied. More or fewer components are shown in the figures, either in combination or with different arrangements of components.

在一个实施例中，提出了一种机器人，包括存储器和处理器，所述存储器存储有计算机程序，所述计算机程序被所述处理器执行时，使得所述处理器执行上述人像卡通化方法的步骤。In one embodiment, a robot is proposed, including a memory and a processor, wherein the memory stores a computer program, and when the computer program is executed by the processor, the processor causes the processor to perform the above-mentioned method for cartoonizing a portrait. step.

在一个实施例中，提出了一种计算机可读存储介质，存储有计算机程序，所述计算机程序被处理器执行时，使得所述处理器执行上述人像卡通化方法的步骤。In one embodiment, a computer-readable storage medium is provided, which stores a computer program, and when the computer program is executed by a processor, causes the processor to perform the steps of the above-mentioned method for cartoonizing a portrait.

本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程，是可以通过计算机程序来指令相关的硬件来完成，所述的程序可存储于一非易失性计算机可读取存储介质中，该程序在执行时，可包括如上述各方法的实施例的流程。其中，本申请所提供的各实施例中所使用的对存储器、存储、数据库或其它介质的任何引用，均可包括非易失性和/或易失性存储器。非易失性存储器可包括只读存储器(ROM)、可编程ROM(PROM)、电可编程ROM(EPROM)、电可擦除可编程ROM(EEPROM)或闪存。易失性存储器可包括随机存取存储器(RAM)或者外部高速缓冲存储器。作为说明而非局限，RAM以多种形式可得，诸如静态RAM(SRAM)、动态RAM(DRAM)、同步DRAM(SDRAM)、双数据率SDRAM(DDRSDRAM)、增强型SDRAM(ESDRAM)、同步链路(Synchlink)DRAM(SLDRAM)、存储器总线(Rambus)直接RAM(RDRAM)、直接存储器总线动态RAM(DRDRAM)、以及存储器总线动态RAM(RDRAM)等。Those of ordinary skill in the art can understand that all or part of the processes in the methods of the above embodiments can be implemented by instructing relevant hardware through a computer program, and the program can be stored in a non-volatile computer-readable storage medium , when the program is executed, it may include the flow of the above-mentioned method embodiments. Wherein, any reference to memory, storage, database or other medium used in the various embodiments provided in this application may include non-volatile and/or volatile memory. Nonvolatile memory may include read only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory. Volatile memory may include random access memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in various forms such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double data rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous chain Road (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), etc.

以上实施例的各技术特征可以进行任意的组合，为使描述简洁，未对上述实施例中的各个技术特征所有可能的组合都进行描述，然而，只要这些技术特征的组合不存在矛盾，都应当认为是本说明书记载的范围。The technical features of the above embodiments can be combined arbitrarily. In order to make the description simple, all possible combinations of the technical features in the above embodiments are not described. However, as long as there is no contradiction in the combination of these technical features It is considered to be the range described in this specification.

以上所述实施例仅表达了本申请的几种实施方式，其描述较为具体和详细，但并不能因此而理解为对本申请专利范围的限制。应当指出的是，对于本领域的普通技术人员来说，在不脱离本申请构思的前提下，还可以做出若干变形和改进，这些都属于本申请的保护范围。因此，本申请专利的保护范围应以所附权利要求为准。The above-mentioned embodiments only represent several embodiments of the present application, and the descriptions thereof are relatively specific and detailed, but should not be construed as a limitation on the scope of the patent of the present application. It should be pointed out that for those skilled in the art, without departing from the concept of the present application, several modifications and improvements can be made, which all belong to the protection scope of the present application. Therefore, the scope of protection of the patent of the present application shall be subject to the appended claims.

Claims

A method for cartoonizing a portrait, comprising:

Get the original character image to be processed;

Recognizing the face in the original character image, and aligning the face according to the key points of the face obtained by the identification, to obtain an aligned standard character image;

Perform portrait segmentation based on the standard person image, filter out background information, and obtain a target person image;

The target character image is used as the input of the portrait cartoon model, and the character cartoon image corresponding to the target character image output by the portrait cartoon model is obtained, and the portrait cartoon model is obtained by training a generative confrontation network model. of.

The method according to claim 1, wherein the generative adversarial network model comprises: a forward generator, a reverse generator, a first discriminator and a second discriminator; the output of the forward generator is the same as the The first discriminator is connected, and the output of the reverse generator is connected to the second discriminator;

The forward generator is used to convert the input character image into a character cartoon image, the reverse generator is used to convert the input character cartoon image into a character image, and the first discriminator is used to calculate that the input image belongs to a character cartoon. The probability of the image, the second discriminant network model is used to calculate the probability that the input image belongs to the person image.

The method according to claim 2, wherein the training step of the generative adversarial network model comprises:

Acquiring a training image set, the training image set includes: a training character image set and a training character cartoon image set;

Input the training character image into the forward generator to obtain the first output image, and input the first output image into the first discriminator to obtain the first probability of the corresponding output;

Input the training character cartoon image into the reverse generator to obtain the second output image, and input the second output image into the second discriminator to obtain the second probability of the corresponding output;

Inputting the first output image into the reverse generator to obtain a third output image; inputting the second output image into the forward generator to obtain a fourth output image;

Calculate the first difference value between the training character image and the third output image, and calculate the second difference value between the training character cartoon image and the fourth output image;

The first loss function value is obtained by calculating according to the first probability, the second probability, the first difference value and the second difference value;

Using the training character image as the input of the facial feature extraction model, obtain the first facial feature extracted by the facial feature extraction model;

Using the first output image as the input of the facial feature extraction model, obtain the second facial feature extracted by the facial feature extraction model;

Calculate the similarity between the first face feature and the second face feature, and calculate the second loss function value according to the similarity;

A total loss value is calculated according to the first loss function value and the second loss function value, and the total loss value is inversely correlated with the similarity;

Update the weight parameters in the forward generator, the reverse generator, the first discriminator and the second discriminator according to the total loss value, and repeat this cycle until the total loss value reaches Convergence condition, the forward generator obtained after training is used as the portrait cartoon model.

The method according to claim 2, wherein the forward generator comprises: an encoder and a decoder, and the encoder and the decoder are composed of Hourglass modules.

The method according to claim 1, wherein the identifying the face in the original character image, aligning the face according to the identified face key points, and obtaining the aligned standard character image, comprising: :

Determine the target coordinate position of the key point of the face in the preset space according to the preset proportion of the human body;

The face key points are mapped to the target coordinate position in the preset space to obtain a standard person image aligned in the preset space.

The method according to claim 1, wherein the performing portrait segmentation based on the standard person image, filtering out background information, and obtaining the target person image, comprising:

Using the standard person image as the input of a portrait segmentation model, the portrait segmentation model is used to segment a person image from the standard person image;

The target person image is generated by filtering out background information based on the person image output by the portrait segmentation model.

The method according to claim 6, wherein the portrait segmentation model is obtained by training with a convolutional neural network, and includes a plurality of convolutional layers, and the convolutional layers are used to perform feature extraction on the image;

Before using the convolution layer to perform feature extraction, the method further includes: performing edge enhancement on the image, so that the resolution of the image obtained after the convolution is consistent with the resolution of the input standard person image.

A robot includes a memory and a processor, the memory stores a computer program, and when the computer program is executed by the processor, the processor executes the following steps:

Get the original character image to be processed;

The robot according to claim 8, wherein the generative adversarial network model comprises: a forward generator, a reverse generator, a first discriminator and a second discriminator; the output of the forward generator is the same as the The first discriminator is connected, and the output of the reverse generator is connected to the second discriminator;

The robot according to claim 9, wherein the training step of the generative adversarial network model comprises:

The robot according to claim 9, wherein the forward generator comprises: an encoder and a decoder, and the encoder and the decoder are composed of Hourglass modules.

The robot according to claim 8, wherein the face recognition in the original character image is performed, and face alignment is performed according to the key points of the face obtained by the recognition, and an aligned standard character image is obtained, comprising: :

The robot according to claim 8, wherein the performing portrait segmentation based on the standard person image, filtering out background information, and obtaining the target person image, comprising:

Using the standard person image as an input of a portrait segmentation model, the portrait segmentation model is used to segment a person image from the standard person image;

The robot according to claim 13, wherein the portrait segmentation model is obtained by training with a convolutional neural network, and includes a plurality of convolutional layers, and the convolutional layers are used to perform feature extraction on images;

A computer-readable storage medium storing a computer program, when executed by a processor, the computer program causes the processor to perform the following steps:

Get the original character image to be processed;

The storage medium according to claim 15, wherein the generative adversarial network model comprises: a forward generator, a reverse generator, a first discriminator and a second discriminator; the output of the forward generator connected with the first discriminator, and the output of the reverse generator is connected with the second discriminator;

The storage medium according to claim 16, wherein the step of training the generative adversarial network model comprises:

The storage medium according to claim 16, wherein the forward generator comprises: an encoder and a decoder, and the encoder and the decoder are composed of Hourglass modules.

The storage medium according to claim 15, wherein the identifying the face in the original character image, aligning the face according to the identified face key points, and obtaining the aligned standard character image, include:

The storage medium according to claim 15, wherein the performing portrait segmentation based on the standard person image, filtering out background information, and obtaining the target person image, comprising:

The storage medium according to claim 15, wherein the portrait segmentation model is obtained by training a convolutional neural network, and comprises a plurality of convolutional layers, and the convolutional layers are used to perform feature extraction on images;