WO2021051562A1

WO2021051562A1 - Facial feature point positioning method and apparatus, computing device, and storage medium

Info

Publication number: WO2021051562A1
Application number: PCT/CN2019/117650
Authority: WO
Inventors: 罗天文
Original assignee: Ping An Technology Shenzhen Co Ltd
Current assignee: Ping An Technology Shenzhen Co Ltd
Priority date: 2019-09-17
Filing date: 2019-11-12
Publication date: 2021-03-25
Anticipated expiration: 2022-03-17
Also published as: CN110717405B; CN110717405A

Abstract

A facial feature point positioning method and apparatus, a computing device and a storage medium. Said method comprises: inputting a target facial image of a feature point to be positioned into a first convolutional neural network model, so as to obtain a first shallow feature image which contains a plurality of candidate feature points and is outputted by a predetermined convolutional layer of the first convolutional neural network model (210); using a bilinear interpolation algorithm to perform bilinear interpolation on the candidate feature points in the first shallow feature image, so as to obtain a second shallow feature image (220); and inputting the second shallow feature image into a second convolutional neural network model cascaded with the first convolutional neural network model, so as to obtain a facial feature image which corresponds to the target facial image of the feature point to be positioned and is outputted by the second convolutional neural network model (230), the weight of a predetermined convolutional layer of the second convolutional model and the weight of each convolutional layer before same being consistent with the weight of the convolutional layer in a corresponding layer number of the first convolutional model, respectively. By means said method, the positioning precision of the facial feature point is improved, and since convolution layers of cascaded models share weights, the calculation amount and the parameter amount can be reduced, and the training speed and convergence speed of the models can be improved.

Description

Facial feature point positioning method, device, computing equipment and storage medium

本申请基于并要求2019年9月17日申请的、申请号为CN 201910877995.2、名称为“人脸特征点定位方法、装置、介质及电子设备”的中国专利申请的优先权，其全部内容在此并入作为参考。This application is based on and claims the priority of the Chinese patent application filed on September 17, 2019 with the application number CN 201910877995. 2, titled "Face feature point positioning method, device, medium and electronic equipment", the entire content of which is here Incorporated as a reference.

Technical field

本申请涉及图像算法技术领域，特别是涉及一种人脸特征点定位方法、装置、计算设备和计算机非易失性可读存储介质。This application relates to the technical field of image algorithms, and in particular to a method, device, computing device, and computer non-volatile readable storage medium for locating facial feature points.

Background technique

卷积神经网络模型是图像处理领域重要而经典的模型之一。而人脸图像处理是图像处理领域的重要课题。The convolutional neural network model is one of the important and classic models in the field of image processing. And face image processing is an important subject in the field of image processing.

目前，人脸特征点定位模型很多都用到了卷积神经网络模型。通常情况下，人们为了对人脸图像中的特征点进行定位，通常是将人脸图像直接输入到卷积神经网络模型中，利用该卷积神经网络模型的卷积层进行人脸特征点的定位。本申请发明人意识到，采用的这种人脸特征点定位方法在实际进行特征点定位时的精度不够高。At present, many face feature point positioning models use convolutional neural network models. Normally, in order to locate the feature points in the face image, people usually input the face image directly into the convolutional neural network model, and use the convolutional layer of the convolutional neural network model to perform facial feature points. Positioning. The inventor of the present application realizes that the accuracy of the adopted facial feature point positioning method is not high enough in actual feature point positioning.

发明内容Summary of the invention

在图像算法技术领域，为了解决上述技术问题，本申请的目的在于提供一种人脸特征点定位方法、装置、计算设备和计算机非易失性可读存储介质。In the field of image algorithm technology, in order to solve the above technical problems, the purpose of this application is to provide a method, device, computing device, and computer non-volatile readable storage medium for locating facial feature points.

第一方面，提供了一种人脸特征点定位方法，包括：In the first aspect, a method for locating facial feature points is provided, including:

将待定位特征点的目标人脸图像输入至第一卷积神经网络模型，得到所述第一卷积神经网络模型的预定层卷积层输出的包含多个候选特征点的第一浅层特征图；Input the target face image of the feature points to be located into the first convolutional neural network model, and obtain the first shallow feature containing multiple candidate feature points output by the predetermined layer convolutional layer of the first convolutional neural network model Figure;

利用双线性插值算法对所述第一浅层特征图中的候选特征点进行双线性插值，得到第二浅层特征图；Using a bilinear interpolation algorithm to perform bilinear interpolation on candidate feature points in the first shallow feature map to obtain a second shallow feature map;

将所述第二浅层特征图输入至与所述第一卷积神经网络模型级联的第二卷积神经网络模型，得到所述第二卷积神经网络模型输出的与所述待定位特征点的目标人脸图像对应的人脸特征图，其中，所述第二卷积神经网络模型的预定层卷积层及所述预定层卷积层之前的所有卷积层中每一卷积层的权重分别与所述第一卷积神经网络模型对应层数的卷积层的权重一致，所述第二卷积神经网络模型的预定层卷积层在所有所述第二卷积神经网络模型的卷积层中的排序和所述第一卷积神经网络模型的预定层卷积层在所有所述第一卷积神经网络模型的卷积层中的排序一致。Input the second shallow feature map to a second convolutional neural network model cascaded with the first convolutional neural network model to obtain the output of the second convolutional neural network model and the feature to be located The face feature map corresponding to the target face image of the point, where each of the predetermined convolutional layers of the second convolutional neural network model and all convolutional layers before the predetermined convolutional layer The weights of the first convolutional neural network model correspond to the weights of the convolutional layers corresponding to the number of layers, and the predetermined convolutional layers of the second convolutional neural network model are in all the second convolutional neural network models. The ordering in the convolutional layer of is consistent with the ordering of the predetermined convolutional layer of the first convolutional neural network model in all the convolutional layers of the first convolutional neural network model.

第二方面，提供了一种人脸特征点定位装置，包括：In the second aspect, a device for locating facial feature points is provided, including:

第一获取模块，被配置为将待定位特征点的目标人脸图像输入至第一卷积神经网络模型，得到所述第一卷积神经网络模型的预定层卷积层输出的包含多个候选特征点的第一浅层特征图；The first acquisition module is configured to input the target face image of the feature point to be located into the first convolutional neural network model, and obtain the output of the predetermined layer convolutional layer of the first convolutional neural network model, which contains multiple candidates The first shallow feature map of feature points;

插值模块，被配置为利用双线性插值算法对所述第一浅层特征图中的候选特征点进行双线性插值，得到第二浅层特征图；An interpolation module, configured to perform bilinear interpolation on candidate feature points in the first shallow feature map by using a bilinear interpolation algorithm to obtain a second shallow feature map;

第二获取模块，被配置为将所述第二浅层特征图输入至与所述第一卷积神经网络模型级联的第二卷积神经网络模型，得到所述第二卷积神经网络模型输出的与所述待定位特征点的目标人脸图像对应的人脸特征图，其中，所述第二卷积神经网络模型的预定层卷积层及所述预定层卷积层之前的所有卷积层中每一卷积层的权重分别与所述第一卷积神经网络模型对应层数的卷积层的权重一致，所述第二卷积神经网络模型的预定层卷积层在所有所述第二卷积神经网络模型的卷积层中的排序和所述第一卷积神经网络模型的预定层卷积层在所有所述第一卷积神经网络模型的卷积层中的排序一致。The second acquisition module is configured to input the second shallow feature map to a second convolutional neural network model cascaded with the first convolutional neural network model to obtain the second convolutional neural network model The output face feature map corresponding to the target face image of the feature point to be located, wherein the predetermined convolutional layer of the second convolutional neural network model and all the volumes before the predetermined convolutional layer The weight of each convolutional layer in the buildup layer is consistent with the weight of the convolutional layer corresponding to the number of layers of the first convolutional neural network model. The predetermined convolutional layer of the second convolutional neural network model is in all the convolutional layers. The ordering in the convolutional layer of the second convolutional neural network model is consistent with the ordering of the predetermined convolutional layer of the first convolutional neural network model in all convolutional layers of the first convolutional neural network model .

第三方面，提供了一种计算设备，包括存储器和处理器，所述存储器用于存储所述处理器的人脸特征点定位的程序，所述处理器配置为经由执行所述人脸特征点定位的程序来执行以下处理：将待定位特征点的目标人脸图像输入至第一卷积神经网络模型，得到所述第一卷积神经网络模型的预定层卷积层输出的包含多个候选特征点的第一浅层特征图；利用双线性插值算法对所述第一浅层特征图中的候选特征点进行双线性插值，得到第二浅层特征图；将所述第二浅层特征图输入至与所述第一卷积神经网络模型级联的第二卷积神经网络模型，得到所述第二卷积神经网络模型输出的与所述待定位特征点的目标人脸图像对应的人脸特征图，其中，所述第二卷积神经网络模型的预定层卷积层及所述预定层卷积层之前的所有卷积层中每一卷积层的权重分别与所述第一卷积神经网络模型对应层数的卷积层的权重一致，所述第二卷积神经网络模型的预定层卷积层在所有所述第二卷积神经网络模型的卷积层中的排序和所述第一卷积神经网络模型的预定层卷积层在所有所述第一卷积神经网络模型的卷积层中的排序一致。In a third aspect, a computing device is provided, including a memory and a processor, the memory is configured to store a program for locating facial feature points of the processor, and the processor is configured to execute the facial feature points via The positioning program performs the following processing: input the target face image of the feature points to be located into the first convolutional neural network model, and obtain the predetermined layer convolutional layer output of the first convolutional neural network model containing multiple candidates A first shallow feature map of feature points; bilinear interpolation is performed on candidate feature points in the first shallow feature map using a bilinear interpolation algorithm to obtain a second shallow feature map; The layer feature map is input to the second convolutional neural network model cascaded with the first convolutional neural network model to obtain the target face image output by the second convolutional neural network model and the feature points to be located Corresponding face feature map, wherein the weight of each convolutional layer in the predetermined layer of the second convolutional neural network model and all convolutional layers before the predetermined layer of convolutional layer is the same as the weight of each convolutional layer. The weights of the convolutional layers corresponding to the number of layers of the first convolutional neural network model are the same, and the predetermined convolutional layers of the second convolutional neural network model are among all the convolutional layers of the second convolutional neural network model. The ordering is consistent with the ordering of the predetermined convolutional layers of the first convolutional neural network model in all convolutional layers of the first convolutional neural network model.

第四方面，提供了一种存储有计算机可读指令的计算机非易失性可读存储介质，其上存储有人脸特征点定位的程序，所述人脸特征点定位的程序被处理器执行时实现以下处理：将待定位特征点的目标人脸图像输入至第一卷积神经网络模型，得到所述第一卷积神经网络模型的预定层卷积层输出的包含多个候选特征点的第一浅层特征图；利用双线性插值算法对所述第一浅层特征图中的候选特征点进行双线性插值，得到第二浅层特征图；将所述第二浅层特征图输入至与所述第一卷积神经网络模型级联的第二卷积神经网络模型，得到所述第二卷积神经网络模型输出的与所述待定位特征点的目标人脸图像对应的人脸特征图，其中，所述第二卷积神经网络模型的预定层卷积层及所述预定层卷积层之前的所有卷积层中每一卷积层的权重分别与所述第一卷积神经网络模型对应层数的卷积层的权重一致，所述第二卷积神经网络模型的预定层卷积层在所有所述第二卷积神经网络模型的卷积层中的排序和所述第一卷积神经网络模型的预定层卷积层在所有所述第一卷积神经网络模型的卷积层中的排序一致。In a fourth aspect, a computer non-volatile readable storage medium storing computer readable instructions is provided, and a program for locating facial feature points is stored thereon. When the program for locating facial feature points is executed by a processor, there is provided The following processing is implemented: input the target face image of the feature points to be located into the first convolutional neural network model, and obtain the first convolutional layer output of the predetermined layer of the first convolutional neural network model that contains multiple candidate feature points A shallow feature map; using a bilinear interpolation algorithm to perform bilinear interpolation on candidate feature points in the first shallow feature map to obtain a second shallow feature map; input the second shallow feature map To the second convolutional neural network model cascaded with the first convolutional neural network model, to obtain the face corresponding to the target face image of the feature point to be located, which is output by the second convolutional neural network model A feature map, wherein the weight of each convolutional layer in the predetermined convolutional layer of the second convolutional neural network model and all convolutional layers before the predetermined convolutional layer is the same as that of the first convolutional layer. The weights of the convolutional layers corresponding to the number of layers of the neural network model are the same, and the order of the predetermined convolutional layers of the second convolutional neural network model in all convolutional layers of the second convolutional neural network model is the same as that of the The order of the predetermined convolutional layers of the first convolutional neural network model in all the convolutional layers of the first convolutional neural network model is consistent.

上述人脸特征点定位方法、装置、计算设备和计算机非易失性可读存储介质，首先通过两个卷积神经网络模型的级联来进行人脸特征点的定位，利用第一个卷积神经网络模型去除掉背景等干扰因素，然后将第一卷积神经网络模型的输出输入至第二卷积神经网络模型，可以提高最终对人脸特征点的定位精度，在此基础上，在将第一卷积神经网络模型的输出输入至第二卷积神经网络模型之前，通过利用双线性插值算法对第一卷积神经网络模型输出的特征图进行双线性插值，使得输入至第二卷积神经网络模型的特征图能够精确地反映原图中的特征点的位置，使得对人脸特征点的定位精度更为准确，另外，由于两个卷积神经网络模型的卷积层实现了权重共享，在模型训练时，第二卷积神经网络模型无需根据输入重新从原始输入的图片来进行学习，减少了参数计算量，加快了整体模型的训练速度和收敛速度。The above-mentioned facial feature point positioning method, device, computing device and computer non-volatile readable storage medium, firstly locate the facial feature point through the cascade of two convolutional neural network models, and use the first convolution The neural network model removes the background and other interference factors, and then inputs the output of the first convolutional neural network model to the second convolutional neural network model, which can improve the final positioning accuracy of facial feature points. On this basis, Before the output of the first convolutional neural network model is input to the second convolutional neural network model, bilinear interpolation is performed on the feature map output by the first convolutional neural network model by using a bilinear interpolation algorithm, so that the input is input to the second convolutional neural network model. The feature map of the convolutional neural network model can accurately reflect the location of the feature points in the original image, making the positioning accuracy of the facial feature points more accurate. In addition, because the convolutional layers of the two convolutional neural network models realize Weight sharing. During model training, the second convolutional neural network model does not need to relearn from the original input picture according to the input, which reduces the amount of parameter calculation and accelerates the training speed and convergence speed of the overall model.

应当理解的是，以上的一般描述和后文的细节描述仅是示例性的，并不能限制本申请。It should be understood that the above general description and the following detailed description are only exemplary and cannot limit the application.

Description of the drawings

图1是根据一示例性实施例示出的一种人脸特征点定位方法的应用场景示意图；Fig. 1 is a schematic diagram showing an application scenario of a method for locating facial feature points according to an exemplary embodiment;

图2是根据一示例性实施例示出的一种人脸特征点定位方法的流程图；Fig. 2 is a flow chart showing a method for locating facial feature points according to an exemplary embodiment;

图3是根据一示例性实施例示出的一种用于人脸特征点定位方法的级联卷积神经网络模型的结构示意图；Fig. 3 is a schematic structural diagram of a cascaded convolutional neural network model used in a method for locating facial feature points according to an exemplary embodiment;

图4是根据图2对应实施例示出的一实施例的步骤220的细节流程图；FIG. 4 is a detailed flowchart of step 220 of an embodiment shown according to the embodiment corresponding to FIG. 2;

图5是根据图4对应实施例示出的一实施例的步骤223的细节流程图；FIG. 5 is a detailed flowchart of step 223 of an embodiment shown according to the embodiment corresponding to FIG. 4;

图6是根据一示例性实施例示出的一种人脸特征点定位装置的框图；Fig. 6 is a block diagram showing a device for locating facial feature points according to an exemplary embodiment;

图7是根据一示例性实施例示出的一种实现上述人脸特征点定位方法的计算设备的示例框图；Fig. 7 is an exemplary block diagram of a computing device that implements the aforementioned method for locating facial feature points according to an exemplary embodiment;

图8是根据一示例性实施例示出的一种实现上述人脸特征点定位方法的计算机非易失性可读存储介质。Fig. 8 is a computer non-volatile readable storage medium for realizing the above-mentioned method for locating facial feature points according to an exemplary embodiment.

detailed description

这里将详细地对示例性实施例进行说明，其示例表示在附图中。下面的描述涉及附图时，除非另有表示，不同附图中的相同数字表示相同或相似的要素。以下示例性实施例中所描述的实施方式并不代表与本申请相一致的所有实施方式。相反，它们仅是与如所附权利要求书中所详述的、本申请的一些方面相一致的装置和方法的例子。The exemplary embodiments will be described in detail here, and examples thereof are shown in the accompanying drawings. When the following description refers to the drawings, unless otherwise indicated, the same numbers in different drawings indicate the same or similar elements. The implementation manners described in the following exemplary embodiments do not represent all implementation manners consistent with the present application. On the contrary, they are merely examples of devices and methods consistent with some aspects of the application as detailed in the appended claims.

此外，附图仅为本申请的示意性图解，并非一定是按比例绘制。图中相同的附图标记表示相同或类似的部分，因而将省略对它们的重复描述。附图中所示的一些方框图是功能实体，不一定必须与物理或逻辑上独立的实体相对应。In addition, the drawings are only schematic illustrations of the application, and are not necessarily drawn to scale. The same reference numerals in the figures denote the same or similar parts, and thus their repeated description will be omitted. Some of the block diagrams shown in the drawings are functional entities and do not necessarily correspond to physically or logically independent entities.

本申请首先提供了一种人脸特征点定位方法。人脸是指电子图像中的人脸，此处的电子图像既可以是照片或者图片，也可以是视频中的一帧，一般情况下，一张包含人脸的电子图像包括多个像素点。特征点是指能够代表人脸的特征信息或者特点的像素点或包含像素点的区域，人脸特征点由于对人脸最具有代表性，所以人脸特征点亦可以称之为人脸关键点。人脸特征点定位是指在一张包含人脸的电子图像中确定人脸的特征点或关键点的过程。通过使用本申请提供的人脸特征点定位方法能够高效准确地对包含人脸的图像中人脸的特征点进行定位，并且整个模型的训练过程更加高效。This application first provides a method for locating facial feature points. A human face refers to a human face in an electronic image. The electronic image here can be either a photo or a picture, or a frame in a video. Generally, an electronic image containing a human face includes multiple pixels. Feature points refer to pixels or areas containing pixel points that can represent the feature information or characteristics of a human face. Because facial feature points are the most representative of human faces, facial feature points can also be called face key points. Facial feature point positioning refers to the process of determining the feature points or key points of a human face in an electronic image containing a human face. By using the facial feature point positioning method provided by the present application, the feature points of the face in the image containing the face can be efficiently and accurately located, and the training process of the entire model is more efficient.

本申请的实施终端可以是任何具有运算和处理功能的设备，该设备可以与外部设备相连，用于接收或者发送数据，具体可以是便携移动设备，例如智能手机、平板电脑、笔记本电脑、PDA(Personal Digital Assistant)等，也可以是固定式设备，例如，计算机设备、现场终端、台式电脑、服务器、工作站等，还可以是多个设备的集合，比如云计算的物理基础设施。The implementation terminal of this application can be any device with computing and processing functions. The device can be connected to an external device to receive or send data. Specifically, it can be a portable mobile device, such as a smart phone, a tablet computer, a notebook computer, or a PDA ( Personal Digital Assistant), etc., can also be fixed devices, such as computer equipment, field terminals, desktop computers, servers, workstations, etc., or a collection of multiple devices, such as the physical infrastructure of cloud computing.

可选地，本申请的实施终端可以为服务器、云计算的物理基础设施或者带有高性能显卡的计算机设备。Optionally, the implementation terminal of the present application may be a server, a physical infrastructure of cloud computing, or a computer device with a high-performance graphics card.

图1是根据一示例性实施例示出的一种人脸特征点定位方法的应用场景示意图。如图1所示，包括服务器110、数据库120以及用户终端130，其中，数据库120和用户终端130分别通过通信链路与服务器110相连，在本实施例中，服务器110为本申请的实施终端，用户终端130也可以是任何具有运算和处理功能的设备，其可以与本申请的实施终端为相同类型的终端，也可以是不同类型的终端，可以与本申请的实施终端为同一终端，也可以为不同终端。对人脸特征点定位需要使用本申请所使用的级联的卷积神经网络模型，未经训练的级联的卷积神经网络模型首先会被嵌入到服务器110中，数据库120中事先存储了大量被准确标注了人脸特征点的图像样本，将这些图像样本输入至嵌于服务器110的级联的卷积神经网络模型中，可以对级联的卷积神经网络模型进行训练，当级联的卷积神经网络模型被训练好时，即可用于进行人脸特征点的定位。当用户需要对一张电子图像中的人脸进行人脸特征点定位时，可以通过使用用户终端130将要进行人脸特征点定位的电子图像发送至服务器110，服务器110利用已经训练好的级联的卷积神经网络模型，可以输出包含定位出的人脸特征点的电子图像并返回至用户终端130。Fig. 1 is a schematic diagram showing an application scenario of a method for locating facial feature points according to an exemplary embodiment. As shown in FIG. 1, it includes a server 110, a database 120, and a user terminal 130. The database 120 and the user terminal 130 are respectively connected to the server 110 through a communication link. In this embodiment, the server 110 is the implementation terminal of the application. The user terminal 130 can also be any device with computing and processing functions. It can be the same type of terminal as the implementation terminal of the application, or a different type of terminal, and can be the same terminal as the implementation terminal of the application, or it can be the same terminal as the implementation terminal of the application. For different terminals. It is necessary to use the cascaded convolutional neural network model used in this application to locate the facial feature points. The untrained cascaded convolutional neural network model will first be embedded in the server 110, and a large number of them are stored in the database 120 in advance. The image samples accurately labeled with facial feature points are input into the cascaded convolutional neural network model embedded in the server 110, and the cascaded convolutional neural network model can be trained. When the convolutional neural network model is trained, it can be used to locate the facial feature points. When the user needs to locate the facial feature points of a face in an electronic image, the user terminal 130 can use the user terminal 130 to send the electronic image for facial feature point positioning to the server 110, and the server 110 uses the trained cascade The convolutional neural network model can output an electronic image containing the located facial feature points and return it to the user terminal 130.

值得一提的是，虽然在本实施例中，本申请的实施终端为服务器，并且级联的卷积神经网络模型都被固定于本申请的实施终端——服务器中，但在其他实施例或者具体应用中，可以根据需要将各种终端选为本申请的实施终端，并且可以将级联的卷积神经网络模型固定于包括本申请的实施终端在内的任意两个相同或者不同终端上，本申请对此不作任何限定，本申请的保护范围也不应因此而受到任何限制。It is worth mentioning that although in this embodiment, the implementation terminal of this application is a server, and the cascaded convolutional neural network model is fixed in the implementation terminal of this application-the server, in other embodiments or In specific applications, various terminals can be selected as the implementation terminals of this application as required, and the cascaded convolutional neural network model can be fixed on any two identical or different terminals including the implementation terminals of this application. This application does not make any limitation on this, and the protection scope of this application should not be restricted in any way.

图2是根据一示例性实施例示出的一种人脸特征点定位方法的流程图。如图2所示，可以包括以下步骤：Fig. 2 is a flow chart showing a method for locating facial feature points according to an exemplary embodiment. As shown in Figure 2, the following steps can be included:

步骤210，将待定位特征点的目标人脸图像输入至第一卷积神经网络模型，得到所述第一卷积神经网络模型的预定层卷积层输出的包含多个候选特征点的第一浅层特征图。Step 210: Input the target face image of the feature points to be located into the first convolutional neural network model, and obtain the first convolutional layer output of the predetermined layer of the first convolutional neural network model that contains multiple candidate feature points. Shallow feature map.

目标人脸图像即需要进行特征点定位的图像，其可以是各种形式或者格式的电子图像，比如可以是以任意方式生成的图片、照片或者视频文件中的一帧，可以是.jpg、.jpeg、.png、.bmp等各种格式。The target face image is the image that requires feature point positioning. It can be an electronic image in various forms or formats. For example, it can be a picture, photo, or a frame in a video file generated in any way, which can be .jpg,. Various formats such as jpeg, .png, .bmp, etc.

第一卷积神经网络模型包括多层卷积层，各层卷积层是用堆叠的方式建构的。预定层卷积层可以是第一卷积神经网络模型的多层卷积层中除第一层卷积层之外的任意一层卷积层。卷积层用于提取目标人脸图像中的特征，一个卷积神经网络模型包含的卷积层越多，就能提取到更深层次的特征图，每一卷积层都可以包含卷积核，卷积核的大小可以是任意的，比如可以为3×3、5×5等。The first convolutional neural network model includes multiple convolutional layers, and each convolutional layer is constructed in a stacked manner. The predetermined convolutional layer may be any convolutional layer except the first convolutional layer in the multi-layer convolutional layer of the first convolutional neural network model. The convolutional layer is used to extract the features in the target face image. The more convolutional layers a convolutional neural network model contains, the deeper the feature map can be extracted. Each convolutional layer can contain a convolution kernel. The size of the convolution kernel can be arbitrary, for example, it can be 3×3, 5×5, and so on.

在一个实施例中，所述第一卷积神经网络模型除了包含卷积层外，还包含池化层。池化层可用于对特征图进行压缩，池化层处理特征图的方式可以是平均池化或者最大池化，平均池化为用一个区域中各像素点的像素值的均值代表整个区域，而最大池化为用一个区域中各像素点的像素值的最大值代表整个区域。In one embodiment, the first convolutional neural network model includes a pooling layer in addition to a convolutional layer. The pooling layer can be used to compress the feature map. The method of processing the feature map by the pooling layer can be average pooling or maximum pooling. Average pooling is to use the average of the pixel values of each pixel in an area to represent the entire area, and Maximum pooling is to use the maximum value of each pixel in an area to represent the entire area.

在一个实施例中，所述第一卷积神经网络模型的每一卷积层之前，包含一个池化层。In one embodiment, before each convolutional layer of the first convolutional neural network model, a pooling layer is included.

在一个实施例中，所述第一卷积神经网络模型的最后一层卷积层之前的每一卷积层之后，包含一个池化层。In one embodiment, after each convolutional layer before the last convolutional layer of the first convolutional neural network model, a pooling layer is included.

候选特征点为所述第一卷积神经网络模型的输出的第一浅层特征图中点，是对目标人脸图像粗略提取得到的特征点。The candidate feature points are the points in the first shallow feature map output by the first convolutional neural network model, and are feature points obtained by roughly extracting the target face image.

步骤220，利用双线性插值算法对所述第一浅层特征图中的候选特征点进行双线性插值，得到第二浅层特征图。Step 220: Perform bilinear interpolation on candidate feature points in the first shallow feature map by using a bilinear interpolation algorithm to obtain a second shallow feature map.

双线性插值算法可以通过在横坐标和纵坐标两个方向上分别进行一次线性插值，实现对图像的尺寸的变换，比如可以将第一尺寸的第一浅层特征图变换为第二尺寸的第二浅层特征图。双线性插值算法可以实现在将第一浅层特征图转换为第二浅层特征图的同时，使插值得到的第二浅层特征图能够准确反映第一浅层特征图中的候选特征点的位置和像素值的对应关系。The bilinear interpolation algorithm can realize the transformation of the size of the image by performing linear interpolation in the two directions of the abscissa and the ordinate respectively. For example, the first shallow feature map of the first size can be transformed into the second size. The second shallow feature map. The bilinear interpolation algorithm can convert the first shallow feature map to the second shallow feature map, and at the same time, the second shallow feature map obtained by interpolation can accurately reflect the candidate feature points in the first shallow feature map The corresponding relationship between the position and the pixel value.

在一个实施例中，步骤220的具体实现步骤可以如图4所示。图4是根据图2对应实施例示出的一实施例的步骤220的细节流程图。如图4所示，步骤220具体可以包括以下子步骤：In an embodiment, the specific implementation steps of step 220 may be as shown in FIG. 4. FIG. 4 is a detailed flowchart of step 220 of an embodiment shown according to the embodiment corresponding to FIG. 2. As shown in FIG. 4, step 220 may specifically include the following sub-steps:

步骤221，针对所述第一浅层特征图中的每一候选特征点，获取要以该候选特征点为坐标中心确定的正方形区域的目标边长。Step 221: For each candidate feature point in the first shallow feature map, obtain the target side length of the square area to be determined with the candidate feature point as the coordinate center.

在一个实施例中，针对所述第一浅层特征图中的每一候选特征点获取的所述目标边长都是相同的。这样做的好处在于，使得插值得到的第二浅层特征图中的每一点都是按照相同的双线性插值方式得到的，保证了插值的平滑性。In an embodiment, the target side length obtained for each candidate feature point in the first shallow feature map is the same. The advantage of this is that each point in the second shallow feature map obtained by interpolation is obtained according to the same bilinear interpolation method, which ensures the smoothness of interpolation.

在一个实施例中，针对所述第一浅层特征图中的每一候选特征点获取的目标边长是按照预定顺序获取的。In an embodiment, the target side length obtained for each candidate feature point in the first shallow feature map is obtained in a predetermined order.

比如，事先设置一个目标边长序列表，存储了针对所述第一浅层特征图中的各候选特征点要获取的各目标边长，各目标边长在所述序列表中按照预定顺序排序，先对所述第一浅层特征图中的每一候选特征点按照从左到右从上到下的顺序进行排序；然后按照所述排序为各候选特征点获取目标边长。具体地，从所述第一浅层特征图中的第一候选特征点开始，针对每一候选特征点，按照各候选特征点的排序，在没被标记为已获取的目标边长中按照预定顺序获取一个目标边长，将获取的目标边长作为针对对应的候选特征点的目标边长，并将该目标边长标记为已选取，直至针对所述第一浅层特征图中的每一候选特征点都获取到了目标边长，其中，若在针对候选特征点获取目标边长时，所有目标边长都被标记为已选取，则将取消所有目标边长的标记。For example, a target side length sequence table is set in advance, and each target side length to be acquired for each candidate feature point in the first shallow feature map is stored, and each target side length is sorted in a predetermined order in the sequence table , First sort each candidate feature point in the first shallow feature map in the order from left to right and top to bottom; and then obtain the target side length for each candidate feature point according to the sort. Specifically, starting from the first candidate feature point in the first shallow feature map, for each candidate feature point, according to the order of each candidate feature point, according to the predetermined target side lengths that are not marked as acquired. Obtain a target side length in sequence, use the obtained target side length as the target side length for the corresponding candidate feature point, and mark the target side length as selected, until each target side length is selected in the first shallow feature map. The candidate feature points have all acquired the target side length, and if all the target side lengths are marked as selected when the target side length is acquired for the candidate feature points, the marking of all target side lengths will be cancelled.

在一个实施例中，所述针对所述第一浅层特征图中的每一候选特征点，获取要以该候选特征点为坐标中心确定的正方形区域的目标边长，包括：In an embodiment, for each candidate feature point in the first shallow feature map, obtaining the target side length of a square area to be determined with the candidate feature point as a coordinate center includes:

以所述第一浅层特征图中的每一候选特征点为坐标中心，在所述第一浅层特征图中获取半径为预设的第一边长的圆形区域；Taking each candidate feature point in the first shallow feature map as a coordinate center, acquiring a circular area with a preset first side length in the first shallow feature map;

针对所述第一浅层特征图中的每一候选特征点，确定该候选特征点对应的圆形区域内各像素点的方差；For each candidate feature point in the first shallow feature map, determine the variance of each pixel point in the circular area corresponding to the candidate feature point;

在所述方差大于或等于预设的方差阈值的情况下，将所述预设的第一边长作为要以该候选特征点为坐标中心确定的正方形区域的目标边长；In the case that the variance is greater than or equal to the preset variance threshold, use the preset first side length as the target side length of the square area to be determined with the candidate feature point as the coordinate center;

在所述方差小于预设的方差阈值的情况下，将预设的第二边长作为要以该候选特征点为坐标中心确定的正方形区域的目标边长，其中，所述预设的第二边长小于预设的第一边长。In the case that the variance is less than the preset variance threshold, the preset second side length is used as the target side length of the square area to be determined with the candidate feature point as the coordinate center, wherein the preset second side length The side length is smaller than the preset first side length.

一般情况下，像素点距离越近像素值差距越小，因此如果以一个候选特征点为坐标中心的圆形区域内的像素点的像素值方差越大，说明该圆形区域内的像素点的像素值变化越剧烈，所以本实施例的好处在于，通过在候选特征点周围的像素点的像素值的方差足够大的情况下，选择较小的边长作为要以该候选特征点为坐标中心确定的正方形区域的目标边长，从而可以使插值得到的像素值与对应的特征点的像素值更近似，在一定程度上使得获得的第二浅层特征图能够更加准确反映目标人脸图像的特征。In general, the closer the pixel point is, the smaller the pixel value difference is. Therefore, if the pixel value variance of the pixel point in the circular area with a candidate feature point as the coordinate center is larger, it means that the pixel value in the circular area is The pixel value changes more drastically, so the advantage of this embodiment is that when the variance of the pixel values of the pixel points around the candidate feature point is large enough, the smaller side length is selected as the coordinate center of the candidate feature point. Determine the target side length of the square area, so that the pixel value obtained by interpolation can be more similar to the pixel value of the corresponding feature point, and to a certain extent, the obtained second shallow feature map can more accurately reflect the target face image. feature.

步骤222，针对所述第一浅层特征图中的每一候选特征点，以该候选特征点为坐标中心在所述第一浅层特征图中确定正方形区域。Step 222: For each candidate feature point in the first shallow feature map, a square area is determined in the first shallow feature map with the candidate feature point as a coordinate center.

其中，所述正方形区域的边长为针对该候选特征点获取的目标边长。Wherein, the side length of the square area is the target side length obtained for the candidate feature point.

在一个实施例中，若所述第一浅层特征图中的候选特征点的坐标为(x，y)，而与该候选特征点对应的目标边长为d，则以该候选特征点为中心的边长为与该候选特征点对应的目标边长的正方形区域为由如下函数组的图像包围成的正方形：In an embodiment, if the coordinates of the candidate feature point in the first shallow feature map are (x, y), and the target side length corresponding to the candidate feature point is d, then the candidate feature point is The square area where the side length of the center is the target side length corresponding to the candidate feature point is a square surrounded by the images of the following function group:

其中，f(x) _x表示该函数的因变量为横坐标，f(y) _y表示该函数的图像的因变量为纵坐标。 Among them, f(x) _x indicates that the dependent variable of the function is the abscissa, and f(y) _y indicates that the dependent variable of the image of the function is the ordinate.

步骤223，获取针对每一候选特征点确定的正方形区域的四个顶点的坐标值以及每一顶点处的像素值。Step 223: Obtain the coordinate values of the four vertices of the square region determined for each candidate feature point and the pixel value at each vertex.

在一个实施例中，所述获取针对每一候选特征点确定的正方形区域的四个顶点的坐标值以及每一顶点处的像素值，包括：In an embodiment, the obtaining the coordinate values of the four vertices of the square region determined for each candidate feature point and the pixel value at each vertex includes:

获取针对每一候选特征点确定的正方形区域的四个顶点的坐标值；Acquiring the coordinate values of the four vertices of the square area determined for each candidate feature point;

根据获取的各顶点的坐标值，确定各顶点是否位于所述第一浅层特征图内；Determine whether each vertex is located in the first shallow feature map according to the acquired coordinate value of each vertex;

当针对每一候选特征点确定的正方形区域的四个顶点中存在至少一个顶点不位于所述第一浅层特征图内时，将针对该候选特征点确定的正方形区域的四个顶点中不位于所述第一浅层特征图内的顶点的坐标值，替换为针对每一候选特征点确定的正方形区域的四个顶点中任意一个位于所述第一浅层特征图内的顶点的坐标值；When at least one of the four vertices of the square area determined for each candidate feature point is not located in the first shallow feature map, the four vertices of the square area determined for the candidate feature point are not located The coordinate value of the vertex in the first shallow feature map is replaced with the coordinate value of any one of the four vertices of the square region determined for each candidate feature point located in the first shallow feature map;

针对每一候选特征点确定的正方形区域的四个顶点的坐标值，获取各顶点的像素值。For the coordinate values of the four vertices of the square region determined by each candidate feature point, the pixel value of each vertex is obtained.

步骤224，针对每一候选特征点，基于该候选特征点对应的顶点的坐标和每一顶点处的像素值，利用如下公式获取构成所述第二浅层特征图的各个像素点的像素值，以得到第二浅层特征图：Step 224: For each candidate feature point, based on the coordinates of the vertex corresponding to the candidate feature point and the pixel value at each vertex, use the following formula to obtain the pixel value of each pixel that constitutes the second shallow feature map, To get the second shallow feature map:

其中，(x,y)是针对该候选特征点获取的第二浅层特征图中与该特征点对应的像素点的坐标值，(x ₁,y ₁)，(x ₂,y ₁),(x ₁,y ₂)和(x ₂,y ₂)分别是该候选特征点对应的正方形区域的四个顶点的坐标值，f(x ₁,y ₁)，f(x ₂,y ₁)，f(x ₁,y ₂)和f(x ₂,y ₂)是在所述第一浅层特征图中的与该候选特征点对应的正方形区域的四个顶点处的像素值。 Among them, (x, y) is the coordinate value of the pixel point corresponding to the feature point in the second shallow feature map obtained for the candidate feature point, (x ₁ , y ₁ ), (x ₂ , y ₁ ), (x ₁ ,y ₂ ) and (x ₂ ,y ₂ ) are the coordinate values of the four vertices of the square area corresponding to the candidate feature point, f(x ₁ ,y ₁ ), f(x ₂ ,y ₁ ) , F(x ₁ , y ₂ ) and f(x ₂ , y ₂ ) are the pixel values at the four vertices of the square region corresponding to the candidate feature point in the first shallow feature map.

在一个实施例中，每一点的像素值为灰度值。In one embodiment, the pixel value of each point is a gray value.

在一个实施例中，每一点的像素值为RGB值。In one embodiment, the pixel value of each point is an RGB value.

步骤230，将所述第二浅层特征图输入至与所述第一卷积神经网络模型级联的第二卷积神经网络模型，得到所述第二卷积神经网络模型输出的与所述待定位特征点的目标人脸图像对应的人脸特征图。Step 230: Input the second shallow feature map to a second convolutional neural network model cascaded with the first convolutional neural network model to obtain the output of the second convolutional neural network model and the The face feature map corresponding to the target face image of the feature point to be located.

其中，所述第二卷积神经网络模型的预定层卷积层及所述预定层卷积层之前的所有卷积层中每一卷积层的权重分别与所述第一卷积神经网络模型对应层数的卷积层的权重一致，所述第二卷积神经网络模型的预定层卷积层在所有所述第二卷积神经网络模型的卷积层中的排序和所述第一卷积神经网络模型的预定层卷积层在所有所述第一卷积神经网络模型的卷积层中的排序一致。Wherein, the weight of each convolutional layer in the predetermined convolutional layer of the second convolutional neural network model and all convolutional layers before the predetermined convolutional layer is the same as that of the first convolutional neural network model. The weights of the convolutional layers corresponding to the number of layers are the same, and the order of the predetermined convolutional layers of the second convolutional neural network model in all convolutional layers of the second convolutional neural network model is the same as that of the first convolutional layer. The order of the predetermined convolutional layers of the convolutional neural network model in all the convolutional layers of the first convolutional neural network model is consistent.

第二卷积神经网络模型包含的卷积层的层数大于或等于所述第一卷积神经网络模型的所述预定层卷积层之前的所有卷积层的层数加1，即第二卷积神经网络模型包含的卷积层的层数大于或等于所述第一卷积神经网络模型的预定层卷积层和所述预定层卷积层之前的所有卷积层的层数之和，第二卷积神经网络模型包含的各层卷积层堆叠排布。The number of convolutional layers included in the second convolutional neural network model is greater than or equal to the number of all convolutional layers before the predetermined convolutional layer of the first convolutional neural network model plus 1, that is, the second The number of convolutional layers included in the convolutional neural network model is greater than or equal to the sum of the number of convolutional layers of the predetermined convolutional layer of the first convolutional neural network model and all convolutional layers before the predetermined convolutional layer , The second convolutional neural network model contains the convolutional layers of each layer in a stacked arrangement.

由于第二卷积神经网络模型包含的卷积层的层数大于或等于预定层卷积层和所述预定层卷积层之前的所有卷积层的层数之和，并且所述第二卷积神经网络模型的预定层卷积层在所有所述第二卷积神经网络模型的卷积层中的排序和所述第一卷积神经网络模型的预定层卷积层在所有所述第一卷积神经网络模型的卷积层中的排序一致，所以对于所述第二卷积神经网络模型的预定层卷积层及所述预定层卷积层之前的每一卷积层，在所述第一卷积神经网络模型中都有与该卷积层的层数一致的卷积层，从而可以使两个卷积神经网络模型中对应层数的卷积层的权重一致。Since the number of convolutional layers contained in the second convolutional neural network model is greater than or equal to the sum of the number of convolutional layers of the predetermined layer and all convolutional layers before the predetermined layer of convolutional layer, and the second convolutional layer The order of the predetermined convolutional layer of the convolutional neural network model in all the convolutional layers of the second convolutional neural network model and the predetermined convolutional layer of the first convolutional neural network model in all the first convolutional layers The ordering in the convolutional layers of the convolutional neural network model is consistent, so for the predetermined convolutional layer of the second convolutional neural network model and each convolutional layer before the predetermined convolutional layer, in the The first convolutional neural network model has convolutional layers with the same number of layers as the convolutional layer, so that the weights of the convolutional layers corresponding to the number of layers in the two convolutional neural network models can be consistent.

在一个实施例中，第二卷积神经网络除了包含卷积层，还包括至少一层池化层以及至少一层全连接层。In one embodiment, in addition to the convolutional layer, the second convolutional neural network also includes at least one pooling layer and at least one fully connected layer.

第二卷积神经网络模型与所述第一卷积神经网络模型级联是指，将所述第一卷积神经网络模型的输出直接作为所述第二卷积神经网络模型的输入。The cascading of the second convolutional neural network model and the first convolutional neural network model refers to directly using the output of the first convolutional neural network model as the input of the second convolutional neural network model.

由于第二卷积神经网络模型同所述第一卷积神经网络模型一样，都包含了多层卷积层，所以第二卷积神经网络模型可以对所述第一卷积神经网络模型输出的特征图进行进一步的提取，得到更精细的人脸特征图。Since the second convolutional neural network model is the same as the first convolutional neural network model, it contains multiple convolutional layers, so the second convolutional neural network model can output the output of the first convolutional neural network model. The feature map is further extracted to obtain a more refined facial feature map.

卷积层的权重是指卷积层对特征图进行运算处理时所使用的参数。The weight of the convolutional layer refers to the parameters used when the convolutional layer performs arithmetic processing on the feature map.

在一个实施例中，在建立级联卷积神经网络模型时，利用反向传播算法对模型进行训练，训练的过程即确定包含卷积层的权重在内的模型中的参数的过程。In one embodiment, when the cascaded convolutional neural network model is established, the back-propagation algorithm is used to train the model, and the training process is the process of determining the parameters of the model including the weight of the convolutional layer.

在一个实施例中，卷积层的权重为提取特征图的特征时所使用的卷积核上的权重矩阵。In one embodiment, the weight of the convolution layer is the weight matrix on the convolution kernel used when extracting the features of the feature map.

通过共享第一卷积神经网络模型和第二卷积神经网络模型的卷积层的权重，可以实现对用于人脸特征点定位方法的级联卷积神经网络模型的高效训练：一方面，通过共享级联的两个卷积神经网络模型的卷积层的权重，减少参数量和计算量；另一方面，当要使用第二卷积神经网络模型对第二浅层特征图进行进一步的特征提取时，可以直接使用第一卷积神经网络模型已经学习过的抽象语义特征，第二卷积模型不需要重新从原始输入图片的层次重新学习语义特征，加快了整体模型的训练速度和收敛速度。By sharing the weights of the convolutional layers of the first convolutional neural network model and the second convolutional neural network model, efficient training of the cascaded convolutional neural network model used in the facial feature point positioning method can be realized: on the one hand, By sharing the weights of the convolutional layers of the two cascaded convolutional neural network models, the amount of parameters and calculations can be reduced; on the other hand, when the second convolutional neural network model is to be used to further the second shallow feature map When extracting features, you can directly use the abstract semantic features that the first convolutional neural network model has learned, and the second convolutional model does not need to relearn semantic features from the original input image level, which speeds up the training speed and convergence of the overall model speed.

图3是根据一示例性实施例示出的一种用于人脸特征点定位方法的级联卷积神经网络模型的结构示意图。如图3所示，包括输入的目标人脸图像310、第一卷积神经网络模型320和与第一卷积神经网络模型级联的第二卷积神经网络模型340，其中第一卷积神经网络模型320包括多层卷积层330，而第二卷积神经网络模型340包括多层卷积层350和多层全连接层360，第一卷积神经网络模型320和第二卷积神经网络模型340共同组成了级联的卷积神经网络模型。利用级联的卷积神经网络模型当进行人脸特征点定位时，具体流程是这样的：当目标人脸图像310被输入至第一卷积神经网络模型320后，第一卷积神经网络模型320中的多层卷积层330可以实现对目标人脸图像310中人脸特征点的初步粗略提取，输出第一浅层特征图；本申请的实施终端可以对第一浅层特征图进行双线性插值，得到第二浅层特征图，然后将第二浅层特征图喂给第二卷积神经网络模型340；第二浅层特征图被输入至第二卷积神经网络模型340后，第二卷积神经网络模型340利用其包括的多层卷积层350和多层全连接层360，能够对第二浅层特征图进行更进一步的精细特征提取，最终得到与目标人脸图像对应的人脸特征图。Fig. 3 is a schematic structural diagram of a cascaded convolutional neural network model used in a method for locating facial feature points according to an exemplary embodiment. As shown in FIG. 3, it includes an input target face image 310, a first convolutional neural network model 320, and a second convolutional neural network model 340 cascaded with the first convolutional neural network model, where the first convolutional neural network model The network model 320 includes a multi-layer convolutional layer 330, and the second convolutional neural network model 340 includes a multi-layer convolutional layer 350 and a multi-layer fully connected layer 360, the first convolutional neural network model 320 and the second convolutional neural network The models 340 together form a cascaded convolutional neural network model. When using the cascaded convolutional neural network model to locate the facial feature points, the specific process is as follows: after the target face image 310 is input to the first convolutional neural network model 320, the first convolutional neural network model The multi-layer convolutional layer 330 in 320 can realize the preliminary rough extraction of facial feature points in the target face image 310, and output the first shallow feature map; the implementation terminal of this application can double the first shallow feature map. Linear interpolation is used to obtain the second shallow feature map, and then feed the second shallow feature map to the second convolutional neural network model 340; after the second shallow feature map is input to the second convolutional neural network model 340, The second convolutional neural network model 340 uses the multi-layer convolutional layer 350 and the multi-layer fully connected layer 360 that it includes to perform further fine feature extraction on the second shallow feature map, and finally obtains the corresponding target face image Face feature map.

值得一提的是，图3仅为本申请的一个实施例。虽然在图3的实施例中，第一卷积神经网络模型仅包括了卷积层，而第二卷积神经网络模型仅包括了卷积层和全连接层，但在实际应用中，可以分别给第一卷积神经网络模型和第二卷积神经网络模型增加一层或多层池化层的设置，因此本申请的各实施例对用于本申请提供的人脸特征点定位方法的级联卷积神经网络模型的具体结构不作任何限定，本申请的保护范围不应因此而受到任何限制。It is worth mentioning that Fig. 3 is only an embodiment of the present application. Although in the embodiment of FIG. 3, the first convolutional neural network model only includes the convolutional layer, and the second convolutional neural network model only includes the convolutional layer and the fully connected layer, in practical applications, they can be separately One or more pooling layers are added to the first convolutional neural network model and the second convolutional neural network model. Therefore, the various embodiments of this application compare the levels used in the facial feature point positioning method provided by this application. The specific structure of the convolutional neural network model is not limited in any way, and the scope of protection of this application should not be restricted in any way.

综上所述，根据图2实施例提供的人脸特征点定位方法，通过利用级联的两个卷积神经网络模型来进行特征图的提取，实现了对人脸特征点较为准确的定位，在此基础上，通过使用双线性插值算法对要输入至第二卷积神经网络模型的第一浅层特征图进行双线性插值，可以进一步提高获取的特征图的精度，同时由于两个卷积神经网络模型的卷积层共享权重，可以减少计算量和参数量，当第二卷积神经网络模型接收到输入该模型的第二浅层特征图后，第二卷积神经网络模型无需从原始图片输入的层次学习图像的语义特征，可以加快模型的训练速度，使得损失函数快速收敛。In summary, according to the facial feature point locating method provided in the embodiment in FIG. 2, by using two cascaded convolutional neural network models to extract feature maps, more accurate facial feature points are located. On this basis, by using the bilinear interpolation algorithm to perform bilinear interpolation on the first shallow feature map to be input to the second convolutional neural network model, the accuracy of the acquired feature map can be further improved. At the same time, due to the two The convolutional layer of the convolutional neural network model shares weights, which can reduce the amount of calculation and the amount of parameters. When the second convolutional neural network model receives the second shallow feature map input to the model, the second convolutional neural network model does not need Learning the semantic features of the image from the input level of the original image can speed up the training speed of the model and make the loss function converge quickly.

图5是根据图4对应实施例示出的一实施例的步骤223的细节流程图。在图5实施例中，针对每一候选特征点获取的目标边长都为相同的预设的边长。如图5所示，可以包括以下步骤：FIG. 5 is a detailed flowchart of step 223 of an embodiment shown according to the embodiment corresponding to FIG. 4. In the embodiment of FIG. 5, the target side length obtained for each candidate feature point is the same preset side length. As shown in Figure 5, the following steps can be included:

步骤2231，针对每一候选特征点，分别利用下列表达式获取针对该候选特征点确定的正方形区域的四个顶点的坐标值：Step 2231: For each candidate feature point, use the following expressions to obtain the coordinate values of the four vertices of the square area determined for the candidate feature point:

(x ₃-r,y ₃-r),(x ₃-r,y ₃+r),(x ₃+r,y ₃-r),(x ₃+r,y ₃+r)， (x ₃ -r,y ₃ -r),(x ₃ -r,y ₃ +r),(x ₃ +r,y ₃ -r),(x ₃ +r,y ₃ +r),

其中，(x ₃，y ₃)是候选特征点的坐标值，r是所述预设的边长的二分之一。 Wherein, (x ₃ , y ₃ ) is the coordinate value of the candidate feature point, and r is one half of the preset side length.

步骤2232，针对每一候选特征点，判断针对该候选特征点确定的正方形区域的四个顶点中每一顶点的坐标值是否都位于所述第一浅层特征图内。Step 2232: For each candidate feature point, determine whether the coordinate value of each of the four vertices of the square region determined for the candidate feature point is located in the first shallow feature map.

在一个实施例中，通过将针对该候选特征点确定的正方形区域的四个顶点中每一顶点的坐标值与所述第一浅层特征图内的所有像素点的坐标值进行比对，实现对针对该候选特征点确定的正方形区域的四个顶点中每一顶点的坐标值是否都位于所述第一浅层特征图内的判断。In an embodiment, it is achieved by comparing the coordinate value of each of the four vertices of the square region determined for the candidate feature point with the coordinate values of all the pixels in the first shallow feature map. Determine whether the coordinate value of each of the four vertices of the square area determined for the candidate feature point is located in the first shallow feature map.

步骤2233，如果是，按照所述坐标值在所述第一浅层特征图中获取对应顶点的像素值。Step 2233, if yes, obtain the pixel value of the corresponding vertex in the first shallow feature map according to the coordinate value.

当针对一个候选特征点确定的正方形区域的四个顶点中每一顶点的坐标值都位于所述第一浅层特征图内时，说明每一顶点的坐标值处的像素值都能获得。When the coordinate value of each of the four vertices of the square area determined for a candidate feature point is located in the first shallow feature map, it means that the pixel value at the coordinate value of each vertex can be obtained.

步骤2234，如果否，对于针对该候选特征点确定的正方形区域的四个顶点中位于所述第一浅层特征图内的顶点，按照各顶点对应的所述坐标值在所述第一浅层特征图中获取对应顶点的像素值。Step 2234, if not, for the vertices located in the first shallow feature map among the four vertices of the square region determined for the candidate feature point, the vertices located in the first shallow layer feature map according to the coordinate value corresponding to each vertex are set in the first shallow layer. Get the pixel value of the corresponding vertex in the feature map.

当针对候选特征点确定的正方形区域的四个顶点中每一顶点的坐标值不都位于所述第一浅层特征图中时，针对该候选特征点确定的正方形区域的四个顶点中也有可能存在至少一个顶点位于所述第一浅层特征图内。When the coordinate value of each of the four vertices of the square region determined for the candidate feature point is not all located in the first shallow feature map, it is also possible that the four vertices of the square region determined for the candidate feature point There is at least one vertex located in the first shallow feature map.

步骤2235，获取针对该候选特征点确定的正方形区域的四个顶点中位于所述第一浅层特征图以外的顶点，作为辅助顶点。Step 2235: Obtain the vertices outside the first shallow feature map among the four vertices of the square region determined for the candidate feature point as auxiliary vertices.

步骤2236，针对每一辅助顶点，在所述第一浅层特征图中获取与该辅助顶点距离最近的像素点处的像素值，作为该辅助顶点的像素值。Step 2236: For each auxiliary vertex, obtain the pixel value of the pixel point closest to the auxiliary vertex in the first shallow feature map as the pixel value of the auxiliary vertex.

本实施例的好处在于，为当正方形区域中的四个顶点中存在顶点不位于第一浅层特征图内时，提供了解决方案，同时，由于在同一特征图中，两个像素点距离越近，这两个像素点对应的像素值越可能相近，所以本实施例还可以保证获取的顶点的像素值的精度，从而可以提高获取的第二浅层特征图的精度。The advantage of this embodiment is that it provides a solution when the four vertices in the square area are not located in the first shallow feature map. At the same time, because the distance between the two pixels in the same feature map is greater Nearly, the pixel values corresponding to the two pixels are more likely to be similar, so this embodiment can also ensure the accuracy of the obtained pixel values of the vertices, thereby improving the accuracy of the obtained second shallow feature map.

在一个实施例中，所述针对每一辅助顶点，在所述第一浅层特征图中获取与该辅助顶点距离最近的像素点处的像素值，作为该辅助顶点的像素值，包括：In an embodiment, for each auxiliary vertex, acquiring the pixel value at the pixel point closest to the auxiliary vertex in the first shallow feature map as the pixel value of the auxiliary vertex includes:

针对每一辅助顶点，针对所述第一浅层特征图中的每一像素点，利用如下公式确定该像素点与该辅助顶点之间的距离：For each auxiliary vertex, for each pixel in the first shallow feature map, use the following formula to determine the distance between the pixel and the auxiliary vertex:

其中，x ₄，y ₄分别是辅助顶点的横坐标和纵坐标，x，y分别是所述第一浅层特征图中像素点的横坐标和纵坐标，D是辅助顶点与所述第一浅层特征图中的像素点之间的距离；针对每一辅助顶点，在针对该辅助顶点获取的所述第一浅层特征图中的每一像素点与该辅助顶点的距离中，获取最小的距离；将最小的距离对应的像素点处的像素值，作为该辅助顶点的像素值。 Where x ₄ and y ₄ are the abscissa and ordinate of the auxiliary vertex, respectively, x, y are the abscissa and ordinate of the pixel in the first shallow feature map, and D is the auxiliary vertex and the first The distance between pixels in the shallow feature map; for each auxiliary vertex, the smallest distance between each pixel in the first shallow feature map and the auxiliary vertex acquired for the auxiliary vertex is obtained The distance; the pixel value at the pixel point corresponding to the smallest distance is used as the pixel value of the auxiliary vertex.

在一个实施例中，所述针对每一辅助顶点，在针对该辅助顶点获取的所述第一浅层特征图中的每一像素点与该辅助顶点的距离中，获取最小的距离，包括：针对每一辅助顶点，在针对该辅助顶点获取的所述第一浅层特征图中的每一像素点与该辅助顶点的距离中任取一个，标记为候选最小距离；在除所述候选最小距离之外的每一像素点与该辅助顶点的距离中，按照预定顺序针对每一像素点与该辅助顶点的距离，判断该像素点与该辅助顶点的距离是否小于所述候选最小距离；如果是，取消所有候选最小距离的标记，将该像素点与该辅助顶点的距离标记为候选最小距离；在没获取过的像素点与该辅助顶点的距离中，按照预定顺序再次针对每一像素点与该辅助顶点的距离，判断该像素点与该辅助顶点的距离是否小于所述候选最小距离，直至没有像素点与该辅助顶点的距离小于所述候选最小距离；将所述候选最小距离作为最小的距离。In one embodiment, for each auxiliary vertex, acquiring the smallest distance among the distances between each pixel point in the first shallow feature map and the auxiliary vertex acquired for the auxiliary vertex includes: For each auxiliary vertex, any one of the distances between each pixel in the first shallow feature map obtained for the auxiliary vertex and the auxiliary vertex is selected and marked as the candidate minimum distance; in addition to the candidate minimum In the distance between each pixel point outside the distance and the auxiliary vertex, determine whether the distance between the pixel point and the auxiliary vertex is less than the candidate minimum distance according to the distance between each pixel point and the auxiliary vertex in a predetermined order; if Yes, cancel the mark of all candidate minimum distances, and mark the distance between the pixel and the auxiliary vertex as the candidate minimum distance; in the distance between the unobtained pixel and the auxiliary vertex, again for each pixel in a predetermined order Determine whether the distance between the pixel and the auxiliary vertex is less than the candidate minimum distance, until there is no pixel and the auxiliary vertex is less than the candidate minimum distance; take the candidate minimum distance as the minimum the distance.

在一个实施例中，在针对每一辅助顶点，针对所述第一浅层特征图中的每一像素点，利用如下公式确定该像素点与该辅助顶点之间的距离之前，所述方法还包括：In one embodiment, for each auxiliary vertex, for each pixel in the first shallow feature map, the following formula is used to determine the distance between the pixel and the auxiliary vertex, the method further include:

针对每一辅助顶点，在所述第一浅层特征图的像素点中获取与该辅助顶点的横坐标之差小于预定坐标差值且纵坐标之差小于预定坐标差值的像素点，作为候选像素点；For each auxiliary vertex, from the pixel points of the first shallow feature map, obtain a pixel whose abscissa difference from the auxiliary vertex is less than a predetermined coordinate difference and whose ordinate difference is less than a predetermined coordinate difference, as a candidate pixel;

所述针对每一辅助顶点，针对所述第一浅层特征图中的每一像素点，利用如下公式确定该像素点与该辅助顶点之间的距离，包括：For each auxiliary vertex, for each pixel in the first shallow feature map, using the following formula to determine the distance between the pixel and the auxiliary vertex includes:

针对每一辅助顶点，针对所述第一浅层特征图中的每一候选像素点，利用如下公式确定该候选像素点与该辅助顶点之间的距离：For each auxiliary vertex, for each candidate pixel in the first shallow feature map, the following formula is used to determine the distance between the candidate pixel and the auxiliary vertex:

其中，x ₄，y ₄分别是辅助顶点的横坐标和纵坐标，x，y分别是所述第一浅层特征图中候选像素点的横坐标和纵坐标，D是辅助顶点与所述第一浅层特征图中的候选像素点之间的距离。 Where x ₄ and y ₄ are the abscissa and ordinate of the auxiliary vertex, respectively, x, y are the abscissa and ordinate of the candidate pixel in the first shallow feature map, and D is the auxiliary vertex and the first The distance between candidate pixels in a shallow feature map.

由于获取像素点与辅助顶点之间的距离的计算复杂度较高，会消耗大量资源，所以本实施例的好处在于，通过首先获取候选像素点，然后再针对候选像素点获取与辅助顶点之间的距离，使得在获取像素点与辅助顶点之间的距离时的计算任务大大降低，从而可以提高获取最小的距离的效率。Since the calculation complexity of obtaining the distance between the pixel and the auxiliary vertex is relatively high and consumes a lot of resources, the advantage of this embodiment is that by first obtaining the candidate pixel, and then obtaining the distance between the candidate pixel and the auxiliary vertex The distance between the pixel point and the auxiliary vertex greatly reduces the calculation task when obtaining the distance between the pixel point and the auxiliary vertex, so that the efficiency of obtaining the minimum distance can be improved.

本申请还提供了一种人脸特征点定位装置，以下是本申请的装置实施例。The present application also provides a device for locating facial feature points. The following are device embodiments of the present application.

图6是根据一示例性实施例示出的一种人脸特征点定位装置的框图。如图6所示，装置600包括：Fig. 6 is a block diagram showing a device for locating facial feature points according to an exemplary embodiment. As shown in FIG. 6, the apparatus 600 includes:

第一获取模块610，被配置为将待定位特征点的目标人脸图像输入至第一卷积神经网络模型，得到所述第一卷积神经网络模型的预定层卷积层输出的包含多个候选特征点的第一浅层特征图。The first acquisition module 610 is configured to input the target face image of the feature points to be located into the first convolutional neural network model, and obtain the predetermined layer convolutional layer output of the first convolutional neural network model. The first shallow feature map of candidate feature points.

插值模块620，被配置为利用双线性插值算法对所述第一浅层特征图中的候选特征点进行双线性插值，得到第二浅层特征图。The interpolation module 620 is configured to perform bilinear interpolation on candidate feature points in the first shallow feature map by using a bilinear interpolation algorithm to obtain a second shallow feature map.

第二获取模块630，被配置为将所述第二浅层特征图输入至与所述第一卷积神经网络模型级联的第二卷积神经网络模型，得到所述第二卷积神经网络模型输出的与所述待定位特征点的目标人脸图像对应的人脸特征图，其中，所述第二卷积神经网络模型的预定层卷积层及所述预定层卷积层之前的所有卷积层中每一卷积层的权重分别与所述第一卷积神经网络模型对应层数的卷积层的权重一致，所述第二卷积神经网络模型的预定层卷积层在所有所述第二卷积神经网络模型的卷积层中的排序和所述第一卷积神经网络模型的预定层卷积层在所有所述第一卷积神经网络模型的卷积层中的排序一致。The second acquisition module 630 is configured to input the second shallow feature map to a second convolutional neural network model cascaded with the first convolutional neural network model to obtain the second convolutional neural network The face feature map output by the model corresponding to the target face image of the feature point to be located, wherein the predetermined convolutional layer of the second convolutional neural network model and all preceding convolutional layers of the predetermined convolutional layer The weight of each convolutional layer in the convolutional layer is the same as the weight of the corresponding convolutional layer of the first convolutional neural network model. The predetermined convolutional layer of the second convolutional neural network model is in all the convolutional layers. The ordering in the convolutional layer of the second convolutional neural network model and the ordering of the predetermined convolutional layer of the first convolutional neural network model in all the convolutional layers of the first convolutional neural network model Unanimous.

在一个实施例中，所述插值模块被进一步配置为：In an embodiment, the interpolation module is further configured to:

针对所述第一浅层特征图中的每一候选特征点，获取要以该候选特征点为坐标中心确定的正方形区域的目标边长；For each candidate feature point in the first shallow feature map, obtain the target side length of a square area to be determined with the candidate feature point as a coordinate center;

针对所述第一浅层特征图中的每一候选特征点，以该候选特征点为坐标中心在所述第一浅层特征图中确定正方形区域，其中，所述正方形区域的边长为针对该候选特征点获取的目标边长；For each candidate feature point in the first shallow feature map, a square area is determined in the first shallow feature map with the candidate feature point as the coordinate center, wherein the side length of the square area is for The target side length obtained by the candidate feature point;

获取针对每一候选特征点确定的正方形区域的四个顶点的坐标值以及每一顶点处的像素值；Acquiring the coordinate values of the four vertices of the square area determined for each candidate feature point and the pixel value at each vertex;

针对每一候选特征点，基于该候选特征点对应的顶点的坐标和每一顶点处的像素值，利用如下公式获取构成所述第二浅层特征图的各个像素点的像素值，以得到第二浅层特征图：For each candidate feature point, based on the coordinates of the vertex corresponding to the candidate feature point and the pixel value at each vertex, the following formula is used to obtain the pixel value of each pixel that constitutes the second shallow feature map to obtain the first Two shallow feature maps:

在一个实施例中，针对每一候选特征点获取的目标边长都为相同的预设的边长，所述获取针对每一候选特征点确定的正方形区域的四个顶点的坐标值以及每一顶点处的像素值，包括：In one embodiment, the target side length obtained for each candidate feature point is the same preset side length, and the obtained coordinate values of the four vertices of the square area determined for each candidate feature point and each The pixel value at the vertex, including:

针对每一候选特征点，分别利用下列表达式获取针对该候选特征点确定的正方形区域的四个顶点的坐标值：For each candidate feature point, the following expressions are used to obtain the coordinate values of the four vertices of the square area determined for the candidate feature point:

其中，(x ₃，y ₃)是候选特征点的坐标值，r是所述预设的边长的二分之一； Wherein, (x ₃ , y ₃ ) is the coordinate value of the candidate feature point, and r is one half of the preset side length;

针对每一候选特征点，判断针对该候选特征点确定的正方形区域的四个顶点中每一顶点的坐标值是否都位于所述第一浅层特征图内；For each candidate feature point, determine whether the coordinate value of each of the four vertices of the square area determined for the candidate feature point is located in the first shallow feature map;

如果是，按照所述坐标值在所述第一浅层特征图中获取对应顶点的像素值；If yes, obtain the pixel value of the corresponding vertex in the first shallow feature map according to the coordinate value;

如果否，对于针对该候选特征点确定的正方形区域的四个顶点中位于所述第一浅层特征图内的顶点，按照各顶点对应的所述坐标值在所述第一浅层特征图中获取对应顶点的像素值；If not, for the vertices located in the first shallow feature map among the four vertices of the square area determined for the candidate feature point, in the first shallow feature map according to the coordinate value corresponding to each vertex Obtain the pixel value of the corresponding vertex;

获取针对该候选特征点确定的正方形区域的四个顶点中位于所述第一浅层特征图以外的顶点，作为辅助顶点；Acquiring a vertex outside the first shallow feature map among the four vertices of the square region determined for the candidate feature point as an auxiliary vertex;

针对每一辅助顶点，在所述第一浅层特征图中获取与该辅助顶点距离最近的像素点处的像素值，作为该辅助顶点的像素值。For each auxiliary vertex, the pixel value at the pixel point closest to the auxiliary vertex is acquired in the first shallow feature map, as the pixel value of the auxiliary vertex.

在一个实施例中，所述针对每一辅助顶点，在所述第一浅层特征图中获取与该辅助顶点距离最近的像素点处的像素值，作为该辅助顶点的像素值，包括：In an embodiment, for each auxiliary vertex, obtaining the pixel value at the pixel point closest to the auxiliary vertex in the first shallow feature map as the pixel value of the auxiliary vertex includes:

其中，x ₄，y ₄分别是辅助顶点的横坐标和纵坐标，x，y分别是所述第一浅层特征图中像素点的横坐标和纵坐标，D是辅助顶点与所述第一浅层特征图中的像素点之间的距离； Where x ₄ and y ₄ are the abscissa and ordinate of the auxiliary vertex, respectively, x, y are the abscissa and ordinate of the pixel in the first shallow feature map, and D is the auxiliary vertex and the first The distance between pixels in the shallow feature map;

针对每一辅助顶点，在针对该辅助顶点获取的所述第一浅层特征图中的每一像素点与该辅助顶点的距离中，获取最小的距离；For each auxiliary vertex, obtain the smallest distance among the distances between each pixel point in the first shallow feature map obtained for the auxiliary vertex and the auxiliary vertex;

将最小的距离对应的像素点处的像素值，作为该辅助顶点的像素值。The pixel value at the pixel point corresponding to the smallest distance is used as the pixel value of the auxiliary vertex.

在一个实施例中，所述针对每一辅助顶点，在针对该辅助顶点获取的所述第一浅层特征图中的每一像素点与该辅助顶点的距离中，获取最小的距离，包括：In one embodiment, for each auxiliary vertex, acquiring the smallest distance among the distances between each pixel point in the first shallow feature map and the auxiliary vertex acquired for the auxiliary vertex includes:

针对每一辅助顶点，在针对该辅助顶点获取的所述第一浅层特征图中的每一像素点与该辅助顶点的距离中任取一个，标记为候选最小距离；For each auxiliary vertex, any one of the distances between each pixel in the first shallow feature map obtained for the auxiliary vertex and the auxiliary vertex is selected, and it is marked as a candidate minimum distance;

在除所述候选最小距离之外的每一像素点与该辅助顶点的距离中，按照预定顺序针对每一像素点与该辅助顶点的距离，判断该像素点与该辅助顶点的距离是否小于所述候选最小距离；In the distance between each pixel point and the auxiliary vertex except the candidate minimum distance, determine whether the distance between the pixel point and the auxiliary vertex is less than the distance between each pixel point and the auxiliary vertex according to a predetermined sequence. The candidate minimum distance;

如果是，取消所有候选最小距离的标记，将该像素点与该辅助顶点的距离标记为候选最小距离；If yes, cancel the mark of all candidate minimum distances, and mark the distance between the pixel and the auxiliary vertex as the candidate minimum distance;

在没获取过的像素点与该辅助顶点的距离中，按照预定顺序再次针对每一像素点与该辅助顶点的距离，判断该像素点与该辅助顶点的距离是否小于所述候选最小距离，直至没有像素点与该辅助顶点的距离小于所述候选最小距离；In the distance between the unacquired pixel and the auxiliary vertex, the distance between each pixel and the auxiliary vertex is again determined in a predetermined order to determine whether the distance between the pixel and the auxiliary vertex is less than the candidate minimum distance, until The distance between no pixel and the auxiliary vertex is less than the candidate minimum distance;

将所述候选最小距离作为最小的距离。The candidate minimum distance is taken as the minimum distance.

在一个实施例中，在针对每一辅助顶点，针对所述第一浅层特征图中的每一像素点，利用如下公式确定该像素点与该辅助顶点之间的距离之前，还包括：In one embodiment, for each auxiliary vertex, for each pixel in the first shallow feature map, before determining the distance between the pixel and the auxiliary vertex using the following formula, the method further includes:

针对每一辅助顶点，针对所述第一浅层特征图中的每一候选像素点，利用如下公式确定该候选像素点与该辅助顶点之间的距离。For each auxiliary vertex, for each candidate pixel in the first shallow feature map, the following formula is used to determine the distance between the candidate pixel and the auxiliary vertex.

根据本申请的第三方面，还提供了一种计算设备，执行上述任一所示的人脸特征点定位方法的全部或者部分步骤。该计算设备包括：According to the third aspect of the present application, there is also provided a computing device that executes all or part of the steps of any one of the above-mentioned facial feature point positioning methods. The computing equipment includes:

至少一个处理器；以及At least one processor; and

与所述至少一个处理器通信连接的存储器；其中，A memory communicatively connected with the at least one processor; wherein,

所述存储器存储有可被所述至少一个处理器执行的指令，所述指令被所述至少一个处理器执行，以使所述至少一个处理器能够执行如上述任一个示例性实施例所示出的人脸特征点定位方法。The memory stores instructions that can be executed by the at least one processor, and the instructions are executed by the at least one processor, so that the at least one processor can execute as shown in any of the above exemplary embodiments. The method of locating facial feature points.

所属技术领域的技术人员能够理解，本申请的各个方面可以实现为系统、方法或程序产品。因此，本申请的各个方面可以具体实现为以下形式，即：完全的硬件实施方式、完全的软件实施方式(包括固件、微代码等)，或硬件和软件方面结合的实施方式，这里可以统称为“电路”、“模块”或“系统”。Those skilled in the art can understand that various aspects of the present application can be implemented as a system, a method, or a program product. Therefore, each aspect of the present application can be specifically implemented in the following forms, namely: complete hardware implementation, complete software implementation (including firmware, microcode, etc.), or a combination of hardware and software implementations, which can be collectively referred to herein as "Circuit", "Module" or "System".

下面参照图7来描述根据本申请的这种实施方式的计算设备700。图7显示的计算设备700仅仅是一个示例，不应对本申请实施例的功能和使用范围带来任何限制。The computing device 700 according to this embodiment of the present application will be described below with reference to FIG. 7. The computing device 700 shown in FIG. 7 is only an example, and should not bring any limitation to the function and use scope of the embodiments of the present application.

如图7所示，计算设备700以通用计算设备的形式表现。计算设备700的组件可以包括但不限于：上述至少一个处理单元710、上述至少一个存储单元720、连接不同系统组件(包括存储单元720和处理单元710)的总线730。As shown in FIG. 7, the computing device 700 is in the form of a general-purpose computing device. The components of the computing device 700 may include, but are not limited to: the aforementioned at least one processing unit 710, the aforementioned at least one storage unit 720, and a bus 730 connecting different system components (including the storage unit 720 and the processing unit 710).

其中，所述存储单元存储有程序代码，所述程序代码可以被所述处理单元710执行，使得所述处理单元710执行本说明书上述“实施例方法”部分中描述的根据本申请各种示例性实施方式的步骤。Wherein, the storage unit stores a program code, and the program code can be executed by the processing unit 710, so that the processing unit 710 executes the various exemplary methods described in the above-mentioned "Embodiment Method" section of this specification. Steps of implementation.

存储单元720可以包括易失性存储单元形式的可读介质，例如随机存取存储单元(RAM)721和/或高速缓存存储单元722，还可以进一步包括只读存储单元(ROM)723。The storage unit 720 may include a readable medium in the form of a volatile storage unit, such as a random access storage unit (RAM) 721 and/or a cache storage unit 722, and may further include a read-only storage unit (ROM) 723.

存储单元720还可以包括具有一组(至少一个)程序模块725的程序/实用工具724，这样的程序模块725包括但不限于：操作系统、一个或者多个应用程序、其它程序模块以及程序数据，这些示例中的每一个或某种组合中可能包括网络环境的实现。The storage unit 720 may also include a program/utility tool 724 having a set of (at least one) program module 725. Such program module 725 includes but is not limited to: an operating system, one or more application programs, other program modules, and program data, Each of these examples or some combination may include the implementation of a network environment.

总线730可以为表示几类总线结构中的一种或多种，包括存储单元总线或者存储单元控制器、外围总线、图形加速端口、处理单元或者使用多种总线结构中的任意总线结构的局域总线。The bus 730 may represent one or more of several types of bus structures, including a storage unit bus or a storage unit controller, a peripheral bus, a graphics acceleration port, a processing unit, or a local area using any bus structure among multiple bus structures. bus.

计算设备700也可以与一个或多个外部设备900(例如键盘、指向设备、蓝牙设备等)通信，还可与一个或者多个使得用户能与该计算设备700交互的设备通信，和/或与使得该计算设备700能与一个或多个其它计算设备进行通信的任何设备(例如路由器、调制解调器等等)通信。这种通信可以通过输入/输出(I/O)接口750进行。并且，计算设备700还可以通过网络适配器760与一个或者多个网络(例如局域网(LAN)，广域网(WAN)和/或公共网络，例如因特网)通信。如图所示，网络适配器760通过总线730与计算设备700的其它模块通信。应当明白，尽管图中未示出，可以结合计算设备700使用其它硬件和/或软件模块，包括但不限于：微代码、设备驱动器、冗余处理单元、外部磁盘驱动阵列、RAID系统、磁带驱动器以及数据备份存储系统等。The computing device 700 may also communicate with one or more external devices 900 (such as keyboards, pointing devices, Bluetooth devices, etc.), and may also communicate with one or more devices that enable a user to interact with the computing device 700, and/or communicate with Any device (eg, router, modem, etc.) that enables the computing device 700 to communicate with one or more other computing devices. This communication can be performed through an input/output (I/O) interface 750. In addition, the computing device 700 may also communicate with one or more networks (for example, a local area network (LAN), a wide area network (WAN), and/or a public network, such as the Internet) through the network adapter 760. As shown in the figure, the network adapter 760 communicates with other modules of the computing device 700 through the bus 730. It should be understood that although not shown in the figure, other hardware and/or software modules can be used in conjunction with the computing device 700, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives And data backup storage system, etc.

通过以上的实施方式的描述，本领域的技术人员易于理解，这里描述的示例实施方式可以通过软件实现，也可以通过软件结合必要的硬件的方式来实现。因此，根据本申请实施方式的技术方案可以以软件产品的形式体现出来，该软件产品可以存储在一个非易失性存储介质(可以是CD-ROM，U盘，移动硬盘等)中或网络上，包括若干指令以使得一台计算设备(可以是个人计算机、服务器、终端装置、或者网络设备等)执行根据本申请实施方式的方法。Through the description of the above embodiments, those skilled in the art can easily understand that the example embodiments described here can be implemented by software, or can be implemented by combining software with necessary hardware. Therefore, the technical solution according to the embodiments of the present application can be embodied in the form of a software product, which can be stored in a non-volatile storage medium (which can be a CD-ROM, U disk, mobile hard disk, etc.) or on the network , Including several instructions to make a computing device (which can be a personal computer, a server, a terminal device, or a network device, etc.) execute the method according to the embodiment of the present application.

根据本申请的第四方面，还提供了一种计算机非易失性可读存储介质，其上存储有能够实现本说明书上述方法的程序产品。在一些可能的实施方式中，本申请的各个方面还可以实现为一种程序产品的形式，其包括程序代码，当所述程序产品在终端设备上运行时，所述程序代码用于使所述终端设备执行本说明书上述“示例性方法”部分中描述的根据本申请各种示例性实施方式的步骤。According to the fourth aspect of the present application, there is also provided a computer non-volatile readable storage medium on which is stored a program product capable of implementing the above-mentioned method in this specification. In some possible implementation manners, various aspects of the present application can also be implemented in the form of a program product, which includes program code. When the program product runs on a terminal device, the program code is used to make the The terminal device executes the steps according to various exemplary embodiments of the present application described in the above-mentioned "Exemplary Method" section of this specification.

参考图8所示，描述了根据本申请的实施方式的用于实现上述方法的计算机非易失性可读存储介质800，其可以采用便携式紧凑盘只读存储器(CD-ROM)并包括程序代码，并可以在终端设备，例如个人电脑上运行。然而，本申请的程序产品不限于此，在本文件中，计算机非易失性可读存储介质可以是任何包含或存储程序的有形介质，该程序可以被指令执行系统、装置或者器件使用或者与其结合使用。Referring to FIG. 8, a computer non-volatile readable storage medium 800 for implementing the above method according to an embodiment of the present application is described, which may adopt a portable compact disk read-only memory (CD-ROM) and includes program code , And can run on terminal devices, such as personal computers. However, the program product of this application is not limited to this. In this document, the computer non-volatile readable storage medium can be any tangible medium that contains or stores a program. The program can be used by or in combination with an instruction execution system, device, or device. In conjunction with.

所述程序产品可以采用一个或多个可读介质的任意组合。可读介质可以是可读信号介质或者可读存储介质。可读存储介质例如可以为但不限于电、磁、光、电磁、红外线、或半导体的系统、装置或器件，或者任意以上的组合。可读存储介质的更具体的例子(非穷举的列表)包括：具有一个或多个导线的电连接、便携式盘、硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(EPROM或闪存)、光纤、便携式紧凑盘只读存储器(CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。The program product can use any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. The readable storage medium may be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, device, or device, or a combination of any of the above. More specific examples (non-exhaustive list) of readable storage media include: electrical connections with one or more wires, portable disks, hard disks, random access memory (RAM), read-only memory (ROM), erasable Type programmable read only memory (EPROM or flash memory), optical fiber, portable compact disk read only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the above.

计算机可读信号介质可以包括在基带中或者作为载波一部分传播的数据信号，其中承载了可读程序代码。这种传播的数据信号可以采用多种形式，包括但不限于电磁信号、光信号或上述的任意合适的组合。可读信号介质还可以是可读存储介质以外的任何可读介质，该可读介质可以发送、传播或者传输用于由指令执行系统、装置或者器件使用或者与其结合使用的程序。The computer-readable signal medium may include a data signal propagated in baseband or as a part of a carrier wave, and readable program code is carried therein. This propagated data signal can take many forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing. The readable signal medium may also be any readable medium other than a readable storage medium, and the readable medium may send, propagate, or transmit a program for use by or in combination with the instruction execution system, apparatus, or device.

可读介质上包含的程序代码可以用任何适当的介质传输，包括但不限于无线、有线、光缆、RF等等，或者上述的任意合适的组合。The program code contained on the readable medium can be transmitted by any suitable medium, including but not limited to wireless, wired, optical cable, RF, etc., or any suitable combination of the above.

可以以一种或多种程序设计语言的任意组合来编写用于执行本申请操作的程序代码，所述程序设计语言包括面向对象的程序设计语言—诸如Java、C++等，还包括常规的过程式程序设计语言—诸如“C”语言或类似的程序设计语言。程序代码可以完全地在用户计算设备上执行、部分地在用户设备上执行、作为一个独立的软件包执行、部分在用户计算设备上部分在远程计算设备上执行、或者完全在远程计算设备或服务器上执行。在涉及远程计算设备的情形中，远程计算设备可以通过任意种类的网络，包括局域网(LAN)或广域网(WAN)，连接到用户计算设备，或者，可以连接到外部计算设备(例如利用因特网服务提供商来通过因特网连接)。The program code used to perform the operations of the present application can be written in any combination of one or more programming languages. The programming languages include object-oriented programming languages—such as Java, C++, etc., as well as conventional procedural programming languages. Programming language-such as "C" language or similar programming language. The program code can be executed entirely on the user's computing device, partly on the user's device, executed as an independent software package, partly on the user's computing device and partly executed on the remote computing device, or entirely on the remote computing device or server Executed on. In the case of a remote computing device, the remote computing device can be connected to a user computing device through any kind of network, including a local area network (LAN) or a wide area network (WAN), or it can be connected to an external computing device (for example, using Internet service providers). Business to connect via the Internet).

此外，上述附图仅是根据本申请示例性实施例的方法所包括的处理的示意性说明，而不是限制目的。易于理解，上述附图所示的处理并不表明或限制这些处理的时间顺序。另外，也易于理解，这些处理可以是例如在多个模块中同步或异步执行的。In addition, the above-mentioned drawings are merely schematic illustrations of the processing included in the method according to the exemplary embodiments of the present application, and are not intended for limitation. It is easy to understand that the processing shown in the above drawings does not indicate or limit the time sequence of these processings. In addition, it is easy to understand that these processes can be executed synchronously or asynchronously in multiple modules, for example.

应当理解的是，本申请并不局限于上面已经描述并在附图中示出的精确结构，并且可以在不脱离其范围执行各种修改和改变。本申请的范围仅由所附的权利要求来限制。It should be understood that the present application is not limited to the precise structure that has been described above and shown in the drawings, and various modifications and changes can be performed without departing from its scope. The scope of the application is only limited by the appended claims.

Claims

A method for locating facial feature points, including:

Input the target face image of the feature points to be located into the first convolutional neural network model, and obtain the first shallow feature containing multiple candidate feature points output by the predetermined layer convolutional layer of the first convolutional neural network model Figure;

Using a bilinear interpolation algorithm to perform bilinear interpolation on candidate feature points in the first shallow feature map to obtain a second shallow feature map;

Input the second shallow feature map to a second convolutional neural network model cascaded with the first convolutional neural network model to obtain the output of the second convolutional neural network model and the feature to be located The face feature map corresponding to the target face image of the point, where each of the predetermined convolutional layers of the second convolutional neural network model and all convolutional layers before the predetermined convolutional layer The weights of the first convolutional neural network model correspond to the weights of the convolutional layers corresponding to the number of layers, and the predetermined convolutional layers of the second convolutional neural network model are in all the second convolutional neural network models. The ordering in the convolutional layer of is consistent with the ordering of the predetermined convolutional layer of the first convolutional neural network model in all the convolutional layers of the first convolutional neural network model.

The method according to claim 1, wherein the using a bilinear interpolation algorithm to perform bilinear interpolation on candidate feature points in the first shallow feature map to obtain a second shallow feature map comprises:

For each candidate feature point in the first shallow feature map, obtain the target side length of a square area to be determined with the candidate feature point as a coordinate center;

For each candidate feature point in the first shallow feature map, a square area is determined in the first shallow feature map with the candidate feature point as the coordinate center, wherein the side length of the square area is for The target side length obtained by the candidate feature point;

Acquiring the coordinate values of the four vertices of the square area determined for each candidate feature point and the pixel value at each vertex;

For each candidate feature point, based on the coordinates of the vertex corresponding to the candidate feature point and the pixel value at each vertex, the following formula is used to obtain the pixel value of each pixel that constitutes the second shallow feature map to obtain the first Two shallow feature maps:

Among them, (x, y) is the coordinate value of the pixel point corresponding to the feature point in the second shallow feature map obtained for the candidate feature point, (x ₁ , y ₁ ), (x ₂ , y ₁ ), (x ₁ ,y ₂ ) and (x ₂ ,y ₂ ) are the coordinate values of the four vertices of the square area corresponding to the candidate feature point, f(x ₁ ,y ₁ ), f(x ₂ ,y ₁ ) , F(x ₁ , y ₂ ) and f(x ₂ , y ₂ ) are the pixel values at the four vertices of the square region corresponding to the candidate feature point in the first shallow feature map.

The method according to claim 2, wherein the target side length acquired for each candidate feature point is the same preset side length, and the acquisition of the four vertices of the square area determined for each candidate feature point The coordinate value and the pixel value at each vertex include:

For each candidate feature point, the following expressions are used to obtain the coordinate values of the four vertices of the square area determined for the candidate feature point:

(x ₃ -r,y ₃ -r),(x ₃ -r,y ₃ +r),(x ₃ +r,y ₃ -r),(x ₃ +r,y ₃ +r),

Wherein, (x ₃ , y ₃ ) is the coordinate value of the candidate feature point, and r is one half of the preset side length;

For each candidate feature point, determine whether the coordinate value of each of the four vertices of the square area determined for the candidate feature point is located in the first shallow feature map;

If yes, obtain the pixel value of the corresponding vertex in the first shallow feature map according to the coordinate value;

If not, for the vertices located in the first shallow feature map among the four vertices of the square area determined for the candidate feature point, in the first shallow feature map according to the coordinate value corresponding to each vertex Obtain the pixel value of the corresponding vertex;

Acquiring a vertex outside the first shallow feature map among the four vertices of the square region determined for the candidate feature point as an auxiliary vertex;

For each auxiliary vertex, the pixel value at the pixel point closest to the auxiliary vertex is acquired in the first shallow feature map, as the pixel value of the auxiliary vertex.

The method according to claim 3, wherein, for each auxiliary vertex, the pixel value at the pixel point closest to the auxiliary vertex is obtained in the first shallow feature map as the pixel of the auxiliary vertex Values include:

For each auxiliary vertex, for each pixel in the first shallow feature map, use the following formula to determine the distance between the pixel and the auxiliary vertex:

Where x ₄ and y ₄ are the abscissa and ordinate of the auxiliary vertex, respectively, x, y are the abscissa and ordinate of the pixel in the first shallow feature map, and D is the auxiliary vertex and the first The distance between pixels in the shallow feature map;

For each auxiliary vertex, obtain the smallest distance among the distances between each pixel point in the first shallow feature map obtained for the auxiliary vertex and the auxiliary vertex;

The pixel value at the pixel point corresponding to the smallest distance is used as the pixel value of the auxiliary vertex.

The method according to claim 4, wherein, for each auxiliary vertex, the smallest distance between each pixel in the first shallow feature map and the auxiliary vertex obtained for the auxiliary vertex is obtained The distance includes:

For each auxiliary vertex, any one of the distances between each pixel in the first shallow feature map obtained for the auxiliary vertex and the auxiliary vertex is selected, and it is marked as a candidate minimum distance;

In the distance between each pixel point and the auxiliary vertex except the candidate minimum distance, determine whether the distance between the pixel point and the auxiliary vertex is less than the distance between each pixel point and the auxiliary vertex according to a predetermined sequence. The candidate minimum distance;

If yes, cancel the mark of all candidate minimum distances, and mark the distance between the pixel and the auxiliary vertex as the candidate minimum distance;

In the distance between the unacquired pixel and the auxiliary vertex, the distance between each pixel and the auxiliary vertex is again determined in a predetermined order to determine whether the distance between the pixel and the auxiliary vertex is less than the candidate minimum distance, until The distance between no pixel and the auxiliary vertex is less than the candidate minimum distance;

The candidate minimum distance is taken as the minimum distance.

The method according to claim 4, wherein, for each auxiliary vertex, for each pixel in the first shallow feature map, the following formula is used to determine the distance between the pixel and the auxiliary vertex , The method further includes:

For each auxiliary vertex, from the pixel points of the first shallow feature map, obtain a pixel whose abscissa difference from the auxiliary vertex is less than a predetermined coordinate difference and whose ordinate difference is less than a predetermined coordinate difference, as a candidate pixel;

For each auxiliary vertex, for each pixel in the first shallow feature map, using the following formula to determine the distance between the pixel and the auxiliary vertex includes:

For each auxiliary vertex, for each candidate pixel in the first shallow feature map, the following formula is used to determine the distance between the candidate pixel and the auxiliary vertex.

The method according to claim 2, wherein, for each candidate feature point in the first shallow feature map, acquiring the target side length of a square area to be determined with the candidate feature point as a coordinate center comprises :

Taking each candidate feature point in the first shallow feature map as a coordinate center, acquiring a circular area with a preset first side length in the first shallow feature map;

For each candidate feature point in the first shallow feature map, determine the variance of each pixel point in the circular area corresponding to the candidate feature point;

In the case that the variance is greater than or equal to the preset variance threshold, use the preset first side length as the target side length of the square area to be determined with the candidate feature point as the coordinate center;

In the case that the variance is less than the preset variance threshold, the preset second side length is used as the target side length of the square area to be determined with the candidate feature point as the coordinate center, wherein the preset second side length The side length is smaller than the preset first side length.

A device for locating facial feature points, including:

The first acquisition module is configured to input the target face image of the feature point to be located into the first convolutional neural network model, and obtain the output of the predetermined layer convolutional layer of the first convolutional neural network model, which contains multiple candidates The first shallow feature map of feature points;

An interpolation module, configured to perform bilinear interpolation on candidate feature points in the first shallow feature map by using a bilinear interpolation algorithm to obtain a second shallow feature map;

The second acquisition module is configured to input the second shallow feature map to a second convolutional neural network model cascaded with the first convolutional neural network model to obtain the second convolutional neural network model The output face feature map corresponding to the target face image of the feature point to be located, wherein the predetermined convolutional layer of the second convolutional neural network model and all the volumes before the predetermined convolutional layer The weight of each convolutional layer in the buildup layer is consistent with the weight of the convolutional layer corresponding to the number of layers of the first convolutional neural network model. The predetermined convolutional layer of the second convolutional neural network model is in all the convolutional layers. The ordering in the convolutional layer of the second convolutional neural network model is consistent with the ordering of the predetermined convolutional layer of the first convolutional neural network model in all convolutional layers of the first convolutional neural network model .

The apparatus according to claim 8, wherein the interpolation module is further configured to:

The device according to claim 9, wherein the target side length acquired for each candidate feature point is the same preset side length, and the acquisition of the four vertices of the square region determined for each candidate feature point The coordinate value and the pixel value at each vertex include:

11. The device according to claim 10, wherein, for each auxiliary vertex, the pixel value at the pixel point closest to the auxiliary vertex is obtained in the first shallow feature map as the pixel of the auxiliary vertex Values include:

11. The device according to claim 11, wherein for each auxiliary vertex, the smallest distance between each pixel in the first shallow feature map and the auxiliary vertex obtained for the auxiliary vertex is obtained The distance includes:

The candidate minimum distance is taken as the minimum distance.

11. The device of claim 11, wherein for each auxiliary vertex, for each pixel in the first shallow feature map, before determining the distance between the pixel and the auxiliary vertex using the following formula ,Also includes:

The device according to claim 9, wherein, for each candidate feature point in the first shallow feature map, acquiring the target side length of a square area to be determined with the candidate feature point as a coordinate center comprises :

A computing device includes a memory and a processor. The memory stores computer-readable instructions, and when the computer-readable instructions are executed by the processor, the processor executes:

The computing device according to claim 15, wherein said using a bilinear interpolation algorithm to perform bilinear interpolation on candidate feature points in the first shallow feature map to obtain a second shallow feature map comprises:

The computing device according to claim 16, wherein the target side length obtained for each candidate feature point is the same preset side length, and the four vertices of the square area determined for each candidate feature point are obtained The coordinate value of and the pixel value at each vertex, including:

For each auxiliary vertex, the pixel value at the pixel point closest to the auxiliary vertex is obtained in the first shallow feature map, as the pixel value of the auxiliary vertex.

The computing device according to claim 17, wherein for each auxiliary vertex, the pixel value at the pixel point closest to the auxiliary vertex is obtained in the first shallow feature map as the value of the auxiliary vertex Pixel value, including:

The computing device according to claim 18, wherein, for each auxiliary vertex, in the distance between each pixel in the first shallow feature map obtained for the auxiliary vertex and the auxiliary vertex, the The minimum distance includes:

The candidate minimum distance is taken as the minimum distance.

18. The computing device according to claim 18, wherein, for each auxiliary vertex, for each pixel in the first shallow feature map, the distance between the pixel and the auxiliary vertex is determined using the following formula Previously, when the computer-readable instructions were executed by the processor, the processor was caused to also execute:

The computing device according to claim 16, wherein, for each candidate feature point in the first shallow feature map, acquiring the target side length of a square area to be determined with the candidate feature point as a coordinate center, include:

A computer non-volatile readable storage medium storing computer readable instructions, which when executed by one or more processors, cause one or more processors to execute any one of claims 1 to 7 The method described in the item.