RU2840316C1

RU2840316C1 - Method and system for authenticating face on image

Info

Publication number: RU2840316C1
Application number: RU2024131563A
Authority: RU
Inventors: Владимир Игоревич Михеюшкин; Кирилл Сергеевич Митягин; Михаил Вячеславович Сосульников; Данил Александрович Кононыхин; Анна Андреевна Варфоломеева; Ксения Антоновна Телегина
Original assignee: Публичное Акционерное Общество "Сбербанк России" (Пао Сбербанк)
Filing date: 2024-10-21
Publication date: 2025-05-21

Abstract

FIELD: computer engineering.

SUBSTANCE: at stages of method: initial image of user is obtained; determining coordinates of the position of the user’s face; forming a face image in accordance with the coordinates of the position of the user’s face; determining the position of the face key points for the face image and the position of the user key points for the user image; generating a first face key point mask for the face image and a second user key point mask for the user image; calculating a first vector representation of the face features based on the face image and the first mask of the face keypoints and the second vector representation of the user features based on the user image and the second keypoint mask of the user; based on the obtained feature vectors, determining the final assessment of the probability of the authenticity of the user on the original image; based on the value of the final assessment of the probability of authenticity of the user on the image, the class of the image is determined, which indicates that the face presented on the image is authentic or fake.

EFFECT: high accuracy when determining the authenticity of a face.

15 cl, 5 dwg

Description

ОБЛАСТЬ ТЕХНИКИAREA OF TECHNOLOGY

[001] Настоящее изобретение относится, в общем, к вычислительной технике, а в частности, к способу и системе определения подлинности лица на изображении с целью защиты от мошенничества.[001] The present invention relates generally to computing technology and, in particular, to a method and system for determining the authenticity of a face in an image for the purpose of protecting against fraud.

УРОВЕНЬ ТЕХНИКИLEVEL OF TECHNOLOGY

[002] В настоящее время существуют несколько известных решений, используемых в задачах биометрии лица для защиты от мошенничества. Под мошенничеством понимаются ситуации, при которых злоумышленник в системе лицевой идентификации пытается выдать себя за другого человека путем подмены регистрируемой фотографии или видеозаписи с пользовательского терминала. В качестве примера подделки могут использоваться, например, цветная распечатка фотографии человека на бумаге, показ фотографии или видеозаписи на экране телефона, бумажная или объемная маска, выполненная из специальных материалов.[002] Currently, there are several known solutions used in facial biometrics tasks to protect against fraud. Fraud is understood as situations in which an intruder in a facial identification system attempts to impersonate another person by replacing the registered photograph or video recording from the user terminal. Examples of forgery include, for example, a color printout of a person's photograph on paper, displaying a photograph or video recording on a phone screen, a paper or volumetric mask made of special materials.

[003] Технологии защиты лицевой идентификации способны определить, является ли полученная фотография или видеозапись с пользовательского терминала поддельной или настоящей. Для этого осуществляется вычислительный анализ лицевых признаков, основанный, как правило, на применении алгоритмов машинного обучения, в частности, с помощью искусственных нейронных сетей (ИНС), для формирования итогового решения проверки подлинности. [003] Facial identification protection technologies are capable of determining whether a photograph or video recording received from a user terminal is fake or real. This is done by performing a computational analysis of facial features, usually based on the use of machine learning algorithms, in particular, using artificial neural networks (ANN), to form the final authentication solution.

[004] Известен способ выявления подделок в биометрических системах посредством предоставления ей фотографии человека, зарегистрированного в системе (Способ и устройство распознавания рельефности лица, патент № RU 2431190 С2, 22.06.2009), который основан на анализе рельефности лица. Данный подход включает в себя выполнение последовательности операций, направленных на идентификацию признаков объемности полученной фотографии объекта. В частности, формируются два изображения целевого объекта: одно при включенной подсветке, другое - при выключенной. На сформированных изображениях выделяются области, содержащие лицо, после чего выполняется сопоставление областей между парой изображений и построение карты распределения интенсивности. Анализ изменений интенсивности сопоставляемых областей позволяет принять решение о рельефности представленного на изображениях лица и выявить попытку обмана биометрической системы.[004] A method is known for detecting counterfeits in biometric systems by providing it with a photograph of a person registered in the system (Method and device for recognizing facial relief, patent No. RU 2431190 C2, 22.06.2009), which is based on an analysis of facial relief. This approach includes performing a sequence of operations aimed at identifying the features of the volume of the obtained photograph of the object. In particular, two images of the target object are formed: one with the backlight on, the other with the backlight off. In the formed images, areas containing the face are highlighted, after which a comparison of the areas between the pair of images is performed and an intensity distribution map is constructed. An analysis of changes in the intensity of the compared areas allows a decision to be made on the relief of the face presented in the images and an attempt to deceive the biometric system to be detected.

[005] Известно решение, направленное на обнаружение попыток обмана биометрических систем распознавания лиц, основанное на анализе инфракрасного изображения (Living body detection method and device, патент № CN112883758B, 25.08.2023), из которого извлекают признаки области интереса, содержащей ключевые черты лица, затем данные подают в нейросетевой классификатор. Классификатор определяет наличие или отсутствие информации о распределении вен на изображении, после чего принимается решение о подлинности исходных данных.[005] A solution is known for detecting attempts to deceive biometric facial recognition systems, based on the analysis of an infrared image (Living body detection method and device, patent No. CN112883758B, 08/25/2023), from which features of the area of interest containing key facial features are extracted, then the data is fed to a neural network classifier. The classifier determines the presence or absence of information about the distribution of veins in the image, after which a decision is made on the authenticity of the original data.

[006] Известен способ выявления подделок в биометрических системах идентификации пользователя, основанный на извлечении информации о глубине лица из последовательных изображений (Anti-spoofing, патент № GB2579583B, 06.01.2021). При данном подходе используется алгоритм машинного обучения, который обучается восстанавливать псевдоглубину с использованием референсных трехмерных изображений. Также используется специальный модуль обработки, который адаптирует параметры для сопоставления результатов анализа полученных изображений с реальной глубиной. Обученная система проверяет на соответствие вычисленную глубину и ожидаемые значения для настоящего лица. Если обнаружено несоответствие, то изображение определяется как поддельное. Недостатком данного подхода является необходимость использования нескольких последовательных кадров с терминала пользователя, что приводит к появлению временной задержки при обработке данных. [006] A known method for detecting counterfeits in biometric user identification systems is based on extracting face depth information from successive images (Anti-spoofing, Patent No. GB2579583B, 06.01.2021). This approach uses a machine learning algorithm that learns to restore pseudo-depth using reference three-dimensional images. A special processing module is also used that adapts the parameters to compare the results of the analysis of the obtained images with the real depth. The trained system checks the calculated depth and the expected values for a real face for compliance. If a discrepancy is detected, the image is determined to be counterfeit. The disadvantage of this approach is the need to use several successive frames from the user's terminal, which leads to a time delay in data processing.

[007] В уровне техники также известен способ анализа подлинности лица на изображении с использованием обученной модели искусственного интеллекта (Method and device for determining face liveness, патент № KR102509134B1, 14.03.2023). Данная система сначала получает первое изображение с лицом, на основе которого генерируются обучающие данные. Для этого полученное изображение обрабатывается различными способами, такими как поворот, увеличение, сдвиг и др., чтобы создать множество вариаций исходных данных. Далее в обученную модель подается второе изображение для извлечения вектора признаков, на основе которого формируется решении о наличии настоящего лица на изображении.[007] The prior art also includes a method for analyzing the authenticity of a face in an image using a trained artificial intelligence model (Method and device for determining face liveness, patent No. KR102509134B1, 14.03.2023). This system first receives a first image with a face, on the basis of which training data is generated. To do this, the received image is processed in various ways, such as rotation, enlargement, shift, etc., to create multiple variations of the original data. Next, a second image is fed to the trained model to extract a feature vector, on the basis of which a decision is made about the presence of a real face in the image.

[008] Также известен способ защиты биометрической идентификации на изображении с использованием нейронной сети (Biometric task network, патент № US11776323B2, 03.10.2023). При этом используются несколько отдельных нейронных сетей - общая нейронная сеть извлекает ключевые признаки изображения, затем признаки передаются в специализированные нейронные сети, которые определяют область лица, текстуру, позу тела и другие параметры. Полученные результаты передаются в нейронную сеть для объединения признаков, формируется решение о том, является ли входная фотографии подлинной.[008] A method for protecting biometric identification in an image using a neural network is also known (Biometric task network, patent No. US11776323B2, 03.10.2023). In this case, several separate neural networks are used - a common neural network extracts key features of the image, then the features are transferred to specialized neural networks that determine the area of the face, texture, body pose and other parameters. The obtained results are transferred to the neural network to combine the features, a decision is made on whether the input photograph is genuine.

[009] Известен способ защиты биометрических систем от злоумышленников на основе анализа текстурных характеристик изображения (Facial liveness detection, патент № US11244150B2, 08.02.2022). Система выделяет область лица на полученном изображении, затем определяются зеркальные и текстурные признаки лица. Зеркальные признаки извлекаются из Y-канала изображения после его обработки в цветовом пространстве YUV. Далее признаки объединяются в итоговый вектор, на основе которого формируется финальное решение. [009] A method for protecting biometric systems from intruders is known based on the analysis of texture characteristics of an image (Facial liveness detection, patent No. US11244150B2, 08.02.2022). The system selects the area of the face in the received image, then the mirror and texture features of the face are determined. Mirror features are extracted from the Y-channel of the image after its processing in the YUV color space. Then the features are combined into a final vector, on the basis of which the final solution is formed.

[010] Также известен метод определения подлинности человека на изображении (Method and apparatus with liveness detection, патент № US11804070B2, 31.10.2023), в котором с помощью алгоритмов детектирования определяется область лица и извлекаются различные параметры изображения, такие как оттенок, наклон лица, баланс цветовых компонент, яркость и контраст. Параметры сравниваются с предопределенными диапазонами значений, которые были вычислены на обучаемой выборке. Если обнаружено, что параметры выходят за пределы допустимых значений, то система может их скорректировать. После корректировки параметров система применяет модель на основе нейронной сети для определения подлинности изображения. Известно, что данный подход характеризуется неустойчивостью к условиям внешнего освещения, изменения которых даже с учетом системы корректировки сложно компенсировать. [010] A method for determining the authenticity of a person in an image is also known (Method and apparatus with liveness detection, patent No. US11804070B2, 10/31/2023), in which the face area is determined using detection algorithms and various image parameters are extracted, such as hue, face tilt, color component balance, brightness and contrast. The parameters are compared with predefined ranges of values that were calculated on the training sample. If it is detected that the parameters are outside the permissible values, the system can correct them. After correcting the parameters, the system applies a neural network-based model to determine the authenticity of the image. It is known that this approach is characterized by instability to external lighting conditions, changes in which are difficult to compensate for even taking into account the correction system.

[011] Наиболее близким аналогом заявляемого изобретения является метод определения подлинности лица человека (Liveness test method and apparatus, патент № US10121059B2, 06.11.2018), который основан на обработке изображения с помощью нейронной сети. Из исходного изображения извлекается подобласть лица, которая подается на первый входной слой модели нейронной сети, исходное изображение лица подается на второй входной слой. Модель обрабатывает два изображения, извлекает из первого изображения данные о текстуре и комбинирует их с признаками, полученными из второго изображения. Скомбинированные признаки обрабатываются последующими слоями нейронной сети, после чего на выходе формируется решение о том, является ли человек на изображении подлинным. В данном подходе используется только изображение лица, однако для оценки подлинности можно использовать исходную, полноформатную фотографию с пользовательского терминала. Исходное изображение несет в себе общую информацию о текстуре и может содержать артефакты, признаки которых особенно важны в случае детектирования ряда спуфинг-атак, например, показа человека с мобильного телефона, когда видны края или рамки экрана, что служит в качестве важного признака наличия атаки.[011] The closest analogue of the claimed invention is a method for determining the authenticity of a human face (Liveness test method and apparatus, patent No. US10121059B2, 06.11.2018), which is based on image processing using a neural network. A subregion of the face is extracted from the original image and fed to the first input layer of the neural network model, the original face image is fed to the second input layer. The model processes two images, extracts texture data from the first image and combines them with features obtained from the second image. The combined features are processed by subsequent layers of the neural network, after which a decision is formed at the output on whether the person in the image is authentic. This approach uses only a face image, but the original, full-size photograph from the user terminal can be used to assess authenticity. The original image contains general texture information and may contain artifacts, the signs of which are especially important in the case of detecting a number of spoofing attacks, for example, showing a person from a mobile phone, when the edges or frames of the screen are visible, which serves as an important sign of the presence of an attack.

[012] При этом, известные решения при определении подлинности изображения не учитывают информацию о положении ключевых точек пользователя, характеризующих, например, контуры лица (глаза, нос, рот), расположение рук, плеч и пр. Использование подобных данных при анализе изображения позволяет в явном виде выделить области интереса, что обеспечивает повышение результирующей точности моделей определения подлинности. [012] At the same time, known solutions for determining the authenticity of an image do not take into account information about the position of the user's key points, characterizing, for example, the contours of the face (eyes, nose, mouth), the location of the arms, shoulders, etc. Using such data in image analysis allows for the explicit identification of areas of interest, which ensures an increase in the resulting accuracy of authenticity determination models.

[013] Для надежного определения признаков подделки также необходимо выполнять совместную обработку изображений лица и пользователя. В частности, изображение лица позволяет найти неестественные изменения в структуре кожи при использовании поддельной маски, а изображение пользователя выявлять артефакты общей текстуры и разрывы фона, которые наиболее часто проявляются, например, при показе напечатанной фотографии. Таким образом, анализ объединенных признаков изображений лица и пользователя повышает надежность работы биометрической системы при применении злоумышленником различных видов подделок.[013] In order to reliably identify counterfeit features, it is also necessary to perform joint processing of the face and user images. In particular, the face image allows one to find unnatural changes in the skin structure when using a counterfeit mask, and the user image allows one to identify artifacts of the general texture and background breaks, which most often appear, for example, when showing a printed photograph. Thus, the analysis of the combined features of the face and user images increases the reliability of the biometric system when an intruder uses various types of counterfeits.

РАСКРЫТИЕ ИЗОБРЕТЕНИЯDISCLOSURE OF INVENTION

[014] Технической проблемой или задачей, поставленной в данном изобретении, является создание нового эффективного, простого и надежного решения для определения подлинности лица на изображении.[014] The technical problem or task posed by the present invention is to create a new efficient, simple and reliable solution for determining the authenticity of a face in an image.

[015] Техническим результатом, достигаемым при выполнении вышеуказанной задачи, является повышение точности при определении подлинности лица. [015] The technical result achieved by performing the above task is an increase in the accuracy in determining the authenticity of a face.

[016] Указанный технический результат достигается благодаря осуществлению способа определения подлинности лица на изображении, выполняемого по меньшей мере одним вычислительным устройством, содержащего этапы, на которых:[016] The specified technical result is achieved by implementing a method for determining the authenticity of a face in an image, performed by at least one computing device, comprising the steps of:

a. получают исходное изображение пользователя;a. obtain the user's original image;

b. определяют координаты положения лица пользователя;b. determine the coordinates of the user's face position;

c. формируют в соответствии с координатами положения лица пользователя изображение лица;c. generate an image of the face in accordance with the coordinates of the position of the user's face;

d. определяют положение ключевых точек лица для изображения лица и положение ключевых точек пользователя для изображения пользователя;d. determine the position of the facial key points for the facial image and the position of the user key points for the user image;

e. формируют первую маску ключевых точек лица для изображения лица и вторую маску ключевых точек пользователя для изображения пользователя;e. forming a first mask of facial key points for a face image and a second mask of user key points for a user image;

f. рассчитывают первое векторное представление признаков лица на основе изображения лица и первой маски ключевых точек лица, и второе векторное представление признаков пользователя на основе изображения пользователя и второй маски ключевых точек пользователя;f. calculating a first vector representation of features of a face based on an image of a face and a first mask of key points of a face, and a second vector representation of features of a user based on an image of a user and a second mask of key points of a user;

g. на основе первого и второго векторов признаков определяют итоговую оценку вероятности подлинности пользователя на исходном изображении;g. based on the first and second feature vectors, the final assessment of the probability of the user's authenticity in the original image is determined;

h. на основе значения итоговой оценки вероятности подлинности пользователя на изображении определяют класс изображения, указывающий на то, что лицо, представленное на изображении, является подлинным или поддельным. h. based on the value of the final assessment of the probability of authenticity of the user in the image, a class of the image is determined indicating whether the person presented in the image is genuine or fake.

[017] В одном из частных примеров осуществления способа этапы f-g выполняются моделью машинного обучения или ансамблем моделей, при этом модель машинного обучения или ансамбль моделей обучены на наборе данных, содержащих подлинные и поддельные изображения лиц пользователей. [017] In one particular example of implementing the method, steps f-g are performed by a machine learning model or an ensemble of models, wherein the machine learning model or ensemble of models is trained on a data set containing genuine and fake images of users' faces.

[018] В другом частном примере осуществления способа ансамбль моделей состоит из отдельных моделей, каждая из которых обучена на выявление определенного разработчиком вида подделок.[018] In another particular example of implementing the method, the ensemble of models consists of individual models, each of which is trained to identify a type of counterfeit determined by the developer.

[019] В другом частном примере осуществления способа этап формирования масок ключевых точек лица и пользователя осуществляется с помощью нейронной сети, обученной с целью повышения точности определения итоговой оценки вероятности подлинности пользователя.[019] In another particular example of implementing the method, the stage of forming masks of key points of the face and the user is carried out using a neural network trained for the purpose of increasing the accuracy of determining the final assessment of the probability of the user's authenticity.

[020] В другом частном примере осуществления способа этап формирования масок ключевых точек лица и пользователя содержит этапы, на которых:[020] In another particular example of implementing the method, the stage of forming masks of key points of the face and the user comprises stages in which:

- формируют замкнутые контуры в соответствии с положениями координат ключевых точек;- form closed contours in accordance with the positions of the coordinates of key points;

- увеличивают ширину линии замкнутых контуров посредством добавления к упомянутым линиям заданного количества соседних пикселей, либо увеличивают ширину линии замкнутых контуров посредством использования фильтров размытия с заданными параметрами, например, гауссовского фильтра.- increase the line width of closed contours by adding a specified number of adjacent pixels to the said lines, or increase the line width of closed contours by using blur filters with specified parameters, such as a Gaussian filter.

[021] В другом частном примере осуществления способа дополнительно выполняют этап, на котором нормируют изображения посредством приведения значений пикселей к заданным уровням математического ожидания и дисперсии распределения или посредством линейного преобразования к заданному диапазону значений.[021] In another particular example of implementing the method, an additional step is performed in which the images are normalized by bringing the pixel values to specified levels of mathematical expectation and distribution variance or by means of a linear transformation to a specified range of values.

[022] В другом частном примере осуществления способа дополнительно выполняют этап, на котором изменяют размеры изображений для их соответствия входам моделям определения признаков изображения посредством двумерной интерполяции изображений.[022] In another particular example of implementing the method, an additional step is performed in which the sizes of the images are changed to match the inputs of the image feature determination models by means of two-dimensional image interpolation.

[023] В другом частном примере осуществления способа этап формирования изображения лица содержит этапы, на которых:[023] In another particular example of implementing the method, the step of forming an image of a face comprises the steps of:

- в соответствии с координатами положения лица выделяют на изображении пользователя квадратную область;- in accordance with the coordinates of the face position, a square area is selected on the user's image;

- увеличивают квадратную область на заданное количество пикселей, причем в качестве изображения лица выбирают изображение, расположенное в увеличенной квадратной области.- enlarge the square area by a specified number of pixels, and select the image located in the enlarged square area as the face image.

[024] В другом частном примере осуществления способа дополнительно выполняют этапы, на которых определяют, что выделенная увеличенная квадратная область выходит за пределы изображения пользователя и выполняют репликацию граничных пикселей изображения пользователя для заполнения упомянутой выделенной области.[024] In another particular example of implementing the method, additional steps are performed in which it is determined that the selected enlarged square area extends beyond the user's image and replication of the boundary pixels of the user's image is performed to fill said selected area.

[025] В другом частном примере осуществления способа дополнительно выполняют этапы, на которых формируют:[025] In another particular example of implementing the method, additional steps are performed in which the following are formed:

- входные данные для первой модели посредством матричного сложения или умножения значений пикселей первой маски ключевых точек и обработанного изображения лица;- input data for the first model by matrix addition or multiplication of the pixel values of the first keypoint mask and the processed face image;

- входные данные для второй модели посредством матричного сложения или умножения значений пикселей второй маски ключевых точек и обработанного изображения пользователя, причем при выполнении операции сложения или умножения элементы матрицы маски ключевых точек умножаются на заранее заданный числовой коэффициент.- input data for the second model by means of matrix addition or multiplication of the pixel values of the second key point mask and the processed user image, wherein when performing the addition or multiplication operation, the elements of the key point mask matrix are multiplied by a predetermined numerical coefficient.

[026] В другом частном примере осуществления способа дополнительно выполняют этапы, на которых формируют:[026] In another particular example of implementing the method, additional steps are performed in which the following are formed:

- входные данные для первой модели посредством конкатенации матриц первой маски ключевых точек и обработанного изображения лица; и- input data for the first model by concatenating the matrices of the first keypoint mask and the processed face image; and

- входные данные для второй модели посредством конкатенации матриц второй маски ключевых точек и обработанного изображения пользователя.- input data for the second model by concatenating the matrices of the second keypoint mask and the processed user image.

[027] В другом частном примере осуществления способа дополнительно выполняют этапы, на которых формируют:[027] In another particular example of implementing the method, additional steps are performed in which the following are formed:

- входные данные для первой модели посредством взвешенного сложения трех матриц, соответствующих пикселям первой маски ключевых точек, инвертированной первой маски ключевых точек и обработанного изображения лица;- input data for the first model by weighted addition of three matrices corresponding to the pixels of the first keypoint mask, the inverted first keypoint mask and the processed face image;

- входные данные для второй модели посредством взвешенного сложения трех матриц, соответствующих пикселям второй маски ключевых точек, инвертированной второй маски ключевых точек и обработанного изображения пользователя.- input data for the second model by weighted addition of three matrices corresponding to the pixels of the second keypoint mask, the inverted second keypoint mask and the processed user image.

[028] В другом частном примере осуществления способа для определения класса изображения первый и второй векторы признаков обрабатываются независимыми слоями нейронной сети.[028] In another particular example of implementing the method for determining the class of an image, the first and second feature vectors are processed by independent layers of a neural network.

[029] В другом частном примере осуществления способа для определения класса изображения первый и второй векторы признаков конкатенируются, после чего обрабатываются общей нейронной сетью.[029] In another particular example of implementing the method for determining the class of an image, the first and second feature vectors are concatenated and then processed by a common neural network.

[030] В другом предпочтительном варианте осуществления заявленного решения представлена система определения подлинности лица на изображении, содержащая по меньшей мере одно вычислительное устройство и по меньшей мере одну память, содержащую машиночитаемые инструкции, которые при их исполнении по меньшей мере одним вычислительным устройством выполняют вышеуказанный способ.[030] In another preferred embodiment of the claimed solution, a system for determining the authenticity of a face in an image is presented, comprising at least one computing device and at least one memory containing machine-readable instructions which, when executed by at least one computing device, perform the above-mentioned method.

КРАТКОЕ ОПИСАНИЕ ЧЕРТЕЖЕЙBRIEF DESCRIPTION OF DRAWINGS

[031] Признаки и преимущества настоящего изобретения станут очевидными из приводимого ниже подробного описания изобретения и прилагаемых чертежей, на которых:[031] The features and advantages of the present invention will become apparent from the following detailed description of the invention and the accompanying drawings, in which:

[032] На Фиг. 1 представлена общая схема взаимодействия элементов системы определения подлинности лица.[032] Fig. 1 shows a general diagram of the interaction of elements of the face authenticity determination system.

[033] На Фиг. 2 представлен пример исходного изображения пользователя и изображения лица.[033] Fig. 2 shows an example of an original user image and a face image.

[034] На Фиг. 3 представлен пример обработанного изображения пользователя. [034] Fig. 3 shows an example of a processed image of a user.

[035] На Фиг. 4 представлен пример формирования замкнутого контура с увеличенной шириной линии для маски ключевых точек изображения лица пользователя.[035] Fig. 4 shows an example of forming a closed contour with an increased line width for a mask of key points of a user's face image.

[036] На Фиг. 5 представлен пример общего вида вычислительного устройства. [036] Fig. 5 shows an example of a general view of a computing device.

ОСУЩЕСТВЛЕНИЕ ИЗОБРЕТЕНИЯIMPLEMENTATION OF THE INVENTION

[037] Ниже будут описаны понятия и термины, необходимые для понимания данного технического решения.[037] Below we will describe the concepts and terms necessary for understanding this technical solution.

[038] В данном техническом решении под системой подразумевается, в том числе компьютерная система, ЭВМ (электронно-вычислительная машина), ЧПУ (числовое программное управление), ПЛК (программируемый логический контроллер), компьютеризированные системы управления и любые другие устройства, способные выполнять заданную, четко определенную последовательность операций (действий, инструкций).[038] In this technical solution, the term “system” means, among other things, a computer system, a computer (electronic computer), a numerical control (CNC), a PLC (programmable logic controller), computerized control systems and any other devices capable of performing a given, clearly defined sequence of operations (actions, instructions).

[039] Под устройством обработки команд подразумевается электронный блок, либо интегральная схема (микропроцессор), исполняющая машинные инструкции (программы).[039] A command processing unit is an electronic unit or integrated circuit (microprocessor) that executes machine instructions (programs).

[040] Устройство обработки команд считывает и выполняет машинные инструкции (программы) с одного или более устройств хранения данных. В роли устройства хранения данных, включая, но не ограничиваясь, могут выступать жесткие диски (HDD), флеш-память, ПЗУ (постоянное запоминающее устройство), твердотельные накопители (SSD), оптические приводы.[040] The command processing unit reads and executes machine instructions (programs) from one or more data storage devices. The data storage devices may include, but are not limited to, hard disk drives (HDD), flash memory, ROM (read-only memory), solid state drives (SSD), and optical drives.

[041] Программа - последовательность инструкций, предназначенных для исполнения устройством управления вычислительной машины или устройством обработки команд.[041] A program is a sequence of instructions intended for execution by a computer control unit or command processing device.

[042] База данных (БД) - совокупность данных, организованных в соответствии с концептуальной структурой, описывающей характеристики этих данных и взаимоотношения между ними, причем такая совокупность данных, которая поддерживает одну или более областей применения (ISO/IEC 2382:2015, 2121423 «database»).[042] Database (DB) - a collection of data organized according to a conceptual structure that describes the characteristics of this data and the relationships between them, and such a collection of data that supports one or more application areas (ISO/IEC 2382:2015, 2121423 "database").

[043] Сигнал - материальное воплощение сообщения для использования при передаче, переработке и хранении информации.[043] Signal - a material embodiment of a message for use in the transmission, processing and storage of information.

[044] Логический элемент - элемент, осуществляющий определенные логические зависимости между входными и выходными сигналами. Логические элементы обычно используются для построения логических схем вычислительных машин, дискретных схем автоматического контроля и управления. Для всех видов логических элементов, независимо от их физической природы, характерны дискретные значения входных и выходных сигналов.[044] Logical element - an element implementing certain logical dependencies between input and output signals. Logical elements are usually used to construct logical circuits of computers, discrete circuits of automatic control and management. All types of logical elements, regardless of their physical nature, are characterized by discrete values of input and output signals.

[045] В соответствии со схемой, представленной на Фиг. 1, заявленная система определения подлинности лица на изображении содержит: устройство 10 формирования изображения; устройство 20 обработки изображения; устройства 31, 32 генерации маски ключевых точек; устройства 41, 42 формирования входных данных; модели 51, 52 определения признаков изображения; устройство 60 классификации признаков.[045] In accordance with the diagram shown in Fig. 1, the claimed system for determining the authenticity of a face in an image comprises: an image forming device 10; an image processing device 20; devices 31, 32 for generating a mask of key points; devices 41, 42 for forming input data; models 51, 52 for determining image features; a feature classification device 60.

[046] Устройство 10 формирования изображения может быть реализовано на базе по меньшей мере одной фото/видео камеры различного типа, в частности PTZ-камеры, IP-камеры, стационарные антивандальные камеры, камеры кругового обзора, камеры мобильного устройства или терминала и т.п., оснащенного детектором лица, например, раскрытого в статье P.Viola и MP. Jones «Robust real-time face detection*. Также в качестве алгоритма для детектирования лиц людей могут использоваться такие подходы, как: адаптированное улучшение и основанный на нем метод Виолы- Джонса, MTCNN, метод гибкого сравнения на графах (Elastic graph matching), DeepFace Facebook, метод детектирования объектов YOLO, семейство нейросетевых архитектур R-CNN, Fast R-CNN, Faster R-CNN.[046] The imaging device 10 can be implemented on the basis of at least one photo/video camera of various types, in particular a PTZ camera, IP camera, stationary vandal-proof cameras, all-round cameras, mobile device or terminal cameras, etc., equipped with a face detector, for example, disclosed in the article by P. Viola and M. P. Jones "Robust real-time face detection*. Also, the following approaches can be used as an algorithm for detecting people's faces: adapted enhancement and the Viola-Jones method based on it, MTCNN, the elastic graph matching method, DeepFace Facebook, the YOLO object detection method, the R-CNN family of neural network architectures, Fast R-CNN, Faster R-CNN.

[047] Устройство 20 обработки изображения может быть выполнено на базе вычислительного устройства и оснащено логическими элементами, выполненными на транзисторах, и различными преобразователями сигналов (ЦПА, АПЦ и пр.) размещенными таким образом, чтобы обеспечить получение и обработку сигналов от устройства 10 описанным далее способом.[047] The image processing device 20 can be implemented on the basis of a computing device and equipped with logical elements implemented on transistors and various signal converters (DPA, APC, etc.) arranged in such a way as to ensure the receipt and processing of signals from the device 10 in the manner described below.

[048] Устройства 31, 32 генерации маски ключевых точек могут быть реализованы на базе вычислительного устройства или нейронных сетей, содержащих входные, выходные и прочие слои, состоящие из множества нейронов с соответствующими весовыми коэффициентами, настроенные путем обучения нейронных сетей на тренировочном наборе данных, в частности, характеризующим ключевые точки пользователя на изображении для формирования матрицы внимания. При этом для первичного формирования ключевых точек используются известные алгоритмы распознавания точек лица, раскрытого, например, в руководстве «Face landmark detection guide», опубл. в Интернет по адресу: https://developers.google.com/mediapipe/solutions/vision/face_landmarker или такие алгоритмы, как Active Appearance Models (ААМ), Active Shape Models (ASM), SURF, NeoFace, SHORE, ROl, Template Matching Methods, DPM (модель деформируемой детали).[048] The devices 31, 32 for generating a mask of key points can be implemented on the basis of a computing device or neural networks containing input, output and other layers consisting of a plurality of neurons with corresponding weight coefficients, configured by training the neural networks on a training data set, in particular, characterizing the user's key points in the image for forming an attention matrix. In this case, for the primary formation of key points, known algorithms for recognizing facial points are used, disclosed, for example, in the manual "Face landmark detection guide", published on the Internet at: https://developers.google.com/mediapipe/solutions/vision/face_landmarker or such algorithms as Active Appearance Models (AAM), Active Shape Models (ASM), SURF, NeoFace, SHORE, ROl, Template Matching Methods, DPM (deformable part model).

[049] Устройства 41, 42 формирования входных данных для модели могут быть реализованы на базе вычислительного устройства, оснащенного известным программно-аппаратным обеспечением для выполнения операции объединения данных сгенерированных масок ключевых точек и обработанных изображений, полученных от устройства 20.[049] The devices 41, 42 for generating input data for the model can be implemented on the basis of a computing device equipped with known software and hardware for performing the operation of combining data of generated masks of key points and processed images received from the device 20.

[050] Модели 51, 52 определения признаков изображения могут быть реализованы на базе нейронных сетей, содержащих входные, выходные и прочие слои, состоящие из множества нейронов с соответствующими весовыми коэффициентами, настроенные путем обучения на размещенных изображениях и соответствующих им данных (масок), характеризующих структуру ключевых областей, причем модель 51 обучается на изображениях лиц пользователей в отличие от модели 52, обученной на изображениях пользователей. Таким образом, модель 51 позволяет более эффективно анализировать характерные признаки лица, в частности, текстуру кожи, изменения в цветах лица, наличие мелкомасштабных артефактов, например бликов или отражений. Такой анализ крайне важен для детектирования сложных видов подделок, таких как силиконовые маски или объемные маски, напечатанные на 3D принтере. В свою очередь, модель 52 учитывает присутствие крупных деталей на изображении для определения неестественного положения частей тела, выполняет общий анализ текстуры изображения и выявляет признаки разрыва или изменения фона, которые могут возникать, например, при показе распечатанной фотографии лица или пользователя.[050] Models 51, 52 for determining image features can be implemented on the basis of neural networks containing input, output and other layers consisting of a plurality of neurons with corresponding weight coefficients, configured by training on the posted images and the corresponding data (masks) characterizing the structure of the key areas, wherein model 51 is trained on images of user faces, in contrast to model 52, which is trained on user images. Thus, model 51 allows for more efficient analysis of characteristic facial features, in particular, skin texture, changes in facial colors, the presence of small-scale artifacts, such as glare or reflections. Such analysis is extremely important for detecting complex types of counterfeits, such as silicone masks or volumetric masks printed on a 3D printer. In turn, model 52 takes into account the presence of large details in the image to determine the unnatural position of body parts, performs a general analysis of the image texture and identifies signs of a break or change in the background, which may occur, for example, when showing a printed photograph of a face or user.

[051] Аугментация данных для обучения одной или нескольких моделей машинного обучения может проводиться с использованием как минимум одного из следующих подходов: масштабирование изображения (увеличения, уменьшения); обрезка изображения; затемнение всего изображения, отдельных каналов изображения; осветление всего изображения, отдельных каналов изображения; повышение контрастности; цветовые преобразования: перемена мест (перемешивание) цветовых каналов, усиление, уменьшения одного или несколько цветовых каналов, получение изображения в градациях серого, получение монохромного изображения, удаление цветового канала; сдвиги и децентровка изображения; повороты изображения на различные углы в различных направлениях, вращение изображения или его части; наклоны, перекосы изображения; зеркальное отображение вдоль произвольной оси, линии; дополнительные линии или геометрические объекты на изображении: с прозрачностью своего цвета, без прозрачности, цветные объекты; серые объекты (от белого до черного цвета), в том числе и удаление части изображения (помещение черного объекта на изображение) на геометрических или смысловых позициях изображения; добавление любого фона на изображение; блики и затемнения частей изображения; дефокус (размытие) изображения или его частей; повышение зернистости, шарпности (резкости) изображения; сжатия и растяжения вдоль осей, линий; зашумление изображения по всему изображению или его части, помещение белого или иного шума; добавление одного или нескольких элементов гауссового шума, пятнистого шума; совмещение (наложение) двух или нескольких изображений из тренировочной выборки (частей изображений) с различными весами; эластическая трансформация изображения (Elastic Transform); сеточное искажение изображения (GridDistortion); сжатие данных изображения различными алгоритмами обработки изображения с некоторым качеством (например, сжатие исходного bmp-изображения по стандарту JPEG некоторого качества, а затем получения из него снова bmp-изображения); изотропные, аффинные и другие преобразования.[051] Data augmentation for training one or more machine learning models may be performed using at least one of the following approaches: image scaling (enlargement, reduction); image cropping; darkening the entire image, individual image channels; lightening the entire image, individual image channels; contrast enhancement; color transformations: changing the places (mixing) of color channels, amplifying, decreasing one or more color channels, obtaining an image in grayscale, obtaining a monochrome image, removing a color channel; image shifts and decentering; image rotations at different angles in different directions, rotation of the image or part thereof; image tilts, skews; mirroring along an arbitrary axis, line; additional lines or geometric objects in the image: with transparency of their color, without transparency, colored objects; gray objects (from white to black), including the removal of part of the image (placing a black object on the image) in geometric or semantic positions of the image; adding any background to the image; highlights and darkening of parts of the image; defocusing (blurring) of an image or its parts; increasing the graininess, sharpness (sharpness) of an image; compression and stretching along axes, lines; adding noise to an image over the entire image or its part, placing white or other noise; adding one or more elements of Gaussian noise, spotty noise; combining (overlaying) two or more images from a training sample (parts of images) with different weights; elastic transformation of an image (Elastic Transform); grid distortion of an image (GridDistortion); compression of image data by various image processing algorithms with a certain quality (for example, compressing the original bmp image according to the JPEG standard of a certain quality, and then obtaining a bmp image from it again); isotropic, affine and other transformations.

[052] В качестве алгоритма обучения модели машинного обучения может использоваться один или несколько следующих алгоритмов: Adagrad (Adaptive gradient algorithm), RMS (Root mean square), RMSProp (Root mean square propagation), Rprop (Resilient backpropagation algorithm), SGD (Stochastic Gradient Descent), BGD (Batch Gradient Descent), MBGD (Mini-batch Gradient Descent), Momentum, Nesterov Momentum, NAG (Nesterov Accelerated Gradient), FussySGD, SGDNesterov (SGD+Nesterov Momentum), AdaDelta, Adam (Adaptive Moment Estimation), AMSGrad, AdamW, ASGD (Averaged Stochastic Gradient Descent), LBFGS (L-BFGS algorithm - алгоритм Бройдена-Флетчера-Гольдфарба-Шанно с ограниченным использованием памяти), а так же оптимизаторы второго порядка, такие как: Метод Ньютона, Квазиньютоновский метод, Алгоритм Гаусса-Ньютона, Метод сопряженного градиента, Алгоритм Левенберга-Марквардта. [052] One or more of the following algorithms can be used as a training algorithm for a machine learning model: Adagrad (Adaptive gradient algorithm), RMS (Root mean square), RMSProp (Root mean square propagation), Rprop (Resilient backpropagation algorithm), SGD (Stochastic Gradient Descent), BGD (Batch Gradient Descent), MBGD (Mini-batch Gradient Descent), Momentum, Nesterov Momentum, NAG (Nesterov Accelerated Gradient), FussySGD, SGDNesterov (SGD+Nesterov Momentum), AdaDelta, Adam (Adaptive Moment Estimation), AMSGrad, AdamW, ASGD (Averaged Stochastic Gradient Descent), LBFGS (L-BFGS algorithm - the Broyden-Fletcher-Goldfarb-Shanno algorithm with limited memory use), as well as second-order optimizers, such as: Method Newton, Quasi-Newton method, Gauss-Newton algorithm, Conjugate gradient method, Levenberg-Marquardt algorithm.

[053] В качестве целевой функции при обучении модели машинного обучения используется по крайней мере одна из следующих функций: L1Loss, MSELoss, CrossEntropyLoss, CTCLoss, NLLLoss, PoissonNLLLoss, GaussianNLLLoss, KLDivLoss, BCELoss, BCEWithLogitsLoss, MarginRankingLoss, HingeEmbeddingLoss, MultiLabelMarginLoss, HuberLoss, SmoothL1Loss, SoftMarginLoss, MultiLabelSoftMarginLoss, CosineEmbeddingLoss, MultiMarginLoss, TripletMarginLoss, TripletMarginWithDistanceLoss.[053] At least one of the following functions is used as the objective function when training the machine learning model: L1Loss, MSELoss, CrossEntropyLoss, CTCLoss, NLLLoss, PoissonNLLLoss, GaussianNLLLoss, KLDivLoss, BCELoss, BCEWithLogitsLoss, MarginRankingLoss, HingeEmbeddingLoss, MultiLabelMarginLoss, HuberLoss, SmoothL1Loss, SoftMarginLoss, MultiLabelSoftMarginLoss, CosineEmbeddingLoss, MultiMarginLoss, TripletMarginLoss, TripletMarginWithDistanceLoss.

[054] Устройство 60 классификации признаков может быть выполнено на базе вычислительного устройства или нейронных сетей, содержащих входные, выходные и прочие слои, состоящие из множества нейронов с соответствующими весовыми коэффициентами, настроенные путем обучения нейронных сетей на размеченном тренировочном наборе данных, содержащих вектора признаков, характерных для подлинных и поддельных изображений, в частности, для вычисления итоговой вероятности подлинности пользователя на основе полученных признаков изображений пользователя. В альтернативном варианте осуществления представленного решения векторы признаков, содержащиеся в размеченном тренировочном наборе данных, конкатенируются, после чего используются для обучения общей нейронной сетью. Устройство 60 дополнительно оснащено средствами считывания строк и средствами сравнения элементов упомянутых строк, выполненных на базе логических элементов на транзисторах. [054] The feature classification device 60 may be implemented on the basis of a computing device or neural networks containing input, output and other layers consisting of a plurality of neurons with corresponding weight coefficients, configured by training neural networks on a marked training data set containing feature vectors characteristic of genuine and counterfeit images, in particular, for calculating the final probability of user authenticity based on the obtained features of the user's images. In an alternative embodiment of the presented solution, the feature vectors contained in the marked training data set are concatenated, after which they are used for training by a common neural network. The device 60 is additionally equipped with means for reading lines and means for comparing elements of said lines, implemented on the basis of logical elements on transistors.

[055] Соответственно, исходное изображение пользователя (см. Фиг. 2, левое изображение), размещенного напротив устройства 10 формирования изображения, фиксируется упомянутым устройством 10, после чего устройство 10 известными методами определяет наличие и положение лица на изображении и передает исходное изображение пользователя с координатами области лица в устройство 20 обработки изображений.[055] Accordingly, the original image of the user (see Fig. 2, left image), placed opposite the imaging device 10, is captured by said device 10, after which the device 10, using known methods, determines the presence and position of the face in the image and transmits the original image of the user with the coordinates of the face area to the image processing device 20.

[056] Устройство 20 на основе исходного изображения пользователя и координат области лица формирует изображение лица (см. Фиг. 2, правое изображение) следующим образом. На первом этапе устройство 20 выделяет квадратную область на изображении пользователя в соответствии с координатами области лица, после чего устройство 20 увеличивает выделенную область лица на заданное количество пикселей, например, на 5-20 процентов от размера квадратной области. Подобное увеличение позволяет компенсировать возможные неточности определения границ области лица при работе детектора и дает прирост в результирующей точности определения подлинности лица за счет добавления дополнительного информационного контекста. Далее устройство 20 извлекает изображение, попавшее в увеличенную выделенную область лица, и формирует изображение лица.[056] The device 20, based on the original image of the user and the coordinates of the face region, forms an image of the face (see Fig. 2, right image) as follows. In the first stage, the device 20 selects a square region on the image of the user in accordance with the coordinates of the face region, after which the device 20 increases the selected face region by a specified number of pixels, for example, by 5-20 percent of the size of the square region. Such an increase makes it possible to compensate for possible inaccuracies in determining the boundaries of the face region during the operation of the detector and provides an increase in the resulting accuracy of determining the authenticity of the face due to the addition of additional information context. Then, the device 20 extracts the image that fell into the enlarged selected face region and forms an image of the face.

[057] Дополнительно при увеличении выделенной области лица устройство 20 может быть выполнено с возможностью определения того, что упомянутая выделенная область выходит за пределы изображения пользователя. Соответственно, при определении того, что упомянутая выделенная область выходит за пределы изображения пользователя, устройство 20 аналогичным образом извлекает изображение, попавшее в увеличенную выделенную область лица, и формирует изображение лица, причем недостающая часть изображения (т.е. часть, выходящая за пределы изображения пользователя), заполняется, например, пикселями, полученными путем репликации граничных пикселей изображения. Далее устройство 20 известными методами изменяет размер изображения лица для его соответствия размеру входа модели 51, например, посредством применения методов двумерной интерполяции изображений. [057] Additionally, when enlarging the selected area of the face, the device 20 can be configured to determine that said selected area extends beyond the user image. Accordingly, when determining that said selected area extends beyond the user image, the device 20 similarly extracts an image that falls within the enlarged selected area of the face and forms an image of the face, wherein the missing part of the image (i.e. the part that extends beyond the user image) is filled, for example, with pixels obtained by replicating the boundary pixels of the image. Then, the device 20 changes the size of the face image using known methods to match the size of the input of the model 51, for example, by applying two-dimensional image interpolation methods.

[058] Устройство 20 также осуществляет изменение размера исходного изображения пользователя для того, чтобы размер упомянутого изображения пользователя соответствовал размеру входа модели 52 с помощью методов двумерной интерполяции изображений (см. Фиг. 3). Также устройство 20 может быть выполнено с возможностью определения формы изображения пользователя, причем если устройством 20 будет определено, что изображение пользователя имеет форму, отличную от квадратной (например, имеет прямоугольную форму), то устройство 20 заполняет недостающую часть изображения для получения квадратной формы изображения, например, пикселями, полученными путем репликации граничных пикселей изображения пользователя. [058] The device 20 also performs a change in the size of the original image of the user in order for the size of said image of the user to correspond to the size of the input of the model 52 using two-dimensional image interpolation methods (see Fig. 3). Also, the device 20 can be configured to determine the shape of the image of the user, and if the device 20 determines that the image of the user has a shape other than square (for example, has a rectangular shape), then the device 20 fills the missing part of the image to obtain a square shape of the image, for example, with pixels obtained by replicating the boundary pixels of the image of the user.

[059] Для соответствия размеров моделей 51 и 52 упомянутых изображения лица и изображения пользователя в качестве методов двумерной интерполяции могут использоваться такие известные алгоритмы как метод ближайшего соседа, билинейная интерполяция, бикубическая интерполяция, интерполяция на основе сплайнов, передискретизация на основе дискретного преобразования Фурье или методы супер-разрешения изображения.[059] In order to match the sizes of the models 51 and 52 of the said face image and the user image, known algorithms such as the nearest neighbor method, bilinear interpolation, bicubic interpolation, spline-based interpolation, discrete Fourier transform-based resampling, or image super-resolution methods may be used as two-dimensional interpolation methods.

[060] При необходимости устройство 20 может выполнить известными методами дополнительную обработку изображений, например, регулировку яркости и контрастности, линейную и нелинейную фильтрацию, компенсацию шума, преобразование цветового формата изображения RGB/HSV/YUV/LAB/YCbCr, выравнивание гистограммы, нормализацию пикселей, например, посредством приведения диапазона значений к заданным уровням математического ожидания и дисперсии распределения или посредством линейного преобразования к заданному диапазону значений и многие другие. Использование дополнительной обработки повышает качество обучения и способность моделей 51, 52 для более эффективного извлечения признаков подлинности пользователя. [060] If necessary, the device 20 can perform additional image processing using known methods, such as brightness and contrast adjustment, linear and nonlinear filtering, noise compensation, RGB/HSV/YUV/LAB/YCbCr color format image conversion, histogram equalization, pixel normalization, for example, by bringing the range of values to specified levels of mathematical expectation and distribution variance or by linear transformation to a specified range of values, and many others. The use of additional processing improves the quality of training and the ability of the models 51, 52 to more effectively extract user authenticity features.

[061] На третьем этапе устройство 20 направляет изображение лица и изображение пользователя в устройства 31 и 32 соответственно, которые определяют положения ключевых точек для каждого изображения, в частности для изображения лица - ключевые точки лица, а для изображения пользователя - ключевые точки пользователя. Далее на основе полученных точек устройство 31 формирует первую маску ключевых точек лица для изображения лица, а устройство 32 формирует вторую маску ключевых точек пользователя для изображения пользователя.[061] In the third step, the device 20 sends the face image and the user image to the devices 31 and 32, respectively, which determine the positions of the key points for each image, in particular, for the face image - the face key points, and for the user image - the user key points. Then, based on the received points, the device 31 forms a first mask of the face key points for the face image, and the device 32 forms a second mask of the user key points for the user image.

[062] При формировании маски ключевых точек можно использовать несколько способов: либо с помощью специализированной нейронной сети, либо с помощью алгоритма формирования замкнутого контура. Таким образом, сформированные маски ключевых точек позволяют при анализе изображения в явном виде указать области повышенного интереса, например, характерные участки лица (нос, глаза, рот) и тела (руки, шея, плечи).[062] Several methods can be used to form a keypoint mask: either using a specialized neural network or using a closed-loop formation algorithm. Thus, the formed keypoint masks allow for explicitly indicating areas of increased interest during image analysis, such as characteristic areas of the face (nose, eyes, mouth) and body (arms, neck, shoulders).

[063] Для первого способа формирования маски устройства 31 и 32 могут быть оснащены нейронными сетями, каждая из которых обучена на бинарных изображениях ключевых точек и соответствующих им выходных масках. Бинарное изображение ключевых точек лица содержит нули и единицы, где «1» указывает на то, что координаты пикселя изображения совпадают с координатами ключевых точек лица, а «0» - координаты пикселя изображения не совпадают с координатами ключевых точек лица. Соответственно, бинарные изображения ключевых точек подаются на вход нейронным сетям, а на выходе устройства 31 и 32 получают первую и вторую маски ключевых точек лица и пользователя соответственно.[063] For the first method of forming a mask, devices 31 and 32 can be equipped with neural networks, each of which is trained on binary images of key points and the corresponding output masks. A binary image of key points of a face contains zeros and ones, where "1" indicates that the coordinates of the pixel of the image coincide with the coordinates of the key points of the face, and "0" indicates that the coordinates of the pixel of the image do not coincide with the coordinates of the key points of the face. Accordingly, binary images of key points are fed to the input of the neural networks, and at the output of devices 31 and 32, the first and second masks of key points of the face and the user are obtained, respectively.

Таким образом, указанные нейронные сети адаптивно формируют матрицы внимания в зависимости от положения ключевых точек пользователя. При этом их параметры обучаются совместно с другими моделями, например, моделями определения признаков, таким образом чтобы максимизировать результирующую точность определения вероятности подлинности пользователя. Thus, these neural networks adaptively generate attention matrices depending on the position of the user's key points. At the same time, their parameters are trained together with other models, such as feature detection models, in such a way as to maximize the resulting accuracy of determining the probability of user authenticity.

[064] Для второго способа формирования маски устройства 31 и 32 известными методами на основе ключевых точек лица и пользователя соответственно формирует замкнутые контуры (см., например, Фиг. 4, левое изображение), после чего устройства 31 и 32 увеличивают ширину линии замкнутых контуров посредством добавления к упомянутым линиям заданного количества соседних пикселей, например, на 2-10 процентов от размера изображения либо посредством использования фильтров размытия, например, гауссовского фильтра, которыми упомянутые устройства могут быть оснащены (см., например, Фиг. 4, правое изображение).[064] For the second method of forming a mask, devices 31 and 32, using known methods, form closed contours based on key points of the face and the user, respectively (see, for example, Fig. 4, left image), after which devices 31 and 32 increase the width of the line of the closed contours by adding a specified number of neighboring pixels to said lines, for example, by 2-10 percent of the image size, or by using blur filters, for example, a Gaussian filter, with which said devices can be equipped (see, for example, Fig. 4, right image).

[065] Соответственно, полученная маска ключевых точек может представлять собой матрицу, содержащую нули и единицы, где «1» указывает на то, что координаты пикселя изображения совпадают с координатами расширенного контура маски лица или пользователя, а «0» - координаты пикселя изображения не совпадают с координатами расширенного контура маски лица или пользователя. [065] Accordingly, the obtained keypoint mask may be a matrix containing zeros and ones, where “1” indicates that the coordinates of the image pixel coincide with the coordinates of the extended contour of the face mask or user, and “0” indicates that the coordinates of the image pixel do not coincide with the coordinates of the extended contour of the face mask or user.

[066] Далее первая маска ключевых точек, полученная устройством 31 для изображения лица, и изображение лица от устройства 20 направляются в устройство 41, а вторая маска ключевых точек, полученная устройством 32 для изображения пользователя, и изображение пользователя от устройства 20 направляются в устройство 42. Соответственно, если исходное изображение пользователя прошло дополнительную обработку, то вместо изображения пользователя в устройство 42 направляется обработанное изображение пользователя.[066] Next, the first mask of key points obtained by device 31 for the face image and the face image from device 20 are sent to device 41, and the second mask of key points obtained by device 32 for the user image and the user image from device 20 are sent to device 42. Accordingly, if the original user image has undergone additional processing, then instead of the user image, the processed user image is sent to device 42.

[067] Устройство 41 при получении указанных данных первой маски ключевых точек и изображения лица выполняет формирование входных данных для модели 51, например, посредством матричного сложения, причем при выполнении операции сложения элементы матрицы маски ключевых точек могут быть умножены на заранее заданный числовой коэффициент:[067] The device 41, upon receiving the said data of the first mask of key points and the image of the face, performs the formation of input data for the model 51, for example, by means of matrix addition, wherein when performing the addition operation, the elements of the matrix of the mask of key points can be multiplied by a predetermined numerical coefficient:

input=coef*mask+imginput=coef*mask+img

где mask - маска ключевых точек, img - изображение лица, coef - заданный числовой коэффициент.where mask is a mask of key points, img is a face image, coef is a given numerical coefficient.

[068] При другом варианте входные данные устройством 41 могут быть сформированы путем поэлементного умножения матриц, соответствующих пикселям полученной маски ключевых точек и изображения лица, причем при выполнении операции умножения элементы матрицы маски ключевых точек могут быть умножены на заранее заданный числовой коэффициент:[068] In another embodiment, the input data by device 41 can be formed by element-by-element multiplication of matrices corresponding to the pixels of the obtained key point mask and the face image, wherein when performing the multiplication operation, the elements of the key point mask matrix can be multiplied by a predetermined numerical coefficient:

input=coef*mask*img input=coef*mask*img

[069] В качестве другого варианта входные данные устройством 41 могут быть сформированы путем операции конкатенации матриц, соответствующих пикселям полученной маски ключевых точек и изображения лица:[069] As another option, the input data by device 41 can be formed by a concatenation operation of matrices corresponding to the pixels of the obtained keypoint mask and the face image:

input=[mask, img]input=[mask, img]

[070] В альтернативном варианте входные данные устройством 41 могут быть сформированы путем взвешенного сложения трех матриц, соответствующих пикселям полученной маски ключевых точек, инвертированной маски ключевых точек и изображения лица:[070] Alternatively, the input data by device 41 may be generated by weighted addition of three matrices corresponding to the pixels of the obtained keypoint mask, the inverted keypoint mask, and the face image:

input=coef1*mask+coef2*inv_mask+img input=coef1*mask+coef2*inv_mask+img

где inv_mask - инвертированная маска ключевых точек (значения «0» и «1» заменены друг на друга), coef1 и coef2 - заранее заданные коэффициенты. where inv_mask is the inverted mask of key points (the values “0” and “1” are replaced with each other), coef1 and coef2 are pre-set coefficients.

[071] Аналогичным способом устройство 42 выполняет формирование входных данных для модели 52 на основе полученных данных маски ключевых точек и изображения пользователя.[071] In a similar manner, the device 42 generates input data for the model 52 based on the received data of the key point mask and the user image.

[072] Далее устройства 41 и 42 направляют входные данные на вход моделей 51 и 52 соответственно, выходы которых передаются в устройство 60 в виде векторов признаков, характеризующих что изображение пользователя является подлинным. На основе первого и второго векторов признаков, полученных от моделей 51 и 52 соответственно, устройство 60 определяет итоговое значение вероятности подлинности пользователя, после чего устройство сравнивает рассчитанное итоговое значение вероятности с заданным пороговым значением, заданным разработчиком устройства 60, для определения класса изображения, указывающего на то, что лицо, представленное на изображении, является подлинными или поддельным.[072] Then, devices 41 and 42 send input data to the input of models 51 and 52, respectively, the outputs of which are transmitted to device 60 in the form of feature vectors characterizing that the user's image is genuine. Based on the first and second feature vectors received from models 51 and 52, respectively, device 60 determines the final value of the probability of the user's authenticity, after which the device compares the calculated final value of the probability with a specified threshold value specified by the developer of device 60, to determine the class of the image indicating that the face presented in the image is genuine or fake.

[073] Для определения значения вероятности подлинности пользователя на основе векторов признаков изображений в устройстве 60 может быть использована обученная нейронная сеть, например, на базе многослойного перцептрона или других алгоритмов машинного обучения, например градиентного бустинга и бинарных решающих деревьев.[073] To determine the probability value of user authenticity based on the feature vectors of images in device 60, a trained neural network may be used, for example, based on a multilayer perceptron or other machine learning algorithms, such as gradient boosting and binary decision trees.

[074] В одной из реализаций устройства 60 возможен вариант, при котором первый и второй вектор признаков обрабатываются независимыми слоями нейронной сети, после чего формируются промежуточные оценки вероятности подлинности, на основе которых затем рассчитывается итоговая оценка вероятности подлинности пользователя путем взвешенного суммирования с заранее заданным числовым коэффициентом:[074] In one implementation of device 60, a variant is possible in which the first and second feature vectors are processed by independent layers of the neural network, after which intermediate estimates of the probability of authenticity are formed, on the basis of which the final estimate of the probability of authenticity of the user is then calculated by weighted summation with a predetermined numerical coefficient:

prob=coef*prob1+(1-coef)*prob2 prob=coef*prob1+(1-coef)*prob2

где prob и prob2 - промежуточные оценки вероятности подлинности, рассчитанные на основе первого и второго векторов признаков, coef - заранее заданный коэффициент.where prob and prob2 are intermediate estimates of the probability of authenticity calculated on the basis of the first and second feature vectors, coef is a predetermined coefficient.

[075] При другой реализации устройства 60 выполняется операция конкатенации первого и второго векторов признаков, после чего объединенный вектор обрабатывается общей нейронной сетью для формирования итоговой оценки вероятности подлинности пользователя.[075] In another implementation of device 60, a concatenation operation is performed on the first and second feature vectors, after which the combined vector is processed by a common neural network to form a final assessment of the probability of user authenticity.

[076] Информация о классе изображения может быть направлена устройством 60 в различные системы авторизации пользователей для принятия соответствующих решений в зависимости от того, является ли лицо на изображении подлинным. [076] The image class information may be forwarded by the device 60 to various user authorization systems to make appropriate decisions depending on whether the face in the image is genuine.

[077] Таким образом, за счет того, что итоговое значение вероятности подлинности пользователя определяется с учетом расположения ключевых точек, определенных для изображения лица и изображения пользователя, повышается результирующая точность при определении подлинности лица. [077] Thus, due to the fact that the final value of the probability of authenticity of the user is determined taking into account the location of the key points determined for the face image and the user image, the resulting accuracy in determining the authenticity of the face is increased.

[078] В общем виде (см. Фиг. 5) вычислительное устройство содержит объединенные общей шиной информационного обмена один или несколько процессоров (101), средства памяти, такие как ОЗУ (102) и ПЗУ (103), интерфейсы ввода/вывода (104), устройства ввода/вывода (105), и устройство для сетевого взаимодействия (106).[078] In general form (see Fig. 5), the computing device comprises one or more processors (101), memory means such as RAM (102) and ROM (103), input/output interfaces (104), input/output devices (105), and a device for network interaction (106), united by a common information exchange bus.

[079] Процессор (101) (или несколько процессоров, многоядерный процессор и т.п.) может выбираться из ассортимента устройств, широко применяемых в настоящее время, например, таких производителей, как: Intel™, AMD™, Apple™, Samsung Exynos™, MediaTEK™, Qualcomm Snapdragon™ и т.п. Под процессором или одним из используемых процессоров в устройстве (100) также необходимо учитывать графический процессор, например, GPU NVIDIA или Graphcore, тип которых также является пригодным для полного или частичного выполнения способа, а также может применяться для обучения и применения моделей машинного обучения в различных информационных системах.[079] The processor (101) (or several processors, a multi-core processor, etc.) can be selected from a range of devices that are widely used at present, for example, from manufacturers such as: Intel™, AMD™, Apple™, Samsung Exynos™, MediaTEK™, Qualcomm Snapdragon™, etc. The processor or one of the processors used in the device (100) must also include a graphics processor, for example, an NVIDIA or Graphcore GPU, the type of which is also suitable for the full or partial implementation of the method, and can also be used for training and applying machine learning models in various information systems.

[080] ОЗУ (102) представляет собой оперативную память и предназначено для хранения исполняемых процессором (101) машиночитаемых инструкций для выполнение необходимых операций по логической обработке данных. ОЗУ (102), как правило, содержит исполняемые инструкции операционной системы и соответствующих программных компонент (приложения, программные модули и т.п.). При этом, в качестве ОЗУ (102) может выступать доступный объем памяти графической карты или графического процессора.[080] RAM (102) is a random access memory and is intended for storing machine-readable instructions executable by the processor (101) for performing the necessary operations for logical data processing. RAM (102), as a rule, contains executable instructions of the operating system and the corresponding software components (applications, software modules, etc.). In this case, the available memory capacity of the graphic card or graphic processor may act as RAM (102).

[081] ПЗУ (103) представляет собой одно или более устройств постоянного хранения данных, например, жесткий диск (HDD), твердотельный накопитель данных (SSD), флэш-память (EEPROM, NAND и т.п.), оптические носители информации (CD-R/RW, DVD-R/RW, BlueRay Disc, MD) и др.[081] The ROM (103) is one or more permanent data storage devices, such as a hard disk drive (HDD), a solid-state drive (SSD), flash memory (EEPROM, NAND, etc.), optical storage media (CD-R/RW, DVD-R/RW, BlueRay Disc, MD), etc.

[082] Для организации работы компонентов устройства (100) и организации работы внешних подключаемых устройств применяются различные виды интерфейсов В/В (104). Выбор соответствующих интерфейсов зависит от конкретного исполнения вычислительного устройства, которые могут представлять собой, не ограничиваясь: PCI, AGP, PS/2, IrDa, FireWire, LPT, COM, SATA, IDE, Lightning, USB (2.0, 3.0, 3.1, micro, mini, type C), TRS/Audio jack (2.5, 3.5, 6.35), HDMI, DVI, VGA, Display Port, RJ45, RS232 и т.п.[082] To organize the operation of the components of the device (100) and to organize the operation of external connected devices, various types of I/O interfaces (104) are used. The choice of the corresponding interfaces depends on the specific design of the computing device, which may be, but are not limited to: PCI, AGP, PS/2, IrDa, FireWire, LPT, COM, SATA, IDE, Lightning, USB (2.0, 3.0, 3.1, micro, mini, type C), TRS/Audio jack (2.5, 3.5, 6.35), HDMI, DVI, VGA, Display Port, RJ45, RS232, etc.

[083] Для обеспечения взаимодействия пользователя с устройством (100) применяются различные средства (105) В/В информации, например, клавиатура, дисплей (монитор), сенсорный дисплей, тач-пад, джойстик, манипулятор мышь, световое перо, стилус, сенсорная панель, трекбол, динамики, микрофон, средства дополненной реальности, оптические сенсоры, планшет, световые индикаторы, проектор, камера, средства биометрической идентификации (сканер сетчатки глаза, сканер отпечатков пальцев, модуль распознавания голоса) и т.п.[083] To ensure user interaction with the device (100), various means (105) of I/O information are used, for example, a keyboard, a display (monitor), a touch display, a touchpad, a joystick, a mouse, a light pen, a stylus, a touch panel, a trackball, speakers, a microphone, augmented reality means, optical sensors, a tablet, light indicators, a projector, a camera, biometric identification means (a retina scanner, a fingerprint scanner, a voice recognition module), etc.

[084] Средство сетевого взаимодействия (106) обеспечивает передачу данных посредством внутренней или внешней вычислительной сети, например, Интранет, Интернет, ЛВС и т.п. В качестве одного или более средств (206) может использоваться, но не ограничиваться: Ethernet карта, GSM модем, GPRS модем, LTE модем, 5G модем, модуль спутниковой связи, NFC модуль, Bluetooth и/или BLE модуль, Wi-Fi модуль и др.[084] The network interaction means (106) provides data transmission via an internal or external computer network, for example, an Intranet, the Internet, a LAN, etc. One or more means (206) may be, but are not limited to: an Ethernet card, a GSM modem, a GPRS modem, an LTE modem, a 5G modem, a satellite communication module, an NFC module, a Bluetooth and/or BLE module, a Wi-Fi module, etc.

[085] Дополнительно могут применяться также средства спутниковой навигации в составе устройства (100), например, GPS, ГЛОНАСС, BeiDou, Galileo. Конкретный выбор элементов устройства (100) для реализации различных программно- аппаратных архитектурных решений может варьироваться с сохранением обеспечиваемого требуемого функционала.[085] Additionally, satellite navigation means may also be used as part of the device (100), for example, GPS, GLONASS, BeiDou, Galileo. The specific selection of elements of the device (100) for the implementation of various software and hardware architectural solutions may vary while maintaining the required functionality provided.

[086] Модификации и улучшения вышеописанных вариантов осуществления настоящего технического решения будут ясны специалистам в данной области техники. Предшествующее описание представлено только в качестве примера и не несет никаких ограничений. Таким образом, объем настоящего технического решения ограничен только объемом прилагаемой формулы изобретения.[086] Modifications and improvements of the above-described embodiments of the present technical solution will be clear to those skilled in the art. The preceding description is provided only as an example and does not carry any limitations. Therefore, the scope of the present technical solution is limited only by the scope of the appended claims.

Claims

1. A method for determining the authenticity of a face in an image, performed by at least one computing device, comprising the steps of:

a. obtain the user's original image;

b. determine the coordinates of the user's face position;

c. generate an image of the face in accordance with the coordinates of the position of the user's face;

d. determine the position of the facial key points for the facial image and the position of the user key points for the user image;

e. forming a first mask of facial key points for a face image and a second mask of user key points for a user image;

f. calculating a first vector representation of features of a face based on the face image and the first mask of key points of the face and a second vector representation of features of a user based on the user image and the second mask of key points of the user;

g. based on the first and second feature vectors, the final assessment of the probability of the user's authenticity in the original image is determined;

h. based on the value of the final assessment of the probability of authenticity of the user in the image, a class of the image is determined indicating whether the person presented in the image is genuine or fake.

2. The method according to claim 1, characterized in that steps f-g are performed by a machine learning model or an ensemble of models, wherein the machine learning model or ensemble of models is trained on a data set containing genuine and fake images of users’ faces.

3. The method according to paragraph 2, characterized in that the ensemble of models consists of individual models, each of which is trained to identify a type of counterfeit determined by the developer.

4. The method according to item 1, characterized in that the stage of forming masks of key points of the face and the user is carried out using a neural network trained for the purpose of increasing the accuracy of determining the final assessment of the probability of the user's authenticity.

5. The method according to item 1, characterized in that the stage of forming masks of key points of the face and the user comprises stages in which:

- form closed contours in accordance with the positions of the coordinates of key points;

- increase the line width of closed contours by adding a specified number of adjacent pixels to the said lines, or increase the line width of closed contours by using blur filters with specified parameters, such as a Gaussian filter.

6. The method according to item 1, characterized in that an additional step is performed in which images are normalized by bringing the pixel values to specified levels of mathematical expectation and distribution variance or by means of a linear transformation to a specified range of values.

7. The method according to claim 1, characterized in that a step is additionally performed in which the sizes of the images are changed to match the inputs of the models for determining image features by means of two-dimensional interpolation of the images.

8. The method according to item 1, characterized in that the stage of forming a face image comprises stages in which:

- in accordance with the coordinates of the face position, a square area is selected on the user's image;

- enlarge the square area by a specified number of pixels, and select the image located in the enlarged square area as the face image.

9. The method according to claim 4, characterized in that additional steps are performed in which it is determined that the selected enlarged square area extends beyond the user's image and replication of the boundary pixels of the user's image is performed to fill said selected area.

10. The method according to paragraph 1, characterized in that additional steps are performed at which the following is formed:

- input data for the first model by matrix addition or multiplication of the pixel values of the first keypoint mask and the processed face image;

- input data for the second model by means of matrix addition or multiplication of the pixel values of the second key point mask and the processed user image, wherein when performing the addition or multiplication operation, the elements of the key point mask matrix are multiplied by a predetermined numerical coefficient.

11. The method according to paragraph 1, characterized in that additional steps are performed at which the following are formed:

- input data for the first model by concatenating the matrices of the first keypoint mask and the processed face image; and

- input data for the second model by concatenating the matrices of the second keypoint mask and the processed user image.

12. The method according to paragraph 1, characterized in that additional steps are performed at which the following is formed:

- input data for the first model by weighted addition of three matrices corresponding to the pixels of the first keypoint mask, the inverted first keypoint mask and the processed face image;

- input data for the second model by weighted addition of three matrices corresponding to the pixels of the second keypoint mask, the inverted second keypoint mask and the processed user image.

13. The method according to item 1, characterized in that in order to determine the class of the image, the first and second feature vectors are processed by independent layers of the neural network.

14. The method according to item 1, characterized in that in order to determine the class of the image, the first and second feature vectors are concatenated, after which they are processed by a common neural network.

15. A system for determining the authenticity of a face in an image, comprising at least one computing device and at least one memory containing machine-readable instructions which, when executed by at least one computing device, perform the method according to any one of paragraphs 1-14.