WO2018192246A1

WO2018192246A1 - Contactless emotion detection method based on machine vision

Info

Publication number: WO2018192246A1
Application number: PCT/CN2017/116060
Authority: WO
Inventors: 吕东岳; 吕相文; 刘弋锋; 廖勇
Original assignee: China Academy of Information and Communications Technology CAICT
Current assignee: China Academy of Information and Communications Technology CAICT
Priority date: 2017-04-19
Filing date: 2017-12-14
Publication date: 2018-10-25
Anticipated expiration: 2019-10-19

Abstract

Provided is a contactless emotion detection method and apparatus based on machine vision. The method comprises: acquiring a video image containing human face information, so as to obtain an image sequence to be processed; determining a human face position at each frame of a video image in the image sequence to be processed by means of a pre-set human face detection algorithm and an image stabilization algorithm; carrying out normalization processing on the image sequence to be processed, extracting a human face image from the image sequence to be processed after the normalization processing, and further obtaining a facial local image matrix; and amplifying a color change in the facial local image matrix using a pre-set video amplification algorithm, and performing signal processing on the amplified facial local image sequence, so as to obtain human body signs. In the present invention, a human face detection module and an image stabilization module are introduced, so that a subject can move his/her head modestly within the shooting range of a camera during use, and the heart rate value of the subject can be accurately obtained.

Description

A non-contact emotion detection method based on machine vision

专利的交叉引用Patent cross-reference

本申请要求2017年04月19日提交的，申请号CN201710256722.7、申请号CN201710256725.0的中国发明专利申请的优选权。The present application claims the priority of the Chinese invention patent application filed on April 19, 2017, the application number CN201710256722.7, the application number CN201710256725.0.

Technical field

本发明涉及网络技术应用技术领域，特别涉及一种基于机器视觉的非接触式情绪检测方法。The present invention relates to the field of network technology application technologies, and in particular, to a non-contact emotion detection method based on machine vision.

Background technique

心率是临床检测生命参数的重要指标，现行的方法主要采用接触式检测技术，生物医学信号的接触式检测是指利用电极或传感器直接或间接地接触人体，达到检测医学信息的目的，它可分为对人体固有信息的检测(如血压、心率测量等)和借助外能量的信息检测(X射线、B超检测等)，检测过程中对人体有一定的约束。Heart rate is an important indicator for clinical detection of life parameters. The current method mainly uses contact detection technology. The contact detection of biomedical signals refers to the direct or indirect contact with the human body through electrodes or sensors to achieve the purpose of detecting medical information. In order to detect the inherent information of the human body (such as blood pressure, heart rate measurement, etc.) and information detection by means of external energy (X-ray, B-ultrasound detection, etc.), the detection process has certain constraints on the human body.

非接触式检测是指借助于外来能量(探测媒介)，不接触人体，而且隔一定的距离，隔一定的介质，通过检测人体生理活动所引起的各种微动，进而获取各类生理信息。非接触式检测在检测过程中不需要对人体进行太多约束，测量过程更为友好，在某些特殊场合能够做到更为隐蔽的人体生理特征监测，甚至于更进一步的刑侦审讯中的特殊需求。Non-contact detection refers to the use of external energy (detection medium), does not touch the human body, and at a certain distance, separated by a certain medium, through the detection of various micro-motion caused by human physiological activities, and thus obtain various physiological information. Non-contact detection does not require too much restraint on the human body during the detection process, and the measurement process is more friendly. In some special occasions, it can achieve more concealed monitoring of human physiological characteristics, even in special criminal investigations. demand.

非接触式心率检测技术，按照采用的手段可以分为视觉检测技术与非视觉检测技术；其中视觉检测技术主要为基于成像式光电容积描记技术(IPPG)，非视觉检测技术主要为光电容积描记技术(PPG)与基于多普勒原理的雷达检测技术(微波、无线电波与声波等)。Non-contact heart rate detection technology can be divided into visual detection technology and non-visual detection technology according to the adopted methods; visual inspection technology is mainly based on imaging photoelectric plethysmography (IPPG), and non-visual detection technology is mainly photoelectric plethysmography (PPG) and radar detection technology based on Doppler principle (microwave, radio waves and sound waves, etc.).

基于机器视觉的心率检测方法已经在市场上有产品出现，最著名的为飞利浦公司开发的“生命体征相机”软件，该软件的工作流程为使用摄像头捕捉固定区域内的人脸部图像，分析人脸部图像的颜色变化进而得到人体心率的数值。但是，这种方法的缺点为在测量时需要被测者保持不动，从而使得人脸部保持固定，在此种情况下才能分析固定区域内的人脸部面色变化。这种不友好的使用方式使得被测者在使用的过程中体验较差，在一些无法主动要求被测者保持固定不动的特殊场景下(如审讯、测谎等)，则无法正常使用。The heart rate detection method based on machine vision has already appeared on the market. The most famous is the "Life Sign Camera" software developed by Philips. The workflow of the software is to capture the face image in the fixed area using the camera. The color change of the facial image further obtains the value of the human heart rate. However, the disadvantage of this method is that it needs to be protected by the tester during the measurement. The face is kept fixed, so that the facial color change in the fixed area can be analyzed in this case. This unfriendly use makes the subject experience poor in the process of use. In some special situations (such as trials, polygraphs, etc.) that cannot actively request the subject to remain fixed, they cannot be used normally.

因此，为了解决上述问题，需要能够在多种使用场景对人体进行友好检测的一种基于机器视觉的非接触式情绪检测方法。Therefore, in order to solve the above problems, a machine vision-based non-contact emotion detecting method capable of friendly detection of a human body in a plurality of use scenarios is required.

发明内容Summary of the invention

本发明的目的一个方面在于提供一种基于机器视觉的非接触式情绪检测方法，其特征在于，所述方法包括：One aspect of the object of the present invention is to provide a machine vision-based non-contact emotion detection method, characterized in that the method comprises:

获取包含人脸信息的视频图像，将所述视频图像与预先存储的若干帧视频图像合并，得到待处理图像序列；Obtaining a video image including face information, combining the video image with a plurality of pre-stored video images to obtain a sequence of images to be processed;

通过预设的人脸检测算法和稳像算法确定出所述待处理图像序列中每帧视频图像中的人脸位置；Determining a face position in each frame of the video image in the image sequence to be processed by a preset face detection algorithm and a image stabilization algorithm;

根据所述每帧视频图像中的人脸位置对所述待处理图像序列进行归一化处理，在归一化处理后的待处理图像序列中提取出人脸图像，并根据所述人脸图像得到脸部局部图像矩阵；Performing normalization processing on the image sequence to be processed according to the face position in each frame of the video image, extracting a face image in the normalized image sequence to be processed, and according to the face image Obtaining a partial image matrix of the face;

利用预设的视频放大算法放大所述脸部局部图像矩阵的颜色变化，得到放大后的脸部局部图像序列；Amplifying a color change of the partial image matrix of the face by using a preset video enlargement algorithm to obtain an enlarged partial image sequence of the face;

对所述放大后的脸部局部图像序列进行信号处理，得到人体体征，根据所述人体特征检测人体的情绪活动。Performing signal processing on the enlarged partial image sequence of the face to obtain a human body sign, and detecting the emotional activity of the human body according to the human body feature.

优选地，通过预设的人脸检测算法和稳像算法确定出所述待处理图像序列中每帧视频图像中的人脸位置，包括：Preferably, determining a face position in each frame of the video image in the image sequence to be processed by using a preset face detection algorithm and a image stabilization algorithm includes:

利用预设的人脸检测算法对所述待处理图像序列中的每帧图像进行人脸检测，得到初步的人脸位置；Performing face detection on each frame of the image sequence to be processed by using a preset face detection algorithm to obtain a preliminary face position;

通过预设的稳像算法对所述初步的人脸位置进行修正，得到每帧图像的人脸位置。The preliminary face position is corrected by a preset image stabilization algorithm to obtain a face position of each frame of image.

优选地，根据所述每帧视频图像中的人脸位置对所述待处理图像序列进行归一化处理，在归一化处理后的待处理图像序列中提取出人脸图像，并根据所述人脸图像得到脸部局部图像矩阵，包括： Preferably, the image sequence to be processed is normalized according to the face position in each frame of the video image, and the face image is extracted in the normalized image sequence to be processed, according to the The face image obtains a partial image matrix of the face, including:

采用预设的特征点提取算法提取所述待处理图像序列中每帧视频图像的特征点，对所述待处理图像序列中不属于人脸位置的特征点的亮度进行归一化处理，得到每帧视频图像的亮度归一化系数；Extracting feature points of each frame of the video image in the image sequence to be processed by using a preset feature point extraction algorithm, and normalizing the brightness of the feature points in the image sequence to be processed that are not belonging to the face position, and obtaining each The brightness normalization coefficient of the frame video image;

根据第i帧视频图像的亮度归一化系数对第i帧视频图像进行调整，得到归一化调整后的第i帧视频图像，所述第i帧视频图像为待处理图像序列中的任意一帧视频图像；Adjusting the ith frame video image according to the brightness normalization coefficient of the ith frame video image to obtain a normalized adjusted ith frame video image, where the ith frame video image is any one of the to-be-processed image sequences Frame video image

在归一化调整后的第i帧视频图像中根据第i帧视频图像中的人脸位置提取人脸图像；Extracting a face image according to a face position in the ith frame video image in the normalized adjusted ith frame video image;

调整所述人脸图像，以使在所有视频图像中提取出的人脸图像大小相同；Adjusting the face image so that the face images extracted in all video images are the same size;

根据调整后的人脸图像得到脸部局部图像矩阵。A facial partial image matrix is obtained according to the adjusted face image.

优选地，利用预设的视频放大算法放大所述脸部局部图像矩阵的颜色变化，得到放大后的脸部局部图像序列，包括：Preferably, the color variation of the partial image matrix of the face is enlarged by using a preset video enlargement algorithm to obtain an enlarged partial image sequence of the face, including:

对所述脸部局部图像矩阵序列进行多层降采样，并对最后的降采样结果进行带通滤波；Performing multi-layer downsampling on the partial image matrix sequence of the face, and performing band pass filtering on the final downsampling result;

将带通滤波后的结果进行放大，得到放大后的信息；Amplifying the result of the band pass filtering to obtain the amplified information;

通过与所述多层降采样相同层级的升采样过程将所述放大后的信息嵌入脸部局部图像序列，得到放大后的脸部局部图像序列。The enlarged partial information is embedded in the facial partial image sequence by an upsampling process of the same level as the multi-layer downsampling, to obtain an enlarged partial image sequence of the face.

优选地，统计所述放大后的脸部局部图像序列中每帧图像的RGB均值，根据所述均值的变化得到所述人体体征。Preferably, the RGB mean value of each frame image in the enlarged partial image sequence of the face is counted, and the human body sign is obtained according to the change of the mean value.

本发明的另一个方面在于提供一种基于机器视觉的非接触式情绪检测装置，包括：Another aspect of the present invention provides a machine vision-based non-contact emotion detecting apparatus, including:

预处理模块，用于获取包含人脸信息的视频图像，将所述视频图像与预先存储的若干帧视频图像合并，得到待处理图像序列；a pre-processing module, configured to acquire a video image that includes face information, and combine the video image with a plurality of pre-stored video images to obtain a sequence of images to be processed;

人脸检测模块，用于通过预设的人脸检测算法和稳像算法确定出所述待处理图像序列中每帧视频图像中的人脸位置；a face detection module, configured to determine, by using a preset face detection algorithm and a image stabilization algorithm, a face position in each frame of the video image in the image sequence to be processed;

脸部局部图像确定模块，根据所述每帧视频图像中的人脸位置对所述待处理图像序列进行归一化处理，在归一化处理后的待处理图像序列中提取出人脸图像，并根据所述人脸图像得到脸部局部图像矩阵； The facial partial image determining module normalizes the image sequence to be processed according to the face position in each frame of the video image, and extracts the face image in the normalized image sequence to be processed. And obtaining a facial partial image matrix according to the face image;

放大模块，用于利用预设的视频放大算法放大所述脸部局部图像矩阵的颜色变化，得到放大后的脸部局部图像序列；An amplifying module, configured to enlarge a color change of the partial image matrix of the face by using a preset video enlargement algorithm, to obtain an enlarged partial image sequence of the face;

信号处理模块，用于对所述放大后的脸部局部图像序列进行信号处理，得到人体体征，根据所述人体特征检测人体的情绪活动。The signal processing module is configured to perform signal processing on the enlarged partial image sequence of the face to obtain a human body sign, and detect an emotional activity of the human body according to the human body feature.

优选地，所述人脸检测模块具体用于：Preferably, the face detection module is specifically configured to:

优选地，所述脸部局部图像确定模块具体用于：Preferably, the facial partial image determining module is specifically configured to:

优选地，所述放大模块具体用于：Preferably, the amplification module is specifically configured to:

优选地，所述信号处理模块，统计所述放大后的脸部局部图像序列中每帧图像的RGB均值，根据所述均值的变化得到所述人体体征。 Preferably, the signal processing module calculates an RGB mean value of each frame image in the enlarged partial image sequence of the face, and obtains the human body sign according to the change of the mean value.

本发明实施例使用常见的网络摄像头或手机摄像头拍摄实时人脸视频，通过人脸检测算法与稳像算法确定人脸部位置，通过视频放大算法放大人脸部局部图像矩阵的颜色变化，通过信号处理算法得到精确地实时人体心率数值，根据人体心率数值检测人体的情绪活动。为了解决在视频心率测量中人体头部自由活动产生的问题，本发明引入了人脸检测与稳像模块，使被测者在使用中可以在摄像机拍摄范围内幅度不大的活动头部，并且能够准确地获得被测者的心率数值，根据心率数值检测被测者的情绪活动。In the embodiment of the present invention, a real-time face video is captured by using a common webcam or a mobile phone camera, and the face position is determined by a face detection algorithm and a image stabilization algorithm, and the color change of the partial image matrix of the face is amplified by a video enlargement algorithm, and the signal is passed. The processing algorithm obtains an accurate real-time human heart rate value, and detects the emotional activity of the human body according to the human heart rate value. In order to solve the problem caused by the free movement of the human head in the video heart rate measurement, the present invention introduces a face detection and image stabilization module, so that the subject can use a movable head with a small amplitude within the camera shooting range, and The heart rate value of the subject can be accurately obtained, and the emotional activity of the subject is detected according to the heart rate value.

应当理解，前述大体的描述和后续详尽的描述均为示例性说明和解释，并不应当用作对本发明所要求保护内容的限制。It is to be understood that the foregoing general descriptions

DRAWINGS

参考随附的附图，本发明更多的目的、功能和优点将通过本发明实施方式的如下描述得以阐明，其中：Further objects, features, and advantages of the present invention will be made apparent by the following description of the embodiments of the invention.

图1示意性示出了本发明方法实施例的基于机器视觉的非接触式情绪检测方法的流程图；1 is a flow chart schematically showing a machine vision based non-contact emotion detection method of an embodiment of the method of the present invention;

图2示出了本发明装置实施例的基于机器视觉的非接触式情绪检测装置的结构示意图；2 is a block diagram showing the structure of a machine vision-based non-contact emotion detecting device of an embodiment of the apparatus of the present invention;

图3示出了实例1基于机器视觉的非接触式情绪检测装置的结构示意图。FIG. 3 is a block diagram showing the structure of the non-contact emotion detecting device based on the machine vision of the example 1.

detailed description

通过参考示范性实施例，本发明的目的和功能以及用于实现这些目的和功能的方法将得以阐明。然而，本发明并不受限于以下所公开的示范性实施例；可以通过不同形式来对其加以实现。说明书的实质仅仅是帮助相关领域技术人员综合理解本发明的具体细节。Objects and functions of the present invention, and methods for achieving the objects and functions will be clarified by referring to the exemplary embodiments. However, the invention is not limited to the exemplary embodiments disclosed below; it can be implemented in various forms. The essence of the description is merely to assist those skilled in the relevant art to understand the specific details of the invention.

在下文中，将参考附图描述本发明的实施例，相关技术术语应当是本领域技术人员所熟知的。在附图中，相同的附图标记代表相同或类似的部件，或者相同或类似的步骤，除非另有说明。下面通过具体的实施例对本发明的内容进行说明。 Hereinafter, embodiments of the present invention will be described with reference to the drawings, and related technical terms should be well known to those skilled in the art. In the figures, the same reference numerals are used to refer to the same or similar parts, or the same or similar steps, unless otherwise stated. The contents of the present invention will be described below by way of specific examples.

为了解决在视频心率测量中人体头部自由活动产生的问题，本发明提供了一种基于机器视觉的非接触式情绪检测方法及装置，以下结合附图以及实施例，对本发明进行进一步详细说明。应当理解，此处所描述的具体实施例仅仅用以解释本发明，并不限定本发明。In order to solve the problem of free movement of the human head in the video heart rate measurement, the present invention provides a non-contact emotion detection method and apparatus based on machine vision. The present invention will be further described in detail below with reference to the accompanying drawings and embodiments. It is understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

根据本发明的方法实施例，提供了一种基于机器视觉的非接触式情绪检测方法，图1是本发明方法实施例的基于机器视觉的非接触式情绪检测方法的流程图，本发明所述的人体体征包括心率、血压、呼吸频率等，根据具体的人体特征检测人体情绪。如图1所示，根据本发明方法实施例的基于机器视觉的非接触式情绪检测方法包括如下处理：According to a method embodiment of the present invention, a machine vision based non-contact emotion detection method is provided, and FIG. 1 is a flowchart of a machine vision based non-contact emotion detection method according to an embodiment of the method of the present invention. The body signs include heart rate, blood pressure, respiratory rate, etc., and detect human emotions according to specific human characteristics. As shown in FIG. 1, a machine vision based non-contact emotion detection method according to an embodiment of the method of the present invention includes the following processing:

步骤101，获取包含人脸信息的视频图像，将所述视频图像与预先存储的若干帧视频图像合并，得到待处理图像序列。Step 101: Acquire a video image including face information, and combine the video image with a plurality of frame video images stored in advance to obtain a sequence of images to be processed.

步骤102，通过预设的人脸检测算法和稳像算法确定出所述待处理图像序列中每帧视频图像中的人脸位置。Step 102: Determine, by using a preset face detection algorithm and a image stabilization algorithm, a face position in each frame of the video image in the image sequence to be processed.

具体的，通过预设的人脸检测算法和稳像算法确定出所述待处理图像序列中每帧视频图像中的人脸位置，包括：Specifically, determining, by using a preset face detection algorithm and a image stabilization algorithm, a face position in each frame of the video image in the image sequence to be processed, including:

步骤103，根据所述每帧视频图像中的人脸位置对所述待处理图像序列进行归一化处理，在归一化处理后的待处理图像序列中提取出人脸图像，并根据所述人脸图像得到脸部局部图像矩阵。Step 103: normalize the image sequence to be processed according to the face position in each frame of the video image, and extract a face image in the normalized image sequence to be processed, according to the The face image obtains a partial image matrix of the face.

具体的，根据所述每帧视频图像中的人脸位置对所述待处理图像序列进行归一化处理，在归一化处理后的待处理图像序列中提取出人脸图像，并根据所述人脸图像得到脸部局部图像矩阵，包括：Specifically, the image sequence to be processed is normalized according to the face position in each frame of the video image, and the face image is extracted in the normalized image sequence to be processed, according to the The face image obtains a partial image matrix of the face, including:

根据第i帧视频图像的亮度归一化系数对第i帧视频图像进行调整，得到归一化调整后的第i帧视频图像，所述第i帧视频图像为待处理图像序列中的任意一帧视频图像； Adjusting the ith frame video image according to the brightness normalization coefficient of the ith frame video image to obtain a normalized adjusted ith frame video image, where the ith frame video image is any one of the to-be-processed image sequences Frame video image

步骤104，利用预设的视频放大算法放大所述脸部局部图像矩阵的颜色变化，得到放大后的脸部局部图像序列。Step 104: The color variation of the partial image matrix of the face is enlarged by using a preset video enlargement algorithm to obtain an enlarged partial image sequence of the face.

具体的，利用预设的视频放大算法放大所述脸部局部图像矩阵的颜色变化，得到放大后的脸部局部图像序列，包括：Specifically, the color variation of the partial image matrix of the face is enlarged by using a preset video enlargement algorithm, and the enlarged partial image sequence of the face is obtained, including:

步骤105，对放大后的脸部局部图像序列进行信号处理，得到人体体征。具体的，统计放大后的脸部局部图像序列中每帧图像的RGB均值，根据均值的变化得到所述人体体征。Step 105: Perform signal processing on the enlarged partial image sequence of the face to obtain a human body sign. Specifically, the RGB mean value of each frame image in the partial image sequence of the face after the statistical enlargement is obtained according to the change of the mean value.

与本发明的方法实施例相对应，提供了一种基于机器视觉的非接触式情绪检测装置，图2是本发明装置实施例的基于机器视觉的非接触式情绪检测装置的结构示意图，如图2所示，根据本发明装置实施例的基于机器视觉的非接触式情绪检测装置包括：预处理模块20、人脸检测模块22、脸部局部图像确定模块24、放大模块26、信号处理模块28；以下对本发明实施例的各个模块进行详细的说明。Corresponding to the method embodiment of the present invention, a machine vision-based non-contact emotion detecting device is provided, and FIG. 2 is a schematic structural diagram of a machine vision-based non-contact emotion detecting device according to an embodiment of the present invention. As shown in FIG. 2, the machine vision-based non-contact emotion detecting apparatus according to the apparatus embodiment of the present invention includes: a pre-processing module 20, a face detecting module 22, a facial partial image determining module 24, an amplifying module 26, and a signal processing module 28. The respective modules of the embodiments of the present invention are described in detail below.

所述预处理模块20，用于获取包含人脸信息的视频图像，将所述视频图像与预先存储的若干帧视频图像合并，得到待处理图像序列。The pre-processing module 20 is configured to acquire a video image including face information, and combine the video image with a plurality of frame video images stored in advance to obtain a sequence of images to be processed.

所述人脸检测模块22，用于通过预设的人脸检测算法和稳像算法确定出所述待处理图像序列中每帧视频图像中的人脸位置。The face detection module 22 is configured to determine, by using a preset face detection algorithm and a image stabilization algorithm, a face position in each frame of the video image in the image sequence to be processed.

所述人脸检测模块22具体用于：The face detection module 22 is specifically configured to:

通过预设的稳像算法对所述初步的人脸位置进行修正，得到每帧图像的人脸位置。Correcting the preliminary face position by a preset image stabilization algorithm to obtain a frame per frame Like the face position.

所述脸部局部图像确定模块24，根据所述每帧视频图像中的人脸位置对所述待处理图像序列进行归一化处理，在归一化处理后的待处理图像序列中提取出人脸图像，并根据所述人脸图像得到脸部局部图像矩阵。The facial partial image determining module 24 normalizes the image sequence to be processed according to the face position in each frame of the video image, and extracts a person in the normalized image sequence to be processed. a face image, and a facial partial image matrix is obtained according to the face image.

所述脸部局部图像确定模块24具体用于：The facial partial image determining module 24 is specifically configured to:

所述放大模块26，用于利用预设的视频放大算法放大所述脸部局部图像矩阵的颜色变化，得到放大后的脸部局部图像序列。The amplifying module 26 is configured to enlarge a color change of the partial image matrix of the face by using a preset video enlargement algorithm to obtain an enlarged partial image sequence of the face.

所述放大模块26具体用于：The amplification module 26 is specifically configured to:

所述信号处理模块28，用于对所述放大后的脸部局部图像序列进行信号处理，得到人体体征。统计放大后的脸部局部图像序列中每帧图像的RGB均值，根据均值的变化得到所述人体体征。The signal processing module 28 is configured to perform signal processing on the enlarged partial image sequence of the face to obtain a human body sign. The RGB mean value of each frame of the image in the partial image sequence of the face after the quantization is obtained, and the human body sign is obtained according to the change of the mean value.

为了更加详细的说明本发明的方法实施例和装置实施实施例，示例性的以心率作为人体体征给出实例1，根据心率的检测人体情绪变化。In order to explain the method embodiment and the device embodiment of the present invention in more detail, an exemplary example 1 is given with heart rate as a human body sign, and the human mood change is detected according to the heart rate.

图3是实例1基于机器视觉的非接触式情绪检测装置的结构示意图，如图3所示，基于机器视觉的非接触式情绪检测装置，具有视频图像摄像设备31和用于视频图像处理的计算设备32，视频图像摄像设备31对人脸面部30拍摄视频流。计算设备32内置计算芯片(例如中央处理器CPU)，以及存储计算芯片执行指令的存储器(例如固态硬盘SSD)。计算设备32的计算芯片包括如下四个模块：3 is a schematic structural diagram of a machine vision-based non-contact emotion detecting device according to Example 1, as shown in FIG. 3, a machine vision-based non-contact emotion detecting device having a video image taken Like the device 31 and the computing device 32 for video image processing, the video image capturing device 31 takes a video stream to the face portion 30. The computing device 32 includes a computing chip (eg, a central processing unit CPU) and a memory (eg, a solid state drive SSD) that stores computing chip execution instructions. The computing chip of computing device 32 includes the following four modules:

模块一：视频输入模块Module 1: Video Input Module

该模块的作用为从摄像设备中获取视频流。The function of this module is to obtain a video stream from the camera device.

模块二：视频预处理模块Module 2: Video Preprocessing Module

该模块的作用为对模块一得到的视频流进行预处理，获得模块三能够处理的脸部局部图像矩阵。该模块包含三个子模块：The function of the module is to preprocess the video stream obtained by the module, and obtain a partial image matrix of the face that the module 3 can process. This module contains three submodules:

子模块一：人脸检测Sub-module one: face detection

接收到模块一传递的视频流之后，首先与内存中存储的之前若干帧图像合并，得到待处理的图像序列。对合并之后的图像序列中每帧图像的人脸进行检测，获得初步的人脸位置信息，通过光流稳像算法对人脸位置进行细微修正，得到优化过后的人脸位置信息。After receiving the video stream delivered by the module, it first merges with the previous several frames of images stored in the memory to obtain a sequence of images to be processed. The face of each frame in the image sequence after the combination is detected, the preliminary face position information is obtained, and the face position is finely corrected by the optical flow stabilization algorithm to obtain the optimized face position information.

子模块二：光照调整Submodule 2: Lighting adjustment

进行光照归一化调节。采用特征点提取算法(例如SIFT、SURF、ORB等)提取图像序列每一帧的特征点，对不属于人脸部区域的对应位置的特征点的亮度进行归一化处理，得到每帧图像亮度归一化系数，从而实现图像序列的光照归一化处理。Perform light normalization adjustment. The feature points extraction algorithm (for example, SIFT, SURF, ORB, etc.) is used to extract the feature points of each frame of the image sequence, and the brightness of the feature points that are not corresponding to the face region is normalized to obtain the brightness of each frame image. The coefficients are normalized to achieve illumination normalization of the image sequence.

子模块三：区域确定Submodule 3: Area Determination

在优化后的人脸位置信息的基础上选择前额与面颊两部分所对应的图像像素组成可供模块三后续处理的矩阵序列。具体方法是首先对图像序列的每帧图像中提取出的人脸图像进行大小调整，使所有人脸图像同样大小，其次选择人脸前额与面颊两部分区域的像素组合为模块三能够处理的矩阵序列。On the basis of the optimized face position information, the image pixel composition corresponding to the two parts of the forehead and the cheek is selected to be a matrix sequence for subsequent processing by the module 3. The specific method is to first resize the face image extracted from each frame of the image sequence, so that all face images are the same size, and then select the pixels of the forehead and cheek regions of the face to be combined into a matrix that can be processed by the module three. sequence.

模块三：心率信号处理模块Module 3: Heart Rate Signal Processing Module

该模块的作用为对模块二得到的脸部局部图像矩阵进行处理，获得模块四能够处理的人体心率信息。该模块包含两个子模块：The function of the module is to process the facial partial image matrix obtained by the module 2, and obtain the human heart rate information that the module 4 can process. This module contains two submodules:

子模块一：视频放大Submodule 1: Video Zoom

接收到模块二传递的脸部局部图像矩阵后，采用视频放大算法对矩阵进行处理，实现有效的放大人脸前额与面颊两部分的面色变化，具体方法是对矩阵序列进行多层降采样，对最后的降采样结果进行带通滤波，将滤波后的结果与放大系数相乘进行放大，在通过与降采样相同层级的升采样过程将放大之后的信息嵌入前额与面颊部分所对应的图像，得到放大后的脸部局部图像序列。After receiving the partial image matrix of the face transmitted by the module 2, the matrix is processed by the video amplification algorithm to effectively enlarge the facial color change of the forehead and the cheek of the face, specifically The method is to perform multi-layer downsampling on the matrix sequence, perform band-pass filtering on the final down-sampling result, multiply the filtered result by multiplying the amplification factor, and zoom in after the upsampling process of the same level as the downsampling. The information is embedded in the image corresponding to the forehead and the cheek portion, and the enlarged partial image sequence of the face is obtained.

子模块二：信号处理Submodule 2: Signal Processing

对放大面色后的脸部局部图像序列进行信号处理，获得准确地人体心率数值。具体地，统计所述放大后的脸部局部图像序列中每帧图像的RGB均值，根据所述均值的变化得到人体心率数值。Signal processing is performed on the partial image sequence of the face after the face color is enlarged, and an accurate human heart rate value is obtained. Specifically, the RGB mean value of each frame image in the enlarged partial image sequence of the face is counted, and the human heart rate value is obtained according to the change of the mean value.

模块四：心率结果输出模块Module 4: Heart Rate Results Output Module

该模块的作用为输出被测者的心率信息，可以采用多种方式，例如在可视化界面中显示被测者的心率曲线，心率数值等。根据心率值反映人体的情绪活动。例如当心率达到某一数值时，人体情绪为焦虑、紧张；当心率低于某一数值时，人体情绪为难过、抑郁。The function of the module is to output the heart rate information of the test subject, and can adopt various methods, for example, displaying the heart rate curve of the test subject, the heart rate value, and the like in the visual interface. Reflects the emotional activity of the human body according to the heart rate value. For example, when the heart rate reaches a certain value, the human emotion is anxiety and nervousness; when the heart rate is lower than a certain value, the human emotion is sad and depressed.

本发明的处理流程包括如下七个步骤：The processing flow of the present invention includes the following seven steps:

步骤一：模块一接收摄像设备拍摄的包含人脸部图像的视频流，摄像设备可以采用USB连接、局域网连接等多种连接方式与用于采集视频信息的计算机相连接。用于采集视频信息的计算机通过软件工具采集并解析视频流。用于采集视频信息的计算机获取到视频流之后将之保存入能够被模块二处理的矩阵内存中，传递给模块二。Step 1: The module receives the video stream of the face image captured by the camera device, and the camera device can be connected to the computer for collecting video information by using various connection methods such as USB connection and local area network connection. A computer for collecting video information collects and parses the video stream through a software tool. The computer for collecting video information acquires the video stream and saves it into the matrix memory that can be processed by module two, and passes it to module two.

步骤二：模块二的子模块一接收模块一传递的矩阵内存之后，首先与内存中存储的之前若干帧图像合并得到图像序列，对图像序列中每帧图像的人脸进行检测，获得初步的人脸位置信息。通过光流稳像算法对人脸位置进行细微修正，得到优化过后的人脸位置信息。将图像序列与优化过后的人脸位置信息传递给模块二的子模块二。Step 2: After receiving the matrix memory transmitted by the module, the submodule of the module 2 first combines with the previous several frames of images stored in the memory to obtain an image sequence, and detects the face of each frame of the image sequence to obtain a preliminary person. Face location information. The position of the face is finely corrected by the optical flow image stabilization algorithm to obtain the optimized face position information. The image sequence and the optimized face position information are transmitted to the submodule 2 of the module 2.

步骤三：模块二的子模块二接收模块二的子模块一传递的图像序列与优化过后的人脸位置信息之后，采用特征点提取算法提取图像序列每一帧的特征点，结合优化过后的人脸位置信息，对不属于人脸部区域的对应位置的特征点的亮度进行归一化处理，得到每帧图像亮度归一化系数，从而实现图像序列的光照归一化处理。将经过光照归一化处理的图像序列与优化过后的人脸位置信息传递给模块二的子模块三。Step 3: After the sub-module 2 of the module 2 receives the image sequence transmitted by the sub-module of the module 2 and the optimized face position information, the feature point extraction algorithm is used to extract the feature points of each frame of the image sequence, and the optimized person is combined. The face position information normalizes the brightness of the feature points of the corresponding positions not belonging to the face region, and obtains the brightness normalization coefficient of each frame image, thereby realizing the illumination normalization process of the image sequence. The image sequence normalized by illumination and the optimized face position information are transmitted to submodule 3 of module 2.

步骤四：模块二的子模块三接收模块二的子模块二传递的经过光照归一化处理的图像序列与优化过后的人脸位置信息后，对图像序列的每帧图像中按照优化过后的人脸位置信息提取出的人脸图像进行大小调整，使所有人脸图像同样大小，其次选择人脸前额与面颊两部分区域的像素变换组合为脸部局部图像矩阵。将脸部局部图像矩阵传递给模块三。Step 4: Submodule 3 of module 2 receives the illuminated light of submodule 2 of module 2 After the normalized image sequence and the optimized face position information, the face image extracted according to the optimized face position information in each frame of the image sequence is resized so that all face images are the same size. Secondly, the pixel transformation of the two parts of the forehead and the cheek is selected as a partial image matrix of the face. Pass the facial partial image matrix to module three.

步骤五：模块三的子模块一接收模块二传递的脸部局部图像矩阵之后，采用视频放大算法对矩阵进行处理，得到脸部局部图像序列。将脸部局部图像序列传递给模块三的子模块二。Step 5: After receiving the partial image matrix of the face transmitted by the module 2, the sub-module of the module 3 processes the matrix by using a video amplification algorithm to obtain a partial image sequence of the face. Pass the partial image sequence of the face to submodule 2 of module 3.

步骤六：模块三的子模块二接收模块三的子模块一传递的脸部局部图像序列之后，统计脸部局部图像序列中每帧图像的均值(图像为RGB模式)，通过处理获得对应的心率数值，将心率数值传递给模块四。Step 6: After the sub-module 2 of the module 3 receives the partial image sequence of the face transmitted by the sub-module of the module 3, the average value of each frame image in the partial image sequence of the face (the image is RGB mode) is obtained, and the corresponding heart rate is obtained through processing. Value, passing the heart rate value to module four.

步骤七：模块四接收模块三传递的心率数值之后，将心率数值以可视化的方式输出，并通过心率值反映人体的情绪活动。Step 7: After receiving the heart rate value transmitted by the module three, the module 4 outputs the heart rate value in a visual manner, and reflects the emotional activity of the human body through the heart rate value.

本发明为了在被测者脸部运动时能够正常获得心率数值，在通过摄像设备获得人脸视频的基础上，采用人脸检测算法与稳像算法获得稳定的人脸图像，解决了其他方法在视频心率测量中人体头部自由活动导致的测量失效问题。The invention can obtain the heart rate value normally when the face of the subject is moved, and obtains a stable face image by using the face detection algorithm and the image stabilization algorithm on the basis of obtaining the face video by the camera device, and solving other methods in the method. Measurement failure problems caused by free movement of the human head in video heart rate measurement.

本发明在实验验证时，通过摄像设备(网络摄像头、监控摄像头等)在自然光照环境下拍摄室内被测者的脸部视频，拍摄时被测者头部会轻微运动，实时输出被测者的心率数值。本发明解决了非接触状态下活动人体心率检测问题。使用常见的网络摄像头或手机摄像头拍摄实时人脸部视频，通过人脸检测算法与稳像算法确定人脸部位置，通过视频放大算法放大人脸部敏感区域的颜色变化，通过信号处理算法得到精确地实时人体心率数值。为了解决在视频心率测量中人体头部自由活动产生的问题，本发明引入了人脸检测与稳像模块，使被测者在使用中可以在摄像机拍摄范围内幅度不大的活动头部，并且能够准确地获得被测者的心率数值，根据心率值准确检测到人的情绪活动。In the experimental verification, the camera device (webcam, surveillance camera, etc.) captures the face video of the indoor subject in a natural lighting environment, and the subject's head moves slightly when shooting, and the subject is output in real time. Heart rate value. The invention solves the problem of detecting heart rate of an active human body in a non-contact state. Real-time face video is captured by a common webcam or mobile phone camera, the face position is determined by the face detection algorithm and the image stabilization algorithm, and the color change of the sensitive area of the face is amplified by the video enlargement algorithm, which is accurately obtained by the signal processing algorithm. Real-time human heart rate values. In order to solve the problem caused by the free movement of the human head in the video heart rate measurement, the present invention introduces a face detection and image stabilization module, so that the subject can use a movable head with a small amplitude within the camera shooting range, and The heart rate value of the subject can be accurately obtained, and the emotional activity of the person is accurately detected according to the heart rate value.

结合这里披露的本发明的说明和实践，本发明的其他实施例对于本领域技术人员都是易于想到和理解的。说明和实施例仅被认为是示例性的，本发明的真正范围和主旨均由权利要求所限定。 Other embodiments of the invention will be apparent to those skilled in the <RTIgt; The description and the examples are to be considered as illustrative only, and the true scope and spirit of the invention are defined by the claims.

Claims

A machine vision-based non-contact emotion detection method, characterized in that the method comprises:

Obtaining a video image including face information, combining the video image with a plurality of pre-stored video images to obtain a sequence of images to be processed;

Determining a face position in each frame of the video image in the image sequence to be processed by a preset face detection algorithm and a image stabilization algorithm;

Performing normalization processing on the image sequence to be processed according to the face position in each frame of the video image, extracting a face image in the normalized image sequence to be processed, and according to the face image Obtaining a partial image matrix of the face;

Amplifying a color change of the partial image matrix of the face by using a preset video enlargement algorithm to obtain an enlarged partial image sequence of the face;

The enlarged facial partial image sequence is subjected to signal processing to obtain human body signs, and the human body's emotional activity is detected according to the human body signs.

The method of claim 1, wherein determining a face position in each frame of the video image in the sequence of images to be processed by using a preset face detection algorithm and a image stabilization algorithm comprises:

Performing face detection on each frame of the image sequence to be processed by using a preset face detection algorithm to obtain a preliminary face position;

The preliminary face position is corrected by a preset image stabilization algorithm to obtain a face position of each frame of image.

The method according to claim 1, wherein the image sequence to be processed is normalized according to the face position in each frame of the video image, and is in the image sequence to be processed after normalization processing. Extracting a face image, and obtaining a facial partial image matrix according to the face image, including:

Extracting feature points of each frame of the video image in the image sequence to be processed by using a preset feature point extraction algorithm, and normalizing the brightness of the feature points in the image sequence to be processed that are not belonging to the face position, and obtaining each The brightness normalization coefficient of the frame video image;

Adjusting the ith frame video image according to the brightness normalization coefficient of the ith frame video image to obtain a normalized adjusted ith frame video image, where the ith frame video image is any one of the to-be-processed image sequences Frame video image

Extracting a face image according to a face position in the ith frame video image in the normalized adjusted ith frame video image;

Adjusting the face image so that the face images extracted in all video images are the same size;

A facial partial image matrix is obtained according to the adjusted face image.

The method of claim 1, wherein the color variation of the partial image matrix of the face is enlarged by using a preset video enlargement algorithm to obtain an enlarged partial image sequence of the face, comprising:

Performing multi-layer downsampling on the partial image matrix sequence of the face, and performing band pass filtering on the final downsampling result;

Amplifying the result of the band pass filtering to obtain the amplified information;

The enlarged partial information is embedded in the facial partial image sequence by an upsampling process of the same level as the multi-layer downsampling, to obtain an enlarged partial image sequence of the face.

The method according to claim 1, wherein the RGB mean value of each frame of the image of the enlarged partial image of the face is counted, and the human body sign is obtained according to the change of the mean value.

A machine vision-based non-contact emotion detecting device, comprising:

a pre-processing module, configured to acquire a video image that includes face information, and combine the video image with a plurality of pre-stored video images to obtain a sequence of images to be processed;

a face detection module, configured to determine, by using a preset face detection algorithm and a image stabilization algorithm, a face position in each frame of the video image in the image sequence to be processed;

The facial partial image determining module normalizes the image sequence to be processed according to the face position in each frame of the video image, and extracts the face image in the normalized image sequence to be processed. And obtaining a facial partial image matrix according to the face image;

An amplifying module, configured to enlarge a color change of the partial image matrix of the face by using a preset video enlargement algorithm, to obtain an enlarged partial image sequence of the face;

The signal processing module is configured to perform signal processing on the enlarged partial image sequence of the face to obtain a human body sign, and detect an emotional activity of the human body according to the human body feature.

The device according to claim 6, wherein the face detection module is specifically configured to:

The device according to claim 6, wherein the facial partial image determining module is specifically configured to:

A facial partial image matrix is obtained according to the adjusted face image.

The device according to claim 6, wherein the amplifying module is specifically configured to:

The apparatus according to claim 6, wherein said signal processing module calculates an RGB mean value of each frame of said enlarged partial image sequence of the face, according to said The change in mean results in the human body sign.