KR20100081874A

KR20100081874A - Method and apparatus for user-customized facial expression recognition

Info

Publication number: KR20100081874A
Application number: KR1020090001296A
Authority: KR
Inventors: 김대진; 천영재; 신종주
Original assignee: 포항공과대학교 산학협력단
Priority date: 2009-01-07
Filing date: 2009-01-07
Publication date: 2010-07-15
Anticipated expiration: 2029-01-07
Also published as: KR100988326B1

Abstract

인물의 표정 특징을 효과적으로 표현하면서 조명과 카메라 환경과 같은 외부 요인(artifact) 및 노이즈에 강인한 특징점을 추출함으로써 실시간으로 사람에 독립적인 표정 인식을 가능하게 하는 얼굴 표정 인식 방법 및 장치가 제공된다. 얼굴 표정 인식 방법은 사용자로부터 트레이닝 이미지 시퀀스를 수신하는 단계; 수신된 트레이닝 이미지 시퀀스에 대한 DFEPDM을 학습하며, 학습된 DFEPDM을 이용하여 무표정 이미지를 추출하는 무표정 이미지 추출 단계; 사용자로부터 테스트 이미지 시퀀스를 수신하는 단계; 무표정 이미지 및 테스트 이미지 시퀀스의 AAM) 파라미들 간의 차분치를 이용하여 D-AAM 특징점을 계산하는 D-AAM 특징점 계산 단계; D-AAM 특징점을 학습된 매니폴드 공간으로 투영시켜 차원을 감소시키는 매니폴드 공간 투영 단계; 및 갤러리 시퀀스를 참조하여 매니폴드 공간으로 투영된 D-AAM 특징점들로부터 테스트 이미지 시퀀스의 표정을 인식하는 얼굴 표정 인식 단계를 포함한다. 본 발명에 의하여, 실시간으로 무표정 이미지를 찾고 이를 참조하여 차등-AAM 특징점을 계산할 수 있다. There is provided a facial expression recognition method and apparatus for extracting feature points that are robust to noise and external factors such as lighting and camera environment while effectively expressing facial expression characteristics of a person, thereby enabling human-independent facial recognition in real time. The facial expression recognition method includes receiving a training image sequence from a user; A expressionless image extracting step of learning a DFEPDM for the received training image sequence and extracting the expressionless image using the learned DFEPDM; Receiving a test image sequence from a user; A D-AAM feature point calculation step of calculating a D-AAM feature point using a difference between AAM) parameters of the expressionless image and the test image sequence; A manifold space projection step of projecting the D-AAM feature points into the learned manifold space to reduce dimensions; And a facial expression recognition step of recognizing the expression of the test image sequence from the D-AAM feature points projected into the manifold space with reference to the gallery sequence. According to the present invention, a differential-AAM feature point can be calculated in real time by finding an expressionless image.

Description

Method and apparatus for user-customized facial expression recognition

본 발명은 이미지 분석을 통하여 자연스런 얼굴 표정을 인식하는 방법 및 시스템에 관한 것이다. 특히, 본 발명은 다양한 조명과 카메라 환경에서도 서로 다른 사람의 자연스런 표정을 실시간으로 인식하는 방법에 관한 것이다. The present invention relates to a method and system for recognizing natural facial expressions through image analysis. In particular, the present invention relates to a method for real-time recognition of natural expressions of different people in various lighting and camera environments.

획득된 이미지로부터 사용자의 표정을 추출하기 위한 다양한 기술이 소개되었다. 특히, 이러한 기술은 소형 카메라에도 적용되어, 웃는 표정을 자동으로 감지하고 촬영함으로써 사용자 편의성을 증대시킬 수 있다. Various techniques for extracting a facial expression of a user from the acquired image have been introduced. In particular, such a technique may be applied to a small camera to increase user convenience by automatically detecting and photographing a smiling face.

입력 영상으로부터 피사체의 특징을 추적하는 방법에는 연속 이미지 간의 픽셀 변화량을 모델링하여 벡터로 표시하는 방법, 능동 윤곽선을 이용하는 방법, 및 화상 검출 방식 등이 이용된다. 특히, 화상 검출 방식에서는 이미지 신호 중에서 고주파 성분을 추출하여 피사체의 해상도를 산출하고, 피사체의 해상도가 최대로 형성되도록 촬상 장치를 구동하는 데에도 이용된다. As a method of tracking a feature of a subject from an input image, a method of modeling and displaying a pixel change amount between consecutive images as a vector, a method using an active contour, and an image detection method are used. In particular, the image detection method is also used to extract the high frequency components from the image signal to calculate the resolution of the subject, and to drive the imaging device so that the resolution of the subject is maximized.

종래의 얼굴 표정 인식 방법에 대해서는 우선 대한민국 특허출원 제10-2001-0019166)호를 참조한다. 이 문헌에 따르면, 확장된 결정 함수를 사용하여 추적하 고자 하는 얼굴의 특징 추적의 정확도를 높이고, 탐색 공간에서 비교 연산수를 감소시킴으로써 처리 속도를 향상시킬 수 있는 기술이 공개된다. 그러나, 이 방법은 무표정으로부터 목적 표정으로의 변화에 대하여 얼굴 표정을 인식하는 방법이지만, 실시간 자동으로 무표정으로 시작하여 목적 표정으로 끝나는 시퀀스를 얻기가 어려울 뿐만 아니라 특정 표정 (무표정 이외의 표정) 으로부터 목적 표정으로 변하는 경우는 인식 자체가 불가능하다는 한계점을 가지고 있다. For a conventional facial expression recognition method, first refer to Korean Patent Application No. 10-2001-0019166. According to this document, a technique is disclosed that can improve the processing speed by increasing the accuracy of feature tracking of a face to be tracked using an extended decision function and reducing the number of comparison operations in the search space. However, this method is a method of recognizing facial expressions for a change from expressionless to object expression, but it is not only difficult to obtain a sequence starting with expressionless and ending with object expression automatically in real time, but also from a specific expression (expression other than expressionless). In the case of turning into a facial expression, there is a limitation that recognition itself is impossible.

또한, 얼굴 표정 인식을 위하여 1999년 서울에서 개최된 IEEE International Conference on Fuzzy System에 소개된 논문은 퍼지논리, 신경망, 퍼지신경망 등의 방법을 사용하고 있다. 특히, 종래의 얼굴 표정 인식 방법에 대한 다른 또다른 얼굴 표정을 인식하는 방법(국내특허 10-2003-0047256)은 얼굴 표정 인식을 위한 전반적인 시스템을 제안하고 있고, 여기서 'Soft computing-based Intention reading through the user's mouth for human-friendly human-robot interaction', Proceedings of SCISamp; ISIS2002, 23Q3-5, 2002를 참조한다. 또한, 얼굴 인식/표정 인식을 위해 불러온 G 영상을 HLS 색변환하여 각각 H, S 및 L 영상을 얻고, 이 영상들에 다양한 이미지 프로세싱 기법을 적용한다. 그러나, 이러한 인식 방법들은 학습 이미지와 유사한 환경에서만 높은 인식률을 보인다는 한계점과 함께 인위적으로 만들어낸 극단의 표정만을 인식한다는 단점을 가진다. In addition, the paper introduced at IEEE International Conference on Fuzzy System held in Seoul in 1999 for facial expression recognition uses fuzzy logic, neural network, and fuzzy neural network. In particular, another method for recognizing facial expressions (national patent 10-2003-0047256) relative to the conventional facial expression recognition method proposes an overall system for facial expression recognition, where 'Soft computing-based Intention reading through the user's mouth for human-friendly human-robot interaction ', Proceedings of SCISamp; See ISIS2002, 23Q3-5, 2002. In addition, H images obtained by HLS color conversion for face recognition / expression recognition are obtained by HLS, respectively, and various image processing techniques are applied to the images. However, these recognition methods have a disadvantage in that they recognize only the extreme facial expressions created artificially with the limitation of showing a high recognition rate only in an environment similar to the learning image.

그러므로, 학습 이미지가 촬영된 환경과 상이한 환경에서도 효율적으로 동작할 수 있으며, 사용자별로 맞춤된 얼굴 표정 인식을 용이하게 수행할 수 있는 방법 및 장치가 절실히 요구된다. Therefore, there is an urgent need for a method and apparatus that can efficiently operate in an environment different from an environment in which a learning image is photographed, and can easily perform facial expression recognition customized for each user.

상기와 같은 문제점을 해결하기 위한 본 발명의 목적은 목적 인물의 표정 특징을 효과적으로 표현하면서 조명과 카메라 환경과 같은 외부 요인(artifact) 및 노이즈에 강인한 특징점을 추출함으로써 실시간으로 사람에 독립적인 표정 인식을 가능하게 하는 얼굴 표정 인식 방법을 제공하는 것이다. An object of the present invention for solving the above problems is to express facial expression characteristics of the target person effectively to extract facial features independent of humans in real time by extracting feature points that are robust to external factors such as lighting and camera environment and noise. It is to provide a facial expression recognition method that makes it possible.

본 발명의 다른 목적은 표정의 비선형적인 공간에 대하여 학습된 매니폴드 공간상에서 연속적인 정보를 이용하여 표정을 최종적으로 분류함으로써, 극단적 표정 뿐 아니라 자연스런 표정까지도 인식할 수 있는 얼굴 표정 인식 장치를 제공하는 것이다. Another object of the present invention is to provide a facial expression recognition apparatus capable of recognizing not only extreme facial expressions but also natural facial expressions by finally classifying facial expressions using continuous information in a manifold space learned about nonlinear spatial expressions. will be.

상기와 같은 목적들을 달성하기 위한 본 발명의 일면은 사용자의 무표정 이미지에 기반한 얼굴 표정 인식 방법에 관한 것으로서, 사용자로부터 트레이닝 이미지 시퀀스를 수신하는 단계; 수신된 트레이닝 이미지 시퀀스에 대한 차등 얼굴 표정 확률 밀도 모델(Differential Facial Expression Probability Density Model, DFEPDM)을 학습하며, 학습된 얼굴 표정 확률 밀도 모델(DFEPDM)을 이용하여 무표정 이미지를 추출하는 무표정 이미지 추출 단계; 사용자로부터 테스트 이미지 시퀀스를 수신하는 단계; 무표정 이미지 및 테스트 이미지 시퀀스의 능동 외모 모델(Active Appearance Model, AAM) 파라미터들 간의 차분치를 이용하여 차등-AAM(differential AAM, D-AAM) 특징점을 계산하는 D-AAM 특징점 계산 단계; D-AAM 특징점을 학습된 매니폴드 공간(manifold space)으로 투영시켜 차원을 감소시키는 매니폴드 공간 투영 단계; 및 갤러리 시퀀스(gallery sequence)를 참조하여 매니폴드 공간으로 투영된 D-AAM 특징점들로부터 테스트 이미지 시퀀스의 표정을 인식하는 얼굴 표정 인식 단계를 포함한다. 특히, 갤러리 시퀀스는 매니폴드 공간상에서 무표정으로부터 소정의 목적 표정으로의 변화에 대한 차등-AAM 특징점 시퀀스를 이용하여 생성되는 것을 특징으로 한다. One aspect of the present invention for achieving the above object relates to a facial expression recognition method based on the expressionless image of the user, receiving a training image sequence from the user; A facial expression image extraction step of learning a differential facial expression probability density model (DFEPDM) on a received training image sequence and extracting an expressionless image using the learned facial expression probability density model (DFEPDM); Receiving a test image sequence from a user; A D-AAM feature point calculation step of calculating a differential-AAM (D-AAM) feature point using the difference between the Active Appearance Model (AAM) parameters of the expressionless image and the test image sequence; A manifold space projection step of projecting the D-AAM feature points into the learned manifold space to reduce dimensions; And a facial expression recognition step of recognizing the expression of the test image sequence from the D-AAM feature points projected into the manifold space with reference to the gallery sequence. In particular, the gallery sequence is characterized using a differential-AAM feature point sequence for a change from expressionless to predetermined desired expression in manifold space.

본 발명에 의한 얼굴 표정 인식 방법에 포함되는 무표정 이미지 추출 단계는, 가우시안 커널(Gaussian kernel)을 이용하여 양/음의 방향을 가지는 특징점들의 밀도 함수를 추정하는 단계; 및 양의 방향의 밀도 함수에서 음의 방향의 밀도 함수를 감산하여 차등 얼굴 표정 확률 밀도 모델(DFEPDM)을 생성하는 단계를 포함하는 것을 특징으로 한다. The expressionless image extraction step included in the facial expression recognition method according to the present invention may include estimating a density function of feature points having a positive / negative direction using a Gaussian kernel; And subtracting the density function in the negative direction from the density function in the positive direction to generate a differential facial expression probability density model (DFEPDM).

더 나아가, D-AAM 특징점 계산 단계는, 트레이닝 이미지 시퀀스로부터 실시간으로 무표정 이미지를 추출하는 단계; 및 추출된 무표정 이미지 및 테스트 이미지 시퀀스들로부터 D-AAM 특징점을 계산하는 단계를 포함하는 것을 특징으로 한다. Furthermore, the D-AAM feature point calculation step may include extracting an expressionless image in real time from a training image sequence; And calculating a D-AAM feature point from the extracted expressionless image and test image sequences.

뿐만 아니라, 매니폴드 공간 투영 단계는, 테스트 이미지 시퀀스에 대한 D-AAM 특징점의 비선형성을 용이하게 표현하도록 학습된 매니폴드 공간으로 투영시킴으로써, 얼굴 표정의 특징점들의 차원을 감소시키는 단계를 포함하는 것을 특징으로 한다. Furthermore, projecting the manifold space includes reducing the dimension of the feature points of the facial expression by projecting it into the learned manifold space to easily represent the nonlinearity of the D-AAM feature points for the test image sequence. It features.

상기와 같은 목적들을 달성하기 위한 본 발명의 일면은 사용자의 무표정 이미지에 기반한 얼굴 표정 인식 장치에 관한 것으로서, 사용자로부터 트레이닝 이미 지 시퀀스 및 테스트 이미지 시퀀스를 수신하기 위한 이미지 수신부 및 수신된 이미지 시퀀스를 분석하여 사용자의 얼굴 표정을 인식하기 위한 이미지 프로세서를 포함한다. 얼굴 표정 인식 장치에 포함되는 이미지 프로세서는, 수신된 트레이닝 이미지 시퀀스에 대한 차등 얼굴 표정 확률 밀도 모델(DFEPDM)을 학습하며, 학습된 얼굴 표정 확률 밀도 모델(DFEPDM)을 이용하여 무표정 이미지를 추출하고, 무표정 이미지 및 테스트 이미지 시퀀스의 능동 외모 모델(AAM) 파라미터들 간의 차분치를 이용하여 D-AAM 특징점을 계산하며, D-AAM 특징점을 학습된 매니폴드 공간으로 투영시켜 차원을 감소시키고, 매니폴드 공간상에서 무표정으로부터 소정의 목적 표정으로의 변화에 대한 차등-AAM 특징점 시퀀스를 이용하여 갤러리 시퀀스를 생성하며, 및 갤러리 시퀀스를 참조하여 매니폴드 공간으로 투영된 D-AAM 특징점들로부터 테스트 이미지 시퀀스의 표정을 인식하도록 적응되는 것을 특징으로 한다. One aspect of the present invention for achieving the above object relates to a facial expression recognition device based on the expressionless image of the user, an image receiver for receiving a training image sequence and a test image sequence from the user to analyze the received image sequence And an image processor for recognizing a facial expression of a user. The image processor included in the facial expression recognition apparatus learns a differential facial expression probability density model (DFEPDM) for the received training image sequence, extracts an expressionless image using the learned facial expression probability density model (DFEPDM), Calculate the D-AAM feature points using the differences between the active appearance model (AAM) parameters of the expressionless image and test image sequences, and project the D-AAM feature points into the learned manifold space to reduce dimensions and in manifold space. Generate a gallery sequence using the differential-AAM feature point sequence for the change from no expression to a desired target expression, and recognize the expression of the test image sequence from the D-AAM feature points projected into the manifold space with reference to the gallery sequence. It is characterized in that it is adapted to.

특히, 본 발명에 의한 얼굴 표정 인식 장치에 포함되는 이미지 프로세서는 시퀀스 기반의 k-NNS 분류 알고리즘을 이용하여 얼굴 표정을 인식하도록 더욱 적응되는 것을 특징으로 한다. 뿐만 아니라, 이미지 프로세서는 인접한 두 시퀀스 사이의 거리 및 시간을 반영하는 가중치를 고려하여 두 시퀀스의 유사성을 판단하고, 테스트 이미지 시퀀스 및 갤러리 시퀀스 사이의 유사성에 기반하여 테스트 이미지 시퀀스의 얼굴 표정을 인식하도록 적응되는 것을 특징으로 한다. In particular, the image processor included in the facial expression recognition apparatus according to the present invention is further adapted to recognize facial expressions using a sequence-based k-NNS classification algorithm. In addition, the image processor determines the similarity of the two sequences by considering the weight reflecting the distance and time between two adjacent sequences, and recognizes the facial expression of the test image sequence based on the similarity between the test image sequence and the gallery sequence. Characterized in that it is adapted.

본 발명에 의하여, 실시간으로 무표정 이미지를 찾고 이를 참조하여 차등-AAM 특징점을 계산하므로, 이 과정에서 사람에 따른 변이 뿐 아니라, 조명, 카메라 등의 변이도 제거하는 효과를 가진다. According to the present invention, since the differential-AAM feature point is calculated by referring to the expressionless image in real time and referring to the expression-free image, in this process, not only the variation according to a person, but also the variation of lighting, a camera, and the like is removed.

또한, 매니폴드 학습 기법을 이용함으로써, 얼굴 표정 특징점이 구성하는 비선형 공간을 효과적으로 표현하면서도 차원을 낮출 수 있다. In addition, by using the manifold learning technique, the dimension can be reduced while effectively expressing the nonlinear space that the facial expression feature points constitute.

더 나아가, 방향성 하우스드로프 거리(directed Hausdorff distance, DHD)를 이용한 k-NNS 방법을 채택하여, 시퀀스 정보를 이용함과 동시에 표정 인식 시점에 가까운 특징점들의 거리에는 가중치를 주게 되므로 무표정에서 목적 표정으로의 변화에서 뿐만 아니라, 특정 표정에서 특정 표정으로의 변화, 특정 표정에서 유지되는 입력 시퀀스에 대하여도 효과적으로 얼굴 표정을 인식할 수 있다. Furthermore, by adopting the k-NNS method using the directed Hausdorff distance (DHD), using the sequence information and weighting the distance between the feature points close to the facial expression recognition point, the weight is changed from the expressionless expression to the object expression. Not only the change, but also the change from a specific facial expression to a specific facial expression and an input sequence maintained at the specific facial expression can be effectively recognized.

본 발명과 본 발명의 동작상의 이점 및 본 발명의 실시에 의하여 달성되는 목적을 충분히 이해하기 위해서는 본 발명의 바람직한 실시예를 예시하는 첨부 도면 및 첨부 도면에 기재된 내용을 참조하여야만 한다. In order to fully understand the present invention, the operational advantages of the present invention, and the objects achieved by the practice of the present invention, reference should be made to the accompanying drawings which illustrate preferred embodiments of the present invention and the contents described in the accompanying drawings.

이하, 첨부한 도면을 참조하여 본 발명의 바람직한 실시예를 설명함으로서, 본 발명을 상세히 설명한다. 그러나, 본 발명은 여러 가지 상이한 형태로 구현될 수 있으며, 설명하는 실시예에 한정되는 것이 아니다. 그리고, 본 발명을 명확하게 설명하기 위하여 설명과 관계없는 부분은 생략되며, 도면의 동일한 참조부호는 동일한 부재임을 나타낸다. Hereinafter, exemplary embodiments of the present invention will be described in detail with reference to the accompanying drawings. As those skilled in the art would realize, the described embodiments may be modified in various different ways, all without departing from the spirit or scope of the present invention. In addition, in order to clearly describe the present invention, parts irrelevant to the description are omitted, and the same reference numerals in the drawings indicate the same members.

명세서 전체에서, 어떤 부분이 어떤 구성요소를 "포함"한다고 할 때, 이는 특별히 반대되는 기재가 없는 한 다른 구성요소를 제외하는 것이 아니라, 다른 구성요소를 더 포함할 수 있는 것을 의미한다. 또한, 명세서에 기재된 "...부", "...기", "모듈", "블록" 등의 용어는 적어도 하나의 기능이나 동작을 처리하는 단위를 의미하며, 이는 하드웨어나 소프트웨어 또는 하드웨어 및 소프트웨어의 결합으로 구현될 수 있다. Throughout the specification, when a part is said to "include" a certain component, it means that it may further include other components, without excluding the other components unless otherwise stated. In addition, the terms "... unit", "... unit", "module", "block", etc. described in the specification mean a unit that processes at least one function or operation, which means hardware, software, or hardware. And software.

도 1은 본 발명에 의한 표정 인식 방법(100)을 설명하기 위한 흐름도이다. 1 is a flowchart illustrating a facial expression recognition method 100 according to the present invention.

우선, 사용자로부터 트레이닝 이미지 시퀀스가 수신된다(S110). 트레이닝 이미지 시퀀스란 사용자의 무표정 이미지를 추출하기 위하여 획득되는 일련의 이미지 시퀀스를 의미한다. 예를 들어, 트레이닝 이미지 시퀀스는 카메라 장치를 이용하여 수 초 동안 촬영된 이미지 프레임들일 수 있다. First, a training image sequence is received from a user (S110). The training image sequence refers to a series of image sequences obtained to extract a user's expressionless image. For example, the training image sequence may be image frames taken for several seconds using a camera device.

능동 외모 모델(AAM)이란, 다양한 얼굴 이미지를 효과적으로 표현하기 위한 모델링 기법 중 하나로서, 주성분 분석(principal component analysis) 기법을 이용한 선형 모델의 파라미터이다. 그런데, 능동 외모 모델(AAM) 파라미터에는 얼굴 표정의 특징 뿐만 아니라, 촬영 당시의 조명, 화이트 밸런스, 선명도 등의 다양한 외부 요인들도 포함된다. 따라서, 얼굴 표정을 제외한 이러한 요소들을 제외하기 위하여 본 발명에서는 어느 이미지의 능동 외모 모델(AAM) 파라미터를 그대로 이용하는 것이 아니라, 동일한 사용자의 무표정 이미지의 능동 외모 모델(AAM)을 이용하여 AAM 파라미터의 차분치(D-AAM)를 계산한다. D-AAM은 형상 파라미터 및 외모 파라미터 능동 외모 모델(AAM) 파라미터는 형상 모델(shape model) 및 외모 모델(appearance model)에 의하여 표현되는 파라미터로서, 이에 대해서는 도 3에 간략히 예시된다. 예를 들어, D-AAM은 형상 파라미터 벡터, 외모 파라미터 벡터의 결합(concatenation)을 통하여 연산될 수 있다. An active appearance model (AAM) is one of the modeling techniques for effectively expressing various face images, and is a parameter of a linear model using principal component analysis. However, the active appearance model (AAM) parameter includes not only features of facial expressions but also various external factors such as lighting, white balance, and sharpness at the time of shooting. Therefore, in order to exclude such factors except facial expressions, the present invention does not use the AAM parameter of any image as it is, but the difference of the AAM parameter using the AAM of the expressionless image of the same user. Calculate the value (D-AAM). D-AAM is a shape parameter and appearance parameter An active appearance model (AAM) parameter is a parameter represented by a shape model and an appearance model, which are briefly illustrated in FIG. 3. For example, the D-AAM may be calculated through concatenation of the shape parameter vector and the appearance parameter vector.

트레이닝 이미지 시퀀스가 수신되면, 이들의 차등-AAM(D-AAM) 특정점이 수학식 1과 같이 계산된다. When the training image sequences are received, their differential-AAM (D-AAM) specific points are calculated as in Equation 1.

수학식 1에서

는 시간 t 에서의 입력 이미지의 D-AAM 특징점이며,

는 시간 t에서의 AAM 파라미터이고,

는 참조 얼굴 이미지의 AAM 파라미터이다. In Equation 1

Is the D-AAM feature point of the input image at time t,

Is the AAM parameter at time t,

Is the AAM parameter of the reference face image.

수학식 1에서 알 수 있는 바와 같이, D-AAM 특징점을 계산하기 위하여는 참조 얼굴 이미지의 AAM 파라미터

가 필요하다. 본 발명에 의한 얼굴 표정 인식 방법에서는 무표정 이미지를 참조 얼굴 이미지로 사용한다. 무표정 이미지를 참조 이미지를 사용하는 이유는, 다양한 사용자들의 표정의 특징들을 추출하면 이러한 특징점들이 유사한 성질을 가진다는 점 및 무표정한 표정으로부터 다양한 표정을 자유롭게 표현할 수 있기 때문이다. As can be seen from Equation 1, in order to calculate the D-AAM feature point, the AAM parameter of the reference face image

Is needed. In the facial expression recognition method according to the present invention, the expressionless image is used as the reference face image. The reason why the reference image is used for the expressionless image is that extracting the features of the expressions of various users can express various expressions freely from the fact that these feature points have similar properties and the expressionless expression.

이하, 일련의 트레이닝 이미지 시퀀스로부터 무표정 이미지를 추출하기 위한 방법에 대해서 설명한다. Hereinafter, a method for extracting an expressionless image from a series of training image sequences will be described.

본 발명에서는 무표정에 가장 가까운 이미지를 찾기 위하여, 차등 얼굴 표정 확률 밀도 모델(Differential Facial Expression Probability Density Model, DFEPDM)를 학습한다(S120). 이 경우, 무표정 이미지로부터 특정 표정으로 변화하는 방향에 양의 값을 부여하고, 반대로 특정 표정으로부터 무표정으로 변화하는 방향에 음의 방향성을 부여할 수 있다. 어떤 얼굴 이미지가 무표정인지 모르는 상태 에서 DFEPDM은 무표정을 참조 얼굴 이미지로 하여 계산된 D-AAM 특징점(양의 방향의 D-AAM 특징점) 에 대하여는 높은 값을 반환하고, 특정 표정(화남, 웃음, 놀람)의 표정을 참조 얼굴 이미지로 하여 계산된 D-AAM 특징점(음의 방향의 D-AAM 특징점)에 대하여는 낮은 값을 반환할 수 있도록 학습될 수 있다. In the present invention, a differential facial expression probability density model (DFEPDM) is trained to find an image closest to the expressionless expression (S120). In this case, a positive value can be given to the direction changing from the expressionless image to the specific facial expression, and negative direction can be given to the direction changing from the specific facial expression to the expressionless expression. Without knowing which face image is expressionless, DFEPDM returns high values for the D-AAM feature points (D-AAM feature points in both directions) calculated using the expression as the reference face image, and certain facial expressions (angry, laughter, surprise). ) Can be learned to return a low value for the D-AAM feature point (the D-AAM feature point in the negative direction) calculated using the expression of the reference face image.

이와 같은 DFEPDM의 학습을 위하여 양/음의 방향의 D-AAM 특징점들은 각각의

과

의 가우시안 커널을 이용한 밀도 함수를 추정한다. 여기서

와

는 양/음의 방향 각각의 가우시 함수의 분산이다. 그러면, 이를 이용하여 차등 얼굴 표정 확률 밀도 모델(DFEPDM)은 다음 수학식 2와 같이 연산된다. For this DFEPDM learning, the positive / negative D-AAM feature points

and

We estimate the density function using Gaussian kernel of. here

Wow

Is the variance of the Gaussian function in each of the positive and negative directions. Then, using this, the differential facial expression probability density model (DFEPDM) is calculated as in Equation 2 below.

실시간 입력 시퀀스에 대하여 각 이미지를 참조 얼굴 이미지로 하여 D-AAM 특징점을 계산하고 이에 대한 DFEPDM에서의 값을 계산하여 가장 높은 값을 얻은 참조 얼굴 이미지는 학습된 DFEPDM에서 양의 방향의 D-AAM 특징점에 가장 부합하고 음의 방향의 D-AAM 특징점에는 가장 덜 부합한 의미를 가지며, 이를 근거로 무표정일 확률이 높다고 추정 할 수 있다. 이러한 방법으로 찾은 무표정 얼굴 이미지를 이용하여 수학식 1과 같이 D-AAM 특징점을 계산한다.For the real-time input sequence, the D-AAM feature point is calculated by calculating each D-AAM feature point as the reference face image and the value in the DFEPDM is calculated. It has the lowest agreement with the D-AAM feature point in the negative direction, and it can be estimated that it is likely to be expressionless. The D-AAM feature point is calculated as shown in Equation 1 using the expressionless face image found in this way.

이와 같이 무표정 이미지가 결정되면, 사용자의 테스트 이미지 시퀀스를 다시 수신한다(S130). 테스트 이미지 시퀀스도 트레이닝 이미지 시퀀스와 같이 카메라 등의 촬상 장치를 이용하여 생성될 수 있음은 물론이다. When the expressionless image is determined as described above, the test image sequence of the user is received again (S130). The test image sequence may also be generated using an imaging device such as a camera as the training image sequence.

그러면, 무표정 이미지 및 테스트 이미지 시퀀스의 능동 외모 모델(AAM) 파라미터의 차이를 이용하여 D-AAM 특징점을 계산한다(S140). Then, the D-AAM feature point is calculated using the difference between the active appearance model (AAM) parameters of the expressionless image and the test image sequence (S140).

이 과정에서, 얼굴 표정에 대한 D-AAM 특징점은 비선형 공간상에서 변화하므로 이를 비선형 모델을 이용하여 차원을 감소(dimension reduction)시키는 것이 바람직하다. 따라서, D-AAM 특징점을 학습된 매니폴드 공간으로 투영시킨다(S150). 본 발명에서는 매니폴드 학습의 방법 가운데 2000년 Science지에 개제된 Tenenbaum의 "A global geometric framework for nonlinear dimensionality reduction"에서 제안하고 있는 k-Isomap을 이용하여 얼굴 표정 공간을 학습하였으나, 본 발명은 이에 한정되는 것이 아니다. 일반적으로 매니폴드는 사람마다 상이하므로, 매니폴드 공간은 각 사용자에 대해서 개별적으로 학습되어야 한다. 하지만, 무표정 이미지로부터 특정 표정으로의 변화가 사람 간에 유사하기 때문에 본 발명에 의한 얼굴 표정 인식 방법에서는 공통의 매니폴드 공간을 학습할 수 있다. In this process, since the D-AAM feature point for the facial expression changes in nonlinear space, it is preferable to reduce the dimension by using the nonlinear model. Therefore, the D-AAM feature point is projected into the learned manifold space (S150). In the present invention, the facial expression space was studied using k-Isomap proposed by Tenenbaum's "A global geometric framework for nonlinear dimensionality reduction" published in Science in 2000, but the present invention is not limited thereto. It is not. In general, manifolds vary from person to person, so the manifold space must be learned individually for each user. However, since the change from the expressionless image to the specific expression is similar between people, the facial expression recognition method according to the present invention can learn a common manifold space.

이와 같이 D-AAM 특징점이 매니폴드 공간에 투영되면, 시퀀스 기반의 알고리즘을 이용하여 얼굴 표정의 인식을 수행한다(S160). 특히, 본 발명은 최근접 이웃 시퀀스(k-NNS) 분류 알고리즘을 이용할 수 있는데, 이것은 종래의 최근접 이웃(k-NN, k-nearest neighbors) 분류 알고리즘이 정적으로 하나의 이미지만을 가지고 분류 작업을 수행하는 대신에 동적인 데이터를 처리할 수 있도록 다음 수학식 3과 같이 알고리즘을 확장시킨 것이다. k-NNS는 학습 과정에서 다음 수학식 3과 같은 갤러리를 생성한다.When the D-AAM feature point is projected in the manifold space as described above, the facial expression is recognized using a sequence-based algorithm (S160). In particular, the present invention may use a k-NNS classification algorithm, which is a conventional k-NN (k-nearest neighbors) classification algorithm that performs a classification operation with only one image statically. Instead of performing, the algorithm is extended as shown in Equation 3 to process dynamic data. k-NNS generates a gallery as shown in Equation 3 in the learning process.

수학식 3에서 S는 하나의 시퀀스를 의미하고,

는 무표정에서 특정 표정(무표정, 화남, 웃음, 놀람)으로 변하는 D-AAM 특징점들을 학습된 매니폴드 공간으로 투영한 i번째 시퀀스이며,

는 i번째 시퀀스의 얼굴 표정 클래스를 의미하고,

는 갤러리의 시퀀스의 총수이다. In Equation 3, S means one sequence,

Is the i-th sequence that projects D-AAM feature points that change from expressionless to specific expressions (expression, anger, laughter, surprise) into the learned manifold space,

Means the facial expression class of the i th sequence,

Is the total number of sequences in the gallery.

실시간에서의 k-NNS의 테스트 과정은 다음과 같다. 입력 시퀀스,

와 갤러리상의 i번째 참조 시퀀스

사이의 거리를 측정하기 위하여 방향성 하우스드로프 거리 (Directed Hausdorff Distance, DHD)를 다음 수학식 4와 같이 정의한다.The test process of k-NNS in real time is as follows. Input sequence,

Reference sequence in the and galleries

Directed Hausdorff Distance (DHD) is defined as in Equation 4 below to measure the distance between them.

여기서

는 방향성에 대응하는 상수이다.

는 상수값

값에 따라서 최근 정보에 가중치를 두는 가중치 인자로서의 역할을 가지며 다음 수학식 5와 같이 계산된다. here

Is a constant corresponding to directionality.

Is a constant value

It has a role as a weighting factor that weights the latest information according to the value and is calculated as in Equation 5 below.

즉,

값이 0이면 모든 시퀀스 상의 특징점들 간의 거리들이 같은 가중치를 가지게 되며,

값이 커질수록 표정인식을 하는 시점에 가까운 특징점들간의 거리는 더 높은 가중치를 취하게 된다. j는 X 시퀀스의 i 번째 특징점에 대응하는 Y 시퀀스 상에서의 특징점의 인덱스로서, 다음 수학식 6과 같이 연산된다. In other words,

If the value is 0, the distances between the feature points in all sequences have the same weight.

As the value increases, the distance between feature points close to the point of time of facial recognition becomes higher. j is an index of a feature point on the Y sequence corresponding to the i th feature point of the X sequence and is calculated as in Equation 6 below.

DHD를 이용하여 입력 시퀀스와 갤러리상의 시퀀스들 간의 거리를 계산하고 나면 k 개의 인접 시퀀스를 찾고 다수 투표(majority voting) 방법을 이용하여 가장 많이 선택된 얼굴 표정 클래스로 입력 시퀀스의 얼굴 표정을 분류 인식한다(S160).After calculating the distance between the input sequence and the sequences in the gallery using DHD, k adjacent sequences are found and the facial expressions of the input sequence are classified and recognized by the most selected facial expression class using the majority voting method ( S160).

다시 말하면, 본 발명에 따르는 얼굴 표정 인식 방법은 입력되는 시퀀스 이미지에 대하여 무표정에 가장 가까운 이미지를 추정하는 단계, 무표정 이미지의 능동 외모 모델(AAM) 파라미터로부터 입력 이미지의 AAM 파라미터로의 차이로 정의되는 차등-AAM 특징점을 계산하는 단계, 이렇게 얻어진 얼굴 표정 이미지에 대한 차등-AAM 특징점의 비선형 구조를 잘 표현할 수 있도록 학습된 매니폴드 공간으로 투영시키는 단계, 및 시퀀스 기반의 최근접 이웃(k-NN) 분류 알고리즘을 이용하여 표정을 인식하는 단계를 포함한다. In other words, the facial expression recognition method according to the present invention includes estimating the image closest to the expressionless expression with respect to the input sequence image, and is defined as the difference from the active appearance model (AAM) parameter of the expressionless image to the AAM parameter of the input image. Calculating a differential-AAM feature point, projecting the non-linear structure of the differential-AAM feature point to the facial expression image thus obtained, into a trained manifold space so that it can be well represented, and sequence-based nearest neighbor (k-NN) Recognizing a facial expression using a classification algorithm.

도 2는 본 발명의 다른 측면에 의한 얼굴 표정 인식 장치에 포함되는 이미지 프로세서를 개념적으로 나타내는 블록도이다. 2 is a block diagram conceptually illustrating an image processor included in a facial expression recognition apparatus according to another aspect of the present invention.

프로세서(200)는 제1 및 제2 능동 외모 모델(AAM) 파라미터 추출부(210, 215), 제1 및 제2 D-AAM 특징점 추출부(220, 225), 제1 및 제2 매니폴드 공간 매핑부(240, 250), DFEPDM 연산부(230), 갤러리 시퀀스 생성부(260) 및 시퀀스 기반 분류부(270)를 포함한다. 시퀀스 기반 분류부(270)는 DHD 연산부(280) 및 K-NNS 처리부(290)를 포함한다. 도시된 이미지 프로세서(200)에서, 제1 능동 외모 모델(AAM) 파라미터 추출부(210), 제1 D-AAM 특징점 추출부(220), 및 제1 매니폴드 공간 매핑부(240)는 트레이닝 이미지 시퀀스를 처리하는데 관련되고, 제2 능동 외모 모델(AAM) 파라미터 추출부(215), 제2 D-AAM 특징점 추출부(225), 및 제2 매니폴드 공간 매핑부(250)는 테스트 이미지 시퀀스를 처리하는데 관련된다. The processor 200 may include first and second active appearance model (AAM) parameter extractors 210 and 215, first and second D-AAM feature point extractors 220 and 225, and first and second manifold spaces. The mapping unit 240 and 250, the DFEPDM calculator 230, the gallery sequence generator 260, and the sequence-based classifier 270 are included. The sequence-based classifier 270 includes a DHD calculator 280 and a K-NNS processor 290. In the illustrated image processor 200, the first active appearance model (AAM) parameter extractor 210, the first D-AAM feature point extractor 220, and the first manifold spatial mapping unit 240 are trained images. In connection with processing the sequence, the second active appearance model (AAM) parameter extractor 215, the second D-AAM feature point extractor 225, and the second manifold spatial mapping unit 250 perform a test image sequence. It is involved in processing.

능동 외모 모델(AAM) 파라미터 추출부(210), 제1 D-AAM 특징점 추출부(220), 및 DFEPDM 연산부(230)는 수신된 트레이닝 이미지 시퀀스에 대한 차등 얼굴 표정 확률 밀도 모델(DFEPDM)을 학습한다. 특히, DFEPDM 연산부(230)는 학습된 얼굴 표정 확률 밀도 모델(DFEPDM)을 이용하여 무표정 이미지를 추출한다. The active appearance model (AAM) parameter extractor 210, the first D-AAM feature point extractor 220, and the DFEPDM calculator 230 learn a differential facial expression probability density model (DFEPDM) for the received training image sequence. do. In particular, the DFEPDM calculator 230 extracts the expressionless image using the learned facial expression probability density model (DFEPDM).

학습된 DFEPDM은 제1 매니폴드 공간 매핑부(240)에 의하여 매니폴드 공간으로 매핑된다. 그러면, 갤러리 시퀀스 생성부(260)는 매니폴드 공간상에서 무표정으로부터 소정의 목적 표정으로의 변화에 대한 차등-AAM 특징점 시퀀스를 이용하여 갤러리 시퀀스를 생성한다. The learned DFEPDM is mapped to the manifold space by the first manifold space mapping unit 240. The gallery sequence generator 260 then generates a gallery sequence using the differential-AAM feature point sequence for the change from the expressionless expression to the desired expression in the manifold space.

이와 같이, 매니폴드 공간 및 갤러리 시퀀스가 생성되면, 제2 능동 외모 모델(AAM) 파라미터 추출부(215)는 테스트 이미지 시퀀스를 수신하여 능동 외모 모델(AAM) 파라미터를 추출한다. 그러면, 제2 D-AAM 특징점 추출부(225)는 이미 획 득된 무표정 이미지 및 수신된 테스트 이미지 시퀀스의 능동 외모 모델(AAM) 파라미터들 간의 차분치를 이용하여 D-AAM 특징점을 계산한다. 계산된 D-AAM 특징점은 제2 매니폴드 공간 매핑부(250)에 의하여 학습된 매니폴드 공간으로 투영된다. 그러면, D-AAM 특징점들의 차원이 감소된다. As such, when the manifold space and gallery sequence are generated, the second active appearance model (AAM) parameter extractor 215 receives the test image sequence to extract the active appearance model (AAM) parameters. Then, the second D-AAM feature point extractor 225 calculates the D-AAM feature point by using a difference value between the active expression model (AAM) parameters of the already acquired expressionless image and the received test image sequence. The calculated D-AAM feature point is projected into the learned manifold space by the second manifold space mapping unit 250. Then, the dimension of the D-AAM feature points is reduced.

DHD 연산부(280)는 투영된 D-AAM 특징점 및 갤러리 시퀀스 생성부(260)로부터 수신된 갤러리 시퀀스를 이용하여 DHD를 연산한다. 또한, 연산된 DHD는 K-NNS 처리부(290)에 입력된다. 그러면, K-NNS 처리부(290)는 가장 근접한 k 개의 시퀀스를 획득하고, 인접한 두 시퀀스 사이의 거리 및 시간을 반영하는 가중치를 고려하여 두 시퀀스의 유사성을 판단하고, 상기 테스트 이미지 시퀀스 및 상기 갤러리 시퀀스 사이의 유사성에 기반하여 테스트 이미지 시퀀스의 얼굴 표정을 인식한다. The DHD calculator 280 calculates a DHD using the projected D-AAM feature point and the gallery sequence received from the gallery sequence generator 260. In addition, the calculated DHD is input to the K-NNS processing unit 290. Then, the K-NNS processing unit 290 obtains the nearest k sequences, determines the similarity of the two sequences in consideration of weights reflecting the distance and time between two adjacent sequences, and determines the test image sequence and the gallery sequence. Recognize facial expressions in test image sequences based on similarities between them.

도 4는 전술된 바와 같이 학습된 D-AAM 특징점들을 도시한다. 도 4a는 양의 방향의 D-AAM 특징점과 밀도 함수를, 도 4b는 음의 방향의 D-AAM 특징점과 밀도 함수를, 그리고 도 4c는 수학식 2에 의하여 얻어진 차등 얼굴 표정 확률 밀도 모델(DFEPDM)을 도시한다. 4 illustrates the D-AAM feature points learned as described above. 4A is a D-AAM feature point and density function in the positive direction, FIG. 4B is a D-AAM feature point and density function in the negative direction, and FIG. 4C is a differential facial expression probability density model obtained by Equation 2 (DFEPDM). ).

본 발명은 도면에 도시된 실시예를 참고로 설명되었으나 이는 예시적인 것에 불과하며, 본 기술 분야의 통상의 지식을 가진 자라면 이로부터 다양한 변형 및 균등한 타 실시예가 가능하다는 점을 이해할 것이다. Although the present invention has been described with reference to the embodiments shown in the drawings, this is merely exemplary, and it will be understood by those skilled in the art that various modifications and equivalent other embodiments are possible.

따라서, 본 발명의 진정한 기술적 보호 범위는 첨부된 등록청구범위의 기술적 사상에 의해 정해져야 할 것이다. Therefore, the true technical protection scope of the present invention will be defined by the technical spirit of the appended claims.

본 발명은 촬상된 영상으로부터 외란의 영향을 감소시킨 채 사용자의 얼굴의 다양한 표정을 용이하게 추출하기 위한 방법 및 장치에 적용될 수 있다. The present invention can be applied to a method and apparatus for easily extracting various expressions of a user's face while reducing the influence of disturbance from the captured image.

도 1은 본 발명의 일 측면에 의한 얼굴 표정 인식 방법을 나타내는 흐름도이다. 1 is a flowchart illustrating a facial expression recognition method according to an aspect of the present invention.

도 3은 본 발명에 적용된 능동 외모 모델(AAM)이 형상 파라미터(shape parameter) 및 외모 파라미터(appearance parameter)로 이루어지는 것을 개념적으로 나타내는 도면이다. 3 is a view conceptually showing that an active appearance model (AAM) applied to the present invention is composed of a shape parameter and an appearance parameter.

도 4a 내지 도 4c는 무표정 이미지를 찾기 위하여 적용된 차등 얼굴 표정 확률 밀도 모델(DFEPDM)을 설명하기 위한 도면으로서, 도 4a는 무표정을 참조 표정으로 하여 계산된 차등-AAM 특징점과 이를 가우시안 커널을 이용하여 추정한 밀도함수를 도시한다. 4A to 4C are diagrams for explaining a differential facial expression probability density model (DFEPDM) applied to find an expressionless image, and FIG. 4A is a differential-AAM feature point calculated using the expressionless expression as a reference expression and using a Gaussian kernel. The estimated density function is shown.

도 4b는 특정 표정을 참조 표정으로 하여 계산된 차등-AAM 특징점과 이를 가우시안 커널을 이용하여 추정한 밀도 함수를 도시한다. 4B shows a differential-AAM feature point calculated using a specific facial expression as a reference facial expression and a density function estimated using the Gaussian kernel.

도 4c는 상기의 밀도 함수들을 이용하여 만들어진 DFEPDM을 도시한다. 4C shows a DFEPDM made using the density functions above.

Claims

In the facial expression recognition method based on the user's expressionless image,

Receiving a training image sequence from a user;

An expressionless image extraction step of learning a differential facial expression probability density model (DFEPDM) on a received training image sequence and extracting an expressionless image using the trained facial expression probability density model (DFEPDM). ;

Receiving a test image sequence from a user;

A D-AAM feature point calculation step of calculating a differential AAM (D-AAM) feature point using a difference value between active expression model (AAM) parameters of the expressionless image and the test image sequence;

A manifold space projection step of projecting the D-AAM feature points into a learned manifold space to reduce dimensions; And

A facial expression recognition step of recognizing an expression of the test image sequence from the D-AAM feature points projected into the manifold space with reference to a gallery sequence;

And the gallery sequence is generated using a differential-AAM feature point sequence for a change from no expression to a predetermined desired expression in the manifold space.

The method of claim 1, wherein the expressionless image extraction step comprises:

Estimating a density function of feature points with a positive / negative direction using a Gaussian kernel; And

Subtracting the density function in the negative direction from the density function in the positive direction to generate the differential facial expression probability density model (DFEPDM).

The method of claim 1, wherein the D-AAM feature point calculation step,

Extracting the expressionless image in real time from the training image sequence; And

Calculating the D-AAM feature point from the extracted expressionless image and the test image sequences.

The method of claim 1, wherein the manifold space projecting step,

Reducing the dimension of the feature points of the facial expression by projecting the manifold space trained to easily represent non-linearity of the D-AAM feature points with respect to the test image sequence. Way.

The method of claim 1, wherein the facial expression recognition step comprises:

The method is performed using a sequence-based k-NNS classification algorithm.

The method of claim 5, wherein the facial expression recognition step,

Determining the similarity of the two sequences in consideration of weights reflecting the distance and time between two adjacent sequences; And

Recognizing a facial expression of the test image sequence based on a similarity between the test image sequence and the gallery sequence.

In the facial expression recognition device based on the expressionless image of the user,

An image receiver for receiving a training image sequence and a test image sequence from a user; And

An image processor for analyzing a received image sequence to recognize a facial expression of a user, wherein the image processor includes:

Learning a differential facial expression probability density model (DFEPDM) on the received training image sequence, extracting an expressionless image using the learned facial expression probability density model (DFEPDM),

Calculates a D-AAM feature point using a difference between active expression model (AAM) parameters of the expressionless image and the test image sequence,

Projecting the D-AAM feature points into a learned manifold space to reduce dimensions, generating a gallery sequence using a differential-AAM feature point sequence for a change from no expression to a desired expression on the manifold space, and

And recognize the expression of the test image sequence from the D-AAM feature points projected into the manifold space with reference to the gallery sequence.

The method of claim 7, wherein the image processor,

The differential face expression probability density model (DFEPDM) is estimated by using a Gaussian kernel to estimate the density function of feature points with positive / negative direction, and subtract the density function in the negative direction from the density function in the positive direction. And is further adapted to extract the expressionless image by generating an expression.

The method of claim 7, wherein the image processor,

And is further adapted to calculate the D-AAM feature point by extracting the expressionless image in real time from the training image sequence and calculating the D-AAM feature point from the extracted expressionless image and the test image sequences. Facial expression recognition device.

The method of claim 7, wherein the image processor,

Facial expression recognition, characterized in that it is further adapted to reduce the dimension of the feature points of the facial expression by projecting it into the learned manifold space to easily represent non-linearity of the D-AAM feature point with respect to the test image sequence. Device.

The method of claim 7, wherein the image processor,

And further adapted to recognize the facial expression using a sequence based k-NNS classification algorithm.

The method of claim 11, wherein the image processor,

Determining the similarity of the two sequences in consideration of weights reflecting distance and time between two adjacent sequences, and adapting to recognize a facial expression of the test image sequence based on the similarity between the test image sequence and the gallery sequence. Characterized in that the device.