WO2008018460A1

WO2008018460A1 - Image processing method, image processing apparatus, image processing program, and image pickup apparatus

Info

Publication number: WO2008018460A1
Application number: PCT/JP2007/065447
Authority: WO
Inventors: Akihiko Utsugi
Original assignee: Nikon Corp
Current assignee: Nikon Corp
Priority date: 2006-08-08
Filing date: 2007-08-07
Publication date: 2008-02-14
Anticipated expiration: 2009-02-08
Also published as: JP2009258771A

Abstract

An image processing method comprises acquiring an image consisting of a plurality of pixels; detecting, based on the acquired image, the edge of a recess the pixel values of which are locally smaller than those of the periphery of the recess; and generating an edge image based on the detected edge of the recess.

Description

明細書 Specification

画像処理方法、画像処理装置、画像処理プログラム、撮像装置技術分野 Image processing method, image processing apparatus, image processing program, imaging apparatus

[0001] 本発明は、取得した画像においてエッジ検出を行う画像処理方法、画像処理装置 The present invention relates to an image processing method and an image processing apparatus for performing edge detection in an acquired image.

、画像処理プログラム、撮像装置に関する。 The present invention relates to an image processing program and an imaging apparatus.

背景技術 Background art

[0002] デジタル画像処理において、撮影された画像の中力顔画像を検出する処理の需要は高い。例えば、デジタルカメラにおいて、検出された顔領域を好ましい色や階調に変換する処理や、ビデオ画像において、特定の人物の登場場面を抽出する処理や、監視カメラにおいて、不審者の画像を抽出する処理などがある。また、撮影された顔画像に基づレ、て個人を識別したり、性別や年齢や表情を推定することなども行われている。 [0002] In digital image processing, there is a high demand for processing for detecting a medium facial image of a captured image. For example, in a digital camera, a process that converts a detected face area into a preferred color or gradation, a process that extracts the appearance of a specific person in a video image, or a suspicious person image that is extracted in a surveillance camera There is processing. In addition, based on the photographed face image, individuals are identified, and gender, age, and facial expressions are estimated.

[0003] 以上のような用途で使われる顔画像判定処理にお!/、て、従来、入力画像の高周波成分を抽出してエッジ画像を作成し、そのエッジ画像に対してニューラルネットワークなどの学習判別処理を施す方法が提案されている。エッジ画像を生成することにより、撮影時の照明条件の影響などの、顔画像判定処理に必要の無い画像情報を除去すること力 Sでき、判定処理を効率よく施すことができる。エッジ成分を抽出する方法としては、例えば、様々な方向と周波数を持つガボールフィルタを用いることにより、入力画像のエッジ構造の方向と周波数の情報を抽出する方法などが提案されている。 [0003] In the face image determination process used in the above applications! /, Conventionally, high-frequency components of the input image are extracted to create an edge image, and learning such as a neural network is performed for the edge image. A method for performing a discrimination process has been proposed. By generating an edge image, it is possible to remove image information that is not necessary for face image determination processing, such as the influence of illumination conditions at the time of shooting, and the determination processing can be performed efficiently. As a method for extracting edge components, for example, a method for extracting information on the direction and frequency of the edge structure of the input image by using a Gabor filter having various directions and frequencies has been proposed.

[0004] 特許文献 1：特開 2004— 199386号公報 [0004] Patent Document 1: Japanese Unexamined Patent Application Publication No. 2004-199386

発明の開示 Disclosure of the invention

発明が解決しょうとする課題 Problems to be solved by the invention

[0005] しかしながら、ガボールフィルタなどの従来のエッジ抽出方法では、エッジ構造の情報を充分に抽出できるとはいえない。例えば、顔画像の目、鼻、口などの位置は周辺に比べて局所的に暗い。従って、顔画像を識別する場合、目、鼻、口に対応する位置に局所的に暗い構造があるかを知ることが重要である。また、歯を見せて笑っている顔の画像では、歯の位置が周辺に比べて局所的に明るい。従って、笑顔を識別する場合、口の位置に局所的に明るい構造があるかを知ることが重要である。 [0005] However, it cannot be said that conventional edge extraction methods such as a Gabor filter can sufficiently extract edge structure information. For example, the positions of eyes, nose, mouth, etc. in the face image are locally darker than the surroundings. Therefore, when identifying facial images, it is important to know if there are locally dark structures at positions corresponding to the eyes, nose and mouth. Also, in the face image showing teeth and laughing, the teeth are locally brighter than the surroundings. Therefore, identify the smile It is important to know if there is a locally bright structure in the mouth position.

[0006] しかし、従来のエッジ抽出方法では、エッジ構造が局所的に喑!/、構造である力、、局所的に明るい構造であるか、あるいはそれ以外の構造であるかを識別することはできないという問題があった。 [0006] However, in the conventional edge extraction method, the edge structure is locally identified as 喑! /, A force that is a structure, a locally bright structure, or another structure. There was a problem that it was not possible.

課題を解決するための手段 Means for solving the problem

[0007] 本発明の第 1の態様によると、画像処理方法は、複数の画素からなる画像を取得し、取得した画像に基づき、局所的に周辺より画素値がへこんでいる凹構造のエッジを検出し、検出した凹構造のエッジに基づきエッジ画像を生成する。 [0007] According to the first aspect of the present invention, an image processing method acquires an image including a plurality of pixels, and based on the acquired image, an edge having a concave structure in which a pixel value is locally recessed from the periphery. An edge image is generated based on the detected edge of the concave structure.

本発明の第 2の態様によると、第 1の態様の画像処理方法において、取得した画像に対して、凹構造のエッジを検出する非線形フィルタを演算することによりエッジ画像を生成するのが好ましい。 According to the second aspect of the present invention, in the image processing method according to the first aspect, it is preferable that the edge image is generated by calculating a nonlinear filter that detects the edge of the concave structure with respect to the acquired image.

本発明の第 3の態様によると、第 2の態様の画像処理方法において、非線形フィノレタは、対象領域における画素値と対象領域の周辺領域における画素値の最小値との差異に基づ!/、て演算結果を出力するのが好まし!/、。 According to the third aspect of the present invention, in the image processing method according to the second aspect, the nonlinear finisher is based on a difference between a pixel value in the target region and a minimum value of the pixel values in the peripheral region of the target region! / It is preferable to output the calculation result!

本発明の第 4の態様によると、第 3の態様の画像処理方法において、非線形フィノレタは、対象領域における画素値の最小値と周辺領域における画素値の最小値との差異に基づレ、て演算結果を出力するのが好まし!/、。 According to the fourth aspect of the present invention, in the image processing method according to the third aspect, the nonlinear finisher is based on a difference between a minimum pixel value in the target region and a minimum pixel value in the peripheral region. To output the calculation result!

本発明の第 5の態様によると、第 4の態様の画像処理方法において、対象領域における画素値の最小値が対象領域の周辺領域における画素値の最小値より小さいとき、その差分に応じた値をそのエッジ画素の値とし、対象領域における画素値の最小値が対象領域の周辺領域における画素値の最小値より大きいとき、エッジ画素の値をゼロにクリッピング処理するのが好ましい。 According to the fifth aspect of the present invention, in the image processing method according to the fourth aspect, when the minimum value of the pixel value in the target area is smaller than the minimum value of the pixel value in the peripheral area of the target area, according to the difference When the value is the value of the edge pixel and the minimum value of the pixel value in the target area is larger than the minimum value of the pixel value in the peripheral area of the target area, it is preferable to perform the clipping process with the value of the edge pixel set to zero.

本発明の第 6の態様によると、画像処理方法は、複数の画素からなる画像を取得し、取得した画像に基づき、局所的に周辺より画素値が出っ張つている凸構造のエッジを検出し、検出した凸構造のエッジに基づきエッジ画像を生成する。 According to the sixth aspect of the present invention, an image processing method acquires an image composed of a plurality of pixels, and detects a convex structure edge in which pixel values protrude locally from the periphery based on the acquired image. Then, an edge image is generated based on the detected edge of the convex structure.

本発明の第 7の態様によると、第 6の態様の画像処理方法において、取得した画像に対して、凸構造のエッジを検出する非線形フィルタを演算することによりエッジ画像を生成するのが好ましい。本発明の第 8の態様によると、第 7の態様の画像処理方法において、非線形フィルタは、対象領域における画素値と対象領域の周辺領域における画素値の最大値との差異に基づ!/、て演算結果を出力するのが好まし!/、。 According to the seventh aspect of the present invention, in the image processing method according to the sixth aspect, it is preferable that the edge image is generated by calculating a nonlinear filter that detects the edge of the convex structure with respect to the acquired image. According to the eighth aspect of the present invention, in the image processing method according to the seventh aspect, the nonlinear filter is based on the difference between the pixel value in the target region and the maximum value of the pixel value in the peripheral region of the target region! / It is preferable to output the calculation result!

本発明の第 9の態様によると、第 8の態様の画像処理方法において、非線形フィルタは、対象領域における画素値の最大値と周辺領域における画素値の最大値との差異に基づレ、て演算結果を出力するのが好まし!/、。 According to the ninth aspect of the present invention, in the image processing method according to the eighth aspect, the nonlinear filter is based on a difference between the maximum pixel value in the target region and the maximum pixel value in the peripheral region. To output the calculation result!

本発明の第 10の態様によると、第 8の態様の画像処理方法において、対象領域における画素値の最大値が対象領域の周辺領域における画素値の最大値より大きいとき、その差分に応じた値をそのエッジ画素の値とし、対象領域における画素値の最大値が対象領域の周辺領域における画素値の最大値より小さいとき、エッジ画素の値をゼロにクリッピング処理するのが好ましい。 According to the tenth aspect of the present invention, in the image processing method according to the eighth aspect, when the maximum value of the pixel values in the target area is larger than the maximum value of the pixel values in the peripheral area of the target area, the difference is determined according to the difference. When the value is the value of the edge pixel and the maximum value of the pixel value in the target region is smaller than the maximum value of the pixel value in the peripheral region of the target region, the edge pixel value is preferably clipped to zero.

本発明の第 11の態様によると、第 1から第 10のいずれかの態様の画像処理方法において、取得した画像に基づき、輝度成分による輝度画像を生成し、生成した輝度画像を使用して、エッジ画像を生成するのが好ましい。 According to the eleventh aspect of the present invention, in the image processing method according to any one of the first to tenth aspects, a luminance image based on a luminance component is generated based on the acquired image, and the generated luminance image is used. Thus, it is preferable to generate an edge image.

本発明の第 12の態様によると、第 3から第 5、第 8から第 10のいずれかの態様の画像処理方法において、対象領域は対象画素のみの 1画素あるいは対象画素とその隣接画素の 2画素の領域であり、周辺領域は対象領域の外側両サイドに位置する 2 画素の領域であるのが好ましレ、。 According to a twelfth aspect of the present invention, in the image processing method according to any one of the third to fifth and eighth to tenth aspects, the target area is only one target pixel or the target pixel and its adjacent pixels. It is a 2-pixel area, and the surrounding area is preferably a 2-pixel area located on both sides of the target area.

本発明の第 13の態様によると、第 2から第 5、第 7から第 10のいずれかの態様の画像処理方法おいて、非線形フィルタを、少なくとも 2通りの方向に演算するのが好ましい。 According to the thirteenth aspect of the present invention, in the image processing method according to any one of the second to fifth and seventh to tenth aspects, the nonlinear filter is preferably operated in at least two directions. Yes.

本発明の第 14の態様によると、画像処理方法は、複数の画素からなる画像を取得し、取得した画像に基づき、局所的に周辺より画素値がへこんでいる凹構造のエッジと出っ張つている凸構造のエッジとを検出し、検出した凹構造のエッジに基づき凹構造のエッジ画像を生成し、検出した凸構造のエッジに基づき凸構造のエッジ画像を生成する。 According to the fourteenth aspect of the present invention, an image processing method acquires an image composed of a plurality of pixels, and on the basis of the acquired image, the edge of a concave structure that protrudes locally from the periphery and protrudes. The edge of the convex structure is detected, an edge image of the concave structure is generated based on the detected edge of the concave structure, and an edge image of the convex structure is generated based on the detected edge of the convex structure.

本発明の第 15の態様によると、画像処理方法は、複数の画素からなる画像を取得し、取得した画像に基づき、局所的に周辺より画素値がへこんでいる凹構造のエッジ本発明の第 16の態様によると、画像処理方法は、複数の画素からなる画像を取得し、取得した画像に基づき、局所的に周辺より画素値がへこんでいる凹構造のエッジおよび局所的に周辺より画素値が出っ張つている凸構造のエッジの少なくともいずれ力、を検出し、検出した凹構造のエッジおよび凸構造のエッジの少なくともいずれかに基づきエッジ画像を生成する。 According to the fifteenth aspect of the present invention, an image processing method acquires an image composed of a plurality of pixels, and based on the acquired image, the edge of a concave structure in which pixel values are locally recessed from the periphery. According to the sixteenth aspect of the present invention, an image processing method acquires an image composed of a plurality of pixels, and based on the acquired image, the edge of the concave structure where the pixel value is locally recessed from the periphery and locally At least one of the edges of the convex structure whose pixel value protrudes from the periphery is detected, and an edge image is generated based on at least one of the detected concave structure edge and convex structure edge.

本発明の第 17の態様によると、画像処理方法は、複数の画素からなる画像を取得し、取得した画像のエッジ成分を検出し、検出したエッジ成分に対してガンマ変換を施し、ガンマ変換されたエッジ成分によるエッジ画像を生成する。 According to the seventeenth aspect of the present invention, an image processing method acquires an image composed of a plurality of pixels, detects an edge component of the acquired image, performs gamma conversion on the detected edge component, and performs gamma conversion. An edge image based on the edge component is generated.

本発明の第 18の態様によると、第 1〜第 17のいずれかの態様の画像処理方法おいて、生成したエッジ画像を使用して顔の画像を検出するのが好ましい。 According to the eighteenth aspect of the present invention, in the image processing method according to any one of the first to seventeenth aspects, it is preferable to detect a face image using the generated edge image.

本発明の第 19の態様によると、画像処理プログラムは、第 1から第 18のいずれかの態様の画像処理方法をコンピュータに実行させる画像処理プログラムとするものであ本発明の第 20の態様によると、画像処理装置は、第 19の態様の画像処理プロダラムを搭載する画像処理装置とするものである。 According to a nineteenth aspect of the present invention, the image processing program is an image processing program for causing a computer to execute the image processing method according to any one of the first to eighteenth aspects. According to the twentieth aspect of the present invention. The image processing apparatus is an image processing apparatus equipped with the image processing program of the nineteenth aspect.

本発明の第 21の態様によると、撮像装置は、第 19の態様の画像処理プログラムを搭載する撮像装置とするものである。 According to a twenty-first aspect of the present invention, an imaging apparatus is an imaging apparatus that mounts the image processing program of the nineteenth aspect.

発明の効果 The invention's effect

[0008] 本発明は以上のように構成しているので、エッジを的確に検出することができる。例えば、請求項 1では凹構造のエッジを検出しているので、顔画像の目などの局所的に喑レ、構造を、エッジとして的確に検出することができる。 [0008] Since the present invention is configured as described above, an edge can be accurately detected. For example, since the edge of the concave structure is detected in claim 1, it is possible to accurately detect a local crease or structure such as an eye of the face image as an edge.

図面の簡単な説明 Brief Description of Drawings

[0009] [図 1]本発明の一実施の形態である画像処理装置を示す図である。 FIG. 1 is a diagram showing an image processing apparatus according to an embodiment of the present invention.

[図 2]パーソナルコンピュータ 1が実行する画像処理プログラムのフローチャートを示す図である。 [図 3]エッジ抽出対象画素と周辺画素とを座標 xyで表した図である。 FIG. 2 is a diagram showing a flowchart of an image processing program executed by the personal computer 1. FIG. 3 is a diagram showing an edge extraction target pixel and peripheral pixels with coordinates xy.

[図 4]輝度の様々な構造に対して輝度凹部画像 (x,y)を作成した結果を示す図である。 FIG. 4 is a diagram showing the result of creating a luminance concave image (x, y) for various luminance structures.

[図 5]具体的な顔の輝度画像について 4種類のエッジ画像 E (x，y)〜E (x，y)を生成 [Figure 5] Four types of edge images E (x, y) to E (x, y) are generated for a specific facial luminance image

1 4 14

した例を示す図である。 FIG.

[図 6]具体的なエッジ画像につ!/、て、顔らしさ V (x,y)を生成し、顔らしさ V を算 [Fig.6] Creates a specific edge image! /, Generates facial appearance V (x, y), and calculates facial appearance V

1 SUM1 出する処理を行った例を示す図である。 It is a figure which shows the example which performed the process which issues 1SUM1.

[図 7]ルックアップテーブル L (E)の具体的な値をエッジの大きさ毎に表した図で [Fig.7] A specific value of the lookup table L (E) is shown for each edge size.

1 (x,y) 1 (x, y)

ある。 is there.

[図 8]図 2のステップ S6の顔判定の処理において、部分画像の顔らしさ Vsum〜Vsu mを求めた後の処理のフローチャートを示す図である。 FIG. 8 is a diagram showing a flowchart of the processing after obtaining the face-likeness Vsum to Vsum of the partial image in the face determination processing in step S6 of FIG.

4 Four

[図 9]顔らしさ L (E)を求める処理のフローチャートを示す図である。 FIG. 9 is a diagram showing a flowchart of a process for obtaining facial appearance L (E).

1 (x'y) 1 (x'y)

[図 10]撮像装置であるデジタルカメラ 100の構成を示す図である。 FIG. 10 is a diagram showing a configuration of a digital camera 100 that is an imaging apparatus.

発明を実施するための最良の形態 BEST MODE FOR CARRYING OUT THE INVENTION

[0010] 第 1の実施の形態 [0010] First Embodiment

図 1は、本発明の一実施の形態である画像処理装置を示す図である。画像処理装置は、パーソナルコンピュータ 1で実現される。パーソナルコンピュータ 1は、デジタルカメラ 2、 CD— ROMなどの記録媒体 3、他のコンピュータ 4などと接続され、各種の画像（画像データ）の提供を受ける。パーソナルコンピュータ 1は、提供された画像に対して、以下に説明する画像処理を行う。コンピュータ 4は、インターネットやその他の電気通信回線 5を経由して接続される。 FIG. 1 is a diagram showing an image processing apparatus according to an embodiment of the present invention. The image processing apparatus is realized by the personal computer 1. The personal computer 1 is connected to a digital camera 2, a recording medium 3 such as a CD-ROM, another computer 4, etc., and receives various images (image data). The personal computer 1 performs the image processing described below on the provided image. The computer 4 is connected via the Internet and other telecommunications lines 5.

[0011] パーソナルコンピュータ 1が画像処理のために実行するプログラムは、図 1の構成と同様に、 CD— ROMなどの記録媒体や、インターネットやその他の電気通信回線を経由した他のコンピュータ力、ら提供され、パーソナルコンピュータ 1内にインストールされる。パーソナルコンピュータ 1は、 CPU (不図示）およびその周辺回路（不図示）から構成され、 CPUがインストールされたプログラムを実行する。 [0011] The program executed by the personal computer 1 for image processing is similar to the configuration shown in FIG. 1, such as a recording medium such as a CD-ROM, other computer power via the Internet or other electric communication line, and the like. Provided and installed in the personal computer 1. The personal computer 1 is composed of a CPU (not shown) and its peripheral circuits (not shown), and executes a program in which the CPU is installed.

[0012] プログラムがインターネットやその他の電気通信回線を経由して提供される場合は、プログラムは、電気通信回線、すなわち、伝送媒体を搬送する搬送波上の信号に変換して送信される。このように、プログラムは、記録媒体や搬送波などの種々の形態のコンピュータ読み込み可能なコンピュータプログラム製品として供給される。 [0012] If the program is provided via the Internet or other telecommunications line, the program may be connected to a telecommunications line, ie, a signal on a carrier wave carrying a transmission medium. Converted and sent. Thus, the program is supplied as a computer readable computer program product in various forms such as a recording medium and a carrier wave.

[0013] 本実施の形態のパーソナルコンピュータ 1は、撮影された画像の中力顔画像を検出する画像処理を行う。具体的には、入力した画像に基づきエッジ成分を抽出してエッジ画像を生成し、生成したエッジ画像に基づき顔の画像があるかどうかを判定する。本実施の形態における処理では、このエッジ成分の抽出方法およびエッジ画像に基づく顔の判定方法に特徴を有する。 [0013] The personal computer 1 according to the present embodiment performs image processing for detecting a medium facial image of a captured image. Specifically, an edge component is extracted based on the input image to generate an edge image, and it is determined whether there is a face image based on the generated edge image. The processing in the present embodiment is characterized by the edge component extraction method and the face determination method based on the edge image.

[0014] なお、以下では、画像に対して画像処理を行うという表現をするが、実際には入力した画像データに対して画像処理を行うことを意味する。また、本実施の形態で言うエッジとは、輝度値や画素値が周囲より小さい値を示しへこんでいる箇所 (領域、画素）、周囲より大き!/、値を示し出っ張って!/、る（突出してレ、る）箇所 (領域、画素）、段差になっている箇所 (領域、画素）のことを言う。特に、周囲よりへこんでいる箇所 (領域、画素）を凹構造のエッジ、周囲より出っ張っている箇所 (領域、画素）を凸構造のエッジと言う。 In the following, the expression “image processing is performed on an image” is used, but in practice, this means that image processing is performed on input image data. Also, in this embodiment, an edge is a portion (area, pixel) where the luminance value or pixel value is smaller than the surrounding area, is larger than the surrounding area! /, And the value is protruding! / This refers to locations (regions, pixels) that are (protruding), and locations (regions, pixels) that are not leveled. In particular, the indented area (area, pixel) is called a concave edge, and the protruding area (area, pixel) is called a convex edge.

[0015] 以下、本実施の形態のパーソナルコンピュータ 1が撮影された画像の中力顔画像を検出する画像処理について詳細に説明する。図 2は、パーソナルコンピュータ 1が実行する画像処理プログラムのフローチャートを示す図である。 [0015] Hereinafter, image processing for detecting a medium face image captured by the personal computer 1 according to the present embodiment will be described in detail. FIG. 2 is a diagram showing a flowchart of an image processing program executed by the personal computer 1.

[0016] ステップ S1では、デジタルカメラなどで撮影（撮像）した顔を検出する対象の画像（画像データ）を入力（取得)する。入力画像の各画素は R, G, Bの各色成分を含み、各色成分の範囲は 0〜255とする。ステップ S2では、入力画像の R, G, Bに基づき、輝度画像 Yを次の式で生成する。すなわち、輝度画像 Y面を生成する。 [0016] In step S1, an image (image data) to be detected for a face photographed (captured) with a digital camera or the like is input (acquired). Each pixel of the input image includes R, G, and B color components, and each color component ranges from 0 to 255. In step S2, a luminance image Y is generated by the following formula based on R, G, and B of the input image. That is, the luminance image Y plane is generated.

Y= (R+ 2G + B) /4 Y = (R + 2G + B) / 4

[0017] ステップ S3では、生成した輝度画像を階層的に縮小して出力する。例えば、 0〜3 1までの整数 nに対して縮小倍率 _κを 0.9ⁿで与え、その 32通りの縮小倍率 κで縮小された輝度画像を出力する。なお、縮小方法は、例えば Cubic変倍や線形変倍を用いればよい。このように複数通りの縮小画像を生成するのは、入力した画像にはどのようなサイズの顔画像があるかどうか不明であり、あらゆるサイズの顔の画像に対応できるようにするためである。 [0018] ステップ S4では、縮小されたそれぞれの輝度画像 Y (x，y)力も 4種類のエッジ画像 E (x，y)〜E (x，y)を以下の手順で生成する。以下では、 x方向を画像の横方向ある[0017] In step S3, the generated luminance image is hierarchically reduced and output. For example, given by the reduction ratio _κ of 0.9 ⁿ for integer n of 0 to 3 1, and outputs the luminance image which has been reduced by the reduction magnification κ of the 32 patterns. As a reduction method, for example, Cubic scaling or linear scaling may be used. The reason why multiple reduced images are generated in this way is that it is unclear whether the input image has a face image of any size, so that it can handle face images of any size. . [0018] In step S4, four types of edge images E (x, y) to E (x, y) are generated by the following procedure for each reduced luminance image Y (x, y) force. In the following, the x direction is the horizontal direction of the image

1 4 14

いは水平方向、 y方向を縦方向あるいは鉛直方向とする。 Or the horizontal direction and the y direction are vertical or vertical.

[0019] まず、以下の式より、縦方向に平滑化した画像 Y (x [0019] First, an image Y (x

LV ，y)と横方向に平滑化した画像 Y (x，y)を生成する。縦方向のエッジ成分を抽出するためには、横方向を平滑化 LV, y) and a horizontally smoothed image Y (x, y) are generated. To extract the edge component in the vertical direction, smooth the horizontal direction.

LH LH

した画像データを使用し、横方向のエッジ成分を抽出するためには、縦方向を平滑化した画像データを使用するのが好ましいからである。 This is because, in order to extract the edge component in the horizontal direction using the processed image data, it is preferable to use the image data in which the vertical direction is smoothed.

Y (x,y) = {Y(x,y-l) +2XY(x,y) +Y(x,y+1) }/4 Y (x, y) = (Y (x, y-l) + 2XY (x, y) + Y (x, y + 1)} / 4

LV LV

Y (x,y) = {Y(x-l,y) + 2XY(x,y) +Y(x+l,y) }/4 Y (x, y) = (Y (x-l, y) + 2XY (x, y) + Y (x + l, y)} / 4

LH LH

[0020] 次に、横方向を平滑化した画像 Y (x，y)を使用して、以下の式より、縦方向のエツ [0020] Next, using the image Y (x, y) smoothed in the horizontal direction, the vertical direction

LH LH

ジ画像 E (x，y)を生成する。なお、エッジ画像の各画素はエッジ画素と言う。 Di-image E (x, y) is generated. Each pixel of the edge image is called an edge pixel.

E '(x,y)=Min(Y (x,y-l), Y (x,y+2)) E '(x, y) = Min (Y (x, y-l), Y (x, y + 2))

1 LH LH 1 LH LH

-Min (Y (x,y) , Y (x,y+l)) -Min (Y (x, y), Y (x, y + l))

LH LH LH LH

E (x,y) = γ (E ' (x,y) ) E (x, y) = γ (E '(x, y))

[0021] 次に、以下の式より、縦方向のエッジ画像 E (x，y)を生成する。 Next, a vertical edge image E (x, y) is generated from the following equation.

2 2

E ' (x,y) = Y E '(x, y) = Y

2 I (x,y-l)-Y (x,y) | 2 I (x, y-l) -Y (x, y) |

LH LH LH LH

+ I Y (x,y+l)-Y (x,y) | + I Y (x, y + l) -Y (x, y) |

LH LH LH LH

E (x，y) = γ (E ' (x，y) ) E (x, y) = γ (E '(x, y))

2 2 twenty two

[0022] 次に、縦方向を平滑化した画像 Y (x，y)を使用して、以下の式より、横方向のエツ [0022] Next, using the image Y (x, y) smoothed in the vertical direction, the horizontal direction

LV LV

ジ画像 E (x，y)を生成する _c _C to generate the image E (x, y)

3 Three

E (x,y)=Min (Y (x—l，y)， Y (x+2,y)) E (x, y) = Min (Y (x—l, y), Y (x + 2, y))

3 LV LV 3 LV LV

Min (Y (x，y)， Y (x + 1，y) ) Min (Y (x, y), Y (x + 1, y))

LV LV LV LV

E (x，y) = γ (E ' (x，y) ) E (x, y) = γ (E '(x, y))

3 3 3 3

[0023] 次に、以下の式より、横方向のエッジ画像 E (x，y)を生成する。 Next, a lateral edge image E (x, y) is generated from the following equation.

4 Four

E ' (x,y) = I Y (x-l,y)-Y (x,y) | E '(x, y) = I Y (x-l, y) -Y (x, y) |

4 LV LV 4 LV LV

+ I Y (x+l,y)-Y (x,y) | + I Y (x + l, y) -Y (x, y) |

LV LV LV LV

E (x，y) = γ (E ' (x，y) ) E (x, y) = γ (E '(x, y))

4 4 4 4

[0024] ここで、 Min()は、 0の中の最小の値を戻す関数である。また、 Ί (E)は、 γ変換とクリッピングを行う関数であり、以下の演算を行い、 0〜31の整数を出力する。この Ml N ()処理は、非線形フィルタ処理である。また、 γ変換やクリッピング処理を含めて非線形フィルタ処理と言ってもょレ、。 Here, Min () is a function that returns the minimum value of 0. In addition, Ί (E) is This is a clipping function that performs the following operations and outputs an integer between 0 and 31. This Ml N () process is a non-linear filter process. It can also be called non-linear filter processing, including γ conversion and clipping processing.

Ε< 0の場合 γ (Ε) =0 If Ε <0 γ (Ε) = 0

Ε〉63の場合 _Ί (Ε) = 31 Ε> 63 ( _Ί ) = 31

0≤Ε≤63の場合 γ (E) = (int) (4 Χ ^Ε) When 0≤Ε≤63 γ (E) = (int) (4 Χ ^ Ε)

[0025] 上記エッジ画像の生成について、図 3を参照してさらに詳しく説明する。図 3は、ェッジ抽出対象画素と周辺画素とを座標 xyで表した図である。上記 E ' (x，y)は、輝度画像 Y (x，y)面において、縦方向 4画素 Y (x，y— 1)、 Y (x，y)、 Y (x，y+ l)、[0025] Generation of the edge image will be described in more detail with reference to FIG. FIG. 3 is a diagram in which the edge extraction target pixel and the surrounding pixels are represented by coordinates xy. The above E ′ (x, y) is 4 pixels in the vertical direction Y (x, y— 1), Y (x, y), Y (x, y + l),

LH LH LH LH LH LH LH LH

Y (x，y+ 2)のうち、対象画素（_x， y)を基準に、外側 2画素 Y (x，y— 1)、 Y (x，yOut of Y (x, y + 2), the outer two pixels Y (x, y— 1), Y (x, y) with reference to the target pixel ( _x , y)

LH LH LH LH LH LH

+ 2)の最小値と内側 2画素 Y (x，y)、 Y (x，y+ l)の最小値の差を求めている。 The difference between the minimum value of +2) and the minimum value of the inner two pixels Y (x, y) and Y (x, y + l) is obtained.

LH LH LH LH

[0026] E ' (x,y)の値が正の値を示すことは、対象画素（x, y)近辺の値が、縦方向周辺画素の値より小さい、すなわち画素値が縦方向の周辺よりへこんでいることを示す。従つて、このようにして生成した E (x，y)の値を画素値として取り扱い、生成された画像を縦方向輝度凹部画像と言う。 [0026] The positive value of E ′ (x, y) indicates that the value near the target pixel (x, y) is smaller than the value in the vertical peripheral pixel, that is, the pixel value is in the vertical direction. Indicates that it is dented from the surroundings. Therefore, the value of E (x, y) generated in this way is treated as a pixel value, and the generated image is called a vertical luminance concave image.

[0027] 上記 E ' (x，y)は、輝度画像 Y (x，y)面において、対象画素 (x, y)と縦方向に隣 [0027] The above E '(x, y) is adjacent to the target pixel (x, y) in the vertical direction on the luminance image Y (x, y) plane.

2 LH 2 LH

接する画素との輝度値の差分を足し込んだ値を示す。すなわち、縦方向隣接画素との間で輝度値の変化が大きい場合に大きな値が生成される。従って、このようにして生成した E (x，y)の値を画素値として取り扱い、生成された画像を縦方向隣接画素 A value obtained by adding a difference in luminance value with the adjacent pixel is shown. That is, a large value is generated when the luminance value changes greatly between the adjacent pixels in the vertical direction. Therefore, the value of E (x, y) generated in this way is treated as a pixel value, and the generated image is treated as a vertically adjacent pixel.

2 2

差分画像と言う。縦方向隣接画素差分画像は、凹部構造のエッジ、凸部構造のエツジ、段差のエッジを区別なく検出する。 This is called a difference image. The vertically adjacent pixel difference image detects the edge of the concave structure, the edge of the convex structure, and the edge of the step without distinction.

[0028] 上記 E ' (x,y)および E (x,y)、 E ' (x,y)および E (x,y)は、横方向のエッジ画像 [0028] The above E '(x, y) and E (x, y), E' (x, y) and E (x, y) are horizontal edge images

3 3 4 4 3 3 4 4

を生成するためのものである。上記 E ' ( ）ぉょび£ (x，y)、 E ' ( ）ぉょび£ (x, Is for generating. E '() selection £ (x, y), E' () selection £ (x,

1 1 2 2 y)に対して、縦と横をひっくり返して考え、後は同様に演算するものである。従って、このようにして生成された E (x，y)を横方向輝度凹部画像、 E (x，y)を横方向隣接画 For 1 1 2 2 y), the vertical and horizontal directions are reversed, and the rest is calculated in the same way. Therefore, E (x, y) generated in this way is the horizontal luminance concave image, and E (x, y) is the horizontal adjacent image.

3 4 3 4

素差分画像と言う。 This is called an elementary difference image.

[0029] 図 4は、輝度の様々な構造に対して輝度凹部画像 E (x,y)を作成した結果を示す図である。図 4 (a)は輝度が凹んでいる場合であり、図 4 (b)は輝度が突出している場合であり、図 4 (c)は輝度が段差になっている場合である。図 4を見ると、輝度が凹んでいる場合のみ輝度凹部画像が正の値を持つことがわかる。従って、輝度凹部画像 E'の負の値を 0にクリッピングすれば、輝度の凹みだけに反応するエッジ画像 E (x,y )が生成される。 FIG. 4 is a diagram showing the result of creating the luminance concave image E (x, y) for various luminance structures. Fig. 4 (a) shows the case where the luminance is concave, and Fig. 4 (b) shows the case where the luminance is protruding. Fig. 4 (c) shows the case where the brightness is stepped. As can be seen from Fig. 4, the luminance concave image has a positive value only when the luminance is concave. Therefore, if the negative value of the luminance recess image E ′ is clipped to 0, an edge image E (x, y) that reacts only to the luminance recess is generated.

[0030] この輝度凹部画像によると、目鼻口などの局所的に暗い箇所に特に良く反応する。 [0030] According to this luminance concave image, it reacts particularly well to locally dark places such as the eyes and nose and mouth.

図 5は、具体的な顔の輝度画像について上記 4種類のエッジ画像 E (x，y)〜E (x,y Fig. 5 shows the four types of edge images E (x, y) to E (x, y

1 4 14

)を生成した例を示す図である。実際、輝度凹部画像は、目鼻口の位置に鋭!/、ピークを持つ。特に、図 5の縦方向輝度凹部画像 Eでは、目、鼻の穴、口などに反応し、その中でも目、鼻の穴などには強く反応し白くなつている。すなわち、その位置の Eの値が大きな値となっている。従って、このような輝度凹部画像を解析することにより、顔を高精度に検出することができる。ただし、輝度凹部画像だけを用いるのではなく、従来の方法で作成したエッジ画像も合わせて用いることが望ましい。 FIG. In fact, the luminance depression image has a sharp peak at the position of the eyes and nose! In particular, in the vertical luminance concave image E in FIG. 5, it reacts to the eyes, nostrils, mouth, etc. Among them, it reacts strongly to the eyes, nostrils, etc. and becomes white. In other words, the value of E at that position is large. Therefore, the face can be detected with high accuracy by analyzing such a luminance concave image. However, it is desirable to use not only the luminance concave image but also the edge image created by the conventional method.

[0031] なお、上記エッジ画像をガンマ変換した理由は、エッジ量を適切な特徴量 E に変換するためである。画像解析において、ほとんどエッジがない箇所での微妙なェッジ量の違!/、は、大きなエッジがある箇所での多少のエッジ量の違!/、よりも大きな意味を持つ。エッジ量 ΕΊこ対してガンマ変換を施すことにより上記の効果が実現され、ほとんどエッジがな!/、箇所でのエッジ量の違いは特徴量 Eの大きな違いに変換され、大きなエッジがある箇所でのエッジ量の違いは特徴量 Eの小さな違いに変換される。 The reason why the edge image is gamma-converted is to convert the edge amount into an appropriate feature amount E. In image analysis, a subtle difference in the edge amount in a place where there is almost no edge has a larger meaning than a slight difference in the amount of edge in a place where there is a large edge! /. By applying gamma conversion to the edge amount, the above effect is achieved. Almost no edge! / The difference in edge amount at the location is converted into a large difference in feature amount E. The difference in edge amount is converted into a small difference in feature amount E.

[0032] 次に、図 2に戻って、ステップ S5では、縮小した画像の 1画素おきに 19 X 19画素の顔判定対象領域を設定し、その領域におけるエッジ画像の部分画像を出力する。これをすベての縮小画像において行う。 19 X 19画素の顔判定対象領域は、その領域が顔である場合に目や鼻や口などが 2画素程度で検出できるのに適したサイズである。 Next, returning to FIG. 2, in step S5, a face determination target area of 19 × 19 pixels is set for every other pixel of the reduced image, and a partial image of the edge image in that area is output. This is performed for all reduced images. The 19 x 19 pixel face detection target area is suitable for detecting the eyes, nose, mouth, etc. with about 2 pixels when the area is a face.

[0033] ステップ S6では、ステップ 5で出力したエッジ画像の各部分画像に対して、この領域が顔の画像であるかどうか判定する。本実施の形態では、この顔の画像の判定を以下に説明する手法により行う。 In step S 6, it is determined for each partial image of the edge image output in step 5 whether this area is a face image. In the present embodiment, the determination of the face image is performed by the method described below.

[0034] まず、エッジ画像 E (x，y)の部分画像の各画素位置（x，y) (0≤x≤18、 0≤y≤18 )について、次の式に基づいてその位置の顔らしさ V (x，y)を生成する。顔らしさ V ( x,y)は、各画素位置で顔らしさを数値化したもので、顔らしさの度合いや程度を示すものである。 V (x，y)は、顔として尤もらしい度合いを表す尤度といってもよい。 [0034] First, for each pixel position (x, y) (0≤x≤18, 0≤y≤18) of the partial image of the edge image E (x, y), the face of that position is based on the following equation: Produces the likelihood V (x, y). Facialness V ( x, y) is a numerical expression of the facial appearance at each pixel position, and indicates the degree and degree of facial appearance. V (x, y) may be said to be a likelihood representing a degree of likelihood as a face.

V (x,y) =L (E (x,y) ) V (x, y) = L (E (x, y))

1 1 (x,y) 1 1 1 (x, y) 1

ここで、 L (E)は、各画素位置 (x，y) (0≤x≤18, 0≤v≤18)について、後述す Here, L (E) is described later for each pixel position (x, y) (0≤x≤18, 0≤v≤18).

1 (x,y) 1 (x, y)

る統計処理によりあらかじめ作成されているルックアップテーブルであり、画素位置 (X This is a look-up table created in advance by statistical processing.

，y)のエッジ E (x,y)が Eである時のその箇所の顔らしさを表す。 , Y) represents the face likeness when the edge E (x, y) is E.

[0035] そして、生成した顔らしさ V (x，y)を全画素（x，y) (0≤x≤18, 0≤y≤18)について積算し、顔らしさ V を算出する。 Then, the generated facial appearance V (x, y) is integrated for all pixels (x, y) (0≤x≤18, 0≤y≤18) to calculate the facial appearance V.

SUM1 SUM1

[0036] 図 6は、具体的なエッジ画像について上記の処理を行った例を示す図である。図 6 の顔らしさ画像では、顔らしい箇所が白く表示され、顔らしくない箇所が黒く表示されている。図 6 (a)に示す顔のエッジ画像から生成した顔らしさ画像は、全体的に大きな値を持つ。すなわち、全体的に白っぽい画像となる。しかし、図 6 (b)に示す非顔のェッジ画像から生成した顔らしさ画像は所々小さな値を持つ。すなわち、所々黒っぽくなった画像となる。 FIG. 6 is a diagram illustrating an example in which the above processing is performed on a specific edge image. In the face-like image in Fig. 6, parts that look like faces are displayed in white, and parts that do not look like faces are displayed in black. The face-like image generated from the face edge image shown in Fig. 6 (a) has a large overall value. That is, the overall image is whitish. However, the facial image generated from the non-facial edge image shown in Fig. 6 (b) has small values in some places. That is, the image becomes dark in some places.

[0037] 図 6 (b)の非顔の例では、目の間、鼻、口の両横に対応する領域が顔らしくないとされて、顔らしさ画像ではその領域の画素値は小さな値となり黒!、画像となってレ、る。従って、非顔画像の顔らしさ画像を全画素積算した値 V は小さな値になる。 [0037] In the non-face example shown in Fig. 6 (b), the area corresponding to both sides of the eyes, the nose, and the mouth does not look like a face. In the face-like image, the pixel value of that area is a small value. Next black! Therefore, the value V obtained by integrating all the pixels of the face-like image of the non-face image is a small value.

SUM1 SUM1

[0038] 図 7は、ルックアップテーブル L (E)の具体的な値をエッジの大きさ毎に表した [0038] FIG. 7 shows specific values of the lookup table L (E) for each edge size.

1 (x,y) 1 (x, y)

図である。図 7では、顔らしさの値が大きいほど白く表示されている。図 7において、左側はエッジが小さい時の顔らしさであり、右側はエッジが大きい時の顔らしさである FIG. In FIG. 7, the larger the face-like value, the more white it is displayed. In Fig. 7, the left side is the facial appearance when the edge is small, and the right side is the facial appearance when the edge is large.

。なお、ルックアップテーブル L (E)の全ての値を図示するなら、前述の通りエツ . If all the values in the lookup table L (E) are illustrated,

1 (x,y) 1 (x, y)

ジは 0〜31ので生成されているので、 L (0)〜L (31)の 32通りの図ができ

Is generated from 0 to 31, so 32 diagrams from L (0) to L (31) can be created.

る。し力、し、図 7では、図示の便宜上そのうちの 8通りのみ表示している。 The In FIG. 7, only eight of them are shown for convenience of illustration.

[0039] なお、図 7のルックアップテーブル L (E)は、具体的な値をエッジの大きさ毎に [0039] Note that the look-up table L (E) in Fig. 7 has a specific value for each edge size.

1 (x,y) 1 (x, y)

視覚的に表した図である。実際には、画素位置 (X, y)を引数とした画素値のテープルが、エッジの値毎にメモリに格納されている。すなわち、 32個の画素位置 (X, y)を引数とした画素値のテーブル力 Sメモリに格納されている。 It is the figure represented visually. Actually, a table of pixel values with the pixel position (X, y) as an argument is stored in the memory for each edge value. In other words, the table power S of the pixel value with 32 pixel positions (X, y) as an argument is stored in the S memory.

[0040] 図 7において、左側の図はエッジが小さい時の顔らしさを表す。左側の図を見ると、目、鼻、口の箇所の顔らしさが小さな値になっている。これは、目、鼻、口の箇所のェッジが小さい場合には、その箇所は顔らしくないということを表している。例えば、図 6 ωの非顔の例では、鼻に対応する箇所のエッジが小さいので、その箇所は顔らしくないとされる。 [0040] In Fig. 7, the diagram on the left represents the facial appearance when the edge is small. Looking at the diagram on the left, The facial appearance of the eyes, nose and mouth is small. This means that if the edges of the eyes, nose, and mouth are small, the area is not likely to be a face. For example, in the non-face example in Fig. 6 ω, the edge of the part corresponding to the nose is small, so the part does not look like a face.

[0041] また、図 7の右側の図はエッジが大きい時の顔らしさを表す。右側の図を見ると、目、鼻、口以外の箇所の顔らしさが小さな値になっている。これは、目、鼻、口以外の箇所のエッジが大きい場合には、その箇所は顔らしくないということを表している。例えば、図 6 (a)の非顔の例では、目の間と口の両横に対応する箇所のエッジが大きいので、その箇所は顔らしくないとされる。 [0041] Further, the diagram on the right side of FIG. 7 represents the facial appearance when the edge is large. Looking at the figure on the right, the face-likeness of parts other than the eyes, nose, and mouth is small. This means that if the edge of a part other than the eyes, nose, or mouth is large, that part does not look like a face. For example, in the non-face example shown in Fig. 6 (a), the edge of the part corresponding to the space between the eyes and both sides of the mouth is large, so the part is not likely to be a face.

[0042] すなわち、顔の画像を特定種類の画像とし、目、鼻、口などを特定種類の画像の特徴的な要素であると考えると、特定種類の画像の特徴的な要素に対応する画素位置では、その画素のエッジ成分が大きい場合の特定種類の画像らしさの度合いを、エツジ成分が小さレ、場合の特定種類の画像らしさの度合いに比べて大きな値としてレ、る。また、特定種類の画像の特徴的な要素以外に対応する画素位置では、その画素のエッジ成分が大きい場合の特定種類の画像らしさの度合いを、エッジ成分が小さ V、場合の特定種類の画像らしさの度合いに比べて小さな値として!/、る。 [0042] That is, assuming that a face image is a specific type of image and eyes, nose, mouth, and the like are characteristic elements of the specific type of image, they correspond to characteristic elements of the specific type of image. At the pixel position, the degree of the particular kind of image when the edge component of the pixel is large is smaller than the degree of the particular kind of image when the edge component is small. In addition, at pixel positions corresponding to elements other than the characteristic elements of a specific type of image, the degree of a particular type of image when the edge component of that pixel is large is expressed as the degree of a particular type of image when the edge component is small V. As a small value compared to the degree of!

[0043] 上記ルックアップテーブルを参照する処理を整理すると、まず、エッジ画像 E (x,y) の部分画像において、 x = 0、 = 0のェッジ£の値を得る。次に、このエッジ Eの値に相当するルックアップテーブル L (E )を 32個のルックアップテーブルの中から [0043] When the process of referring to the lookup table is arranged, first, in the partial image of the edge image E (x, y), the value of the edge of x = 0, = 0 is obtained. Next, look-up table L (E) corresponding to the value of edge E is selected from 32 look-up tables.

1 (x,y) 1 1 (x, y) 1

決める。ルックアップテーブル L (E )が決まると、このルックアップテーブル L Decide. When the lookup table L (E) is determined, this lookup table L

1 (x,y) 1 1 vx.y) 1 (x, y) 1 1 vx.y)

(E )の画素位置（0, 0)の値を得る。これが、エッジ画像 E (x，y)の画素位置（0, 0) の顔らしさの値である。この処理を、 x = 0、 y= 0の画素力、ら x= 18、 y= 18の画素まで順次行い、顔らしさ画像 V (x，y)を得る。そして、 V (x，y)をすベて積算して Vsum を得る。 The value of the pixel position (0, 0) of (E) is obtained. This is the faceness value at the pixel position (0, 0) of the edge image E (x, y). This process is sequentially performed until the pixel power of x = 0, y = 0, and the pixels of x = 18, y = 18, and the face-like image V (x, y) is obtained. Then, Vsum is obtained by summing all V (x, y).

[0044] 以上の処理により、エッジ画像 E (x，y)に基づいて部分画像の顔らしさ Vsumが生成される。そして、エッジ画像 E (x，y)〜E (x，y)に基づいて部分画像の顔らしさ Vsu [0044] Through the above processing, the facial appearance Vsum of the partial image is generated based on the edge image E (x, y). Then, based on the edge images E (x, y) to E (x, y), the facial image Vsu

2 4 twenty four

m〜Vsumを生成する処理も同様に行う。 The process of generating m to Vsum is performed in the same way.

2 4 twenty four

[0045] 図 8は、図 2のステップ S6の顔判定の処理において、部分画像の顔らしさ Vsum〜 Vsumを求めた後の処理のフローチャートを示す図である。ステップ S6の顔判定処[0045] FIG. 8 shows the face-likeness Vsum of the partial image in the face determination process of step S6 of FIG. It is a figure which shows the flowchart of the process after calculating | requiring Vsum. Face determination process in step S6

4 Four

理では、上記に説明したように、顔らしさ Vsum〜Vsumを段階的に生成し、それらを In fact, as explained above, facial appearance Vsum to Vsum are generated step by step,

1 4 14

積算した評価値が閾値よりも大きければ顔とする。ただし、評価値を閾値と比較する処理を図 8に示すように各段階において行うことにより、明らかに顔ではない画像を早レ、段階で除外して、効率的な処理を行えるようにしてレ、る。 If the integrated evaluation value is larger than the threshold value, the face is determined. However, the process of comparing the evaluation value with the threshold value is performed at each stage as shown in Fig. 8, so that images that are clearly not faces are excluded at an early stage and at a stage so that efficient processing can be performed. RU

[0046] まず、ステップ S 11では、部分画像が顔の画像であるかどうかを判定する評価値を、エッジ画像 E (x，y)の顔らしさ Vsumとする。ステップ S12では、評価値が所定の閾値 thはり大きいかどうかを判定し、この評価値が閾値 thはり大きければステップ S13 に進み、この評価値が閾値 thlより大きくなければ、部分画像は顔の画像でないとして、対象の部分画像の顔判定の処理を終了する。 First, in step S 11, an evaluation value for determining whether or not the partial image is a face image is set as the face likelihood Vsum of the edge image E (x, y). In step S12, it is determined whether or not the evaluation value is larger than a predetermined threshold value th. If this evaluation value is larger than the threshold value th, the process proceeds to step S13. If this evaluation value is not larger than the threshold value thl, the partial image is a face image. If it is not an image, the face determination process for the target partial image is terminated.

[0047] ステップ S13では、評価値をステップ S 11の評価値にエッジ画像 E (x，y)の顔らし [0047] In step S13, the evaluation value is changed to the evaluation value in step S11, and the face appearance of the edge image E (x, y) is displayed.

2 2

さ Vsumを足した値とする。ステップ S 14では、この評価値が所定の閾値 th2より大き Let Vsum be the value added. In step S14, this evaluation value is greater than a predetermined threshold th2.

2 2

いかどうかを判定し、評価値が閾値 th2より大きければステップ S15に進み、この評価値が閾値 th2より大きくなければ、部分画像は顔の画像でないとして、対象の部分画像の顔判定の処理を終了する。 If the evaluation value is greater than the threshold value th2, the process proceeds to step S15.If the evaluation value is not greater than the threshold value th2, the face determination process for the target partial image is performed assuming that the partial image is not a face image. finish.

[0048] ステップ S15では、評価値をステップ S 13の評価値にエッジ画像 E (x，y)の顔らし [0048] In step S15, the evaluation value is changed to the evaluation value in step S13, and the face appearance of the edge image E (x, y) is set.

3 Three

さ Vsumを足した値とする。ステップ S 16では、この評価値が所定の閾値 th3より大き Let Vsum be the value added. In step S16, this evaluation value is greater than a predetermined threshold th3.

3 Three

いかどうかを判定し、評価値が閾値 th3より大きければステップ S17に進み、この評価値が閾値 th3より大きくなければ、部分画像は顔の画像でないとして、対象の部分画像の顔判定の処理を終了する。 If the evaluation value is greater than the threshold th3, the process proceeds to step S17.If the evaluation value is not greater than the threshold th3, the partial image is determined not to be a face image, and the face determination process for the target partial image is performed. finish.

[0049] ステップ S17では、評価値をステップ S 15の評価値にエッジ画像 E (x，y)の顔らし [0049] In step S17, the evaluation value is changed to the evaluation value in step S15, and the facial appearance of the edge image E (x, y) is displayed.

4 Four

さ Vsumを足した値とする。ステップ S 18では、この評価値が所定の閾値 th4より大き Let Vsum be the value added. In step S18, this evaluation value is greater than a predetermined threshold th4.

4 Four

いかどうかを判定する。ステップ S 18において、評価値が閾値 th4より大きければ、最終的にこの部分画像は顔の画像であると判定する。この評価値が閾値 th4より大きくなければ、部分画像は顔の画像でないとして、対象の部分画像の顔判定の処理を終了する。 Judge whether or not. If the evaluation value is larger than the threshold th4 in step S18, it is finally determined that the partial image is a face image. If this evaluation value is not greater than the threshold th4, the partial image is not a face image, and the face determination process for the target partial image is terminated.

[0050] 以上説明した部分画像の顔判定処理を、各縮小画像において、 1ビットずつずらした各部分画像についてすべて行い、顔の画像と判定できる部分画像をすベて抽出し、ステップ S 7に進む。 [0050] The partial image face determination process described above is performed for each partial image shifted by 1 bit in each reduced image, and all partial images that can be determined as face images are extracted. Proceed to step S7.

[0051] ステップ S 7では、ステップ 6によりある部分画像が顔であると判定された場合には、その部分画像の入力画像に対する顔の大きさ Sと座標 (X, Y)を出力する。 S , X, Y は、縮小画像における顔のサイズ S ' = 19と、顔とされた領域の座標 (Χ' , Υ' )と縮小倍率 κとを用いて、次の式で与えられる。 In step S 7, when it is determined in step 6 that a partial image is a face, the face size S and coordinates (X, Y) for the input image of the partial image are output. S 1, X, and Y are given by the following expression using the size S ′ = 19 of the face in the reduced image, the coordinates (Χ ′, Υ ′) of the face area and the reduction magnification κ.

S = S ' / κ S = S '/ κ

X = X' / κ X = X '/ κ

Υ = Υ' / κ Υ = Υ '/ κ

[0052] 以上のようにして、入力画像に顔の画像がある場合は、その顔の画像の位置と大きさが検出されて出力される。 As described above, when there is a face image in the input image, the position and size of the face image are detected and output.

[0053] <統計処理〉 [0053] <Statistical processing>

次に、前述した統計処理について説明する。すなわち、画素位置 (x，y)のエッジ Ε Next, the statistical processing described above will be described. That is, the edge Ε of pixel position (x, y)

(x，y)が Eであるときのその画素の顔らしさ L (E)を求める方法を説明する。図 9は A method for determining the facial appearance L (E) of the pixel when (x, y) is E will be described. Figure 9

1 (x,y) 1 (x, y)

、この顔らしさ L (E)を求める処理のフローチャートを示す図である。この処理は、 FIG. 10 is a diagram showing a flowchart of processing for obtaining the facial appearance L (E). This process is

1 (x,y) 1 (x, y)

パーソナルコンピュータ 1におレ、て実行される。 It is executed on the personal computer 1.

[0054] ステップ S 21では、数百人以上の顔の画像を取得する。すなわち、数百人以上の顔をデジタルカメラ等で撮影 (撮像)し、その画像 (画像データ)を取得する。取得する画像は、図 2のステップ S 1で入力する画像と同様な色成分で構成された画像である。ステップ S 22では、顔が撮影されている画像を、顔領域の大きさが 19 X 19画素になるように変倍して、顔領域を切り出した部分画像を顔画像サンプル群とする。 In step S 21, images of several hundred or more faces are acquired. That is, several hundred or more faces are photographed (captured) with a digital camera or the like, and the images (image data) are acquired. The acquired image is an image composed of the same color components as the image input in step S1 of FIG. In step S 22, the image of the face imaged is scaled so that the size of the face area becomes 19 × 19 pixels, and the partial images obtained by cutting out the face area are set as face image sample groups.

[0055] ステップ S 23では、 19 X 19画素の非顔画像サンプル群を、数百パターン以上取得する。これは、デジタルカメラで撮影した顔以外の画像から適宜抽出して非顔画像サンプル群とする。顔が写っている画像から、顔の領域を避けて抽出するようにしてもよい。この場合は、モニタに写された画像から、ユーザが適宜非顔画像の領域を指定すればよい。 [0055] In step S23, several hundred patterns of non-face image sample groups of 19 X 19 pixels are acquired. This is appropriately extracted from images other than the face photographed with a digital camera and made into a non-face image sample group. It is also possible to extract from an image showing a face while avoiding the face area. In this case, the user may appropriately designate the non-face image area from the image captured on the monitor.

[0056] ステップ S 24では、顔画像サンプル群からエッジ成分を抽出して、顔エッジ画像サンプル群を生成する。この処理は、顔検出処理においてエッジ画像 E (x，y)を生成する処理と同様に行う。ステップ S 25では、非顔画像サンプル群からエッジ成分を抽出して、非顔エッジ画像サンプル群を生成する。この処理も、顔検出処理においてェッジ画像 E (x，y)を生成する処理と同様に行う。 [0056] In step S24, an edge component is extracted from the face image sample group to generate a face edge image sample group. This process is the same as the process for generating the edge image E (x, y) in the face detection process. In step S25, edge components are extracted from the non-face image sample group. And a non-face edge image sample group is generated. This process is also performed in the same manner as the process for generating the edge image E (x, y) in the face detection process.

[0057] ステップ S26では、顔エッジ画像サンプル群につ!/、て、（x，y)のエッジが Eとなる頻度 P (x，y，E)を求める。すなわち、画素（X, y)の値が Eとなる画像力^、くつあるかを顔 [0057] In step S26, the frequency P (x, y, E) at which the edge of (x, y) becomes E is obtained for the face edge image sample group! In other words, the image power with a pixel (X, y) value of E ^

カウントする。ステップ S27では、非顔エッジ画像サンプル群について、同様に、（x，y Count. In step S27, for the non-face edge image sample group, (x, y

)のエッジが Eとなる頻度 P (x，y，E)を求める。 The frequency P (x, y, E) at which the edge of) becomes E is obtained.

非顔 Non-face

[0058] ステップ S28では、画素位置（x，y)のエッジ E (x,y)が Eであるときのその画素の顔らしさ L (E)を、次の式によって算出する。 In step S28, the facial appearance L (E) of the pixel when the edge E (x, y) at the pixel position (x, y) is E is calculated by the following equation.

1 (x,y) 1 (x, y)

L (E) =log{ (P (χ,γ,Ε) + 8 ) / (Ρ (χ,γ,Ε) + ε ) } L (E) = log {(P (χ, γ, Ε) + 8) / (Ρ (χ, γ, Ε) + ε)}

1 (x,y) 顔 1 非顔 2 1 (x, y) face 1 nonface 2

ここで、 ε と ε は所定の定数であり、対数の発散や過学習を抑制するために導入 Where ε and ε are predetermined constants, introduced to suppress logarithmic divergence and overlearning.

1 2 1 2

している。 ε の値は P (x，y，E)の平均的な値の 1000分の 1程度に設定すればよく is doing. The value of ε should be set to about 1/1000 of the average value of P (x, y, E).

1 顔 1 face

、 ε の値は ε の値の数十倍に設定すればよい。 The value of ε should be set to several tens of times the value of ε.

2 1 twenty one

[0059] 上記し（Ε)を求める式において、 log{ (P (χ,γ,Ε) + ε ) }は、単調増加関数 [0059] In the above equation (Ε), log {(P (χ, γ, Ε) + ε)} is a monotonically increasing function.

1 (x,y) 顔 1 1 (x, y) face 1

であり、 log{ l/ (P (χ,γ,Ε) + ε ) }は、単調減少関数である。すなわち、顔らしさ L 非顔 2 And log {l / (P (χ, γ, Ε) + ε)} is a monotonically decreasing function. That is, facial appearance L Non-face 2

(Ε)は、画素位置（x，y)のエッジ E (x，y)が Eである顔画像サンプルの分布が増 (Ε) shows an increase in the distribution of face image samples whose edge E (x, y) is E at the pixel position (x, y).

1 vx.y) 1 1 vx.y) 1

加していく方向にその値は単調増加し、画素位置 (x，y)のエッジ E (x，y)が Eである非顔画像サンプルの分布が増加していく方向にその値は単調減少していく関数である。なお、画素位置（x，y)のエッジ E (x，y)が Eである顔画像サンプルの分布、および、画素位置 (x，y)のエッジ E (x，y)が Eである非顔画像サンプルの分布は、通常正規分布している。 The value increases monotonously in the direction of addition, and the value decreases monotonously in the direction of increase in the distribution of non-face image samples whose edge E (x, y) is E at the pixel position (x, y). It is a function that does. The distribution of face image samples whose edge E (x, y) is E at the pixel position (x, y), and the non-face whose edge E (x, y) is E at the pixel position (x, y) The distribution of image samples is usually normal.

[0060] エッジ画像 E (x，y)〜E (x，y)を顔らしさに変換するルックアップテーブル L (E [0060] A look-up table L (E that converts the edge images E (x, y) to E (x, y) into facial appearance

2 4 2 (x,y) 2 4 2 (x, y)

)〜L (E)を生成するには、上記ステップ S24、ステップ S25のエッジ成分抽出処) To L (E) to generate the edge component extraction process in steps S24 and S25 above.

4 (x,y) 4 (x, y)

理を、顔検出処理におけるエッジ画像 E (x，y)〜E (x，y)を生成する処理と同様に In the same way as the processing for generating edge images E (x, y) to E (x, y) in face detection processing

2 4 twenty four

すればよい。 do it.

[0061] 以上説明した第 1の実施の形態の処理を行うと、次のような効果を奏する。 When the processing of the first embodiment described above is performed, the following effects are obtained.

(1)顔画像の目、鼻、口などの位置は周辺に比べて局所的に暗い。従来のエッジ抽出方法では、エッジ構造が局所的に暗い構造である力、、局所的に明るい構造である力、、あるいはそれ以外の構造である力、を識別することはできな力た。しかし、上記のように凹部構造のエッジを検出し、エッジ画像である凹部画像を生成することにより顔画像の局所的に暗い構造である目、鼻、口などを適切に抽出することができる。その結果、顔の画像を正確に判定することができる。 (1) The positions of eyes, nose, mouth, etc. in the face image are locally darker than the surroundings. In the conventional edge extraction method, it was impossible to distinguish between a force having a locally dark structure, a force having a locally bright structure, or a force having another structure. But above Thus, by detecting the edge of the concave structure and generating a concave image that is an edge image, it is possible to appropriately extract the eyes, nose, mouth, and the like that are locally dark structures of the face image. As a result, the face image can be accurately determined.

[0062] (2)輝度凹部画像によると、目鼻口などの局所的に暗い箇所に特に良く反応する。 [0062] (2) According to the luminance concave image, it reacts particularly well to locally dark places such as the eyes and nose and mouth.

実際、輝度凹部画像は、目鼻口の位置に鋭いピークを持つ。従って、このような輝度凹部画像を解析することにより、顔を高精度に検出することができる。本実施の形態では、輝度凹部画像だけを用いるのではなぐ従来の方法で作成したエッジ画像も合わせて用いるようにしてレ、るので、さらにより精度の高!/、顔の判定を可能としてレ、る In fact, the luminance concave image has a sharp peak at the position of the eyes, nose and mouth. Therefore, the face can be detected with high accuracy by analyzing such a luminance concave image. In this embodiment, it is possible to use an edge image created by a conventional method in addition to using only a luminance concave image, so that it is possible to determine the face with even higher accuracy! /. RU

[0063] (3)上記エッジエッジをガンマ変換した理由は、エッジ量を適切な特徴量 Eに変換するためである。画像解析において、ほとんどエッジがない箇所での微妙なエツジ量の違いは、大きなエッジがある箇所での多少のエッジ量の違!/、よりも大きな意味を持つ。エッジ量 ΕΊこ対してガンマ変換を施すことにより、ほとんどエッジがない箇所でのエッジ量の違いは特徴量 Eの大きな違いに変換され、大きなエッジがある箇所でのエッジ量の違いは特徴量 Eの小さな違いに変換される。これにより、エッジ量の違いが画像の構造の違いに一致するようになる。この結果、顔判定の精度も高くなる。 [0063] (3) The reason why the edge edge is gamma-converted is to convert the edge amount into an appropriate feature amount E. In image analysis, a subtle difference in the edge amount in a place where there is almost no edge has a larger meaning than a slight difference in edge amount in a place where there is a large edge. By applying gamma conversion to the edge amount, the difference in the edge amount at the point where there is almost no edge is converted into a large difference in the feature amount E, and the difference in the edge amount at the point where there is a large edge is the feature amount E. Is translated into a small difference. As a result, the difference in edge amount matches the difference in image structure. As a result, the accuracy of face determination is increased.

[0064] (4)上記実施の形態の図 4から明らかなように、輝度が凹んでいる場合のみ輝度凹部画像が正の値を持つことがわかる。従って、本実施の形態では、輝度凹部画像 E' の負の値を 0にクリッピングするようにした。これにより、輝度の凹みだけに反応するェッジ画像 E (x y)が生成され、エッジ画像 Eを使用する処理がしゃすくなる。 (4) As is clear from FIG. 4 of the above embodiment, it can be seen that the luminance concave image has a positive value only when the luminance is concave. Therefore, in this embodiment, the negative value of the luminance recess image E ′ is clipped to 0. As a result, an edge image E (xy) that reacts only to the luminance depression is generated, and the processing using the edge image E becomes frustrating.

[0065] (5)エッジ画像の画素値をルックアップテーブルを用いて顔らしさに変換して積算するという単純で高速な処理により、顔の画像を検出することができる。また、エッジ画像を判定することにより、画像を撮影する際の照明条件の影響を抑制する効果がある [0065] (5) A face image can be detected by a simple and high-speed process in which pixel values of an edge image are converted into facial appearance using a lookup table and integrated. Also, by determining the edge image, there is an effect of suppressing the influence of the lighting conditions when shooting the image.

[0066] 第 2の実施の形態 [0066] Second Embodiment

第 1の実施の形態では、輝度凹部画像を生成し、顔の目鼻口などの局所的に暗い箇所を適切に判断する例を説明した。しかし、歯を見せて笑っている口や、光が当たつて光っている頰ゃ鼻では、輝度が周囲に比べて局所的に明るくなつている。第 2の実施の形態では、このような顔の局所的に明るい箇所も適切に検出することにより、顔の画像を第 1の実施の形態よりより一層正確に検出する例を説明する。 In the first embodiment, an example has been described in which a luminance concave image is generated and a locally dark spot such as the eyes and nose of the face is appropriately determined. However, the brightness of the mouth laughing while showing teeth and the nose shining with light is locally brighter than the surroundings. Second In the embodiment, an example will be described in which a face image is detected more accurately than in the first embodiment by appropriately detecting a locally bright portion of the face.

[0067] 第 2の実施の形態は、第 1の実施の形態と同様に、パーソナルコンピュータ 1で実現される。従って、第 2の実施の形態の画像処理装置の構成は、第 1の実施の形態の図 1を参照することとする。また、パーソナルコンピュータ 1が実行する画像処理プログラムは、第 1の実施の形態の図 2のフローチャートと処理の流れとしては同様であるので、図 2を参照しながら以下説明をする。 [0067] The second embodiment is realized by the personal computer 1 as in the first embodiment. Therefore, for the configuration of the image processing apparatus of the second embodiment, refer to FIG. 1 of the first embodiment. Further, the image processing program executed by the personal computer 1 is the same as the flowchart of FIG. 2 of the first embodiment, and the processing flow will be described below with reference to FIG.

[0068] ステップ S1からステップ S3までは、第 1の実施の形態と同様であるので説明を省略する。 [0068] Steps S1 to S3 are the same as those in the first embodiment, and thus description thereof is omitted.

[0069] ステップ S4では、縮小されたそれぞれの輝度画像 Y (x，y)力も 6種類のエッジ画像 E (x，y)〜E (x，y)を生成する。縦方向に平滑化した画像 Y (x，y)と横方向に平滑 [0069] In step S4, each of the reduced luminance images Y (x, y) force also generates six types of edge images E (x, y) to E (x, y). Image Y (x, y) smoothed in the vertical direction and smoothed in the horizontal direction

1 6 LV 1 6 LV

化した画像 Y (x，y)の生成、および、エッジ画像 E (x，y)〜E (x，y)の生成は第 1 Generation of image Y (x, y) and edge images E (x, y) to E (x, y)

LH 1 4 LH 1 4

の実施の形態と同様であるのでその説明を省略し、以下、エッジ画像 E (x，y)と E (X Therefore, the description thereof will be omitted, and the edge images E (x, y) and E (X

5 6 5 6

，y)の生成について説明をする。 , Y) will be described.

[0070] まず、横方向を平滑化した画像 Y (x，y)を使用して、以下の式より、縦方向のエツ [0070] First, using the image Y (x, y) smoothed in the horizontal direction, the vertical direction

LH LH

ジ画像 E (x，y)を生成する。 Di-image E (x, y) is generated.

5 Five

E ' (x,y) =Max (Y (x，y— 1)， Y (x，y+ 2) ) E '(x, y) = Max (Y (x, y— 1), Y (x, y + 2))

5 LH LH 5 LH LH

-Max (Y (x,y) , Y (x,y+ l ) ) -Max (Y (x, y), Y (x, y + l))

LH LH LH LH

E (x，y) = γ (E ' (x，y) ) E (x, y) = γ (E '(x, y))

5 5 5 5

[0071] 次に、縦方向を平滑化した画像 Y (x，y)を使用して、以下の式より、横方向のエツ [0071] Next, using the image Y (x, y) smoothed in the vertical direction, the horizontal direction

LV LV

ジ画像 E (x，y)を生成する _c _C to generate the image E (x, y)

6 6

E ' (x，y)二 Max (Y (x— 1，y)， Y (x+ 2,y) ) E '(x, y) 2 Max (Y (x—1, y), Y (x + 2, y))

6 LV LV 6 LV LV

-Max (Y (x，y)， Y (x+ l,y) ) -Max (Y (x, y), Y (x + l, y))

LV LV LV LV

E (x，y) = γ ( E ' (x，y) ) E (x, y) = γ (E '(x, y))

6 6 6 6

[0072] ここで、 Max()は、 0の中の最大値を戻す関数である。また、 7 (E)は、第 1の実施の形態と同様の関数である。また、クリッピング処理も第 1の実施の形態と同様に行う。なお、以上で生成した E (x，y)を縦方向輝度凸部画像、 E (x，y)を横方向輝度凸画像と言つ c [0073] 上記エッジ画像 E (x，y)と E (x，y)の生成について、第 1の実施の形態の図 3を参 Here, Max () is a function that returns the maximum value among 0. 7 (E) is a function similar to that of the first embodiment. Clipping processing is also performed in the same way as in the first embodiment. The E (x, y) generated above is called the vertical luminance convex image, and E (x, y) is called the horizontal luminance convex image. [0073] For generation of the edge images E (x, y) and E (x, y), refer to FIG. 3 of the first embodiment.

5 6 5 6

照して説明する。上記 E ' (x,y)は、輝度画像 Y (x,y)面において、縦方向 4画素 Y This will be explained. E '(x, y) above is 4 pixels Y in the vertical direction on the luminance image Y (x, y) plane.

5 LH 5 LH

(x，y— 1)、 Y (x，y)、 Y (x，y+ 1)、 Y (x，y+ 2)のうち、対象画素（x， y)を基 Of (x, y— 1), Y (x, y), Y (x, y + 1), Y (x, y + 2), the target pixel (x, y)

LH LH LH LH LH LH LH LH

準に、外側 2画素 Y (x，y— 1)、 Y (x，y+ 2)の最大値と内側 2画素 Υ (x，y)、 Υ Similarly, the maximum value of the outer two pixels Y (x, y— 1) and Y (x, y + 2) and the inner two pixels Υ (x, y), Υ

LH LH LH L LH LH LH L

(x，y+ 1)の最大の差を求めて!/、る。 Find the maximum difference of (x, y + 1)!

H H

[0074] E ' (x，y)の値が正の値を示すことは、対象画素（x, y)近辺の値が、縦方向周辺画 [0074] The value of E ′ (x, y) indicates a positive value because the value near the target pixel (x, y)

5 Five

素の値より大きい、すなわち画素値が縦方向の周辺よりでっぱつていることを示す。従って、このようにして生成した E (x，y)を縦方向輝度凸部画像と言う。 It is larger than the prime value, that is, the pixel value is more prominent than the periphery in the vertical direction. Therefore, E (x, y) generated in this way is referred to as a vertical luminance convex image.

5 Five

[0075] 上記 E ' ( ）ぉょび£ (x，y)は、横方向のエッジ画像を生成するためのものである [0075] The above E '() selection (x, y) is for generating a lateral edge image.

6 6 6 6

。上記 E ' ( )ぉょび£ (x，y)に対して、縦と横をひっくり返して考え、後は同様に . Think about E '() selection (x, y) by turning it upside down and the same

5 5 5 5

演算するものである。従って、このようにして生成された E (x，y)を横方向輝度凸部画 It is to calculate. Therefore, E (x, y) generated in this way is used as the horizontal luminance convex image.

6 6

像と言う。 Say a statue.

[0076] ステップ S5では、縮小した画像の 1画素おきに 19 X 19画素の顔判定対象領域を設定し、その領域におけるエッジ画像 E (x，y)〜E (x，y)の部分画像を出力する。こ [0076] In step S5, a 19 x 19 pixel face determination target area is set for every other pixel of the reduced image, and partial images of edge images E (x, y) to E (x, y) in that area are set. Output. This

1 6 1 6

れをすべての縮小画像におレ、て行う。 Do this for all reduced images.

[0077] ステップ S6では、ステップ 5で出力したエッジ画像の各部分画像に対して、この領域が顔の画像であるかどうか第 1の実施の形態と同様に判定する。エッジ画像 E (X, In step S 6, it is determined for each partial image of the edge image output in step 5 whether or not this area is a face image, as in the first embodiment. Edge image E (X,

5 y)〜E (x，y)に基づいて部分画像の顔らしさ Vsum〜Vsumを生成する処理も、エツ 5) The process of generating the facial appearance Vsum to Vsum of the partial image based on y) to E (x, y)

6 5 6 6 5 6

ジ画像 E (x，y)〜E (x，y)に基づいて部分画像の顔らしさ Vsum〜Vsumを生成す The facial appearance Vsum to Vsum of the partial image is generated based on the images E (x, y) to E (x, y)

1 4 1 4 る処理と同様な考えのもとで行う。すなわち、凹部の概念を凸部の概念に置き換えて行えばよい。 This is based on the same idea as 1 4 1 4 processing. That is, the concept of the concave portion may be replaced with the concept of the convex portion.

[0078] 以上説明した部分画像の顔判定処理を、各縮小画像において、 1ビットずつずらした各部分画像についてすべて行い、顔の画像と判定できる部分画像をすベて抽出し、ステップ S 7に進む。 The partial image face determination process described above is performed for each partial image shifted by 1 bit in each reduced image, and all partial images that can be determined as face images are extracted, and the process proceeds to step S 7. move on.

[0079] ステップ S7では、ステップ 6によりある部分画像が顔であると判定された場合には、その部分画像の入力画像に対する顔の大きさ Sと座標 (X, Y)を、第 1の実施の形態と同様に出力する。 S, X, Yは、縮小画像における顔のサイズ S' = 19と、顔とされた領域の座標（X Y と縮小倍率 κとを用いて、次の式で与えられる。 S = S' / κ [0079] In step S7, when it is determined in step 6 that a partial image is a face, the size S and coordinates (X, Y) of the face with respect to the input image of the partial image are set in the first implementation. Output in the same way as S, X, and Y are given by the following equation using the size of the face S '= 19 in the reduced image and the coordinates of the area that is the face (XY and reduction magnification κ). S = S '/ κ

X = X' / K X = X '/ K

Y = Y' / K Y = Y '/ K

[0080] 以上のようにして、入力画像に顔の画像がある場合は、その顔の画像の位置と大きさが検出されて出力される。 As described above, when a face image is included in the input image, the position and size of the face image are detected and output.

[0081] 以上説明した第 2の実施の形態の処理を行うと、次のような効果を奏する。 When the processing of the second embodiment described above is performed, the following effects are obtained.

(1)歯を見せて笑っている口や、光が当たって光って頰ゃ鼻では、輝度が周囲に比ベて局所的に明るくなつている。本実施の形態によると、そのような局所的に明るい箇所も効果的に検出してエッジ画像を作成する。従って、このようにして作成したエツジ画像を使用して第 1の実施の形態の凹部画像による顔画像の判定と同様に行えば、顔の画像の局所的暗い箇所に加えて、顔の画像の局所的明るい箇所も考慮して顔判定を行うことができる。これにより、第 1の実施の形態よりもさらに高精度に顔を検出できる。 (1) Luminance is brighter locally in the mouth laughing while showing teeth, and in the nose when it shines when exposed to light. According to the present embodiment, such a locally bright spot is effectively detected to create an edge image. Therefore, if the edge image created in this way is used in the same manner as the determination of the face image by the concave image in the first embodiment, in addition to the locally dark portion of the face image, the face image It is possible to perform face determination considering local bright spots. As a result, the face can be detected with higher accuracy than in the first embodiment.

[0082] 第 3の実施の形態 [0082] Third Embodiment

第 2の実施の形態では、輝度凹部画像に加えて輝度凸部画像を生成し、顔の目鼻口などの局所的に暗い箇所に加えて、歯を見せて笑っている口や、光が当たって光つている頰ゃ鼻では、輝度が周囲に比べて局所的に明るくなつている箇所も適切に検出する例を説明した。第 3の実施の形態では、輝度凹部画像と輝度凸部画像の情報を輝度凹凸部画像にまとめて処理を行う例を説明する。 In the second embodiment, a luminance convex image is generated in addition to the luminance concave image, and in addition to a locally dark spot such as the eyes, nose and mouth of the face, a mouth laughing with a tooth or a light hits it. In the case of a nose that is shining, the example in which the location where the brightness is locally brighter than the surroundings is also detected appropriately. In the third embodiment, an example will be described in which information on the luminance concave portion image and the luminance convex portion image is combined into a luminance uneven portion image and processed.

[0083] 第 3の実施の形態は、第 1の実施の形態と同様に、パーソナルコンピュータ 1で実現される。従って、第 3の実施の形態の画像処理装置の構成は、第 1の実施の形態の図 1を参照することとする。また、パーソナルコンピュータ 1が実行する画像処理プログラムは、第 1の実施の形態で使用し、第 2の実施の形態で参照した図 2のフローチャートと、処理の流れとしては同様であるので、同様に図 2を参照しながら以下説明をする。 [0083] The third embodiment is realized by the personal computer 1 as in the first embodiment. Therefore, for the configuration of the image processing apparatus of the third embodiment, refer to FIG. 1 of the first embodiment. The image processing program executed by the personal computer 1 is the same as the flow chart of FIG. 2 used in the first embodiment and referenced in the second embodiment. Similarly, the following explanation will be made with reference to FIG.

[0084] ステップ S1からステップ S3までは、第 2の実施の形態と同様であるので説明を省略する。 [0084] Steps S1 to S3 are the same as those in the second embodiment, and thus description thereof is omitted.

[0085] ステップ S4では、第 2の実施の形態と同様にエッジ画像 E (x，y)〜E (x，y)を生成する。そして、つぎの式に基づき、縦方向輝度凹凸部画像 E₇ (x，y)と横方向輝度凹凸部画像 E (x，y)を生成する。 [0085] In step S4, edge images E (x, y) to E (x, y) are generated as in the second embodiment. To do. Then, based on the following equation, longitudinal luminance unevenness portion image E ₇ (x, y) and transverse luminance concave protrusions image E (x, y) to produce a.

8 8

[数 1] [Number 1]

【数 1】 [Equation 1]

(Α(χ, ) > ₅(χ, の場合） _{(Α (χ,)> 5} (χ, the case of a)

E₅ (x,y) ァ）≤E₅(x, )の場合） E ₅ (x, y) a) If ≤E ₅ (x,))

[数 2] [Equation 2]

【数 2】

[Equation 2]

[0086] ステップ S5では、縮小した画像の 1画素おきに 19 X 19画素の顔判定対象領域を設定し、その領域におけるエッジ画像 E (x，y)，E (x，y)，E (x，y)，E (x，y)の部分画 [0086] In step S5, a face determination target area of 19 X 19 pixels is set every other pixel of the reduced image, and edge images E (x, y), E (x, y), E (x , Y), E (x, y)

1 2 7 8 1 2 7 8

像を出力する。 Output an image.

[0087] ステップ S6では、ステップ 5で出力したエッジ画像 E (x，y)，E (x,y) ,E (x,y) ,Ε (x [0087] In step S6, the edge images E (x, y), E (x, y), E (x, y), Ε (x

1 2 7 8 1 2 7 8

，y)の各部分画像に対して、この領域が顔の画像であるかどうか第 1の実施の形態と同様に判定する。エッジ画像 E (x，y)〜E (x，y)に基づいて部分画像の顔らしさ Vsu , Y), it is determined in the same manner as in the first embodiment whether this area is a face image. The facial appearance Vsu of the partial image based on the edge images E (x, y) to E (x, y)

7 8 7 8

m〜Vsumを生成する処理も、エッジ画像 E (x，y)〜E (x，y)に基づいて部分画像 The process of generating m to Vsum is also a partial image based on the edge images E (x, y) to E (x, y)

7 8 1 4 7 8 1 4

の顔らしさ Vsum〜Vsumを生成する処理と同様な考えのもとで行う。すなわち、凹部 This is based on the same idea as the process of generating Vsum to Vsum. That is, the recess

1 4 14

の概念を凹部と凸部を組み合わせた概念に置き換えて行えばよい。 This concept may be replaced with a concept combining concave and convex portions.

[0088] ステップ S7は、第 1の実施の形態と同様である。以上のようにして、入力画像に顔の画像がある場合は、その顔の画像の位置と大きさが検出されて出力される。 [0088] Step S7 is the same as in the first embodiment. As described above, when a face image is included in the input image, the position and size of the face image are detected and output.

[0089] 以上説明した第 3の実施の形態の処理を行うと、次のような効果を奏する。 When the processing of the third embodiment described above is performed, the following effects are obtained.

(1)輝度凹部画像と輝度凸部画像の情報を輝度凹凸部画像にまとめることにより、判別処理に用いる情報量を省略しつつ、第 2の実施の形態に近い精度で顔を検出でき [0090] 一変形例 (1) By combining the information on the luminance concave portion image and the luminance convex portion image into the luminance uneven portion image, the face can be detected with an accuracy close to that of the second embodiment while omitting the amount of information used for the discrimination process. [0090] A variation

上記実施の形態では、顔の画像の判定において、を生成して処理

In the above embodiment, in the determination of the face image, is generated and processed.

する例を説明した。しかし、ステップ 5で出力したエッジ画像に対して、ニューラルネットワークなどの公知の学習判別処理を施すことにより、この領域が顔の画像であるかを判定するようにしてもよい。 The example to do was explained. However, the edge image output in step 5 may be subjected to a known learning discrimination process such as a neural network to determine whether this region is a face image.

[0091] 検出した顔画像領域におけるエッジ画像に対して、公知の技術を適用して表情判定処理を行っても良い。特に、第 2と第 3の実施の形態では、歯を見せて笑っている顔の口の輝度が局所的に高くなつていることが検出されるので、そのような笑顔を高精度に判定することができる。 A facial expression determination process may be performed by applying a known technique to the edge image in the detected face image area. In particular, in the second and third embodiments, it is detected that the brightness of the mouth of the face that is laughing while showing teeth is locally high, so such a smile is determined with high accuracy. be able to.

[0092] 縦方向輝度凹部画像を生成するエッジ検出フィルタは、次のようなものであっても良い。 [0092] The edge detection filter that generates the vertical luminance concave image may be as follows.

E ' (x,y) =Min (Y (x,y- l) , Y (x,y+ l) )— Y (x,y) E '(x, y) = Min (Y (x, y- l), Y (x, y + l)) — Y (x, y)

1 LH LH LH 1 LH LH LH

すなわち、対象画素とその縦方向隣接画素の 3画素を使用するものでもよい。横方向輝度凹部画像についても同様である。また、輝度凸部画像を生成するエッジ検出フィルタについても同様である。 That is, the target pixel and three pixels adjacent in the vertical direction may be used. The same applies to the horizontal luminance concave image. The same applies to the edge detection filter that generates the brightness convex image.

[0093] 縦方向輝度凹部画像を生成するエッジ検出フィルタは、次のようなものであっても良い。 [0093] The edge detection filter that generates the vertical luminance concave image may be as follows.

E ' (x,y) =Min (Y (x,y- l) , Y (x,y+ 2) ) E '(x, y) = Min (Y (x, y- l), Y (x, y + 2))

1 LH LH 1 LH LH

— (Y (x,y) +Y (x,y+ l) ) / 2 — (Y (x, y) + Y (x, y + l)) / 2

LH LH LH LH

横方向輝度凹部画像についても同様である。また、輝度凸部画像を生成するエッジ検出フィルタについても同様である。 The same applies to the horizontal luminance concave image. The same applies to the edge detection filter that generates the brightness convex image.

[0094] 輝度凹部画像または輝度凸部画像を作るためのフィルタのサイズは、異なる複数のサイズを用い、複数の周波数帯域の凸構造または凹構造を検出しても良い。なお、縮小倍率の異なる複数の輝度画像に対して同じサイズのフィルタを演算することにより、複数の周波数帯域の凸構造または凹構造を検出しても良い。 [0094] A plurality of different sizes may be used as the size of the filter for creating the luminance concave image or luminance convex image, and a convex structure or a concave structure in a plurality of frequency bands may be detected. Note that convex structures or concave structures in a plurality of frequency bands may be detected by calculating filters of the same size for a plurality of luminance images with different reduction magnifications.

[0095] 上記実施の形態では、 E ' (x,y)として、対象画素近辺画素の最小値と周辺画素の最小値の差分を出力する例を説明した。すなわち、対象画素近辺画素の最小値と周辺画素の最小値の差異を差分値の形で出力する例を説明した。しかし、これらの値の比を差異を表す値として出力するようにしてもよ!/、。 In the above embodiment, an example has been described in which the difference between the minimum value of the pixels near the target pixel and the minimum value of the surrounding pixels is output as E ′ (x, y). That is, the minimum value and the circumference of the pixels near the target pixel The example which outputs the difference of the minimum value of a side pixel in the form of the difference value was demonstrated. However, the ratio of these values may be output as a value representing the difference! /.

[0096] 上記実施の形態では、パーソナルコンピュータ 1が、撮影された画像の中から顔画像を検出する画像処理を行う例を説明した。しかし、デジタルスチルカメラなどの撮像装置内で、撮像した画像に対し上記説明した処理を行うようにしてもよい。 [0096] In the above embodiment, the example in which the personal computer 1 performs the image processing for detecting the face image from the photographed image has been described. However, the above-described processing may be performed on the captured image in an imaging apparatus such as a digital still camera.

[0097] 図 10は、このような撮像装置であるデジタルカメラ 100の構成を示す図である。デジタルカメラ 100は、撮影レンズ 102、 CCDなどからなる撮像素子 103、 CPUおよび周辺回路からなる制御装置 104、メモリ 105などから構成される。 FIG. 10 is a diagram showing a configuration of a digital camera 100 that is such an imaging apparatus. The digital camera 100 includes a photographing lens 102, an image sensor 103 including a CCD, a control device 104 including a CPU and peripheral circuits, a memory 105, and the like.

[0098] 撮像素子 103は、被写体 101を撮影レンズ 102を介して撮影（撮像）し、撮影した画像データを制御装置 104へ出力する。制御装置 104は、撮像素子 103で撮影された画像（画像データ）に対して、上記で説明した顔画像を検出する画像処理を行う。そして、制御装置 104は、顔画像の検出結果に基づき撮影した画像に対し、ホワイトバランスの調整やその他の各種の画像処理を行!/、、画像処理後の画像データを適宜メモリ 105に格納する。また、制御装置 104は、顔画像の検出結果を、オートフォ一カス処理などにも利用することができる。なお、制御装置 104が実行する画像処理プログラムは、不図示の ROMに格納されている。 The image sensor 103 captures (captures) the subject 101 via the photographing lens 102 and outputs the captured image data to the control device 104. The control device 104 performs image processing for detecting the face image described above on the image (image data) captured by the image sensor 103. Then, the control device 104 performs white balance adjustment and various other image processing on the image captured based on the detection result of the face image! /, And stores the image data after the image processing in the appropriate memory 105. Store. Further, the control device 104 can also use the detection result of the face image for autofocus processing or the like. The image processing program executed by the control device 104 is stored in a ROM (not shown).

[0099] また、上記説明した処理をビデオカメラにも適用できる。さらに、不審者を監視する監視カメラや、撮影された顔画像に基づいて個人を識別したり、性別や年齢や表情を推定するような装置にも適用できる。すなわち、顔の画像など特定種類の画像を抽出して処理する画像処理装置や撮像装置などのあらゆる装置に本発明を適用すること力 Sできる。 [0099] The processing described above can also be applied to a video camera. Furthermore, it can also be applied to surveillance cameras that monitor suspicious individuals and devices that identify individuals based on captured face images and estimate gender, age, and facial expressions. That is, the present invention can be applied to all devices such as an image processing device and an imaging device that extract and process a specific type of image such as a face image.

[0100] 上記では、種々の実施の形態および変形例を説明したが、本発明はこれらの内容に限定されるものではない。本発明の技術的思想の範囲内で考えられるその他の態様も本発明の範囲内に含まれる。 [0100] While various embodiments and modifications have been described above, the present invention is not limited to these contents. Other forms conceivable within the scope of the technical idea of the present invention are also included in the scope of the present invention.

[0101] 次の優先権基礎出願の開示内容は引用文としてここに組み込まれる。 [0101] The disclosure of the following priority application is hereby incorporated by reference.

日本国特許出願 2006年第 215944号（2006年 8月 8日出願） Japanese patent application No. 215944 (filed Aug. 8, 2006)

Claims

The scope of the claims

[1] An image processing method,

Acquire an image consisting of multiple pixels,

Based on the acquired image, the edge of the concave structure where the pixel value is recessed from the periphery locally is detected,

An image processing method for generating an edge image based on the detected edge of the concave structure.

[2] In the image processing method according to claim 1,

An image processing method for generating the edge image by calculating a non-linear filter for detecting the edge of the concave structure with respect to the acquired image.

[3] In the image processing method according to claim 2,

The non-linear filter is an image processing method for outputting a calculation result based on a difference between a pixel value in a target region and a minimum pixel value in a peripheral region of the target region.

[4] In the image processing method according to claim 3,

The non-linear filter is an image processing method for outputting a calculation result based on a difference between a minimum pixel value in the target region and a minimum pixel value in the peripheral region.

[5] In the image processing method according to claim 4,

When the minimum pixel value in the target area is smaller than the minimum pixel value in the peripheral area of the target area, the value corresponding to the difference is set as the edge pixel value, and the minimum pixel value in the target area is An image processing method for clipping the edge pixel value to zero when the pixel value is larger than the minimum pixel value in the peripheral area of the target area.

[6] An image processing method,

Acquire an image consisting of multiple pixels,

Based on the acquired image, the edge of the convex structure where the pixel value protrudes locally from the periphery is detected,

An image processing method for generating an edge image based on the detected edge of the convex structure.

[7] The image processing method according to claim 6,

An image processing method for generating the edge image by calculating a non-linear filter for detecting the edge of the convex structure on the acquired image.

[8] In the image processing method according to claim 7,

The non-linear filter is an image processing method for outputting a calculation result based on a difference between a pixel value in a target region and a maximum value of pixel values in a peripheral region of the target region.

[9] The image processing method according to claim 8,

The non-linear filter is an image processing method for outputting a calculation result based on a difference between a maximum pixel value in the target region and a maximum pixel value in the peripheral region.

[10] The image processing method according to claim 8,

When the maximum pixel value in the target region is larger than the maximum pixel value in the peripheral region of the target region, the value corresponding to the difference is set as the value of the edge pixel, and the maximum pixel value in the target region is An image processing method in which the edge pixel value is clipped to zero when the pixel value is smaller than the maximum value in the surrounding area of the target area.

[11] In the image processing method according to any one of claims 1 to 10,

Based on the acquired image, generate a luminance image by a luminance component,

An image processing method for generating the edge image using the generated luminance image.

[12] The image processing method according to any one of claims 3 to 5 and 8 to 10,

The image processing method, wherein the target area is a single pixel of only the target pixel or a two-pixel area of the target pixel and its neighboring pixels, and the peripheral area is a two-pixel area located on both outer sides of the target area.

[13] In the image processing method according to any one of claims 2 to 5 and 7 to 10,

An image processing method for calculating the nonlinear filter in at least two directions.

[14] An image processing method comprising:

Acquire an image consisting of multiple pixels,

Based on the acquired image, a concave structure edge where the pixel value is recessed locally from the periphery and a protruding convex structure edge are detected,

An image processing method for generating an edge image of a concave structure based on the detected edge of the concave structure, and generating an edge image of the convex structure based on the detected edge of the convex structure.

[15] An image processing method comprising: Acquire an image consisting of multiple pixels,

An image processing method for generating an edge image including both a concave structure edge and a convex structure edge based on the detected concave structure edge and convex structure edge.

[16] An image processing method comprising:

Acquire an image consisting of multiple pixels,

Based on the acquired image, at least one of the edge of the concave structure where the pixel value is locally recessed from the periphery and the edge of the convex structure where the pixel value protrudes locally from the periphery is detected. ,

An image processing method for generating an edge image based on at least one of the detected edge of the concave structure and the edge of the convex structure.

[17] An image processing method comprising:

Acquire an image consisting of multiple pixels,

Detecting an edge component of the acquired image;

Gamma conversion is performed on the detected edge component,

An image processing method for generating an edge image based on the gamma-converted edge component.

[18] In the image processing method according to any one of claims 1 to 17;

An image processing method for detecting a face image using the generated edge image.

[19] An image processing program for causing a computer to execute the image processing method according to any one of claims 1 to 18.

20. An image processing apparatus equipped with the image processing program according to claim 19.

[21] An imaging device equipped with the image processing program according to [19].