[go: up one dir, main page]

WO2019155570A1 - Dispositif d'estimation de ligne de visée, procédé d'estimation de ligne de visée et support d'enregistrement - Google Patents

Dispositif d'estimation de ligne de visée, procédé d'estimation de ligne de visée et support d'enregistrement Download PDF

Info

Publication number
WO2019155570A1
WO2019155570A1 PCT/JP2018/004370 JP2018004370W WO2019155570A1 WO 2019155570 A1 WO2019155570 A1 WO 2019155570A1 JP 2018004370 W JP2018004370 W JP 2018004370W WO 2019155570 A1 WO2019155570 A1 WO 2019155570A1
Authority
WO
WIPO (PCT)
Prior art keywords
eye
image
line
sight
feature amount
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/JP2018/004370
Other languages
English (en)
Japanese (ja)
Inventor
雄介 森下
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
NEC Corp
Original Assignee
NEC Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by NEC Corp filed Critical NEC Corp
Priority to PCT/JP2018/004370 priority Critical patent/WO2019155570A1/fr
Priority to JP2019570215A priority patent/JP7040539B2/ja
Publication of WO2019155570A1 publication Critical patent/WO2019155570A1/fr
Anticipated expiration legal-status Critical
Priority to JP2022033164A priority patent/JP7255721B2/ja
Ceased legal-status Critical Current

Links

Images

Classifications

    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B3/00Apparatus for testing the eyes; Instruments for examining the eyes
    • A61B3/10Objective types, i.e. instruments for examining the eyes independent of the patients' perceptions or reactions
    • A61B3/113Objective types, i.e. instruments for examining the eyes independent of the patients' perceptions or reactions for determining or recording eye movement
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras

Definitions

  • the present disclosure relates to a gaze estimation apparatus, a gaze estimation method, and a recording medium, and more particularly, to a gaze estimation apparatus that estimates a gaze of a person included in a captured image.
  • a person's line of sight (the direction in which the eyes are looking) can be an important clue in analyzing the person's actions and intentions. For example, an object or event that the person is gazing at can be identified from the line of sight of the person.
  • a technique for estimating a person's line of sight in particular, a technique for estimating a person's line of sight using an image including a person's face (hereinafter referred to as a “face image”) has been developed.
  • Patent Documents 1 to 3 and Non-Patent Documents 1 and 2 describe techniques for estimating a line of sight based on a face image.
  • Patent Document 1 discloses a feature-based method using a feature point (image feature point) included in a face image.
  • Non-Patent Document 1 discloses a method for estimating a line of sight from a face image including only one eye.
  • Patent Document 2 and Non-Patent Document 2 disclose examples of “appearance-based gaze estimation”, respectively.
  • the relationship between a face and a line of sight is learned by performing deep learning based on a CNN (Convolutional neural network) model using a given face image data set.
  • CNN Convolutional neural network
  • the related technology described above has a problem that the accuracy of gaze estimation varies depending on the shape of the human eye.
  • the technique disclosed in Patent Document 2 can accurately estimate the line of sight of a person with large eyes, but may cause a problem that the line of sight of a person with small eyes cannot be estimated with high accuracy. That is, with the related technique described above, it is difficult to estimate the line of sight with high accuracy regardless of the feature amount (for example, size and inclination) related to the shape of the eye.
  • the present invention has been made in view of the above problems, and an object of the present invention is to estimate a person's line of sight with high accuracy regardless of the shape of the person's eyes.
  • An eye gaze estimation apparatus includes an image acquisition unit that acquires an image including a human face, an eye detection unit that detects a human eye from the image, and a feature amount related to the detected eye shape. And a feature amount calculating means for calculating a partial image including the detected eye from the image, and in the extracted partial image, at least one feature amount related to the shape of the eye is a feature amount reference Image conversion means for converting the shape of the partial image so as to be equal to, gaze estimation means for estimating the gaze of the person using the transformed partial image, and outputting information on the estimated gaze Output means.
  • a gaze estimation method includes obtaining an image including a human face, detecting a human eye from the image, and calculating a feature amount related to the detected eye shape.
  • a partial image including the detected eye is extracted from the image, and in the extracted partial image, the at least one feature amount related to the shape of the eye is equal to a feature amount reference. Converting the shape of the partial image, estimating the line of sight of the person using the converted partial image, and outputting information of the estimated line of sight.
  • a non-temporary recording medium includes obtaining an image including a person's face, detecting a person's eyes from the image, and a feature amount related to the detected eye shape. Calculating and extracting a partial image including the detected eye from the image, and in the extracted partial image, at least one feature amount related to the shape of the eye is equal to a reference of the feature amount And converting the shape of the partial image, estimating the line of sight of the person using the converted partial image, and outputting information of the estimated line of sight to a computer device.
  • a program to be executed is recorded.
  • the line of sight of a person can be estimated with high accuracy regardless of the shape of the person's eyes.
  • FIG. 1 is a block diagram illustrating a configuration of a line-of-sight estimation apparatus 100 according to the first embodiment.
  • the gaze estimation apparatus 100 is an apparatus for estimating the gaze of a person included in an image. As shown in FIG. 1, the gaze estimation apparatus 100 includes at least an image acquisition unit 110, an eye detection unit 120, a feature amount calculation unit 130, a normalization unit 140, a gaze estimation unit 150, and an output unit 160. Including.
  • the line-of-sight estimation apparatus 100 may include other components not shown.
  • the image acquisition unit 110 acquires image data including a person's face.
  • the image acquisition unit 110 may acquire image data transmitted from another device.
  • the other device here may be an imaging device such as a monitoring camera or a built-in camera of an electronic device, or may be a storage device such as a database in which image data is recorded.
  • the image acquisition unit 110 outputs the acquired image data to the eye detection unit 120.
  • the image data acquired by the image acquisition unit 110 is expressed by luminance values of a plurality of pixels.
  • the number of pixels, the number of colors (number of color components), the number of gradations, and the like included in the image data are not limited to specific numerical values.
  • the image acquisition unit 110 may acquire only image data having a predetermined number of pixels and colors, but may not limit the number of pixels and colors of the image data.
  • the image data may be a still image or a moving image. For convenience of explanation, the image data acquired by the image acquisition unit 110 will be referred to as “input image” below.
  • each input image includes the face of only one person.
  • the image acquisition unit 110 may divide the input image into a plurality of input images each including only one face.
  • the image acquisition unit 110 generates a face image from the acquired input image, and supplies the generated face image to the eye detection unit 120 and the normalization unit 140.
  • a face image refers to an image including part or all of a person's face. In other words, the face image is obtained by removing elements (background, object, person's body, etc.) other than the person's face from the input image.
  • FIG. 2 shows a face image 400 that the image acquisition unit 110 generates from an input image.
  • the face image 400 shown in FIG. 2 includes facial parts (eyebrows, nose, and mouth) in addition to the eyes. However, it is sufficient that the face image 400 includes at least one eye. This is because in the present embodiment, only an eye region image (described later) extracted from the face image 400 is used.
  • the input image is composed of a plurality of images (frames).
  • the image acquisition unit 110 extracts only one or more images including a human face from the moving image, and detects the eyes using the extracted one or more images as a face image.
  • Unit 120 and normalization unit 140 may be supplied. With this configuration, the line-of-sight estimation apparatus 100 can improve the efficiency of the process (described later) for estimating the line of sight.
  • the image acquisition unit 110 may supply the input image as it is as a face image to the eye detection unit 120 and the normalization unit 140, or process the processed input image as a face image.
  • An image may be supplied to the eye detection unit 120 and the normalization unit 140.
  • the image acquisition unit 110 detects a human face from the input image, extracts a part of the input image including the detected human face as a face image, and extracts the extracted face image as an eye. You may supply to the detection part 120 and the normalization part 140.
  • the face image may be a monochrome image or a color image. That is, the face image may be composed of pixels including a plurality of color components such as R (red), G (green), and B (blue).
  • the image acquisition unit 110 may convert the face image so that the number of colors or the number of gradations becomes a predetermined value, and supply the converted face image to the eye detection unit 120 and the normalization unit 140. Good.
  • the image acquisition unit 110 may convert a face image that is a color image into a single-color face image represented by a single component gray scale. This is because color information (saturation, hue) included in the face image is not used in the present embodiment.
  • the face image thus converted is also simply referred to as “face image” hereinafter.
  • the eye detection unit 120 detects eyes from the face image 400 (see FIG. 2) supplied from the image acquisition unit 110.
  • the eye detection unit 120 detects the center of the pupil of the eye detected from the face image 400 and a plurality of points on the eye outline.
  • the plurality of points on the center of the pupil and the outline of the eyes detected by the eye detection unit 120 are hereinafter referred to as eye feature points.
  • the eye detection unit 120 specifies four points, that is, the inner eye angle, the outer eye angle, the upper eyelid center, and the lower eyelid center as eye feature points in addition to the center of the pupil.
  • the inner eye angle (so-called eye head) is the point on the inner side of the face among the two points where the upper and lower eyelids meet at both ends of the outline of the eye.
  • Outer eye angle (so-called eye corner) refers to the point on the outside of the face among the two points where the upper and lower eyelids meet.
  • the center of the upper eyelid is the center in the lateral direction of the boundary between the upper eyelid and the eyeball.
  • the center of the lower eyelid is the center in the lateral direction of the boundary between the lower eyelid and the eyeball.
  • the eye detection unit 120 may use any known method such as the method described in Patent Document 3 in order to detect eye feature points.
  • the eye detection unit 120 may use general machine learning such as supervised learning. In this configuration, the eye detection unit 120 learns the features and positions of the pupil and eye contours in the faces of a plurality of persons using the given face image 400.
  • the eye detection unit 120 outputs information about eye feature points detected from the face image 400 to the feature amount calculation unit 130.
  • the feature amount calculation unit 130 Based on the information about the eye feature points detected by the eye detection unit 120, the feature amount calculation unit 130 indicates an index (hereinafter, referred to as this index) indicating the feature related to the shape of the eye included in the face image 400 (see FIG. 2). , “Feature amount related to eye shape” or simply “feature amount”).
  • FIG. 3 is an enlarged view of a part of the face image 400 shown in FIG.
  • the face image 410 shown in FIG. 3 includes the left eye in the face image 400 shown in FIG. 2, and the face image 420 includes the right eye in the face image 400 shown in FIG.
  • the point I is the inner eye angle
  • the point O is the outer eye angle
  • the point H is the center of the upper eyelid
  • the point L is the center of the lower eyelid.
  • Point P is the center of the pupil.
  • the feature amount calculation unit 130 may use the eye height in the face images 410 and 420 as the feature amount related to the eye shape.
  • the eye height y is the distance between the center H of the upper eyelid and the center L of the lower eyelid.
  • the center H of the upper eyelid and the center L of the lower eyelid are detected by the eye detection unit 120. Therefore, the feature amount calculation unit 130 can calculate the eye height y using the information of the eye feature points (including the points H and L) acquired from the eye detection unit 120.
  • An eye having a high (low) eye height y has a characteristic that it is generally said that the eye is large (thin).
  • the feature amount calculation unit 130 may use the average value of the heights of the left and right eyes as the feature amount, or may use both the heights of the left and right eyes as the feature amount.
  • the feature amount calculation unit 130 may use the eye width x in the face image 410 as the feature amount related to the eye shape.
  • the eye width x is the distance between the inner eye angle I (eye head) and the outer eye angle O (eye corner).
  • the inner eye angle I and the outer eye angle O of the eye are detected by the eye detection unit 120. Therefore, the feature amount calculation unit 130 can calculate the eye width x using the information about the eye feature points (including points I and O) acquired from the eye detection unit 120.
  • the feature amount calculation unit 130 may use an average value of the widths of the left and right eyes as the feature amount, or may use both the widths of the left and right eyes as the feature amount.
  • the feature amount calculation unit 130 may use the eye inclination ⁇ shown in the face image 420 of FIG. 3 as the feature amount related to the eye shape.
  • the feature amount calculation unit 130 firstly calculates a first line segment that passes through the center P of the left and right pupils, and a second line segment that passes through the inner eye angle I (eye) and the outer eye angle O (eye corner). calculate.
  • the feature-value calculation part 130 calculates inclination (theta) of the 2nd line segment with respect to a 1st line segment.
  • the feature amount calculation unit 130 may use an average value of the inclinations of the left and right eyes as the feature amount, or may use both the inclinations of the left and right eyes as the feature amount.
  • the feature amount calculation unit 130 may use the eye outline detected by the eye detection unit 120 (that is, the boundary between the upper and lower eyelids and the eyeball) as the feature amount.
  • the feature amount calculation unit 130 may calculate the plurality of feature amounts described above.
  • the feature amount calculation unit 130 may use both the eye height and the eye width as the feature amount related to the eye shape.
  • the feature amount related to the eye shape is not limited to the above-described example.
  • the feature amount calculation unit 130 may calculate another element related to the shape of the eye as one of the feature amounts.
  • the normalization unit 140 acquires the face image 400 (see FIG. 2) from the image acquisition unit 110. Then, the normalization unit 140 generates an eye region image (normalized face image) by performing normalization processing on the face image 400 using the feature amount information acquired from the feature amount calculation unit 130. To do.
  • the normalization unit 140 first determines four reference coordinates that define the size of the eye area image on the face image 400.
  • the normalization unit 140 calculates a distance w between the left and right pupil centers P on the face image 400 (hereinafter referred to as “inter-eye distance”). Since the center P of the left and right pupils is detected by the eye detection unit 120, the feature amount calculation unit 130 uses the information on the eye feature points (including the point P) acquired from the eye detection unit 120 to determine the distance. w can be calculated.
  • the normalization unit 140 calculates the width X0 and the height Y0 of the eye region image according to the following equation (1).
  • the width X0 and the height Y0 of the eye region image are proportional to the distance w between the centers P of the left and right pupils.
  • k is a predetermined constant. k may be, for example, 0.75.
  • the normalization unit 140 sets four points separated from the center P of the pupil by ( ⁇ X0 / 2, ⁇ Y0 / 2) on the orthogonal coordinate system as the reference coordinates A to D of the eye area image.
  • FIGS. 4A and 4B generation of an eye region image by the normalization unit 140 will be described.
  • 4A shows face images 434 to 436 acquired by the normalization unit 140 from the eye detection unit 120.
  • FIG. 4B shows eye area images 437 to 439 generated by normalizing the face images 434 to 436.
  • illustration of facial parts other than eyes is omitted.
  • the face images 434 to 436 shown in FIG. 4 (a) include different human faces.
  • the sizes of the eyes included in the face images 434 to 436 are different from each other. Specifically, the eyes included in the face image 435 are large, and the eyes included in the face image 436 are small.
  • the eyes included in the face image 434 are smaller than the face image 435 but larger than the face image 436.
  • the normalization unit 140 first determines reference coordinates A ′ to D ′ that define the sizes of the face images 434 to 436 based on the feature amounts related to the shape of the eyes included in the face images 434 to 436.
  • the normalization unit 140 performs normalization processing on the face images 434 to 436 so that the feature amount related to the eye shape (the eye size in the present embodiment) is leveled. Thereby, eye area images 437 to 439 shown in FIG. 4B are generated from the face images 434 to 436 shown in FIG.
  • the normalization process includes, for example, affine transformation for the face images 434 to 436. A specific example of normalization processing executed by the normalization unit 140 will be described later.
  • the original face images 434 to 436 may be composed of 640 ⁇ 480 pixels, while the eye area images 437 to 439 may be composed of 50 ⁇ 50 pixels.
  • the normalization unit 140 can calculate the pixel values of the eye region images 437 to 439 using any known method such as a bilinear method (linear interpolation method) or a bicubic method.
  • Example 1 When the feature amount is eye height>
  • the normalization unit 140 normalizes the face image so that the position and height of the eyes are constant.
  • J1 in the above equation (2) depends on the eye height y in the face images 434 to 436.
  • J1 is represented by the following formula (4).
  • J1 j0 ⁇ y ⁇ w (4)
  • j0 is a ratio between the eye height y0 and the eye area image height Y0 in the eye area images 437 to 439, and is a constant value.
  • j0 may be 5.0, for example.
  • w is the above-described distance between eyes (see FIG. 3).
  • the height Y of the face images 434 to 436 is expressed as the following expression (5) according to the expressions (1), (2), and (4).
  • Y k ⁇ j0 ⁇ y (5)
  • the normalization unit 140 determines the height Y of the face images 434 to 436.
  • J2 1. That is, X is represented by the following formula (6).
  • X w ⁇ k (6)
  • the normalization unit 140 determines four reference coordinates A ′ to D ′ of the face images 434 to 436 having the height Y and the width X around the center P of the pupil.
  • the normalization unit 140 may rotate the eye region images 437 to 439 so that the line segment connecting the centers P of the left and right pupils is horizontal. Specifically, if the slope of the line segment connecting the centers P of the left and right pupils is ⁇ (see the face image 420 in FIG. 3), the normalization unit 140 converts the eye region images 437 to 439 to the eye regions. The center of rotation of the pupil P is rotated by ⁇ .
  • Example 2 When the feature amount is the width of the eye>
  • the normalization unit 140 normalizes the face image so that the eye width is constant.
  • the normalization unit 140 first determines a set of parameters (X0, Y0) that defines the size of the eye area image, as in the case where the feature amount is the eye height.
  • X be the width of the face image.
  • the width X of the face image is determined to be proportional to the inter-eye distance w (see FIG. 3).
  • the ratio between the width X0 of the eye area image and the eye width x0 in the eye area image is set to j1.
  • the width X of the face images 434 to 436 is expressed by the following equation (7).
  • X k ⁇ j1 ⁇ x (7)
  • J1 in Expression (7) is a ratio between the eye width x0 and the eye area image width X0 in the eye area images 437 to 439, and is a constant value.
  • j1 may be 1.25.
  • J2 in Equation (3) described above depends on the eye width x in the face images 434 to 436.
  • J2 is represented by the following formula (8).
  • J2 j1 ⁇ x ⁇ w
  • J1 in the equation (8) is a ratio between the eye width x0 and the eye region image width X0 in the eye region images 437 to 439, and is a constant value.
  • j1 may be 1.25.
  • w is the above-described distance between eyes (see FIG. 3).
  • the normalization unit 140 determines the width X of the face images 434 to 436.
  • J1 1. That is, Y is represented by the following formula (9).
  • the normalization unit 140 determines four reference coordinates A ′ to D ′ of the face images 434 to 436 having the height Y and the width X around the center P of the pupil. Further, as in the case where the feature amount is the eye height, the normalization unit 140 may rotate the eye region image so that the line segment connecting the centers P of the left and right pupils is horizontal.
  • Example 3 When the feature amount is an eye inclination> The normalizing unit 140 normalizes the face image so that the eye inclination ⁇ (see FIG. 3) is constant.
  • the normalization unit 140 first calculates reference coordinates that define the size of the eye region image, as in the case where the feature amount is the eye height. Next, the normalization unit 140 calculates, in the face image, a first line segment that connects the centers P of the left and right pupils, and a second line segment that connects the eyes and the outer eye angles.
  • the inclination angle of the first line segment with respect to the horizontal is ⁇
  • the inclination angle of the second line segment with respect to the first line segment is ⁇ .
  • the normalization unit 140 normalizes the coordinate system of the face image by rotating it by an angle ⁇ ( ⁇ + ⁇ ) with the center of the pupil as the rotation center. Thereby, the inclination of the eyes is constant between the eye region images.
  • the line-of-sight estimation unit 150 does not need to let the line-of-sight estimator 151 learn the relationship between the change in the size or inclination of the eyes and the line of sight. Therefore, the line-of-sight estimation unit 150 can estimate the line of sight more accurately using the eye region image.
  • the line-of-sight estimation unit 150 estimates the line of sight of a person from the face orientation and the eye (pupil) orientation included in the face image.
  • the line of sight indicates the direction (more precisely, the direction) in which a person's eyes are looking.
  • the gaze estimation unit 150 estimates the gaze from the eye area image normalized by the normalization unit 140.
  • the gaze estimation unit 150 can use any known gaze estimation method.
  • the line-of-sight estimation unit 150 causes the line-of-sight estimator 151 to learn the relationship between the appearance of the face and the line of sight using a face image (face image with correct answer) in which the line of sight is specified in advance.
  • the gaze estimation unit 150 estimates the gaze using the learned gaze estimator 151.
  • the gaze estimation unit 150 outputs gaze estimation result data to the output unit 160.
  • the line-of-sight estimator 151 calculates a line-of-sight vector (g x , g y ) indicating which direction the line of sight is directed using the following equation (10).
  • g x satisfies ⁇ 90 ⁇ g x ⁇ 90 [deg]
  • g y satisfies ⁇ 90 ⁇ g y ⁇ 90 [deg].
  • u x and u y are learned.
  • F shown in Expression (10) is an image feature amount (scalar), and (u x , u y ) is a weight vector.
  • the line-of-sight vector (g x , g y ) represents a relative direction with respect to the front of the face. Therefore, the direction in which the photographed person is looking with eyes is not specified only by the line-of-sight vector (g x , g y ), but is specified by the line-of-sight vector (g x , g y ) and the direction of the person's face.
  • the line-of-sight estimator 151 may use the camera direction as a reference instead of using the front of the face as a reference.
  • the line-of-sight vector (g x , g y ) (0, 0).
  • the image feature amount f indicates the direction and magnitude of the luminance change in the eye area with a predetermined number of dimensions (for example, several hundred to several thousand).
  • the image feature amount f relates to the gradient of the luminance of the image.
  • HOG Heistograms of Oriented Gradients
  • This image feature amount f is also expressed by a column vector having a predetermined number of elements.
  • the weight vectors u x and u y are row vectors having the same number of elements as the image feature amount f. Therefore, the line-of-sight estimator 151 can calculate the inner product of the image feature amount f and the weight vectors u x and u y .
  • the weight vectors u x and u y can be learned by a well-known method such as support vector regression (Support Vector Regression, SVR) or linear regression by the least square method.
  • the output unit 160 outputs data indicating the gaze estimated by the gaze estimation unit 150 (hereinafter also referred to as “gaze data”).
  • the line-of-sight data represents the direction indicated by the line of sight determined by the line-of-sight estimation unit 150 according to a predetermined rule.
  • the output by the output unit 160 may be, for example, supplying the line-of-sight data to another device such as a display device, or writing the line-of-sight data to a recording medium included in the line-of-sight estimation apparatus 100. .
  • the configuration of the gaze estimation apparatus 100 is as described above.
  • the line-of-sight estimation apparatus 100 having such a configuration operates as described below, for example.
  • the specific operation of the line-of-sight estimation apparatus 100 is not limited to the operation example described here.
  • FIG. 5 is a flowchart illustrating a gaze estimation method executed by the gaze estimation apparatus 100 according to the present embodiment.
  • the line-of-sight estimation apparatus 100 estimates the line of sight from the face image by sequentially executing the processing of each step shown in FIG. 5 according to the flow.
  • the line-of-sight estimation apparatus 100 can start the processing shown in FIG. 5 at an appropriate timing such as a timing designated by the user or an input image transmitted from another apparatus.
  • the image data input to the line-of-sight estimation apparatus 100 includes a human face.
  • the coordinates on the image are represented by an orthogonal coordinate system having a predetermined position (for example, the center of the image) as the origin.
  • step S11 shown in FIG. 5 the image acquisition unit 110 acquires an input image.
  • the image acquisition unit 110 generates one or a plurality of face images from the acquired input image.
  • Each face image includes the face of one person.
  • step S12 the eye detection unit 120 detects an eye included in the face image generated in step S11, and detects a feature point of the detected eye. Specifically, the eye detection unit 120 detects the iris center of the eye, the head of the eye, the corner of the eye, the center of the upper eyelid, and the center of the lower eyelid.
  • the feature amount calculation unit 130 calculates a feature amount related to the eye shape using the eye shape data calculated in step S12. For example, as described above, the feature amount calculation unit 130 calculates the eye height, the eye width, or the eye inclination as the feature amount related to the eye shape.
  • step S14 the normalization unit 140 extracts an eye region image from the face image generated in step S11. Then, the normalization unit 140 normalizes the eye region image using the feature amount calculated in step S13.
  • step S15 the line-of-sight estimation unit 150 estimates the line of sight of a person using the line-of-sight estimator 151 that has performed machine learning in advance.
  • step S ⁇ b > 16 the output unit 160 outputs line-of-sight data indicating the line of sight (g x , g y ) calculated by the line-of-sight estimation unit 150.
  • the line-of-sight data is visualized by being output to a display device (not shown), for example.
  • the line-of-sight data may be displayed as a numerical value, or may be displayed as an arrow indicating the line of sight on the face image.
  • the line-of-sight estimation unit 150 may estimate the face direction by using a known face orientation estimation technique.
  • the gaze estimation unit 150 may use the face direction estimated in this way as a reference.
  • Modification 2 The user may input feature points such as the center of the right eye and the left eye and an eye region image. In this case, the line-of-sight estimation apparatus 100 does not need to detect feature points and does not need to generate an eye area image.
  • the shape of the eye area image is not necessarily limited to a rectangle.
  • a part of the face that is, a part that does not directly affect the gaze estimation (for example, including eyebrows or nose) may be excluded.
  • the eye region image does not necessarily include only one eye (left eye or right eye).
  • the eye area image may include both eyes.
  • the visual line learning method by the visual line estimator 151 is not limited to the machine learning described above.
  • the line-of-sight estimator 151 may learn a non-linear function for estimating the line of sight using a collective learning algorithm such as a random forest.
  • the use of the line of sight estimated by the line-of-sight estimation apparatus 100 is not particularly limited.
  • the gaze estimation apparatus 100 may be applied to a system that estimates the gaze of a person imaged by a surveillance camera installed in a store and determines a suspicious person from the estimated gaze.
  • the line-of-sight estimation apparatus 100 may be applied to a system that estimates a user's line of sight with respect to a screen on which information is displayed, and estimates the user's interest / interest based on the line-of-sight estimation result.
  • the line-of-sight estimation apparatus 100 may be applied to an electronic device that can be operated by movement of the line of sight, or may be applied to driving assistance for an automobile or the like.
  • the specific hardware configuration of the line-of-sight estimation apparatus 100 may include various variations and is not limited to a specific configuration.
  • an apparatus according to the present disclosure may be realized using software, and may be configured to share various processes using a plurality of hardware. The configuration of this modification will be described in detail in the second embodiment.
  • the line-of-sight estimation apparatus 100 generates a normalized eye area image so that a feature amount related to the shape of a person's eyes is constant, and estimates the line of sight of the person based on the normalized eye area image To do.
  • a robust estimation result can be stably obtained by using the eye region image obtained by normalizing the feature amount related to the eye shape in this manner as an image with a correct answer for machine learning.
  • the linear learning device such as the linear regression based on the least square method has a relatively low expression ability compared to the non-linear learning device, and thus the difference in the feature amount related to the shape of the eye tends to affect the accuracy of the gaze estimation.
  • the gaze estimation performance by the linear learner is dramatically improved. This is because, according to the configuration of the present embodiment, it is possible to estimate the line of sight with high accuracy regardless of the difference in the feature amount related to the eye shape.
  • FIG. 6 is a block diagram illustrating an example of a hardware configuration of a computer that realizes the line-of-sight estimation apparatus 300 according to the second embodiment.
  • the line-of-sight estimation apparatus 300 includes a CPU (Central Processing Unit) 301, a ROM (Read Only Memory) 302, a RAM (Random Access Memory) 303, a storage device 304, a drive device 305, a communication interface 306, and an input / output. And an interface 307.
  • the line-of-sight estimation apparatus 300 according to the second embodiment can be realized by the hardware configuration (or part thereof) shown in FIG.
  • the CPU 301 executes the program 308 read into the RAM 303.
  • the program 308 may be stored in the ROM 302.
  • the program 308 may be recorded on a recording medium 309 such as a memory card and read by the drive device 305, or may be transmitted from the external device to the line-of-sight estimation device 300 via the network 310.
  • the communication interface 306 exchanges data with an external device via the network 310.
  • the input / output interface 307 exchanges data with peripheral devices (such as an input device and a display device).
  • the communication interface 306 and the input / output interface 307 can function as components for acquiring or outputting data.
  • the components of the line-of-sight estimation apparatus 300 may be configured by a single circuit (processor) or a combination of a plurality of circuits.
  • the circuit here may be either dedicated or general-purpose.
  • a part of the gaze estimation apparatus according to the present disclosure may be realized by a dedicated processor, and the other part may be realized by a general-purpose processor.
  • the line-of-sight estimation apparatus 300 does not have to be realized by a single computer.
  • the components of the line-of-sight estimation apparatus 300 may be distributed among a plurality of computers.
  • the line-of-sight estimation apparatus 300 according to the present embodiment may be realized by the cooperation of a plurality of computer apparatuses using cloud computing technology.
  • the present invention has been described as an exemplary example of the above-described embodiments and modifications. However, the present invention is not limited to these embodiments and modifications.
  • the present invention may include embodiments to which various modifications or applications that can be understood by those skilled in the art are applied within the scope of the present invention.
  • the present invention may include an embodiment in which matters described in the present specification are appropriately combined or replaced as necessary. For example, the matters described using a specific embodiment can be applied to other embodiments as long as no contradiction arises.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Molecular Biology (AREA)
  • Public Health (AREA)
  • Medical Informatics (AREA)
  • Biomedical Technology (AREA)
  • Surgery (AREA)
  • Animal Behavior & Ethology (AREA)
  • General Health & Medical Sciences (AREA)
  • Heart & Thoracic Surgery (AREA)
  • Veterinary Medicine (AREA)
  • Ophthalmology & Optometry (AREA)
  • Biophysics (AREA)
  • Human Computer Interaction (AREA)
  • Image Analysis (AREA)
  • Eye Examination Apparatus (AREA)
  • Image Processing (AREA)

Abstract

La présente invention estime la ligne de visée d'une personne avec une précision élevée indépendamment de la forme des yeux de la personne. Une unité d'acquisition d'image (110) acquiert une image comprenant le visage d'une personne. Une unité de détection d'œil (120) détecte des yeux à partir de l'image. Une unité de calcul de valeur caractéristique (130) calcule une valeur caractéristique se rapportant à la forme des yeux, telle que la taille ou l'orientation des yeux. Une unité de normalisation (140) extrait une région comprenant les yeux à partir de chaque image et convertit l'image extraite de façon à égaliser la valeur caractéristique se rapportant à la forme des yeux. L'unité d'estimation de ligne de visée (150) estime la ligne de visée à l'aide de l'image convertie.
PCT/JP2018/004370 2018-02-08 2018-02-08 Dispositif d'estimation de ligne de visée, procédé d'estimation de ligne de visée et support d'enregistrement Ceased WO2019155570A1 (fr)

Priority Applications (3)

Application Number Priority Date Filing Date Title
PCT/JP2018/004370 WO2019155570A1 (fr) 2018-02-08 2018-02-08 Dispositif d'estimation de ligne de visée, procédé d'estimation de ligne de visée et support d'enregistrement
JP2019570215A JP7040539B2 (ja) 2018-02-08 2018-02-08 視線推定装置、視線推定方法、およびプログラム
JP2022033164A JP7255721B2 (ja) 2018-02-08 2022-03-04 視線推定装置、視線推定方法、およびプログラム

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/JP2018/004370 WO2019155570A1 (fr) 2018-02-08 2018-02-08 Dispositif d'estimation de ligne de visée, procédé d'estimation de ligne de visée et support d'enregistrement

Publications (1)

Publication Number Publication Date
WO2019155570A1 true WO2019155570A1 (fr) 2019-08-15

Family

ID=67549300

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2018/004370 Ceased WO2019155570A1 (fr) 2018-02-08 2018-02-08 Dispositif d'estimation de ligne de visée, procédé d'estimation de ligne de visée et support d'enregistrement

Country Status (2)

Country Link
JP (1) JP7040539B2 (fr)
WO (1) WO2019155570A1 (fr)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2021195124A (ja) * 2019-12-16 2021-12-27 エヌビディア コーポレーション 入力としてグレアを使用する注視判定
US20220327734A1 (en) * 2021-04-08 2022-10-13 EMOCOG Co., Ltd. Apparatus and method for gaze tracking based on machine learning
JP7164231B1 (ja) 2021-06-01 2022-11-01 株式会社プロモデルスタジオ キャスティング装置、方法及びプログラム
CN118506430A (zh) * 2024-07-17 2024-08-16 江苏富翰医疗产业发展有限公司 视线估计方法及系统

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2012038106A (ja) * 2010-08-06 2012-02-23 Canon Inc 情報処理装置、情報処理方法、およびプログラム
JP2012037934A (ja) * 2010-08-03 2012-02-23 Canon Inc 視線検出装置、視線検出方法及びプログラム

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP6525635B2 (ja) 2015-02-25 2019-06-05 キヤノン株式会社 画像処理装置、画像処理方法及びプログラム

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2012037934A (ja) * 2010-08-03 2012-02-23 Canon Inc 視線検出装置、視線検出方法及びプログラム
JP2012038106A (ja) * 2010-08-06 2012-02-23 Canon Inc 情報処理装置、情報処理方法、およびプログラム

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2021195124A (ja) * 2019-12-16 2021-12-27 エヌビディア コーポレーション 入力としてグレアを使用する注視判定
JP7696764B2 (ja) 2019-12-16 2025-06-23 エヌビディア コーポレーション 入力としてグレアを使用する注視判定
US20220327734A1 (en) * 2021-04-08 2022-10-13 EMOCOG Co., Ltd. Apparatus and method for gaze tracking based on machine learning
JP2024513497A (ja) * 2021-04-08 2024-03-25 イモコグ インク. 機械学習ベースの視線追跡装置および方法
JP7645572B2 (ja) 2021-04-08 2025-03-14 イモコグ インク. 機械学習ベースの視線追跡装置および方法
US12412296B2 (en) * 2021-04-08 2025-09-09 EMOCOG Co., Ltd. Apparatus and method for gaze tracking based on machine learning
JP7164231B1 (ja) 2021-06-01 2022-11-01 株式会社プロモデルスタジオ キャスティング装置、方法及びプログラム
JP2022184465A (ja) * 2021-06-01 2022-12-13 株式会社プロモデルスタジオ キャスティング装置、方法及びプログラム
CN118506430A (zh) * 2024-07-17 2024-08-16 江苏富翰医疗产业发展有限公司 视线估计方法及系统

Also Published As

Publication number Publication date
JP7040539B2 (ja) 2022-03-23
JPWO2019155570A1 (ja) 2021-01-14

Similar Documents

Publication Publication Date Title
US12142076B2 (en) Facial authentication device, facial authentication method, and program recording medium
US11232585B2 (en) Line-of-sight estimation device, line-of-sight estimation method, and program recording medium
Sánchez et al. Differential optical flow applied to automatic facial expression recognition
JP2008194146A (ja) 視線検出装置及びその方法
JPWO2010137157A1 (ja) 画像処理装置、方法、プログラム
JP6071002B2 (ja) 信頼度取得装置、信頼度取得方法および信頼度取得プログラム
JP7040539B2 (ja) 視線推定装置、視線推定方法、およびプログラム
US20230360433A1 (en) Estimation device, estimation method, and storage medium
JP4569186B2 (ja) 画像処理装置および方法、記録媒体、並びにプログラム
JP6410450B2 (ja) オブジェクト識別装置、オブジェクト識別方法及びプログラム
WO2020195732A1 (fr) Dispositif de traitement d'image, procédé de traitement d'image, et support d'enregistrement dans lequel un programme est stocké
Azzopardi et al. Fast gender recognition in videos using a novel descriptor based on the gradient magnitudes of facial landmarks
JP7255721B2 (ja) 視線推定装置、視線推定方法、およびプログラム
JP7103443B2 (ja) 情報処理装置、情報処理方法、およびプログラム
JP2025062846A (ja) 情報処理装置、情報処理方法、及びプログラム
WO2025191724A1 (fr) Dispositif de traitement d'informations, procédé de traitement d'informations et support d'enregistrement

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18904962

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2019570215

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18904962

Country of ref document: EP

Kind code of ref document: A1