WO2025216110A1 - Information processing device, information processing method, and program - Google Patents
Information processing device, information processing method, and programInfo
- Publication number
- WO2025216110A1 WO2025216110A1 PCT/JP2025/013086 JP2025013086W WO2025216110A1 WO 2025216110 A1 WO2025216110 A1 WO 2025216110A1 JP 2025013086 W JP2025013086 W JP 2025013086W WO 2025216110 A1 WO2025216110 A1 WO 2025216110A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- head pose
- correlation
- rotation
- information processing
- target
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61B—DIAGNOSIS; SURGERY; IDENTIFICATION
- A61B3/00—Apparatus for testing the eyes; Instruments for examining the eyes
- A61B3/10—Objective types, i.e. instruments for examining the eyes independent of the patients' perceptions or reactions
- A61B3/113—Objective types, i.e. instruments for examining the eyes independent of the patients' perceptions or reactions for determining or recording eye movement
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/03—Arrangements for converting the position or the displacement of a member into a coded form
- G06F3/033—Pointing devices displaced or positioned by the user, e.g. mice, trackballs, pens or joysticks; Accessories therefor
- G06F3/0346—Pointing devices displaced or positioned by the user, e.g. mice, trackballs, pens or joysticks; Accessories therefor with detection of the device orientation or free movement in a 3D space, e.g. 3D mice, 6-DOF [six degrees of freedom] pointers using gyroscopes, accelerometers or tilt-sensors
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/03—Arrangements for converting the position or the displacement of a member into a coded form
- G06F3/033—Pointing devices displaced or positioned by the user, e.g. mice, trackballs, pens or joysticks; Accessories therefor
- G06F3/038—Control and interface arrangements therefor, e.g. drivers or device-embedded control circuitry
Definitions
- the present invention relates to an information processing device, an information processing method, and a program.
- gaze estimation technology is applied to reduce the load on rendering processing (Foveated Rendering).
- further improvements in the accuracy of gaze estimation are required to realize gaze-based UIs (User Interfaces), express DoF (Depth of Field), and correct display distortion caused by eye movements (Swim Correction).
- Patent Document 1 discloses technology that corrects the positional relationship between the optical axis and visual axis by estimating the direction of gravity using an acceleration sensor and estimating the rotation of the eyeball from the direction of gravity.
- Patent Document 2 discloses technology that estimates the rotation of the eyeball from the iris pattern.
- detecting iris patterns places a high processing load on the system, making it difficult to operate constantly in an HMD.
- this disclosure proposes an information processing device, information processing method, and program that are capable of highly accurate gaze estimation.
- the present disclosure provides an information processing device having a headpose-rotation correlation unit that acquires the correlation between eye rotation and head pose as a headpose-rotation correlation, a headpose estimation unit that acquires the head pose, and an optical axis-visual axis correlation unit that applies the head pose to the headpose-rotation correlation to estimate the eye rotation and corrects the visual axis of the eye based on the estimated eye rotation.
- the present disclosure also provides an information processing method in which the information processing of the information processing device is executed by a computer, and a program that causes a computer to implement the information processing of the information processing device.
- FIG. 10 is an explanatory diagram of gaze estimation taking into account eye rotation.
- FIG. 10 is an explanatory diagram of gaze estimation taking into account eye rotation.
- FIG. 1 is a diagram illustrating an example of a configuration of a display system.
- FIG. 1 is a diagram showing a conventional display system serving as a comparative example.
- FIG. 10 is a diagram showing an example of a calibration image for investigating the correlation between eye rotation and head pose.
- FIG. 10 is a diagram showing an example of a calibration image for examining the correlation between the optical axis and the visual axis.
- FIG. 10 is a diagram showing an example of a processing flow for detecting head pose-rotation correlation by a calibration operation before viewing content.
- FIG. 10 is a diagram illustrating an example of a processing flow for estimating a line of sight when viewing content.
- FIG. 10 is a diagram illustrating another configuration example of the display system.
- FIG. 2 illustrates an example of a hardware configuration of an information processing device.
- Gaze estimation taking into account eye rotation [2. System configuration example] [3. Display example of calibration image] 4. Calibration Processing Flow 5. Gaze Estimation Processing Flow 6. Sequential update of headpose-rotation correlation [7. Hardware Configuration Example] 8. Effects
- Gaze estimation taking into account eye rotation 1 and 2 are explanatory diagrams of gaze estimation taking into account eye rotation.
- Eye tracking is used in a variety of fields. Eye tracking is a technology that tracks what a user US is looking at in real time based on eye movements.
- the eye tracker detects the optical axis LA of the eyeball EB based on an image of the eye (ocular image).
- the optical axis LA is the axis that passes through the center of the cornea CR and the center of the pupil PU, but the line of sight does not necessarily coincide with the optical axis LA of the eyeball EB.
- the distribution of photoreceptors on the retina is not uniform, and the line of sight is the line connecting the area with a high density of photoreceptors (the depression in the center of the macula MC: the macular fovea centralis) and the nodal point (the central posterior surface of the lens).
- the axis along which the line of sight passes is called the visual axis VA.
- the visual axis VA cannot be determined directly from the eye image.
- the visual axis VA is tilted at about 5° with respect to the optical axis LA.
- the offset between the visual axis VA and the optical axis LA varies from person to person, and this individual difference must be adjusted through calibration.
- the visual axis VA (line of sight) is estimated from the optical axis LA based on the calibration information.
- the eyeball EB rotates.
- the three-dimensional positional relationship between the optical axis LA and the visual axis VA changes.
- the correlation between eyeball rotation and head pose is obtained in advance, and the visual axis VA is corrected based on the eyeball rotation estimated from the head pose.
- FIG. 3 is a diagram showing an example of the configuration of the display system 1. As shown in FIG.
- the display system 1 has an information processing device 10 and an HMD 20.
- the HMD 20 presents 3D images to the user US who is wearing the HMD.
- the information processing device 10 detects the user US's line of sight based on the eye movements of the user US and controls the display according to the line of sight.
- Figure 3 selectively shows configuration related to line of sight estimation. Configuration normally used only for processing content (movies, games, etc.) is omitted from the illustration.
- the HMD 20 has an IMU (Inertial Measurement Unit) 21, an external camera 22, a display 23, and an eye camera 24.
- the external camera 22 captures images of the surroundings of the HMD 20. Sensing information from the IMU 21 and external camera 22 is used to estimate the head pose of the user US.
- the display 23 displays an image for calibration.
- the eye camera 24 captures images of the eyes of the user US and acquires eye images.
- the information processing device 10 has a SLAM signal processing unit 11, a head pose estimation unit 12, a head pose-rotation correlation unit 13, a target display unit 14, an iris detection unit 15, a rotation detection unit 16, an optical axis estimation unit 17, an optical axis-visual axis correlation unit 18, and a visual axis estimation unit 19.
- the SLAM signal processing unit 11 estimates the user's position from sensing information from the IMU 21 and external camera 22. Self-position estimation can be performed using SLAM (Simultaneous Localization and Mapping) technology.
- the head pose estimation unit 12 acquires the head pose of the user US based on the estimated self-position information.
- the headpose-rotation correlation unit 13 obtains the correlation between eye rotation and head pose as the headpose-rotation correlation.
- the headpose-rotation correlation is determined through a prior calibration process by the user US.
- the headpose-rotation correlation unit 13 applies the headpose estimated by the headpose estimation unit 12 to the headpose-rotation correlation to obtain the eye rotation (direction and angle of rotation) corresponding to the head pose.
- the headpose-rotation correlation unit 13 stores the correlation between eye rotation and headpose acquired for the user of the calibration information (the user US whose headpose is to be acquired when viewing content).
- the headpose-rotation correlation may also be obtained by aggregating headpose and eye rotation data collected from multiple people.
- the iris detection unit 15 extracts the iris from the eye image acquired from the eye camera 24.
- the rotation detection unit 16 detects the rotation of the eyeball EB based on the iris image.
- Information on eyeball rotation is output to the headpose-rotation correlation unit 13.
- the headpose-rotation correlation unit 13 links the eyeball rotation information acquired from the rotation detection unit 16 with the headpose information acquired from the headpose estimation unit 12. In this way, the headpose-rotation correlation is acquired.
- the optical axis estimation unit 17 acquires the optical axis LA of the eyeball EB based on the eye image acquired from the eye camera 24.
- the optical axis-visual axis correlation unit 18 acquires the correlation between the optical axis LA of the eyeball EB and the visual axis VA as the optical axis-visual axis correlation.
- the optical axis-visual axis correlation is determined through a prior calibration operation by the user US.
- the optical axis-visual axis correlation unit 18 stores the correlation between the optical axis LA and the visual axis VA acquired for the user of the calibration information (the user US from whom the optical axis LA is acquired when viewing content).
- the optical axis-visual axis correlation may also be acquired by aggregating data on the optical axis LA and visual axis VA collected from multiple people.
- the target display unit 14 generates an image (calibration image) to be used in the calibration process and displays it on the display 23.
- the calibration image includes a target TG (see Figures 5 and 6) that is the gaze target.
- the target display unit 14 displays the target TG that the user US is looking at on the display 23.
- the target display unit 14 moves the position of the target TG to achieve natural line of sight and head movements required during calibration.
- the target display unit 14 generates calibration images, including an image for examining the correlation between eye rotation and head pose (see Figure 5) and an image for examining the correlation between the optical axis LA and the visual axis VA (see Figure 6).
- the calibration image encourages the eyes to follow the target TG, which is the object of gaze, and thereby encourages fluctuations in the eyeballs EB and head pose.
- the rotation detection unit 16 detects the eyeball rotation of the user US that accompanies the movement of the target TG.
- the head pose estimation unit 12 acquires the head pose of the user US that fluctuates as the target TG moves.
- the head pose-rotation correlation unit 13 acquires the correlation between the eyeball rotation and head pose that accompanies the movement of the target TG as the head pose-rotation correlation.
- the correlation between the optical axis LA and the visual axis VA can be obtained based on the position of the target TG and information about the optical axis LA.
- the optical axis-visual axis correlation unit 18 estimates the visual axis VA of the eyeball EB based on the position of the target TG obtained from the target display unit 14.
- the optical axis-visual axis correlation unit 18 obtains the optical axis-visual axis correlation by linking the estimated visual axis VA with the optical axis LA obtained from the optical axis estimation unit 17.
- the optical axis-visual axis correlation unit 18 obtains the eye rotation corresponding to the headpose from the headpose-rotation correlation unit 13.
- the eye rotation is obtained by applying the headpose estimated by the headpose estimation unit 12 to the headpose-rotation correlation.
- the optical axis-visual axis correlation unit 18 estimates the visual axis VA by applying the optical axis LA obtained from the optical axis estimation unit 17 to the optical axis-visual axis correlation.
- the optical axis-visual axis correlation unit 18 corrects the visual axis VA of the eye EB based on the eye rotation obtained from the headpose-rotation correlation unit 13.
- the visual axis estimation unit 19 outputs the visual axis VA, whose orientation has been corrected by the correction, as the visual axis VA corresponding to the headpose.
- Fig. 4 is a diagram showing a conventional display system 1C as a comparative example.
- Fig. 4 selectively depicts configurations related to gaze estimation, and omits configurations used only for processing regular content (movies, games, etc.).
- gaze estimation is performed based only on the correlation between the optical axis LA and the visual axis VA.
- the target display unit 14C displays only an image for examining the correlation between the optical axis LA and the visual axis VA. Since the correlation between head pose and eye rotation and the effect of eye rotation on the visual axis VA are not taken into consideration, appropriate gaze estimation is not performed when the head moves. In the method of the present disclosure shown in FIG. 3, the relationship between head pose and the visual axis VA is taken into consideration, and therefore appropriate gaze estimation is performed even when the head moves.
- Figure 5 shows an example of a calibration image for investigating the correlation between eye rotation and head pose.
- a spherical object floating in virtual space is presented as the target TG.
- the head pose estimation unit 12 tracks the head movement of the user US who is following the target TG.
- the target display unit 14 guides the head movement by varying the display position of the target TG.
- the target display unit 14 displays the target TG in a position that induces a change in head pose in order to achieve natural head movement.
- the target display unit 14 displays an obstruction OB in front of the target TG, encouraging the head pose to be changed as if peering at the target TG behind the obstruction OB.
- the target display unit 14 can also notify the user by voice or text of a message encouraging the head pose to be changed.
- the calibration process is performed as a preliminary step before viewing regular content (movies, games, etc.). It is preferable that the subject of the calibration process be the same user US as the viewer of the regular content (the user US whose head pose is to be acquired when viewing the regular content). This allows for appropriate correction of the visual axis VA, taking into account the individual differences of the user US.
- the visual axis VA can also be corrected based on standard correlation data obtained from a large number of subjects.
- the headpose-rotation correlation unit 13 can store the correlation between average eye rotation and headpose based on data from multiple subjects as the headpose-rotation correlation. In this case, generally good gaze estimation can be performed without prior calibration.
- FIG. 6 shows an example of a calibration image for investigating the correlation between the optical axis LA and the visual axis VA.
- one point is selected from multiple radially arranged points, and the selected point is presented as the target TG by, for example, lighting up.
- the calibration image is presented as an image (headlock image HI) that allows all points to be seen with the head fixed.
- the iris detection unit 15 tracks the movement (eye rotation) of the eyeball EB of the user US as it follows the target TG.
- the target display unit 14 guides the movement of the eyeball EB by varying the display position of the target TG.
- the iris detection unit 15 estimates eye rotation based on the iris pattern. Detecting the iris pattern requires high processing power, making it difficult to perform constantly when viewing normal content. However, if the iris pattern detection process is limited to calibration work, processing load issues are unlikely to arise. When viewing normal content, the processing load is reduced by performing gaze estimation based on the process of estimating the optical axis LA and the process of converting the optical axis LA to the visual axis VA.
- FIG. 7 is a diagram showing an example of a processing flow for detecting head pose-rotation correlation through a calibration operation before viewing content.
- the target display unit 14 displays a calibration image including the target TG on the display 23 (step S1).
- the SLAM signal processing unit 11 estimates the self-position of the user US during the calibration operation based on sensor information acquired from the IMU 21 and external camera 22.
- the head pose estimation unit 12 acquires the head pose of the user US during the calibration operation based on the estimated self-position (step S2).
- the eye camera 24 acquires an image of the eye of the user US following the target TG.
- the iris detection unit 15 detects the iris from the eye image.
- the rotation detection unit 16 detects eye rotation based on the tilt of the iris pattern (step S3).
- the head pose-rotation correlation unit 13 links the head pose acquired from the head pose estimation unit 12 with the eye rotation acquired from the rotation detection unit 16, and records the correspondence between the two as a head pose-rotation correlation (step S4). Note that steps S2 and S3 are performed in parallel, and either may be performed first.
- FIG. 8 is a diagram showing an example of a processing flow of gaze estimation during content viewing.
- the SLAM signal processing unit 11 estimates the user US's self-position while viewing content based on sensor information acquired from the IMU 21 and external camera 22.
- the head pose estimation unit 12 acquires the head pose of the user US while viewing content based on the estimated self-position (step S11).
- the optical axis-visual axis correlation unit 18 applies the acquired head pose to the head pose-rotation correlation to estimate the eye rotation of the user US (step S12).
- the eye camera 24 acquires an image of the eye of the user US while viewing content.
- the optical axis estimation unit 17 estimates the optical axis LA of the eyeball EB based on the eye image (step S13).
- the optical axis-visual axis correlation unit 18 applies the estimated optical axis LA to the optical axis-visual axis correlation to estimate the visual axis VA of the user US while viewing content (step S14).
- the optical axis-visual axis correlation unit 18 corrects the direction of the visual axis VA based on the eye rotation estimated from the head pose (step S15). Note that steps S11-S12 and steps S13-S14 are performed in parallel, and either may be performed first.
- Sequential update of headpose-rotation correlation 9 is a diagram showing another example of the configuration of a display system. The following description will focus on the differences from the display system 1 shown in FIG.
- the display system omits the target display unit 14 shown in Figure 3.
- no dedicated image is generated for the calibration process.
- an image of the content being viewed is extracted based on some kind of trigger, and the extracted image is used as the calibration image.
- Triggers can be set arbitrarily by the system developer. For example, a trigger can be set when a scene containing a clear target of gaze (attractive area), such as a scene of a glowing object flying in the dark, is detected.
- the headpose-rotation correlation unit 13 sequentially updates the headpose-rotation correlation based on eye rotation and headpose data of the user US acquired intermittently based on preset triggers.
- the information processing device 30 may include a scene detection unit.
- the scene detection unit detects a video scene including an attention region as a peculiar scene.
- an attention region is a display area of an object that is likely to attract the attention of the user US and is likely to induce head movement of the user US by moving over a wide area of the screen. In a scene in which a glowing object is flying in the dark, the glowing flying object becomes the attention region.
- the rotation detection unit 16 detects eye rotation of the user US accompanying movement of the eye region when a peculiar scene is detected.
- the head pose estimation unit 12 acquires the head pose of the user US, which changes as the eye region moves.
- the head pose-rotation correlation unit 13 updates the head pose-rotation correlation based on the correlation between the eye rotation accompanying movement of the eye region and the head pose.
- FIG. 10 is a diagram illustrating an example of the hardware configuration of the information processing device 10.
- Computer 1000 has a CPU (Central Processing Unit) 1100, RAM (Random Access Memory) 1200, ROM (Read Only Memory) 1300, HDD (Hard Disk Drive) 1400, communication interface 1500, and input/output interface 1600. Each part of computer 1000 is connected by bus 1050.
- CPU Central Processing Unit
- RAM Random Access Memory
- ROM Read Only Memory
- HDD Hard Disk Drive
- CPU 1100 operates based on programs (program data 1450) stored in ROM 1300 or HDD 1400, and controls each component. For example, CPU 1100 loads programs stored in ROM 1300 or HDD 1400 into RAM 1200 and executes processing corresponding to the various programs.
- BIOS Basic Input Output System
- HDD 1400 is a computer-readable, non-transitory recording medium that non-temporarily records programs executed by CPU 1100 and data used by such programs.
- HDD 1400 is a recording medium that records an information processing program according to an embodiment, which is an example of program data 1450.
- the communication interface 1500 is an interface that allows the computer 1000 to connect to an external network 1550 (such as the Internet).
- an external network 1550 such as the Internet
- the CPU 1100 receives data from other devices and transmits data generated by the CPU 1100 to other devices via the communication interface 1500.
- the input/output interface 1600 is an interface for connecting the input/output device 1650 and the computer 1000.
- the CPU 1100 receives data from input devices such as a keyboard or mouse via the input/output interface 1600.
- the CPU 1100 also transmits data to output devices such as a display device, speaker, or printer via the input/output interface 1600.
- the input/output interface 1600 may also function as a media interface that reads programs recorded on a specified recording medium. Examples of media include optical recording media such as DVDs (Digital Versatile Discs) and PDs (Phase Change Rewritable Disks), magneto-optical recording media such as MOs (Magneto-Optical Disks), tape media, magnetic recording media, or semiconductor memory.
- HDD 1400 also stores the information processing program, various models, and various data according to the present disclosure. While CPU 1100 reads and executes program data 1450 from HDD 1400, as another example, it may also obtain these programs from other devices via external network 1550.
- the information processing device 10 has a head pose-rotation correlation unit 13, a head pose estimation unit 12, and an optical axis-visual axis correlation unit 18.
- the head pose-rotation correlation unit 13 obtains the correlation between eye rotation and head pose as a head pose-rotation correlation.
- the head pose estimation unit 12 obtains the head pose.
- the optical axis-visual axis correlation unit 18 corrects the visual axis VA of the eye EB based on the eye rotation obtained by applying the head pose to the head pose-rotation correlation.
- the processing of the information processing device 10 is executed by a computer 1000.
- a program disclosed herein causes the computer 1000 to realize the processing of the information processing device 10.
- the visual axis VA is corrected while appropriately taking into account the rotation of the eyeball EB. This enables highly accurate gaze estimation.
- the head pose-rotation correlation unit 13 stores the correlation between eye rotation and head pose acquired for the user US whose head pose is to be acquired.
- This configuration allows for appropriate correction of the visual axis VA, taking into account individual differences among users US.
- the information processing device 10 has a target display unit 14 and a rotation detection unit 16.
- the target display unit 14 displays a target TG that the user US is focusing on on the display 23.
- the rotation detection unit 16 detects eye rotation of the user US that accompanies movement of the target TG.
- the head pose estimation unit 12 acquires the head pose of the user US that changes as the target TG moves.
- the head pose-rotation correlation unit 13 acquires the correlation between the eye rotation and head pose that accompanies movement of the target TG as the head pose-rotation correlation.
- the target display unit 14 displays a message encouraging the user to change their head pose.
- This configuration ensures that head pose changes are implemented reliably.
- the target display unit 14 displays the target TG in a position that induces a change in head pose.
- This configuration ensures that head pose changes are implemented reliably.
- the target display unit 14 displays an obstruction OB in front of the target TG. This encourages the head pose movement to look at the target TG behind the obstruction OB.
- This configuration ensures that head pose changes are implemented reliably.
- the headpose-rotation correlation unit 13 sequentially updates the headpose-rotation correlation based on eye rotation and headpose data of the user US acquired intermittently based on preset triggers.
- This configuration allows for accurate headpose-rotation correlation that reflects individual differences among users US.
- the information processing device 10 has a scene detection unit.
- the scene detection unit detects video scenes including an attention region as peculiar scenes.
- the rotation detection unit 16 triggered by the detection of a peculiar scene, detects eye rotation of the user US accompanying movement of the attention region.
- the head pose estimation unit 12 acquires the head pose of the user US, which fluctuates as the attention region moves.
- the head pose-rotation correlation unit 13 updates the head pose-rotation correlation based on the correlation between the eye rotation accompanying movement of the attention region and the head pose.
- head pose-rotation correlation can be acquired naturally during viewing of normal content (movies, games, etc.).
- the headpose-rotation correlation unit 13 stores the correlation between average eye rotation and headpose based on data from multiple subjects as the headpose-rotation correlation.
- This configuration allows for generally good gaze estimation while omitting prior calibration.
- the information processing device 10 has an optical axis estimation unit 17.
- the optical axis estimation unit 17 acquires the optical axis LA of the eyeball EB.
- the optical axis-visual axis correlation unit 18 acquires the correlation between the optical axis LA and the visual axis VA as the optical axis-visual axis correlation.
- the optical axis-visual axis correlation unit 18 applies the optical axis LA to the optical axis-visual axis correlation to estimate the visual axis VA.
- the optical axis-visual axis correlation unit 18 corrects the estimated visual axis VA based on eye rotation.
- the visual axis VA can be estimated with high accuracy from the optical axis LA. This allows for highly accurate gaze estimation.
- the optical axis-visual axis correlation unit 18 stores the correlation between the optical axis LA and the visual axis VA obtained for the user US for whom the optical axis LA is to be obtained.
- This configuration enables accurate gaze estimation that reflects individual differences among users US.
- the present technology can also be configured as follows.
- a headpose-rotation correlation unit that acquires a correlation between eye rotation and head pose as a headpose-rotation correlation; a head pose estimation unit for acquiring the head pose; an optical axis-visual axis correlation unit that corrects the visual axis of the eyeball based on the eyeball rotation obtained by applying the headpose to the headpose-rotation correlation; An information processing device having the above.
- the head pose-rotation correlation unit stores the correlation between the eye rotation and the head pose acquired for the user whose head pose is to be acquired.
- the information processing device according to (1) above.
- a target display unit that displays a target that the user is paying attention to on a display; a rotation detection unit that detects the eye rotation of the user accompanying the movement of the target; and the head pose estimation unit acquires the head pose of the user that varies in accordance with movement of the target; the headpose-rotation correlation unit acquires a correlation between the eye rotation accompanying the movement of the target and the headpose as the headpose-rotation correlation;
- the target display unit notifies a message prompting the user to change the head pose.
- the target display unit displays the target at a position that causes a change in the head pose.
- the target display unit displays an obstruction in front of the target, and prompts the user to move the head pose in a manner that looks at the target behind the obstruction;
- the information processing device according to (5) above.
- the headpose-rotation correlator sequentially updates the headpose-rotation correlation based on data of the eye rotation and the headpose of the user intermittently acquired based on a preset trigger;
- the information processing device according to any one of (2) to (6) above.
- a scene detection unit that detects a video scene including an attention region as a peculiar scene
- a rotation detection unit that detects the eye rotation of the user accompanying the movement of the attraction region, using the detection of the peculiar scene as the trigger
- the head pose estimation unit acquires the head pose of the user that varies in accordance with the movement of the interest region
- the head pose-rotation correlation unit updates the head pose-rotation correlation based on the correlation between the eye rotation accompanying the movement of the attention region and the head pose.
- the headpose-rotation correlation unit stores an average correlation between the eye rotation and the headpose based on data of a plurality of subjects as the headpose-rotation correlation;
- the information processing device according to (1) above.
- the optical axis-visual axis correlation unit Obtaining a correlation between the optical axis and the visual axis as an optical axis-visual axis correlation; applying the optical axis to the optical axis-visual axis correlation to estimate the visual axis; correcting the estimated visual axis based on the eye rotation;
- the information processing device according to any one of (1) to (9) above.
- the optical axis-visual axis correlation unit stores the correlation between the optical axis and the visual axis acquired for the user from whom the optical axis is to be acquired.
- the information processing device according to (10) above.
- a computer-implemented information processing method comprising: (13) storing the correlation between the eye rotation and the head pose acquired for the user whose head pose is to be acquired; The information processing method according to (12) above.
- the process of acquiring the head pose-rotation correlation includes acquiring a correlation between the eye rotation accompanying the movement of the target and the head pose as the head pose-rotation correlation.
- the information processing method according to (13) above. notifying a message prompting the head pose change; The information processing method according to (14) above.
- the target display process displays the target at a position that causes a change in the head pose.
- the target display process includes displaying an obstruction in front of the target, and prompting the user to move the head pose so as to look at the target behind the obstruction.
- the process of acquiring the head pose-rotation correlation includes sequentially updating the head pose-rotation correlation based on data of the eye rotation and the head pose of the user acquired intermittently based on a preset trigger.
- the process of acquiring the head pose-rotation correlation includes storing an average correlation between the eye rotation and the head pose based on data of a plurality of subjects as the head pose-rotation correlation.
- Head pose estimation unit 13 Head pose-rotation correlation unit 14 Target display unit 16 Rotation detection unit 17 Optical axis estimation unit 18 Optical axis-visual axis correlation unit 23 Display EB Eyeball LA Optical axis OB Obstruction object TG Target US User VA Visual axis
Landscapes
- Engineering & Computer Science (AREA)
- General Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- Ophthalmology & Optometry (AREA)
- Biophysics (AREA)
- Heart & Thoracic Surgery (AREA)
- Medical Informatics (AREA)
- Molecular Biology (AREA)
- Surgery (AREA)
- Animal Behavior & Ethology (AREA)
- General Health & Medical Sciences (AREA)
- Public Health (AREA)
- Veterinary Medicine (AREA)
- Eye Examination Apparatus (AREA)
Abstract
Description
ãæ¬çºæã¯ãæ å ±åŠçè£ çœ®ãæ å ±åŠçæ¹æ³ããã³ããã°ã©ã ã«é¢ããã The present invention relates to an information processing device, an information processing method, and a program.
ãïŒïŒ€ïŒïŒšïœ ïœïœãïŒïœïœïœïœïœ ïœãïœïœïœïœïœïœïŒãªã©ã§ã¯ãèŠç·æšå®æè¡ãæç»åŠçè² è·äœæžïŒïŒŠïœïœïœ ïœïœïœ ïœãïŒ²ïœ ïœïœïœ ïœïœïœïœïŒã«å¿çšãããŠãããããããèŠç·ã«ããïŒïŒµïœïœ ïœãïœïœïœ ïœïœïœïœïœ ïŒã®å®çŸãïœïŒŠïŒïŒ€ïœ ïœïœïœãïœïœãïœïœ ïœïœïŒè¡šçŸãçŒæ¯ãã«ãããã£ã¹ãã¬ã€æªè£æ£ïŒïŒ³ïœïœïœè£æ£ïŒãè¡ãããã«ã¯ããããªãèŠç·æšå®ã®ç²ŸåºŠåäžãå¿ èŠã§ããã In devices such as HMDs (Head Mounted Displays), gaze estimation technology is applied to reduce the load on rendering processing (Foveated Rendering). However, further improvements in the accuracy of gaze estimation are required to realize gaze-based UIs (User Interfaces), express DoF (Depth of Field), and correct display distortion caused by eye movements (Swim Correction).
ãçŒã®å 軞ãšèŠè»žã¯äžè¬ã«äžèŽããŠãããããŸãé ãåŸãããšçŒçã¯åæãããçŒçãåæãããšå 軞ãšèŠè»žã®é¢ä¿ã¯åæã«åŸã£ãŠå€åããŠããŸããç¹èš±æç®ïŒã«ã¯ãå é床ã»ã³ãµãçšããŠéåæ¹åãæšå®ããéåæ¹åããçŒçã®åæãæšå®ããããšã«ãã£ãŠãå 軞ãšèŠè»žã®äœçœ®é¢ä¿ãè£æ£ããæè¡ãé瀺ãããŠãããããããçŒçã®åæã¯éåæ¹åã«å®å šã«äžèŽããããã§ã¯ãªãã®ã§ã粟床ã®é«ãè£æ£ã¯ã§ããªããç¹èš±æç®ïŒã«ã¯ãè¹åœ©ãã¿ãŒã³ããçŒçã®åæãæšå®ããæè¡ãé瀺ãããŠãããããããè¹åœ©ãã¿ãŒã³ã®æ€åºã¯åŠçè² è·ãé«ããïŒïŒ€ã«ãããŠåžžæåäœãããããšã¯å°é£ã§ããã The optical axis and visual axis of the eye generally do not coincide, and tilting the head causes the eyeball to rotate. When the eyeball rotates, the relationship between the optical axis and visual axis changes accordingly. Patent Document 1 discloses technology that corrects the positional relationship between the optical axis and visual axis by estimating the direction of gravity using an acceleration sensor and estimating the rotation of the eyeball from the direction of gravity. However, because the rotation of the eyeball does not perfectly coincide with the direction of gravity, highly accurate correction is not possible. Patent Document 2 discloses technology that estimates the rotation of the eyeball from the iris pattern. However, detecting iris patterns places a high processing load on the system, making it difficult to operate constantly in an HMD.
ãããã§ãæ¬é瀺ã§ã¯ã粟床ã®ããèŠç·æšå®ãå¯èœãªæ å ±åŠçè£ çœ®ãæ å ±åŠçæ¹æ³ããã³ããã°ã©ã ãææ¡ããã Therefore, this disclosure proposes an information processing device, information processing method, and program that are capable of highly accurate gaze estimation.
ãæ¬é瀺ã«ããã°ãçŒçåæãšãããããŒãºãšã®éã®çžé¢ããããããŒãºïŒåæçžé¢ãšããŠååŸãããããããŒãºïŒåæçžé¢éšãšãåèšãããããŒãºãååŸãããããããŒãºæšå®éšãšãåèšãããããŒãºãåèšãããããŒãºïŒåæçžé¢ã«é©çšããŠåèšçŒçåæãæšå®ããæšå®ãããåèšçŒçåæã«åºã¥ããŠçŒçã®èŠè»žãè£æ£ããå 軞ïŒèŠè»žçžé¢éšãšããæããæ å ±åŠçè£ çœ®ãæäŸãããããŸããæ¬é瀺ã«ããã°ãåèšæ å ±åŠçè£ çœ®ã®æ å ±åŠçãã³ã³ãã¥ãŒã¿ã«ããå®è¡ãããæ å ±åŠçæ¹æ³ããªãã³ã«ãåèšæ å ±åŠçè£ çœ®ã®æ å ±åŠçãã³ã³ãã¥ãŒã¿ã«å®çŸãããããã°ã©ã ãæäŸãããã The present disclosure provides an information processing device having a headpose-rotation correlation unit that acquires the correlation between eye rotation and head pose as a headpose-rotation correlation, a headpose estimation unit that acquires the head pose, and an optical axis-visual axis correlation unit that applies the head pose to the headpose-rotation correlation to estimate the eye rotation and corrects the visual axis of the eye based on the estimated eye rotation. The present disclosure also provides an information processing method in which the information processing of the information processing device is executed by a computer, and a program that causes a computer to implement the information processing of the information processing device.
ã以äžã«ãæ¬é瀺ã®å®æœåœ¢æ ã«ã€ããŠå³é¢ã«åºã¥ããŠè©³çްã«èª¬æããã以äžã®å宿œåœ¢æ ã«ãããŠãåäžã®éšäœã«ã¯åäžã®ç¬Šå·ãä»ããããšã«ããéè€ãã説æãçç¥ããã Embodiments of the present disclosure will be described in detail below with reference to the drawings. In each of the following embodiments, identical components will be designated by the same reference numerals, and duplicate descriptions will be omitted.
ããªãã説æã¯ä»¥äžã®é åºã§è¡ãããã
ïŒïŒçŒçåæãèæ
®ããèŠç·æšå®ïŒœ
ïŒïŒã·ã¹ãã æ§æäŸïŒœ
ïŒïŒãã£ãªãã¬ãŒã·ã§ã³çšç»åã®è¡šç€ºäŸïŒœ
ïŒïŒãã£ãªãã¬ãŒã·ã§ã³ã®åŠçãããŒïŒœ
ïŒïŒèŠç·æšå®ã®åŠçãããŒïŒœ
ïŒïŒãããããŒãºïŒåæçžé¢ã®éæ¬¡æŽæ°ïŒœ
ïŒïŒããŒããŠã§ã¢æ§æäŸïŒœ
ïŒïŒå¹æïŒœ
The explanation will be given in the following order.
[1. Gaze estimation taking into account eye rotation]
[2. System configuration example]
[3. Display example of calibration image]
4. Calibration Processing Flow
5. Gaze Estimation Processing Flow
6. Sequential update of headpose-rotation correlation
[7. Hardware Configuration Example]
8. Effects
ïŒïŒçŒçåæãèæ
®ããèŠç·æšå®ïŒœ
ãå³ïŒããã³å³ïŒã¯ãçŒçåæãèæ
®ããèŠç·æšå®ã®èª¬æå³ã§ããã
[1. Gaze estimation taking into account eye rotation]
1 and 2 are explanatory diagrams of gaze estimation taking into account eye rotation.
ãã¢ã€ãã©ããã³ã°ã¯æ§ã ãªåéã§å©çšãããŠãããã¢ã€ãã©ããã³ã°ã¯ãçŒã®åãã«åºã¥ããŠããŠãŒã¶ïŒµïŒ³ãäœãèŠãŠããã®ãããªã¢ã«ã¿ã€ã ã«è¿œè·¡ããæè¡ã§ãããã¢ã€ãã©ãã«ã¯ãçŒã®ç»åïŒçŒç»åïŒã«åºã¥ããŠçŒçã®å è»žïŒ¬ïŒ¡ãæ€åºãããå 軞ã¯è§èã®äžå¿ãšç³åã®äžå¿ãéã軞ãæå³ããããèŠç·ã¯çŒçã®å 軞ãšã¯å¿ ãããäžèŽããªããç¶²èäžã®èŠçްèã®ååžã¯åäžã§ã¯ãªããèŠçްèã®å¯åºŠãé«ãéšäœïŒé»æïŒïŒ£ã®äžå€®éšã®é¥æ²¡ïŒé»æäžå¿çª©ïŒãšç¯ç¹ïŒæ°Žæ¶äœäžå€®åŸé¢ïŒãšãçµã¶ç·ãèŠç·ãšãªãã Eye tracking is used in a variety of fields. Eye tracking is a technology that tracks what a user US is looking at in real time based on eye movements. The eye tracker detects the optical axis LA of the eyeball EB based on an image of the eye (ocular image). The optical axis LA is the axis that passes through the center of the cornea CR and the center of the pupil PU, but the line of sight does not necessarily coincide with the optical axis LA of the eyeball EB. The distribution of photoreceptors on the retina is not uniform, and the line of sight is the line connecting the area with a high density of photoreceptors (the depression in the center of the macula MC: the macular fovea centralis) and the nodal point (the central posterior surface of the lens).
ãèŠç·ãéã軞ã¯èŠè»žïŒ¶ïŒ¡ãšåŒã°ãããèŠè»žïŒ¶ïŒ¡ã¯çŒç»åããã¯çŽæ¥æ±ããããªããèŠè»žïŒ¶ïŒ¡ã¯å 軞ã«å¯ŸããŠïŒÂ°çšåºŠåŸããŠãããèŠè»žïŒ¶ïŒ¡ãšå 軞ãšã®éã®ããïŒãªãã»ããïŒã«ã¯å人差ãããããã®å人差ã¯ãã£ãªãã¬ãŒã·ã§ã³ã«ãã£ãŠèª¿æŽããå¿ èŠããããèŠè»žïŒ¶ïŒ¡ïŒèŠç·ïŒã¯ãã£ãªãã¬ãŒã·ã§ã³æ å ±ã«åºã¥ããŠå 軞ããæšå®ãããã The axis along which the line of sight passes is called the visual axis VA. The visual axis VA cannot be determined directly from the eye image. The visual axis VA is tilted at about 5° with respect to the optical axis LA. The offset between the visual axis VA and the optical axis LA varies from person to person, and this individual difference must be adjusted through calibration. The visual axis VA (line of sight) is estimated from the optical axis LA based on the calibration information.
ãé ãåŸãããšçŒçã¯åæãããçŒçãåæãããšãå 軞ãšèŠè»žïŒ¶ïŒ¡ã®ïŒæ¬¡å çãªäœçœ®é¢ä¿ãå€åããã粟床ã®ããèŠç·æšå®ãè¡ãããã«ã¯ãçŒçã®åç·ãèæ ®ããèŠè»žïŒ¶ïŒ¡ã®è£æ£ãå¿ èŠãšãªããæ¬é瀺ã§ã¯ãçŒçåæãšãããããŒãºãšã®éã®çžé¢ãäºåã«ååŸãããããããŒãºããæšå®ãããçŒçåæã«åºã¥ããŠèŠè»žïŒ¶ïŒ¡ã®è£æ£ãè¡ãã以äžãæ¬ææ³ãå®çŸããããã®è¡šç€ºã·ã¹ãã ã«ã€ããŠå ·äœçã«èª¬æããã When the head is tilted, the eyeball EB rotates. When the eyeball EB rotates, the three-dimensional positional relationship between the optical axis LA and the visual axis VA changes. To perform accurate gaze estimation, it is necessary to correct the visual axis VA taking into account the line of the eyeball EB. In this disclosure, the correlation between eyeball rotation and head pose is obtained in advance, and the visual axis VA is corrected based on the eyeball rotation estimated from the head pose. Below, a display system for realizing this method is described in detail.
ïŒïŒã·ã¹ãã æ§æäŸïŒœ
ãå³ïŒã¯ã衚瀺ã·ã¹ãã ïŒã®æ§æã®äžäŸã瀺ãå³ã§ããã
[2. System configuration example]
FIG. 3 is a diagram showing an example of the configuration of the display system 1. As shown in FIG.
ã衚瀺ã·ã¹ãã ïŒã¯ãæ å ±åŠçè£ çœ®ïŒïŒããã³ïŒšïŒïŒ€ïŒïŒãæãããïŒïŒ€ïŒïŒã¯ãè£ çè ã§ãããŠãŒã¶ïŒµïŒ³ã«ïŒïŒ€æ åãæç€ºãããæ å ±åŠçè£ çœ®ïŒïŒã¯ããŠãŒã¶ïŒµïŒ³ã®çŒã®åãã«åºã¥ããŠãŠãŒã¶ïŒµïŒ³ã®èŠç·ãæ€åºããèŠç·ã«å¿ãã衚瀺å¶åŸ¡ãè¡ããå³ïŒã«ã¯ãèŠç·æšå®ã«é¢ããæ§æãéžæçã«èšèŒãããŠãããéåžžã³ã³ãã³ãïŒæ ç»ãã²ãŒã ãªã©ïŒã®åŠçã®ã¿ã«çšããããæ§æã®å³ç€ºã¯çç¥ãããŠããã The display system 1 has an information processing device 10 and an HMD 20. The HMD 20 presents 3D images to the user US who is wearing the HMD. The information processing device 10 detects the user US's line of sight based on the eye movements of the user US and controls the display according to the line of sight. Figure 3 selectively shows configuration related to line of sight estimation. Configuration normally used only for processing content (movies, games, etc.) is omitted from the illustration.
ãïŒïŒ€ïŒïŒã¯ãïŒïŒµïŒïŒ©ïœïœ ïœïœïœïœïœãïŒïœ ïœïœïœïœïœ ïœïœ ïœïœãïœïœïœïŒïŒïŒãå€ã«ã¡ã©ïŒïŒããã£ã¹ãã¬ã€ïŒïŒããã³ã¢ã€ã«ã¡ã©ïŒïŒãæãããå€ã«ã¡ã©ïŒïŒã¯ãïŒïŒ€ïŒïŒã®åšå²ãæ®åœ±ãããïŒïŒµïŒïŒãšå€ã«ã¡ã©ïŒïŒã®ã»ã³ã·ã³ã°æ å ±ã¯ããŠãŒã¶ïŒµïŒ³ã®ãããããŒãºã®æšå®åŠçã«çšããããããã£ã¹ãã¬ã€ïŒïŒã¯ããã£ãªãã¬ãŒã·ã§ã³çšç»åã衚瀺ãããã¢ã€ã«ã¡ã©ïŒïŒã¯ããŠãŒã¶ïŒµïŒ³ã®çŒãæ®åœ±ããçŒç»åãååŸããã The HMD 20 has an IMU (Inertial Measurement Unit) 21, an external camera 22, a display 23, and an eye camera 24. The external camera 22 captures images of the surroundings of the HMD 20. Sensing information from the IMU 21 and external camera 22 is used to estimate the head pose of the user US. The display 23 displays an image for calibration. The eye camera 24 captures images of the eyes of the user US and acquires eye images.
ãæ å ±åŠçè£ çœ®ïŒïŒã¯ãïŒä¿¡å·åŠçéšïŒïŒããããããŒãºæšå®éšïŒïŒããããããŒãºïŒåæçžé¢éšïŒïŒãã¿ãŒã²ãã衚瀺éšïŒïŒãè¹åœ©æ€åºéšïŒïŒãåææ€åºéšïŒïŒãå 軞æšå®éšïŒïŒãå 軞ïŒèŠè»žçžé¢éšïŒïŒããã³èŠè»žæšå®éšïŒïŒãæããã The information processing device 10 has a SLAM signal processing unit 11, a head pose estimation unit 12, a head pose-rotation correlation unit 13, a target display unit 14, an iris detection unit 15, a rotation detection unit 16, an optical axis estimation unit 17, an optical axis-visual axis correlation unit 18, and a visual axis estimation unit 19.
ãïŒä¿¡å·åŠçéšïŒïŒã¯ãïŒïŒµïŒïŒãšå€ã«ã¡ã©ïŒïŒã®ã»ã³ã·ã³ã°æ å ±ããèªå·±äœçœ®æšå®ãè¡ããèªå·±äœçœ®æšå®ã¯ãïŒïŒïŒ³ïœïœïœïœïœïœïœïœ ïœïœïœãïœïœïœïœïœïœïœïœïœïœïœãïœïœïœãïŒïœïœïœïœïœïœïŒæè¡ãçšããŠè¡ãããšãã§ããããããããŒãºæšå®éšïŒïŒã¯ãæšå®ãããèªå·±äœçœ®ã®æ å ±ã«åºã¥ããŠãŠãŒã¶ïŒµïŒ³ã®ãããããŒãºãååŸããã The SLAM signal processing unit 11 estimates the user's position from sensing information from the IMU 21 and external camera 22. Self-position estimation can be performed using SLAM (Simultaneous Localization and Mapping) technology. The head pose estimation unit 12 acquires the head pose of the user US based on the estimated self-position information.
ããããããŒãºïŒåæçžé¢éšïŒïŒã¯ãçŒçåæãšãããããŒãºãšã®éã®çžé¢ããããããŒãºïŒåæçžé¢ãšããŠååŸãããå³ïŒã®äŸã§ã¯ããŠãŒã¶ïŒµïŒ³ã«ããäºåã®ãã£ãªãã¬ãŒã·ã§ã³äœæ¥ã«ãã£ãŠãããããŒãºïŒåæçžé¢ãæ±ããããããããããŒãºïŒåæçžé¢éšïŒïŒã¯ããããããŒãºæšå®éšïŒïŒã§æšå®ããããããããŒãºããããããŒãºïŒåæçžé¢ã«é©çšããããšã«ããããããããŒãºã«å¯Ÿå¿ããçŒçåæïŒåæã®æ¹åãè§åºŠïŒãååŸããã The headpose-rotation correlation unit 13 obtains the correlation between eye rotation and head pose as the headpose-rotation correlation. In the example of Figure 3, the headpose-rotation correlation is determined through a prior calibration process by the user US. The headpose-rotation correlation unit 13 applies the headpose estimated by the headpose estimation unit 12 to the headpose-rotation correlation to obtain the eye rotation (direction and angle of rotation) corresponding to the head pose.
ãäŸãã°ããããããŒãºïŒåæçžé¢éšïŒïŒã¯ããã£ãªãã¬ãŒã·ã§ã³æ å ±ã®å©çšè ïŒã³ã³ãã³ãèŠèŽæã«ãããããŒãºã®ååŸå¯Ÿè±¡ãšãªããŠãŒã¶ïŒµïŒ³ïŒã«ã€ããŠååŸããããçŒçåæãšãããããŒãºãšã®éã®çžé¢ãèšæ¶ããããããããããããŒãºïŒåæçžé¢ã¯ãå€äººæ°ããéãããããããŒãºãšçŒçåæã®ããŒã¿ãéçŽããããšã«ããæ±ããããŠãããã For example, the headpose-rotation correlation unit 13 stores the correlation between eye rotation and headpose acquired for the user of the calibration information (the user US whose headpose is to be acquired when viewing content). However, the headpose-rotation correlation may also be obtained by aggregating headpose and eye rotation data collected from multiple people.
ãè¹åœ©æ€åºéšïŒïŒã¯ãã¢ã€ã«ã¡ã©ïŒïŒããååŸããçŒç»åããè¹åœ©ãæœåºãããåææ€åºéšïŒïŒã¯ãè¹åœ©ã®ç»åã«åºã¥ããŠçŒçã®åæãæ€åºãããçŒçåæã®æ å ±ã¯ããããããŒãºïŒåæçžé¢éšïŒïŒã«åºåãããããããããŒãºïŒåæçžé¢éšïŒïŒã¯ãåææ€åºéšïŒïŒããååŸããçŒçåæã®æ å ±ãããããããŒãºæšå®éšïŒïŒããååŸãããããããŒãºã®æ å ±ãšçŽã¥ãããããã«ããããããããŒãºïŒåæçžé¢ãååŸãããã The iris detection unit 15 extracts the iris from the eye image acquired from the eye camera 24. The rotation detection unit 16 detects the rotation of the eyeball EB based on the iris image. Information on eyeball rotation is output to the headpose-rotation correlation unit 13. The headpose-rotation correlation unit 13 links the eyeball rotation information acquired from the rotation detection unit 16 with the headpose information acquired from the headpose estimation unit 12. In this way, the headpose-rotation correlation is acquired.
ãå 軞æšå®éšïŒïŒã¯ãã¢ã€ã«ã¡ã©ïŒïŒããååŸããçŒç»åã«åºã¥ããŠçŒçã®å 軞ãååŸãããå 軞ïŒèŠè»žçžé¢éšïŒïŒã¯ãçŒçã®å 軞ãšèŠè»žïŒ¶ïŒ¡ãšã®éã®çžé¢ãå 軞ïŒèŠè»žçžé¢ãšããŠååŸãããå³ïŒã®äŸã§ã¯ããŠãŒã¶ïŒµïŒ³ã«ããäºåã®ãã£ãªãã¬ãŒã·ã§ã³äœæ¥ã«ãã£ãŠå 軞ïŒèŠè»žçžé¢ãæ±ãããããå 軞ïŒèŠè»žçžé¢éšïŒïŒã¯ããã£ãªãã¬ãŒã·ã§ã³æ å ±ã®å©çšè ïŒã³ã³ãã³ãèŠèŽæã«å 軞ã®ååŸå¯Ÿè±¡ãšãªããŠãŒã¶ïŒµïŒ³ïŒã«ã€ããŠååŸããããå 軞ãšèŠè»žïŒ¶ïŒ¡ãšã®éã®çžé¢ãèšæ¶ãããããããå 軞ïŒèŠè»žçžé¢ã¯ãå€äººæ°ããéããå 軞ãšèŠè»žïŒ¶ïŒ¡ã®ããŒã¿ãéçŽããããšã«ããæ±ããããŠãããã The optical axis estimation unit 17 acquires the optical axis LA of the eyeball EB based on the eye image acquired from the eye camera 24. The optical axis-visual axis correlation unit 18 acquires the correlation between the optical axis LA of the eyeball EB and the visual axis VA as the optical axis-visual axis correlation. In the example of Figure 3, the optical axis-visual axis correlation is determined through a prior calibration operation by the user US. The optical axis-visual axis correlation unit 18 stores the correlation between the optical axis LA and the visual axis VA acquired for the user of the calibration information (the user US from whom the optical axis LA is acquired when viewing content). However, the optical axis-visual axis correlation may also be acquired by aggregating data on the optical axis LA and visual axis VA collected from multiple people.
ãã¿ãŒã²ãã衚瀺éšïŒïŒã¯ããã£ãªãã¬ãŒã·ã§ã³äœæ¥ã§çšããç»åïŒãã£ãªãã¬ãŒã·ã§ã³çšç»åïŒãçæãããã£ã¹ãã¬ã€ïŒïŒã«è¡šç€ºããããã£ãªãã¬ãŒã·ã§ã³çšç»åã¯ã泚èŠå¯Ÿè±¡ãšãªãã¿ãŒã²ããïŒå³ïŒãå³ïŒåç §ïŒãå«ããã¿ãŒã²ãã衚瀺éšïŒïŒã¯ããŠãŒã¶ïŒµïŒ³ã泚ç®ããã¿ãŒã²ããããã£ã¹ãã¬ã€ïŒïŒã«è¡šç€ºãããã¿ãŒã²ãã衚瀺éšïŒïŒã¯ãã¿ãŒã²ããã®äœçœ®ãç§»åãããŠããã£ãªãã¬ãŒã·ã§ã³æã«å¿ èŠãªèŠç·ãé éšã®èªç¶ãªåããå®çŸããã The target display unit 14 generates an image (calibration image) to be used in the calibration process and displays it on the display 23. The calibration image includes a target TG (see Figures 5 and 6) that is the gaze target. The target display unit 14 displays the target TG that the user US is looking at on the display 23. The target display unit 14 moves the position of the target TG to achieve natural line of sight and head movements required during calibration.
ãã¿ãŒã²ãã衚瀺éšïŒïŒã¯ããã£ãªãã¬ãŒã·ã§ã³çšç»åãšããŠãçŒçåæãšãããããŒãºãšã®çžé¢ã調ã¹ãããã®ç»åïŒå³ïŒåç §ïŒãããã³ãå 軞ãšèŠè»žïŒ¶ïŒ¡ãšã®çžé¢ã調ã¹ãããã®ç»åïŒå³ïŒåç §ïŒãçæããã The target display unit 14 generates calibration images, including an image for examining the correlation between eye rotation and head pose (see Figure 5) and an image for examining the correlation between the optical axis LA and the visual axis VA (see Figure 6).
ããã£ãªãã¬ãŒã·ã§ã³çšç»åã¯ã泚èŠå¯Ÿè±¡ãšãªãã¿ãŒã²ãããçŒã§è¿œãããããšã«ãã£ãŠçŒçããããããŒãºã®å€åãä¿ããäŸãã°ãåææ€åºéšïŒïŒã¯ãã¿ãŒã²ããã®ç§»åã«äŒŽããŠãŒã¶ïŒµïŒ³ã®çŒçåæãæ€åºããããããããŒãºæšå®éšïŒïŒã¯ãã¿ãŒã²ããã®ç§»åã«äŒŽã£ãŠå€åãããŠãŒã¶ïŒµïŒ³ã®ãããããŒãºãååŸããããããããŒãºïŒåæçžé¢éšïŒïŒã¯ãã¿ãŒã²ããã®ç§»åã«äŒŽãçŒçåæãšãããããŒãºãšã®éã®çžé¢ããããããŒãºïŒåæçžé¢ãšããŠååŸããã The calibration image encourages the eyes to follow the target TG, which is the object of gaze, and thereby encourages fluctuations in the eyeballs EB and head pose. For example, the rotation detection unit 16 detects the eyeball rotation of the user US that accompanies the movement of the target TG. The head pose estimation unit 12 acquires the head pose of the user US that fluctuates as the target TG moves. The head pose-rotation correlation unit 13 acquires the correlation between the eyeball rotation and head pose that accompanies the movement of the target TG as the head pose-rotation correlation.
ãå 軞ãšèŠè»žïŒ¶ïŒ¡ãšã®çžé¢ã¯ãã¿ãŒã²ããã®äœçœ®ãšå è»žïŒ¬ïŒ¡ã®æ å ±ã«åºã¥ããŠååŸããããšãã§ãããäŸãã°ãå 軞ïŒèŠè»žçžé¢éšïŒïŒã¯ãã¿ãŒã²ãã衚瀺éšïŒïŒããååŸããã¿ãŒã²ããã®äœçœ®ã«åºã¥ããŠçŒçã®èŠè»žïŒ¶ïŒ¡ãæšå®ãããå 軞ïŒèŠè»žçžé¢éšïŒïŒã¯ãæšå®ãããèŠè»žïŒ¶ïŒ¡ããå 軞æšå®éšïŒïŒããååŸããå 軞ãšçŽã¥ããããšã«ãããå 軞ïŒèŠè»žçžé¢ãååŸããã The correlation between the optical axis LA and the visual axis VA can be obtained based on the position of the target TG and information about the optical axis LA. For example, the optical axis-visual axis correlation unit 18 estimates the visual axis VA of the eyeball EB based on the position of the target TG obtained from the target display unit 14. The optical axis-visual axis correlation unit 18 obtains the optical axis-visual axis correlation by linking the estimated visual axis VA with the optical axis LA obtained from the optical axis estimation unit 17.
ãå 軞ïŒèŠè»žçžé¢éšïŒïŒã¯ããããããŒãºïŒåæçžé¢éšïŒïŒããããããããŒãºã«å¯Ÿå¿ããçŒçåæãååŸãããçŒçåæã¯ããããããŒãºæšå®éšïŒïŒã§æšå®ããããããããŒãºããããããŒãºïŒåæçžé¢ã«é©çšããŠåŸããããå 軞ïŒèŠè»žçžé¢éšïŒïŒã¯ãå 軞æšå®éšïŒïŒããååŸããå 軞ãå 軞ïŒèŠè»žçžé¢ã«é©çšããŠèŠè»žïŒ¶ïŒ¡ãæšå®ãããå 軞ïŒèŠè»žçžé¢éšïŒïŒã¯ããããããŒãºïŒåæçžé¢éšïŒïŒããååŸããçŒçåæã«åºã¥ããŠçŒçã®èŠè»žïŒ¶ïŒ¡ãè£æ£ãããèŠè»žæšå®éšïŒïŒã¯ãè£æ£ã«ãã£ãŠåããä¿®æ£ãããèŠè»žïŒ¶ïŒ¡ãããããããŒãºã«å¯Ÿå¿ããèŠè»žïŒ¶ïŒ¡ãšããŠåºåããã The optical axis-visual axis correlation unit 18 obtains the eye rotation corresponding to the headpose from the headpose-rotation correlation unit 13. The eye rotation is obtained by applying the headpose estimated by the headpose estimation unit 12 to the headpose-rotation correlation. The optical axis-visual axis correlation unit 18 estimates the visual axis VA by applying the optical axis LA obtained from the optical axis estimation unit 17 to the optical axis-visual axis correlation. The optical axis-visual axis correlation unit 18 corrects the visual axis VA of the eye EB based on the eye rotation obtained from the headpose-rotation correlation unit 13. The visual axis estimation unit 19 outputs the visual axis VA, whose orientation has been corrected by the correction, as the visual axis VA corresponding to the headpose.
ãå³ïŒã¯ãæ¯èŒäŸãšãªãåŸæ¥ã®è¡šç€ºã·ã¹ãã ïŒïŒ£ã瀺ãå³ã§ãããå³ïŒã§ã¯ãèŠç·æšå®ã«é¢ããæ§æãéžæçã«èšèŒãããéåžžã³ã³ãã³ãïŒæ ç»ãã²ãŒã ãªã©ïŒã®åŠçã®ã¿ã«çšããããæ§æã®å³ç€ºã¯çç¥ãããŠããã Fig. 4 is a diagram showing a conventional display system 1C as a comparative example. Fig. 4 selectively depicts configurations related to gaze estimation, and omits configurations used only for processing regular content (movies, games, etc.).
ãæ¯èŒäŸã®è¡šç€ºã·ã¹ãã ïŒïŒ£ã§ã¯ãå 軞ãšèŠè»žïŒ¶ïŒ¡ãšã®éã®çžé¢ã®ã¿ã«åºã¥ããŠèŠç·æšå®ãè¡ããããã¿ãŒã²ãã衚瀺éšïŒïŒïŒ£ã¯ãå 軞ãšèŠè»žïŒ¶ïŒ¡ãšã®çžé¢ã調ã¹ãããã®ç»åã®ã¿ã衚瀺ããããããããŒãºãšçŒçåæãšã®çžé¢ãããã³ãçŒçåæãèŠè»žïŒ¶ïŒ¡ã«äžãã圱é¿ãèæ ®ãããªããããé éšãåãããšãã«é©åãªèŠç·æšå®ãè¡ãããªããå³ïŒã«ç€ºããæ¬éç€ºã®ææ³ã§ã¯ããããããŒãºãšèŠè»žïŒ¶ïŒ¡ãšã®é¢ä¿ãèæ ®ããããããé éšãåããŠãé©åãªèŠç·æšå®ãè¡ãããã In the display system 1C of the comparative example, gaze estimation is performed based only on the correlation between the optical axis LA and the visual axis VA. The target display unit 14C displays only an image for examining the correlation between the optical axis LA and the visual axis VA. Since the correlation between head pose and eye rotation and the effect of eye rotation on the visual axis VA are not taken into consideration, appropriate gaze estimation is not performed when the head moves. In the method of the present disclosure shown in FIG. 3, the relationship between head pose and the visual axis VA is taken into consideration, and therefore appropriate gaze estimation is performed even when the head moves.
ïŒïŒãã£ãªãã¬ãŒã·ã§ã³çšç»åã®è¡šç€ºäŸïŒœ
ãå³ïŒããã³å³ïŒã¯ããã£ãªãã¬ãŒã·ã§ã³çšç»åã®è¡šç€ºäŸã瀺ãå³ã§ããã
[3. Display example of calibration image]
5 and 6 are diagrams showing examples of display of calibration images.
ãå³ïŒã¯ãçŒçåæãšãããããŒãºãšã®çžé¢ã調ã¹ãããã®ãã£ãªãã¬ãŒã·ã§ã³çšç»åã®äžäŸã瀺ãå³ã§ãããå³ïŒã®äŸã§ã¯ãä»®æ³ç©ºéã«æµ®ãã¶çç¶ã®ç©äœãã¿ãŒã²ãããšããŠæç€ºãããããããããŒãºæšå®éšïŒïŒã¯ãã¿ãŒã²ããã远ããŠãŒã¶ïŒµïŒ³ã®é éšã®åãããã©ããã³ã°ãããã¿ãŒã²ãã衚瀺éšïŒïŒã¯ãã¿ãŒã²ããã®è¡šç€ºäœçœ®ãå€åãããããšã§ãé éšã®åããèªå°ããã Figure 5 shows an example of a calibration image for investigating the correlation between eye rotation and head pose. In the example of Figure 5, a spherical object floating in virtual space is presented as the target TG. The head pose estimation unit 12 tracks the head movement of the user US who is following the target TG. The target display unit 14 guides the head movement by varying the display position of the target TG.
ãäŸãã°ãã¿ãŒã²ãã衚瀺éšïŒïŒã¯ãé éšã®èªç¶ãªåããå®çŸããããã«ããããããŒãºã®å€åãåŒãèµ·ãããããããªäœçœ®ã«ã¿ãŒã²ããã衚瀺ãããå³ïŒã®äŸã§ã¯ãã¿ãŒã²ãã衚瀺éšïŒïŒã¯ãã¿ãŒã²ããã®åæ¹ã«é®èœç©ïŒ¯ïŒ¢ã衚瀺ããé®èœç©ïŒ¯ïŒ¢ã®è£åŽã«ããã¿ãŒã²ãããèŠã蟌ããããªãããããŒãºã®åããä¿ãããã®éãã¿ãŒã²ãã衚瀺éšïŒïŒã¯ããããããŒãºã®å€åãä¿ãã¡ãã»ãŒãžãé³å£°ãæåã§éç¥ããããšãã§ããã For example, the target display unit 14 displays the target TG in a position that induces a change in head pose in order to achieve natural head movement. In the example of Figure 5, the target display unit 14 displays an obstruction OB in front of the target TG, encouraging the head pose to be changed as if peering at the target TG behind the obstruction OB. At this time, the target display unit 14 can also notify the user by voice or text of a message encouraging the head pose to be changed.
ããã£ãªãã¬ãŒã·ã§ã³äœæ¥ã¯ãéåžžã³ã³ãã³ãïŒæ ç»ãã²ãŒã ãªã©ïŒãèŠèŽããåã®äºåäœæ¥ãšããŠå®æœãããããã£ãªãã¬ãŒã·ã§ã³äœæ¥ã®å¯Ÿè±¡è ã¯ãéåžžã³ã³ãã³ãã®èŠèŽè ãšåäžã®ãŠãŒã¶ïŒµïŒ³ïŒéåžžã³ã³ãã³ãã®èŠèŽæã«ãããããŒãºã®ååŸå¯Ÿè±¡ãšãªããŠãŒã¶ïŒµïŒ³ïŒãšããããšã奜ãŸãããããããããšã§ããŠãŒã¶ïŒµïŒ³ã®å人差ãèæ ®ããé©åãªèŠè»žïŒ¶ïŒ¡ã®è£æ£ãè¡ãããšãã§ããã The calibration process is performed as a preliminary step before viewing regular content (movies, games, etc.). It is preferable that the subject of the calibration process be the same user US as the viewer of the regular content (the user US whose head pose is to be acquired when viewing the regular content). This allows for appropriate correction of the visual axis VA, taking into account the individual differences of the user US.
ããããããŠãŒã¶ïŒµïŒ³ã®å人差ãå°ãªãå Žåã«ã¯ã倿°ã®è¢«éšè ããåŸãããæšæºçãªçžé¢ããŒã¿ã«åºã¥ããŠèŠè»žïŒ¶ïŒ¡ãè£æ£ããããšãã§ãããäŸãã°ããããããŒãºïŒåæçžé¢éšïŒïŒã¯ãè€æ°ã®è¢«éšè ã®ããŒã¿ã«åºã¥ãå¹³åçãªçŒçåæãšãããããŒãºãšã®éã®çžé¢ããããããŒãºïŒåæçžé¢ãšããŠèšæ¶ããããšãã§ããããã®å Žåãäºåã®ãã£ãªãã¬ãŒã·ã§ã³ãçç¥ãã€ã€æŠãè¯å¥œãªèŠç·æšå®ã宿œãããã However, if there is little individual variation among users US, the visual axis VA can also be corrected based on standard correlation data obtained from a large number of subjects. For example, the headpose-rotation correlation unit 13 can store the correlation between average eye rotation and headpose based on data from multiple subjects as the headpose-rotation correlation. In this case, generally good gaze estimation can be performed without prior calibration.
ãå³ïŒã¯ãå 軞ãšèŠè»žïŒ¶ïŒ¡ãšã®çžé¢ã調ã¹ãããã®ãã£ãªãã¬ãŒã·ã§ã³çšç»åã®äžäŸã瀺ãå³ã§ãããå³ïŒã®äŸã§ã¯ãæŸå°ç¶ã«é 眮ãããè€æ°ã®ç¹ããïŒã€ã®ç¹ãéžæãããéžæãããç¹ãç¹ç¯ãªã©ãããããšã«ãã£ãŠã¿ãŒã²ãããšããŠæç€ºãããããã£ãªãã¬ãŒã·ã§ã³çšç»åã¯ãé éšãåºå®ããç¶æ ã§å šãŠã®ç¹ãäžæã§ãããããªç»åïŒãããããã¯ç»åïŒãšããŠæç€ºããããè¹åœ©æ€åºéšïŒïŒã¯ãã¿ãŒã²ããã远ããŠãŒã¶ïŒµïŒ³ã®çŒçã®åãïŒçŒçåæïŒããã©ããã³ã°ãããã¿ãŒã²ãã衚瀺éšïŒïŒã¯ãã¿ãŒã²ããã®è¡šç€ºäœçœ®ãå€åãããããšã§ãçŒçã®åããèªå°ããã FIG. 6 shows an example of a calibration image for investigating the correlation between the optical axis LA and the visual axis VA. In the example of FIG. 6, one point is selected from multiple radially arranged points, and the selected point is presented as the target TG by, for example, lighting up. The calibration image is presented as an image (headlock image HI) that allows all points to be seen with the head fixed. The iris detection unit 15 tracks the movement (eye rotation) of the eyeball EB of the user US as it follows the target TG. The target display unit 14 guides the movement of the eyeball EB by varying the display position of the target TG.
ãè¹åœ©æ€åºéšïŒïŒã¯ãè¹åœ©ãã¿ãŒã³ã«åºã¥ããŠçŒçåæãæšå®ãããè¹åœ©ãã¿ãŒã³ã®æ€åºã«ã¯é«ãåŠçèœåãå¿ èŠã§ãããéåžžã³ã³ãã³ãã®èŠèŽæã«åžžæå®æœããããšã¯é£ãããããããè¹åœ©ãã¿ãŒã³ã®æ€åºåŠçããã£ãªãã¬ãŒã·ã§ã³äœæ¥æã«éå®ããã°ãåŠçè² è·ã®åé¡ã¯çãã«ãããéåžžã³ã³ãã³ãã®èŠèŽæã«ã¯ãå è»žïŒ¬ïŒ¡ã®æšå®åŠçãããã³ãå 軞ããèŠè»žïŒ¶ïŒ¡ãžã®å€æåŠçã«åºã¥ããŠèŠç·æšå®ãè¡ãããšã§ãåŠçè² è·ãäœæžãããã The iris detection unit 15 estimates eye rotation based on the iris pattern. Detecting the iris pattern requires high processing power, making it difficult to perform constantly when viewing normal content. However, if the iris pattern detection process is limited to calibration work, processing load issues are unlikely to arise. When viewing normal content, the processing load is reduced by performing gaze estimation based on the process of estimating the optical axis LA and the process of converting the optical axis LA to the visual axis VA.
ïŒïŒãã£ãªãã¬ãŒã·ã§ã³ã®åŠçãããŒïŒœ
ãå³ïŒã¯ãã³ã³ãã³ãèŠèŽåã®ãã£ãªãã¬ãŒã·ã§ã³äœæ¥ã«ãã£ãŠãããããŒãºïŒåæçžé¢ãæ€åºããããã®åŠçãããŒã®äžäŸã瀺ãå³ã§ããã
4. Calibration Processing Flow
FIG. 7 is a diagram showing an example of a processing flow for detecting head pose-rotation correlation through a calibration operation before viewing content.
ãã¿ãŒã²ãã衚瀺éšïŒïŒã¯ããã£ã¹ãã¬ã€ïŒïŒã«ãã¿ãŒã²ãããå«ããã£ãªãã¬ãŒã·ã§ã³ç»åã衚瀺ããïŒã¹ãããïŒïŒãïŒä¿¡å·åŠçéšïŒïŒã¯ãïŒïŒµïŒïŒããã³å€ã«ã¡ã©ïŒïŒããååŸããã»ã³ãµæ å ±ã«åºã¥ããŠããã£ãªãã¬ãŒã·ã§ã³äœæ¥æã®ãŠãŒã¶ïŒµïŒ³ã®èªå·±äœçœ®ãæšå®ããããããããŒãºæšå®éšïŒïŒã¯ãæšå®ãããèªå·±äœçœ®ã«åºã¥ããŠããã£ãªãã¬ãŒã·ã§ã³äœæ¥æã®ãŠãŒã¶ïŒµïŒ³ã®ãããããŒãºãååŸããïŒã¹ãããïŒïŒã The target display unit 14 displays a calibration image including the target TG on the display 23 (step S1). The SLAM signal processing unit 11 estimates the self-position of the user US during the calibration operation based on sensor information acquired from the IMU 21 and external camera 22. The head pose estimation unit 12 acquires the head pose of the user US during the calibration operation based on the estimated self-position (step S2).
ãã¢ã€ã«ã¡ã©ïŒïŒã¯ãã¿ãŒã²ããã远ããŠãŒã¶ïŒµïŒ³ã®çŒç»åãååŸãããè¹åœ©æ€åºéšïŒïŒã¯ãçŒç»åããè¹åœ©ãæ€åºãããåææ€åºéšïŒïŒã¯ãè¹åœ©ãã¿ãŒã³ã®åŸãã«åºã¥ããŠçŒçåæãæ€åºããïŒã¹ãããïŒïŒããããããŒãºïŒåæçžé¢éšïŒïŒã¯ããããããŒãºæšå®éšïŒïŒããååŸãããããããŒãºãšãåææ€åºéšïŒïŒããååŸããçŒçåæãšããçŽã¥ããäž¡è ã®å¯Ÿå¿é¢ä¿ããããããŒãºïŒåæçžé¢ãšããŠèšé²ããïŒã¹ãããïŒïŒããªããã¹ãããïŒãšã¹ãããïŒã¯äžŠè¡ããŠè¡ãããã©ã¡ããå ã«è¡ãããŠãããã The eye camera 24 acquires an image of the eye of the user US following the target TG. The iris detection unit 15 detects the iris from the eye image. The rotation detection unit 16 detects eye rotation based on the tilt of the iris pattern (step S3). The head pose-rotation correlation unit 13 links the head pose acquired from the head pose estimation unit 12 with the eye rotation acquired from the rotation detection unit 16, and records the correspondence between the two as a head pose-rotation correlation (step S4). Note that steps S2 and S3 are performed in parallel, and either may be performed first.
ïŒïŒèŠç·æšå®ã®åŠçãããŒïŒœ
ãå³ïŒã¯ãã³ã³ãã³ãèŠèŽæã®èŠç·æšå®ã®åŠçãããŒã®äžäŸã瀺ãå³ã§ããã
5. Gaze Estimation Processing Flow
FIG. 8 is a diagram showing an example of a processing flow of gaze estimation during content viewing.
ãïŒä¿¡å·åŠçéšïŒïŒã¯ãïŒïŒµïŒïŒããã³å€ã«ã¡ã©ïŒïŒããååŸããã»ã³ãµæ å ±ã«åºã¥ããŠãã³ã³ãã³ãèŠèŽæã®ãŠãŒã¶ïŒµïŒ³ã®èªå·±äœçœ®ãæšå®ããããããããŒãºæšå®éšïŒïŒã¯ãæšå®ãããèªå·±äœçœ®ã«åºã¥ããŠãã³ã³ãã³ãèŠèŽæã®ãŠãŒã¶ïŒµïŒ³ã®ãããããŒãºãååŸããïŒã¹ãããïŒïŒïŒãå 軞ïŒèŠè»žçžé¢éšïŒïŒã¯ãååŸããããããããŒãºããããããŒãºïŒåæçžé¢ã«é©çšãããŠãŒã¶ïŒµïŒ³ã®çŒçåæãæšå®ããïŒã¹ãããïŒïŒïŒã The SLAM signal processing unit 11 estimates the user US's self-position while viewing content based on sensor information acquired from the IMU 21 and external camera 22. The head pose estimation unit 12 acquires the head pose of the user US while viewing content based on the estimated self-position (step S11). The optical axis-visual axis correlation unit 18 applies the acquired head pose to the head pose-rotation correlation to estimate the eye rotation of the user US (step S12).
ãã¢ã€ã«ã¡ã©ïŒïŒã¯ãã³ã³ãã³ãèŠèŽæã®ãŠãŒã¶ïŒµïŒ³ã®çŒç»åãååŸãããå 軞æšå®éšïŒïŒã¯ãçŒç»åã«åºã¥ããŠçŒçã®å è»žïŒ¬ïŒ¡ãæšå®ããïŒã¹ãããïŒïŒïŒãå 軞ïŒèŠè»žçžé¢éšïŒïŒã¯ãæšå®ãããå 軞ãå 軞ïŒèŠè»žçžé¢ã«é©çšããŠãã³ã³ãã³ãèŠèŽæã®ãŠãŒã¶ïŒµïŒ³ã®èŠè»žïŒ¶ïŒ¡ãæšå®ããïŒã¹ãããïŒïŒïŒãå 軞ïŒèŠè»žçžé¢éšïŒïŒã¯ããããããŒãºããæšå®ãããçŒçåæã«åºã¥ããŠèŠè»žïŒ¶ïŒ¡ã®åããè£æ£ããïŒã¹ãããïŒïŒïŒããªããã¹ãããïŒïŒïœïŒ³ïŒïŒãšã¹ãããïŒïŒïœïŒ³ïŒïŒã¯äžŠè¡ããŠè¡ãããã©ã¡ããå ã«è¡ãããŠãããã The eye camera 24 acquires an image of the eye of the user US while viewing content. The optical axis estimation unit 17 estimates the optical axis LA of the eyeball EB based on the eye image (step S13). The optical axis-visual axis correlation unit 18 applies the estimated optical axis LA to the optical axis-visual axis correlation to estimate the visual axis VA of the user US while viewing content (step S14). The optical axis-visual axis correlation unit 18 corrects the direction of the visual axis VA based on the eye rotation estimated from the head pose (step S15). Note that steps S11-S12 and steps S13-S14 are performed in parallel, and either may be performed first.
ïŒïŒãããããŒãºïŒåæçžé¢ã®éæ¬¡æŽæ°ïŒœ
ãå³ïŒã¯ã衚瀺ã·ã¹ãã ã®ä»ã®æ§æäŸã瀺ãå³ã§ããã以äžãå³ïŒã«ç€ºãã衚瀺ã·ã¹ãã ïŒãšã®çžéç¹ãäžå¿ã«èª¬æããã
6. Sequential update of headpose-rotation correlation
9 is a diagram showing another example of the configuration of a display system. The following description will focus on the differences from the display system 1 shown in FIG.
ãæ¬äŸã®è¡šç€ºã·ã¹ãã ã§ã¯ãå³ïŒã«ç€ºããã¿ãŒã²ãã衚瀺éšïŒïŒãçç¥ãããŠãããæ¬äŸã§ã¯ããã£ãªãã¬ãŒã·ã§ã³äœæ¥ã®ããã®å°çšã®ç»åã¯çæãããªããæ¬äŸã§ã¯ãèŠèŽäžã®ã³ã³ãã³ãã®ç»åãäœããã®ããªã¬ã«åºã¥ããŠæœåºãããæœåºãããç»åããã£ãªãã¬ãŒã·ã§ã³çšç»åãšããŠçšããããã In this example, the display system omits the target display unit 14 shown in Figure 3. In this example, no dedicated image is generated for the calibration process. In this example, an image of the content being viewed is extracted based on some kind of trigger, and the extracted image is used as the calibration image.
ãããªã¬ã¯ãã·ã¹ãã éçºè ãä»»æã«èšå®ããããšãã§ãããäŸãã°ãæéã®äžãå ãç©äœãé£è¡ããã·ãŒã³ãªã©ãæç¢ºãªæ³šèŠå¯Ÿè±¡ïŒèªç®é åïŒãå«ãŸããã·ãŒã³ãæ€åºãããå Žåãªã©ãããªã¬ãšããŠèšå®ãããããããããŒãºïŒåæçžé¢éšïŒïŒã¯ãäºãèšå®ãããããªã¬ã«åºã¥ããŠéæ¬ çã«ååŸããããŠãŒã¶ïŒµïŒ³ã®çŒçåæããã³ãããããŒãºã®ããŒã¿ã«åºã¥ããŠããããããŒãºïŒåæçžé¢ã鿬¡æŽæ°ããã Triggers can be set arbitrarily by the system developer. For example, a trigger can be set when a scene containing a clear target of gaze (attractive area), such as a scene of a glowing object flying in the dark, is detected. The headpose-rotation correlation unit 13 sequentially updates the headpose-rotation correlation based on eye rotation and headpose data of the user US acquired intermittently based on preset triggers.
ãåè¿°ã®ããã«ã·ãŒã³æ€åºãããªã¬ãšããå Žåã«ã¯ãæ å ±åŠçè£ çœ®ïŒïŒã¯ã·ãŒã³æ€åºéšãå«ãããšãã§ãããã·ãŒã³æ€åºéšã¯ãèªç®é åãå«ãæ åã·ãŒã³ãç¹ç°ã·ãŒã³ãšããŠæ€åºãããäŸãã°ãèªç®é åã¯ããŠãŒã¶ïŒµïŒ³ã®æ³šç®ãéãŸããããããã€ãç»é¢ã®åºãç¯å²ãç§»åããããšã«ãããŠãŒã¶ïŒµïŒ³ã®é éšã®åããèªçºããããç©äœã®è¡šç€ºé åã§ãããæéã®äžãå ãç©äœãé£è¡ããã·ãŒã³ã§ã¯ãå ãé£è¡ç©äœãèªç®é åãšãªãã When scene detection is used as a trigger as described above, the information processing device 30 may include a scene detection unit. The scene detection unit detects a video scene including an attention region as a peculiar scene. For example, an attention region is a display area of an object that is likely to attract the attention of the user US and is likely to induce head movement of the user US by moving over a wide area of the screen. In a scene in which a glowing object is flying in the dark, the glowing flying object becomes the attention region.
ãåææ€åºéšïŒïŒã¯ãç¹ç°ã·ãŒã³ãæ€åºãããããšãããªã¬ãšããŠãèªç®é åã®ç§»åã«äŒŽããŠãŒã¶ïŒµïŒ³ã®çŒçåæãæ€åºããããããããŒãºæšå®éšïŒïŒã¯ãèªç®é åã®ç§»åã«äŒŽã£ãŠå€åãããŠãŒã¶ïŒµïŒ³ã®ãããããŒãºãååŸããããããããŒãºïŒåæçžé¢éšïŒïŒã¯ãèªç®é åã®ç§»åã«äŒŽãçŒçåæãšãããããŒãºãšã®éã®çžé¢ã«åºã¥ããŠãããããŒãºïŒåæçžé¢ãæŽæ°ããã The rotation detection unit 16 detects eye rotation of the user US accompanying movement of the eye region when a peculiar scene is detected. The head pose estimation unit 12 acquires the head pose of the user US, which changes as the eye region moves. The head pose-rotation correlation unit 13 updates the head pose-rotation correlation based on the correlation between the eye rotation accompanying movement of the eye region and the head pose.
ïŒïŒããŒããŠã§ã¢æ§æäŸïŒœ
ãå³ïŒïŒã¯ãæ
å ±åŠçè£
眮ïŒïŒã®ããŒããŠã§ã¢æ§æã®äžäŸã瀺ãå³ã§ããã
[7. Hardware Configuration Example]
FIG. 10 is a diagram illustrating an example of the hardware configuration of the information processing device 10.
ãæ å ±åŠçè£ çœ®ïŒïŒã®æ å ±åŠçã¯ãäŸãã°ãã³ã³ãã¥ãŒã¿ïŒïŒïŒïŒã«ãã£ãŠå®çŸããããã³ã³ãã¥ãŒã¿ïŒïŒïŒïŒã¯ãïŒïŒ£ïœ ïœïœïœïœïœãïœïœïœïœ ïœïœïœïœïœãïœïœïœïŒïŒïŒïŒïŒãïŒïŒïŒ²ïœïœïœïœïœãïœïœïœ ïœïœãïŒïœ ïœïœïœyïŒïŒïŒïŒïŒãïŒïŒïŒ²ïœ ïœïœãïœïœïœãïŒïœ ïœïœïœïœïŒïŒïŒïŒïŒãïŒïŒšïœïœïœãïœïœïœãïœïœïœïœ ïŒïŒïŒïŒïŒãéä¿¡ã€ã³ã¿ãŒãã§ã€ã¹ïŒïŒïŒïŒãããã³å ¥åºåã€ã³ã¿ãŒãã§ã€ã¹ïŒïŒïŒïŒãæãããã³ã³ãã¥ãŒã¿ïŒïŒïŒïŒã®åéšã¯ããã¹ïŒïŒïŒïŒã«ãã£ãŠæ¥ç¶ãããã Information processing by information processing device 10 is realized, for example, by computer 1000. Computer 1000 has a CPU (Central Processing Unit) 1100, RAM (Random Access Memory) 1200, ROM (Read Only Memory) 1300, HDD (Hard Disk Drive) 1400, communication interface 1500, and input/output interface 1600. Each part of computer 1000 is connected by bus 1050.
ãïŒïŒïŒïŒã¯ãïŒïŒïŒïŒïŒãŸãã¯ïŒšïŒ€ïŒ€ïŒïŒïŒïŒã«æ ŒçŽãããããã°ã©ã ïŒããã°ã©ã ããŒã¿ïŒïŒïŒïŒïŒã«åºã¥ããŠåäœããåéšã®å¶åŸ¡ãè¡ããããšãã°ãïŒïŒïŒïŒã¯ãïŒïŒïŒïŒïŒãŸãã¯ïŒšïŒ€ïŒ€ïŒïŒïŒïŒã«æ ŒçŽãããããã°ã©ã ãïŒïŒïŒïŒïŒã«å±éããåçš®ããã°ã©ã ã«å¯Ÿå¿ããåŠçãå®è¡ããã CPU 1100 operates based on programs (program data 1450) stored in ROM 1300 or HDD 1400, and controls each component. For example, CPU 1100 loads programs stored in ROM 1300 or HDD 1400 into RAM 1200 and executes processing corresponding to the various programs.
ãïŒïŒïŒïŒïŒã¯ãã³ã³ãã¥ãŒã¿ïŒïŒïŒïŒã®èµ·åæã«ïŒ£ïŒ°ïŒµïŒïŒïŒïŒã«ãã£ãŠå®è¡ãããïŒïŒ¢ïœïœïœïœãïœïœïœïœãïœïœïœïœïœãïœïœïœïœ ïœïŒãªã©ã®ããŒãããã°ã©ã ããã³ã³ãã¥ãŒã¿ïŒïŒïŒïŒã®ããŒããŠã§ã¢ã«äŸåããããã°ã©ã ãªã©ãæ ŒçŽããã ROM 1300 stores boot programs such as BIOS (Basic Input Output System) that are executed by CPU 1100 when computer 1000 starts up, as well as programs that depend on the computer 1000's hardware.
ãïŒïŒïŒïŒã¯ãïŒïŒïŒïŒã«ãã£ãŠå®è¡ãããããã°ã©ã ãããã³ããããããã°ã©ã ã«ãã£ãŠäœ¿çšãããããŒã¿ãªã©ãéäžæçã«èšé²ãããã³ã³ãã¥ãŒã¿ãèªã¿åãå¯èœãªéäžæçèšé²åªäœã§ãããå ·äœçã«ã¯ãïŒïŒïŒïŒã¯ãããã°ã©ã ããŒã¿ïŒïŒïŒïŒã®äžäŸãšããŠã®ã宿œåœ¢æ ã«ãããæ å ±åŠçããã°ã©ã ãèšé²ããèšé²åªäœã§ããã HDD 1400 is a computer-readable, non-transitory recording medium that non-temporarily records programs executed by CPU 1100 and data used by such programs. Specifically, HDD 1400 is a recording medium that records an information processing program according to an embodiment, which is an example of program data 1450.
ãéä¿¡ã€ã³ã¿ãŒãã§ã€ã¹ïŒïŒïŒïŒã¯ãã³ã³ãã¥ãŒã¿ïŒïŒïŒïŒãå€éšãããã¯ãŒã¯ïŒïŒïŒïŒïŒããšãã°ã€ã³ã¿ãŒãããïŒãšæ¥ç¶ããããã®ã€ã³ã¿ãŒãã§ã€ã¹ã§ãããããšãã°ãïŒïŒïŒïŒã¯ãéä¿¡ã€ã³ã¿ãŒãã§ã€ã¹ïŒïŒïŒïŒãä»ããŠãä»ã®æ©åšããããŒã¿ãåä¿¡ããããïŒïŒïŒïŒãçæããããŒã¿ãä»ã®æ©åšãžéä¿¡ãããããã The communication interface 1500 is an interface that allows the computer 1000 to connect to an external network 1550 (such as the Internet). For example, the CPU 1100 receives data from other devices and transmits data generated by the CPU 1100 to other devices via the communication interface 1500.
ãå ¥åºåã€ã³ã¿ãŒãã§ã€ã¹ïŒïŒïŒïŒã¯ãå ¥åºåããã€ã¹ïŒïŒïŒïŒãšã³ã³ãã¥ãŒã¿ïŒïŒïŒïŒãšãæ¥ç¶ããããã®ã€ã³ã¿ãŒãã§ã€ã¹ã§ãããããšãã°ãïŒïŒïŒïŒã¯ãå ¥åºåã€ã³ã¿ãŒãã§ã€ã¹ïŒïŒïŒïŒãä»ããŠãããŒããŒããããŠã¹ãªã©ã®å ¥åããã€ã¹ããããŒã¿ãåä¿¡ããããŸããïŒïŒïŒïŒã¯ãå ¥åºåã€ã³ã¿ãŒãã§ã€ã¹ïŒïŒïŒïŒãä»ããŠãè¡šç€ºè£ çœ®ãã¹ããŒã«ãŒãããªã³ã¿ãªã©ã®åºåããã€ã¹ã«ããŒã¿ãéä¿¡ããããŸããå ¥åºåã€ã³ã¿ãŒãã§ã€ã¹ïŒïŒïŒïŒã¯ãæå®ã®èšé²åªäœïŒã¡ãã£ã¢ïŒã«èšé²ãããããã°ã©ã ãªã©ãèªã¿åãã¡ãã£ã¢ã€ã³ã¿ãŒãã§ã€ã¹ãšããŠæ©èœããŠããããã¡ãã£ã¢ãšã¯ãããšãã°ïŒ€ïŒ¶ïŒ€ïŒïŒ€ïœïœïœïœïœïœãïŒ¶ïœ ïœïœïœïœïœïœïœ ãïœïœïœïŒãïŒïŒ°ïœïœïœïœ ãïœïœïœïœïœïœ ãïœïœ ïœïœïœïœïœïœïœïœ ãïœïœïœïŒãªã©ã®å åŠèšé²åªäœãïŒïŒ¯ïŒïŒïœïœïœïœ ïœïœïŒïŒ¯ïœïœïœïœïœïœãïœïœïœïœïŒãªã©ã®å ç£æ°èšé²åªäœãããŒãåªäœãç£æ°èšé²åªäœããŸãã¯åå°äœã¡ã¢ãªãªã©ã§ããã The input/output interface 1600 is an interface for connecting the input/output device 1650 and the computer 1000. For example, the CPU 1100 receives data from input devices such as a keyboard or mouse via the input/output interface 1600. The CPU 1100 also transmits data to output devices such as a display device, speaker, or printer via the input/output interface 1600. The input/output interface 1600 may also function as a media interface that reads programs recorded on a specified recording medium. Examples of media include optical recording media such as DVDs (Digital Versatile Discs) and PDs (Phase Change Rewritable Disks), magneto-optical recording media such as MOs (Magneto-Optical Disks), tape media, magnetic recording media, or semiconductor memory.
ãããšãã°ãã³ã³ãã¥ãŒã¿ïŒïŒïŒïŒã宿œåœ¢æ ã«ãããæ å ±åŠçè£ çœ®ïŒïŒãšããŠæ©èœããå Žåãã³ã³ãã¥ãŒã¿ïŒïŒïŒïŒã®ïŒ£ïŒ°ïŒµïŒïŒïŒïŒã¯ãïŒïŒïŒïŒïŒäžã«ããŒããããæ å ±åŠçããã°ã©ã ãå®è¡ããããšã«ãããåè¿°ããåéšã®æ©èœãå®çŸããããŸããïŒïŒïŒïŒã«ã¯ãæ¬é瀺ã«ãããæ å ±åŠçããã°ã©ã ãåçš®ã¢ãã«ããã³åçš®ããŒã¿ãæ ŒçŽãããããªããïŒïŒïŒïŒã¯ãããã°ã©ã ããŒã¿ïŒïŒïŒïŒãïŒïŒïŒïŒããèªã¿åã£ãŠå®è¡ããããä»ã®äŸãšããŠãå€éšãããã¯ãŒã¯ïŒïŒïŒïŒãä»ããŠãä»ã®è£ 眮ãããããã®ããã°ã©ã ãååŸããŠãããã For example, when computer 1000 functions as information processing device 10 according to an embodiment, CPU 1100 of computer 1000 executes an information processing program loaded onto RAM 1200, thereby realizing the functions of each of the aforementioned units. HDD 1400 also stores the information processing program, various models, and various data according to the present disclosure. While CPU 1100 reads and executes program data 1450 from HDD 1400, as another example, it may also obtain these programs from other devices via external network 1550.
ïŒïŒå¹æïŒœ
ãæ
å ±åŠçè£
眮ïŒïŒã¯ããããããŒãºïŒåæçžé¢éšïŒïŒããããããŒãºæšå®éšïŒïŒããã³å
軞ïŒèŠè»žçžé¢éšïŒïŒãæããããããããŒãºïŒåæçžé¢éšïŒïŒã¯ãçŒçåæãšãããããŒãºãšã®éã®çžé¢ããããããŒãºïŒåæçžé¢ãšããŠååŸããããããããŒãºæšå®éšïŒïŒã¯ããããããŒãºãååŸãããå
軞ïŒèŠè»žçžé¢éšïŒïŒã¯ããããããŒãºããããããŒãºïŒåæçžé¢ã«é©çšããŠåŸãããçŒçåæã«åºã¥ããŠçŒçã®èŠè»žïŒ¶ïŒ¡ãè£æ£ãããæ¬éç€ºã®æ
å ±åŠçæ¹æ³ã¯ãæ
å ±åŠçè£
眮ïŒïŒã®åŠçãã³ã³ãã¥ãŒã¿ïŒïŒïŒïŒã«ããå®è¡ããããæ¬é瀺ã®ããã°ã©ã ã¯ãæ
å ±åŠçè£
眮ïŒïŒã®åŠçãã³ã³ãã¥ãŒã¿ïŒïŒïŒïŒã«å®çŸãããã
8. Effects
The information processing device 10 has a head pose-rotation correlation unit 13, a head pose estimation unit 12, and an optical axis-visual axis correlation unit 18. The head pose-rotation correlation unit 13 obtains the correlation between eye rotation and head pose as a head pose-rotation correlation. The head pose estimation unit 12 obtains the head pose. The optical axis-visual axis correlation unit 18 corrects the visual axis VA of the eye EB based on the eye rotation obtained by applying the head pose to the head pose-rotation correlation. In the information processing method disclosed herein, the processing of the information processing device 10 is executed by a computer 1000. A program disclosed herein causes the computer 1000 to realize the processing of the information processing device 10.
ããã®æ§æã«ããã°ãçŒçã®åæãé©åã«èæ ®ããèŠè»žïŒ¶ïŒ¡ã®è£æ£ãè¡ãããããã®ããã粟床ã®ããèŠç·æšå®ãå¯èœã§ããã With this configuration, the visual axis VA is corrected while appropriately taking into account the rotation of the eyeball EB. This enables highly accurate gaze estimation.
ããããããŒãºïŒåæçžé¢éšïŒïŒã¯ããããããŒãºã®ååŸå¯Ÿè±¡ãšãªããŠãŒã¶ïŒµïŒ³ã«ã€ããŠååŸããããçŒçåæãšãããããŒãºãšã®éã®çžé¢ãèšæ¶ããã The head pose-rotation correlation unit 13 stores the correlation between eye rotation and head pose acquired for the user US whose head pose is to be acquired.
ããã®æ§æã«ããã°ããŠãŒã¶ïŒµïŒ³ã®å人差ãèæ ®ããé©åãªèŠè»žïŒ¶ïŒ¡ã®è£æ£ãè¡ãããã This configuration allows for appropriate correction of the visual axis VA, taking into account individual differences among users US.
ãæ å ±åŠçè£ çœ®ïŒïŒã¯ãã¿ãŒã²ãã衚瀺éšïŒïŒããã³åææ€åºéšïŒïŒãæãããã¿ãŒã²ãã衚瀺éšïŒïŒã¯ããŠãŒã¶ïŒµïŒ³ã泚ç®ããã¿ãŒã²ããããã£ã¹ãã¬ã€ïŒïŒã«è¡šç€ºãããåææ€åºéšïŒïŒã¯ãã¿ãŒã²ããã®ç§»åã«äŒŽããŠãŒã¶ïŒµïŒ³ã®çŒçåæãæ€åºããããããããŒãºæšå®éšïŒïŒã¯ãã¿ãŒã²ããã®ç§»åã«äŒŽã£ãŠå€åãããŠãŒã¶ïŒµïŒ³ã®ãããããŒãºãååŸããããããããŒãºïŒåæçžé¢éšïŒïŒã¯ãã¿ãŒã²ããã®ç§»åã«äŒŽãçŒçåæãšãããããŒãºãšã®éã®çžé¢ããããããŒãºïŒåæçžé¢ãšããŠååŸããã The information processing device 10 has a target display unit 14 and a rotation detection unit 16. The target display unit 14 displays a target TG that the user US is focusing on on the display 23. The rotation detection unit 16 detects eye rotation of the user US that accompanies movement of the target TG. The head pose estimation unit 12 acquires the head pose of the user US that changes as the target TG moves. The head pose-rotation correlation unit 13 acquires the correlation between the eye rotation and head pose that accompanies movement of the target TG as the head pose-rotation correlation.
ããã®æ§æã«ããã°ããŠãŒã¶ïŒµïŒ³ãã¿ãŒã²ãããç®ã§è¿œãéçšã§ãããããŒãºã®å€åãä¿ããããã¿ãŒã²ãããç®ã§è¿œãåãã®äžã§èªç¶ã«ãããããŒãºïŒåæçžé¢ãååŸãããã With this configuration, the user US is encouraged to change their head pose as they follow the target TG with their eyes. Head pose-rotation correlation is acquired naturally as they follow the target TG with their eyes.
ãã¿ãŒã²ãã衚瀺éšïŒïŒã¯ããããããŒãºã®å€åãä¿ãã¡ãã»ãŒãžãéç¥ããã The target display unit 14 displays a message encouraging the user to change their head pose.
ããã®æ§æã«ããã°ããããããŒãºã®å€åã確å®ã«å®æœãããã This configuration ensures that head pose changes are implemented reliably.
ãã¿ãŒã²ãã衚瀺éšïŒïŒã¯ããããããŒãºã®å€åãåŒãèµ·ãããããããªäœçœ®ã«ã¿ãŒã²ããã衚瀺ããã The target display unit 14 displays the target TG in a position that induces a change in head pose.
ããã®æ§æã«ããã°ããããããŒãºã®å€åã確å®ã«å®æœãããã This configuration ensures that head pose changes are implemented reliably.
ãã¿ãŒã²ãã衚瀺éšïŒïŒã¯ãã¿ãŒã²ããã®åæ¹ã«é®èœç©ïŒ¯ïŒ¢ã衚瀺ãããããã«ãããã¿ãŒã²ãã衚瀺éšïŒïŒã¯ãé®èœç©ïŒ¯ïŒ¢ã®è£åŽã«ããã¿ãŒã²ãããèŠã蟌ããããªãããããŒãºã®åããä¿ãã The target display unit 14 displays an obstruction OB in front of the target TG. This encourages the head pose movement to look at the target TG behind the obstruction OB.
ããã®æ§æã«ããã°ããããããŒãºã®å€åã確å®ã«å®æœãããã This configuration ensures that head pose changes are implemented reliably.
ããããããŒãºïŒåæçžé¢éšïŒïŒã¯ãäºãèšå®ãããããªã¬ã«åºã¥ããŠéæ¬ çã«ååŸããããŠãŒã¶ïŒµïŒ³ã®çŒçåæããã³ãããããŒãºã®ããŒã¿ã«åºã¥ããŠããããããŒãºïŒåæçžé¢ã鿬¡æŽæ°ããã The headpose-rotation correlation unit 13 sequentially updates the headpose-rotation correlation based on eye rotation and headpose data of the user US acquired intermittently based on preset triggers.
ããã®æ§æã«ããã°ããŠãŒã¶ïŒµïŒ³ã®å人差ãåæ ãã粟床ã®ãããããããŒãºïŒåæçžé¢ãåŸãããã This configuration allows for accurate headpose-rotation correlation that reflects individual differences among users US.
ãæ å ±åŠçè£ çœ®ïŒïŒã¯ãã·ãŒã³æ€åºéšãæãããã·ãŒã³æ€åºéšã¯ãèªç®é åãå«ãæ åã·ãŒã³ãç¹ç°ã·ãŒã³ãšããŠæ€åºãããåææ€åºéšïŒïŒã¯ãç¹ç°ã·ãŒã³ãæ€åºãããããšãããªã¬ãšããŠãèªç®é åã®ç§»åã«äŒŽããŠãŒã¶ïŒµïŒ³ã®çŒçåæãæ€åºããããããããŒãºæšå®éšïŒïŒã¯ãèªç®é åã®ç§»åã«äŒŽã£ãŠå€åãããŠãŒã¶ïŒµïŒ³ã®ãããããŒãºãååŸããããããããŒãºïŒåæçžé¢éšïŒïŒã¯ãèªç®é åã®ç§»åã«äŒŽãçŒçåæãšãããããŒãºãšã®éã®çžé¢ã«åºã¥ããŠãããããŒãºïŒåæçžé¢ãæŽæ°ããã The information processing device 10 has a scene detection unit. The scene detection unit detects video scenes including an attention region as peculiar scenes. The rotation detection unit 16, triggered by the detection of a peculiar scene, detects eye rotation of the user US accompanying movement of the attention region. The head pose estimation unit 12 acquires the head pose of the user US, which fluctuates as the attention region moves. The head pose-rotation correlation unit 13 updates the head pose-rotation correlation based on the correlation between the eye rotation accompanying movement of the attention region and the head pose.
ããã®æ§æã«ããã°ãéåžžã®ã³ã³ãã³ãïŒæ ç»ãã²ãŒã ãªã©ïŒã®èŠèŽåäœã®äžã§èªç¶ã«ãããããŒãºïŒåæçžé¢ãååŸãããã With this configuration, head pose-rotation correlation can be acquired naturally during viewing of normal content (movies, games, etc.).
ããããããŒãºïŒåæçžé¢éšïŒïŒã¯ãè€æ°ã®è¢«éšè ã®ããŒã¿ã«åºã¥ãå¹³åçãªçŒçåæãšãããããŒãºãšã®éã®çžé¢ããããããŒãºïŒåæçžé¢ãšããŠèšæ¶ããã The headpose-rotation correlation unit 13 stores the correlation between average eye rotation and headpose based on data from multiple subjects as the headpose-rotation correlation.
ããã®æ§æã«ããã°ãäºåã®ãã£ãªãã¬ãŒã·ã§ã³ãçç¥ãã€ã€æŠãè¯å¥œãªèŠç·æšå®ã宿œãããã This configuration allows for generally good gaze estimation while omitting prior calibration.
ãæ å ±åŠçè£ çœ®ïŒïŒã¯ãå 軞æšå®éšïŒïŒãæãããå 軞æšå®éšïŒïŒã¯ãçŒçã®å 軞ãååŸãããå 軞ïŒèŠè»žçžé¢éšïŒïŒã¯ãå 軞ãšèŠè»žïŒ¶ïŒ¡ãšã®éã®çžé¢ãå 軞ïŒèŠè»žçžé¢ãšããŠååŸãããå 軞ïŒèŠè»žçžé¢éšïŒïŒã¯ãå 軞ãå 軞ïŒèŠè»žçžé¢ã«é©çšããŠèŠè»žïŒ¶ïŒ¡ãæšå®ãããå 軞ïŒèŠè»žçžé¢éšïŒïŒã¯ãæšå®ãããèŠè»žïŒ¶ïŒ¡ãçŒçåæã«åºã¥ããŠè£æ£ããã The information processing device 10 has an optical axis estimation unit 17. The optical axis estimation unit 17 acquires the optical axis LA of the eyeball EB. The optical axis-visual axis correlation unit 18 acquires the correlation between the optical axis LA and the visual axis VA as the optical axis-visual axis correlation. The optical axis-visual axis correlation unit 18 applies the optical axis LA to the optical axis-visual axis correlation to estimate the visual axis VA. The optical axis-visual axis correlation unit 18 corrects the estimated visual axis VA based on eye rotation.
ããã®æ§æã«ããã°ãå 軞ããèŠè»žïŒ¶ïŒ¡ã粟床ããæšå®ãããããã®ããã粟床ã®ããèŠç·æšå®ãå¯èœã§ããã With this configuration, the visual axis VA can be estimated with high accuracy from the optical axis LA. This allows for highly accurate gaze estimation.
ãå 軞ïŒèŠè»žçžé¢éšïŒïŒã¯ãå 軞ã®ååŸå¯Ÿè±¡ãšãªããŠãŒã¶ïŒµïŒ³ã«ã€ããŠååŸããããå 軞ãšèŠè»žïŒ¶ïŒ¡ãšã®éã®çžé¢ãèšæ¶ããã The optical axis-visual axis correlation unit 18 stores the correlation between the optical axis LA and the visual axis VA obtained for the user US for whom the optical axis LA is to be obtained.
ããã®æ§æã«ããã°ããŠãŒã¶ïŒµïŒ³ã®å人差ãåæ ãã粟床ã®ããèŠç·æšå®ãå¯èœã§ããã This configuration enables accurate gaze estimation that reflects individual differences among users US.
ããªããæ¬æçްæžã«èšèŒããã广ã¯ãããŸã§äŸç€ºã§ãã£ãŠéå®ããããã®ã§ã¯ç¡ãããŸãä»ã®å¹æããã£ãŠãããã Please note that the effects described in this specification are merely examples and are not limiting, and other effects may also be present.
ä»èšïŒœ
ããªããæ¬æè¡ã¯ä»¥äžã®ãããªæ§æãæ¡ãããšãã§ããã
ïŒïŒïŒ
ãçŒçåæãšãããããŒãºãšã®éã®çžé¢ããããããŒãºïŒåæçžé¢ãšããŠååŸãããããããŒãºïŒåæçžé¢éšãšã
ãåèšãããããŒãºãååŸãããããããŒãºæšå®éšãšã
ãåèšãããããŒãºãåèšãããããŒãºïŒåæçžé¢ã«é©çšããŠåŸãããåèšçŒçåæã«åºã¥ããŠçŒçã®èŠè»žãè£æ£ããå
軞ïŒèŠè»žçžé¢éšãšã
ããæããæ
å ±åŠçè£
眮ã
ïŒïŒïŒ
ãåèšãããããŒãºïŒåæçžé¢éšã¯ãåèšãããããŒãºã®ååŸå¯Ÿè±¡ãšãªããŠãŒã¶ã«ã€ããŠååŸããããåèšçŒçåæãšåèšãããããŒãºãšã®éã®çžé¢ãèšæ¶ããã
ãäžèšïŒïŒïŒã«èšèŒã®æ
å ±åŠçè£
眮ã
ïŒïŒïŒ
ãåèšãŠãŒã¶ã泚ç®ããã¿ãŒã²ããããã£ã¹ãã¬ã€ã«è¡šç€ºããã¿ãŒã²ãã衚瀺éšãšã
ãåèšã¿ãŒã²ããã®ç§»åã«äŒŽãåèšãŠãŒã¶ã®åèšçŒçåæãæ€åºããåææ€åºéšãšã
ããæãã
ãåèšãããããŒãºæšå®éšã¯ãåèšã¿ãŒã²ããã®ç§»åã«äŒŽã£ãŠå€åããåèšãŠãŒã¶ã®åèšãããããŒãºãååŸãã
ãåèšãããããŒãºïŒåæçžé¢éšã¯ãåèšã¿ãŒã²ããã®ç§»åã«äŒŽãåèšçŒçåæãšåèšãããããŒãºãšã®éã®çžé¢ãåèšãããããŒãºïŒåæçžé¢ãšããŠååŸããã
ãäžèšïŒïŒïŒã«èšèŒã®æ
å ±åŠçè£
眮ã
ïŒïŒïŒ
ãåèšã¿ãŒã²ãã衚瀺éšã¯ãåèšãããããŒãºã®å€åãä¿ãã¡ãã»ãŒãžãéç¥ããã
ãäžèšïŒïŒïŒã«èšèŒã®æ
å ±åŠçè£
眮ã
ïŒïŒïŒ
ãåèšã¿ãŒã²ãã衚瀺éšã¯ãåèšãããããŒãºã®å€åãåŒãèµ·ãããããããªäœçœ®ã«åèšã¿ãŒã²ããã衚瀺ããã
ãäžèšïŒïŒïŒãŸãã¯ïŒïŒïŒã«èšèŒã®æ
å ±åŠçè£
眮ã
ïŒïŒïŒ
ãåèšã¿ãŒã²ãã衚瀺éšã¯ãåèšã¿ãŒã²ããã®åæ¹ã«é®èœç©ã衚瀺ããåèšé®èœç©ã®è£åŽã«ããåèšã¿ãŒã²ãããèŠã蟌ããããªåèšãããããŒãºã®åããä¿ãã
ãäžèšïŒïŒïŒã«èšèŒã®æ
å ±åŠçè£
眮ã
ïŒïŒïŒ
ãåèšãããããŒãºïŒåæçžé¢éšã¯ãäºãèšå®ãããããªã¬ã«åºã¥ããŠéæ¬ çã«ååŸãããåèšãŠãŒã¶ã®åèšçŒçåæããã³åèšãããããŒãºã®ããŒã¿ã«åºã¥ããŠãåèšãããããŒãºïŒåæçžé¢ã鿬¡æŽæ°ããã
ãäžèšïŒïŒïŒãªããïŒïŒïŒã®ããããïŒã€ã«èšèŒã®æ
å ±åŠçè£
眮ã
ïŒïŒïŒ
ãèªç®é åãå«ãæ åã·ãŒã³ãç¹ç°ã·ãŒã³ãšããŠæ€åºããã·ãŒã³æ€åºéšãšã
ãåèšç¹ç°ã·ãŒã³ãæ€åºãããããšãåèšããªã¬ãšããŠãåèšèªç®é åã®ç§»åã«äŒŽãåèšãŠãŒã¶ã®åèšçŒçåæãæ€åºããåææ€åºéšãšã
ããæãã
ãåèšãããããŒãºæšå®éšã¯ãåèšèªç®é åã®ç§»åã«äŒŽã£ãŠå€åããåèšãŠãŒã¶ã®åèšãããããŒãºãååŸãã
ãåèšãããããŒãºïŒåæçžé¢éšã¯ãåèšèªç®é åã®ç§»åã«äŒŽãåèšçŒçåæãšåèšãããããŒãºãšã®éã®çžé¢ã«åºã¥ããŠåèšãããããŒãºïŒåæçžé¢ãæŽæ°ããã
ãäžèšïŒïŒïŒã«èšèŒã®æ
å ±åŠçè£
眮ã
ïŒïŒïŒ
ãåèšãããããŒãºïŒåæçžé¢éšã¯ãè€æ°ã®è¢«éšè
ã®ããŒã¿ã«åºã¥ãå¹³åçãªåèšçŒçåæãšåèšãããããŒãºãšã®éã®çžé¢ãåèšãããããŒãºïŒåæçžé¢ãšããŠèšæ¶ããã
ãäžèšïŒïŒïŒã«èšèŒã®æ
å ±åŠçè£
眮ã
ïŒïŒïŒïŒ
ãåèšçŒçã®å
軞ãååŸããå
軞æšå®éšãæãã
ãåèšå
軞ïŒèŠè»žçžé¢éšã¯ã
ããåèšå
軞ãšåèšèŠè»žãšã®éã®çžé¢ãå
軞ïŒèŠè»žçžé¢ãšããŠååŸãã
ããåèšå
軞ãåèšå
軞ïŒèŠè»žçžé¢ã«é©çšããŠåèšèŠè»žãæšå®ãã
ããæšå®ãããåèšèŠè»žãåèšçŒçåæã«åºã¥ããŠè£æ£ããã
ãäžèšïŒïŒïŒãªããïŒïŒïŒã®ããããïŒã€ã«èšèŒã®æ
å ±åŠçè£
眮ã
ïŒïŒïŒïŒ
ãåèšå
軞ïŒèŠè»žçžé¢éšã¯ãåèšå
軞ã®ååŸå¯Ÿè±¡ãšãªããŠãŒã¶ã«ã€ããŠååŸããããåèšå
軞ãšåèšèŠè»žãšã®éã®çžé¢ãèšæ¶ããã
ãäžèšïŒïŒïŒïŒã«èšèŒã®æ
å ±åŠçè£
眮ã
ïŒïŒïŒïŒ
ãçŒçåæãšãããããŒãºãšã®éã®çžé¢ããããããŒãºïŒåæçžé¢ãšããŠååŸãã
ãåèšãããããŒãºãååŸãã
ãåèšãããããŒãºãåèšãããããŒãºïŒåæçžé¢ã«é©çšããŠåŸãããåèšçŒçåæã«åºã¥ããŠçŒçã®èŠè»žãè£æ£ããã
ãããšãæãããã³ã³ãã¥ãŒã¿ã«ããå®è¡ãããæ
å ±åŠçæ¹æ³ã
ïŒïŒïŒïŒ
ãåèšãããããŒãºã®ååŸå¯Ÿè±¡ãšãªããŠãŒã¶ã«ã€ããŠååŸããããåèšçŒçåæãšåèšãããããŒãºãšã®éã®çžé¢ãèšæ¶ããããšãæããã
ãäžèšïŒïŒïŒïŒã«èšèŒã®æ
å ±åŠçæ¹æ³ã
ïŒïŒïŒïŒ
ãåèšãŠãŒã¶ã泚ç®ããã¿ãŒã²ããããã£ã¹ãã¬ã€ã«è¡šç€ºãã
ãåèšã¿ãŒã²ããã®ç§»åã«äŒŽãåèšãŠãŒã¶ã®åèšçŒçåæãæ€åºããã
ãããšãæãã
ãåèšãããããŒãºã®ååŸåŠçã¯ãåèšã¿ãŒã²ããã®ç§»åã«äŒŽã£ãŠå€åããåèšãŠãŒã¶ã®åèšãããããŒãºãååŸãã
ãåèšãããããŒãºïŒåæçžé¢ã®ååŸåŠçã¯ãåèšã¿ãŒã²ããã®ç§»åã«äŒŽãåèšçŒçåæãšåèšãããããŒãºãšã®éã®çžé¢ãåèšãããããŒãºïŒåæçžé¢ãšããŠååŸããã
ãäžèšïŒïŒïŒïŒã«èšèŒã®æ
å ±åŠçæ¹æ³ã
ïŒïŒïŒïŒ
ãåèšãããããŒãºã®å€åãä¿ãã¡ãã»ãŒãžãéç¥ããããšãæããã
ãäžèšïŒïŒïŒïŒã«èšèŒã®æ
å ±åŠçæ¹æ³ã
ïŒïŒïŒïŒ
ãåèšã¿ãŒã²ããã®è¡šç€ºåŠçã¯ãåèšãããããŒãºã®å€åãåŒãèµ·ãããããããªäœçœ®ã«åèšã¿ãŒã²ããã衚瀺ããã
ãäžèšïŒïŒïŒïŒãŸãã¯ïŒïŒïŒïŒã«èšèŒã®æ
å ±åŠçæ¹æ³ã
ïŒïŒïŒïŒ
ãåèšã¿ãŒã²ããã®è¡šç€ºåŠçã¯ãåèšã¿ãŒã²ããã®åæ¹ã«é®èœç©ã衚瀺ããåèšé®èœç©ã®è£åŽã«ããåèšã¿ãŒã²ãããèŠã蟌ããããªåèšãããããŒãºã®åããä¿ãã
ãäžèšïŒïŒïŒïŒã«èšèŒã®æ
å ±åŠçæ¹æ³ã
ïŒïŒïŒïŒ
ãåèšãããããŒãºïŒåæçžé¢ã®ååŸåŠçã¯ãäºãèšå®ãããããªã¬ã«åºã¥ããŠéæ¬ çã«ååŸãããåèšãŠãŒã¶ã®åèšçŒçåæããã³åèšãããããŒãºã®ããŒã¿ã«åºã¥ããŠãåèšãããããŒãºïŒåæçžé¢ã鿬¡æŽæ°ããã
ãäžèšïŒïŒïŒïŒãªããïŒïŒïŒïŒã®ããããïŒã€ã«èšèŒã®æ
å ±åŠçæ¹æ³ã
ïŒïŒïŒïŒ
ãåèšãããããŒãºïŒåæçžé¢ã®ååŸåŠçã¯ãè€æ°ã®è¢«éšè
ã®ããŒã¿ã«åºã¥ãå¹³åçãªåèšçŒçåæãšåèšãããããŒãºãšã®éã®çžé¢ãåèšãããããŒãºïŒåæçžé¢ãšããŠèšæ¶ããã
ãäžèšïŒïŒïŒïŒã«èšèŒã®æ
å ±åŠçæ¹æ³ã
ïŒïŒïŒïŒ
ãçŒçåæãšãããããŒãºãšã®éã®çžé¢ããããããŒãºïŒåæçžé¢ãšããŠååŸãã
ãåèšãããããŒãºãååŸãã
ãåèšãããããŒãºãåèšãããããŒãºïŒåæçžé¢ã«é©çšããŠåŸãããåèšçŒçåæã«åºã¥ããŠçŒçã®èŠè»žãè£æ£ããã
ãããšãã³ã³ãã¥ãŒã¿ã«å®çŸãããããã°ã©ã ã
[Note]
The present technology can also be configured as follows.
(1)
a headpose-rotation correlation unit that acquires a correlation between eye rotation and head pose as a headpose-rotation correlation;
a head pose estimation unit for acquiring the head pose;
an optical axis-visual axis correlation unit that corrects the visual axis of the eyeball based on the eyeball rotation obtained by applying the headpose to the headpose-rotation correlation;
An information processing device having the above.
(2)
The head pose-rotation correlation unit stores the correlation between the eye rotation and the head pose acquired for the user whose head pose is to be acquired.
The information processing device according to (1) above.
(3)
a target display unit that displays a target that the user is paying attention to on a display;
a rotation detection unit that detects the eye rotation of the user accompanying the movement of the target;
and
the head pose estimation unit acquires the head pose of the user that varies in accordance with movement of the target;
the headpose-rotation correlation unit acquires a correlation between the eye rotation accompanying the movement of the target and the headpose as the headpose-rotation correlation;
The information processing device according to (2) above.
(4)
the target display unit notifies a message prompting the user to change the head pose.
The information processing device according to (3) above.
(5)
the target display unit displays the target at a position that causes a change in the head pose.
The information processing device according to (3) or (4) above.
(6)
the target display unit displays an obstruction in front of the target, and prompts the user to move the head pose in a manner that looks at the target behind the obstruction;
The information processing device according to (5) above.
(7)
the headpose-rotation correlator sequentially updates the headpose-rotation correlation based on data of the eye rotation and the headpose of the user intermittently acquired based on a preset trigger;
The information processing device according to any one of (2) to (6) above.
(8)
a scene detection unit that detects a video scene including an attention region as a peculiar scene;
a rotation detection unit that detects the eye rotation of the user accompanying the movement of the attraction region, using the detection of the peculiar scene as the trigger;
and
the head pose estimation unit acquires the head pose of the user that varies in accordance with the movement of the interest region;
the head pose-rotation correlation unit updates the head pose-rotation correlation based on the correlation between the eye rotation accompanying the movement of the attention region and the head pose.
The information processing device according to (7) above.
(9)
the headpose-rotation correlation unit stores an average correlation between the eye rotation and the headpose based on data of a plurality of subjects as the headpose-rotation correlation;
The information processing device according to (1) above.
(10)
an optical axis estimation unit that acquires the optical axis of the eyeball,
The optical axis-visual axis correlation unit
Obtaining a correlation between the optical axis and the visual axis as an optical axis-visual axis correlation;
applying the optical axis to the optical axis-visual axis correlation to estimate the visual axis;
correcting the estimated visual axis based on the eye rotation;
The information processing device according to any one of (1) to (9) above.
(11)
The optical axis-visual axis correlation unit stores the correlation between the optical axis and the visual axis acquired for the user from whom the optical axis is to be acquired.
The information processing device according to (10) above.
(12)
Obtaining a correlation between eye rotation and head pose as a head pose-rotation correlation;
obtaining the head pose;
correcting the visual axis of the eye based on the eye rotation obtained by applying the head pose to the head pose-rotation correlation;
1. A computer-implemented information processing method comprising:
(13)
storing the correlation between the eye rotation and the head pose acquired for the user whose head pose is to be acquired;
The information processing method according to (12) above.
(14)
Displaying a target that the user is paying attention to on a display;
detecting the eye rotation of the user accompanying the movement of the target;
Having that,
the head pose acquisition process acquires the head pose of the user that changes in accordance with the movement of the target;
The process of acquiring the head pose-rotation correlation includes acquiring a correlation between the eye rotation accompanying the movement of the target and the head pose as the head pose-rotation correlation.
The information processing method according to (13) above.
(15)
notifying a message prompting the head pose change;
The information processing method according to (14) above.
(16)
The target display process displays the target at a position that causes a change in the head pose.
The information processing method according to (14) or (15) above.
(17)
The target display process includes displaying an obstruction in front of the target, and prompting the user to move the head pose so as to look at the target behind the obstruction.
The information processing method according to (16) above.
(18)
The process of acquiring the head pose-rotation correlation includes sequentially updating the head pose-rotation correlation based on data of the eye rotation and the head pose of the user acquired intermittently based on a preset trigger.
The information processing method according to any one of (13) to (17) above.
(19)
The process of acquiring the head pose-rotation correlation includes storing an average correlation between the eye rotation and the head pose based on data of a plurality of subjects as the head pose-rotation correlation.
The information processing method according to (12) above.
(20)
Obtaining a correlation between eye rotation and head pose as a head pose-rotation correlation;
obtaining the head pose;
correcting the visual axis of the eye based on the eye rotation obtained by applying the head pose to the head pose-rotation correlation;
A program that makes a computer do something.
ïŒïŒïŒïŒïŒãæ
å ±åŠçè£
眮
ïŒïŒããããããŒãºæšå®éš
ïŒïŒããããããŒãºïŒåæçžé¢éš
ïŒïŒãã¿ãŒã²ãã衚瀺éš
ïŒïŒãåææ€åºéš
ïŒïŒãå
軞æšå®éš
ïŒïŒãå
軞ïŒèŠè»žçžé¢éš
ïŒïŒããã£ã¹ãã¬ã€
çŒç
ãå
軞
ãé®èœç©
ãã¿ãŒã²ãã
ããŠãŒã¶
ãèŠè»ž
10, 30 Information processing device 12 Head pose estimation unit 13 Head pose-rotation correlation unit 14 Target display unit 16 Rotation detection unit 17 Optical axis estimation unit 18 Optical axis-visual axis correlation unit 23 Display EB Eyeball LA Optical axis OB Obstruction object TG Target US User VA Visual axis
Claims (20)
ãåèšãããããŒãºãååŸãããããããŒãºæšå®éšãšã
ãåèšãããããŒãºãåèšãããããŒãºïŒåæçžé¢ã«é©çšããŠåŸãããåèšçŒçåæã«åºã¥ããŠçŒçã®èŠè»žãè£æ£ããå 軞ïŒèŠè»žçžé¢éšãšã
ããæããæ å ±åŠçè£ çœ®ã a headpose-rotation correlation unit that acquires a correlation between eye rotation and head pose as a headpose-rotation correlation;
a head pose estimation unit for acquiring the head pose;
an optical axis-visual axis correlation unit that corrects the visual axis of the eyeball based on the eyeball rotation obtained by applying the headpose to the headpose-rotation correlation;
An information processing device having the above.
ãè«æ±é ïŒã«èšèŒã®æ å ±åŠçè£ çœ®ã The head pose-rotation correlation unit stores the correlation between the eye rotation and the head pose acquired for the user whose head pose is to be acquired.
The information processing device according to claim 1 .
ãåèšã¿ãŒã²ããã®ç§»åã«äŒŽãåèšãŠãŒã¶ã®åèšçŒçåæãæ€åºããåææ€åºéšãšã
ããæãã
ãåèšãããããŒãºæšå®éšã¯ãåèšã¿ãŒã²ããã®ç§»åã«äŒŽã£ãŠå€åããåèšãŠãŒã¶ã®åèšãããããŒãºãååŸãã
ãåèšãããããŒãºïŒåæçžé¢éšã¯ãåèšã¿ãŒã²ããã®ç§»åã«äŒŽãåèšçŒçåæãšåèšãããããŒãºãšã®éã®çžé¢ãåèšãããããŒãºïŒåæçžé¢ãšããŠååŸããã
ãè«æ±é ïŒã«èšèŒã®æ å ±åŠçè£ çœ®ã a target display unit that displays a target that the user is paying attention to on a display;
a rotation detection unit that detects the eye rotation of the user accompanying the movement of the target;
and
the head pose estimation unit acquires the head pose of the user that varies in accordance with movement of the target;
the headpose-rotation correlation unit acquires a correlation between the eye rotation accompanying the movement of the target and the headpose as the headpose-rotation correlation;
The information processing device according to claim 2 .
ãè«æ±é ïŒã«èšèŒã®æ å ±åŠçè£ çœ®ã the target display unit notifies a message prompting the user to change the head pose.
The information processing device according to claim 3 .
ãè«æ±é ïŒã«èšèŒã®æ å ±åŠçè£ çœ®ã the target display unit displays the target at a position that causes a change in the head pose.
The information processing device according to claim 3 .
ãè«æ±é ïŒã«èšèŒã®æ å ±åŠçè£ çœ®ã the target display unit displays an obstruction in front of the target, and prompts the user to move the head pose in a manner that looks at the target behind the obstruction;
The information processing device according to claim 5 .
ãè«æ±é ïŒã«èšèŒã®æ å ±åŠçè£ çœ®ã the headpose-rotation correlator sequentially updates the headpose-rotation correlation based on data of the eye rotation and the headpose of the user intermittently acquired based on a preset trigger;
The information processing device according to claim 2 .
ãåèšç¹ç°ã·ãŒã³ãæ€åºãããããšãåèšããªã¬ãšããŠãåèšèªç®é åã®ç§»åã«äŒŽãåèšãŠãŒã¶ã®åèšçŒçåæãæ€åºããåææ€åºéšãšã
ããæãã
ãåèšãããããŒãºæšå®éšã¯ãåèšèªç®é åã®ç§»åã«äŒŽã£ãŠå€åããåèšãŠãŒã¶ã®åèšãããããŒãºãååŸãã
ãåèšãããããŒãºïŒåæçžé¢éšã¯ãåèšèªç®é åã®ç§»åã«äŒŽãåèšçŒçåæãšåèšãããããŒãºãšã®éã®çžé¢ã«åºã¥ããŠåèšãããããŒãºïŒåæçžé¢ãæŽæ°ããã
ãè«æ±é ïŒã«èšèŒã®æ å ±åŠçè£ çœ®ã a scene detection unit that detects a video scene including an attention region as a peculiar scene;
a rotation detection unit that detects the eye rotation of the user accompanying the movement of the attraction region, using the detection of the peculiar scene as the trigger;
and
the head pose estimation unit acquires the head pose of the user that varies in accordance with the movement of the interest region;
the head pose-rotation correlation unit updates the head pose-rotation correlation based on the correlation between the eye rotation accompanying the movement of the attention region and the head pose.
The information processing device according to claim 7 .
ãè«æ±é ïŒã«èšèŒã®æ å ±åŠçè£ çœ®ã the headpose-rotation correlation unit stores, as the headpose-rotation correlation, a correlation between an average of the eye rotation and the headpose based on data of a plurality of subjects;
The information processing device according to claim 1 .
ãåèšå 軞ïŒèŠè»žçžé¢éšã¯ã
ããåèšå 軞ãšåèšèŠè»žãšã®éã®çžé¢ãå 軞ïŒèŠè»žçžé¢ãšããŠååŸãã
ããåèšå 軞ãåèšå 軞ïŒèŠè»žçžé¢ã«é©çšããŠåèšèŠè»žãæšå®ãã
ããæšå®ãããåèšèŠè»žãåèšçŒçåæã«åºã¥ããŠè£æ£ããã
ãè«æ±é ïŒã«èšèŒã®æ å ±åŠçè£ çœ®ã an optical axis estimation unit that acquires the optical axis of the eyeball,
The optical axis-visual axis correlation unit
Obtaining a correlation between the optical axis and the visual axis as an optical axis-visual axis correlation;
applying the optical axis to the optical axis-visual axis correlation to estimate the visual axis;
correcting the estimated visual axis based on the eye rotation;
The information processing device according to claim 1 .
ãè«æ±é ïŒïŒã«èšèŒã®æ å ±åŠçè£ çœ®ã The optical axis-visual axis correlation unit stores the correlation between the optical axis and the visual axis acquired for the user from whom the optical axis is to be acquired.
The information processing device according to claim 10.
ãåèšãããããŒãºãååŸãã
ãåèšãããããŒãºãåèšãããããŒãºïŒåæçžé¢ã«é©çšããŠåŸãããåèšçŒçåæã«åºã¥ããŠçŒçã®èŠè»žãè£æ£ããã
ãããšãæãããã³ã³ãã¥ãŒã¿ã«ããå®è¡ãããæ å ±åŠçæ¹æ³ã Obtaining a correlation between eye rotation and head pose as a head pose-rotation correlation;
obtaining the head pose;
correcting the visual axis of the eye based on the eye rotation obtained by applying the head pose to the head pose-rotation correlation;
10. A computer-implemented information processing method comprising:
ãè«æ±é ïŒïŒã«èšèŒã®æ å ±åŠçæ¹æ³ã storing the correlation between the eye rotation and the head pose acquired for the user whose head pose is to be acquired;
The information processing method according to claim 12.
ãåèšã¿ãŒã²ããã®ç§»åã«äŒŽãåèšãŠãŒã¶ã®åèšçŒçåæãæ€åºããã
ãããšãæãã
ãåèšãããããŒãºã®ååŸåŠçã¯ãåèšã¿ãŒã²ããã®ç§»åã«äŒŽã£ãŠå€åããåèšãŠãŒã¶ã®åèšãããããŒãºãååŸãã
ãåèšãããããŒãºïŒåæçžé¢ã®ååŸåŠçã¯ãåèšã¿ãŒã²ããã®ç§»åã«äŒŽãåèšçŒçåæãšåèšãããããŒãºãšã®éã®çžé¢ãåèšãããããŒãºïŒåæçžé¢ãšããŠååŸããã
ãè«æ±é ïŒïŒã«èšèŒã®æ å ±åŠçæ¹æ³ã Displaying a target that the user is paying attention to on a display;
detecting the eye rotation of the user accompanying the movement of the target;
Having that,
the head pose acquisition process acquires the head pose of the user that changes in accordance with the movement of the target;
The process of acquiring the head pose-rotation correlation includes acquiring a correlation between the eye rotation accompanying the movement of the target and the head pose as the head pose-rotation correlation.
The information processing method according to claim 13.
ãè«æ±é ïŒïŒã«èšèŒã®æ å ±åŠçæ¹æ³ã notifying a message prompting the head pose change;
The information processing method according to claim 14.
ãè«æ±é ïŒïŒã«èšèŒã®æ å ±åŠçæ¹æ³ã The target display process displays the target at a position that causes a change in the head pose.
The information processing method according to claim 14.
ãè«æ±é ïŒïŒã«èšèŒã®æ å ±åŠçæ¹æ³ã The target display process includes displaying an obstruction in front of the target, and prompting the user to move the head pose so as to look at the target behind the obstruction.
17. The information processing method according to claim 16.
ãè«æ±é ïŒïŒã«èšèŒã®æ å ±åŠçæ¹æ³ã The process of acquiring the head pose-rotation correlation includes sequentially updating the head pose-rotation correlation based on data of the eye rotation and the head pose of the user acquired intermittently based on a preset trigger.
The information processing method according to claim 13.
ãè«æ±é ïŒïŒã«èšèŒã®æ å ±åŠçæ¹æ³ã The process of acquiring the head pose-rotation correlation includes storing an average correlation between the eye rotation and the head pose based on data of a plurality of subjects as the head pose-rotation correlation.
The information processing method according to claim 12.
ãåèšãããããŒãºãååŸãã
ãåèšãããããŒãºãåèšãããããŒãºïŒåæçžé¢ã«é©çšããŠåŸãããåèšçŒçåæã«åºã¥ããŠçŒçã®èŠè»žãè£æ£ããã
ãããšãã³ã³ãã¥ãŒã¿ã«å®çŸãããããã°ã©ã ã Obtaining a correlation between eye rotation and head pose as a head pose-rotation correlation;
obtaining the head pose;
correcting the visual axis of the eye based on the eye rotation obtained by applying the head pose to the head pose-rotation correlation;
A program that makes a computer do something.
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| JP2024063170 | 2024-04-10 | ||
| JP2024-063170 | 2024-04-10 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2025216110A1 true WO2025216110A1 (en) | 2025-10-16 |
Family
ID=97349821
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/JP2025/013086 Pending WO2025216110A1 (en) | 2024-04-10 | 2025-03-31 | Information processing device, information processing method, and program |
Country Status (1)
| Country | Link |
|---|---|
| WO (1) | WO2025216110A1 (en) |
Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2015169959A (en) * | 2014-03-04 | 2015-09-28 | åœç«å€§åŠæ³äººéå²¡å€§åŠ | Rotation angle calculation method, gazing point detection method, information input method, rotation angle calculation device, gazing point detection device, information input device, rotation angle calculation program, gazing point detection program, and information input program |
| JP2016198387A (en) * | 2015-04-13 | 2016-12-01 | æ ªåŒäŒç€Ÿã¯ãªã¥ãŒãã¡ãã£ã«ã«ã·ã¹ãã 㺠| Visual inspection device, visual target correction method for visual inspection device, and display device |
| JP2017129898A (en) * | 2016-01-18 | 2017-07-27 | ãœããŒæ ªåŒäŒç€Ÿ | Information processing apparatus, information processing method, and program |
| JP2021069008A (en) * | 2019-10-23 | 2021-04-30 | ãã€ãã³æ ªåŒäŒç€Ÿ | Electronic equipment, control method for electronic equipment, program and storage medium |
| US20210349528A1 (en) * | 2018-10-24 | 2021-11-11 | Pcms Holdings, Inc. | Systems and methods for region of interest estimation for virtual reality |
-
2025
- 2025-03-31 WO PCT/JP2025/013086 patent/WO2025216110A1/en active Pending
Patent Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2015169959A (en) * | 2014-03-04 | 2015-09-28 | åœç«å€§åŠæ³äººéå²¡å€§åŠ | Rotation angle calculation method, gazing point detection method, information input method, rotation angle calculation device, gazing point detection device, information input device, rotation angle calculation program, gazing point detection program, and information input program |
| JP2016198387A (en) * | 2015-04-13 | 2016-12-01 | æ ªåŒäŒç€Ÿã¯ãªã¥ãŒãã¡ãã£ã«ã«ã·ã¹ãã 㺠| Visual inspection device, visual target correction method for visual inspection device, and display device |
| JP2017129898A (en) * | 2016-01-18 | 2017-07-27 | ãœããŒæ ªåŒäŒç€Ÿ | Information processing apparatus, information processing method, and program |
| US20210349528A1 (en) * | 2018-10-24 | 2021-11-11 | Pcms Holdings, Inc. | Systems and methods for region of interest estimation for virtual reality |
| JP2021069008A (en) * | 2019-10-23 | 2021-04-30 | ãã€ãã³æ ªåŒäŒç€Ÿ | Electronic equipment, control method for electronic equipment, program and storage medium |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US20230069764A1 (en) | Systems and methods for using natural gaze dynamics to detect input recognition errors | |
| US10534982B2 (en) | Neural network training for three dimensional (3D) gaze prediction with calibration parameters | |
| US10241329B2 (en) | Varifocal aberration compensation for near-eye displays | |
| US20200250488A1 (en) | Deep learning for three dimensional (3d) gaze prediction | |
| CN110249337A (en) | Using eye tracks camera to facial expression classification | |
| US20180068449A1 (en) | Sensor fusion systems and methods for eye-tracking applications | |
| CN110320998B (en) | Training of neural networks for 3D gaze prediction | |
| US10976816B2 (en) | Using eye tracking to hide virtual reality scene changes in plain sight | |
| JP7367689B2 (en) | Information processing device, information processing method, and recording medium | |
| WO2017074807A1 (en) | Adjusting image frames based on tracking motion of eyes | |
| WO2017074662A1 (en) | Tracking of wearer's eyes relative to wearable device | |
| CN114391117A (en) | Eye tracking delay enhancement | |
| US11054903B2 (en) | Method, eyetracker and computer program for determining eye positions in digital image data | |
| US20190324528A1 (en) | Adjusting gaze point based on determined offset adjustment | |
| US12028419B1 (en) | Systems and methods for predictively downloading volumetric data | |
| CN109451240A (en) | Focusing method, device, computer equipment and readable storage medium storing program for executing | |
| US20230053497A1 (en) | Systems and methods for performing eye-tracking | |
| CN120161953B (en) | Imaging display method and system for head-mounted display device | |
| WO2025216110A1 (en) | Information processing device, information processing method, and program | |
| JP7423005B2 (en) | Calibration method and device for line of sight measurement using changes in pupil diameter, line of sight measurement device and camera device | |
| US20250175585A1 (en) | Binocular disparity correction system | |
| CN115268650A (en) | Picture screen capturing method and device, head-mounted virtual reality equipment and storage medium | |
| CN113780414A (en) | Eye movement behavior analysis method, image rendering method, component, device and medium | |
| JP7442300B2 (en) | Playback control device and playback control program | |
| US20250373943A1 (en) | Multi-Modal Sensor Fusion for Camera Focus Adjustments |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 25786393 Country of ref document: EP Kind code of ref document: A1 |