[go: up one dir, main page]

WO2025243410A1 - Semiconductor device, method, and head-mounted display - Google Patents

Semiconductor device, method, and head-mounted display

Info

Publication number
WO2025243410A1
WO2025243410A1 PCT/JP2024/018748 JP2024018748W WO2025243410A1 WO 2025243410 A1 WO2025243410 A1 WO 2025243410A1 JP 2024018748 W JP2024018748 W JP 2024018748W WO 2025243410 A1 WO2025243410 A1 WO 2025243410A1
Authority
WO
WIPO (PCT)
Prior art keywords
display
image
data
image data
positional relationship
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
PCT/JP2024/018748
Other languages
French (fr)
Japanese (ja)
Inventor
有司 梅津
竜志 大塚
功太郎 江崎
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Socionext Inc
Original Assignee
Socionext Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Socionext Inc filed Critical Socionext Inc
Priority to PCT/JP2024/018748 priority Critical patent/WO2025243410A1/en
Publication of WO2025243410A1 publication Critical patent/WO2025243410A1/en
Pending legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09GARRANGEMENTS OR CIRCUITS FOR CONTROL OF INDICATING DEVICES USING STATIC MEANS TO PRESENT VARIABLE INFORMATION
    • G09G5/00Control arrangements or circuits for visual indicators common to cathode-ray tube indicators and other visual indicators
    • G09G5/36Control arrangements or circuits for visual indicators common to cathode-ray tube indicators and other visual indicators characterised by the display of a graphic pattern, e.g. using an all-points-addressable [APA] memory
    • G09G5/38Control arrangements or circuits for visual indicators common to cathode-ray tube indicators and other visual indicators characterised by the display of a graphic pattern, e.g. using an all-points-addressable [APA] memory with means for controlling the display position
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/64Constructional details of receivers, e.g. cabinets or dust covers

Definitions

  • the present invention relates to a semiconductor device, a method, and a head-mounted display.
  • the user can simultaneously view both the transmitted image through the transmissive display and the captured image displayed on the transmissive display.
  • Patent No. 7246708 Japanese Patent Application Laid-Open No. 2008-96867 Patent No. 5855206
  • the present invention aims to improve the visibility of the boundary between a transmitted image and a captured image when the captured image is displayed on a transmissive display.
  • the semiconductor device of the embodiment includes a memory, a display data generation circuit, and a display control circuit.
  • the memory stores first positional relationship data indicating the positional relationship between a subject included in a transmitted image transmitted through the transmissive display and a subject included in first image data captured by a first camera that captures the direction in which the back of the transmissive display faces.
  • the display data generation circuit extracts and deforms at least a partial area of the first image data based on the first positional relationship data, and generates display data such that the outline of the subject included in the transmitted image and the outline of the subject included in the area of the first image data are continuous on the transmissive display.
  • the display control circuit displays the display data on the transmissive display.
  • FIG. 1 is a diagram illustrating an example of the overall configuration of a head-mounted display according to the first embodiment.
  • FIG. 2 is a diagram illustrating an example of the configuration of an SoC according to the first embodiment.
  • FIG. 3 is a diagram showing an example of the difference between a video see-through display image and a transmission image according to the first embodiment.
  • FIG. 4 is a diagram showing an example of the installation position of the calibration camera according to the first embodiment.
  • FIG. 5 is a diagram illustrating an example of a subject for the calibration process according to the first embodiment.
  • FIG. 6 is a diagram showing an example of a subject 9b photographed from the side by the calibration camera according to the first embodiment.
  • FIG. 7 is a flowchart showing an example of the flow of the calibration process according to the first embodiment.
  • FIG. 1 is a diagram illustrating an example of the overall configuration of a head-mounted display according to the first embodiment.
  • FIG. 2 is a diagram illustrating an example of the configuration of an SoC according to the
  • FIG. 8 is a diagram showing an example of the processing content of each step in the flowchart of FIG.
  • FIG. 9 is a diagram showing an example of distortion contained in each image data in the calibration process according to the first embodiment.
  • FIG. 10 is a diagram illustrating an example of distortion contained in each image data when the head-mounted display according to the first embodiment is used.
  • FIG. 11 is a diagram showing an example of the breakdown of the modification process B in FIG.
  • FIG. 12 is a flowchart showing an example of the flow of a process for generating display data when the head-mounted display according to the first embodiment is used.
  • FIG. 13 is a diagram showing an example of correction of video see-through image data according to the first embodiment.
  • FIG. 14 is a diagram illustrating the principle of visual assistance using the transmissive display and liquid crystal shutter according to the first embodiment.
  • FIG. 15 is a diagram illustrating an example of a display mode of the transmissive display according to the first embodiment.
  • FIG. 16 is a diagram showing an example of the flow of the bright place/dark place region extraction process according to the first embodiment.
  • FIG. 17 is a diagram illustrating an example of the flow of the EV value acquisition process according to the first embodiment.
  • FIG. 18 is a diagram showing an example of a threshold value determined based on a histogram according to the first embodiment.
  • FIG. 19 is a diagram showing an example of a general EV/Lux conversion table.
  • FIG. 20 is a diagram showing an example of the relationship between the EV value and the transmittance of the lens according to the first embodiment.
  • FIG. 21 is a diagram showing an example of a process for estimating EV values for each rectangular block of video see-through image data according to the first embodiment.
  • FIG. 22 is a diagram showing an example of the relationship between the brightness of a video see-through display image and the EV value in a scene according to the first embodiment.
  • FIG. 23 is a diagram showing an example of the relationship between the brightness of the video see-through display image according to the first embodiment and the luminance difference of the EV value in the scene.
  • FIG. 24 is a diagram showing an example of the relationship between the brightness of the video see-through display image and the average value of luminance for each block according to the first embodiment.
  • FIG. 25 is a diagram showing an example of the brightness adjustment process for video see-through image data according to the first embodiment.
  • FIG. 26 is a diagram showing an example of video see-through image data before and after correction according to the first embodiment.
  • FIG. 27 is a diagram illustrating an example of a light-blocking target region according to the first modification of the first embodiment.
  • FIG. 28 is a diagram illustrating an example of a case where a bright area exists according to the second modification of the first embodiment.
  • FIG. 29 is a diagram illustrating an example of a light-blocking target region according to the second modification of the first embodiment.
  • FIG. 30 is a diagram illustrating an example of the overall configuration of a head-mounted display according to the second embodiment.
  • Figure 31 is a diagram showing an example of the positional relationship between the eye tracking camera and the user's eyes in the second embodiment.
  • Figure 32 is a diagram showing another example of the positional relationship between the eye tracking camera and the user's eyes in the second embodiment.
  • FIG. 33 is a diagram illustrating an example of the configuration of an SoC according to the second embodiment.
  • FIG. 34 is a flowchart showing an example of the flow of processing for acquiring the correction amount of pupil position misalignment according to the second embodiment.
  • FIG. 35 is a flowchart showing an example of the flow of a process for generating display data when using the head-mounted display according to the second embodiment.
  • (First embodiment) 1 is a diagram showing an example of the overall configuration of a head-mounted display 1a according to the first embodiment.
  • the head-mounted display 1a is equipped with high-sensitivity cameras (video see-through cameras 41a and 41b) that can capture images even in dark places, and supports the user's vision by displaying camera images in dark places.
  • high-sensitivity cameras video see-through cameras 41a and 41b
  • the head-mounted display 1a includes, for example, a pair of eyeglasses 10, lenses 2a and 2b, transparent displays 3a and 3b, video see-through cameras 41a and 41b, display projectors 5a and 5b, an ambient light sensor 60, head tracking cameras 63a and 63b, and a system on a chip (SoC) 100a.
  • SoC system on a chip
  • the lenses 2a, 2b, transmissive displays 3a, 3b, video see-through cameras 41a, 41b, display projectors 5a, 5b, ambient light sensor 60, head tracking cameras 63a, 63b, and SoC are fixed to the eyeglass body 10.
  • the eyeglass body 10 is shaped so that it can be worn on the user's head, similar to the frame of regular eyeglasses.
  • the eyeglass body 10 includes, for example, a front portion that secures the lenses 2a and 2b, and temple portions that can be fastened to the user's earlobes.
  • Lens 2a and 2b are transparent eyeglass lenses positioned in front of the eyes of a user wearing head-mounted display 1a. Lenses 2a and 2b also function as liquid crystal shutters. This liquid crystal shutter function allows lenses 2a and 2b to adjust the amount of external light that passes through lenses 2a and 2b. Hereinafter, when there is no need to distinguish between individual lenses 2a and 2b, they will simply be referred to as lenses 2.
  • Transparent displays 3a and 3b are provided on lenses 2a and 2b and are capable of displaying images. Furthermore, transparent displays 3a and 3b transmit external light. Therefore, a user wearing head-mounted display 1a can see both the transmitted image transmitted through transparent displays 3a and 3b and the display image displayed on transparent displays 3a and 3b.
  • the display images displayed on transparent displays 3a and 3b are, for example, images obtained by subjecting captured image data captured by video see-through cameras 41a and 41b (described below) to processing such as deformation. Note that deformation is an example of correction.
  • the transparent displays 3a and 3b are provided in at least a portion of the area of the lenses 2a and 2b.
  • the transparent displays 3a and 3b are provided in a portion of the area located at the center of the lenses 2a and 2b.
  • the transparent displays 3a and 3b are, for example, screens for optical see-through displays such as waveguides.
  • the transparent displays 3a and 3b are, for example, screens for optical see-through displays such as waveguides.
  • the surface facing the user wearing the head-mounted display 1a is referred to as the front surface. Furthermore, of the two surfaces of the transparent display 3, the surface facing away from the user wearing the head-mounted display 1a is referred to as the back surface.
  • the user wearing the head-mounted display 1a is an example of an observer of the transparent display 3.
  • the video see-through cameras 41a and 41b are cameras that capture images in the direction in which the back of the transmissive display 3 faces.
  • the direction in which the back of the transmissive display 3 faces corresponds to the front direction for a user wearing the head-mounted display 1a.
  • the video see-through cameras 41a and 41b are provided, for example, in positions above the centers of the lenses 2a and 2b of the eyeglass body 10.
  • the video see-through cameras 41a and 41b are an example of a first camera in this embodiment.
  • the captured image data captured by the video see-through cameras 41a and 41b is an example of first image data in this embodiment.
  • Video see-through cameras 41a and 41b are image sensors capable of capturing images in dark places, such as a SPAD (Single Photon Avalanche Diode) sensor or a highly sensitive CMOS (Complementary Metal Oxide Semiconductor) sensor.
  • video see-through cameras 41a and 41b may be IR (Infrared Rays) sensors capable of capturing images in the dark by irradiating them with IR light.
  • IR Infrared Rays
  • the captured image data captured by the video see-through camera 41 is referred to as video see-through image data.
  • the video see-through image data is projected onto the transmissive display 3 by the display projector 5
  • the image displayed on the transmissive display 3 is referred to as a video see-through display image.
  • "when video see-through image data is projected onto the transmissive display 3 by the display projector 5" also includes when video see-through image data after various corrections is projected onto the transmissive display 3.
  • Display projectors 5a and 5b display images on transmissive displays 3a and 3b under the control of SoC 100a (described below).
  • Display projectors 5a and 5b are, for example, micro LEDs (Light Emitting Diodes).
  • display projectors 5a and 5b are, for example, micro LEDs (Light Emitting Diodes).
  • display projectors 5a and 5b when there is no need to distinguish between individual display projectors 5a and 5b, they will simply be referred to as display projectors 5.
  • the Ambient Light Sensor 60 also known as an ambient light sensor, acquires ambient light information.
  • the Ambient Light Sensor 60 measures the intensity of ambient light.
  • the head tracking cameras 63a and 63b detect the movement of the head of a user wearing the head-mounted display 1a.
  • head tracking cameras 63 when there is no need to distinguish between the individual head tracking cameras 63a and 63b, they will simply be referred to as head tracking cameras 63.
  • SoC100a is a computer that controls each component of head-mounted display 1a.
  • SoC100a is an example of a semiconductor device in this embodiment.
  • the head-mounted display 1a may further include a depth sensor that measures distance, an IMU (Inertial Measurement Unit) that measures the posture of the user wearing the head-mounted display 1a, and the like.
  • depth sensors that measure distance include, but are not limited to, a ToF (Time of Flight) sensor or a stereo camera.
  • FIG. 2 is a diagram showing an example of the configuration of the SoC 100a according to the first embodiment.
  • the SoC 100a includes an I2C (Inter-Integrated Circuit) interface 11, 22, a Mono ISP (Image Signal Processor) 12, a DSP (Digital Signal Processor) & AI (Artificial Intelligence) Accelerator 13, and an SRAM (Static Random Access Memory) 1.
  • I2C Inter-Integrated Circuit
  • Mono ISP Image Signal Processor
  • DSP Digital Signal Processor
  • AI Artificial Intelligence
  • SRAM Static Random Access Memory
  • the head mounted display 1a further includes an IMU 61, a ToF sensor 62, a flash memory 31, and a DRAM 32 outside the SoC 100a.
  • the flash memory 31 and the DRAM 32 store various types of data under the control of the SoC 100a.
  • the calibration cameras 42a and 42b shown in FIG. 2 are cameras provided outside the head-mounted display 1a.
  • the calibration camera 42 is an example of the second camera in this embodiment. Details of the calibration camera 42 will be described later with reference to FIG. 4.
  • I2C interfaces 11 and 22 are communication interfaces that perform synchronous serial communication.
  • I2C interface 11 acquires acceleration and angular velocity from IMU 61.
  • I2C interface 22 acquires the intensity of ambient light from Ambient Light Sensor 60.
  • Mono ISP 12 acquires information from various sensors used to recognize the surrounding situation and corrects the acquired information. For example, Mono ISP 12 acquires distance measurement data from surrounding objects from ToF sensor 62. Mono ISP 12 also acquires image data from head tracking cameras 63a and 63b, for example, and corrects brightness, etc.
  • the DSP & AI Accelerator 13 includes a DSP 131 and an AI Accelerator 132.
  • the DSP & AI Accelerator 13 performs various tracking processes based on various data acquired and corrected by the Mono ISP 12. For example, the DSP & AI Accelerator 13 may perform head tracking processing to detect head movement of a user wearing the head-mounted display 1a. Alternatively, the DSP & AI Accelerator 13 may perform hand tracking processing or eye tracking processing, depending on the type of sensor.
  • the DSP & AI Accelerator 13 outputs tracking information generated as a result of various tracking processes to the Time Warp 17. Note that various tracking processes are not required in this embodiment.
  • SRAMs 14a to 14c store various data used in various processes and data generated by various processes.
  • SRAM 14b is used as a temporary storage location for rendered image data and captured image data. It may also store first positional relationship data used in the image data transformation process performed by Warp 18, described below. SRAM capacity is usually small and may not be able to store the entire image. In such cases, SRAM may be used as a FIFO (First in First out) buffer to store only a portion of the image that has not yet been used in subsequent processing.
  • SRAMs 14a to 14c are an example of memory in this embodiment. Note that each process performed by SoC 100a uses SRAMs 14a to 14c as a buffer for temporary data storage.
  • Flash memory 31 is an external storage device, and may be NAND memory, an SSD, an SD card, or the like.
  • the GPU 15 performs rendering processing of the image data displayed on the transmissive display 3.
  • Time Warp 17 is a processing circuit that corrects delays in image data caused by processing by GPU 15.
  • the time correction performed by Time Warp 17 has the effect of reducing the discomfort experienced by users who view image data displayed on the translucent display 3, known as "VR (Virtual Reality) sickness.”
  • Time Warp 17 performs correction processing using, for example, the results of head tracking processing performed by DSP & AI Accelerator 13.
  • Color ISP 16 and Warp 18 acquire captured image data from the video see-through camera 41 and the calibration camera 42.
  • Color ISP 16 converts the acquired image data into RGB (Red-Green-Blue color model) image data.
  • Warp 18 performs correction processing on the captured image data of the video see-through camera 41 that has been converted into RGB image data by Color ISP 16.
  • the correction processing by Warp 18 includes, for example, a transformation processing based on the first positional relationship data for at least a partial area of the captured image data, and a processing for extracting at least a partial area of the captured image data as a display target.
  • Warp 18 generates video see-through image data so that the outline of the subject included in the transmitted image and the outline of the subject included in part of the captured image data are continuous on the transmissive display 3.
  • the video see-through image data is an example of display data in this embodiment.
  • Warp 18 is an example of a display data generation circuit in this embodiment.
  • STAT21 identifies various pieces of information from the image data captured by the video see-through camera 41, which has been converted into an RGB image by Color ISP16. For example, STAT21 detects the information required for AE (Auto Exposure) and AWB (Auto White Balance). STAT21 also extracts a histogram from the captured image data, calculates the average brightness value for each rectangular block of the captured image data, and counts the number of saturated pixels.
  • a histogram is a graph that shows the distribution of brightness values of pixels in image data, and typically the horizontal axis represents brightness and the vertical axis represents the number of pixels.
  • the CPU 23 performs AE and AWB based on information acquired via the STAT 21 and the I2C interface 22.
  • the CPU 23 also performs extraction processing of bright and dark areas based on the brightness of the scene and the luminance of the captured image data. Based on the results of these processing, the CPU 23 generates transmittance data for controlling the liquid crystal shutter of the lens 2a. Details of the extraction processing of bright and dark areas by the CPU 23 will be described later.
  • the CPU 23 is an example of an image processing circuit in this embodiment.
  • the display controller 19 controls the display projector 5 to display an image on the transmissive display 3.
  • the display controller 19 also controls the degree of transmittance of the liquid crystal shutter of the lens 2 based on transmittance data generated by the CPU 23.
  • the display controller 19 also corrects the video see-through image data corrected by the Warp 18 and the rendering image data rendered by the GPU 15 into a format suitable for display on the transmissive display 3.
  • the display controller 19 is an example of a display control circuit in this embodiment.
  • the Display Controller 19 comprises an EN (Enable) block 191 and a Blend block 192.
  • the EN block 191 determines whether the display of saturated pixels is on or off. If the EN block 191 determines that the pixel is off, it corrects the pixel to be hidden (black). If the EN block 191 determines that the pixel is on, it corrects the brightness of the displayed image corresponding to the pixel.
  • the Blend block 192 combines the rendering image data and the video see-through image data.
  • the DRAM controller 24 controls the storage of various data in the DRAM 32 and the reading of various data from the DRAM 32.
  • the configuration of the head-mounted display 1a in this embodiment is not limited to the example shown in Figures 1 and 2.
  • the head-mounted display 1a does not necessarily include the head tracking camera 63a, IMU 61, and ToF sensor 62.
  • the head-mounted display 1a may also include other sensors, etc.
  • the calibration process is performed, for example, before the head-mounted display 1a is shipped. More specifically, the calibration process is a process for making corrections that take into account differences between the eye position of a user wearing the head-mounted display 1a and the mounting position of the video see-through camera 41.
  • the head-mounted display 1a When a user wears the head-mounted display 1a, the user's eyes are generally positioned near the centers of the lenses 2a and 2b, respectively. Therefore, the transmitted image that passes through the lenses 2a and 2b and the transmissive displays 3a and 3b is based on a viewpoint near the centers of the lenses 2a and 2b.
  • FIG. 3 is a diagram showing an example of the difference between the video see-through display images 931, 931a and the transmission image 941 according to the first embodiment.
  • the video see-through display image 931 is an image in which the uncorrected video see-through image data is displayed as is on the transmission display 3.
  • a difference occurs in the position and/or size of the object (subject 9a) included in the video see-through display image 931 and the object (subject 9a) included in the transmission image.
  • the video see-through display image 931 and the transmission image 941 are displayed separately, but in reality, the video see-through display image 931 is displayed superimposed on the transmission image 941 on the transmission display 3.
  • the contours of the subject 9a in the video see-through display image 931 and the subject 9a in the transmission image 941 are not continuous, resulting in a double image.
  • a user wearing the head-mounted display 1a sees both the subject 9a in the video see-through display image 931 and the subject 9a in the transmission image 941 as a double image, reducing visibility.
  • the "subject" includes not only objects depicted in the captured image data, but also objects included in the transmission image. Note that objects also include people.
  • the SoC 100a corrects the video see-through image data before displaying it on the transmissive display 3.
  • the position and size of the subject 9a in the video see-through display image 931a based on the corrected video see-through image data match the position and size of the subject 9a in the transmissive image 941.
  • the subject 9a does not appear as a double image on the transmissive display 3, so user visibility is not reduced.
  • the calibration process acquires first positional relationship data used for such correction during use.
  • the first positional relationship data is data indicating the positional relationship between the subject 9a included in the transmissive image 941 transmitted through the transmissive display 3 and the subject included in the first image data captured by the video see-through camera 41.
  • the first positional relationship data represents the amount of alignment correction of the first image data in the warp process.
  • the alignment correction amount is defined, for example, by coordinates that represent the positional relationship of feature points before and after the warp process.
  • FIG. 4 is a diagram showing an example of the installation positions of the calibration cameras 42a and 42b according to the first embodiment.
  • the calibration cameras 42a and 42b are installed at positions corresponding to the eye positions of a user wearing the head-mounted display 1a.
  • the calibration cameras 42a and 42b are installed near the centers of the lenses 2a and 2b.
  • the calibration cameras 42a and 42b pass through the transparent display 3 from the front side of the transparent display 3 and capture an image in the direction in which the rear side of the transparent display 3 faces. Therefore, the calibration cameras 42a and 42b pass through the transparent display 3 to capture an image of a subject located on the rear side of the transparent display 3.
  • the calibration camera 42 is not used after the calibration process and is therefore not included in the configuration of the head-mounted display 1a.
  • the subject used for calibration processing prefferably be one from which feature points can be easily extracted.
  • FIG. 5 is a diagram showing an example of a subject 9b for calibration processing according to the first embodiment.
  • a checkerboard chart in which black and white rectangles are arranged in a grid pattern is used as the subject 9b for calibration processing.
  • the grid points of the checkerboard chart are examples of feature points 90. Note that while FIG. 5 shows one grid point as an example of a feature point 90, all grid points in the image data captured by the calibration camera 42 and the video see-through camera 41 become feature points 90. Note that the subject 9b for calibration processing is not limited to the example shown in FIG. 5.
  • FIG. 6 is a diagram showing an example of a side view of subject 9b photographed by the calibration camera 42 according to the first embodiment.
  • both the video see-through camera 41 and the calibration camera 42 photograph subject 9b located toward the back surface 302 of the transmissive display 3.
  • the calibration camera 42 photographs subject 9b from the front surface 301 of the transmissive display 3 toward the back surface 302, passing through the lens 2 and the transmissive display 3.
  • the video see-through camera 41 photographs subject 9b without passing through the lens 2 or the transmissive display 3.
  • FIG. 7 is a flowchart showing an example of the flow of calibration processing according to the first embodiment.
  • FIG. 8 is a diagram showing an example of the processing content of each step in the flowchart of FIG. 7. Steps 1-7 in FIG. 8 correspond to S1-S7 in FIG. 7.
  • Color ISP 16 acquires image data (first calibration image data 901 shown in FIG. 8) of subject 9b photographed by calibration camera 42 (S1).
  • the first calibration image data 901 is image data of subject 9b photographed from the front surface 301 of the transparent display 3, passing through the transparent display 3 to the rear surface 302 of the transparent display 3.
  • the first calibration image data 901 is an example of second image data in this embodiment.
  • calibration camera 42 does not have to be mounted on a head-mounted display, and a commercially available digital camera, web camera, etc. may be used.
  • the photographed image data is transferred from the digital camera, web camera, etc. to SoC 100a after photographing, and subsequent processing is performed.
  • the CPU 23 extracts feature points 90 of the subject 9b in the first calibration image data 901 (S2).
  • the feature points 90 extracted from the first calibration image data 901 are designated as first feature points. Extracting feature points 90 means identifying the positions of the feature points 90 (for example, grid points of a checkerboard chart) depicted in the first calibration image data 901.
  • the first calibration image data 901 is a pseudo-image of the transmission image seen by a user wearing the head-mounted display 1a, but differs from the transmission image actually seen by the user due to lens distortion of the calibration camera 42, etc. Distortion caused by the characteristics of the calibration camera 42, including lens distortion, is called camera distortion.
  • the CPU 23 corrects the first calibration image data 901 using the internal parameters of the calibration camera 42 before extracting the first feature point in S2. Correction using the internal parameters is an inverse correction that cancels out distortion caused by the display system.
  • the CPU 23 converts the first calibration image data 901 into a state equivalent to the transmission image seen by a user wearing the head-mounted display 1a. In Figure 8, the converted image data is shown as pseudo transmission image data 911.
  • the internal parameters of the calibration camera 42 are correction parameters for removing the effects of camera distortion of the calibration camera 42.
  • the internal parameters can be calculated using a known calibration tool such as OpenCV (registered trademark).
  • the internal parameters may be calculated by an information processing device external to the SoC 100a before the processing of Figure 7 and stored in memory within the SoC 100a.
  • the video see-through camera 41 like the calibration camera 42, also has camera distortion. Therefore, the internal parameters for removing the effects of camera distortion of the video see-through camera 41 may also be stored in memory within the SoC 100a.
  • Color ISP 16 acquires image data (video see-through image data 921 shown in Figure 8) of subject 9b captured by video see-through camera 41 from video see-through camera 41 (S3).
  • the display controller 19 controls the display projector 5 to display the video see-through image data 921 on the transparent display 3 (S4).
  • the video see-through image data 921 displayed on the transparent display 3 is shown as a video see-through display image 932.
  • the subject 9b is removed from in front of the head-mounted display 1a by an operator or the like. As a result, the subject 9b is no longer within the angle of view of the calibration camera 42.
  • Color ISP 16 acquires image data from the calibration camera 42, which is an image of the transmissive display 3 displaying the video see-through display image 932 (S5).
  • This image data is the second calibration image data 902 shown in FIG. 8.
  • the second calibration image data 902 is also an example of the third image data in this embodiment.
  • the CPU 23 extracts feature points 90 of the subject 9b in the second calibration image data 902 (S6).
  • the feature points 90 extracted from the second calibration image data 902 are designated as second feature points. More specifically, similar to the processing in S2, the CPU 23 corrects camera distortion of the second calibration image data 902 using internal parameters of the calibration camera 42 before extracting the second feature points.
  • the corrected second calibration image data 902 is a pseudo-reproduction of the video see-through display image seen by a user wearing the head-mounted display 1a, and is therefore referred to as pseudo see-through display image data 940.
  • the CPU 23 extracts the second feature points from the pseudo see-through display image data 940.
  • the CPU 23 calculates deformation parameters from the first feature points extracted in S2 and the second feature points extracted in S6 (S7).
  • the deformation parameters are the first positional relationship data described above.
  • the deformation parameters are data indicating the positional relationship between the subject 9b included in the pseudo-transmitted image data 911 and the subject 9b included in the video see-through image data 921 captured by the video see-through camera 41.
  • the pseudo-transmitted image data 911 in the calibration process corresponds to the transmitted image when the head-mounted display 1a is actually in use.
  • the pseudo-see-through display image data 940 generated from the video see-through image data 921 in the calibration process corresponds to the video see-through display image when the head-mounted display 1a is actually in use. Therefore, the deformation parameters calculated in S7 indicate the positional relationship between the subject included in the transmitted image and the subject included in the video see-through image data captured by the video see-through camera 41.
  • the CPU 23 calculates the positional relationship between the first feature point and the second feature point, thereby calculating deformation parameters that can match the first feature point and the second feature point. As shown in Step 7-1 of FIG. 8, the CPU 23 calculates deformation parameters that can correct the pseudo see-through display image data 940 so that it matches the pseudo transmitted image data 911. Also, as shown in Step 7-2 of FIG. 8, the CPU 23 calculates the amount of distortion caused by the display system based on the feature point 90 of the subject 9b in the video see-through image data 921 and the feature point 90 of the subject 9b in the pseudo see-through display image data 940.
  • the amount of distortion caused by the display system is the magnitude of the image distortion caused by the characteristics of the lens 2 and the transparent display 3.
  • the characteristics of the lens 2 and the transparent display 3 include, for example, the curvature of the lens 2 and the transparent display 3.
  • the CPU 23 calculates the deformation parameters (first positional relationship data) based on the deformation parameters calculated in Step 7-1 and the distortion amount calculated in Step 7-2. More specifically, the CPU 23 performs a conversion process on the deformation parameters calculated in Step 7-1. This conversion process will be described later with reference to Figures 9-11.
  • the CPU 23 stores the calculated transformation parameters (first positional relationship data) in a memory such as the SRAM 14b or the flash memory 31 (S8). At this point, the processing of this flowchart ends.
  • the calibration process shown in FIG. 7 may be executed by the SoC 100a as described above, or may be executed by another information processing device external to the head-mounted display 1a.
  • the other information processing device may be, for example, a high-performance PC (Personal Computer).
  • the first positional relationship data generated by the calibration process executed by the SoC 100a of one head-mounted display 1a may be stored in the memory of multiple head-mounted displays 1a.
  • FIG. 9 is a diagram showing an example of distortion contained in each image data in the calibration process according to the first embodiment.
  • the deformation parameter that can be directly calculated in Step 7-1 shown in FIG. 8 above is deformation parameter A for the pseudo see-through display image data 940 based on the second calibration image data 902 captured by the calibration camera 42.
  • the video see-through image data 921 acquired in S3 of FIG. 7 is affected by distortion due to the display system when it is displayed on the transmissive display 3 in Step 4.
  • the second calibration image data 902 acquired in Step 5 also includes distortion due to the display system.
  • the second calibration image data 902 also includes distortion (camera distortion) due to the calibration camera 42.
  • the CPU 23 removes this camera distortion through correction using internal parameters.
  • the CPU 23 calculates a deformation parameter A that can correct the pseudo see-through display image data 940 from which the camera distortion has been removed so that it matches the pseudo transmission image data 911.
  • the pseudo see-through display image data 940a after deformation using the deformation parameter A matches the pseudo transmission image data 911.
  • the calibration camera 42 is not used, and therefore the object to be deformed during use is the video see-through image data 921.
  • FIG. 10 is a diagram showing an example of distortion contained in each image data when the head-mounted display 1a according to the first embodiment is in use. Also, FIG. 11 is a diagram showing an example of the breakdown of deformation process B in FIG. 10.
  • the correction process when using the head-mounted display 1a aims to ensure that the feature points of the subject in the video see-through display image 933 displayed on the transmissive display 3 match the feature points of the subject in the transmissive image.
  • deformation process B when using the head-mounted display 1a requires deformation parameters B that take into account the distortion that occurs in the video see-through image data 921a after deformation.
  • the video see-through image data 921 is displayed on the transmissive display 3, it is affected by distortion due to the display system and camera distortion of the video see-through camera 41.
  • the effect of camera distortion is removed by correction using internal parameters, so it does not need to be taken into account in the transformation process.
  • transformation process B is a combination process of transformation process S701 for distortion caused by the display system, transformation process S702 using transformation parameter A calculated in the calibration process, and inverse transformation process S703 for distortion caused by the display system.
  • the CPU 23 calculates transformation parameter B by combining the transformation parameters from processes S701 to S703. Transformation parameter B indicates, for example, the difference in the positions of feature points 90 (e.g., lattice points) of subject 9b before and after processes S701 to S703.
  • This transformation parameter B is the first positional relationship data.
  • FIG. 12 is a flowchart showing an example of the flow of the display data generation process when using the head-mounted display 1a according to the first embodiment. Before the process in FIG. 12, it is assumed that the calibration process in FIG. 7 has been completed and the transformation parameters (first positional relationship data) have already been stored in a memory such as SRAM 14b or flash memory 31.
  • Color ISP 16 acquires video see-through image data from video see-through camera 41 (S21). Color ISP 16 then converts the acquired video see-through image data into RGB image data. STAT 21 extracts a histogram, calculates the average brightness value for each rectangular block, and counts the number of saturated pixels for the video see-through image data converted into an RGB image by Color ISP 16. Color ISP 16 and STAT 21 store the video see-through image data converted into RGB image data, the histogram, the calculation results of the average brightness value for each rectangular block, and the number of saturated pixels in memory such as SRAM 14a-14c or DRAM 32.
  • video see-through image data converted into RGB image data will also be simply referred to as video see-through image data.
  • Warp 18 acquires the transformation parameters (first positional relationship data) generated in the calibration process of FIG. 7 from SRAMs 14a to 14c, etc. (S22).
  • the CPU 23 also performs a process of extracting bright and dark areas from the video see-through image data converted into RGB image data (S23). For example, the CPU 23 extracts bright areas from the video see-through image data, and calculates the bright areas of the transmitted image based on the bright areas from the video see-through image data and the first positional relationship data. Details of the process of extracting bright and dark areas will be described later.
  • Warp 18 deforms at least a portion of the video see-through image data according to the alignment correction amount defined by the first positional relationship data (S24). This deformation makes it possible to align the contour of the subject included in the transmitted image with the contour of the subject included in the video see-through display image. Therefore, even when the transmitted image is visible, it is possible to prevent the contour of the subject included in the transmitted image and the contour of the subject included in the video see-through display image from appearing double.
  • the head-mounted display 1a displays a video see-through image to aid vision in dark places. For this reason, not all areas of the video see-through image data are necessarily subject to display. For this reason, Warp 18 corrects only the areas that require display, based on the results of the bright and dark area extraction process in S23.
  • FIG. 13 is a diagram showing an example of correction of video see-through image data 922 according to the first embodiment.
  • Warp 18 deforms only areas of the video see-through image data 922 that have been determined to be dark or dimly lit in the process of extracting bright and dark areas. Warp 18 does not deform areas determined to be bright. For example, Warp 18 deforms the parts of the video see-through image data 922 that are dark or dimly lit and extracted as display targets, and that border the bright areas of the transmitted image, to generate display data.
  • Warp 18 sets the value "0" to areas not subject to correction.
  • the areas at both ends of the video see-through image data 922 are determined to be bright areas, so Warp 18 sets the value "0" to those areas.
  • the original image is deleted.
  • the corrected video see-through image data 922a where the value "0" is set nothing is displayed when displayed on the transmissive display 3.
  • the areas shown in black in the corrected video see-through image data 922a in Figure 13 have the value "0" set, so when actually displayed on the transmissive display 3, the background is visible through them. Therefore, in areas where the value "0" is set, the user only sees the transmissive image.
  • Warp 18 narrowing down the areas subject to correction in this way, the amount of calculation required for correction can be reduced.
  • Warp 18 outputs the corrected video see-through image data 922a to the Display Controller 19.
  • the Display Controller 19 adjusts the brightness of a portion of the corrected video see-through image data 922a based on the results of the bright and dark area extraction process of S23 (S25).
  • the Display Controller 19 performs, for example, gamma correction to correct the brightness.
  • the Display Controller 19 may also perform color correction to adjust the colors.
  • the Display Controller 19 generates display data by converting the corrected video see-through image data 922a into a format that can be displayed on the transmissive display 3 (S26).
  • the processing content differs depending on the transmissive display 3, but the Display Controller 19 performs, for example, resizing to match the resolution.
  • the Display Controller 19 outputs the generated display data and the transmittance data generated in the bright and dark area extraction process of S23 (S27). More specifically, the generated display data is output to the display projector 5, causing the transmissive display 3 to display a video see-through display image.
  • the Display Controller 19 also controls the transmittance of the liquid crystal shutter of the lens 2 based on the transmittance data. For example, the Display Controller 19 controls the transmittance of the area of the transmissive display 3 corresponding to the bright area of the transmitted image to a second transmittance that is smaller than the first transmittance used when the bright area of the transmitted image was calculated.
  • the processing of this flowchart ends.
  • the processing of the flowchart in Figure 12 is repeatedly executed while the head-mounted display 1a is being used by the user. Note that, in order to reduce the user's visual discomfort, it is desirable that the processing of Figure 12 be executed at a refresh rate of 90 to 60 Hz, as an example.
  • FIG. 14 is a diagram illustrating the principle of visual assistance using the transmissive display 3 and liquid crystal shutter according to the first embodiment.
  • FIG. 14 shows a state in which a video see-through display image is displayed on the transmissive display 3 and light reduction is performed by the liquid crystal shutter of the lens 2 as a result of the processing of S27 in FIG. 12.
  • the lens 2 with liquid crystal shutter function and the transmissive display 3 are located between the eye 8 of the user wearing the head-mounted display 1a and the real image (subject).
  • FIG 14 there are dark areas (dark regions) 71 and bright areas (bright regions) 72 in the real image.
  • the Display Controller 19 displays a video see-through display image in the areas where the dark areas of the real image on the transmissive display 3 are transmitted. Therefore, in the areas where the dark areas 71 are transmitted, the user sees a bright video see-through display image 934, not a dark transmitted image. Note that even if a small amount of the transmitted image is visible in the areas where the dark areas 71 are transmitted, no double image occurs because the video see-through display image 934 has been corrected to match the transmitted image by the transformation process in S24.
  • a value of "0" is set for each pixel of the video see-through display image 934, so the video see-through display image 934 is not displayed on the transmissive display 3. Furthermore, depending on the brightness of the bright area 72, the liquid crystal shutter function of the lens 2 dims the area where the bright area 72 is transmitted.
  • FIG. 15 is a diagram showing an example of the display mode of the transmissive display 3 according to the first embodiment.
  • the user may be unable to see objects in the region 310 of the transmissive display 3 through which the dark area 71 is transmitted due to darkness.
  • the user may be unable to see objects due to glare.
  • the user's visibility is improved in the region 310 of the transmissive display 3 through which the dark area 71 is transmitted by the video see-through display image 934. Furthermore, in the region 320 of the transmissive display 3 through which the bright area 72 is transmitted by the liquid crystal shutter, the glare is reduced, thereby improving the user's visibility.
  • Figure 16 is a diagram showing an example of the flow of the bright and dark area extraction process according to the first embodiment.
  • the CPU 23 acquires the brightness of the scene (EV (Exposure Value) value) (S231).
  • EV Exposure Value
  • the brightness of the scene is estimated using AE and the shutter speed and sensitivity are determined.
  • the CPU 23 uses this AE information to acquire the brightness of dark areas of the scene (EV value).
  • the brightness of the scene is the brightness around the user wearing the head-mounted display 1a.
  • the CPU 23 also determines the transmittance of lens 2 based on the acquired EV value (S232).
  • the transmittance of lens 2 indicates the degree of light blocking by the liquid crystal shutter.
  • the CPU 23 extracts bright and dark areas from the video see-through image data 923 (S233). Extracting bright and dark areas means, in other words, identifying the ranges that correspond to bright and dark areas in the video see-through image data 923. At this point, the processing of this flowchart ends.
  • Figure 17 is a diagram showing an example of the flow of the EV value acquisition process according to the first embodiment.
  • the processes of S301 to S307 shown in Figure 17 are executed by the CPU 23.
  • the CPU 23 acquires the histogram and the calculation results of the average brightness value for each rectangular block generated by STAT21 from the video see-through image data 923 acquired by Color ISP16.
  • the CPU 23 calculates a threshold value that serves as a reference for dark places from the histogram. For example, the CPU 23 integrates the histogram from element 0 and sets the element number when the integrated value exceeds N% of the total number of pixels as the threshold value (S301).
  • Figure 18 is a diagram showing an example of a threshold value determined based on a histogram according to the first embodiment. By calculating the threshold value from the histogram, the CPU 23 can dynamically set a threshold value that corresponds to the video see-through image data 923.
  • the CPU 23 obtains the average value of the area where the brightness is equal to or less than the threshold value from the average brightness values of rectangular blocks obtained from the STAT 21 (S302). This average value is called the STAT average value.
  • the CPU 23 estimates the brightness (EV value) of the scene from the STAT average value and the current sensor module settings (exposure time, sensor gain, and aperture) (S303).
  • the CPU 23 also sets new sensor setting values for various sensors (such as the Ambient Light Sensor 60) based on the estimated brightness (S304). This allows the sensor sensitivity to be adjusted to match the actual level of brightness.
  • the CPU 23 also acquires the ambient light brightness (Lux) from the Ambient Light Sensor 60 via the I2C interface 22 (S305).
  • FIG. 19 is a diagram showing an example of a common EV/Lux conversion table.
  • the EV value is a numerical value used when determining the exposure compensation value of a camera, and it is known that there is a correspondence relationship between the EV value and illuminance (Lux value), a common unit of brightness, roughly as shown in the table in Figure 19. The higher the EV value, the brighter the scene, and the lower the EV value, the darker the scene.
  • the CPU 23 may determine that an EV value of "-4" or greater and less than “0” is a "dark place,” an EV value of "0” or greater and less than “2” is a “twilight” place, and an EV value of "15” or greater is a “bright place.” Note that the criteria for "dark place,” “twilight,” and “bright place” shown in Figure 19 are merely examples and are not limited to these. Also, in Figure 19, EV values of "2" or greater and less than “15” are considered “daytime” and distinguished from “bright places,” but “daytime” may also be included in “bright places.”
  • the CPU 23 uses the detection results (ALS brightness) of the Ambient Light Sensor 60; if the measurement results are unavailable, the CPU 23 uses the AE information (AE brightness) (S307).
  • the update interval is longer than when AE information is used, but power consumption for scene brightness estimation processing can be reduced.
  • the sensor settings are set so that dark areas can be captured brightly, bright areas are saturated, and AE information based on the captured image data may not accurately measure the brightness of bright areas.
  • the CPU 23 outputs an EV value based on either the ALS brightness or the AE brightness to the Display Controller 19.
  • the CPU 23 may use both the ALS brightness and the AE brightness to estimate the brightness of a scene.
  • the CPU 23 corrects the EV value based on the maximum value of the histogram and the average STAT value to obtain the brightness of a bright place. Specifically, the CPU 23 corrects the EV value based on the AE information using the following formula (1).
  • the CPU 23 determines the transmittance of lens 2 according to the EV value indicating the brightness of the scene obtained in S231.
  • the CPU 23 reduces the decrease in visibility due to brightness by controlling the transmittance of the lens 2 according to the degree of brightness. For every 1 increase in the EV value, the brightness (Lux) approximately doubles. For this reason, for example, the CPU 23 may define an EV value of 15 or higher as a bright place, as shown in Figure 19, and double the degree of light blocking by the lens 2 for every 1 increase in the EV value from 15. By controlling the liquid crystal shutter in this way, the brightness of the transmitted image through the lens 2 will not exceed an EV value of 14, reducing the glare felt by the user.
  • FIG. 20 is a diagram showing an example of the relationship between the EV value and the transmittance of the lens 2 according to the first embodiment.
  • the transmittance changes as shown in the graph in FIG. 20.
  • the maximum transmittance of the lens 2 cannot be set to 100%, so in FIG. 20, the maximum transmittance is set to 80% as an example.
  • the transmittance is halved every time the EV value increases by 1 (brightness doubles).
  • the CPU 23 sets the degree of light blocking of the liquid crystal shutter based on the determined transmittance.
  • Figure 21 is a diagram showing an example of the EV value estimation process for rectangular blocks of video see-through image data 923 according to the first embodiment.
  • the CPU 23 estimates the brightness for each location on the video see-through image data 923 to determine areas where visual correction is required. This brightness estimation process can be performed on a pixel-by-pixel basis, but the amount of calculation can be reduced by estimating the brightness on a block-by-block basis using the average value for each block obtained from STAT 21. Note that the various numerical values shown in Figure 21 are merely examples and are not limited to these.
  • the CPU 23 converts the average value for each block into an EV value.
  • the CPU 23 estimates the brightness (EV value) for each block using the following equation (2):
  • the video see-through image data 923 is an image captured without using the lens 2
  • the brightness of the scene actually seen by the user's eye 8 is the brightness transmitted through the lens 2.
  • the CPU 23 corrects the EV value of each block based on the transmittance of the lens 2 at the time of processing, using the following equation (3). Through this correction, the CPU 23 estimates the EV value indicating the brightness of the actual transmitted image.
  • the EV value after correction based on the transmittance of the lens 2 is also called the visual EV value.
  • the CPU 23 may divide the area into “bright areas” that are visually recognizable to the user and “dark areas” that are difficult to recognize, depending on the brightness (visual EV value).
  • the CPU 23 may also divide the area into "twilight areas” that are brighter than dark areas but somewhat difficult to recognize.
  • the CPU 23 can identify the relevant area using the absolute value of the EV value.
  • the dynamic range of human vision is said to be around 80 to 120 dB, and if there is a brightness difference greater than this, it becomes difficult to see dark areas. For this reason, in order to include these two patterns, the CPU 23 determines that areas where the absolute value of the EV value is below a certain threshold, or where the difference from the maximum EV value is below a certain value, are difficult to recognize.
  • the CPU 23 determines that an EV value in the range of "0" to "1" is "twilight,” and as the level of visual recognition is high, gradually begins providing visual assistance using video see-through image data. Furthermore, if the EV value falls below “0,” the CPU 23 determines that the location is a "dark place” and provides visual assistance using video see-through image data.
  • FIG. 22 is a diagram showing an example of the relationship between the brightness of the video see-through display image and the EV value in a scene according to the first embodiment.
  • the vertical axis of FIG. 22 is the brightness (luminance) of the video see-through display image, and the horizontal axis is the EV value.
  • "Gradually starting visual assistance" means, for example, as shown in the graph in FIG. 22, that the brightness of the video see-through display image displayed on the transmissive display 3 gradually increases as the EV value decreases.
  • the CPU 23 determines the brightness of the video see-through display image based on the difference between the EV value of the block and the maximum EV value for the entire video see-through image data 923. For example, the CPU 23 determines that it is "twilight” if "EV value - maximum EV value" falls below "-4.” In the case of "twilight,” visual recognition becomes somewhat difficult, so the CPU 23 gradually begins providing visual assistance using the video see-through image data. Furthermore, the CPU 23 determines that it is a "dark place” if "EV value - maximum EV value” falls below “-7.” If the CPU 23 determines that it is a "dark place,” it provides visual assistance using the video see-through image data.
  • FIG. 23 is a diagram showing an example of the relationship between the brightness of the video see-through display image according to the first embodiment and the luminance difference in EV value in a scene.
  • the vertical axis of FIG. 23 is the brightness (luminance) of the video see-through display image, and the horizontal axis is the EV value.
  • This is the difference (luminance difference) between the EV value of the block and the maximum EV value in the entire video see-through image data 923.
  • the CPU 23 gradually brightens the brightness of the video see-through display image displayed on the transmissive display 3 as the luminance difference increases.
  • the CPU 23 can determine the brightness of each area corresponding to "bright place,” “dark place,” and "twilight” based on the EV value.
  • the video see-through image data 923 may contain locally bright areas such as blown-out highlights in addition to the brightness of the entire scene. This is because when a normal image sensor is used in the video see-through camera 41, the dynamic range is narrow, causing blown-out highlights (a state in which pixel values are saturated) in areas with relatively high brightness. An example of an area with relatively high brightness is an area with strong backlight. If the video see-through display image displayed on the transmissive display 3 includes such blown-out highlight areas, this can actually impair vision, which is undesirable. Therefore, the CPU 23 also determines such relatively bright areas to be bright areas. Areas of the video see-through display image that the CPU 23 determines to be bright areas are not displayed on the transmissive display 3.
  • FIG. 24 is a diagram showing an example of the relationship between the brightness of a video see-through display image and the average luminance value for each block according to the first embodiment.
  • the CPU 23 performs a conversion as shown in the graph in FIG. 24 on the average luminance value for each block output from STAT 21.
  • the Display Controller 19 adjusts the brightness of each area of the video see-through display image. For example, the Display Controller 19 displays the video see-through display image on the transmissive display 3 only in areas that it has determined should be displayed on the transmissive display 3 in all of the three patterns where visual recognition is difficult due to darkness or brightness described above. For example, the Display Controller 19 adjusts the brightness of the video see-through image data 923 by multiplying each pixel value of the video see-through image data 923 by the brightness value of that image data.
  • the CPU 23 determines the "bright” and “dark” areas of the video see-through image data 924 based on the brightness (EV value) of the transmitted image estimated by the CPU 23 and the average brightness value for each block of the video see-through image data 924 output from the STAT 21.
  • the CPU 23 also identifies areas of relatively high brightness, such as blown-out highlights, in the video see-through image data 924.
  • the Display Controller 19 adjusts the brightness of the video see-through image data 924 based on the processing results of the CPU 23.
  • the Display Controller 19 identifies the area to be displayed in the video see-through image data 924 by multiplying the determination result of the "bright” and “dark” areas of the video see-through image data 924 by the identification result of the relatively high brightness areas, such as blown-out highlights. Warp18 sets "0" to areas of the video see-through image data 924 that it determines not to be displayed (for example, areas that fall into a "bright place”). Therefore, in the video see-through image data 924a after brightness adjustment, areas that fall into a "bright place” in any of the three pattern determinations are set to "0".
  • the Display Controller 19 begins providing visual assistance using video see-through image data not only in “dark places” but also in “twilight” areas. Unlike “dark places,” such "twilight” areas allow the user to faintly see the transmitted image. Therefore, in order to prevent double images, the Warp 18 uses the first positional relationship data to deform the video see-through image data for at least the "twilight” area. This deformation causes the position and size of the subject in the transmitted image to match the subject in the video see-through display image on the transmissive display 3. Note that in “dark places,” the transmitted image is not visible and no double images occur, so deformation of the video see-through image data can be omitted. In this case, the amount of calculation required for deformation can be reduced.
  • Warp 18 it is preferable for Warp 18 to target both "dark” and “twilight” areas of the video see-through image data that are displayed on the transmissive display 3 as a video see-through display image for deformation.
  • FIG. 26 shows an example of video see-through image data 925, 925a before and after correction according to the first embodiment.
  • Warp18 deforms the "twilight” area of the video see-through image data 925.
  • An example is shown in which areas in “dark places” that do not overlap with the transmitted image are not deformed, but it is also possible to deform both "dark places” and “twilight” areas.
  • Warp18 sets the value "0" to "bright places” areas that are not subject to correction.
  • the SoC 100a of this embodiment extracts and deforms at least a partial area of the video see-through image data based on the first positional relationship data. Furthermore, the SoC 100a of this embodiment generates display data so that the outline of the subject included in the transmitted image and the outline of the subject included in part of the video see-through image data are continuous on the transmissive display 3, and displays the generated display data on the transmissive display 3. Therefore, the SoC 100a of this embodiment can improve visibility at the boundary between the transmitted image and the video see-through display image when displaying a video see-through display image on the transmissive display 3.
  • the SoC 100a of this embodiment calculates feature points of the transmitted image based on the video see-through image data, and generates first positional relationship data based on the feature points of the transmitted image and the feature points of the video see-through image data.
  • the SoC 100a outputs the generated first positional relationship data to memory. Therefore, according to the SoC 100a of this embodiment, the first positional relationship data generated and stored in advance is used when the user uses the head-mounted display 1a, thereby improving the processing speed for correction.
  • the SoC 100a of this embodiment also extracts a first feature point of the transmitted image from first calibration image data 901 captured by the calibration camera 42 from the front surface 301 of the transmissive display 3.
  • the SoC 100a also extracts a second feature point of the subject from second calibration image data 902 captured by the calibration camera 42 of the transmissive display 3, on which display data based on the video see-through image data captured by the video see-through camera 41 is displayed.
  • the SoC 100a of this embodiment then generates first positional relationship data based on the positional relationship between the first and second feature points. Therefore, the SoC 100a of this embodiment can correct the misalignment between the video see-through image data and the transmitted image by taking into account actual lens distortion, display system distortion, etc.
  • the head-mounted display 1a of this embodiment has the functions of providing visual assistance in dark places and suppressing reduced visibility due to glare in bright places. Therefore, for example, when a user wears the head-mounted display 1a while driving a vehicle, visual assistance is provided by the display of a video see-through display image in dark places such as behind objects, and visual assistance is provided by the light blocking liquid crystal shutter in bright places such as backlight. Furthermore, the head-mounted display 1a of this embodiment can also be used as a visual assistance when working in dark places such as at night or underground.
  • the SoC 100a reduced the occurrence of double images, for example, in the "dark” and "twilight” areas on the transmissive display 3 by transforming the video see-through image data to match the transmitted image.
  • Another way to make double images less noticeable is to increase the degree of light blocking by the LCD shutter to reduce the effects of external light, and then display video see-through image data. With this method, the transmitted image becomes invisible due to the light blocking, thereby preventing double images.
  • Figure 27 is a diagram showing an example of a light-blocking target area 201 according to variant 1 of the first embodiment.
  • the transparent display 3 has a narrower angle of view than the lens 2, and therefore can often only display a portion of the user's field of view.
  • the Display Controller 19 of this variant does not block the entire field of view of the user (the entire lens 2), but rather sets only the area where the transparent display 3 is present as the light-blocking target area 201. By setting this area as the light-blocking target area 201, the Display Controller 19 can prevent double images without obstructing the field of view outside the range of the transparent display 3.
  • the SoC 100a defines only the area of the entire lens 2 where the transmissive display 3 is present as the light-shielded area 201. In such a light-shielded area, if there is an area where the relative or absolute value of the EV value is high, such as in strong backlight, it is not possible to reduce glare outside the area of the transmissive display 3.
  • FIG. 28 is a diagram showing an example of a case where a bright area exists according to Variation 2 of the first embodiment.
  • SoC 100a uses the liquid crystal shutter to block light only in the area where the transmissive display 3 exists, the user will still feel dazzled.
  • Even in dark areas areas outside the range of the transmissive display 3 are not subject to visual aid by the video see-through display image. Therefore, when SoC 100a blocks light from the entire surface of lens 2 using the liquid crystal shutter, areas outside the range of the transmissive display 3 remain dark, resulting in poor visibility.
  • the SoC 100a of this modified example controls the liquid crystal shutter to block light only in areas that are dazzling due to strong backlight, etc.
  • FIG. 29 is a diagram showing an example of a light-shielding target area 202 according to Modification 2 of the first embodiment.
  • the light-shielding target area 202 also includes areas outside the range of the transmissive display 3.
  • the CPU 23 of the SoC 100a of this modification calculates the transmittance for each area of the lens 2 from the average brightness value for each rectangular block, and controls the transmittance of the liquid crystal shutter for each area.
  • the liquid crystal shutter of the lens 2 of this modification has a liquid crystal panel whose transmittance can be partially controlled.
  • the head-mounted display 1a of this modified example can reduce the reduction in visibility for the user when there is a bright area with relatively or absolutely high brightness outside the range of the transmissive display 3 within the field of view of the lens 2. Furthermore, the head-mounted display 1a of this modified example can reduce the reduction in visibility for the user even when there is a mixture of bright and dark areas within the field of view of the lens 2.
  • the SoC 100a corrects the misalignment between the video see-through display image and the transmitted image based on the difference between the position of the user's eye 8 wearing the head-mounted display 1a and the mounting position of the video see-through camera 41.
  • the SoC 100a determines the first positional relationship data on the assumption that the user's eye 8 is located near the center of the lens 2.
  • the position of the eye 8 varies depending on the individual user, and even for the same person, the relative position between the eye 8 and the video see-through camera 41 is expected to constantly change depending on the wearing conditions. For this reason, in this second embodiment, the misalignment during mounting is also corrected.
  • FIG 30 is a diagram showing an example of the overall configuration of a head-mounted display 1b according to the second embodiment.
  • the head-mounted display 1b of this embodiment includes an eyeglass body 10, lenses 2a and 2b, transparent displays 3a and 3b, video see-through cameras 41a and 41b, display projectors 5a and 5b, an ambient light sensor 60, head tracking cameras 63a and 63b, and an SoC 100b.
  • the head-mounted display 1b of this embodiment also includes eye tracking cameras 43a and 43b.
  • the eyeglasses main body 10, lenses 2a, 2b, transmissive displays 3a, 3b, video see-through cameras 41a, 41b, display projectors 5a, 5b, ambient light sensor 60, and head tracking cameras 63a, 63b have the same functions as in the first embodiment.
  • the head-mounted display 1b has an IMU 61, a ToF sensor 62, a flash memory 31, and a DRAM 32, just like in the first embodiment.
  • Eye Tracking cameras 43a and 43b capture images in the direction in which the front of the transmissive display 3 faces.
  • Figure 31 is a diagram showing an example of the positional relationship between the eye tracking cameras 43a, 43b and the user's eyes 8a, 8b according to the second embodiment. As shown in Figure 31, the eye tracking cameras 43a, 43b capture images of the eyes 8a, 8b of a user wearing the head-mounted display 1b.
  • Figure 32 is a diagram showing another example of the positional relationship between the eye tracking cameras 43a-43d and the user's eyes 8a, 8b according to the second embodiment. As shown in Figure 32, when the head-mounted display 1b is equipped with four eye tracking cameras 43a-43d, it is possible to capture images of one eye 8 using two cameras. With this configuration, it is also possible to estimate the distance from the eyes 8a, 8b to the lenses 2a, 2b using the principle of triangulation.
  • Eye tracking camera 43 is an example of the third camera in this embodiment.
  • image data captured by eye tracking camera 43 is an example of the fourth image data in this embodiment.
  • FIG. 33 is a diagram showing an example of the configuration of SoC 100b according to the second embodiment.
  • SoC 100b includes I2C interfaces 11 and 22, Mono ISP 12, DSP & AI Accelerator 13, SRAMs 14a to 14c, GPU 15, Color ISP 16, Time Warp 17, Warp 18, Display Controller 19, STAT 21, CPU 23, and DRAM Controller 24.
  • the Mono ISP 12 of this embodiment also acquires captured image data from the eye tracking camera 43 and corrects brightness, etc.
  • the DSP & AI Accelerator 13 of this embodiment also performs eye tracking processing to detect the position of the pupil of the user's eye 8 based on image data captured by the eye tracking camera 43 acquired and corrected by the Mono ISP 12.
  • the DSP & AI Accelerator 13 is an example of an pupil detection circuit in this embodiment.
  • the DSP 131 in particular generates second positional relationship data based on the position of the user's pupil and feature points of the video see-through image data, and outputs it to a memory such as the SRAM 14a.
  • the DSP 131 is an example of an image processing circuit in this embodiment.
  • Warp 18 of this embodiment also corrects the position of the subject in the video see-through image data based on second positional relationship data.
  • the second positional relationship data represents the positional relationship between the position of the user's pupil and the video see-through camera 41.
  • the CPU 23 performs the pre-use calibration process for the head-mounted display 1b in the same manner as in the first embodiment, and stores the first positional relationship data in memory.
  • Figure 34 is a flowchart showing an example of the flow of the process for obtaining the pupil position misalignment correction amount according to the second embodiment.
  • Mono ISP 12 acquires image data of the eyes of a user (wearer) wearing head-mounted display 1b from eye tracking camera 43 (S31).
  • the image data captured by eye tracking camera 43 is referred to as eye tracking image data.
  • the DSP & AI Accelerator 13 detects the position of the user's pupil from the image data for eye tracking (S32).
  • the data indicating the position of the user's pupil is an example of pupil position data indicating the position of the user's pupil.
  • the DSP in the DSP & AI Accelerator 13 acquires the relative position between the user's eye 8 and the video see-through camera 41 based on the detection result of the user's pupil position (S33).
  • the DSP 131 outputs second positional relationship data indicating the acquired relative position between the user's eye 8 and the video see-through camera 41 to a memory such as the SRAM 14a.
  • the DSP 131 may also generate second positional relationship data indicating the relative position between the user's eye 8 and the video see-through camera 41 based on feature points extracted from the video see-through image data.
  • the DSP 131 may also store pupil position data indicating the position of the user's pupil in a memory such as the SRAM 14a.
  • the method of eye tracking processing is not particularly limited, and any known method can be used.
  • data indicating the positional relationship between the video see-through camera 41 and the eye tracking camera 43 may be stored in advance in a memory such as RAM 14a.
  • the DSP 131 may convert the relative relationship between the user's eyes 8 and the eye tracking camera 43 into the relative position between the user's eyes 8 and the video see-through camera 41 based on the data indicating this positional relationship.
  • the processing of the flowchart in Figure 34 is repeatedly executed while the head-mounted display 1b is being used by the user.
  • FIG. 35 is a flowchart showing an example of the flow of the display data generation process when using the head-mounted display 1b according to the second embodiment.
  • the image data acquisition process in S21 and the bright and dark area extraction process in S23 are the same as the processes in the first embodiment described in FIG. 12.
  • Warp 18 acquires first positional relationship data and second positional relationship data from a memory such as SRAM 14a.
  • Warp 18 then corrects the positional relationship of the subject in the video see-through image data based on the second positional relationship data (S41).
  • the processes from the transformation process of the video see-through image data in S24 to the output process of the display data and transmittance data in S27 are the same as those in the first embodiment described in FIG. 12. At this point, the processing of this flowchart ends.
  • the SoC 100b of this embodiment detects the position of the user's pupils based on the Eye Tracking image data, and generates second positional relationship data that represents the positional relationship between the pupil positions and the video see-through camera 41. Therefore, the SoC 100b of this embodiment can reflect the deviation in pupil position when actually worn and individual differences between users in the video see-through display image displayed on the transmissive display 3.
  • the DSP&AI accelerator 13 detects the position of the user's pupil from the eye tracking image data.
  • the DSP&AI accelerator 13 extracts pupil reflection image data reflected in the user's pupil from the eye tracking image data.
  • the DSP&AI accelerator 13 is an example of a pupil reflection image extraction circuit in this modification.
  • the DSP 131 of this modified example generates second positional relationship data based on the pupil reflection image data and feature points of the video see-through image data, and outputs the data to a memory such as the SRAM 14a.
  • the DSP 131 is an example of an image processing circuit in this modified example.
  • SoC100b of this modified example extracts pupil reflection image data reflected in the user's eyes from the eye tracking image data captured by the eye tracking camera 43, and generates second positional relationship data based on that pupil reflection image data. Therefore, SoC100b of this modified example can correct video see-through image data by taking into account the image that the user is actually viewing.
  • the SoCs 100a and 100b are mounted on the head-mounted displays 1a and 1b, but the SoCs 100a and 100b may be applied to other display devices.
  • the SoCs 100a and 100b may be display devices for head-up displays mounted on the windshield of a vehicle.
  • the head-mounted displays 1a and 1b may be goggle-type devices equipped with two lenses 2a and 2b, or may be goggle-type devices equipped with one large lens 2 for both eyes.
  • the see-through display 3 may be provided on the entire surface of the lens 2, rather than as part of it.
  • SoC 100a, 100b may be stored, for example, as a program on a non-volatile storage medium.
  • the CPU 23 of SoC 100a, 100b may read the program to execute the various processes described above.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computer Hardware Design (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)

Abstract

This semiconductor device includes a memory, a display data generation circuit, and a display control circuit. In the memory, first positional relationship data is stored, the data indicating a positional relationship between a subject included in a transmission image transmitted through a transmissive display and the subject included in first image data captured by a first camera that captures an image of a direction in which the back face of the transmissive display faces. The display data generation circuit uses the first positional relationship data to extract and deform at least a partial region of the first image data, and to generate display data such that the contour of the subject included in the transmission image is continuous with the contour of the subject included in the first image data region on the transmissive display. The display control circuit displays the display data on the transmissive display.

Description

半導体装置、方法、及びヘッドマウントディスプレイSemiconductor device, method, and head-mounted display

 本発明は、半導体装置、方法、及びヘッドマウントディスプレイに関する。 The present invention relates to a semiconductor device, a method, and a head-mounted display.

 従来、透過ディスプレイ上に、カメラで撮影された撮影画像を重畳表示するヘッドマウントディスプレイやヘッドアップディスプレイ等の技術が知られている。 Conventionally, technologies such as head-mounted displays and head-up displays are known that superimpose images captured by a camera onto a see-through display.

 このような技術では、ユーザは、透過ディスプレイを透過した透過像と、透過ディスプレイに表示された撮影画像の両方を同時に視認することができる。 With this technology, the user can simultaneously view both the transmitted image through the transmissive display and the captured image displayed on the transmissive display.

特許第7246708号公報Patent No. 7246708 特開2008-96867号公報Japanese Patent Application Laid-Open No. 2008-96867 特許第5855206号公報Patent No. 5855206

 しかしながら、透過ディスプレイ上に撮影画像を表示する場合、透過像と撮影画像との境界で視認性が低下する場合がある。 However, when a captured image is displayed on a transmissive display, visibility may be reduced at the boundary between the transmissive image and the captured image.

 1つの側面では、本発明は、透過ディスプレイ上に撮影画像を表示する場合において、透過像と撮影画像との境界での視認性を向上させることを目的とする。 In one aspect, the present invention aims to improve the visibility of the boundary between a transmitted image and a captured image when the captured image is displayed on a transmissive display.

 実施形態の半導体装置は、メモリと、表示データ生成回路と、表示制御回路と、を備える。メモリには、透過ディスプレイを透過する透過像に含まれる被写体と、透過ディスプレイの背面が向く方向を撮影する第1のカメラにより撮影される第1画像データに含まれる被写体との位置関係を示す第1位置関係データが格納される。表示データ生成回路は、第1位置関係データに基づいて、第1画像データの少なくとも一部の領域を抽出および変形し、透過像に含まれる被写体の輪郭と第1画像データの領域に含まれる被写体の輪郭とが透過ディスプレイ上で連続するように表示データを生成する。表示制御回路は、表示データを透過ディスプレイに表示する。 The semiconductor device of the embodiment includes a memory, a display data generation circuit, and a display control circuit. The memory stores first positional relationship data indicating the positional relationship between a subject included in a transmitted image transmitted through the transmissive display and a subject included in first image data captured by a first camera that captures the direction in which the back of the transmissive display faces. The display data generation circuit extracts and deforms at least a partial area of the first image data based on the first positional relationship data, and generates display data such that the outline of the subject included in the transmitted image and the outline of the subject included in the area of the first image data are continuous on the transmissive display. The display control circuit displays the display data on the transmissive display.

図1は、第1の実施形態に係るヘッドマウントディスプレイの全体構成の一例を示す図である。FIG. 1 is a diagram illustrating an example of the overall configuration of a head-mounted display according to the first embodiment. 図2は、第1の実施形態に係るSoCの構成の一例を示す図である。FIG. 2 is a diagram illustrating an example of the configuration of an SoC according to the first embodiment. 図3は、第1の実施形態に係るビデオシースルー表示画像と透過像との差異の一例を示す図である。FIG. 3 is a diagram showing an example of the difference between a video see-through display image and a transmission image according to the first embodiment. 図4は、第1の実施形態に係るキャリブレーション用カメラの設置位置の一例について示す図である。FIG. 4 is a diagram showing an example of the installation position of the calibration camera according to the first embodiment. 図5は、第1の実施形態に係るキャリブレーション処理の被写体の一例を示す図である。FIG. 5 is a diagram illustrating an example of a subject for the calibration process according to the first embodiment. 図6は、第1の実施形態に係るキャリブレーション用カメラによる被写体9bの撮影を横側から見た一例を示す図である。FIG. 6 is a diagram showing an example of a subject 9b photographed from the side by the calibration camera according to the first embodiment. 図7は、第1の実施形態に係るキャリブレーション処理の流れの一例を示すフローチャートである。FIG. 7 is a flowchart showing an example of the flow of the calibration process according to the first embodiment. 図8は、図7のフローチャートの各ステップの処理内容の一例を示す図である。FIG. 8 is a diagram showing an example of the processing content of each step in the flowchart of FIG. 図9は、第1の実施形態に係るキャリブレーション処理における各画像データに含まれる歪の一例を示す図である。FIG. 9 is a diagram showing an example of distortion contained in each image data in the calibration process according to the first embodiment. 図10は、第1の実施形態に係るヘッドマウントディスプレイの使用時における各画像データに含まれる歪の一例を示す図である。FIG. 10 is a diagram illustrating an example of distortion contained in each image data when the head-mounted display according to the first embodiment is used. 図11は、図10の変形処理Bの内訳の一例を示す図である。FIG. 11 is a diagram showing an example of the breakdown of the modification process B in FIG. 図12は、第1の実施形態に係るヘッドマウントディスプレイの使用時の表示データの生成処理の流れの一例を示すフローチャートである。FIG. 12 is a flowchart showing an example of the flow of a process for generating display data when the head-mounted display according to the first embodiment is used. 図13は、第1の実施形態に係るビデオシースルー画像データの補正の一例を示す図である。FIG. 13 is a diagram showing an example of correction of video see-through image data according to the first embodiment. 図14は、第1の実施形態に係る透過ディスプレイと液晶シャッタによる視覚補助の原理について説明する図である。FIG. 14 is a diagram illustrating the principle of visual assistance using the transmissive display and liquid crystal shutter according to the first embodiment. 図15は、第1の実施形態に係る透過ディスプレイの表示態様の一例を示す図である。FIG. 15 is a diagram illustrating an example of a display mode of the transmissive display according to the first embodiment. 図16は、第1の実施形態に係る明所・暗所領域の抽出処理の流れの一例を示す図である。FIG. 16 is a diagram showing an example of the flow of the bright place/dark place region extraction process according to the first embodiment. 図17は、第1の実施形態に係るEV値の取得処理の流れの一例を示す図である。FIG. 17 is a diagram illustrating an example of the flow of the EV value acquisition process according to the first embodiment. 図18は、第1の実施形態に係るヒストグラムに基づいて決定される閾値の一例を示す図である。FIG. 18 is a diagram showing an example of a threshold value determined based on a histogram according to the first embodiment. 図19は、一般的なEV/Lux変換表の一例を示す図である。FIG. 19 is a diagram showing an example of a general EV/Lux conversion table. 図20は、第1の実施形態に係るEV値とレンズの透過率との関係の一例を示す図である。FIG. 20 is a diagram showing an example of the relationship between the EV value and the transmittance of the lens according to the first embodiment. 図21は、第1の実施形態に係るビデオシースルー画像データの矩形ブロック単位のEV値の推定処理の一例を示す図である。FIG. 21 is a diagram showing an example of a process for estimating EV values for each rectangular block of video see-through image data according to the first embodiment. 図22は、第1の実施形態に係るビデオシースルー表示画像の明るさと、シーンにおけるEV値との関係の一例を示す図である。FIG. 22 is a diagram showing an example of the relationship between the brightness of a video see-through display image and the EV value in a scene according to the first embodiment. 図23は、第1の実施形態に係るビデオシースルー表示画像の明るさと、シーンにおけるEV値の輝度差との関係の一例を示す図である。FIG. 23 is a diagram showing an example of the relationship between the brightness of the video see-through display image according to the first embodiment and the luminance difference of the EV value in the scene. 図24は、第1の実施形態に係るビデオシースルー表示画像の明るさとブロックごとの輝度の平均値との関係の一例を示す図である。FIG. 24 is a diagram showing an example of the relationship between the brightness of the video see-through display image and the average value of luminance for each block according to the first embodiment. 図25は、第1の実施形態に係るビデオシースルー画像データの明るさの調整処理の一例を示す図である。FIG. 25 is a diagram showing an example of the brightness adjustment process for video see-through image data according to the first embodiment. 図26は、第1の実施形態に係る補正前後のビデオシースルー画像データの一例を示す図である。FIG. 26 is a diagram showing an example of video see-through image data before and after correction according to the first embodiment. 図27は、第1の実施形態の変形例1に係る遮光対象領域の一例を示す図である。FIG. 27 is a diagram illustrating an example of a light-blocking target region according to the first modification of the first embodiment. 図28は、第1の実施形態の変形例2に係る明るさの強い領域が存在する場合の一例を示す図である。FIG. 28 is a diagram illustrating an example of a case where a bright area exists according to the second modification of the first embodiment. 図29は、第1の実施形態の変形例2に係る遮光対象領域の一例を示す図である。FIG. 29 is a diagram illustrating an example of a light-blocking target region according to the second modification of the first embodiment. 図30は、第2の実施形態に係るヘッドマウントディスプレイの全体構成の一例を示す図である。FIG. 30 is a diagram illustrating an example of the overall configuration of a head-mounted display according to the second embodiment. 図31は、第2の実施形態に係るEye Tracking用カメラとユーザの目との位置関係の一例を示す図である。Figure 31 is a diagram showing an example of the positional relationship between the eye tracking camera and the user's eyes in the second embodiment. 図32は、第2の実施形態に係るEye Tracking用カメラとユーザの目との位置関係の他の一例を示す図である。Figure 32 is a diagram showing another example of the positional relationship between the eye tracking camera and the user's eyes in the second embodiment. 図33は、第2の実施形態に係るSoCの構成の一例を示す図である。FIG. 33 is a diagram illustrating an example of the configuration of an SoC according to the second embodiment. 図34は、第2の実施形態に係る瞳位置の位置ずれ補正量の取得処理の流れの一例を示すフローチャートである。FIG. 34 is a flowchart showing an example of the flow of processing for acquiring the correction amount of pupil position misalignment according to the second embodiment. 図35は、第2の実施形態に係るヘッドマウントディスプレイの使用時の表示データの生成処理の流れの一例を示すフローチャートである。FIG. 35 is a flowchart showing an example of the flow of a process for generating display data when using the head-mounted display according to the second embodiment.

 以下、添付図面を参照しながら、本願の開示する半導体装置、方法、及びヘッドマウントディスプレイの実施形態を詳細に説明する。なお、以下の実施形態は開示の技術を限定するものではない。そして、各実施形態は、処理内容を矛盾させない範囲で適宜組み合わせることが可能である。 Embodiments of the semiconductor device, method, and head-mounted display disclosed herein will be described in detail below with reference to the accompanying drawings. Note that the following embodiments do not limit the disclosed technology. Furthermore, the embodiments can be combined as appropriate to the extent that the processing content is not contradictory.

(第1の実施形態)
 図1は、第1の実施形態に係るヘッドマウントディスプレイ1aの全体構成の一例を示す図である。ヘッドマウントディスプレイ1aは、暗所でも撮影可能な高感度カメラ(ビデオシースルー用カメラ41a,41b)を搭載し、暗所ではカメラ映像を表示することでユーザの視覚を補助する。
(First embodiment)
1 is a diagram showing an example of the overall configuration of a head-mounted display 1a according to the first embodiment. The head-mounted display 1a is equipped with high-sensitivity cameras (video see-through cameras 41a and 41b) that can capture images even in dark places, and supports the user's vision by displaying camera images in dark places.

 具体的には、ヘッドマウントディスプレイ1aは、例えば、眼鏡本体部10、レンズ2a,2b、透過ディスプレイ3a,3b、ビデオシースルー用カメラ41a,41b、ディスプレイプロジェクタ5a,5b、Ambient Light Sensor60、Head Tracking用カメラ63a,63b、及びSoC(System on a Chip)100aを備える。 Specifically, the head-mounted display 1a includes, for example, a pair of eyeglasses 10, lenses 2a and 2b, transparent displays 3a and 3b, video see-through cameras 41a and 41b, display projectors 5a and 5b, an ambient light sensor 60, head tracking cameras 63a and 63b, and a system on a chip (SoC) 100a.

 レンズ2a,2b、透過ディスプレイ3a,3b、ビデオシースルー用カメラ41a,41b、ディスプレイプロジェクタ5a,5b、Ambient Light Sensor60、Head Tracking用カメラ63a,63b、及びSoCは、眼鏡本体部10に固定される。 The lenses 2a, 2b, transmissive displays 3a, 3b, video see-through cameras 41a, 41b, display projectors 5a, 5b, ambient light sensor 60, head tracking cameras 63a, 63b, and SoC are fixed to the eyeglass body 10.

 眼鏡本体部10は、例えば、通常の眼鏡のフレーム同様に、ユーザの頭部に装着可能な形状である。眼鏡本体部10は、例えば、レンズ2a,2bを固定するフロント部分と、ユーザの両耳殻に掛け止めることが可能なテンプル部分とを含む。 The eyeglass body 10 is shaped so that it can be worn on the user's head, similar to the frame of regular eyeglasses. The eyeglass body 10 includes, for example, a front portion that secures the lenses 2a and 2b, and temple portions that can be fastened to the user's earlobes.

 レンズ2a,2bは、ヘッドマウントディスプレイ1aを装着したユーザの両眼の前に位置する透明な眼鏡レンズである。また、レンズ2a,2bは、液晶シャッタの機能を備える。液晶シャッタの機能により、レンズ2a,2bは、レンズ2a,2bを透過する外光の光量を調整することができる。以下、個々のレンズ2a,2bを特に区別しない場合は、単にレンズ2という。 Lens 2a and 2b are transparent eyeglass lenses positioned in front of the eyes of a user wearing head-mounted display 1a. Lenses 2a and 2b also function as liquid crystal shutters. This liquid crystal shutter function allows lenses 2a and 2b to adjust the amount of external light that passes through lenses 2a and 2b. Hereinafter, when there is no need to distinguish between individual lenses 2a and 2b, they will simply be referred to as lenses 2.

 透過ディスプレイ3a,3bは、レンズ2a,2b上に設けられ、画像を表示可能なディスプレイである。また、透過ディスプレイ3a,3bは、外光を透過する。このため、ヘッドマウントディスプレイ1aを装着したユーザは、透過ディスプレイ3a,3bを透過した透過像と、透過ディスプレイ3a,3bに表示された表示画像との両方を視認することができる。透過ディスプレイ3a,3bに表示される表示画像は、例えば、後述のビデオシースルー用カメラ41a,41bによって撮影された撮影画像データに変形等の処理が施された画像である。なお、変形は、補正の一例である。 Transparent displays 3a and 3b are provided on lenses 2a and 2b and are capable of displaying images. Furthermore, transparent displays 3a and 3b transmit external light. Therefore, a user wearing head-mounted display 1a can see both the transmitted image transmitted through transparent displays 3a and 3b and the display image displayed on transparent displays 3a and 3b. The display images displayed on transparent displays 3a and 3b are, for example, images obtained by subjecting captured image data captured by video see-through cameras 41a and 41b (described below) to processing such as deformation. Note that deformation is an example of correction.

 透過ディスプレイ3a,3bは、レンズ2a,2bの少なくとも一部の領域に設けられる。本実施形態では、透過ディスプレイ3a,3bは、レンズ2a,2bの中央に位置する一部の領域に設けられる。透過ディスプレイ3a,3bは、例えば、ウェーブガイド(Waveguide)等の光学シースルーディスプレイ用スクリーンである。以下、個々の透過ディスプレイ3a,3bを特に区別しない場合は、単に透過ディスプレイ3という。 The transparent displays 3a and 3b are provided in at least a portion of the area of the lenses 2a and 2b. In this embodiment, the transparent displays 3a and 3b are provided in a portion of the area located at the center of the lenses 2a and 2b. The transparent displays 3a and 3b are, for example, screens for optical see-through displays such as waveguides. Hereinafter, when there is no need to distinguish between the individual transparent displays 3a and 3b, they will simply be referred to as the transparent displays 3.

 透過ディスプレイ3の両面のうち、ヘッドマウントディスプレイ1aを装着したユーザの側を向く面を前面という。また、透過ディスプレイ3の両面のうち、ヘッドマウントディスプレイ1aを装着したユーザと反対側を向く面を背面という。ヘッドマウントディスプレイ1aを装着するユーザは、透過ディスプレイ3の観察者の一例である。 Of the two surfaces of the transparent display 3, the surface facing the user wearing the head-mounted display 1a is referred to as the front surface. Furthermore, of the two surfaces of the transparent display 3, the surface facing away from the user wearing the head-mounted display 1a is referred to as the back surface. The user wearing the head-mounted display 1a is an example of an observer of the transparent display 3.

 ビデオシースルー用カメラ41a,41bは、透過ディスプレイ3の背面が向く方向を撮影するカメラである。透過ディスプレイ3の背面が向く方向とは、ヘッドマウントディスプレイ1aを装着したユーザにとっての正面方向に相当する。より詳細には、ビデオシースルー用カメラ41a,41bは、例えば、それぞれ、眼鏡本体部10のうち、レンズ2a,2bの中央の上方の位置に設けられる。ビデオシースルー用カメラ41a,41bは、本実施形態における第1のカメラの一例である。また、ビデオシースルー用カメラ41a,41bにより撮影された撮影画像データは、本実施形態における第1画像データの一例である。 The video see-through cameras 41a and 41b are cameras that capture images in the direction in which the back of the transmissive display 3 faces. The direction in which the back of the transmissive display 3 faces corresponds to the front direction for a user wearing the head-mounted display 1a. More specifically, the video see-through cameras 41a and 41b are provided, for example, in positions above the centers of the lenses 2a and 2b of the eyeglass body 10. The video see-through cameras 41a and 41b are an example of a first camera in this embodiment. Furthermore, the captured image data captured by the video see-through cameras 41a and 41b is an example of first image data in this embodiment.

 ビデオシースルー用カメラ41a,41bは、例えば、SPAD(Single Photon Avalanche Diode)センサ、高感度CMOS(Complementary Metal Oxide Semiconductor)センサ等の暗所でも撮影可能なイメージセンサである。あるいは、ビデオシースルー用カメラ41a,41bは、IR(Infrared Rays)光照射により暗闇を撮影可能なIRセンサでもよい。以下、個々のビデオシースルー用カメラ41a,41bを特に区別しない場合は、単にビデオシースルー用カメラ41という。 Video see-through cameras 41a and 41b are image sensors capable of capturing images in dark places, such as a SPAD (Single Photon Avalanche Diode) sensor or a highly sensitive CMOS (Complementary Metal Oxide Semiconductor) sensor. Alternatively, video see-through cameras 41a and 41b may be IR (Infrared Rays) sensors capable of capturing images in the dark by irradiating them with IR light. Hereinafter, when there is no need to distinguish between the individual video see-through cameras 41a and 41b, they will simply be referred to as video see-through cameras 41.

 本実施形態においては、ビデオシースルー用カメラ41によって撮影された撮影画像データをビデオシースルー画像データという。また、ディスプレイプロジェクタ5によってビデオシースルー画像データが透過ディスプレイ3に投影された場合、透過ディスプレイ3上に表示された当該画像を、ビデオシースルー表示画像という。なお、「ディスプレイプロジェクタ5によってビデオシースルー画像データが透過ディスプレイ3に投影された場合」とは、各種補正後のビデオシースルー画像データが透過ディスプレイ3に投影された場合も含む。 In this embodiment, the captured image data captured by the video see-through camera 41 is referred to as video see-through image data. Furthermore, when the video see-through image data is projected onto the transmissive display 3 by the display projector 5, the image displayed on the transmissive display 3 is referred to as a video see-through display image. Note that "when video see-through image data is projected onto the transmissive display 3 by the display projector 5" also includes when video see-through image data after various corrections is projected onto the transmissive display 3.

 ディスプレイプロジェクタ5a,5bは、後述のSoC100aの制御の下、透過ディスプレイ3a,3bに表示画像を表示させる。ディスプレイプロジェクタ5a,5bは、例えば、micro LED(Light Emitting Diode)である。以下、個々のディスプレイプロジェクタ5a,5bを特に区別しない場合は、単にディスプレイプロジェクタ5という。 Display projectors 5a and 5b display images on transmissive displays 3a and 3b under the control of SoC 100a (described below). Display projectors 5a and 5b are, for example, micro LEDs (Light Emitting Diodes). Hereinafter, when there is no need to distinguish between individual display projectors 5a and 5b, they will simply be referred to as display projectors 5.

 Ambient Light Sensor60は、環境光センサともいい、周囲の環境光情報を取得する。Ambient Light Sensor60は、例えば、環境光の強度を測定する。 The Ambient Light Sensor 60, also known as an ambient light sensor, acquires ambient light information. For example, the Ambient Light Sensor 60 measures the intensity of ambient light.

 Head Tracking用カメラ63a,63bは、ヘッドマウントディスプレイ1aを装着したユーザの頭部の動きを検出する。以下、個々のHead Tracking用カメラ63a,63bを特に区別しない場合は、単にHead Tracking用カメラ63という。 The head tracking cameras 63a and 63b detect the movement of the head of a user wearing the head-mounted display 1a. Hereinafter, when there is no need to distinguish between the individual head tracking cameras 63a and 63b, they will simply be referred to as head tracking cameras 63.

 SoC100aは、ヘッドマウントディスプレイ1aの各構成を制御するコンピュータである。SoC100aは、本実施形態における半導体装置の一例である。 SoC100a is a computer that controls each component of head-mounted display 1a. SoC100a is an example of a semiconductor device in this embodiment.

 また、ヘッドマウントディスプレイ1aは、さらに、距離を測定するDepthセンサ、及びヘッドマウントディスプレイ1aを装着したユーザの姿勢を測定するIMU(Inertial Measurement Unit)等を備えてもよい。距離を測定するDepthセンサは、例えば、ToF(Time of Fligrt)センサやステレオカメラであるが、これらに限定されるものではない。 The head-mounted display 1a may further include a depth sensor that measures distance, an IMU (Inertial Measurement Unit) that measures the posture of the user wearing the head-mounted display 1a, and the like. Examples of depth sensors that measure distance include, but are not limited to, a ToF (Time of Flight) sensor or a stereo camera.

 ここで、SoC100aの構成の詳細を説明する。図2は、第1の実施形態に係るSoC100aの構成の一例を示す図である。図2に示すように、SoC100aは、I2C(Inter-Integrated Circuit)インタフェース11,22、Mono ISP(Image Signal Processor)12、DSP(Digital Signal Processor)&AI(Artificial Intelligence)Accelerator13、SRAM(Static Random Access Memory)14a~14c、GPU(Graphics Processing Unit)15、Color ISP16、Time Warp17、Warp18、Display Controller19、STAT(Statistics block)21、CPU(Central Processing Unit)23、及びDRAM(Dynamic Random Access Memory) Controller24を備える。 Here, the configuration of the SoC 100a will be described in detail. Figure 2 is a diagram showing an example of the configuration of the SoC 100a according to the first embodiment. As shown in Figure 2, the SoC 100a includes an I2C (Inter-Integrated Circuit) interface 11, 22, a Mono ISP (Image Signal Processor) 12, a DSP (Digital Signal Processor) & AI (Artificial Intelligence) Accelerator 13, and an SRAM (Static Random Access Memory) 1. 4a-14c, GPU (Graphics Processing Unit) 15, Color ISP 16, Time Warp 17, Warp 18, Display Controller 19, STAT (Statistics block) 21, CPU (Central Processing Unit) 23, and DRAM (Dynamic Random Access Memory) Controller 24.

 また、図1では不図示であったが、ヘッドマウントディスプレイ1aは、SoC100aの外部に、さらに、IMU61、ToFセンサ62、Flash memory31、及びDRAM32を備える。Flash memory31、及びDRAM32は、SoC100aの制御の下、各種のデータを記憶する。 Although not shown in FIG. 1, the head mounted display 1a further includes an IMU 61, a ToF sensor 62, a flash memory 31, and a DRAM 32 outside the SoC 100a. The flash memory 31 and the DRAM 32 store various types of data under the control of the SoC 100a.

 また、図2に示すキャリブレーション用カメラ42a,42bはヘッドマウントディスプレイ1aの外部に設けられたカメラである。以下、個々のキャリブレーション用カメラ42a,42bを特に区別しない場合は、単にキャリブレーション用カメラ42という。キャリブレーション用カメラ42は、本実施形態における第2のカメラの一例である。キャリブレーション用カメラ42の詳細については図4で後述する。 Furthermore, the calibration cameras 42a and 42b shown in FIG. 2 are cameras provided outside the head-mounted display 1a. Hereinafter, when there is no need to distinguish between the individual calibration cameras 42a and 42b, they will simply be referred to as calibration cameras 42. The calibration camera 42 is an example of the second camera in this embodiment. Details of the calibration camera 42 will be described later with reference to FIG. 4.

 I2Cインタフェース11,22は、同期式シリアル通信を行う通信インタフェースである。I2Cインタフェース11は、IMU61から加速度及び角速度を取得する。また、I2Cインタフェース22は、Ambient Light Sensor60から環境光の強度を取得する。 I2C interfaces 11 and 22 are communication interfaces that perform synchronous serial communication. I2C interface 11 acquires acceleration and angular velocity from IMU 61. I2C interface 22 acquires the intensity of ambient light from Ambient Light Sensor 60.

 Mono ISP12は、周辺状況を認識するための各種センサからの情報を取得し、取得した情報を補正する。Mono ISP12は、例えば、ToFセンサ62から周囲の物体との測距データを取得する。また、Mono ISP12は、例えば、Head Tracking用カメラ63a,63bから撮影データを取得し、明るさ等を補正する。 Mono ISP 12 acquires information from various sensors used to recognize the surrounding situation and corrects the acquired information. For example, Mono ISP 12 acquires distance measurement data from surrounding objects from ToF sensor 62. Mono ISP 12 also acquires image data from head tracking cameras 63a and 63b, for example, and corrects brightness, etc.

 DSP&AI Accelerator13は、DSP131とAI Accelerator132とを含む。DSP&AI Accelerator13は、Mono ISP12が取得及び補正した各種データに基づいて各種トラッキング処理を行う。例えば、DSP&AI Accelerator13は、ヘッドマウントディスプレイ1aを装着したユーザの頭部の動きを検出するHead Tracking処理をしてもよい。あるいは、DSP&AI Accelerator13は、センサの種類に応じて、Hand Tracking処理またはEye Tracking処理等を行ってもよい。DSP&AI Accelerator13は、各種のTracking処理の結果として生成したTracking情報をTime Warp17に出力する。なお、本実施形態においては、各種トラッキング処理は必須ではない。 The DSP & AI Accelerator 13 includes a DSP 131 and an AI Accelerator 132. The DSP & AI Accelerator 13 performs various tracking processes based on various data acquired and corrected by the Mono ISP 12. For example, the DSP & AI Accelerator 13 may perform head tracking processing to detect head movement of a user wearing the head-mounted display 1a. Alternatively, the DSP & AI Accelerator 13 may perform hand tracking processing or eye tracking processing, depending on the type of sensor. The DSP & AI Accelerator 13 outputs tracking information generated as a result of various tracking processes to the Time Warp 17. Note that various tracking processes are not required in this embodiment.

 SRAM14a~14cは、各種処理で使用される各種データ、及び各種処理で生成されたデータを記憶する。例えば、SRAM14bは、レンダリング画像データや撮影画像データの一時保存場所として用いられる。後述のWarp18によって実行される画像データの変形処理で用いられる第1位置関係データを記憶してもよい。SRAM容量は通常小さく画像全体が格納できない場合があるが、その場合はSRAMをFIFO(First in First out)として使用し、後段処理でまだ使用していない画像の一部分のみを格納してもよい。SRAM14a~14cは、本実施形態におけるメモリの一例である。なお、SoC100aが実行する各処理では、データの一時的な保存のためのバッファとしてSRAM14a~14cを用いる。DRAM32をバッファとすることも可能であるが、SRAM14a~14cを用いる方が消費電力を削減可能である。なお、SRAM14bに保存された第1位置関係データには、Flash memory31にあらかじめ記憶していたデータを用いる。このため、Flash memory31が本実施形態におけるメモリの一例に含まれる。Flash memory31は外部記憶装置であり、NANDメモリを用いてもよいし、SSDやSDカードなどを用いても良い。 SRAMs 14a to 14c store various data used in various processes and data generated by various processes. For example, SRAM 14b is used as a temporary storage location for rendered image data and captured image data. It may also store first positional relationship data used in the image data transformation process performed by Warp 18, described below. SRAM capacity is usually small and may not be able to store the entire image. In such cases, SRAM may be used as a FIFO (First in First out) buffer to store only a portion of the image that has not yet been used in subsequent processing. SRAMs 14a to 14c are an example of memory in this embodiment. Note that each process performed by SoC 100a uses SRAMs 14a to 14c as a buffer for temporary data storage. While DRAM 32 can also be used as a buffer, using SRAMs 14a to 14c reduces power consumption. Note that the first positional relationship data stored in SRAM 14b uses data previously stored in flash memory 31. For this reason, flash memory 31 is included as an example of memory in this embodiment. Flash memory 31 is an external storage device, and may be NAND memory, an SSD, an SD card, or the like.

 GPU15は、透過ディスプレイ3に表示される画像データのレンダリング処理を行う。 The GPU 15 performs rendering processing of the image data displayed on the transmissive display 3.

 Time Warp17は、GPU15の処理によって発生した、画像データの遅延の補正をする処理回路である。Time Warp17による時間補正は、透過ディスプレイ3に表示される画像データを見たユーザがいわゆる「VR(Virtual Reality)酔い」という不快感を覚えることを低減する効果を奏する。Time Warp17は、例えば、DSP&AI Accelerator13によるHead Tracking処理の結果を用いて補正処理を行う。 Time Warp 17 is a processing circuit that corrects delays in image data caused by processing by GPU 15. The time correction performed by Time Warp 17 has the effect of reducing the discomfort experienced by users who view image data displayed on the translucent display 3, known as "VR (Virtual Reality) sickness." Time Warp 17 performs correction processing using, for example, the results of head tracking processing performed by DSP & AI Accelerator 13.

 Color ISP16、Warp18は、ビデオシースルー用カメラ41、及びキャリブレーション用カメラ42から撮影画像データを取得する。Color ISP16は、取得した画像データをRGB(Red-Green-Blue color model)画像データに変換する。 Color ISP 16 and Warp 18 acquire captured image data from the video see-through camera 41 and the calibration camera 42. Color ISP 16 converts the acquired image data into RGB (Red-Green-Blue color model) image data.

 Warp18は、Color ISP16によりRGB画像データに変換されたビデオシースルー用カメラ41の撮影画像データに対して補正処理を実行する。Warp18による補正処理は、例えば、撮影画像データの少なくとも一部の領域に対する第1位置関係データに基づく変形処理、及び撮影画像データの少なくとも一部の領域を表示対象として抽出する処理を含む。Warp18は、当該変形及び抽出処理によって、透過像に含まれる被写体の輪郭と撮影画像データの一部に含まれる被写体の輪郭とが透過ディスプレイ3上で連続するようにビデオシースルー画像データを生成する。ビデオシースルー画像データは、本実施形態における表示データの一例である。Warp18は、本実施形態における表示データ生成回路の一例である。 Warp 18 performs correction processing on the captured image data of the video see-through camera 41 that has been converted into RGB image data by Color ISP 16. The correction processing by Warp 18 includes, for example, a transformation processing based on the first positional relationship data for at least a partial area of the captured image data, and a processing for extracting at least a partial area of the captured image data as a display target. Through this transformation and extraction processing, Warp 18 generates video see-through image data so that the outline of the subject included in the transmitted image and the outline of the subject included in part of the captured image data are continuous on the transmissive display 3. The video see-through image data is an example of display data in this embodiment. Warp 18 is an example of a display data generation circuit in this embodiment.

 STAT21は、Color ISP16によってRGB画像に変換されたビデオシースルー用カメラ41の撮影画像データから、各種の情報を特定する。例えば、STAT21は、AE(Auto Exposure:自動露出制御)及びAWB(Auto White Balance)に必要な情報の検波を行う。また、STAT21は、撮影画像データからのヒストグラムの抽出、撮影画像データの矩形ブロック単位の輝度の平均値の算出、及び飽和画素数のカウントを行う。ヒストグラムは、画像データ中の画素の輝度値の分布を示すグラフであり、通常、横軸が輝度、縦軸がピクセル数を表す。 STAT21 identifies various pieces of information from the image data captured by the video see-through camera 41, which has been converted into an RGB image by Color ISP16. For example, STAT21 detects the information required for AE (Auto Exposure) and AWB (Auto White Balance). STAT21 also extracts a histogram from the captured image data, calculates the average brightness value for each rectangular block of the captured image data, and counts the number of saturated pixels. A histogram is a graph that shows the distribution of brightness values of pixels in image data, and typically the horizontal axis represents brightness and the vertical axis represents the number of pixels.

 Color ISP16、Warp18、及びSTAT21による処理が低レイテンシー(low latency)で実行されることにより、ヘッドマウントディスプレイ1aを装着したユーザのVR酔いを低減することができる。 By performing processing by Color ISP 16, Warp 18, and STAT 21 with low latency, it is possible to reduce VR sickness experienced by a user wearing the head-mounted display 1a.

 CPU23は、STAT21及びI2Cインタフェース22で取得された情報等に基づいて、AE及びAWBを行う。また、CPU23は、シーンの明るさ及び撮影画像データの輝度等に基づく明所・暗所領域の抽出処理を行う。CPU23は、これらの処理結果に基づいて、レンズ2aの液晶シャッタを制御するための透過率データを生成する。CPU23による明所・暗所領域の抽出処理等の詳細については後述する。CPU23は、本実施形態における画像処理回路の一例である。 The CPU 23 performs AE and AWB based on information acquired via the STAT 21 and the I2C interface 22. The CPU 23 also performs extraction processing of bright and dark areas based on the brightness of the scene and the luminance of the captured image data. Based on the results of these processing, the CPU 23 generates transmittance data for controlling the liquid crystal shutter of the lens 2a. Details of the extraction processing of bright and dark areas by the CPU 23 will be described later. The CPU 23 is an example of an image processing circuit in this embodiment.

 Display Controller19は、ディスプレイプロジェクタ5を制御して透過ディスプレイ3に表示画像を表示させる。また、Display Controller19は、CPU23によって生成された透過率データに基づいて、レンズ2の液晶シャッタの透過度合を制御する。また、Display Controller19は、Warp18により補正されたビデオシースルー画像データ及びGPU15によりレンダリング処理が施されたレンダリング画像データを、透過ディスプレイ3に表示する形式に補正する。Display Controller19は、本実施形態における表示制御回路の一例である。
The display controller 19 controls the display projector 5 to display an image on the transmissive display 3. The display controller 19 also controls the degree of transmittance of the liquid crystal shutter of the lens 2 based on transmittance data generated by the CPU 23. The display controller 19 also corrects the video see-through image data corrected by the Warp 18 and the rendering image data rendered by the GPU 15 into a format suitable for display on the transmissive display 3. The display controller 19 is an example of a display control circuit in this embodiment.

 より詳細にはDisplay Controller19は、EN(Enable)ブロック191とBlendブロック192とを備える。ENブロック191は、飽和画素の表示ON/OFFを判断する。ENブロック191は、OFFと判断した場合は当該画素を非表示(黒)に補正する。また、ENブロック191は、ONと判断した場合は当該画素に対応する表示映像の明るさを補正する。Blendブロック192は、レンダリング画像データとビデオシースルー画像データを合成する。 More specifically, the Display Controller 19 comprises an EN (Enable) block 191 and a Blend block 192. The EN block 191 determines whether the display of saturated pixels is on or off. If the EN block 191 determines that the pixel is off, it corrects the pixel to be hidden (black). If the EN block 191 determines that the pixel is on, it corrects the brightness of the displayed image corresponding to the pixel. The Blend block 192 combines the rendering image data and the video see-through image data.

 DRAM Controller24は、DRAM32への各種データの記憶、及びDRAM32からの各種データの読み込みを制御する。 The DRAM controller 24 controls the storage of various data in the DRAM 32 and the reading of various data from the DRAM 32.

 なお、本実施形態におけるヘッドマウントディスプレイ1aの構成は図1、2に示す例に限定されるものではない。例えば、ヘッドマウントディスプレイ1aは、Head Tracking用カメラ63a、IMU61、及びToFセンサ62は必須の構成ではない。また、ヘッドマウントディスプレイ1aは、さらに他のセンサ等を備えてもよい。 Note that the configuration of the head-mounted display 1a in this embodiment is not limited to the example shown in Figures 1 and 2. For example, the head-mounted display 1a does not necessarily include the head tracking camera 63a, IMU 61, and ToF sensor 62. Furthermore, the head-mounted display 1a may also include other sensors, etc.

 次に、ヘッドマウントディスプレイ1aの使用前のキャリブレーション処理について説明する。キャリブレーション処理は、例えば、ヘッドマウントディスプレイ1aの出荷前に行われる処理である。より詳細には、キャリブレーション処理は、ヘッドマウントディスプレイ1aを装着したユーザの目の位置とビデオシースルー用カメラ41の搭載位置の差異等を加味した補正のための処理である。 Next, we will explain the calibration process performed before using the head-mounted display 1a. The calibration process is performed, for example, before the head-mounted display 1a is shipped. More specifically, the calibration process is a process for making corrections that take into account differences between the eye position of a user wearing the head-mounted display 1a and the mounting position of the video see-through camera 41.

 ヘッドマウントディスプレイ1aをユーザが装着する場合、当該ユーザの両眼は、一般的にはそれぞれレンズ2a,2bの中心付近に位置する。このため、レンズ2a,2b及び透過ディスプレイ3a,3bを透過した透過像は、レンズ2a,2bの中心付近の視点を基準とした像となる。これに対して、ビデオシースルー用カメラ41a,41bをユーザの目の光軸の位置に合わせて搭載することは構成上困難である。このため、図1に示したように、ビデオシースルー用カメラ41a,41bは、レンズ2a,2bの中心付近ではなく、ヘッドマウントディスプレイ1aの眼鏡本体部10に設けられる。 When a user wears the head-mounted display 1a, the user's eyes are generally positioned near the centers of the lenses 2a and 2b, respectively. Therefore, the transmitted image that passes through the lenses 2a and 2b and the transmissive displays 3a and 3b is based on a viewpoint near the centers of the lenses 2a and 2b. However, it is structurally difficult to mount the video see-through cameras 41a and 41b so that they are aligned with the optical axis of the user's eyes. For this reason, as shown in Figure 1, the video see-through cameras 41a and 41b are provided in the eyeglass body 10 of the head-mounted display 1a, rather than near the centers of the lenses 2a and 2b.

 このような理由により、ビデオシースルー用カメラ41a,41bによって撮影されたビデオシースルー画像データの視点と、ヘッドマウントディスプレイ1aを装着したユーザの視点とに差異が生じる。また、透過ディスプレイ3とビデオシースルー用カメラ41の画角の違いやレンズ歪による映像の歪も、ずれを生じる要因となる。なお、レンズ歪には、ビデオシースルー用カメラ41のレンズの特性と、ヘッドマウントディスプレイ1aのレンズ2の歪の両方がある。 For these reasons, there is a difference between the viewpoint of the video see-through image data captured by the video see-through cameras 41a and 41b and the viewpoint of the user wearing the head-mounted display 1a. Differences in the angle of view between the transparent display 3 and the video see-through camera 41, as well as distortion of the image due to lens distortion, also contribute to the misalignment. Note that lens distortion is caused by both the characteristics of the lens of the video see-through camera 41 and distortion of the lens 2 of the head-mounted display 1a.

 図3は、第1の実施形態に係るビデオシースルー表示画像931,931aと透過像941との差異の一例を示す図である。ビデオシースルー表示画像931は、補正前のビデオシースルー画像データがそのまま透過ディスプレイ3に表示された画像である。この場合、図3に示すように、ビデオシースルー表示画像931に含まれる物体(被写体9a)と、透過像に含まれる物体(被写体9a)の位置及び/または大きさに差異が生じる。 FIG. 3 is a diagram showing an example of the difference between the video see-through display images 931, 931a and the transmission image 941 according to the first embodiment. The video see-through display image 931 is an image in which the uncorrected video see-through image data is displayed as is on the transmission display 3. In this case, as shown in FIG. 3, a difference occurs in the position and/or size of the object (subject 9a) included in the video see-through display image 931 and the object (subject 9a) included in the transmission image.

 図3ではビデオシースルー表示画像931と透過像941とを個別に表示しているが、実際にはビデオシースルー表示画像931は透過像941に重畳して透過ディスプレイ3上に表示される。この場合、ビデオシースルー表示画像931内の被写体9aと透過像941内の被写体9aの輪郭が連続せず、2重像になる。このため、ヘッドマウントディスプレイ1aを装着したユーザは、ビデオシースルー表示画像931内の被写体9aと透過像941内の被写体9aの両方を2重像として見ることとなり、視認性が低下する。なお、本実施形態において、「被写体」は、撮影画像データに描出された物体だけではなく、透過像に含まれる物体も含む。なお、物体には人も含む。 In Figure 3, the video see-through display image 931 and the transmission image 941 are displayed separately, but in reality, the video see-through display image 931 is displayed superimposed on the transmission image 941 on the transmission display 3. In this case, the contours of the subject 9a in the video see-through display image 931 and the subject 9a in the transmission image 941 are not continuous, resulting in a double image. As a result, a user wearing the head-mounted display 1a sees both the subject 9a in the video see-through display image 931 and the subject 9a in the transmission image 941 as a double image, reducing visibility. Note that in this embodiment, the "subject" includes not only objects depicted in the captured image data, but also objects included in the transmission image. Note that objects also include people.

 このため、SoC100aは、ヘッドマウントディスプレイ1aの使用時においてビデオシースルー画像データを補正した上で透過ディスプレイ3に表示させる。補正後のビデオシースルー画像データに基づくビデオシースルー表示画像931a内の被写体9aの位置及び大きさは、透過像941内の被写体9aの位置及び大きさと一致する。この場合、被写体9aが透過ディスプレイ3上で2重像にならないため、ユーザの視認性は低下しない。キャリブレーション処理では、このような使用時の補正に使用される第1位置関係データを取得する。第1位置関係データは、透過ディスプレイ3を透過する透過像941に含まれる被写体9aと、ビデオシースルー用カメラ41により撮影される第1画像データに含まれる被写体との位置関係を示すデータである。例えば、第1位置関係データは、変形(Warp)処理における第1画像データの位置合わせ補正量を表す。位置合わせ補正量は、例えば、変形処理の前後の特徴点の位置関係を表す座標により定義される。 For this reason, when the head-mounted display 1a is in use, the SoC 100a corrects the video see-through image data before displaying it on the transmissive display 3. The position and size of the subject 9a in the video see-through display image 931a based on the corrected video see-through image data match the position and size of the subject 9a in the transmissive image 941. In this case, the subject 9a does not appear as a double image on the transmissive display 3, so user visibility is not reduced. The calibration process acquires first positional relationship data used for such correction during use. The first positional relationship data is data indicating the positional relationship between the subject 9a included in the transmissive image 941 transmitted through the transmissive display 3 and the subject included in the first image data captured by the video see-through camera 41. For example, the first positional relationship data represents the amount of alignment correction of the first image data in the warp process. The alignment correction amount is defined, for example, by coordinates that represent the positional relationship of feature points before and after the warp process.

 図4は、第1の実施形態に係るキャリブレーション用カメラ42a,42bの設置位置の一例について示す図である。図4に示すように、キャリブレーション用カメラ42a,42bは、それぞれ、ヘッドマウントディスプレイ1aを装着したユーザの目の位置に相当する位置に設置される。例えば、図4に示すように、キャリブレーション用カメラ42a,42bは、レンズ2a,2bの中心付近に設置される。当該位置は、キャリブレーション用カメラ42a,42bは、透過ディスプレイ3の前面側から、透過ディスプレイ3を透過して、透過ディスプレイ3の背面が向く方向を撮影する。このため、キャリブレーション用カメラ42a,42bは、透過ディスプレイ3を透過して透過ディスプレイ3の背面側に位置する被写体を撮影する。なお、キャリブレーション用カメラ42は、キャリブレーション処理後は使用されないため、ヘッドマウントディスプレイ1aの構成には含まれない。 FIG. 4 is a diagram showing an example of the installation positions of the calibration cameras 42a and 42b according to the first embodiment. As shown in FIG. 4, the calibration cameras 42a and 42b are installed at positions corresponding to the eye positions of a user wearing the head-mounted display 1a. For example, as shown in FIG. 4, the calibration cameras 42a and 42b are installed near the centers of the lenses 2a and 2b. At these positions, the calibration cameras 42a and 42b pass through the transparent display 3 from the front side of the transparent display 3 and capture an image in the direction in which the rear side of the transparent display 3 faces. Therefore, the calibration cameras 42a and 42b pass through the transparent display 3 to capture an image of a subject located on the rear side of the transparent display 3. Note that the calibration camera 42 is not used after the calibration process and is therefore not included in the configuration of the head-mounted display 1a.

 キャリブレーション処理用の被写体としては、特徴点の抽出が容易なものが望ましい。 It is desirable for the subject used for calibration processing to be one from which feature points can be easily extracted.

 図5は、第1の実施形態に係るキャリブレーション処理の被写体9bの一例を示す図である。本実施形態においては、図5に示すように、キャリブレーション処理用の被写体9bとして白黒の矩形が格子状に並んだチェッカーボードチャートを使用する。チェッカーボードチャートの格子点は、特徴点90の一例である。なお、図5では1つの格子点を特徴点90の一例として図示しているが、キャリブレーション用カメラ42及びビデオシースルー用カメラ41の撮影画像データにおける全ての格子点が特徴点90となる。なお、キャリブレーション処理用の被写体9bは図5に示す例に限定されない。 FIG. 5 is a diagram showing an example of a subject 9b for calibration processing according to the first embodiment. In this embodiment, as shown in FIG. 5, a checkerboard chart in which black and white rectangles are arranged in a grid pattern is used as the subject 9b for calibration processing. The grid points of the checkerboard chart are examples of feature points 90. Note that while FIG. 5 shows one grid point as an example of a feature point 90, all grid points in the image data captured by the calibration camera 42 and the video see-through camera 41 become feature points 90. Note that the subject 9b for calibration processing is not limited to the example shown in FIG. 5.

 図6は、第1の実施形態に係るキャリブレーション用カメラ42による被写体9bの撮影を横側から見た一例を示す図である。図6に示すように、ビデオシースルー用カメラ41及びキャリブレーション用カメラ42は共に、透過ディスプレイ3の背面302側の方向に位置する被写体9bを撮影する。また、キャリブレーション用カメラ42は、レンズ2及び透過ディスプレイ3を透過して、透過ディスプレイ3の前面301側から、背面302側に向けて被写体9bを撮影する。ビデオシースルー用カメラ41は、レンズ2及び透過ディスプレイ3は透過せずに、被写体9bを撮影する。 FIG. 6 is a diagram showing an example of a side view of subject 9b photographed by the calibration camera 42 according to the first embodiment. As shown in FIG. 6, both the video see-through camera 41 and the calibration camera 42 photograph subject 9b located toward the back surface 302 of the transmissive display 3. The calibration camera 42 photographs subject 9b from the front surface 301 of the transmissive display 3 toward the back surface 302, passing through the lens 2 and the transmissive display 3. The video see-through camera 41 photographs subject 9b without passing through the lens 2 or the transmissive display 3.

 ここで、図4~6で説明したキャリブレーション用カメラ42を用いたキャリブレーション処理の流れについて説明する。 Here, we will explain the flow of the calibration process using the calibration camera 42 described in Figures 4 to 6.

 図7は、第1の実施形態に係るキャリブレーション処理の流れの一例を示すフローチャートである。また、図8は、図7のフローチャートの各ステップの処理内容の一例を示す図である。図8のStep1-7は、図7のS1~S7に対応する。 FIG. 7 is a flowchart showing an example of the flow of calibration processing according to the first embodiment. FIG. 8 is a diagram showing an example of the processing content of each step in the flowchart of FIG. 7. Steps 1-7 in FIG. 8 correspond to S1-S7 in FIG. 7.

 まず、Color ISP16は、キャリブレーション用カメラ42で被写体9bを撮影した画像データ(図8に示す第1のキャリブレーション用画像データ901)を取得する(S1)。第1のキャリブレーション用画像データ901は、透過ディスプレイ3の前面301側から透過ディスプレイ3を透過して透過ディスプレイ3の背面302側に位置する被写体9bが撮影された画像データである。第1のキャリブレーション用画像データ901は、本実施形態における第2画像データの一例である。なお、キャリブレーション用カメラ42はヘッドマウントディスプレイには搭載せず、市販のデジタルカメラやWebカメラ等を用いても良い。その場合、撮影した画像データを撮影後にデジタルカメラやWebカメラ等からSoC100aに転送して以降の処理を行う。 First, Color ISP 16 acquires image data (first calibration image data 901 shown in FIG. 8) of subject 9b photographed by calibration camera 42 (S1). The first calibration image data 901 is image data of subject 9b photographed from the front surface 301 of the transparent display 3, passing through the transparent display 3 to the rear surface 302 of the transparent display 3. The first calibration image data 901 is an example of second image data in this embodiment. Note that calibration camera 42 does not have to be mounted on a head-mounted display, and a commercially available digital camera, web camera, etc. may be used. In this case, the photographed image data is transferred from the digital camera, web camera, etc. to SoC 100a after photographing, and subsequent processing is performed.

 そして、CPU23は、第1のキャリブレーション用画像データ901内の被写体9bの特徴点90を抽出する(S2)。第1のキャリブレーション用画像データ901から抽出された特徴点90を、第1の特徴点とする。特徴点90を抽出するとは、第1のキャリブレーション用画像データ901に描出された特徴点90(例えば、チェッカーボードチャートの格子点)の位置を特定することである。 Then, the CPU 23 extracts feature points 90 of the subject 9b in the first calibration image data 901 (S2). The feature points 90 extracted from the first calibration image data 901 are designated as first feature points. Extracting feature points 90 means identifying the positions of the feature points 90 (for example, grid points of a checkerboard chart) depicted in the first calibration image data 901.

 ここで、第1のキャリブレーション用画像データ901は、ヘッドマウントディスプレイ1aを装着したユーザから見える透過像を疑似的に画像化したものであるが、キャリブレーション用カメラ42のレンズ歪等のため、実際にユーザから見える透過像とは異なる。なお、レンズ歪を含むキャリブレーション用カメラ42の特性に起因する歪みを、カメラ歪という。CPU23は、カメラ歪の除去のため、S2の第1の特徴点の抽出の前に、キャリブレーション用カメラ42の内部パラメータを用いて、第1のキャリブレーション用画像データ901を補正する。内部パラメータによる補正は、表示系による歪を打ち消す逆補正である。当該処理により、CPU23は、第1のキャリブレーション用画像データ901を、ヘッドマウントディスプレイ1aを装着したユーザから見える透過像に相当する状態に変換する。図8では、変換後の画像データを疑似透過像データ911として示す。 Here, the first calibration image data 901 is a pseudo-image of the transmission image seen by a user wearing the head-mounted display 1a, but differs from the transmission image actually seen by the user due to lens distortion of the calibration camera 42, etc. Distortion caused by the characteristics of the calibration camera 42, including lens distortion, is called camera distortion. To remove the camera distortion, the CPU 23 corrects the first calibration image data 901 using the internal parameters of the calibration camera 42 before extracting the first feature point in S2. Correction using the internal parameters is an inverse correction that cancels out distortion caused by the display system. Through this processing, the CPU 23 converts the first calibration image data 901 into a state equivalent to the transmission image seen by a user wearing the head-mounted display 1a. In Figure 8, the converted image data is shown as pseudo transmission image data 911.

 キャリブレーション用カメラ42の内部パラメータは、キャリブレーション用カメラ42のカメラ歪の影響を除去するための補正パラメータである。内部パラメータは、例えば、OpenCV(登録商標)等の公知のキャリブレーションツールを使用して算出することが可能である。内部パラメータは、例えば、SoC100aの外部の情報処理装置等によって図7の処理の前に算出済みで、SoC100a内のメモリ等に記憶されていてもよい。また、ビデオシースルー用カメラ41も、キャリブレーション用カメラ42と同様にカメラ歪を有する。このため、ビデオシースルー用カメラ41のカメラ歪の影響を除去するための内部パラメータも同様に、SoC100a内のメモリ等に記憶されていてもよい。 The internal parameters of the calibration camera 42 are correction parameters for removing the effects of camera distortion of the calibration camera 42. The internal parameters can be calculated using a known calibration tool such as OpenCV (registered trademark). The internal parameters may be calculated by an information processing device external to the SoC 100a before the processing of Figure 7 and stored in memory within the SoC 100a. Furthermore, the video see-through camera 41, like the calibration camera 42, also has camera distortion. Therefore, the internal parameters for removing the effects of camera distortion of the video see-through camera 41 may also be stored in memory within the SoC 100a.

 次に、Color ISP16は、ビデオシースルー用カメラ41から、ビデオシースルー用カメラ41で被写体9bを撮影した画像データ(図8に示すビデオシースルー画像データ921)を取得する(S3)。 Next, Color ISP 16 acquires image data (video see-through image data 921 shown in Figure 8) of subject 9b captured by video see-through camera 41 from video see-through camera 41 (S3).

 そして、Display Controller19は、ディスプレイプロジェクタ5を制御して、ビデオシースルー画像データ921を透過ディスプレイ3に表示させる(S4)。図8では、透過ディスプレイ3に表示されたビデオシースルー画像データ921を、ビデオシースルー表示画像932として示す。また、S4の処理の後、被写体9bは、ヘッドマウントディスプレイ1aの前から作業者等によって除かれる。このため、キャリブレーション用カメラ42の画角に被写体9bが入らない状態となる。 Then, the display controller 19 controls the display projector 5 to display the video see-through image data 921 on the transparent display 3 (S4). In Figure 8, the video see-through image data 921 displayed on the transparent display 3 is shown as a video see-through display image 932. Furthermore, after the processing of S4, the subject 9b is removed from in front of the head-mounted display 1a by an operator or the like. As a result, the subject 9b is no longer within the angle of view of the calibration camera 42.

 次に、Color ISP16は、ビデオシースルー表示画像932が表示された透過ディスプレイ3をキャリブレーション用カメラ42で撮影した画像データを、キャリブレーション用カメラ42から取得する(S5)。当該画像データは、図8に示す第2のキャリブレーション用画像データ902である。また、第2のキャリブレーション用画像データ902は、本実施形態における第3画像データの一例である。 Next, Color ISP 16 acquires image data from the calibration camera 42, which is an image of the transmissive display 3 displaying the video see-through display image 932 (S5). This image data is the second calibration image data 902 shown in FIG. 8. The second calibration image data 902 is also an example of the third image data in this embodiment.

 そして、CPU23は、第2のキャリブレーション用画像データ902内の被写体9bの特徴点90を抽出する(S6)。第2のキャリブレーション用画像データ902から抽出された特徴点90を、第2の特徴点とする。より詳細には、S2における処理と同様に、CPU23は、第2の特徴点の抽出の前にキャリブレーション用カメラ42の内部パラメータを用いて、第2のキャリブレーション用画像データ902のカメラ歪を補正する。補正後の第2のキャリブレーション用画像データ902は、ヘッドマウントディスプレイ1aを装着したユーザから見えるビデオシースルー表示画像を疑似的に再現したものであるため、疑似シースルー表示画像データ940という。CPU23は、疑似シースルー表示画像データ940から第2の特徴点を抽出する。 Then, the CPU 23 extracts feature points 90 of the subject 9b in the second calibration image data 902 (S6). The feature points 90 extracted from the second calibration image data 902 are designated as second feature points. More specifically, similar to the processing in S2, the CPU 23 corrects camera distortion of the second calibration image data 902 using internal parameters of the calibration camera 42 before extracting the second feature points. The corrected second calibration image data 902 is a pseudo-reproduction of the video see-through display image seen by a user wearing the head-mounted display 1a, and is therefore referred to as pseudo see-through display image data 940. The CPU 23 extracts the second feature points from the pseudo see-through display image data 940.

 そして、CPU23は、S2で抽出した第1の特徴点と、S6で抽出した第2の特徴点から変形パラメータを算出する(S7)。当該変形パラメータは、上述の第1位置関係データである。当該変形パラメータは、疑似透過像データ911に含まれる被写体9bと、ビデオシースルー用カメラ41により撮影されるビデオシースルー画像データ921に含まれる被写体9bとの位置関係を示すデータである。キャリブレーション処理における疑似透過像データ911は実際のヘッドマウントディスプレイ1aの使用時における透過像に相当する。また、キャリブレーション処理においてビデオシースルー画像データ921から生成された疑似シースルー表示画像データ940は実際のヘッドマウントディスプレイ1aの使用時におけるビデオシースルー表示画像に相当する。このため、S7で算出される変形パラメータは、透過像に含まれる被写体と、ビデオシースルー用カメラ41により撮影されるビデオシースルー画像データに含まれる被写体との位置関係を示す。 The CPU 23 then calculates deformation parameters from the first feature points extracted in S2 and the second feature points extracted in S6 (S7). The deformation parameters are the first positional relationship data described above. The deformation parameters are data indicating the positional relationship between the subject 9b included in the pseudo-transmitted image data 911 and the subject 9b included in the video see-through image data 921 captured by the video see-through camera 41. The pseudo-transmitted image data 911 in the calibration process corresponds to the transmitted image when the head-mounted display 1a is actually in use. Furthermore, the pseudo-see-through display image data 940 generated from the video see-through image data 921 in the calibration process corresponds to the video see-through display image when the head-mounted display 1a is actually in use. Therefore, the deformation parameters calculated in S7 indicate the positional relationship between the subject included in the transmitted image and the subject included in the video see-through image data captured by the video see-through camera 41.

 より詳細には、S7の変形パラメータ(第1位置関係データ)の算出処理では、CPU23は、第1の特徴点と第2の特徴点との位置関係を計算することにより、第1の特徴点と第2の特徴点とを一致させることができる変形パラメータを算出する。図8のStep7-1に示すように、CPU23は、疑似シースルー表示画像データ940を疑似透過像データ911と一致するように補正可能な変形パラメータを算出する。また、図8のStep7-2に示すように、CPU23は、ビデオシースルー画像データ921における被写体9bの特徴点90と疑似シースルー表示画像データ940における被写体9bの特徴点90とに基づいて、表示系によって生じる歪量を算出する。表示系によって生じる歪量とは、レンズ2及び透過ディスプレイ3の特性により生じる画像の歪の大きさである。レンズ2及び透過ディスプレイ3の特性とは、例えば、レンズ2及び透過ディスプレイ3の屈曲等である。CPU23は、Step7-1で算出した変形パラメータとStep7-2で算出した歪量とに基づいて、変形パラメータ(第1位置関係データ)を算出する。さらに詳細には、CPU23は、Step7-1で算出した変形パラメータに対する変換処理を行う。当該変換処理については図9-11で後述する。 More specifically, in the calculation process of the deformation parameters (first positional relationship data) in S7, the CPU 23 calculates the positional relationship between the first feature point and the second feature point, thereby calculating deformation parameters that can match the first feature point and the second feature point. As shown in Step 7-1 of FIG. 8, the CPU 23 calculates deformation parameters that can correct the pseudo see-through display image data 940 so that it matches the pseudo transmitted image data 911. Also, as shown in Step 7-2 of FIG. 8, the CPU 23 calculates the amount of distortion caused by the display system based on the feature point 90 of the subject 9b in the video see-through image data 921 and the feature point 90 of the subject 9b in the pseudo see-through display image data 940. The amount of distortion caused by the display system is the magnitude of the image distortion caused by the characteristics of the lens 2 and the transparent display 3. The characteristics of the lens 2 and the transparent display 3 include, for example, the curvature of the lens 2 and the transparent display 3. The CPU 23 calculates the deformation parameters (first positional relationship data) based on the deformation parameters calculated in Step 7-1 and the distortion amount calculated in Step 7-2. More specifically, the CPU 23 performs a conversion process on the deformation parameters calculated in Step 7-1. This conversion process will be described later with reference to Figures 9-11.

 そして、CPU23は、算出した変形パラメータ(第1位置関係データ)を、SRAM14b、Flash memory31等のメモリに格納する(S8)。ここで、このフローチャートの処理は終了する。 Then, the CPU 23 stores the calculated transformation parameters (first positional relationship data) in a memory such as the SRAM 14b or the flash memory 31 (S8). At this point, the processing of this flowchart ends.

 図7に示すキャリブレーション処理は、上述のようにSoC100aによって実行されてもよいし、ヘッドマウントディスプレイ1aの外部の他の情報処理装置等によって実行されてもよい。他の情報処理装置は、例えば、高性能なPC(Personal Computer)等であってもよい。また、1つのヘッドマウントディスプレイ1aのSoC100aによって実行されたキャリブレーション処理で生成された第1位置関係データが、複数のヘッドマウントディスプレイ1aのメモリに格納されてもよい。 The calibration process shown in FIG. 7 may be executed by the SoC 100a as described above, or may be executed by another information processing device external to the head-mounted display 1a. The other information processing device may be, for example, a high-performance PC (Personal Computer). In addition, the first positional relationship data generated by the calibration process executed by the SoC 100a of one head-mounted display 1a may be stored in the memory of multiple head-mounted displays 1a.

 ここで、図7、8で説明した変形パラメータ(第1位置関係データ)の算出処理について、図9-11を用いてより具体的に説明する。 Here, the calculation process for the transformation parameters (first positional relationship data) described in Figures 7 and 8 will be explained in more detail using Figures 9-11.

 図9は、第1の実施形態に係るキャリブレーション処理における各画像データに含まれる歪の一例を示す図である。上述の図8に図示されたStep7-1で直接的に算出可能な変形パラメータは、キャリブレーション用カメラ42によって撮影された第2のキャリブレーション用画像データ902に基づく疑似シースルー表示画像データ940に対する変形パラメータAである。 FIG. 9 is a diagram showing an example of distortion contained in each image data in the calibration process according to the first embodiment. The deformation parameter that can be directly calculated in Step 7-1 shown in FIG. 8 above is deformation parameter A for the pseudo see-through display image data 940 based on the second calibration image data 902 captured by the calibration camera 42.

 図9に示すように、図7のS3で取得されたビデオシースルー画像データ921は、Step4で透過ディスプレイ3に表示される際に表示系による歪の影響を受ける。このため、Step5で取得される第2のキャリブレーション用画像データ902も、当該表示系による歪を含む。また、第2のキャリブレーション用画像データ902はキャリブレーション用カメラ42による歪(カメラ歪)を含む。当該カメラ歪についてはStep6においてCPU23が内部パラメータを用いた補正で除去する。そして、CPU23は、Step7-1で、カメラ歪が除去された疑似シースルー表示画像データ940を疑似透過像データ911と一致するように補正可能な変形パラメータAを算出する。変形パラメータAによる変形後の疑似シースルー表示画像データ940aは、疑似透過像データ911と一致する。 As shown in FIG. 9, the video see-through image data 921 acquired in S3 of FIG. 7 is affected by distortion due to the display system when it is displayed on the transmissive display 3 in Step 4. For this reason, the second calibration image data 902 acquired in Step 5 also includes distortion due to the display system. The second calibration image data 902 also includes distortion (camera distortion) due to the calibration camera 42. In Step 6, the CPU 23 removes this camera distortion through correction using internal parameters. Then, in Step 7-1, the CPU 23 calculates a deformation parameter A that can correct the pseudo see-through display image data 940 from which the camera distortion has been removed so that it matches the pseudo transmission image data 911. The pseudo see-through display image data 940a after deformation using the deformation parameter A matches the pseudo transmission image data 911.

 しかしながら、ヘッドマウントディスプレイ1aの使用時には、キャリブレーション用カメラ42は用いられないため、使用時における変形対象はビデオシースルー画像データ921となる。 However, when the head-mounted display 1a is in use, the calibration camera 42 is not used, and therefore the object to be deformed during use is the video see-through image data 921.

 図10は、第1の実施形態に係るヘッドマウントディスプレイ1aの使用時における各画像データに含まれる歪の一例を示す図である。また、図11は、図10の変形処理Bの内訳の一例を示す図である。 FIG. 10 is a diagram showing an example of distortion contained in each image data when the head-mounted display 1a according to the first embodiment is in use. Also, FIG. 11 is a diagram showing an example of the breakdown of deformation process B in FIG. 10.

 図10に示すように、ヘッドマウントディスプレイ1aの使用時における補正処理は、透過ディスプレイ3に表示されたビデオシースルー表示画像933内の被写体の特徴点が透過像内の被写体の特徴点と一致することを目的とする。 As shown in Figure 10, the correction process when using the head-mounted display 1a aims to ensure that the feature points of the subject in the video see-through display image 933 displayed on the transmissive display 3 match the feature points of the subject in the transmissive image.

 このため、ヘッドマウントディスプレイ1aの使用時の変形処理Bでは、変形後のビデオシースルー画像データ921aに生じる歪みを考慮した変形パラメータBが必要となる。 For this reason, deformation process B when using the head-mounted display 1a requires deformation parameters B that take into account the distortion that occurs in the video see-through image data 921a after deformation.

 具体的には、ビデオシースルー画像データ921が透過ディスプレイ3に表示される場合、表示系による歪、及びビデオシースルー用カメラ41のカメラ歪の影響がある。カメラ歪の影響については、内部パラメータによる補正によって除去されるため、変形処理では考慮が不要となる。 Specifically, when the video see-through image data 921 is displayed on the transmissive display 3, it is affected by distortion due to the display system and camera distortion of the video see-through camera 41. The effect of camera distortion is removed by correction using internal parameters, so it does not need to be taken into account in the transformation process.

 このため、変形処理Bは、図11に示すように、表示系による歪の変形処理S701と、キャリブレーション処理で算出された変形パラメータAによる変形処理S702と、表示系による歪の逆変形処理S703との合成処理となる。CPU23は、S701~S703の処理における変形パラメータを合成することにより、変形パラメータBを求める。変形パラメータBは、例えば、S701~S703処理の前後における被写体9bの特徴点90(例えば格子点)の位置の差異を示す。当該変形パラメータBが、第1位置関係データである。 As a result, as shown in FIG. 11, transformation process B is a combination process of transformation process S701 for distortion caused by the display system, transformation process S702 using transformation parameter A calculated in the calibration process, and inverse transformation process S703 for distortion caused by the display system. The CPU 23 calculates transformation parameter B by combining the transformation parameters from processes S701 to S703. Transformation parameter B indicates, for example, the difference in the positions of feature points 90 (e.g., lattice points) of subject 9b before and after processes S701 to S703. This transformation parameter B is the first positional relationship data.

 次に、ヘッドマウントディスプレイ1aの使用時の処理について説明する。 Next, we will explain the processing that occurs when using the head-mounted display 1a.

 図12は、第1の実施形態に係るヘッドマウントディスプレイ1aの使用時の表示データの生成処理の流れの一例を示すフローチャートである。図12の処理の前に、図7のキャリブレーション処理が完了し、変形パラメータ(第1位置関係データ)がSRAM14b、Flash memory31等のメモリに格納済みであるものとする。 FIG. 12 is a flowchart showing an example of the flow of the display data generation process when using the head-mounted display 1a according to the first embodiment. Before the process in FIG. 12, it is assumed that the calibration process in FIG. 7 has been completed and the transformation parameters (first positional relationship data) have already been stored in a memory such as SRAM 14b or flash memory 31.

 まず、Color ISP16は、ビデオシースルー用カメラ41からビデオシースルー画像データを取得する(S21)。また、Color ISP16は、取得したビデオシースルー画像データをRGB画像データに変換する。STAT21は、Color ISP16によってRGB画像に変換されたビデオシースルー画像データに対して、ヒストグラムの抽出、矩形ブロック単位の輝度の平均値の算出、及び飽和画素数のカウントを行う。Color ISP16及びSTAT21は、RGB画像データに変換されたビデオシースルー画像データ、ヒストグラム、矩形ブロック単位の輝度の平均値の算出結果、及び飽和画素数をSRAM14a~14cまたはDRAM32等のメモリに格納する。以下、特に区別する必要がある場合を除き、RGB画像データに変換されたビデオシースルー画像データも、単にビデオシースルー画像データという。 First, Color ISP 16 acquires video see-through image data from video see-through camera 41 (S21). Color ISP 16 then converts the acquired video see-through image data into RGB image data. STAT 21 extracts a histogram, calculates the average brightness value for each rectangular block, and counts the number of saturated pixels for the video see-through image data converted into an RGB image by Color ISP 16. Color ISP 16 and STAT 21 store the video see-through image data converted into RGB image data, the histogram, the calculation results of the average brightness value for each rectangular block, and the number of saturated pixels in memory such as SRAM 14a-14c or DRAM 32. Hereinafter, unless a distinction is needed, video see-through image data converted into RGB image data will also be simply referred to as video see-through image data.

 次に、Warp18は、SRAM14a~14c等から、図7のキャリブレーション処理で生成された変形パラメータ(第1位置関係データ)を取得する(S22)。 Next, Warp 18 acquires the transformation parameters (first positional relationship data) generated in the calibration process of FIG. 7 from SRAMs 14a to 14c, etc. (S22).

 また、CPU23は、RGB画像データに変換されたビデオシースルー画像データに対する明所・暗所領域の抽出処理を行う(S23)。例えば、CPU23は、ビデオシースルー画像データの明所の領域を抽出し、ビデオシースルー画像データの明所の領域と第1位置関係データとに基づいて透過像の明所の領域を算出する。明所・暗所領域の抽出処理の詳細については後述する。 The CPU 23 also performs a process of extracting bright and dark areas from the video see-through image data converted into RGB image data (S23). For example, the CPU 23 extracts bright areas from the video see-through image data, and calculates the bright areas of the transmitted image based on the bright areas from the video see-through image data and the first positional relationship data. Details of the process of extracting bright and dark areas will be described later.

 また、Warp18は、第1位置関係データにより定義された位置合わせ補正量に応じて、ビデオシースルー画像データの少なくとも一部を変形する(S24)。当該変形によって、透過像に含まれる被写体の輪郭とビデオシースルー表示画像に含まれる被写体の輪郭を揃えることができる。このため、透過像が視認可能な場合でも透過像に含まれる被写体の輪郭とビデオシースルー表示画像に含まれる被写体の輪郭が二重に見えることを抑制することができる。 Furthermore, Warp 18 deforms at least a portion of the video see-through image data according to the alignment correction amount defined by the first positional relationship data (S24). This deformation makes it possible to align the contour of the subject included in the transmitted image with the contour of the subject included in the video see-through display image. Therefore, even when the transmitted image is visible, it is possible to prevent the contour of the subject included in the transmitted image and the contour of the subject included in the video see-through display image from appearing double.

 なお、上述のように、ヘッドマウントディスプレイ1aは、暗所における視覚補助のためにビデオシースルー表示画像の表示をする。このため、ビデオシースルー画像データの全領域が表示対象とは限らない。このため、Warp18は、S23の明所・暗所領域の抽出処理の結果に基づいて、表示が必要な領域のみを補正する。 As mentioned above, the head-mounted display 1a displays a video see-through image to aid vision in dark places. For this reason, not all areas of the video see-through image data are necessarily subject to display. For this reason, Warp 18 corrects only the areas that require display, based on the results of the bright and dark area extraction process in S23.

 図13は、第1の実施形態に係るビデオシースルー画像データ922の補正の一例を示す図である。図13に示すように、Warp18は、ビデオシースルー画像データ922のうち、明所・暗所領域の抽出処理で暗所もしくは薄明りと判定された領域のみ変形を行う。Warp18は、明所と判定された領域については変形しない。例えば、Warp18はビデオシースルー画像データ922のうち、表示対象として抽出した暗所もしくは薄明りに該当する領域のうち、透過像の明所の領域と接する部分を変形させて表示データを生成する。 FIG. 13 is a diagram showing an example of correction of video see-through image data 922 according to the first embodiment. As shown in FIG. 13, Warp 18 deforms only areas of the video see-through image data 922 that have been determined to be dark or dimly lit in the process of extracting bright and dark areas. Warp 18 does not deform areas determined to be bright. For example, Warp 18 deforms the parts of the video see-through image data 922 that are dark or dimly lit and extracted as display targets, and that border the bright areas of the transmitted image, to generate display data.

 Warp18は、補正対象外の領域には値“0”を設定する。図13では、ビデオシースルー画像データ922のうち、両端の領域については明所と判定されたため、Warp18が当該領域に値“0”を設定する。値“0”が設定された領域においては、元の画像は削除される。補正後のビデオシースルー画像データ922aのうち値“0”が設定された領域は、透過ディスプレイ3の表示の際に何も表示されない。図13の補正後のビデオシースルー画像データ922aにおいて黒色で示される領域は、値“0”が設定されているため、実際に透過ディスプレイ3に表示された場合は背景を透過する。このため、値“0”が設定された領域については、ユーザは透過像のみを視認する。このようにWarp18が補正対象領域を絞ることにより、補正に必要な演算量を削減することができる。Warp18は、補正後のビデオシースルー画像データ922aをDisplay Controller19に出力する。 Warp 18 sets the value "0" to areas not subject to correction. In Figure 13, the areas at both ends of the video see-through image data 922 are determined to be bright areas, so Warp 18 sets the value "0" to those areas. In areas where the value "0" is set, the original image is deleted. In areas of the corrected video see-through image data 922a where the value "0" is set, nothing is displayed when displayed on the transmissive display 3. The areas shown in black in the corrected video see-through image data 922a in Figure 13 have the value "0" set, so when actually displayed on the transmissive display 3, the background is visible through them. Therefore, in areas where the value "0" is set, the user only sees the transmissive image. By Warp 18 narrowing down the areas subject to correction in this way, the amount of calculation required for correction can be reduced. Warp 18 outputs the corrected video see-through image data 922a to the Display Controller 19.

 図12に戻り、Display Controller19は、S23の明所・暗所領域の抽出処理の結果に基づいて、補正後のビデオシースルー画像データ922aの一部の明るさを調整する(S25)。Display Controller19は、例えば、明るさを補正するガンマ補正を実施する。また、Display Controller19は、色を調整するための色補正等を実施してもよい。 Returning to FIG. 12, the Display Controller 19 adjusts the brightness of a portion of the corrected video see-through image data 922a based on the results of the bright and dark area extraction process of S23 (S25). The Display Controller 19 performs, for example, gamma correction to correct the brightness. The Display Controller 19 may also perform color correction to adjust the colors.

 そして、Display Controller19は、補正後のビデオシースルー画像データ922aを透過ディスプレイ3に表示可能な形式に変換することにより、表示データを生成する(S26)。透過ディスプレイ3によって処理内容は異なるが、Display Controller19は、例えば、解像度を合わせるためのリサイズ等を実施する。 Then, the Display Controller 19 generates display data by converting the corrected video see-through image data 922a into a format that can be displayed on the transmissive display 3 (S26). The processing content differs depending on the transmissive display 3, but the Display Controller 19 performs, for example, resizing to match the resolution.

 そして、Display Controller19は、生成した表示データ及びS23の明所・暗所領域の抽出処理で生成された透過率データを出力する(S27)。より詳細には、生成した表示データをディスプレイプロジェクタ5に出力して透過ディスプレイ3にビデオシースルー表示画像を表示させる。また、Display Controller19は、透過率データに基づいて、レンズ2の液晶シャッタの透過度合を制御する。例えば、Display Controller19は、透過像の明所の領域に対応する透過ディスプレイ3の領域の透過率を、透過像の明所の領域を算出したときの第1透過率よりも小さい第2透過率に制御する。 Then, the Display Controller 19 outputs the generated display data and the transmittance data generated in the bright and dark area extraction process of S23 (S27). More specifically, the generated display data is output to the display projector 5, causing the transmissive display 3 to display a video see-through display image. The Display Controller 19 also controls the transmittance of the liquid crystal shutter of the lens 2 based on the transmittance data. For example, the Display Controller 19 controls the transmittance of the area of the transmissive display 3 corresponding to the bright area of the transmitted image to a second transmittance that is smaller than the first transmittance used when the bright area of the transmitted image was calculated.

 ここで、このフローチャートの処理は終了する。図12のフローチャートの処理は、ヘッドマウントディスプレイ1aがユーザに使用されている間は繰り返し実行される。なお、ユーザの視覚的な違和感を低減させるためには、図12の処理は、一例として、90~60Hzのリフレッシュレートで実行されることが望ましい。 Here, the processing of this flowchart ends. The processing of the flowchart in Figure 12 is repeatedly executed while the head-mounted display 1a is being used by the user. Note that, in order to reduce the user's visual discomfort, it is desirable that the processing of Figure 12 be executed at a refresh rate of 90 to 60 Hz, as an example.

 図14は、第1の実施形態に係る透過ディスプレイ3と液晶シャッタによる視覚補助の原理について説明する図である。図14では、図12のS27の処理により、透過ディスプレイ3にビデオシースルー表示画像が表示され、レンズ2の液晶シャッタによる減光が実施された状態を示す。ヘッドマウントディスプレイ1aを装着したユーザの目8と、実像(被写体)との間には、液晶シャッタ機能付きレンズ2と、透過ディスプレイ3とが存在する。 FIG. 14 is a diagram illustrating the principle of visual assistance using the transmissive display 3 and liquid crystal shutter according to the first embodiment. FIG. 14 shows a state in which a video see-through display image is displayed on the transmissive display 3 and light reduction is performed by the liquid crystal shutter of the lens 2 as a result of the processing of S27 in FIG. 12. The lens 2 with liquid crystal shutter function and the transmissive display 3 are located between the eye 8 of the user wearing the head-mounted display 1a and the real image (subject).

 例えば、図14では、実像において暗所(暗い領域)71と明所(明るい領域)72が存在する。Display Controller19は、透過ディスプレイ3上の実像の暗所が透過する領域ではビデオシースルー表示画像を表示させる。このため、暗所71が透過する領域においては、ユーザには暗い透過像ではなく、明るいビデオシースルー表示画像934が見える。なお、暗所71が透過する領域において、仮に、透過像が少し見えていたとしても、S24の変形処理によってビデオシースルー表示画像934が透過像に合わせて補正されているため、二重像は発生しない。 For example, in Figure 14, there are dark areas (dark regions) 71 and bright areas (bright regions) 72 in the real image. The Display Controller 19 displays a video see-through display image in the areas where the dark areas of the real image on the transmissive display 3 are transmitted. Therefore, in the areas where the dark areas 71 are transmitted, the user sees a bright video see-through display image 934, not a dark transmitted image. Note that even if a small amount of the transmitted image is visible in the areas where the dark areas 71 are transmitted, no double image occurs because the video see-through display image 934 has been corrected to match the transmitted image by the transformation process in S24.

 また、透過ディスプレイ3上の明所72が透過する領域においては、ビデオシースルー表示画像934の各画素に値“0”が設定されているため、ビデオシースルー表示画像934は透過ディスプレイ3に表示されない。また、明所72の明るさの程度によっては、レンズ2の液晶シャッタ機能により、明所72が透過する領域が減光される。 Furthermore, in the area on the transmissive display 3 where the bright area 72 is transmitted, a value of "0" is set for each pixel of the video see-through display image 934, so the video see-through display image 934 is not displayed on the transmissive display 3. Furthermore, depending on the brightness of the bright area 72, the liquid crystal shutter function of the lens 2 dims the area where the bright area 72 is transmitted.

 図15は、第1の実施形態に係る透過ディスプレイ3の表示態様の一例を示す図である。図15に示すように、ビデオシースルー表示画像934及び液晶シャッタによる視覚補助がない状態では、透過ディスプレイ3のうち暗所71が透過する領域310においては、ユーザは暗さのために対象物を見ることができない場合がある。また、透過ディスプレイ3のうち明所72が透過する領域320においては、ユーザはまぶしさを感じるために対象物を見ることができない場合がある。 FIG. 15 is a diagram showing an example of the display mode of the transmissive display 3 according to the first embodiment. As shown in FIG. 15, without the visual aids of the video see-through display image 934 and the liquid crystal shutter, the user may be unable to see objects in the region 310 of the transmissive display 3 through which the dark area 71 is transmitted due to darkness. Furthermore, in the region 320 of the transmissive display 3 through which the bright area 72 is transmitted, the user may be unable to see objects due to glare.

 これに対して、ビデオシースルー表示画像934及び液晶シャッタによる視覚補助がある状態では、透過ディスプレイ3のうち暗所71が透過する領域310においては、ビデオシースルー表示画像934の表示によりユーザの視認性が向上する。また、透過ディスプレイ3のうち明所72が透過する領域320においては、液晶シャッタの遮光によってまぶしさが軽減されることによりユーザの視認性が向上する。 In contrast, with the visual aids of the video see-through display image 934 and the liquid crystal shutter, the user's visibility is improved in the region 310 of the transmissive display 3 through which the dark area 71 is transmitted by the video see-through display image 934. Furthermore, in the region 320 of the transmissive display 3 through which the bright area 72 is transmitted by the liquid crystal shutter, the glare is reduced, thereby improving the user's visibility.

 次に、上述の図12のS23の明所・暗所領域の抽出処理の詳細について説明する。図16は、第1の実施形態に係る明所・暗所領域の抽出処理の流れの一例を示す図である。 Next, the details of the bright and dark area extraction process in S23 of Figure 12 described above will be explained. Figure 16 is a diagram showing an example of the flow of the bright and dark area extraction process according to the first embodiment.

 まず、CPU23は、シーンの明るさ(EV(Exposure Value)値)を取得する(S231)。一般に、通常のカメラにおいて、撮影される画像データの明るさを一定に保つため、AEによってシーンの明るさを推定し、シャッタ速度や感度を決定する処理がある。本実施形態のCPU23は、このAEの情報を用いることで、シーンの暗部の明るさ(EV値)を取得する。シーンの明るさとは、ヘッドマウントディスプレイ1aを装着したユーザの周囲の明るさである。 First, the CPU 23 acquires the brightness of the scene (EV (Exposure Value) value) (S231). Generally, in a normal camera, in order to keep the brightness of the captured image data constant, the brightness of the scene is estimated using AE and the shutter speed and sensitivity are determined. In this embodiment, the CPU 23 uses this AE information to acquire the brightness of dark areas of the scene (EV value). The brightness of the scene is the brightness around the user wearing the head-mounted display 1a.

 また、CPU23は、取得したEV値に基づいて、レンズ2の透過率を決定する(S232)。レンズ2の透過率は、液晶シャッタの遮光の度合いを示す。 The CPU 23 also determines the transmittance of lens 2 based on the acquired EV value (S232). The transmittance of lens 2 indicates the degree of light blocking by the liquid crystal shutter.

 そして、CPU23は、ビデオシースルー画像データ923における明所領域及び暗所領域を抽出する(S233)。明所領域及び暗所領域の抽出は、換言すれば、ビデオシースルー画像データ923における明所領域に該当する範囲、及び暗所領域に該当する範囲の特定である。ここで、このフローチャートの処理は終了する。 Then, the CPU 23 extracts bright and dark areas from the video see-through image data 923 (S233). Extracting bright and dark areas means, in other words, identifying the ranges that correspond to bright and dark areas in the video see-through image data 923. At this point, the processing of this flowchart ends.

 ここで、図16のS231のEV値の取得処理の詳細について説明する。図17は、第1の実施形態に係るEV値の取得処理の流れの一例を示す図である。図17に示すS301~S307の処理は、CPU23により実行される。CPU23は、Color ISP16が取得したビデオシースルー画像データ923からSTAT21が生成した、ヒストグラム、及び矩形ブロック単位の輝度の平均値の算出結果を取得する。 Here, the details of the EV value acquisition process of S231 in Figure 16 will be explained. Figure 17 is a diagram showing an example of the flow of the EV value acquisition process according to the first embodiment. The processes of S301 to S307 shown in Figure 17 are executed by the CPU 23. The CPU 23 acquires the histogram and the calculation results of the average brightness value for each rectangular block generated by STAT21 from the video see-through image data 923 acquired by Color ISP16.

 CPU23は、ヒストグラムから暗所の基準となる閾値を算出する。例えば、CPU23は、ヒストグラムを要素0から積算し全画素数のN%を超えたときの要素番号を閾値とする(S301)。図18は、第1の実施形態に係るヒストグラムに基づいて決定される閾値の一例を示す図である。CPU23は、ヒストグラムから閾値を算出することにより、ビデオシースルー画像データ923に応じた閾値を動的に設定することができる。 The CPU 23 calculates a threshold value that serves as a reference for dark places from the histogram. For example, the CPU 23 integrates the histogram from element 0 and sets the element number when the integrated value exceeds N% of the total number of pixels as the threshold value (S301). Figure 18 is a diagram showing an example of a threshold value determined based on a histogram according to the first embodiment. By calculating the threshold value from the histogram, the CPU 23 can dynamically set a threshold value that corresponds to the video see-through image data 923.

 図17に戻り、CPU23は、STAT21から取得した矩形ブロック単位の輝度の平均値から、輝度が閾値以下の領域の平均値を取得する(S302)。当該平均値を、STAT平均値という。 Returning to FIG. 17, the CPU 23 obtains the average value of the area where the brightness is equal to or less than the threshold value from the average brightness values of rectangular blocks obtained from the STAT 21 (S302). This average value is called the STAT average value.

 そして、CPU23は、STAT平均値と現在のセンサモジュール設定値(露光時間・センサゲイン・絞り)からシーンの明るさ(EV値)を推定する(S303)。 Then, the CPU 23 estimates the brightness (EV value) of the scene from the STAT average value and the current sensor module settings (exposure time, sensor gain, and aperture) (S303).

 また、CPU23は、推定した明るさに基づいて、新しいセンサ設定値を各種センサ(Ambient Light Sensor60等)に設定する(S304)。これにより、実際の明るさの程度に合わせでセンサの感度が調整可能である。 The CPU 23 also sets new sensor setting values for various sensors (such as the Ambient Light Sensor 60) based on the estimated brightness (S304). This allows the sensor sensitivity to be adjusted to match the actual level of brightness.

 また、CPU23は、I2Cインタフェース22を介してAmbient Light Sensor60から環境光の明るさ(Lux)を取得する(S305)。 The CPU 23 also acquires the ambient light brightness (Lux) from the Ambient Light Sensor 60 via the I2C interface 22 (S305).

 そして、CPU23は、公知のEV/Lux変換表に基づいて、LuxをEV値に変換する(S306)。図19は、一般的なEV/Lux変換表の一例を示す図である。EV値はカメラの露出補正値を決定する際に用いるための数値で、一般的な明るさを表す単位である照度(Lux値)とは、おおよそ図19の表のような対応関係があることが知られている。EV値が高いほどシーンが明るく、EV値が低いほどシーンが暗い。CPU23は、図19に示す例のように、EV値“-4”以上“0”未満を「暗所」、EV値“0”以上“2”未満を「薄明り」、EV値“15”以上を「明所」と判定してもよい。なお、図19に示す「暗所」、「薄明り」、「明所」の基準は一例であり、これに限定されるものではない。また、図19ではEV値“2”以上“15”未満を「日中」として「明所」と区別しているが、「日中」も「明所」に含まれてもよい。 Then, the CPU 23 converts Lux to an EV value based on a known EV/Lux conversion table (S306). Figure 19 is a diagram showing an example of a common EV/Lux conversion table. The EV value is a numerical value used when determining the exposure compensation value of a camera, and it is known that there is a correspondence relationship between the EV value and illuminance (Lux value), a common unit of brightness, roughly as shown in the table in Figure 19. The higher the EV value, the brighter the scene, and the lower the EV value, the darker the scene. As in the example shown in Figure 19, the CPU 23 may determine that an EV value of "-4" or greater and less than "0" is a "dark place," an EV value of "0" or greater and less than "2" is a "twilight" place, and an EV value of "15" or greater is a "bright place." Note that the criteria for "dark place," "twilight," and "bright place" shown in Figure 19 are merely examples and are not limited to these. Also, in Figure 19, EV values of "2" or greater and less than "15" are considered "daytime" and distinguished from "bright places," but "daytime" may also be included in "bright places."

 図17に戻り、CPU23は、Ambient Light Sensor60からの環境光の明るさの計測結果が取得可能な場合はAmbient Light Sensor60の検出結果(ALS明るさ)、取得不可の場合はAEの情報(AE明るさ)を使用する(S307)。Ambient Light Sensor60による計測結果を使用する場合、AEの情報を使用する場合よりも更新間隔は長くなるが、シーンの明るさの推定処理のための消費電力を低減することができる。また、撮影画像データは暗所が明るく撮影できるようセンサ設定値を設定しているため明所は飽和しており、撮影画像データに基づいたAEの情報は明所の明るさを正確に測定できない場合がある。明所の明るさの度合いを含むALSの検出結果を用いる方がより正確な周辺環境の明るさ情報が取得可能となる。CPU23は、ALS明るさまたはAE明るさのいずれかに基づくEV値をDisplay Controller19に出力する。なお、CPU23は、シーンの明るさの推定にALS明るさとAE明るさの両方を使用してもよい。 Returning to FIG. 17, if the ambient light brightness measurement results from the Ambient Light Sensor 60 are available, the CPU 23 uses the detection results (ALS brightness) of the Ambient Light Sensor 60; if the measurement results are unavailable, the CPU 23 uses the AE information (AE brightness) (S307). When the measurement results from the Ambient Light Sensor 60 are used, the update interval is longer than when AE information is used, but power consumption for scene brightness estimation processing can be reduced. Furthermore, since the sensor settings are set so that dark areas can be captured brightly, bright areas are saturated, and AE information based on the captured image data may not accurately measure the brightness of bright areas. Using the ALS detection results, which include the degree of brightness in bright areas, makes it possible to obtain more accurate information about the brightness of the surrounding environment. The CPU 23 outputs an EV value based on either the ALS brightness or the AE brightness to the Display Controller 19. The CPU 23 may use both the ALS brightness and the AE brightness to estimate the brightness of a scene.

 CPU23は、シーンの明るさの推定にAEの情報を用いる場合、明所の明るさを取得するために、ヒストグラムの最大値とSTAT平均値に基づいてEV値を補正する。具体的には、CPU23は、AEの情報に基づくEV値を下記の式(1)で補正する。 When using AE information to estimate the brightness of a scene, the CPU 23 corrects the EV value based on the maximum value of the histogram and the average STAT value to obtain the brightness of a bright place. Specifically, the CPU 23 corrects the EV value based on the AE information using the following formula (1).

 ここで、図16のS232のレンズ2の透過率の決定処理の詳細について説明する。CPU23は、S231で取得したシーンの明るさを示すEV値に応じて、レンズ2の透過率を決定する。 Here, the process of determining the transmittance of lens 2 in S232 of FIG. 16 will be described in detail. The CPU 23 determines the transmittance of lens 2 according to the EV value indicating the brightness of the scene obtained in S231.

 EV値がある一定の閾値以上の明るさの場合、ユーザにとって非常に明るく視覚上見づらくなる。このため、CPU23は、明るさ度合に応じてレンズ2の透過率を制御することで明るさによる視認性の低下を軽減する。EV値が1上がるごとに、明るさ(Lux)は約2倍となる。このため、例えば、CPU23は、図19に示したようにEV値“15”以上を明所とし、EV値“15”からEV値が1上がるごとにレンズ2の遮光度合を2倍にしてもよい。このような液晶シャッタの制御によって、レンズ2の透過像の明るさはEV値“14”以上にならず、ユーザが感じるまぶしさを軽減することができる。 When the EV value is brighter than a certain threshold, it becomes very bright and visually difficult for the user. For this reason, the CPU 23 reduces the decrease in visibility due to brightness by controlling the transmittance of the lens 2 according to the degree of brightness. For every 1 increase in the EV value, the brightness (Lux) approximately doubles. For this reason, for example, the CPU 23 may define an EV value of 15 or higher as a bright place, as shown in Figure 19, and double the degree of light blocking by the lens 2 for every 1 increase in the EV value from 15. By controlling the liquid crystal shutter in this way, the brightness of the transmitted image through the lens 2 will not exceed an EV value of 14, reducing the glare felt by the user.

 図20は、第1の実施形態に係るEV値とレンズ2の透過率との関係の一例を示す図である。一例として、図20のグラフのように透過率は推移する。通常においてレンズ2の最大透過率は100%にすることはできないため、図20では例として最大透過率を80%とした。図20に示す例では、EV値が1増える(明るさが2倍になる)ごとに透過率が半分になる。CPU23は、決定した透過率に基づいて液晶シャッタの遮光度合を設定する。 FIG. 20 is a diagram showing an example of the relationship between the EV value and the transmittance of the lens 2 according to the first embodiment. As an example, the transmittance changes as shown in the graph in FIG. 20. Normally, the maximum transmittance of the lens 2 cannot be set to 100%, so in FIG. 20, the maximum transmittance is set to 80% as an example. In the example shown in FIG. 20, the transmittance is halved every time the EV value increases by 1 (brightness doubles). The CPU 23 sets the degree of light blocking of the liquid crystal shutter based on the determined transmittance.

 ここで、図16のS233明所・暗所領域の抽出処理の詳細について説明する。図21は、第1の実施形態に係るビデオシースルー画像データ923の矩形ブロック単位のEV値の推定処理の一例を示す図である。 Here, the details of the bright and dark area extraction process S233 in Figure 16 will be explained. Figure 21 is a diagram showing an example of the EV value estimation process for rectangular blocks of video see-through image data 923 according to the first embodiment.

 CPU23は、視覚補正が必要な箇所を判定するために、ビデオシースルー画像データ923上の場所ごとの明るさを推定する。当該明るさの推定処理は画素単位で行うことも可能だが、STAT21から取得されるブロックごとの平均値を用いてブロック単位で明るさを推定することにより演算量が削減可能となる。なお、図21に示す各種数値は一例であり、これに限定されるものではない。 The CPU 23 estimates the brightness for each location on the video see-through image data 923 to determine areas where visual correction is required. This brightness estimation process can be performed on a pixel-by-pixel basis, but the amount of calculation can be reduced by estimating the brightness on a block-by-block basis using the average value for each block obtained from STAT 21. Note that the various numerical values shown in Figure 21 are merely examples and are not limited to these.

 CPU23は、ブロックごとの平均値をEV値に変換する。 The CPU 23 converts the average value for each block into an EV value.

 具体的には、CPU23は、下記の式(2)を用いてブロックごとの明るさ(EV値)の推定処理を行う。 Specifically, the CPU 23 estimates the brightness (EV value) for each block using the following equation (2):

 また、ビデオシースルー画像データ923はレンズ2を介さずに撮影された画像であるが、実際にユーザの目8で見えるシーンの明るさはレンズ2を透過した明るさである。このため、CPU23は、下記の式(3)を用いて、処理時点のレンズ2の透過率に基づいて各ブロックのEV値を補正する。CPU23は、当該補正により、実際の透過像の明るさを示すEV値を推定する。レンズ2の透過率による補正後のEV値を、視覚EV値ともいう。 Furthermore, although the video see-through image data 923 is an image captured without using the lens 2, the brightness of the scene actually seen by the user's eye 8 is the brightness transmitted through the lens 2. For this reason, the CPU 23 corrects the EV value of each block based on the transmittance of the lens 2 at the time of processing, using the following equation (3). Through this correction, the CPU 23 estimates the EV value indicating the brightness of the actual transmitted image. The EV value after correction based on the transmittance of the lens 2 is also called the visual EV value.

 例えば、CPU23は、明るさ(視覚EV値)に応じて、ユーザにとって視覚的に認識可能な領域を「明所」、認識が困難な領域を「暗所」として特定して領域分割する。また、CPU23は、暗所よりも明るいが、やや認識が困難な領域を「薄明り」として特定して分割してもよい。 For example, the CPU 23 may divide the area into "bright areas" that are visually recognizable to the user and "dark areas" that are difficult to recognize, depending on the brightness (visual EV value). The CPU 23 may also divide the area into "twilight areas" that are brighter than dark areas but somewhat difficult to recognize.

 ここで、暗さまたは明るさにより視覚的な認識が困難になるパターンとしては、1つ目のパターンとして「単純に暗すぎて認識が困難」な場合と、2つ目のパターンとして「輝度差が大きいために認識が困難」な場合とがある。1つ目のパターンの場合は、CPU23はEV値の絶対値により該当の領域を特定可能である。また、2つ目のパターンに関して、人の視覚のダイナミックレンジは80~120dB程度といわれており、それ以上の輝度差があると暗部の視認が困難となる。このため、CPU23は、これら2つのパターンを含めるために、EV値の絶対値がある閾値以下、もしくは、EV値の最大値からの差分がある一定の値以下の領域を、認識困難な領域と判定する。 Here, there are two patterns where darkness or brightness makes visual recognition difficult: the first pattern is when "it is simply too dark, making recognition difficult," and the second pattern is when "it is difficult to recognize due to a large difference in brightness." In the first pattern, the CPU 23 can identify the relevant area using the absolute value of the EV value. Regarding the second pattern, the dynamic range of human vision is said to be around 80 to 120 dB, and if there is a brightness difference greater than this, it becomes difficult to see dark areas. For this reason, in order to include these two patterns, the CPU 23 determines that areas where the absolute value of the EV value is below a certain threshold, or where the difference from the maximum EV value is below a certain value, are difficult to recognize.

 具体的には例えば1つ目のパターンの場合、CPU23は、EV値が“0”~“1”の範囲を「薄明り」とし、視覚的な認識の困難度が高いため、ビデオシースルー画像データによる視覚補助を徐々に開始する。また、CPU23は、EV値が“0”を下回った場合、「暗所」と判定し、ビデオシースルー画像データによる視覚補助を実施する。 Specifically, in the case of the first pattern, for example, the CPU 23 determines that an EV value in the range of "0" to "1" is "twilight," and as the level of visual recognition is high, gradually begins providing visual assistance using video see-through image data. Furthermore, if the EV value falls below "0," the CPU 23 determines that the location is a "dark place" and provides visual assistance using video see-through image data.

 図22は、第1の実施形態に係るビデオシースルー表示画像の明るさと、シーンにおけるEV値との関係の一例を示す図である。図22の縦軸はビデオシースルー表示画像の明るさ(輝度)、横軸はEV値である。「視覚補助を徐々に開始する」とは、例えば、図22に示すグラフのように、EV値が小さくなるほど、透過ディスプレイ3に表示されるビデオシースルー表示画像の明るさを、徐々に明るくすることをいう。 FIG. 22 is a diagram showing an example of the relationship between the brightness of the video see-through display image and the EV value in a scene according to the first embodiment. The vertical axis of FIG. 22 is the brightness (luminance) of the video see-through display image, and the horizontal axis is the EV value. "Gradually starting visual assistance" means, for example, as shown in the graph in FIG. 22, that the brightness of the video see-through display image displayed on the transmissive display 3 gradually increases as the EV value decreases.

 また、例えば2つ目のパターンの場合、CPU23は、ブロックのEV値と、ビデオシースルー画像データ923全体におけるEV値の最大値との差分に基づいてビデオシースルー表示画像の明るさを決定する。例えば、CPU23は、“EV値-EV値の最大値”が“-4”を下回ったら「薄明り」と判定する。「薄明り」の場合、視覚的な認識がやや困難になるため、CPU23は、ビデオシースルー画像データによる視覚補助を徐々に開始する。また、CPU23は、“EV値-EV値の最大値”が“-7”を下回ったら「暗所」と判定する。CPU23は、「暗所」と判定した場合、ビデオシースルー画像データによる視覚補助を実施する。 Furthermore, in the case of the second pattern, for example, the CPU 23 determines the brightness of the video see-through display image based on the difference between the EV value of the block and the maximum EV value for the entire video see-through image data 923. For example, the CPU 23 determines that it is "twilight" if "EV value - maximum EV value" falls below "-4." In the case of "twilight," visual recognition becomes somewhat difficult, so the CPU 23 gradually begins providing visual assistance using the video see-through image data. Furthermore, the CPU 23 determines that it is a "dark place" if "EV value - maximum EV value" falls below "-7." If the CPU 23 determines that it is a "dark place," it provides visual assistance using the video see-through image data.

 図23は、第1の実施形態に係るビデオシースルー表示画像の明るさと、シーンにおけるEV値の輝度差との関係の一例を示す図である。図23の縦軸はビデオシースルー表示画像の明るさ(輝度)、横軸はEV値である。ブロックのEV値と、ビデオシースルー画像データ923全体におけるEV値の最大値との差分(輝度差)である。CPU23は、例えば、図23に示すグラフのように、輝度差が大きくなるほど、透過ディスプレイ3に表示されるビデオシースルー表示画像の明るさを、徐々に明るくする。 FIG. 23 is a diagram showing an example of the relationship between the brightness of the video see-through display image according to the first embodiment and the luminance difference in EV value in a scene. The vertical axis of FIG. 23 is the brightness (luminance) of the video see-through display image, and the horizontal axis is the EV value. This is the difference (luminance difference) between the EV value of the block and the maximum EV value in the entire video see-through image data 923. For example, as shown in the graph in FIG. 23, the CPU 23 gradually brightens the brightness of the video see-through display image displayed on the transmissive display 3 as the luminance difference increases.

 CPU23は、図22、23に示す2つのグラフに基づいてビデオシースルー画像データ923の明るさを変換することで、EV値に基づく「明所」、「暗所」、及び「薄明り」該当する各領域の明るさを決定できる。 By converting the brightness of the video see-through image data 923 based on the two graphs shown in Figures 22 and 23, the CPU 23 can determine the brightness of each area corresponding to "bright place," "dark place," and "twilight" based on the EV value.

 また、3つ目のパターンとして、ビデオシースルー画像データ923には、シーン全体の明るさ以外に、白飛び等の局所的な明るい箇所が存在する場合がある。これば、ビデオシースルー用カメラ41に通常のイメージセンサが使用されている場合、ダイナミックレンジが狭いため、相対的に輝度が高い箇所で白飛び(画素値が飽和した状態)が発生するためである。相対的に輝度が高い箇所とは、例えば、強い逆光が射している箇所である。透過ディスプレイ3に表示されるビデオシースルー表示画像がこのような白飛びした領域を含むと、却って視覚を阻害することとなり好ましくない。よってCPU23は、このような相対的に高輝度な領域も明所として判定する。ビデオシースルー表示画像のうちCPU23が明所として判定した個所は、透過ディスプレイ3に表示されない。 As a third pattern, the video see-through image data 923 may contain locally bright areas such as blown-out highlights in addition to the brightness of the entire scene. This is because when a normal image sensor is used in the video see-through camera 41, the dynamic range is narrow, causing blown-out highlights (a state in which pixel values are saturated) in areas with relatively high brightness. An example of an area with relatively high brightness is an area with strong backlight. If the video see-through display image displayed on the transmissive display 3 includes such blown-out highlight areas, this can actually impair vision, which is undesirable. Therefore, the CPU 23 also determines such relatively bright areas to be bright areas. Areas of the video see-through display image that the CPU 23 determines to be bright areas are not displayed on the transmissive display 3.

 図24は、第1の実施形態に係るビデオシースルー表示画像の明るさとブロックごとの輝度の平均値との関係の一例を示す図である。CPU23は、STAT21から出力されたブロックごとの輝度の平均値に対して図24に示すグラフのような変換を行う。 FIG. 24 is a diagram showing an example of the relationship between the brightness of a video see-through display image and the average luminance value for each block according to the first embodiment. The CPU 23 performs a conversion as shown in the graph in FIG. 24 on the average luminance value for each block output from STAT 21.

 このようなCPU23による明所・暗所領域の抽出処理の結果を受けて、Display Controller19は、ビデオシースルー表示画像の領域ごとの明るさを調整する。例えば、Display Controller19は、上述の3つの暗さまたは明るさにより視覚的な認識が困難になるパターンの全てにおいて透過ディスプレイ3への表示対象と判定した領域のみ、ビデオシースルー表示画像を透過ディスプレイ3に表示させる。例えば、Display Controller19は、ビデオシースルー画像データ923の各画素値に対して、当該画像データの明るさ値を乗算することで、ビデオシースルー画像データ923の明るさを調整する。 In response to the results of this bright and dark area extraction process by the CPU 23, the Display Controller 19 adjusts the brightness of each area of the video see-through display image. For example, the Display Controller 19 displays the video see-through display image on the transmissive display 3 only in areas that it has determined should be displayed on the transmissive display 3 in all of the three patterns where visual recognition is difficult due to darkness or brightness described above. For example, the Display Controller 19 adjusts the brightness of the video see-through image data 923 by multiplying each pixel value of the video see-through image data 923 by the brightness value of that image data.

 図25は第1の実施形態に係るビデオシースルー画像データ923の明るさの調整処理の一例を示す図である。図25に示すように、CPU23によって推定された透過像の明るさ(EV値)と、STAT21から出力されたビデオシースルー画像データ924のブロックごとの輝度の平均値に基づいて、CPU23はビデオシースルー画像データ924の「明所」、「暗所」の領域を判定する。また、CPU23はビデオシースルー画像データ924における白飛び等の相対的な高輝度の領域も特定する。Display Controller19は、CPU23の処理結果に基づいて、ビデオシースルー画像データ924の明るさを調整する。例えば、Display Controller19は、ビデオシースルー画像データ924の「明所」、「暗所」の領域の判定結果と、白飛び等の相対的な高輝度の領域の特定結果とを乗算することで、ビデオシースルー画像データ924のうち表示対象の領域を特定する。Warp18は、ビデオシースルー画像データ924のうち、表示対象外と判定した領域(例えば、「明所」に該当する領域)については“0”を設定する。このため、明るさ調整後のビデオシースルー画像データ924aでは、3つのパターンのいずれかの判定で「明所」に該当する領域には“0”が設定される。 25 is a diagram showing an example of the brightness adjustment process for the video see-through image data 923 according to the first embodiment. As shown in FIG. 25, the CPU 23 determines the "bright" and "dark" areas of the video see-through image data 924 based on the brightness (EV value) of the transmitted image estimated by the CPU 23 and the average brightness value for each block of the video see-through image data 924 output from the STAT 21. The CPU 23 also identifies areas of relatively high brightness, such as blown-out highlights, in the video see-through image data 924. The Display Controller 19 adjusts the brightness of the video see-through image data 924 based on the processing results of the CPU 23. For example, the Display Controller 19 identifies the area to be displayed in the video see-through image data 924 by multiplying the determination result of the "bright" and "dark" areas of the video see-through image data 924 by the identification result of the relatively high brightness areas, such as blown-out highlights. Warp18 sets "0" to areas of the video see-through image data 924 that it determines not to be displayed (for example, areas that fall into a "bright place"). Therefore, in the video see-through image data 924a after brightness adjustment, areas that fall into a "bright place" in any of the three pattern determinations are set to "0".

 また、Display Controller19は、「暗所」だけではなく、「薄明り」の領域についてもビデオシースルー画像データによる視覚補助を開始する。このような「薄明り」の領域は、「暗所」の領域とは異なり、ユーザは透過像がうっすらと見えている状態である。このため、Warp18は、2重像を抑止するために、少なくとも「薄明り」の領域に対してはビデオシースルー画像データを、第1位置関係データを用いて変形する。当該変形により、透過ディスプレイ3上で、透過像内の被写体とビデオシースルー表示画像内の被写体の位置及び大きさが一致する。なお、「暗所」においては透過像が見えず2重像が発生しないため、ビデオシースルー画像データの変形は省略可能である。この場合、変形のための演算量を削減することができる。しかしながら、「暗所」と「暗所」外の領域の境界部分、及び透過ディスプレイ3の内外の境界部分でユーザが違和感を覚える可能性がある。このため、Warp18は、ビデオシースルー画像データのうち、ビデオシースルー表示画像として透過ディスプレイ3に表示される「暗所」及び「薄明り」の両方の領域を変形対象とすることが好ましい。 In addition, the Display Controller 19 begins providing visual assistance using video see-through image data not only in "dark places" but also in "twilight" areas. Unlike "dark places," such "twilight" areas allow the user to faintly see the transmitted image. Therefore, in order to prevent double images, the Warp 18 uses the first positional relationship data to deform the video see-through image data for at least the "twilight" area. This deformation causes the position and size of the subject in the transmitted image to match the subject in the video see-through display image on the transmissive display 3. Note that in "dark places," the transmitted image is not visible and no double images occur, so deformation of the video see-through image data can be omitted. In this case, the amount of calculation required for deformation can be reduced. However, the user may experience discomfort at the boundary between the "dark place" and the area outside the "dark place," as well as at the boundary between the inside and outside of the transmissive display 3. For this reason, it is preferable for Warp 18 to target both "dark" and "twilight" areas of the video see-through image data that are displayed on the transmissive display 3 as a video see-through display image for deformation.

 図26は、第1の実施形態に係る補正前後のビデオシースルー画像データ925,925aの一例を示す図である。図26に示すように、Warp18は、ビデオシースルー画像データ925のうち、「薄明り」の領域を変形する。「暗所」で、透過像と2重になることがない領域は、変形しない例を示すが、「暗所」及び「薄明り」の両方の領域を変形することもできる。また、図26に示すように、Warp18は、補正対象外の「明所」の領域には値“0”を設定する。 FIG. 26 shows an example of video see-through image data 925, 925a before and after correction according to the first embodiment. As shown in FIG. 26, Warp18 deforms the "twilight" area of the video see-through image data 925. An example is shown in which areas in "dark places" that do not overlap with the transmitted image are not deformed, but it is also possible to deform both "dark places" and "twilight" areas. Also, as shown in FIG. 26, Warp18 sets the value "0" to "bright places" areas that are not subject to correction.

 このように、本実施形態のSoC100aは、第1位置関係データに基づいて、ビデオシースルー画像データの少なくとも一部の領域を抽出および変形する。また、本実施形態のSoC100aは、透過像に含まれる被写体の輪郭とビデオシースルー画像データの一部に含まれる被写体の輪郭とが透過ディスプレイ3上で連続するように表示データを生成し、生成した表示データを透過ディスプレイ3に表示する。このため、本実施形態のSoC100aによれば、透過ディスプレイ3上にビデオシースルー表示画像を表示する場合において、透過像とビデオシースルー表示画像との境界での視認性を向上させることができる。 In this way, the SoC 100a of this embodiment extracts and deforms at least a partial area of the video see-through image data based on the first positional relationship data. Furthermore, the SoC 100a of this embodiment generates display data so that the outline of the subject included in the transmitted image and the outline of the subject included in part of the video see-through image data are continuous on the transmissive display 3, and displays the generated display data on the transmissive display 3. Therefore, the SoC 100a of this embodiment can improve visibility at the boundary between the transmitted image and the video see-through display image when displaying a video see-through display image on the transmissive display 3.

 また、本実施形態のSoC100aは、ビデオシースルー画像データに基づいて透過像の特徴点を算出し、透過像の特徴点と、ビデオシースルー画像データの特徴点とに基づいて第1位置関係データを生成する。SoC100aは、生成した第1位置関係データをメモリへ出力する。このため、本実施形態のSoC100aによれば、事前に生成及び記憶した第1位置関係データを、ヘッドマウントディスプレイ1aのユーザ使用時に使用するため、補正のための処理速度を向上させることができる。 In addition, the SoC 100a of this embodiment calculates feature points of the transmitted image based on the video see-through image data, and generates first positional relationship data based on the feature points of the transmitted image and the feature points of the video see-through image data. The SoC 100a outputs the generated first positional relationship data to memory. Therefore, according to the SoC 100a of this embodiment, the first positional relationship data generated and stored in advance is used when the user uses the head-mounted display 1a, thereby improving the processing speed for correction.

 また、本実施形態のSoC100aは、透過ディスプレイ3の前面301側からキャリブレーション用カメラ42で撮影された第1のキャリブレーション用画像データ901から、透過像の第1の特徴点を抽出する。また、SoC100aは、ビデオシースルー用カメラ41で撮影されたビデオシースルー画像データに基づく表示データが表示された透過ディスプレイ3をキャリブレーション用カメラ42で撮影した第2のキャリブレーション用画像データ902から被写体の第2の特徴点を抽出する。そして、本実施形態のSoC100aは、第1の特徴点と第2の特徴点の位置関係に基づいて第1位置関係データを生成する。このため、本実施形態のSoC100aは、実際のレンズ歪や表示系の歪等を加味して、ビデオシースルー画像デーと透過像とのずれを補正することができる。 The SoC 100a of this embodiment also extracts a first feature point of the transmitted image from first calibration image data 901 captured by the calibration camera 42 from the front surface 301 of the transmissive display 3. The SoC 100a also extracts a second feature point of the subject from second calibration image data 902 captured by the calibration camera 42 of the transmissive display 3, on which display data based on the video see-through image data captured by the video see-through camera 41 is displayed. The SoC 100a of this embodiment then generates first positional relationship data based on the positional relationship between the first and second feature points. Therefore, the SoC 100a of this embodiment can correct the misalignment between the video see-through image data and the transmitted image by taking into account actual lens distortion, display system distortion, etc.

 また、本実施形態のヘッドマウントディスプレイ1aは、上述のように、暗所における視覚補助、及び明所におけるまぶしさによる視認性の低下を抑制する視覚補助の機能を有する。このため、例えば、ユーザが車両の運転時にヘッドマウントディスプレイ1aを装着する場合、物陰等の暗所についてはビデオシースルー表示画像の表示による視覚補助を受け、逆光等の明所については液晶シャッタの遮光による視覚補助を受けることができる。また、本実施形態のヘッドマウントディスプレイ1aは、夜間や地下等の暗所における作業の際の視覚補助にも利用可能である。 Furthermore, as described above, the head-mounted display 1a of this embodiment has the functions of providing visual assistance in dark places and suppressing reduced visibility due to glare in bright places. Therefore, for example, when a user wears the head-mounted display 1a while driving a vehicle, visual assistance is provided by the display of a video see-through display image in dark places such as behind objects, and visual assistance is provided by the light blocking liquid crystal shutter in bright places such as backlight. Furthermore, the head-mounted display 1a of this embodiment can also be used as a visual assistance when working in dark places such as at night or underground.

(第1の実施形態の変形例1)
 上述の第1の実施形態では、SoC100aは、例えば、透過ディスプレイ3上の「暗所」及び「薄明り」の領域において、ビデオシースルー画像データを透過像に合わせて変形することにより、2重像の発生を低減していた。
(Modification 1 of the First Embodiment)
In the first embodiment described above, the SoC 100a reduced the occurrence of double images, for example, in the "dark" and "twilight" areas on the transmissive display 3 by transforming the video see-through image data to match the transmitted image.

 さらに2重像を目立たなくする方法として、液晶シャッタの遮光度合を強くし外光の影響を軽減したうえで、ビデオシースルー画像データを表示する方法がある。当該手法によれば、遮光によって透過像が見えなくなるため、2重像を抑止できる。 Another way to make double images less noticeable is to increase the degree of light blocking by the LCD shutter to reduce the effects of external light, and then display video see-through image data. With this method, the transmitted image becomes invisible due to the light blocking, thereby preventing double images.

 図27は、第1の実施形態の変形例1に係る遮光対象領域201の一例を示す図である。一般的に、透過ディスプレイ3はレンズ2よりも画角が狭いためユーザの視界に対して一部分しか表示できないことが多い。このため、本変形例のDisplay Controller19は、ユーザの視界全体(レンズ2全体)を遮光するのではなく透過ディスプレイ3の存在する範囲のみを遮光対象領域201とする。Display Controller19は、当該範囲を遮光対象領域201とすることで、透過ディスプレイ3の範囲外の視界を阻害せずに、2重像を抑止することができる。 Figure 27 is a diagram showing an example of a light-blocking target area 201 according to variant 1 of the first embodiment. Generally, the transparent display 3 has a narrower angle of view than the lens 2, and therefore can often only display a portion of the user's field of view. For this reason, the Display Controller 19 of this variant does not block the entire field of view of the user (the entire lens 2), but rather sets only the area where the transparent display 3 is present as the light-blocking target area 201. By setting this area as the light-blocking target area 201, the Display Controller 19 can prevent double images without obstructing the field of view outside the range of the transparent display 3.

(第1の実施形態の変形例2)
 上述の第1の実施形態の変形例1では、SoC100aは、レンズ2全体のうち、透過ディスプレイ3の存在する範囲のみを遮光対象領域201としていた。このような遮光範囲では、例えば強い逆光のようにEV値の相対値または絶対値が高い領域がある場合、透過ディスプレイ3の範囲外についてはまぶしさを低減することができない。
(Modification 2 of the First Embodiment)
In the first modification of the first embodiment described above, the SoC 100a defines only the area of the entire lens 2 where the transmissive display 3 is present as the light-shielded area 201. In such a light-shielded area, if there is an area where the relative or absolute value of the EV value is high, such as in strong backlight, it is not possible to reduce glare outside the area of the transmissive display 3.

 図28は、第1の実施形態の変形例2に係る明るさの強い領域が存在する場合の一例を示す図である。このような場合、SoC100aが、透過ディスプレイ3が存在する範囲のみを液晶シャッタによって遮光しても、ユーザはまぶしさを感じる。また、図28に示す例では、レンズ2の視野内に暗所も存在する。暗所であっても、透過ディスプレイ3の範囲外の領域はビデオシースルー表示画像による視覚補助の対象外である。このため、SoC100aが液晶シャッタによりレンズ2の全面を遮光すると、透過ディスプレイ3の範囲外の領域は暗いままであり、視認性が悪い。 FIG. 28 is a diagram showing an example of a case where a bright area exists according to Variation 2 of the first embodiment. In such a case, even if SoC 100a uses the liquid crystal shutter to block light only in the area where the transmissive display 3 exists, the user will still feel dazzled. Furthermore, in the example shown in FIG. 28, there is also a dark area within the field of view of lens 2. Even in dark areas, areas outside the range of the transmissive display 3 are not subject to visual aid by the video see-through display image. Therefore, when SoC 100a blocks light from the entire surface of lens 2 using the liquid crystal shutter, areas outside the range of the transmissive display 3 remain dark, resulting in poor visibility.

 本変形例のSoC100aは、このような場合には、強い逆光等でまぶしい箇所のみ遮光するように液晶シャッタを制御する。 In such cases, the SoC 100a of this modified example controls the liquid crystal shutter to block light only in areas that are dazzling due to strong backlight, etc.

 図29は、第1の実施形態の変形例2に係る遮光対象領域202の一例を示す図である。図29に示すように、遮光対象領域202は、透過ディスプレイ3の範囲外も含む。本変形例のSoC100aの例えばCPU23は、レンズ2の領域ごとの透過率を矩形ブロック単位の輝度の平均値から算出し、液晶シャッタの透過率を領域ごとに制御する。本変形例のレンズ2の液晶シャッタは、部分的に透過率を制御可能な液晶パネルを有する。 FIG. 29 is a diagram showing an example of a light-shielding target area 202 according to Modification 2 of the first embodiment. As shown in FIG. 29, the light-shielding target area 202 also includes areas outside the range of the transmissive display 3. For example, the CPU 23 of the SoC 100a of this modification calculates the transmittance for each area of the lens 2 from the average brightness value for each rectangular block, and controls the transmittance of the liquid crystal shutter for each area. The liquid crystal shutter of the lens 2 of this modification has a liquid crystal panel whose transmittance can be partially controlled.

 このような遮光範囲の制御により、本変形例のヘッドマウントディスプレイ1aによれば、レンズ2の視野内の透過ディスプレイ3の範囲外に相対的または絶対的に輝度の高い明所がある場合におけるユーザの視認性の低下を軽減することができる。また、本変形例のヘッドマウントディスプレイ1aによれば、レンズ2の視野内に明所と暗所が混在する場合においても、ユーザの視認性の低下を軽減することができる。 By controlling the light-blocking range in this manner, the head-mounted display 1a of this modified example can reduce the reduction in visibility for the user when there is a bright area with relatively or absolutely high brightness outside the range of the transmissive display 3 within the field of view of the lens 2. Furthermore, the head-mounted display 1a of this modified example can reduce the reduction in visibility for the user even when there is a mixture of bright and dark areas within the field of view of the lens 2.

(第2の実施形態)
 上述の第1の実施形態では、SoC100aは、ヘッドマウントディスプレイ1aを装着したユーザの目8の位置と、ビデオシースルー用カメラ41の搭載位置との差異に基づくビデオシースルー表示画像と透過像とのずれを補正していた。第1の実施形態のキャリブレーション処理では、SoC100aはユーザの目8がレンズ2の中心付近に位置することを前提に第1位置関係データを決定していた。しかしながら、ユーザの個人差によって目8の位置は異なる上に、同一人物であっても装着状況によって目8とビデオシースルー用カメラ41との相対的な位置は常時変化することが想定される。このため、この第2の実施形態では、さらに、実装着時のずれについても補正する。
Second Embodiment
In the first embodiment described above, the SoC 100a corrects the misalignment between the video see-through display image and the transmitted image based on the difference between the position of the user's eye 8 wearing the head-mounted display 1a and the mounting position of the video see-through camera 41. In the calibration process of the first embodiment, the SoC 100a determines the first positional relationship data on the assumption that the user's eye 8 is located near the center of the lens 2. However, the position of the eye 8 varies depending on the individual user, and even for the same person, the relative position between the eye 8 and the video see-through camera 41 is expected to constantly change depending on the wearing conditions. For this reason, in this second embodiment, the misalignment during mounting is also corrected.

 図30は、第2の実施形態に係るヘッドマウントディスプレイ1bの全体構成の一例を示す図である。本実施形態のヘッドマウントディスプレイ1bは、眼鏡本体部10、レンズ2a,2b、透過ディスプレイ3a,3b、ビデオシースルー用カメラ41a,41b、ディスプレイプロジェクタ5a,5b、Ambient Light Sensor60、Head Tracking用カメラ63a,63b、及びSoC100bを備える。また、本実施形態のヘッドマウントディスプレイ1bは、さらに、Eye Tracking用カメラ43a,43bを備える。 Figure 30 is a diagram showing an example of the overall configuration of a head-mounted display 1b according to the second embodiment. The head-mounted display 1b of this embodiment includes an eyeglass body 10, lenses 2a and 2b, transparent displays 3a and 3b, video see-through cameras 41a and 41b, display projectors 5a and 5b, an ambient light sensor 60, head tracking cameras 63a and 63b, and an SoC 100b. The head-mounted display 1b of this embodiment also includes eye tracking cameras 43a and 43b.

 眼鏡本体部10、レンズ2a,2b、透過ディスプレイ3a,3b、ビデオシースルー用カメラ41a,41b、ディスプレイプロジェクタ5a,5b、Ambient Light Sensor60、及びHead Tracking用カメラ63a,63bは、第1の実施形態と同様の機能を備える。また、ヘッドマウントディスプレイ1bは、第1の実施形態と同様に、IMU61、ToFセンサ62、Flash memory31、及びDRAM32を備える。 The eyeglasses main body 10, lenses 2a, 2b, transmissive displays 3a, 3b, video see-through cameras 41a, 41b, display projectors 5a, 5b, ambient light sensor 60, and head tracking cameras 63a, 63b have the same functions as in the first embodiment. Also, the head-mounted display 1b has an IMU 61, a ToF sensor 62, a flash memory 31, and a DRAM 32, just like in the first embodiment.

 Eye Tracking用カメラ43a,43bは、透過ディスプレイ3の前面が向く方向を撮影する。 Eye Tracking cameras 43a and 43b capture images in the direction in which the front of the transmissive display 3 faces.

 図31は、第2の実施形態に係るEye Tracking用カメラ43a,43bとユーザの目8a,8bとの位置関係の一例を示す図である。図31に示すように、Eye Tracking用カメラ43a,43bは、ヘッドマウントディスプレイ1bを装着したユーザの目8a,8bを撮影する。 Figure 31 is a diagram showing an example of the positional relationship between the eye tracking cameras 43a, 43b and the user's eyes 8a, 8b according to the second embodiment. As shown in Figure 31, the eye tracking cameras 43a, 43b capture images of the eyes 8a, 8b of a user wearing the head-mounted display 1b.

 なおEye Tracking用カメラ43a,43bの数は、2台に限定されるものではない。図32は、第2の実施形態に係るEye Tracking用カメラ43a~43dとユーザの目8a,8bとの位置関係の他の一例を示す図である。図32に示すように、ヘッドマウントディスプレイ1bが4台のEye Tracking用カメラ43a~43dを備える場合、1つの目8に対して2台のカメラで撮影が可能である。当該構成によれば、三角測量の原理で目8a,8bからレンズ2a,2bまでの距離も推定可能となる。 Note that the number of eye tracking cameras 43a, 43b is not limited to two. Figure 32 is a diagram showing another example of the positional relationship between the eye tracking cameras 43a-43d and the user's eyes 8a, 8b according to the second embodiment. As shown in Figure 32, when the head-mounted display 1b is equipped with four eye tracking cameras 43a-43d, it is possible to capture images of one eye 8 using two cameras. With this configuration, it is also possible to estimate the distance from the eyes 8a, 8b to the lenses 2a, 2b using the principle of triangulation.

 以下、個々のEye Tracking用カメラ43a~43dを特に区別しない場合は、単にEye Tracking用カメラ43という。Eye Tracking用カメラ43は、本実施形態における第3のカメラの一例である。また、Eye Tracking用カメラ43によって撮影される画像データは、本実施形態における第4画像データの一例である。 Hereinafter, when there is no need to distinguish between the individual eye tracking cameras 43a to 43d, they will simply be referred to as eye tracking cameras 43. Eye tracking camera 43 is an example of the third camera in this embodiment. Furthermore, image data captured by eye tracking camera 43 is an example of the fourth image data in this embodiment.

 図33は、第2の実施形態に係るSoC100bの構成の一例を示す図である。SoC100bは、第1の実施形態と同様にI2Cインタフェース11,22、Mono ISP12、DSP&AIAccelerator13、SRAM14a~14c、GPU15、Color ISP16、Time Warp17、Warp18、Display Controller19、STAT21、CPU23、及びDRAM Controller24を備える。 FIG. 33 is a diagram showing an example of the configuration of SoC 100b according to the second embodiment. Like the first embodiment, SoC 100b includes I2C interfaces 11 and 22, Mono ISP 12, DSP & AI Accelerator 13, SRAMs 14a to 14c, GPU 15, Color ISP 16, Time Warp 17, Warp 18, Display Controller 19, STAT 21, CPU 23, and DRAM Controller 24.

 本実施形態のMono ISP12は、第1の実施形態の機能に加えて、さらに、Eye Tracking用カメラ43から撮影画像データを取得し、明るさ等を補正する。 In addition to the functions of the first embodiment, the Mono ISP 12 of this embodiment also acquires captured image data from the eye tracking camera 43 and corrects brightness, etc.

 本実施形態のDSP&AI Accelerator13は、第1の実施形態の機能に加えて、さらに、Mono ISP12が取得及び補正したEye Tracking用カメラ43の撮影画像データに基づいてユーザの目8における瞳の位置を検出するEye Tracking処理を実行する。DSP&AI Accelerator13は、本実施形態における瞳検出回路の一例である。DSP&AI Accelerator13のうち、特に、DSP131は、ユーザの瞳の位置と、ビデオシースルー画像データの特徴点とに基づいて第2位置関係データを生成し、SRAM14a等のメモリへ出力する。DSP131は、本実施形態における画像処理回路の一例である。 In addition to the functions of the first embodiment, the DSP & AI Accelerator 13 of this embodiment also performs eye tracking processing to detect the position of the pupil of the user's eye 8 based on image data captured by the eye tracking camera 43 acquired and corrected by the Mono ISP 12. The DSP & AI Accelerator 13 is an example of an pupil detection circuit in this embodiment. Of the DSP & AI Accelerator 13, the DSP 131 in particular generates second positional relationship data based on the position of the user's pupil and feature points of the video see-through image data, and outputs it to a memory such as the SRAM 14a. The DSP 131 is an example of an image processing circuit in this embodiment.

 本実施形態のWarp18は、第1の実施形態の機能に加えて、さらに、第2位置関係データに基づいて、ビデオシースルー画像データ内の被写体の位置を補正する。第2位置関係データは、ユーザの瞳の位置とビデオシースルー用カメラ41との位置関係を表す。 In addition to the functions of the first embodiment, Warp 18 of this embodiment also corrects the position of the subject in the video see-through image data based on second positional relationship data. The second positional relationship data represents the positional relationship between the position of the user's pupil and the video see-through camera 41.

 なお、CPU23は、ヘッドマウントディスプレイ1bの使用前のキャリブレーション処理については第1の実施形態と同様に実行し、第1位位置関係データをメモリに格納する。 The CPU 23 performs the pre-use calibration process for the head-mounted display 1b in the same manner as in the first embodiment, and stores the first positional relationship data in memory.

 次に、本実施形態における瞳位置の位置ずれ補正量の取得処理の流れについて説明する。図34は、第2の実施形態に係る瞳位置の位置ずれ補正量の取得処理の流れの一例を示すフローチャートである。 Next, the flow of the process for obtaining the pupil position misalignment correction amount in this embodiment will be described. Figure 34 is a flowchart showing an example of the flow of the process for obtaining the pupil position misalignment correction amount according to the second embodiment.

 まず、Mono ISP12は、ヘッドマウントディスプレイ1bを装着したユーザ(装着者)の瞳を撮影した画像データをEye Tracking用カメラ43から取得する(S31)。Eye Tracking用カメラ43によって撮影された当該画像データをEye Tracking用画像データという。 First, Mono ISP 12 acquires image data of the eyes of a user (wearer) wearing head-mounted display 1b from eye tracking camera 43 (S31). The image data captured by eye tracking camera 43 is referred to as eye tracking image data.

 そして、DSP&AI Accelerator13は、Eye Tracking用画像データから、ユーザの瞳孔の位置を検出する(S32)。ユーザの瞳孔の位置を示すデータは、ユーザ瞳の位置を示す瞳位置データの一例である。 Then, the DSP & AI Accelerator 13 detects the position of the user's pupil from the image data for eye tracking (S32). The data indicating the position of the user's pupil is an example of pupil position data indicating the position of the user's pupil.

 そして、DSP&AI Accelerator13のうちのDSPは、ユーザの瞳孔の位置の検出結果に基づいて、ユーザの目8とビデオシースルー用カメラ41との相対位置を取得する(S33)。DSP131は、取得したユーザの目8とビデオシースルー用カメラ41との相対位置を示す第2位置関係データを、SRAM14a等のメモリへ出力する。また、DSP131は、さらに、ビデオシースルー画像データから抽出された特徴点に基づいてユーザの目8とビデオシースルー用カメラ41との相対位置を示す第2位置関係データを生成してもよい。また、DSP131は、ユーザの瞳の位置を示す瞳位置データをSRAM14a等のメモリへ格納してもよい。ここで、図34のフローチャートの処理は終了する。 Then, the DSP in the DSP & AI Accelerator 13 acquires the relative position between the user's eye 8 and the video see-through camera 41 based on the detection result of the user's pupil position (S33). The DSP 131 outputs second positional relationship data indicating the acquired relative position between the user's eye 8 and the video see-through camera 41 to a memory such as the SRAM 14a. The DSP 131 may also generate second positional relationship data indicating the relative position between the user's eye 8 and the video see-through camera 41 based on feature points extracted from the video see-through image data. The DSP 131 may also store pupil position data indicating the position of the user's pupil in a memory such as the SRAM 14a. At this point, the processing of the flowchart in Figure 34 ends.

 なお、Eye Tracking処理の手法は、特に限定されるものではなく、公知の手法を採用可能である。例えば、ビデオシースルー用カメラ41とEye Tracking用カメラ43との位置関係を示すデータがあらかじめRAM14a等のメモリに格納されていてもよい。この場合、DSP131は、当該位置関係を示すデータに基づいてユーザの目8とEye Tracking用カメラ43との相対関係をユーザの目8とビデオシースルー用カメラ41との相対位置に変換してもよい。なお、図34のフローチャートの処理は、ヘッドマウントディスプレイ1bがユーザに使用されている間は繰り返し実行される。 The method of eye tracking processing is not particularly limited, and any known method can be used. For example, data indicating the positional relationship between the video see-through camera 41 and the eye tracking camera 43 may be stored in advance in a memory such as RAM 14a. In this case, the DSP 131 may convert the relative relationship between the user's eyes 8 and the eye tracking camera 43 into the relative position between the user's eyes 8 and the video see-through camera 41 based on the data indicating this positional relationship. The processing of the flowchart in Figure 34 is repeatedly executed while the head-mounted display 1b is being used by the user.

 図35は、第2の実施形態に係るヘッドマウントディスプレイ1bの使用時の表示データの生成処理の流れの一例を示すフローチャートである。S21の画像データ取得処理、及びS23の明所・暗所領域の抽出処理は、図12で説明した第1の実施形態の処理と同様である。 FIG. 35 is a flowchart showing an example of the flow of the display data generation process when using the head-mounted display 1b according to the second embodiment. The image data acquisition process in S21 and the bright and dark area extraction process in S23 are the same as the processes in the first embodiment described in FIG. 12.

 また、S22の画像データ内の被写体の位置関係データの取得処理において、Warp18は、SRAM14a等のメモリから、第1位置関係データと第2位置関係データとを取得する。 Furthermore, in the process of acquiring positional relationship data of the subject in the image data in S22, Warp 18 acquires first positional relationship data and second positional relationship data from a memory such as SRAM 14a.

 そして、Warp18は、第2位置関係データに基づいて、ビデオシースルー画像データ内の被写体の位置関係を補正する(S41)。S24のビデオシースルー画像データの変形処理から、S27の表示データ及び透過率データの出力処理までは、図12で説明した第1の実施形態の処理と同様である。ここで、このフローチャートの処理は終了する。 Warp 18 then corrects the positional relationship of the subject in the video see-through image data based on the second positional relationship data (S41). The processes from the transformation process of the video see-through image data in S24 to the output process of the display data and transmittance data in S27 are the same as those in the first embodiment described in FIG. 12. At this point, the processing of this flowchart ends.

 このように、本実施形態のSoC100bは、Eye Tracking用画像データに基づいてユーザの瞳の位置を検出し、瞳の位置とビデオシースルー用カメラ41との位置関係を表す第2位置関係データを生成する。このため、本実施形態のSoC100bによれば、実際の装着時の瞳の位置のずれやユーザの個人差を透過ディスプレイ3に表示されるビデオシースルー表示画像に反映することができる。 In this way, the SoC 100b of this embodiment detects the position of the user's pupils based on the Eye Tracking image data, and generates second positional relationship data that represents the positional relationship between the pupil positions and the video see-through camera 41. Therefore, the SoC 100b of this embodiment can reflect the deviation in pupil position when actually worn and individual differences between users in the video see-through display image displayed on the transmissive display 3.

(第2の実施形態の変形例1)
 上述の第2の実施形態では、DSP&AI Accelerator13は、Eye Tracking用画像データから、ユーザの瞳孔の位置を検出していた。本変形例では、DSP&AI Accelerator13は、Eye Tracking用画像データから、ユーザの瞳に映る瞳反射画像データを抽出する。DSP&AI Accelerator13は、本変形例における瞳反射画像抽出回路の一例である。
(Modification 1 of the second embodiment)
In the second embodiment described above, the DSP&AI accelerator 13 detects the position of the user's pupil from the eye tracking image data. In this modification, the DSP&AI accelerator 13 extracts pupil reflection image data reflected in the user's pupil from the eye tracking image data. The DSP&AI accelerator 13 is an example of a pupil reflection image extraction circuit in this modification.

 また、本変形例のDSP131は、瞳反射画像データと、ビデオシースルー画像データの特徴点とに基づいて第2位置関係データを生成し、SRAM14a等のメモリへ出力する。DSP131は、本変形例における画像処理回路の一例である。 In addition, the DSP 131 of this modified example generates second positional relationship data based on the pupil reflection image data and feature points of the video see-through image data, and outputs the data to a memory such as the SRAM 14a. The DSP 131 is an example of an image processing circuit in this modified example.

 本変形例のSoC100bは、Eye Tracking用カメラ43により撮影されるEye Tracking用画像データからユーザの瞳に映る瞳反射画像データを抽出し、当該瞳反射画像データに基づく第2位置関係データを生成する。このため、本変形例のSoC100bによれば、ユーザが実際に見ている画像を加味してビデオシースルー画像データを補正することができる。 SoC100b of this modified example extracts pupil reflection image data reflected in the user's eyes from the eye tracking image data captured by the eye tracking camera 43, and generates second positional relationship data based on that pupil reflection image data. Therefore, SoC100b of this modified example can correct video see-through image data by taking into account the image that the user is actually viewing.

(その他の変形例)
 なお、上述の各実施形態では、SoC100a,100bはヘッドマウントディスプレイ1a,1bに搭載されたが、SoC100a,100bは他の態様の表示装置に適用されてもよい。例えば、SoC100a,100bは車両のフロントガラスに設けられたヘッドアップディスプレイの表示装置であってもよい。
(Other Modifications)
In the above-described embodiments, the SoCs 100a and 100b are mounted on the head-mounted displays 1a and 1b, but the SoCs 100a and 100b may be applied to other display devices. For example, the SoCs 100a and 100b may be display devices for head-up displays mounted on the windshield of a vehicle.

 また、上述の各実施形態ではヘッドマウントディスプレイ1a,1bは2つのレンズ2a,2bを備えるゴーグル型の装置であってもよく、また両目用の1つの大きなレンズ2を備えるゴーグル型の装置であってもよい。また、透過ディスプレイ3は、レンズ2の一部ではなく、全面に設けられてもよい。 Furthermore, in each of the above-described embodiments, the head-mounted displays 1a and 1b may be goggle-type devices equipped with two lenses 2a and 2b, or may be goggle-type devices equipped with one large lens 2 for both eyes. Furthermore, the see-through display 3 may be provided on the entire surface of the lens 2, rather than as part of it.

 上述の各実施形態及びその変形例においてSoC100a,100bが実行する各種の処理は、例えば、プログラムとして不揮発性の記憶媒体に保存されていてもよい。例えば、SoC100a,100bのCPU23等が当該プログラムを読み出すことにより、上述の各種の処理が実行されてもよい。 The various processes executed by SoC 100a, 100b in the above-described embodiments and their variations may be stored, for example, as a program on a non-volatile storage medium. For example, the CPU 23 of SoC 100a, 100b may read the program to execute the various processes described above.

 以上、各実施形態及び変形例について説明したが、本願の開示する画像処理装置、画像処理方法、及び画像処理プログラムは、上述の各実施形態等そのままに限定されるものではなく、各実施段階等ではその要旨を逸脱しない範囲で構成要素を変形して具体化できる。また、上述の各実施形態等に開示されている複数の構成要素の適宜な組み合わせにより、種々の発明を形成できる。例えば、実施形態に示される全構成要素から幾つかの構成要素を削除してもよい。 Although the above describes each embodiment and its variations, the image processing device, image processing method, and image processing program disclosed herein are not limited to the above-described embodiments, and the components can be modified and embodied at each implementation stage without departing from the spirit of the invention. Furthermore, various inventions can be created by appropriately combining multiple components disclosed in the above-described embodiments. For example, some components may be deleted from all of the components shown in the embodiments.

 1a,1b ヘッドマウントディスプレイ
 2,2a,2b レンズ
 3,3a,3b 透過ディスプレイ
 5,5a,5b ディスプレイプロジェクタ
 8,8a,8b 目
 9a,9b 被写体
 10 眼鏡本体部
 11,22 I2Cインタフェース
 12 Mono ISP
 13 DSP&AI Accelerator
 14a~14c SRAM
 15 GPU
 16 Color ISP
 17 Time Warp
 18 Warp
 19 Display Controller
 21 STAT
 23 CPU
 24 DRAM Controller
 31 Flash memory
 32 DRAM
 41,41a,41b ビデオシースルー用カメラ
 42,42a,42b キャリブレーション用カメラ
 43,43a,43b Eye Tracking用カメラ
 60 Ambient Light Sensor
 61 IMU
 62 ToFセンサ
 63,63a,63b Head Tracking用カメラ
 71 暗所
 72 明所
 90 特徴点
 100a,100b SoC
 131 DSP
 132 AI Acceler
 191 ENブロック
 192 Blendブロック
 201,202 遮光対象領域
 301 前面
 302 背面
 310 領域
 320 領域
 901 第1のキャリブレーション用画像データ
 902 第2のキャリブレーション用画像データ
 911 疑似透過像データ
 921~925,921a,922a,924a,925a ビデオシースルー画像データ
 931~934,931a ビデオシースルー表示画像
 940,940a 疑似シースルー表示画像データ
 941 透過像
1a, 1b Head-mounted display 2, 2a, 2b Lens 3, 3a, 3b See-through display 5, 5a, 5b Display projector 8, 8a, 8b Eye 9a, 9b Subject 10 Glasses body 11, 22 I2C interface 12 Mono ISP
13 DSP&AI Accelerator
14a-14c SRAM
15 GPUs
16 Color ISP
17 Time Warp
18 Warp
19 Display Controller
21 STAT
23 CPU
24 DRAM Controller
31 Flash memory
32 DRAM
41, 41a, 41b Video see-through cameras 42, 42a, 42b Calibration cameras 43, 43a, 43b Eye tracking cameras 60 Ambient Light Sensor
61 IMU
62 ToF sensor 63, 63a, 63b Head tracking camera 71 Dark place 72 Light place 90 Feature point 100a, 100b SoC
131 DSP
132 AI Acceler
191 EN block 192 Blend block 201, 202 Light-shielded target area 301 Front surface 302 Back surface 310 Area 320 Area 901 First calibration image data 902 Second calibration image data 911 Pseudo-transmitted image data 921 to 925, 921a, 922a, 924a, 925a Video see-through image data 931 to 934, 931a Video see-through display image 940, 940a Pseudo-see-through display image data 941 Transmitted image

Claims (12)

 透過ディスプレイを透過する透過像に含まれる被写体と、前記透過ディスプレイの背面が向く方向を撮影する第1のカメラにより撮影される第1画像データに含まれる前記被写体との位置関係を示す第1位置関係データが格納されるメモリと、
 前記第1位置関係データに基づいて、前記第1画像データの少なくとも一部の領域を抽出および変形し、前記透過像に含まれる前記被写体の輪郭と前記第1画像データの前記領域に含まれる前記被写体の輪郭とが前記透過ディスプレイ上で連続するように表示データを生成する表示データ生成回路と、
 前記表示データを前記透過ディスプレイに表示する表示制御回路と、
 を有する半導体装置。
a memory that stores first positional relationship data indicating a positional relationship between a subject included in a transmission image transmitted through a transmissive display and the subject included in first image data captured by a first camera that captures an image in a direction in which the back surface of the transmissive display faces;
a display data generating circuit that extracts and deforms at least a partial area of the first image data based on the first positional relationship data, and generates display data so that the contour of the subject included in the transmission image and the contour of the subject included in the area of the first image data are continuous on the transmissive display;
a display control circuit that displays the display data on the transmissive display;
A semiconductor device having:
 前記第1画像データに基づいて前記透過像の特徴点を算出し、前記透過像の特徴点と、前記第1画像データの特徴点とに基づいて前記第1位置関係データを生成し、前記メモリへ出力する画像処理回路をさらに有する、
 請求項1に記載の半導体装置。
an image processing circuit that calculates feature points of the transmission image based on the first image data, generates the first positional relationship data based on the feature points of the transmission image and the feature points of the first image data, and outputs the first positional relationship data to the memory;
The semiconductor device according to claim 1 .
 前記画像処理回路は、
  前記透過ディスプレイの前面側から第2のカメラで前記透過ディスプレイを透過して前記透過ディスプレイの背面側に位置する前記被写体が撮影された第2画像データから、前記透過像の第1の特徴点を抽出し、
  前記第1のカメラで撮影された前記第1画像データに基づく前記表示データが表示された前記透過ディスプレイを前記第2のカメラで撮影した第3画像データから前記被写体の第2の特徴点を抽出し、
  前記第1の特徴点と前記第2の特徴点の位置関係に基づいて前記第1位置関係データを生成する、
 請求項2に記載の半導体装置。
The image processing circuit
extracting a first feature point of the transmission image from second image data obtained by photographing the subject located on the rear side of the transmission display through the transmission display with a second camera from the front side of the transmission display;
extracting second feature points of the subject from third image data obtained by capturing an image of the transmissive display, on which the display data based on the first image data captured by the first camera is displayed, using the second camera;
generating the first positional relationship data based on a positional relationship between the first feature point and the second feature point;
The semiconductor device according to claim 2 .
 前記第1画像データの明所の領域を抽出し、前記第1画像データの明所の領域と前記第1位置関係データとに基づいて前記透過像の明所の領域を算出する画像処理回路をさらに有し、
 前記表示データ生成回路は、抽出した前記第1画像データの一部のうち前記透過像の明所の領域と接する部分を変形させて前記表示データを生成する、
 請求項1に記載の半導体装置。
an image processing circuit that extracts a bright area of the first image data and calculates a bright area of the transmission image based on the bright area of the first image data and the first positional relationship data;
the display data generation circuit generates the display data by deforming a portion of the extracted first image data that is in contact with a bright area of the transmission image.
The semiconductor device according to claim 1 .
 前記第1画像データの暗所の領域を抽出し、前記第1画像データの暗所の領域と前記第1位置関係データとに基づいて前記透過像の暗所の領域を算出する画像処理回路をさらに有し、
 前記表示制御回路は、前記透過像の暗所の領域に対応する前記第1画像データの一部を抽出し、明るさを調整する、
 請求項1に記載の半導体装置。
an image processing circuit that extracts a dark area of the first image data and calculates a dark area of the transmission image based on the dark area of the first image data and the first positional relationship data;
the display control circuit extracts a portion of the first image data corresponding to a dark area of the transmission image and adjusts brightness thereof;
The semiconductor device according to claim 1 .
 前記第1画像データの明所の領域を抽出し、前記第1画像データの明所の領域と前記第1位置関係データとに基づいて前記透過像の明所の領域を算出する画像処理回路をさらに有し、
 前記表示制御回路は、前記透過像の明所の領域に対応する前記透過ディスプレイの領域の透過率を前記画像処理回路が前記透過像の明所の領域を算出したときの第1透過率よりも小さい第2透過率に制御する、
 請求項1に記載の半導体装置。
an image processing circuit that extracts a bright area of the first image data and calculates a bright area of the transmission image based on the bright area of the first image data and the first positional relationship data;
the display control circuit controls the transmittance of a region of the transmissive display corresponding to a bright region of the transmission image to a second transmittance that is smaller than a first transmittance used when the image processing circuit calculates the bright region of the transmission image;
The semiconductor device according to claim 1 .
 前記透過ディスプレイの前面が向く方向を撮影する第3のカメラにより撮影される第4画像データに基づいて前記透過ディスプレイの観察者の瞳の位置を検出する瞳検出回路をさらに有し、
 前記瞳の位置と、前記第1画像データの特徴点とに基づいて、前記瞳の位置と前記第1のカメラとの位置関係を表す第2位置関係データを生成し、前記メモリへ出力する画像処理回路をさらに有する、
 請求項1に記載の半導体装置。
a pupil detection circuit that detects the position of a pupil of a viewer of the transmissive display based on fourth image data captured by a third camera that captures an image in a direction in which the front surface of the transmissive display faces;
an image processing circuit that generates second positional relationship data representing a positional relationship between the position of the pupil and the first camera based on the position of the pupil and the feature points of the first image data, and outputs the second positional relationship data to the memory;
The semiconductor device according to claim 1 .
 前記透過ディスプレイの前面が向く方向を撮影する第3のカメラにより撮影される第4画像データから前記透過ディスプレイの観察者の瞳に映る瞳反射画像データを抽出する瞳反射画像抽出回路をさらに有し、
 前記瞳反射画像データと、前記第1画像データの特徴点とに基づいて前記瞳の位置と前記第1のカメラとの位置関係を表す第2位置関係データを生成し、前記メモリへ出力する画像処理回路をさらに有する、
 請求項1に記載の半導体装置。
a pupil reflection image extraction circuit that extracts pupil reflection image data reflected in the pupil of an observer of the transmissive display from fourth image data captured by a third camera that captures an image in a direction in which the front surface of the transmissive display faces;
an image processing circuit that generates second positional relationship data representing a positional relationship between the position of the pupil and the first camera based on the pupil reflection image data and feature points of the first image data, and outputs the second positional relationship data to the memory;
The semiconductor device according to claim 1 .
 前記メモリには、前記透過ディスプレイの観察者の瞳の位置を示す瞳位置データがさらに格納され、
 前記瞳位置データと、前記第1画像データの特徴点とに基づいて前記瞳の位置と前記第1のカメラとの位置関係を表す第2位置関係データを生成し、前記メモリへ出力する画像処理回路をさらに有する、
 請求項1に記載の半導体装置。
the memory further stores pupil position data indicating the positions of the pupils of a viewer of the transmissive display;
an image processing circuit that generates second positional relationship data representing a positional relationship between the pupil position and the first camera based on the pupil position data and the feature points of the first image data, and outputs the second positional relationship data to the memory;
The semiconductor device according to claim 1 .
 前記表示データ生成回路は、さらに、前記第2位置関係データに基づいて、前記第1画像データ内の前記被写体の位置を補正する、
 請求項7から9のいずれか1項に記載の半導体装置。
the display data generation circuit further corrects the position of the subject in the first image data based on the second positional relationship data.
The semiconductor device according to claim 7 .
 透過ディスプレイを透過する透過像に含まれる被写体と、前記透過ディスプレイの背面が向く方向を撮影する第1のカメラにより撮影される第1画像データに含まれる前記被写体との位置関係を示す第1位置関係データを格納される第1位置関係データ格納ステップと、
 前記第1位置関係データに基づいて、前記第1画像データの少なくとも一部の領域を抽出および変形し、前記透過像に含まれる前記被写体の輪郭と前記第1画像データの前記領域に含まれる前記被写体の輪郭とが前記透過ディスプレイ上で連続するように表示データを生成する表示データ生成ステップと、
 前記表示データを前記透過ディスプレイに表示する表示制御ステップと、
 を含む方法。
a first positional relationship data storage step for storing first positional relationship data indicating a positional relationship between a subject included in a transmission image transmitted through a transmissive display and the subject included in first image data captured by a first camera that captures an image in a direction in which the back surface of the transmissive display faces;
a display data generating step of extracting and deforming at least a partial area of the first image data based on the first positional relationship data, and generating display data so that an outline of the subject included in the transmission image and an outline of the subject included in the area of the first image data are continuous on the transmission display;
a display control step of displaying the display data on the transmissive display;
A method comprising:
 少なくとも1つの透過ディスプレイと、
 前記透過ディスプレイに表示データを表示させるディスプレイプロジェクタと、
 前記透過ディスプレイを透過する透過像に含まれる被写体と、前記透過ディスプレイの背面が向く方向を撮影する第1のカメラにより撮影される第1画像データに含まれる前記被写体との位置関係を示す第1位置関係データが格納されるメモリと、
 前記第1位置関係データに基づいて、前記第1画像データの少なくとも一部の領域を抽出および変形し、前記透過像に含まれる前記被写体の輪郭と前記第1画像データの前記領域に含まれる前記被写体の輪郭とが前記透過ディスプレイ上で連続するように前記表示データを生成する表示データ生成回路と、
 前記表示データを前記透過ディスプレイに表示する表示制御回路と、
 を備えるヘッドマウントディスプレイ。
at least one see-through display;
a display projector that displays display data on the transmissive display;
a memory that stores first positional relationship data indicating a positional relationship between a subject included in a transmission image transmitted through the transmissive display and the subject included in first image data captured by a first camera that captures an image in a direction in which the back surface of the transmissive display faces;
a display data generating circuit that extracts and deforms at least a partial area of the first image data based on the first positional relationship data, and generates the display data so that the contour of the subject included in the transmission image and the contour of the subject included in the area of the first image data are continuous on the transmissive display;
a display control circuit that displays the display data on the transmissive display;
A head-mounted display comprising:
PCT/JP2024/018748 2024-05-21 2024-05-21 Semiconductor device, method, and head-mounted display Pending WO2025243410A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/JP2024/018748 WO2025243410A1 (en) 2024-05-21 2024-05-21 Semiconductor device, method, and head-mounted display

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/JP2024/018748 WO2025243410A1 (en) 2024-05-21 2024-05-21 Semiconductor device, method, and head-mounted display

Publications (1)

Publication Number Publication Date
WO2025243410A1 true WO2025243410A1 (en) 2025-11-27

Family

ID=97794899

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2024/018748 Pending WO2025243410A1 (en) 2024-05-21 2024-05-21 Semiconductor device, method, and head-mounted display

Country Status (1)

Country Link
WO (1) WO2025243410A1 (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2012172719A1 (en) * 2011-06-16 2012-12-20 パナソニック株式会社 Head-mounted display and misalignment correction method thereof
JP2016122177A (en) * 2014-12-25 2016-07-07 セイコーエプソン株式会社 Display device and control method of display device
CN110782499A (en) * 2019-10-23 2020-02-11 Oppo广东移动通信有限公司 A calibration method, calibration device and terminal device for augmented reality equipment
JP7246708B2 (en) * 2019-04-18 2023-03-28 ViXion株式会社 head mounted display

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2012172719A1 (en) * 2011-06-16 2012-12-20 パナソニック株式会社 Head-mounted display and misalignment correction method thereof
JP2016122177A (en) * 2014-12-25 2016-07-07 セイコーエプソン株式会社 Display device and control method of display device
JP7246708B2 (en) * 2019-04-18 2023-03-28 ViXion株式会社 head mounted display
CN110782499A (en) * 2019-10-23 2020-02-11 Oppo广东移动通信有限公司 A calibration method, calibration device and terminal device for augmented reality equipment

Similar Documents

Publication Publication Date Title
TWI675583B (en) Augmented reality system and color compensation method thereof
CN109633907B (en) Method for automatically adjusting brightness of monocular AR (augmented reality) glasses and storage medium
TWI633336B (en) Helmet mounted display, visual field calibration method thereof, and mixed reality display system
KR20180136445A (en) Information processing apparatus, information processing method, and program
US20150304625A1 (en) Image processing device, method, and recording medium
CN110084891B (en) Color adjusting method of AR glasses and AR glasses
WO2021218602A1 (en) Vision-based adaptive ar-hud brightness adjustment method
JP2016051317A (en) Visual line detection device
TWI672526B (en) Near-eye display system and display method thereof
CN111200709A (en) Method for setting light source of camera system, camera system and vehicle
WO2025243410A1 (en) Semiconductor device, method, and head-mounted display
US7940295B2 (en) Image display apparatus and control method thereof
US11874469B2 (en) Holographic imaging system
US20240031545A1 (en) Information processing apparatus, information processing method, and storage medium
TW202213994A (en) Augmented reality system and display brightness adjusting method thereof
US12300130B2 (en) Image processing apparatus, display apparatus, image processing method, and storage medium
US12512025B2 (en) Head-mounted device with content dimming for masking noise
US12136142B2 (en) Information processing apparatus, head-mounted display apparatus, information processing method, and non-transitory computer readable medium
US20250201153A1 (en) Head-Mounted Device with Content Dimming for Masking Noise
US12267596B1 (en) Exposure bracketed quick burst for low frame rate cameras
CN108995592B (en) car imaging system
EP4312191A1 (en) Image processing device, glasses-type information display device, image processing method, and image processing program
JP7635977B2 (en) Light control device
US20240388683A1 (en) Synchronizing Image Signal Processing Across Multiple Image Sensors
JP7528951B2 (en) Display terminal device