WO2025243410A1

WO2025243410A1 - Semiconductor device, method, and head-mounted display

Info

Publication number: WO2025243410A1
Application number: PCT/JP2024/018748
Authority: WO
Inventors: 有司梅津; 竜志大塚; 功太郎江崎
Original assignee: Socionext Inc
Current assignee: Socionext Inc
Priority date: 2024-05-21
Filing date: 2024-05-21
Publication date: 2025-11-27
Anticipated expiration: 2026-11-21

Abstract

This semiconductor device includes a memory, a display data generation circuit, and a display control circuit. In the memory, first positional relationship data is stored, the data indicating a positional relationship between a subject included in a transmission image transmitted through a transmissive display and the subject included in first image data captured by a first camera that captures an image of a direction in which the back face of the transmissive display faces. The display data generation circuit uses the first positional relationship data to extract and deform at least a partial region of the first image data, and to generate display data such that the contour of the subject included in the transmission image is continuous with the contour of the subject included in the first image data region on the transmissive display. The display control circuit displays the display data on the transmissive display.

Description

Semiconductor device, method, and head-mounted display

　本発明は、半導体装置、方法、及びヘッドマウントディスプレイに関する。 The present invention relates to a semiconductor device, a method, and a head-mounted display.

　従来、透過ディスプレイ上に、カメラで撮影された撮影画像を重畳表示するヘッドマウントディスプレイやヘッドアップディスプレイ等の技術が知られている。 Conventionally, technologies such as head-mounted displays and head-up displays are known that superimpose images captured by a camera onto a see-through display.

　このような技術では、ユーザは、透過ディスプレイを透過した透過像と、透過ディスプレイに表示された撮影画像の両方を同時に視認することができる。 With this technology, the user can simultaneously view both the transmitted image through the transmissive display and the captured image displayed on the transmissive display.

特許第７２４６７０８号公報Patent No. 7246708 特開２００８－９６８６７号公報Japanese Patent Application Laid-Open No. 2008-96867 特許第５８５５２０６号公報Patent No. 5855206

　しかしながら、透過ディスプレイ上に撮影画像を表示する場合、透過像と撮影画像との境界で視認性が低下する場合がある。 However, when a captured image is displayed on a transmissive display, visibility may be reduced at the boundary between the transmissive image and the captured image.

　１つの側面では、本発明は、透過ディスプレイ上に撮影画像を表示する場合において、透過像と撮影画像との境界での視認性を向上させることを目的とする。 In one aspect, the present invention aims to improve the visibility of the boundary between a transmitted image and a captured image when the captured image is displayed on a transmissive display.

　実施形態の半導体装置は、メモリと、表示データ生成回路と、表示制御回路と、を備える。メモリには、透過ディスプレイを透過する透過像に含まれる被写体と、透過ディスプレイの背面が向く方向を撮影する第１のカメラにより撮影される第１画像データに含まれる被写体との位置関係を示す第１位置関係データが格納される。表示データ生成回路は、第１位置関係データに基づいて、第１画像データの少なくとも一部の領域を抽出および変形し、透過像に含まれる被写体の輪郭と第１画像データの領域に含まれる被写体の輪郭とが透過ディスプレイ上で連続するように表示データを生成する。表示制御回路は、表示データを透過ディスプレイに表示する。 The semiconductor device of the embodiment includes a memory, a display data generation circuit, and a display control circuit. The memory stores first positional relationship data indicating the positional relationship between a subject included in a transmitted image transmitted through the transmissive display and a subject included in first image data captured by a first camera that captures the direction in which the back of the transmissive display faces. The display data generation circuit extracts and deforms at least a partial area of the first image data based on the first positional relationship data, and generates display data such that the outline of the subject included in the transmitted image and the outline of the subject included in the area of the first image data are continuous on the transmissive display. The display control circuit displays the display data on the transmissive display.

図１は、第１の実施形態に係るヘッドマウントディスプレイの全体構成の一例を示す図である。FIG. 1 is a diagram illustrating an example of the overall configuration of a head-mounted display according to the first embodiment. 図２は、第１の実施形態に係るＳｏＣの構成の一例を示す図である。FIG. 2 is a diagram illustrating an example of the configuration of an SoC according to the first embodiment. 図３は、第１の実施形態に係るビデオシースルー表示画像と透過像との差異の一例を示す図である。FIG. 3 is a diagram showing an example of the difference between a video see-through display image and a transmission image according to the first embodiment. 図４は、第１の実施形態に係るキャリブレーション用カメラの設置位置の一例について示す図である。FIG. 4 is a diagram showing an example of the installation position of the calibration camera according to the first embodiment. 図５は、第１の実施形態に係るキャリブレーション処理の被写体の一例を示す図である。FIG. 5 is a diagram illustrating an example of a subject for the calibration process according to the first embodiment. 図６は、第１の実施形態に係るキャリブレーション用カメラによる被写体９ｂの撮影を横側から見た一例を示す図である。FIG. 6 is a diagram showing an example of a subject 9b photographed from the side by the calibration camera according to the first embodiment. 図７は、第１の実施形態に係るキャリブレーション処理の流れの一例を示すフローチャートである。FIG. 7 is a flowchart showing an example of the flow of the calibration process according to the first embodiment. 図８は、図７のフローチャートの各ステップの処理内容の一例を示す図である。FIG. 8 is a diagram showing an example of the processing content of each step in the flowchart of FIG. 図９は、第１の実施形態に係るキャリブレーション処理における各画像データに含まれる歪の一例を示す図である。FIG. 9 is a diagram showing an example of distortion contained in each image data in the calibration process according to the first embodiment. 図１０は、第１の実施形態に係るヘッドマウントディスプレイの使用時における各画像データに含まれる歪の一例を示す図である。FIG. 10 is a diagram illustrating an example of distortion contained in each image data when the head-mounted display according to the first embodiment is used. 図１１は、図１０の変形処理Ｂの内訳の一例を示す図である。FIG. 11 is a diagram showing an example of the breakdown of the modification process B in FIG. 図１２は、第１の実施形態に係るヘッドマウントディスプレイの使用時の表示データの生成処理の流れの一例を示すフローチャートである。FIG. 12 is a flowchart showing an example of the flow of a process for generating display data when the head-mounted display according to the first embodiment is used. 図１３は、第１の実施形態に係るビデオシースルー画像データの補正の一例を示す図である。FIG. 13 is a diagram showing an example of correction of video see-through image data according to the first embodiment. 図１４は、第１の実施形態に係る透過ディスプレイと液晶シャッタによる視覚補助の原理について説明する図である。FIG. 14 is a diagram illustrating the principle of visual assistance using the transmissive display and liquid crystal shutter according to the first embodiment. 図１５は、第１の実施形態に係る透過ディスプレイの表示態様の一例を示す図である。FIG. 15 is a diagram illustrating an example of a display mode of the transmissive display according to the first embodiment. 図１６は、第１の実施形態に係る明所・暗所領域の抽出処理の流れの一例を示す図である。FIG. 16 is a diagram showing an example of the flow of the bright place/dark place region extraction process according to the first embodiment. 図１７は、第１の実施形態に係るＥＶ値の取得処理の流れの一例を示す図である。FIG. 17 is a diagram illustrating an example of the flow of the EV value acquisition process according to the first embodiment. 図１８は、第１の実施形態に係るヒストグラムに基づいて決定される閾値の一例を示す図である。FIG. 18 is a diagram showing an example of a threshold value determined based on a histogram according to the first embodiment. 図１９は、一般的なＥＶ／Ｌｕｘ変換表の一例を示す図である。FIG. 19 is a diagram showing an example of a general EV/Lux conversion table. 図２０は、第１の実施形態に係るＥＶ値とレンズの透過率との関係の一例を示す図である。FIG. 20 is a diagram showing an example of the relationship between the EV value and the transmittance of the lens according to the first embodiment. 図２１は、第１の実施形態に係るビデオシースルー画像データの矩形ブロック単位のＥＶ値の推定処理の一例を示す図である。FIG. 21 is a diagram showing an example of a process for estimating EV values for each rectangular block of video see-through image data according to the first embodiment. 図２２は、第１の実施形態に係るビデオシースルー表示画像の明るさと、シーンにおけるＥＶ値との関係の一例を示す図である。FIG. 22 is a diagram showing an example of the relationship between the brightness of a video see-through display image and the EV value in a scene according to the first embodiment. 図２３は、第１の実施形態に係るビデオシースルー表示画像の明るさと、シーンにおけるＥＶ値の輝度差との関係の一例を示す図である。FIG. 23 is a diagram showing an example of the relationship between the brightness of the video see-through display image according to the first embodiment and the luminance difference of the EV value in the scene. 図２４は、第１の実施形態に係るビデオシースルー表示画像の明るさとブロックごとの輝度の平均値との関係の一例を示す図である。FIG. 24 is a diagram showing an example of the relationship between the brightness of the video see-through display image and the average value of luminance for each block according to the first embodiment. 図２５は、第１の実施形態に係るビデオシースルー画像データの明るさの調整処理の一例を示す図である。FIG. 25 is a diagram showing an example of the brightness adjustment process for video see-through image data according to the first embodiment. 図２６は、第１の実施形態に係る補正前後のビデオシースルー画像データの一例を示す図である。FIG. 26 is a diagram showing an example of video see-through image data before and after correction according to the first embodiment. 図２７は、第１の実施形態の変形例１に係る遮光対象領域の一例を示す図である。FIG. 27 is a diagram illustrating an example of a light-blocking target region according to the first modification of the first embodiment. 図２８は、第１の実施形態の変形例２に係る明るさの強い領域が存在する場合の一例を示す図である。FIG. 28 is a diagram illustrating an example of a case where a bright area exists according to the second modification of the first embodiment. 図２９は、第１の実施形態の変形例２に係る遮光対象領域の一例を示す図である。FIG. 29 is a diagram illustrating an example of a light-blocking target region according to the second modification of the first embodiment. 図３０は、第２の実施形態に係るヘッドマウントディスプレイの全体構成の一例を示す図である。FIG. 30 is a diagram illustrating an example of the overall configuration of a head-mounted display according to the second embodiment. 図３１は、第２の実施形態に係るＥｙｅ　Ｔｒａｃｋｉｎｇ用カメラとユーザの目との位置関係の一例を示す図である。Figure 31 is a diagram showing an example of the positional relationship between the eye tracking camera and the user's eyes in the second embodiment. 図３２は、第２の実施形態に係るＥｙｅ　Ｔｒａｃｋｉｎｇ用カメラとユーザの目との位置関係の他の一例を示す図である。Figure 32 is a diagram showing another example of the positional relationship between the eye tracking camera and the user's eyes in the second embodiment. 図３３は、第２の実施形態に係るＳｏＣの構成の一例を示す図である。FIG. 33 is a diagram illustrating an example of the configuration of an SoC according to the second embodiment. 図３４は、第２の実施形態に係る瞳位置の位置ずれ補正量の取得処理の流れの一例を示すフローチャートである。FIG. 34 is a flowchart showing an example of the flow of processing for acquiring the correction amount of pupil position misalignment according to the second embodiment. 図３５は、第２の実施形態に係るヘッドマウントディスプレイの使用時の表示データの生成処理の流れの一例を示すフローチャートである。FIG. 35 is a flowchart showing an example of the flow of a process for generating display data when using the head-mounted display according to the second embodiment.

　以下、添付図面を参照しながら、本願の開示する半導体装置、方法、及びヘッドマウントディスプレイの実施形態を詳細に説明する。なお、以下の実施形態は開示の技術を限定するものではない。そして、各実施形態は、処理内容を矛盾させない範囲で適宜組み合わせることが可能である。 Embodiments of the semiconductor device, method, and head-mounted display disclosed herein will be described in detail below with reference to the accompanying drawings. Note that the following embodiments do not limit the disclosed technology. Furthermore, the embodiments can be combined as appropriate to the extent that the processing content is not contradictory.

（第１の実施形態）
　図１は、第１の実施形態に係るヘッドマウントディスプレイ１ａの全体構成の一例を示す図である。ヘッドマウントディスプレイ１ａは、暗所でも撮影可能な高感度カメラ（ビデオシースルー用カメラ４１ａ，４１ｂ）を搭載し、暗所ではカメラ映像を表示することでユーザの視覚を補助する。 (First embodiment)
1 is a diagram showing an example of the overall configuration of a head-mounted display 1a according to the first embodiment. The head-mounted display 1a is equipped with high-sensitivity cameras (video see-through cameras 41a and 41b) that can capture images even in dark places, and supports the user's vision by displaying camera images in dark places.

　具体的には、ヘッドマウントディスプレイ１ａは、例えば、眼鏡本体部１０、レンズ２ａ，２ｂ、透過ディスプレイ３ａ，３ｂ、ビデオシースルー用カメラ４１ａ，４１ｂ、ディスプレイプロジェクタ５ａ，５ｂ、Ａｍｂｉｅｎｔ　Ｌｉｇｈｔ　Ｓｅｎｓｏｒ６０、Ｈｅａｄ　Ｔｒａｃｋｉｎｇ用カメラ６３ａ，６３ｂ、及びＳｏＣ（Ｓｙｓｔｅｍ　ｏｎ　ａ　Ｃｈｉｐ）１００ａを備える。 Specifically, the head-mounted display 1a includes, for example, a pair of eyeglasses 10, lenses 2a and 2b, transparent displays 3a and 3b, video see-through cameras 41a and 41b, display projectors 5a and 5b, an ambient light sensor 60, head tracking cameras 63a and 63b, and a system on a chip (SoC) 100a.

　レンズ２ａ，２ｂ、透過ディスプレイ３ａ，３ｂ、ビデオシースルー用カメラ４１ａ，４１ｂ、ディスプレイプロジェクタ５ａ，５ｂ、Ａｍｂｉｅｎｔ　Ｌｉｇｈｔ　Ｓｅｎｓｏｒ６０、Ｈｅａｄ　Ｔｒａｃｋｉｎｇ用カメラ６３ａ，６３ｂ、及びＳｏＣは、眼鏡本体部１０に固定される。 The lenses 2a, 2b, transmissive displays 3a, 3b, video see-through cameras 41a, 41b, display projectors 5a, 5b, ambient light sensor 60, head tracking cameras 63a, 63b, and SoC are fixed to the eyeglass body 10.

　眼鏡本体部１０は、例えば、通常の眼鏡のフレーム同様に、ユーザの頭部に装着可能な形状である。眼鏡本体部１０は、例えば、レンズ２ａ，２ｂを固定するフロント部分と、ユーザの両耳殻に掛け止めることが可能なテンプル部分とを含む。 The eyeglass body 10 is shaped so that it can be worn on the user's head, similar to the frame of regular eyeglasses. The eyeglass body 10 includes, for example, a front portion that secures the lenses 2a and 2b, and temple portions that can be fastened to the user's earlobes.

　レンズ２ａ，２ｂは、ヘッドマウントディスプレイ１ａを装着したユーザの両眼の前に位置する透明な眼鏡レンズである。また、レンズ２ａ，２ｂは、液晶シャッタの機能を備える。液晶シャッタの機能により、レンズ２ａ，２ｂは、レンズ２ａ，２ｂを透過する外光の光量を調整することができる。以下、個々のレンズ２ａ，２ｂを特に区別しない場合は、単にレンズ２という。 Lens 2a and 2b are transparent eyeglass lenses positioned in front of the eyes of a user wearing head-mounted display 1a. Lenses 2a and 2b also function as liquid crystal shutters. This liquid crystal shutter function allows lenses 2a and 2b to adjust the amount of external light that passes through lenses 2a and 2b. Hereinafter, when there is no need to distinguish between individual lenses 2a and 2b, they will simply be referred to as lenses 2.

　透過ディスプレイ３ａ，３ｂは、レンズ２ａ，２ｂ上に設けられ、画像を表示可能なディスプレイである。また、透過ディスプレイ３ａ，３ｂは、外光を透過する。このため、ヘッドマウントディスプレイ１ａを装着したユーザは、透過ディスプレイ３ａ，３ｂを透過した透過像と、透過ディスプレイ３ａ，３ｂに表示された表示画像との両方を視認することができる。透過ディスプレイ３ａ，３ｂに表示される表示画像は、例えば、後述のビデオシースルー用カメラ４１ａ，４１ｂによって撮影された撮影画像データに変形等の処理が施された画像である。なお、変形は、補正の一例である。 Transparent displays 3a and 3b are provided on lenses 2a and 2b and are capable of displaying images. Furthermore, transparent displays 3a and 3b transmit external light. Therefore, a user wearing head-mounted display 1a can see both the transmitted image transmitted through transparent displays 3a and 3b and the display image displayed on transparent displays 3a and 3b. The display images displayed on transparent displays 3a and 3b are, for example, images obtained by subjecting captured image data captured by video see-through cameras 41a and 41b (described below) to processing such as deformation. Note that deformation is an example of correction.

　透過ディスプレイ３ａ，３ｂは、レンズ２ａ，２ｂの少なくとも一部の領域に設けられる。本実施形態では、透過ディスプレイ３ａ，３ｂは、レンズ２ａ，２ｂの中央に位置する一部の領域に設けられる。透過ディスプレイ３ａ，３ｂは、例えば、ウェーブガイド（Ｗａｖｅｇｕｉｄｅ）等の光学シースルーディスプレイ用スクリーンである。以下、個々の透過ディスプレイ３ａ，３ｂを特に区別しない場合は、単に透過ディスプレイ３という。 The transparent displays 3a and 3b are provided in at least a portion of the area of the lenses 2a and 2b. In this embodiment, the transparent displays 3a and 3b are provided in a portion of the area located at the center of the lenses 2a and 2b. The transparent displays 3a and 3b are, for example, screens for optical see-through displays such as waveguides. Hereinafter, when there is no need to distinguish between the individual transparent displays 3a and 3b, they will simply be referred to as the transparent displays 3.

　透過ディスプレイ３の両面のうち、ヘッドマウントディスプレイ１ａを装着したユーザの側を向く面を前面という。また、透過ディスプレイ３の両面のうち、ヘッドマウントディスプレイ１ａを装着したユーザと反対側を向く面を背面という。ヘッドマウントディスプレイ１ａを装着するユーザは、透過ディスプレイ３の観察者の一例である。 Of the two surfaces of the transparent display 3, the surface facing the user wearing the head-mounted display 1a is referred to as the front surface. Furthermore, of the two surfaces of the transparent display 3, the surface facing away from the user wearing the head-mounted display 1a is referred to as the back surface. The user wearing the head-mounted display 1a is an example of an observer of the transparent display 3.

　ビデオシースルー用カメラ４１ａ，４１ｂは、透過ディスプレイ３の背面が向く方向を撮影するカメラである。透過ディスプレイ３の背面が向く方向とは、ヘッドマウントディスプレイ１ａを装着したユーザにとっての正面方向に相当する。より詳細には、ビデオシースルー用カメラ４１ａ，４１ｂは、例えば、それぞれ、眼鏡本体部１０のうち、レンズ２ａ，２ｂの中央の上方の位置に設けられる。ビデオシースルー用カメラ４１ａ，４１ｂは、本実施形態における第１のカメラの一例である。また、ビデオシースルー用カメラ４１ａ，４１ｂにより撮影された撮影画像データは、本実施形態における第１画像データの一例である。 The video see-through cameras 41a and 41b are cameras that capture images in the direction in which the back of the transmissive display 3 faces. The direction in which the back of the transmissive display 3 faces corresponds to the front direction for a user wearing the head-mounted display 1a. More specifically, the video see-through cameras 41a and 41b are provided, for example, in positions above the centers of the lenses 2a and 2b of the eyeglass body 10. The video see-through cameras 41a and 41b are an example of a first camera in this embodiment. Furthermore, the captured image data captured by the video see-through cameras 41a and 41b is an example of first image data in this embodiment.

　ビデオシースルー用カメラ４１ａ，４１ｂは、例えば、ＳＰＡＤ（Ｓｉｎｇｌｅ　Ｐｈｏｔｏｎ　Ａｖａｌａｎｃｈｅ　Ｄｉｏｄｅ）センサ、高感度ＣＭＯＳ（Ｃｏｍｐｌｅｍｅｎｔａｒｙ　Ｍｅｔａｌ　Ｏｘｉｄｅ　Ｓｅｍｉｃｏｎｄｕｃｔｏｒ）センサ等の暗所でも撮影可能なイメージセンサである。あるいは、ビデオシースルー用カメラ４１ａ，４１ｂは、ＩＲ（Ｉｎｆｒａｒｅｄ　Ｒａｙｓ）光照射により暗闇を撮影可能なＩＲセンサでもよい。以下、個々のビデオシースルー用カメラ４１ａ，４１ｂを特に区別しない場合は、単にビデオシースルー用カメラ４１という。 Video see-through cameras 41a and 41b are image sensors capable of capturing images in dark places, such as a SPAD (Single Photon Avalanche Diode) sensor or a highly sensitive CMOS (Complementary Metal Oxide Semiconductor) sensor. Alternatively, video see-through cameras 41a and 41b may be IR (Infrared Rays) sensors capable of capturing images in the dark by irradiating them with IR light. Hereinafter, when there is no need to distinguish between the individual video see-through cameras 41a and 41b, they will simply be referred to as video see-through cameras 41.

　本実施形態においては、ビデオシースルー用カメラ４１によって撮影された撮影画像データをビデオシースルー画像データという。また、ディスプレイプロジェクタ５によってビデオシースルー画像データが透過ディスプレイ３に投影された場合、透過ディスプレイ３上に表示された当該画像を、ビデオシースルー表示画像という。なお、「ディスプレイプロジェクタ５によってビデオシースルー画像データが透過ディスプレイ３に投影された場合」とは、各種補正後のビデオシースルー画像データが透過ディスプレイ３に投影された場合も含む。 In this embodiment, the captured image data captured by the video see-through camera 41 is referred to as video see-through image data. Furthermore, when the video see-through image data is projected onto the transmissive display 3 by the display projector 5, the image displayed on the transmissive display 3 is referred to as a video see-through display image. Note that "when video see-through image data is projected onto the transmissive display 3 by the display projector 5" also includes when video see-through image data after various corrections is projected onto the transmissive display 3.

　ディスプレイプロジェクタ５ａ，５ｂは、後述のＳｏＣ１００ａの制御の下、透過ディスプレイ３ａ，３ｂに表示画像を表示させる。ディスプレイプロジェクタ５ａ，５ｂは、例えば、ｍｉｃｒｏ　ＬＥＤ（Ｌｉｇｈｔ　Ｅｍｉｔｔｉｎｇ　Ｄｉｏｄｅ）である。以下、個々のディスプレイプロジェクタ５ａ，５ｂを特に区別しない場合は、単にディスプレイプロジェクタ５という。 Display projectors 5a and 5b display images on transmissive displays 3a and 3b under the control of SoC 100a (described below). Display projectors 5a and 5b are, for example, micro LEDs (Light Emitting Diodes). Hereinafter, when there is no need to distinguish between individual display projectors 5a and 5b, they will simply be referred to as display projectors 5.

　Ａｍｂｉｅｎｔ　Ｌｉｇｈｔ　Ｓｅｎｓｏｒ６０は、環境光センサともいい、周囲の環境光情報を取得する。Ａｍｂｉｅｎｔ　Ｌｉｇｈｔ　Ｓｅｎｓｏｒ６０は、例えば、環境光の強度を測定する。 The Ambient Light Sensor 60, also known as an ambient light sensor, acquires ambient light information. For example, the Ambient Light Sensor 60 measures the intensity of ambient light.

　Ｈｅａｄ　Ｔｒａｃｋｉｎｇ用カメラ６３ａ，６３ｂは、ヘッドマウントディスプレイ１ａを装着したユーザの頭部の動きを検出する。以下、個々のＨｅａｄ　Ｔｒａｃｋｉｎｇ用カメラ６３ａ，６３ｂを特に区別しない場合は、単にＨｅａｄ　Ｔｒａｃｋｉｎｇ用カメラ６３という。 The head tracking cameras 63a and 63b detect the movement of the head of a user wearing the head-mounted display 1a. Hereinafter, when there is no need to distinguish between the individual head tracking cameras 63a and 63b, they will simply be referred to as head tracking cameras 63.

　ＳｏＣ１００ａは、ヘッドマウントディスプレイ１ａの各構成を制御するコンピュータである。ＳｏＣ１００ａは、本実施形態における半導体装置の一例である。 SoC100a is a computer that controls each component of head-mounted display 1a. SoC100a is an example of a semiconductor device in this embodiment.

　また、ヘッドマウントディスプレイ１ａは、さらに、距離を測定するＤｅｐｔｈセンサ、及びヘッドマウントディスプレイ１ａを装着したユーザの姿勢を測定するＩＭＵ（Ｉｎｅｒｔｉａｌ　Ｍｅａｓｕｒｅｍｅｎｔ　Ｕｎｉｔ）等を備えてもよい。距離を測定するＤｅｐｔｈセンサは、例えば、ＴｏＦ（Ｔｉｍｅ　ｏｆ　Ｆｌｉｇｒｔ）センサやステレオカメラであるが、これらに限定されるものではない。 The head-mounted display 1a may further include a depth sensor that measures distance, an IMU (Inertial Measurement Unit) that measures the posture of the user wearing the head-mounted display 1a, and the like. Examples of depth sensors that measure distance include, but are not limited to, a ToF (Time of Flight) sensor or a stereo camera.

　ここで、ＳｏＣ１００ａの構成の詳細を説明する。図２は、第１の実施形態に係るＳｏＣ１００ａの構成の一例を示す図である。図２に示すように、ＳｏＣ１００ａは、Ｉ２Ｃ（Ｉｎｔｅｒ－Ｉｎｔｅｇｒａｔｅｄ　Ｃｉｒｃｕｉｔ）インタフェース１１，２２、Ｍｏｎｏ　ＩＳＰ（Ｉｍａｇｅ　Ｓｉｇｎａｌ　Ｐｒｏｃｅｓｓｏｒ）１２、ＤＳＰ（Ｄｉｇｉｔａｌ　Ｓｉｇｎａｌ　Ｐｒｏｃｅｓｓｏｒ）＆ＡＩ（Ａｒｔｉｆｉｃｉａｌ　Ｉｎｔｅｌｌｉｇｅｎｃｅ）Ａｃｃｅｌｅｒａｔｏｒ１３、ＳＲＡＭ（Ｓｔａｔｉｃ　Ｒａｎｄｏｍ　Ａｃｃｅｓｓ　Ｍｅｍｏｒｙ）１４ａ～１４ｃ、ＧＰＵ（Ｇｒａｐｈｉｃｓ　Ｐｒｏｃｅｓｓｉｎｇ　Ｕｎｉｔ）１５、Ｃｏｌｏｒ　ＩＳＰ１６、Ｔｉｍｅ　Ｗａｒｐ１７、Ｗａｒｐ１８、Ｄｉｓｐｌａｙ　Ｃｏｎｔｒｏｌｌｅｒ１９、ＳＴＡＴ（Ｓｔａｔｉｓｔｉｃｓ　ｂｌｏｃｋ）２１、ＣＰＵ（Ｃｅｎｔｒａｌ　Ｐｒｏｃｅｓｓｉｎｇ　Ｕｎｉｔ）２３、及びＤＲＡＭ（Ｄｙｎａｍｉｃ　Ｒａｎｄｏｍ　Ａｃｃｅｓｓ　Ｍｅｍｏｒｙ）　Ｃｏｎｔｒｏｌｌｅｒ２４を備える。 Here, the configuration of the SoC 100a will be described in detail. Figure 2 is a diagram showing an example of the configuration of the SoC 100a according to the first embodiment. As shown in Figure 2, the SoC 100a includes an I2C (Inter-Integrated Circuit) interface 11, 22, a Mono ISP (Image Signal Processor) 12, a DSP (Digital Signal Processor) & AI (Artificial Intelligence) Accelerator 13, and an SRAM (Static Random Access Memory) 1. 4a-14c, GPU (Graphics Processing Unit) 15, Color ISP 16, Time Warp 17, Warp 18, Display Controller 19, STAT (Statistics block) 21, CPU (Central Processing Unit) 23, and DRAM (Dynamic Random Access Memory) Controller 24.

　また、図１では不図示であったが、ヘッドマウントディスプレイ１ａは、ＳｏＣ１００ａの外部に、さらに、ＩＭＵ６１、ＴｏＦセンサ６２、Ｆｌａｓｈ　ｍｅｍｏｒｙ３１、及びＤＲＡＭ３２を備える。Ｆｌａｓｈ　ｍｅｍｏｒｙ３１、及びＤＲＡＭ３２は、ＳｏＣ１００ａの制御の下、各種のデータを記憶する。 Although not shown in FIG. 1, the head mounted display 1a further includes an IMU 61, a ToF sensor 62, a flash memory 31, and a DRAM 32 outside the SoC 100a. The flash memory 31 and the DRAM 32 store various types of data under the control of the SoC 100a.

　また、図２に示すキャリブレーション用カメラ４２ａ，４２ｂはヘッドマウントディスプレイ１ａの外部に設けられたカメラである。以下、個々のキャリブレーション用カメラ４２ａ，４２ｂを特に区別しない場合は、単にキャリブレーション用カメラ４２という。キャリブレーション用カメラ４２は、本実施形態における第２のカメラの一例である。キャリブレーション用カメラ４２の詳細については図４で後述する。 Furthermore, the calibration cameras 42a and 42b shown in FIG. 2 are cameras provided outside the head-mounted display 1a. Hereinafter, when there is no need to distinguish between the individual calibration cameras 42a and 42b, they will simply be referred to as calibration cameras 42. The calibration camera 42 is an example of the second camera in this embodiment. Details of the calibration camera 42 will be described later with reference to FIG. 4.

　Ｉ２Ｃインタフェース１１，２２は、同期式シリアル通信を行う通信インタフェースである。Ｉ２Ｃインタフェース１１は、ＩＭＵ６１から加速度及び角速度を取得する。また、Ｉ２Ｃインタフェース２２は、Ａｍｂｉｅｎｔ　Ｌｉｇｈｔ　Ｓｅｎｓｏｒ６０から環境光の強度を取得する。 I2C interfaces 11 and 22 are communication interfaces that perform synchronous serial communication. I2C interface 11 acquires acceleration and angular velocity from IMU 61. I2C interface 22 acquires the intensity of ambient light from Ambient Light Sensor 60.

　Ｍｏｎｏ　ＩＳＰ１２は、周辺状況を認識するための各種センサからの情報を取得し、取得した情報を補正する。Ｍｏｎｏ　ＩＳＰ１２は、例えば、ＴｏＦセンサ６２から周囲の物体との測距データを取得する。また、Ｍｏｎｏ　ＩＳＰ１２は、例えば、Ｈｅａｄ　Ｔｒａｃｋｉｎｇ用カメラ６３ａ，６３ｂから撮影データを取得し、明るさ等を補正する。 Mono ISP 12 acquires information from various sensors used to recognize the surrounding situation and corrects the acquired information. For example, Mono ISP 12 acquires distance measurement data from surrounding objects from ToF sensor 62. Mono ISP 12 also acquires image data from head tracking cameras 63a and 63b, for example, and corrects brightness, etc.

　ＤＳＰ＆ＡＩ　Ａｃｃｅｌｅｒａｔｏｒ１３は、ＤＳＰ１３１とＡＩ　Ａｃｃｅｌｅｒａｔｏｒ１３２とを含む。ＤＳＰ＆ＡＩ　Ａｃｃｅｌｅｒａｔｏｒ１３は、Ｍｏｎｏ　ＩＳＰ１２が取得及び補正した各種データに基づいて各種トラッキング処理を行う。例えば、ＤＳＰ＆ＡＩ　Ａｃｃｅｌｅｒａｔｏｒ１３は、ヘッドマウントディスプレイ１ａを装着したユーザの頭部の動きを検出するＨｅａｄ　Ｔｒａｃｋｉｎｇ処理をしてもよい。あるいは、ＤＳＰ＆ＡＩ　Ａｃｃｅｌｅｒａｔｏｒ１３は、センサの種類に応じて、Ｈａｎｄ　Ｔｒａｃｋｉｎｇ処理またはＥｙｅ　Ｔｒａｃｋｉｎｇ処理等を行ってもよい。ＤＳＰ＆ＡＩ　Ａｃｃｅｌｅｒａｔｏｒ１３は、各種のＴｒａｃｋｉｎｇ処理の結果として生成したＴｒａｃｋｉｎｇ情報をＴｉｍｅ　Ｗａｒｐ１７に出力する。なお、本実施形態においては、各種トラッキング処理は必須ではない。 The DSP & AI Accelerator 13 includes a DSP 131 and an AI Accelerator 132. The DSP & AI Accelerator 13 performs various tracking processes based on various data acquired and corrected by the Mono ISP 12. For example, the DSP & AI Accelerator 13 may perform head tracking processing to detect head movement of a user wearing the head-mounted display 1a. Alternatively, the DSP & AI Accelerator 13 may perform hand tracking processing or eye tracking processing, depending on the type of sensor. The DSP & AI Accelerator 13 outputs tracking information generated as a result of various tracking processes to the Time Warp 17. Note that various tracking processes are not required in this embodiment.

　ＳＲＡＭ１４ａ～１４ｃは、各種処理で使用される各種データ、及び各種処理で生成されたデータを記憶する。例えば、ＳＲＡＭ１４ｂは、レンダリング画像データや撮影画像データの一時保存場所として用いられる。後述のＷａｒｐ１８によって実行される画像データの変形処理で用いられる第１位置関係データを記憶してもよい。ＳＲＡＭ容量は通常小さく画像全体が格納できない場合があるが、その場合はＳＲＡＭをＦＩＦＯ（Ｆｉｒｓｔ　ｉｎ　Ｆｉｒｓｔ　ｏｕｔ）として使用し、後段処理でまだ使用していない画像の一部分のみを格納してもよい。ＳＲＡＭ１４ａ～１４ｃは、本実施形態におけるメモリの一例である。なお、ＳｏＣ１００ａが実行する各処理では、データの一時的な保存のためのバッファとしてＳＲＡＭ１４ａ～１４ｃを用いる。ＤＲＡＭ３２をバッファとすることも可能であるが、ＳＲＡＭ１４ａ～１４ｃを用いる方が消費電力を削減可能である。なお、ＳＲＡＭ１４ｂに保存された第１位置関係データには、Ｆｌａｓｈ　ｍｅｍｏｒｙ３１にあらかじめ記憶していたデータを用いる。このため、Ｆｌａｓｈ　ｍｅｍｏｒｙ３１が本実施形態におけるメモリの一例に含まれる。Ｆｌａｓｈ　ｍｅｍｏｒｙ３１は外部記憶装置であり、ＮＡＮＤメモリを用いてもよいし、ＳＳＤやＳＤカードなどを用いても良い。 SRAMs 14a to 14c store various data used in various processes and data generated by various processes. For example, SRAM 14b is used as a temporary storage location for rendered image data and captured image data. It may also store first positional relationship data used in the image data transformation process performed by Warp 18, described below. SRAM capacity is usually small and may not be able to store the entire image. In such cases, SRAM may be used as a FIFO (First in First out) buffer to store only a portion of the image that has not yet been used in subsequent processing. SRAMs 14a to 14c are an example of memory in this embodiment. Note that each process performed by SoC 100a uses SRAMs 14a to 14c as a buffer for temporary data storage. While DRAM 32 can also be used as a buffer, using SRAMs 14a to 14c reduces power consumption. Note that the first positional relationship data stored in SRAM 14b uses data previously stored in flash memory 31. For this reason, flash memory 31 is included as an example of memory in this embodiment. Flash memory 31 is an external storage device, and may be NAND memory, an SSD, an SD card, or the like.

　ＧＰＵ１５は、透過ディスプレイ３に表示される画像データのレンダリング処理を行う。 The GPU 15 performs rendering processing of the image data displayed on the transmissive display 3.

　Ｔｉｍｅ　Ｗａｒｐ１７は、ＧＰＵ１５の処理によって発生した、画像データの遅延の補正をする処理回路である。Ｔｉｍｅ　Ｗａｒｐ１７による時間補正は、透過ディスプレイ３に表示される画像データを見たユーザがいわゆる「ＶＲ（Ｖｉｒｔｕａｌ　Ｒｅａｌｉｔｙ）酔い」という不快感を覚えることを低減する効果を奏する。Ｔｉｍｅ　Ｗａｒｐ１７は、例えば、ＤＳＰ＆ＡＩ　Ａｃｃｅｌｅｒａｔｏｒ１３によるＨｅａｄ　Ｔｒａｃｋｉｎｇ処理の結果を用いて補正処理を行う。 Time Warp 17 is a processing circuit that corrects delays in image data caused by processing by GPU 15. The time correction performed by Time Warp 17 has the effect of reducing the discomfort experienced by users who view image data displayed on the translucent display 3, known as "VR (Virtual Reality) sickness." Time Warp 17 performs correction processing using, for example, the results of head tracking processing performed by DSP & AI Accelerator 13.

　Ｃｏｌｏｒ　ＩＳＰ１６、Ｗａｒｐ１８は、ビデオシースルー用カメラ４１、及びキャリブレーション用カメラ４２から撮影画像データを取得する。Ｃｏｌｏｒ　ＩＳＰ１６は、取得した画像データをＲＧＢ（Ｒｅｄ－Ｇｒｅｅｎ－Ｂｌｕｅ　ｃｏｌｏｒ　ｍｏｄｅｌ）画像データに変換する。 Color ISP 16 and Warp 18 acquire captured image data from the video see-through camera 41 and the calibration camera 42. Color ISP 16 converts the acquired image data into RGB (Red-Green-Blue color model) image data.

　Ｗａｒｐ１８は、Ｃｏｌｏｒ　ＩＳＰ１６によりＲＧＢ画像データに変換されたビデオシースルー用カメラ４１の撮影画像データに対して補正処理を実行する。Ｗａｒｐ１８による補正処理は、例えば、撮影画像データの少なくとも一部の領域に対する第１位置関係データに基づく変形処理、及び撮影画像データの少なくとも一部の領域を表示対象として抽出する処理を含む。Ｗａｒｐ１８は、当該変形及び抽出処理によって、透過像に含まれる被写体の輪郭と撮影画像データの一部に含まれる被写体の輪郭とが透過ディスプレイ３上で連続するようにビデオシースルー画像データを生成する。ビデオシースルー画像データは、本実施形態における表示データの一例である。Ｗａｒｐ１８は、本実施形態における表示データ生成回路の一例である。 Warp 18 performs correction processing on the captured image data of the video see-through camera 41 that has been converted into RGB image data by Color ISP 16. The correction processing by Warp 18 includes, for example, a transformation processing based on the first positional relationship data for at least a partial area of the captured image data, and a processing for extracting at least a partial area of the captured image data as a display target. Through this transformation and extraction processing, Warp 18 generates video see-through image data so that the outline of the subject included in the transmitted image and the outline of the subject included in part of the captured image data are continuous on the transmissive display 3. The video see-through image data is an example of display data in this embodiment. Warp 18 is an example of a display data generation circuit in this embodiment.

　ＳＴＡＴ２１は、Ｃｏｌｏｒ　ＩＳＰ１６によってＲＧＢ画像に変換されたビデオシースルー用カメラ４１の撮影画像データから、各種の情報を特定する。例えば、ＳＴＡＴ２１は、ＡＥ（Ａｕｔｏ　Ｅｘｐｏｓｕｒｅ：自動露出制御）及びＡＷＢ（Ａｕｔｏ　Ｗｈｉｔｅ　Ｂａｌａｎｃｅ）に必要な情報の検波を行う。また、ＳＴＡＴ２１は、撮影画像データからのヒストグラムの抽出、撮影画像データの矩形ブロック単位の輝度の平均値の算出、及び飽和画素数のカウントを行う。ヒストグラムは、画像データ中の画素の輝度値の分布を示すグラフであり、通常、横軸が輝度、縦軸がピクセル数を表す。 STAT21 identifies various pieces of information from the image data captured by the video see-through camera 41, which has been converted into an RGB image by Color ISP16. For example, STAT21 detects the information required for AE (Auto Exposure) and AWB (Auto White Balance). STAT21 also extracts a histogram from the captured image data, calculates the average brightness value for each rectangular block of the captured image data, and counts the number of saturated pixels. A histogram is a graph that shows the distribution of brightness values of pixels in image data, and typically the horizontal axis represents brightness and the vertical axis represents the number of pixels.

　Ｃｏｌｏｒ　ＩＳＰ１６、Ｗａｒｐ１８、及びＳＴＡＴ２１による処理が低レイテンシー（ｌｏｗ　ｌａｔｅｎｃｙ）で実行されることにより、ヘッドマウントディスプレイ１ａを装着したユーザのＶＲ酔いを低減することができる。 By performing processing by Color ISP 16, Warp 18, and STAT 21 with low latency, it is possible to reduce VR sickness experienced by a user wearing the head-mounted display 1a.

　ＣＰＵ２３は、ＳＴＡＴ２１及びＩ２Ｃインタフェース２２で取得された情報等に基づいて、ＡＥ及びＡＷＢを行う。また、ＣＰＵ２３は、シーンの明るさ及び撮影画像データの輝度等に基づく明所・暗所領域の抽出処理を行う。ＣＰＵ２３は、これらの処理結果に基づいて、レンズ２ａの液晶シャッタを制御するための透過率データを生成する。ＣＰＵ２３による明所・暗所領域の抽出処理等の詳細については後述する。ＣＰＵ２３は、本実施形態における画像処理回路の一例である。 The CPU 23 performs AE and AWB based on information acquired via the STAT 21 and the I2C interface 22. The CPU 23 also performs extraction processing of bright and dark areas based on the brightness of the scene and the luminance of the captured image data. Based on the results of these processing, the CPU 23 generates transmittance data for controlling the liquid crystal shutter of the lens 2a. Details of the extraction processing of bright and dark areas by the CPU 23 will be described later. The CPU 23 is an example of an image processing circuit in this embodiment.

　Ｄｉｓｐｌａｙ　Ｃｏｎｔｒｏｌｌｅｒ１９は、ディスプレイプロジェクタ５を制御して透過ディスプレイ３に表示画像を表示させる。また、Ｄｉｓｐｌａｙ　Ｃｏｎｔｒｏｌｌｅｒ１９は、ＣＰＵ２３によって生成された透過率データに基づいて、レンズ２の液晶シャッタの透過度合を制御する。また、Ｄｉｓｐｌａｙ　Ｃｏｎｔｒｏｌｌｅｒ１９は、Ｗａｒｐ１８により補正されたビデオシースルー画像データ及びＧＰＵ１５によりレンダリング処理が施されたレンダリング画像データを、透過ディスプレイ３に表示する形式に補正する。Ｄｉｓｐｌａｙ　Ｃｏｎｔｒｏｌｌｅｒ１９は、本実施形態における表示制御回路の一例である。
The display controller 19 controls the display projector 5 to display an image on the transmissive display 3. The display controller 19 also controls the degree of transmittance of the liquid crystal shutter of the lens 2 based on transmittance data generated by the CPU 23. The display controller 19 also corrects the video see-through image data corrected by the Warp 18 and the rendering image data rendered by the GPU 15 into a format suitable for display on the transmissive display 3. The display controller 19 is an example of a display control circuit in this embodiment.

　より詳細にはＤｉｓｐｌａｙ　Ｃｏｎｔｒｏｌｌｅｒ１９は、ＥＮ（Ｅｎａｂｌｅ）ブロック１９１とＢｌｅｎｄブロック１９２とを備える。ＥＮブロック１９１は、飽和画素の表示ＯＮ／ＯＦＦを判断する。ＥＮブロック１９１は、ＯＦＦと判断した場合は当該画素を非表示（黒）に補正する。また、ＥＮブロック１９１は、ＯＮと判断した場合は当該画素に対応する表示映像の明るさを補正する。Ｂｌｅｎｄブロック１９２は、レンダリング画像データとビデオシースルー画像データを合成する。 More specifically, the Display Controller 19 comprises an EN (Enable) block 191 and a Blend block 192. The EN block 191 determines whether the display of saturated pixels is on or off. If the EN block 191 determines that the pixel is off, it corrects the pixel to be hidden (black). If the EN block 191 determines that the pixel is on, it corrects the brightness of the displayed image corresponding to the pixel. The Blend block 192 combines the rendering image data and the video see-through image data.

　ＤＲＡＭ　Ｃｏｎｔｒｏｌｌｅｒ２４は、ＤＲＡＭ３２への各種データの記憶、及びＤＲＡＭ３２からの各種データの読み込みを制御する。 The DRAM controller 24 controls the storage of various data in the DRAM 32 and the reading of various data from the DRAM 32.

　なお、本実施形態におけるヘッドマウントディスプレイ１ａの構成は図１、２に示す例に限定されるものではない。例えば、ヘッドマウントディスプレイ１ａは、Ｈｅａｄ　Ｔｒａｃｋｉｎｇ用カメラ６３ａ、ＩＭＵ６１、及びＴｏＦセンサ６２は必須の構成ではない。また、ヘッドマウントディスプレイ１ａは、さらに他のセンサ等を備えてもよい。 Note that the configuration of the head-mounted display 1a in this embodiment is not limited to the example shown in Figures 1 and 2. For example, the head-mounted display 1a does not necessarily include the head tracking camera 63a, IMU 61, and ToF sensor 62. Furthermore, the head-mounted display 1a may also include other sensors, etc.

　次に、ヘッドマウントディスプレイ１ａの使用前のキャリブレーション処理について説明する。キャリブレーション処理は、例えば、ヘッドマウントディスプレイ１ａの出荷前に行われる処理である。より詳細には、キャリブレーション処理は、ヘッドマウントディスプレイ１ａを装着したユーザの目の位置とビデオシースルー用カメラ４１の搭載位置の差異等を加味した補正のための処理である。 Next, we will explain the calibration process performed before using the head-mounted display 1a. The calibration process is performed, for example, before the head-mounted display 1a is shipped. More specifically, the calibration process is a process for making corrections that take into account differences between the eye position of a user wearing the head-mounted display 1a and the mounting position of the video see-through camera 41.

　ヘッドマウントディスプレイ１ａをユーザが装着する場合、当該ユーザの両眼は、一般的にはそれぞれレンズ２ａ，２ｂの中心付近に位置する。このため、レンズ２ａ，２ｂ及び透過ディスプレイ３ａ，３ｂを透過した透過像は、レンズ２ａ，２ｂの中心付近の視点を基準とした像となる。これに対して、ビデオシースルー用カメラ４１ａ，４１ｂをユーザの目の光軸の位置に合わせて搭載することは構成上困難である。このため、図１に示したように、ビデオシースルー用カメラ４１ａ，４１ｂは、レンズ２ａ，２ｂの中心付近ではなく、ヘッドマウントディスプレイ１ａの眼鏡本体部１０に設けられる。 When a user wears the head-mounted display 1a, the user's eyes are generally positioned near the centers of the lenses 2a and 2b, respectively. Therefore, the transmitted image that passes through the lenses 2a and 2b and the transmissive displays 3a and 3b is based on a viewpoint near the centers of the lenses 2a and 2b. However, it is structurally difficult to mount the video see-through cameras 41a and 41b so that they are aligned with the optical axis of the user's eyes. For this reason, as shown in Figure 1, the video see-through cameras 41a and 41b are provided in the eyeglass body 10 of the head-mounted display 1a, rather than near the centers of the lenses 2a and 2b.

　このような理由により、ビデオシースルー用カメラ４１ａ，４１ｂによって撮影されたビデオシースルー画像データの視点と、ヘッドマウントディスプレイ１ａを装着したユーザの視点とに差異が生じる。また、透過ディスプレイ３とビデオシースルー用カメラ４１の画角の違いやレンズ歪による映像の歪も、ずれを生じる要因となる。なお、レンズ歪には、ビデオシースルー用カメラ４１のレンズの特性と、ヘッドマウントディスプレイ１ａのレンズ２の歪の両方がある。 For these reasons, there is a difference between the viewpoint of the video see-through image data captured by the video see-through cameras 41a and 41b and the viewpoint of the user wearing the head-mounted display 1a. Differences in the angle of view between the transparent display 3 and the video see-through camera 41, as well as distortion of the image due to lens distortion, also contribute to the misalignment. Note that lens distortion is caused by both the characteristics of the lens of the video see-through camera 41 and distortion of the lens 2 of the head-mounted display 1a.

　図３は、第１の実施形態に係るビデオシースルー表示画像９３１，９３１ａと透過像９４１との差異の一例を示す図である。ビデオシースルー表示画像９３１は、補正前のビデオシースルー画像データがそのまま透過ディスプレイ３に表示された画像である。この場合、図３に示すように、ビデオシースルー表示画像９３１に含まれる物体（被写体９ａ）と、透過像に含まれる物体（被写体９ａ）の位置及び／または大きさに差異が生じる。 FIG. 3 is a diagram showing an example of the difference between the video see-through display images 931, 931a and the transmission image 941 according to the first embodiment. The video see-through display image 931 is an image in which the uncorrected video see-through image data is displayed as is on the transmission display 3. In this case, as shown in FIG. 3, a difference occurs in the position and/or size of the object (subject 9a) included in the video see-through display image 931 and the object (subject 9a) included in the transmission image.

　図３ではビデオシースルー表示画像９３１と透過像９４１とを個別に表示しているが、実際にはビデオシースルー表示画像９３１は透過像９４１に重畳して透過ディスプレイ３上に表示される。この場合、ビデオシースルー表示画像９３１内の被写体９ａと透過像９４１内の被写体９ａの輪郭が連続せず、２重像になる。このため、ヘッドマウントディスプレイ１ａを装着したユーザは、ビデオシースルー表示画像９３１内の被写体９ａと透過像９４１内の被写体９ａの両方を２重像として見ることとなり、視認性が低下する。なお、本実施形態において、「被写体」は、撮影画像データに描出された物体だけではなく、透過像に含まれる物体も含む。なお、物体には人も含む。 In Figure 3, the video see-through display image 931 and the transmission image 941 are displayed separately, but in reality, the video see-through display image 931 is displayed superimposed on the transmission image 941 on the transmission display 3. In this case, the contours of the subject 9a in the video see-through display image 931 and the subject 9a in the transmission image 941 are not continuous, resulting in a double image. As a result, a user wearing the head-mounted display 1a sees both the subject 9a in the video see-through display image 931 and the subject 9a in the transmission image 941 as a double image, reducing visibility. Note that in this embodiment, the "subject" includes not only objects depicted in the captured image data, but also objects included in the transmission image. Note that objects also include people.

　このため、ＳｏＣ１００ａは、ヘッドマウントディスプレイ１ａの使用時においてビデオシースルー画像データを補正した上で透過ディスプレイ３に表示させる。補正後のビデオシースルー画像データに基づくビデオシースルー表示画像９３１ａ内の被写体９ａの位置及び大きさは、透過像９４１内の被写体９ａの位置及び大きさと一致する。この場合、被写体９ａが透過ディスプレイ３上で２重像にならないため、ユーザの視認性は低下しない。キャリブレーション処理では、このような使用時の補正に使用される第１位置関係データを取得する。第１位置関係データは、透過ディスプレイ３を透過する透過像９４１に含まれる被写体９ａと、ビデオシースルー用カメラ４１により撮影される第１画像データに含まれる被写体との位置関係を示すデータである。例えば、第１位置関係データは、変形(Ｗａｒｐ)処理における第１画像データの位置合わせ補正量を表す。位置合わせ補正量は、例えば、変形処理の前後の特徴点の位置関係を表す座標により定義される。 For this reason, when the head-mounted display 1a is in use, the SoC 100a corrects the video see-through image data before displaying it on the transmissive display 3. The position and size of the subject 9a in the video see-through display image 931a based on the corrected video see-through image data match the position and size of the subject 9a in the transmissive image 941. In this case, the subject 9a does not appear as a double image on the transmissive display 3, so user visibility is not reduced. The calibration process acquires first positional relationship data used for such correction during use. The first positional relationship data is data indicating the positional relationship between the subject 9a included in the transmissive image 941 transmitted through the transmissive display 3 and the subject included in the first image data captured by the video see-through camera 41. For example, the first positional relationship data represents the amount of alignment correction of the first image data in the warp process. The alignment correction amount is defined, for example, by coordinates that represent the positional relationship of feature points before and after the warp process.

　図４は、第１の実施形態に係るキャリブレーション用カメラ４２ａ，４２ｂの設置位置の一例について示す図である。図４に示すように、キャリブレーション用カメラ４２ａ，４２ｂは、それぞれ、ヘッドマウントディスプレイ１ａを装着したユーザの目の位置に相当する位置に設置される。例えば、図４に示すように、キャリブレーション用カメラ４２ａ，４２ｂは、レンズ２ａ，２ｂの中心付近に設置される。当該位置は、キャリブレーション用カメラ４２ａ，４２ｂは、透過ディスプレイ３の前面側から、透過ディスプレイ３を透過して、透過ディスプレイ３の背面が向く方向を撮影する。このため、キャリブレーション用カメラ４２ａ，４２ｂは、透過ディスプレイ３を透過して透過ディスプレイ３の背面側に位置する被写体を撮影する。なお、キャリブレーション用カメラ４２は、キャリブレーション処理後は使用されないため、ヘッドマウントディスプレイ１ａの構成には含まれない。 FIG. 4 is a diagram showing an example of the installation positions of the calibration cameras 42a and 42b according to the first embodiment. As shown in FIG. 4, the calibration cameras 42a and 42b are installed at positions corresponding to the eye positions of a user wearing the head-mounted display 1a. For example, as shown in FIG. 4, the calibration cameras 42a and 42b are installed near the centers of the lenses 2a and 2b. At these positions, the calibration cameras 42a and 42b pass through the transparent display 3 from the front side of the transparent display 3 and capture an image in the direction in which the rear side of the transparent display 3 faces. Therefore, the calibration cameras 42a and 42b pass through the transparent display 3 to capture an image of a subject located on the rear side of the transparent display 3. Note that the calibration camera 42 is not used after the calibration process and is therefore not included in the configuration of the head-mounted display 1a.

　キャリブレーション処理用の被写体としては、特徴点の抽出が容易なものが望ましい。 It is desirable for the subject used for calibration processing to be one from which feature points can be easily extracted.

　図５は、第１の実施形態に係るキャリブレーション処理の被写体９ｂの一例を示す図である。本実施形態においては、図５に示すように、キャリブレーション処理用の被写体９ｂとして白黒の矩形が格子状に並んだチェッカーボードチャートを使用する。チェッカーボードチャートの格子点は、特徴点９０の一例である。なお、図５では１つの格子点を特徴点９０の一例として図示しているが、キャリブレーション用カメラ４２及びビデオシースルー用カメラ４１の撮影画像データにおける全ての格子点が特徴点９０となる。なお、キャリブレーション処理用の被写体９ｂは図５に示す例に限定されない。 FIG. 5 is a diagram showing an example of a subject 9b for calibration processing according to the first embodiment. In this embodiment, as shown in FIG. 5, a checkerboard chart in which black and white rectangles are arranged in a grid pattern is used as the subject 9b for calibration processing. The grid points of the checkerboard chart are examples of feature points 90. Note that while FIG. 5 shows one grid point as an example of a feature point 90, all grid points in the image data captured by the calibration camera 42 and the video see-through camera 41 become feature points 90. Note that the subject 9b for calibration processing is not limited to the example shown in FIG. 5.

　図６は、第１の実施形態に係るキャリブレーション用カメラ４２による被写体９ｂの撮影を横側から見た一例を示す図である。図６に示すように、ビデオシースルー用カメラ４１及びキャリブレーション用カメラ４２は共に、透過ディスプレイ３の背面３０２側の方向に位置する被写体９ｂを撮影する。また、キャリブレーション用カメラ４２は、レンズ２及び透過ディスプレイ３を透過して、透過ディスプレイ３の前面３０１側から、背面３０２側に向けて被写体９ｂを撮影する。ビデオシースルー用カメラ４１は、レンズ２及び透過ディスプレイ３は透過せずに、被写体９ｂを撮影する。 FIG. 6 is a diagram showing an example of a side view of subject 9b photographed by the calibration camera 42 according to the first embodiment. As shown in FIG. 6, both the video see-through camera 41 and the calibration camera 42 photograph subject 9b located toward the back surface 302 of the transmissive display 3. The calibration camera 42 photographs subject 9b from the front surface 301 of the transmissive display 3 toward the back surface 302, passing through the lens 2 and the transmissive display 3. The video see-through camera 41 photographs subject 9b without passing through the lens 2 or the transmissive display 3.

　ここで、図４～６で説明したキャリブレーション用カメラ４２を用いたキャリブレーション処理の流れについて説明する。 Here, we will explain the flow of the calibration process using the calibration camera 42 described in Figures 4 to 6.

　図７は、第１の実施形態に係るキャリブレーション処理の流れの一例を示すフローチャートである。また、図８は、図７のフローチャートの各ステップの処理内容の一例を示す図である。図８のＳｔｅｐ１－７は、図７のＳ１～Ｓ７に対応する。 FIG. 7 is a flowchart showing an example of the flow of calibration processing according to the first embodiment. FIG. 8 is a diagram showing an example of the processing content of each step in the flowchart of FIG. 7. Steps 1-7 in FIG. 8 correspond to S1-S7 in FIG. 7.

　まず、Ｃｏｌｏｒ　ＩＳＰ１６は、キャリブレーション用カメラ４２で被写体９ｂを撮影した画像データ（図８に示す第１のキャリブレーション用画像データ９０１）を取得する（Ｓ１）。第１のキャリブレーション用画像データ９０１は、透過ディスプレイ３の前面３０１側から透過ディスプレイ３を透過して透過ディスプレイ３の背面３０２側に位置する被写体９ｂが撮影された画像データである。第１のキャリブレーション用画像データ９０１は、本実施形態における第２画像データの一例である。なお、キャリブレーション用カメラ４２はヘッドマウントディスプレイには搭載せず、市販のデジタルカメラやＷｅｂカメラ等を用いても良い。その場合、撮影した画像データを撮影後にデジタルカメラやＷｅｂカメラ等からＳｏＣ１００ａに転送して以降の処理を行う。 First, Color ISP 16 acquires image data (first calibration image data 901 shown in FIG. 8) of subject 9b photographed by calibration camera 42 (S1). The first calibration image data 901 is image data of subject 9b photographed from the front surface 301 of the transparent display 3, passing through the transparent display 3 to the rear surface 302 of the transparent display 3. The first calibration image data 901 is an example of second image data in this embodiment. Note that calibration camera 42 does not have to be mounted on a head-mounted display, and a commercially available digital camera, web camera, etc. may be used. In this case, the photographed image data is transferred from the digital camera, web camera, etc. to SoC 100a after photographing, and subsequent processing is performed.

　そして、ＣＰＵ２３は、第１のキャリブレーション用画像データ９０１内の被写体９ｂの特徴点９０を抽出する（Ｓ２）。第１のキャリブレーション用画像データ９０１から抽出された特徴点９０を、第１の特徴点とする。特徴点９０を抽出するとは、第１のキャリブレーション用画像データ９０１に描出された特徴点９０（例えば、チェッカーボードチャートの格子点）の位置を特定することである。 Then, the CPU 23 extracts feature points 90 of the subject 9b in the first calibration image data 901 (S2). The feature points 90 extracted from the first calibration image data 901 are designated as first feature points. Extracting feature points 90 means identifying the positions of the feature points 90 (for example, grid points of a checkerboard chart) depicted in the first calibration image data 901.

　ここで、第１のキャリブレーション用画像データ９０１は、ヘッドマウントディスプレイ１ａを装着したユーザから見える透過像を疑似的に画像化したものであるが、キャリブレーション用カメラ４２のレンズ歪等のため、実際にユーザから見える透過像とは異なる。なお、レンズ歪を含むキャリブレーション用カメラ４２の特性に起因する歪みを、カメラ歪という。ＣＰＵ２３は、カメラ歪の除去のため、Ｓ２の第１の特徴点の抽出の前に、キャリブレーション用カメラ４２の内部パラメータを用いて、第１のキャリブレーション用画像データ９０１を補正する。内部パラメータによる補正は、表示系による歪を打ち消す逆補正である。当該処理により、ＣＰＵ２３は、第１のキャリブレーション用画像データ９０１を、ヘッドマウントディスプレイ１ａを装着したユーザから見える透過像に相当する状態に変換する。図８では、変換後の画像データを疑似透過像データ９１１として示す。 Here, the first calibration image data 901 is a pseudo-image of the transmission image seen by a user wearing the head-mounted display 1a, but differs from the transmission image actually seen by the user due to lens distortion of the calibration camera 42, etc. Distortion caused by the characteristics of the calibration camera 42, including lens distortion, is called camera distortion. To remove the camera distortion, the CPU 23 corrects the first calibration image data 901 using the internal parameters of the calibration camera 42 before extracting the first feature point in S2. Correction using the internal parameters is an inverse correction that cancels out distortion caused by the display system. Through this processing, the CPU 23 converts the first calibration image data 901 into a state equivalent to the transmission image seen by a user wearing the head-mounted display 1a. In Figure 8, the converted image data is shown as pseudo transmission image data 911.

　キャリブレーション用カメラ４２の内部パラメータは、キャリブレーション用カメラ４２のカメラ歪の影響を除去するための補正パラメータである。内部パラメータは、例えば、ＯｐｅｎＣＶ（登録商標）等の公知のキャリブレーションツールを使用して算出することが可能である。内部パラメータは、例えば、ＳｏＣ１００ａの外部の情報処理装置等によって図７の処理の前に算出済みで、ＳｏＣ１００ａ内のメモリ等に記憶されていてもよい。また、ビデオシースルー用カメラ４１も、キャリブレーション用カメラ４２と同様にカメラ歪を有する。このため、ビデオシースルー用カメラ４１のカメラ歪の影響を除去するための内部パラメータも同様に、ＳｏＣ１００ａ内のメモリ等に記憶されていてもよい。 The internal parameters of the calibration camera 42 are correction parameters for removing the effects of camera distortion of the calibration camera 42. The internal parameters can be calculated using a known calibration tool such as OpenCV (registered trademark). The internal parameters may be calculated by an information processing device external to the SoC 100a before the processing of Figure 7 and stored in memory within the SoC 100a. Furthermore, the video see-through camera 41, like the calibration camera 42, also has camera distortion. Therefore, the internal parameters for removing the effects of camera distortion of the video see-through camera 41 may also be stored in memory within the SoC 100a.

　次に、Ｃｏｌｏｒ　ＩＳＰ１６は、ビデオシースルー用カメラ４１から、ビデオシースルー用カメラ４１で被写体９ｂを撮影した画像データ（図８に示すビデオシースルー画像データ９２１）を取得する（Ｓ３）。 Next, Color ISP 16 acquires image data (video see-through image data 921 shown in Figure 8) of subject 9b captured by video see-through camera 41 from video see-through camera 41 (S3).

　そして、Ｄｉｓｐｌａｙ　Ｃｏｎｔｒｏｌｌｅｒ１９は、ディスプレイプロジェクタ５を制御して、ビデオシースルー画像データ９２１を透過ディスプレイ３に表示させる（Ｓ４）。図８では、透過ディスプレイ３に表示されたビデオシースルー画像データ９２１を、ビデオシースルー表示画像９３２として示す。また、Ｓ４の処理の後、被写体９ｂは、ヘッドマウントディスプレイ１ａの前から作業者等によって除かれる。このため、キャリブレーション用カメラ４２の画角に被写体９ｂが入らない状態となる。 Then, the display controller 19 controls the display projector 5 to display the video see-through image data 921 on the transparent display 3 (S4). In Figure 8, the video see-through image data 921 displayed on the transparent display 3 is shown as a video see-through display image 932. Furthermore, after the processing of S4, the subject 9b is removed from in front of the head-mounted display 1a by an operator or the like. As a result, the subject 9b is no longer within the angle of view of the calibration camera 42.

　次に、Ｃｏｌｏｒ　ＩＳＰ１６は、ビデオシースルー表示画像９３２が表示された透過ディスプレイ３をキャリブレーション用カメラ４２で撮影した画像データを、キャリブレーション用カメラ４２から取得する（Ｓ５）。当該画像データは、図８に示す第２のキャリブレーション用画像データ９０２である。また、第２のキャリブレーション用画像データ９０２は、本実施形態における第３画像データの一例である。 Next, Color ISP 16 acquires image data from the calibration camera 42, which is an image of the transmissive display 3 displaying the video see-through display image 932 (S5). This image data is the second calibration image data 902 shown in FIG. 8. The second calibration image data 902 is also an example of the third image data in this embodiment.

　そして、ＣＰＵ２３は、第２のキャリブレーション用画像データ９０２内の被写体９ｂの特徴点９０を抽出する（Ｓ６）。第２のキャリブレーション用画像データ９０２から抽出された特徴点９０を、第２の特徴点とする。より詳細には、Ｓ２における処理と同様に、ＣＰＵ２３は、第２の特徴点の抽出の前にキャリブレーション用カメラ４２の内部パラメータを用いて、第２のキャリブレーション用画像データ９０２のカメラ歪を補正する。補正後の第２のキャリブレーション用画像データ９０２は、ヘッドマウントディスプレイ１ａを装着したユーザから見えるビデオシースルー表示画像を疑似的に再現したものであるため、疑似シースルー表示画像データ９４０という。ＣＰＵ２３は、疑似シースルー表示画像データ９４０から第２の特徴点を抽出する。 Then, the CPU 23 extracts feature points 90 of the subject 9b in the second calibration image data 902 (S6). The feature points 90 extracted from the second calibration image data 902 are designated as second feature points. More specifically, similar to the processing in S2, the CPU 23 corrects camera distortion of the second calibration image data 902 using internal parameters of the calibration camera 42 before extracting the second feature points. The corrected second calibration image data 902 is a pseudo-reproduction of the video see-through display image seen by a user wearing the head-mounted display 1a, and is therefore referred to as pseudo see-through display image data 940. The CPU 23 extracts the second feature points from the pseudo see-through display image data 940.

　そして、ＣＰＵ２３は、Ｓ２で抽出した第１の特徴点と、Ｓ６で抽出した第２の特徴点から変形パラメータを算出する（Ｓ７）。当該変形パラメータは、上述の第１位置関係データである。当該変形パラメータは、疑似透過像データ９１１に含まれる被写体９ｂと、ビデオシースルー用カメラ４１により撮影されるビデオシースルー画像データ９２１に含まれる被写体９ｂとの位置関係を示すデータである。キャリブレーション処理における疑似透過像データ９１１は実際のヘッドマウントディスプレイ１ａの使用時における透過像に相当する。また、キャリブレーション処理においてビデオシースルー画像データ９２１から生成された疑似シースルー表示画像データ９４０は実際のヘッドマウントディスプレイ１ａの使用時におけるビデオシースルー表示画像に相当する。このため、Ｓ７で算出される変形パラメータは、透過像に含まれる被写体と、ビデオシースルー用カメラ４１により撮影されるビデオシースルー画像データに含まれる被写体との位置関係を示す。 The CPU 23 then calculates deformation parameters from the first feature points extracted in S2 and the second feature points extracted in S6 (S7). The deformation parameters are the first positional relationship data described above. The deformation parameters are data indicating the positional relationship between the subject 9b included in the pseudo-transmitted image data 911 and the subject 9b included in the video see-through image data 921 captured by the video see-through camera 41. The pseudo-transmitted image data 911 in the calibration process corresponds to the transmitted image when the head-mounted display 1a is actually in use. Furthermore, the pseudo-see-through display image data 940 generated from the video see-through image data 921 in the calibration process corresponds to the video see-through display image when the head-mounted display 1a is actually in use. Therefore, the deformation parameters calculated in S7 indicate the positional relationship between the subject included in the transmitted image and the subject included in the video see-through image data captured by the video see-through camera 41.

　より詳細には、Ｓ７の変形パラメータ（第１位置関係データ）の算出処理では、ＣＰＵ２３は、第１の特徴点と第２の特徴点との位置関係を計算することにより、第１の特徴点と第２の特徴点とを一致させることができる変形パラメータを算出する。図８のＳｔｅｐ７－１に示すように、ＣＰＵ２３は、疑似シースルー表示画像データ９４０を疑似透過像データ９１１と一致するように補正可能な変形パラメータを算出する。また、図８のＳｔｅｐ７－２に示すように、ＣＰＵ２３は、ビデオシースルー画像データ９２１における被写体９ｂの特徴点９０と疑似シースルー表示画像データ９４０における被写体９ｂの特徴点９０とに基づいて、表示系によって生じる歪量を算出する。表示系によって生じる歪量とは、レンズ２及び透過ディスプレイ３の特性により生じる画像の歪の大きさである。レンズ２及び透過ディスプレイ３の特性とは、例えば、レンズ２及び透過ディスプレイ３の屈曲等である。ＣＰＵ２３は、Ｓｔｅｐ７－１で算出した変形パラメータとＳｔｅｐ７－２で算出した歪量とに基づいて、変形パラメータ（第１位置関係データ）を算出する。さらに詳細には、ＣＰＵ２３は、Ｓｔｅｐ７－１で算出した変形パラメータに対する変換処理を行う。当該変換処理については図９－１１で後述する。 More specifically, in the calculation process of the deformation parameters (first positional relationship data) in S7, the CPU 23 calculates the positional relationship between the first feature point and the second feature point, thereby calculating deformation parameters that can match the first feature point and the second feature point. As shown in Step 7-1 of FIG. 8, the CPU 23 calculates deformation parameters that can correct the pseudo see-through display image data 940 so that it matches the pseudo transmitted image data 911. Also, as shown in Step 7-2 of FIG. 8, the CPU 23 calculates the amount of distortion caused by the display system based on the feature point 90 of the subject 9b in the video see-through image data 921 and the feature point 90 of the subject 9b in the pseudo see-through display image data 940. The amount of distortion caused by the display system is the magnitude of the image distortion caused by the characteristics of the lens 2 and the transparent display 3. The characteristics of the lens 2 and the transparent display 3 include, for example, the curvature of the lens 2 and the transparent display 3. The CPU 23 calculates the deformation parameters (first positional relationship data) based on the deformation parameters calculated in Step 7-1 and the distortion amount calculated in Step 7-2. More specifically, the CPU 23 performs a conversion process on the deformation parameters calculated in Step 7-1. This conversion process will be described later with reference to Figures 9-11.

　そして、ＣＰＵ２３は、算出した変形パラメータ（第１位置関係データ）を、ＳＲＡＭ１４ｂ、Ｆｌａｓｈ　ｍｅｍｏｒｙ３１等のメモリに格納する（Ｓ８）。ここで、このフローチャートの処理は終了する。 Then, the CPU 23 stores the calculated transformation parameters (first positional relationship data) in a memory such as the SRAM 14b or the flash memory 31 (S8). At this point, the processing of this flowchart ends.

　図７に示すキャリブレーション処理は、上述のようにＳｏＣ１００ａによって実行されてもよいし、ヘッドマウントディスプレイ１ａの外部の他の情報処理装置等によって実行されてもよい。他の情報処理装置は、例えば、高性能なＰＣ（Ｐｅｒｓｏｎａｌ　Ｃｏｍｐｕｔｅｒ）等であってもよい。また、１つのヘッドマウントディスプレイ１ａのＳｏＣ１００ａによって実行されたキャリブレーション処理で生成された第１位置関係データが、複数のヘッドマウントディスプレイ１ａのメモリに格納されてもよい。 The calibration process shown in FIG. 7 may be executed by the SoC 100a as described above, or may be executed by another information processing device external to the head-mounted display 1a. The other information processing device may be, for example, a high-performance PC (Personal Computer). In addition, the first positional relationship data generated by the calibration process executed by the SoC 100a of one head-mounted display 1a may be stored in the memory of multiple head-mounted displays 1a.

　ここで、図７、８で説明した変形パラメータ（第１位置関係データ）の算出処理について、図９－１１を用いてより具体的に説明する。 Here, the calculation process for the transformation parameters (first positional relationship data) described in Figures 7 and 8 will be explained in more detail using Figures 9-11.

　図９は、第１の実施形態に係るキャリブレーション処理における各画像データに含まれる歪の一例を示す図である。上述の図８に図示されたＳｔｅｐ７－１で直接的に算出可能な変形パラメータは、キャリブレーション用カメラ４２によって撮影された第２のキャリブレーション用画像データ９０２に基づく疑似シースルー表示画像データ９４０に対する変形パラメータＡである。 FIG. 9 is a diagram showing an example of distortion contained in each image data in the calibration process according to the first embodiment. The deformation parameter that can be directly calculated in Step 7-1 shown in FIG. 8 above is deformation parameter A for the pseudo see-through display image data 940 based on the second calibration image data 902 captured by the calibration camera 42.

　図９に示すように、図７のＳ３で取得されたビデオシースルー画像データ９２１は、Ｓｔｅｐ４で透過ディスプレイ３に表示される際に表示系による歪の影響を受ける。このため、Ｓｔｅｐ５で取得される第２のキャリブレーション用画像データ９０２も、当該表示系による歪を含む。また、第２のキャリブレーション用画像データ９０２はキャリブレーション用カメラ４２による歪（カメラ歪）を含む。当該カメラ歪についてはＳｔｅｐ６においてＣＰＵ２３が内部パラメータを用いた補正で除去する。そして、ＣＰＵ２３は、Ｓｔｅｐ７－１で、カメラ歪が除去された疑似シースルー表示画像データ９４０を疑似透過像データ９１１と一致するように補正可能な変形パラメータＡを算出する。変形パラメータＡによる変形後の疑似シースルー表示画像データ９４０ａは、疑似透過像データ９１１と一致する。 As shown in FIG. 9, the video see-through image data 921 acquired in S3 of FIG. 7 is affected by distortion due to the display system when it is displayed on the transmissive display 3 in Step 4. For this reason, the second calibration image data 902 acquired in Step 5 also includes distortion due to the display system. The second calibration image data 902 also includes distortion (camera distortion) due to the calibration camera 42. In Step 6, the CPU 23 removes this camera distortion through correction using internal parameters. Then, in Step 7-1, the CPU 23 calculates a deformation parameter A that can correct the pseudo see-through display image data 940 from which the camera distortion has been removed so that it matches the pseudo transmission image data 911. The pseudo see-through display image data 940a after deformation using the deformation parameter A matches the pseudo transmission image data 911.

　しかしながら、ヘッドマウントディスプレイ１ａの使用時には、キャリブレーション用カメラ４２は用いられないため、使用時における変形対象はビデオシースルー画像データ９２１となる。 However, when the head-mounted display 1a is in use, the calibration camera 42 is not used, and therefore the object to be deformed during use is the video see-through image data 921.

　図１０は、第１の実施形態に係るヘッドマウントディスプレイ１ａの使用時における各画像データに含まれる歪の一例を示す図である。また、図１１は、図１０の変形処理Ｂの内訳の一例を示す図である。 FIG. 10 is a diagram showing an example of distortion contained in each image data when the head-mounted display 1a according to the first embodiment is in use. Also, FIG. 11 is a diagram showing an example of the breakdown of deformation process B in FIG. 10.

　図１０に示すように、ヘッドマウントディスプレイ１ａの使用時における補正処理は、透過ディスプレイ３に表示されたビデオシースルー表示画像９３３内の被写体の特徴点が透過像内の被写体の特徴点と一致することを目的とする。 As shown in Figure 10, the correction process when using the head-mounted display 1a aims to ensure that the feature points of the subject in the video see-through display image 933 displayed on the transmissive display 3 match the feature points of the subject in the transmissive image.

　このため、ヘッドマウントディスプレイ１ａの使用時の変形処理Ｂでは、変形後のビデオシースルー画像データ９２１ａに生じる歪みを考慮した変形パラメータＢが必要となる。 For this reason, deformation process B when using the head-mounted display 1a requires deformation parameters B that take into account the distortion that occurs in the video see-through image data 921a after deformation.

　具体的には、ビデオシースルー画像データ９２１が透過ディスプレイ３に表示される場合、表示系による歪、及びビデオシースルー用カメラ４１のカメラ歪の影響がある。カメラ歪の影響については、内部パラメータによる補正によって除去されるため、変形処理では考慮が不要となる。 Specifically, when the video see-through image data 921 is displayed on the transmissive display 3, it is affected by distortion due to the display system and camera distortion of the video see-through camera 41. The effect of camera distortion is removed by correction using internal parameters, so it does not need to be taken into account in the transformation process.

　このため、変形処理Ｂは、図１１に示すように、表示系による歪の変形処理Ｓ７０１と、キャリブレーション処理で算出された変形パラメータＡによる変形処理Ｓ７０２と、表示系による歪の逆変形処理Ｓ７０３との合成処理となる。ＣＰＵ２３は、Ｓ７０１～Ｓ７０３の処理における変形パラメータを合成することにより、変形パラメータＢを求める。変形パラメータＢは、例えば、Ｓ７０１～Ｓ７０３処理の前後における被写体９ｂの特徴点９０（例えば格子点）の位置の差異を示す。当該変形パラメータＢが、第１位置関係データである。 As a result, as shown in FIG. 11, transformation process B is a combination process of transformation process S701 for distortion caused by the display system, transformation process S702 using transformation parameter A calculated in the calibration process, and inverse transformation process S703 for distortion caused by the display system. The CPU 23 calculates transformation parameter B by combining the transformation parameters from processes S701 to S703. Transformation parameter B indicates, for example, the difference in the positions of feature points 90 (e.g., lattice points) of subject 9b before and after processes S701 to S703. This transformation parameter B is the first positional relationship data.

　次に、ヘッドマウントディスプレイ１ａの使用時の処理について説明する。 Next, we will explain the processing that occurs when using the head-mounted display 1a.

　図１２は、第１の実施形態に係るヘッドマウントディスプレイ１ａの使用時の表示データの生成処理の流れの一例を示すフローチャートである。図１２の処理の前に、図７のキャリブレーション処理が完了し、変形パラメータ（第１位置関係データ）がＳＲＡＭ１４ｂ、Ｆｌａｓｈ　ｍｅｍｏｒｙ３１等のメモリに格納済みであるものとする。 FIG. 12 is a flowchart showing an example of the flow of the display data generation process when using the head-mounted display 1a according to the first embodiment. Before the process in FIG. 12, it is assumed that the calibration process in FIG. 7 has been completed and the transformation parameters (first positional relationship data) have already been stored in a memory such as SRAM 14b or flash memory 31.

　まず、Ｃｏｌｏｒ　ＩＳＰ１６は、ビデオシースルー用カメラ４１からビデオシースルー画像データを取得する（Ｓ２１）。また、Ｃｏｌｏｒ　ＩＳＰ１６は、取得したビデオシースルー画像データをＲＧＢ画像データに変換する。ＳＴＡＴ２１は、Ｃｏｌｏｒ　ＩＳＰ１６によってＲＧＢ画像に変換されたビデオシースルー画像データに対して、ヒストグラムの抽出、矩形ブロック単位の輝度の平均値の算出、及び飽和画素数のカウントを行う。Ｃｏｌｏｒ　ＩＳＰ１６及びＳＴＡＴ２１は、ＲＧＢ画像データに変換されたビデオシースルー画像データ、ヒストグラム、矩形ブロック単位の輝度の平均値の算出結果、及び飽和画素数をＳＲＡＭ１４ａ～１４ｃまたはＤＲＡＭ３２等のメモリに格納する。以下、特に区別する必要がある場合を除き、ＲＧＢ画像データに変換されたビデオシースルー画像データも、単にビデオシースルー画像データという。 First, Color ISP 16 acquires video see-through image data from video see-through camera 41 (S21). Color ISP 16 then converts the acquired video see-through image data into RGB image data. STAT 21 extracts a histogram, calculates the average brightness value for each rectangular block, and counts the number of saturated pixels for the video see-through image data converted into an RGB image by Color ISP 16. Color ISP 16 and STAT 21 store the video see-through image data converted into RGB image data, the histogram, the calculation results of the average brightness value for each rectangular block, and the number of saturated pixels in memory such as SRAM 14a-14c or DRAM 32. Hereinafter, unless a distinction is needed, video see-through image data converted into RGB image data will also be simply referred to as video see-through image data.

　次に、Ｗａｒｐ１８は、ＳＲＡＭ１４ａ～１４ｃ等から、図７のキャリブレーション処理で生成された変形パラメータ（第１位置関係データ）を取得する（Ｓ２２）。 Next, Warp 18 acquires the transformation parameters (first positional relationship data) generated in the calibration process of FIG. 7 from SRAMs 14a to 14c, etc. (S22).

　また、ＣＰＵ２３は、ＲＧＢ画像データに変換されたビデオシースルー画像データに対する明所・暗所領域の抽出処理を行う（Ｓ２３）。例えば、ＣＰＵ２３は、ビデオシースルー画像データの明所の領域を抽出し、ビデオシースルー画像データの明所の領域と第１位置関係データとに基づいて透過像の明所の領域を算出する。明所・暗所領域の抽出処理の詳細については後述する。 The CPU 23 also performs a process of extracting bright and dark areas from the video see-through image data converted into RGB image data (S23). For example, the CPU 23 extracts bright areas from the video see-through image data, and calculates the bright areas of the transmitted image based on the bright areas from the video see-through image data and the first positional relationship data. Details of the process of extracting bright and dark areas will be described later.

　また、Ｗａｒｐ１８は、第１位置関係データにより定義された位置合わせ補正量に応じて、ビデオシースルー画像データの少なくとも一部を変形する（Ｓ２４）。当該変形によって、透過像に含まれる被写体の輪郭とビデオシースルー表示画像に含まれる被写体の輪郭を揃えることができる。このため、透過像が視認可能な場合でも透過像に含まれる被写体の輪郭とビデオシースルー表示画像に含まれる被写体の輪郭が二重に見えることを抑制することができる。 Furthermore, Warp 18 deforms at least a portion of the video see-through image data according to the alignment correction amount defined by the first positional relationship data (S24). This deformation makes it possible to align the contour of the subject included in the transmitted image with the contour of the subject included in the video see-through display image. Therefore, even when the transmitted image is visible, it is possible to prevent the contour of the subject included in the transmitted image and the contour of the subject included in the video see-through display image from appearing double.

　なお、上述のように、ヘッドマウントディスプレイ１ａは、暗所における視覚補助のためにビデオシースルー表示画像の表示をする。このため、ビデオシースルー画像データの全領域が表示対象とは限らない。このため、Ｗａｒｐ１８は、Ｓ２３の明所・暗所領域の抽出処理の結果に基づいて、表示が必要な領域のみを補正する。 As mentioned above, the head-mounted display 1a displays a video see-through image to aid vision in dark places. For this reason, not all areas of the video see-through image data are necessarily subject to display. For this reason, Warp 18 corrects only the areas that require display, based on the results of the bright and dark area extraction process in S23.

　図１３は、第１の実施形態に係るビデオシースルー画像データ９２２の補正の一例を示す図である。図１３に示すように、Ｗａｒｐ１８は、ビデオシースルー画像データ９２２のうち、明所・暗所領域の抽出処理で暗所もしくは薄明りと判定された領域のみ変形を行う。Ｗａｒｐ１８は、明所と判定された領域については変形しない。例えば、Ｗａｒｐ１８はビデオシースルー画像データ９２２のうち、表示対象として抽出した暗所もしくは薄明りに該当する領域のうち、透過像の明所の領域と接する部分を変形させて表示データを生成する。 FIG. 13 is a diagram showing an example of correction of video see-through image data 922 according to the first embodiment. As shown in FIG. 13, Warp 18 deforms only areas of the video see-through image data 922 that have been determined to be dark or dimly lit in the process of extracting bright and dark areas. Warp 18 does not deform areas determined to be bright. For example, Warp 18 deforms the parts of the video see-through image data 922 that are dark or dimly lit and extracted as display targets, and that border the bright areas of the transmitted image, to generate display data.

　Ｗａｒｐ１８は、補正対象外の領域には値“０”を設定する。図１３では、ビデオシースルー画像データ９２２のうち、両端の領域については明所と判定されたため、Ｗａｒｐ１８が当該領域に値“０”を設定する。値“０”が設定された領域においては、元の画像は削除される。補正後のビデオシースルー画像データ９２２ａのうち値“０”が設定された領域は、透過ディスプレイ３の表示の際に何も表示されない。図１３の補正後のビデオシースルー画像データ９２２ａにおいて黒色で示される領域は、値“０”が設定されているため、実際に透過ディスプレイ３に表示された場合は背景を透過する。このため、値“０”が設定された領域については、ユーザは透過像のみを視認する。このようにＷａｒｐ１８が補正対象領域を絞ることにより、補正に必要な演算量を削減することができる。Ｗａｒｐ１８は、補正後のビデオシースルー画像データ９２２ａをＤｉｓｐｌａｙ　Ｃｏｎｔｒｏｌｌｅｒ１９に出力する。 Warp 18 sets the value "0" to areas not subject to correction. In Figure 13, the areas at both ends of the video see-through image data 922 are determined to be bright areas, so Warp 18 sets the value "0" to those areas. In areas where the value "0" is set, the original image is deleted. In areas of the corrected video see-through image data 922a where the value "0" is set, nothing is displayed when displayed on the transmissive display 3. The areas shown in black in the corrected video see-through image data 922a in Figure 13 have the value "0" set, so when actually displayed on the transmissive display 3, the background is visible through them. Therefore, in areas where the value "0" is set, the user only sees the transmissive image. By Warp 18 narrowing down the areas subject to correction in this way, the amount of calculation required for correction can be reduced. Warp 18 outputs the corrected video see-through image data 922a to the Display Controller 19.

　図１２に戻り、Ｄｉｓｐｌａｙ　Ｃｏｎｔｒｏｌｌｅｒ１９は、Ｓ２３の明所・暗所領域の抽出処理の結果に基づいて、補正後のビデオシースルー画像データ９２２ａの一部の明るさを調整する（Ｓ２５）。Ｄｉｓｐｌａｙ　Ｃｏｎｔｒｏｌｌｅｒ１９は、例えば、明るさを補正するガンマ補正を実施する。また、Ｄｉｓｐｌａｙ　Ｃｏｎｔｒｏｌｌｅｒ１９は、色を調整するための色補正等を実施してもよい。 Returning to FIG. 12, the Display Controller 19 adjusts the brightness of a portion of the corrected video see-through image data 922a based on the results of the bright and dark area extraction process of S23 (S25). The Display Controller 19 performs, for example, gamma correction to correct the brightness. The Display Controller 19 may also perform color correction to adjust the colors.

　そして、Ｄｉｓｐｌａｙ　Ｃｏｎｔｒｏｌｌｅｒ１９は、補正後のビデオシースルー画像データ９２２ａを透過ディスプレイ３に表示可能な形式に変換することにより、表示データを生成する（Ｓ２６）。透過ディスプレイ３によって処理内容は異なるが、Ｄｉｓｐｌａｙ　Ｃｏｎｔｒｏｌｌｅｒ１９は、例えば、解像度を合わせるためのリサイズ等を実施する。 Then, the Display Controller 19 generates display data by converting the corrected video see-through image data 922a into a format that can be displayed on the transmissive display 3 (S26). The processing content differs depending on the transmissive display 3, but the Display Controller 19 performs, for example, resizing to match the resolution.

　そして、Ｄｉｓｐｌａｙ　Ｃｏｎｔｒｏｌｌｅｒ１９は、生成した表示データ及びＳ２３の明所・暗所領域の抽出処理で生成された透過率データを出力する（Ｓ２７）。より詳細には、生成した表示データをディスプレイプロジェクタ５に出力して透過ディスプレイ３にビデオシースルー表示画像を表示させる。また、Ｄｉｓｐｌａｙ　Ｃｏｎｔｒｏｌｌｅｒ１９は、透過率データに基づいて、レンズ２の液晶シャッタの透過度合を制御する。例えば、Ｄｉｓｐｌａｙ　Ｃｏｎｔｒｏｌｌｅｒ１９は、透過像の明所の領域に対応する透過ディスプレイ３の領域の透過率を、透過像の明所の領域を算出したときの第１透過率よりも小さい第２透過率に制御する。 Then, the Display Controller 19 outputs the generated display data and the transmittance data generated in the bright and dark area extraction process of S23 (S27). More specifically, the generated display data is output to the display projector 5, causing the transmissive display 3 to display a video see-through display image. The Display Controller 19 also controls the transmittance of the liquid crystal shutter of the lens 2 based on the transmittance data. For example, the Display Controller 19 controls the transmittance of the area of the transmissive display 3 corresponding to the bright area of the transmitted image to a second transmittance that is smaller than the first transmittance used when the bright area of the transmitted image was calculated.

　ここで、このフローチャートの処理は終了する。図１２のフローチャートの処理は、ヘッドマウントディスプレイ１ａがユーザに使用されている間は繰り返し実行される。なお、ユーザの視覚的な違和感を低減させるためには、図１２の処理は、一例として、９０～６０Ｈｚのリフレッシュレートで実行されることが望ましい。 Here, the processing of this flowchart ends. The processing of the flowchart in Figure 12 is repeatedly executed while the head-mounted display 1a is being used by the user. Note that, in order to reduce the user's visual discomfort, it is desirable that the processing of Figure 12 be executed at a refresh rate of 90 to 60 Hz, as an example.

　図１４は、第１の実施形態に係る透過ディスプレイ３と液晶シャッタによる視覚補助の原理について説明する図である。図１４では、図１２のＳ２７の処理により、透過ディスプレイ３にビデオシースルー表示画像が表示され、レンズ２の液晶シャッタによる減光が実施された状態を示す。ヘッドマウントディスプレイ１ａを装着したユーザの目８と、実像（被写体）との間には、液晶シャッタ機能付きレンズ２と、透過ディスプレイ３とが存在する。 FIG. 14 is a diagram illustrating the principle of visual assistance using the transmissive display 3 and liquid crystal shutter according to the first embodiment. FIG. 14 shows a state in which a video see-through display image is displayed on the transmissive display 3 and light reduction is performed by the liquid crystal shutter of the lens 2 as a result of the processing of S27 in FIG. 12. The lens 2 with liquid crystal shutter function and the transmissive display 3 are located between the eye 8 of the user wearing the head-mounted display 1a and the real image (subject).

　例えば、図１４では、実像において暗所（暗い領域）７１と明所（明るい領域）７２が存在する。Ｄｉｓｐｌａｙ　Ｃｏｎｔｒｏｌｌｅｒ１９は、透過ディスプレイ３上の実像の暗所が透過する領域ではビデオシースルー表示画像を表示させる。このため、暗所７１が透過する領域においては、ユーザには暗い透過像ではなく、明るいビデオシースルー表示画像９３４が見える。なお、暗所７１が透過する領域において、仮に、透過像が少し見えていたとしても、Ｓ２４の変形処理によってビデオシースルー表示画像９３４が透過像に合わせて補正されているため、二重像は発生しない。 For example, in Figure 14, there are dark areas (dark regions) 71 and bright areas (bright regions) 72 in the real image. The Display Controller 19 displays a video see-through display image in the areas where the dark areas of the real image on the transmissive display 3 are transmitted. Therefore, in the areas where the dark areas 71 are transmitted, the user sees a bright video see-through display image 934, not a dark transmitted image. Note that even if a small amount of the transmitted image is visible in the areas where the dark areas 71 are transmitted, no double image occurs because the video see-through display image 934 has been corrected to match the transmitted image by the transformation process in S24.

　また、透過ディスプレイ３上の明所７２が透過する領域においては、ビデオシースルー表示画像９３４の各画素に値“０”が設定されているため、ビデオシースルー表示画像９３４は透過ディスプレイ３に表示されない。また、明所７２の明るさの程度によっては、レンズ２の液晶シャッタ機能により、明所７２が透過する領域が減光される。 Furthermore, in the area on the transmissive display 3 where the bright area 72 is transmitted, a value of "0" is set for each pixel of the video see-through display image 934, so the video see-through display image 934 is not displayed on the transmissive display 3. Furthermore, depending on the brightness of the bright area 72, the liquid crystal shutter function of the lens 2 dims the area where the bright area 72 is transmitted.

　図１５は、第１の実施形態に係る透過ディスプレイ３の表示態様の一例を示す図である。図１５に示すように、ビデオシースルー表示画像９３４及び液晶シャッタによる視覚補助がない状態では、透過ディスプレイ３のうち暗所７１が透過する領域３１０においては、ユーザは暗さのために対象物を見ることができない場合がある。また、透過ディスプレイ３のうち明所７２が透過する領域３２０においては、ユーザはまぶしさを感じるために対象物を見ることができない場合がある。 FIG. 15 is a diagram showing an example of the display mode of the transmissive display 3 according to the first embodiment. As shown in FIG. 15, without the visual aids of the video see-through display image 934 and the liquid crystal shutter, the user may be unable to see objects in the region 310 of the transmissive display 3 through which the dark area 71 is transmitted due to darkness. Furthermore, in the region 320 of the transmissive display 3 through which the bright area 72 is transmitted, the user may be unable to see objects due to glare.

　これに対して、ビデオシースルー表示画像９３４及び液晶シャッタによる視覚補助がある状態では、透過ディスプレイ３のうち暗所７１が透過する領域３１０においては、ビデオシースルー表示画像９３４の表示によりユーザの視認性が向上する。また、透過ディスプレイ３のうち明所７２が透過する領域３２０においては、液晶シャッタの遮光によってまぶしさが軽減されることによりユーザの視認性が向上する。 In contrast, with the visual aids of the video see-through display image 934 and the liquid crystal shutter, the user's visibility is improved in the region 310 of the transmissive display 3 through which the dark area 71 is transmitted by the video see-through display image 934. Furthermore, in the region 320 of the transmissive display 3 through which the bright area 72 is transmitted by the liquid crystal shutter, the glare is reduced, thereby improving the user's visibility.

　次に、上述の図１２のＳ２３の明所・暗所領域の抽出処理の詳細について説明する。図１６は、第１の実施形態に係る明所・暗所領域の抽出処理の流れの一例を示す図である。 Next, the details of the bright and dark area extraction process in S23 of Figure 12 described above will be explained. Figure 16 is a diagram showing an example of the flow of the bright and dark area extraction process according to the first embodiment.

　まず、ＣＰＵ２３は、シーンの明るさ（ＥＶ（Ｅｘｐｏｓｕｒｅ　Ｖａｌｕｅ）値）を取得する（Ｓ２３１）。一般に、通常のカメラにおいて、撮影される画像データの明るさを一定に保つため、ＡＥによってシーンの明るさを推定し、シャッタ速度や感度を決定する処理がある。本実施形態のＣＰＵ２３は、このＡＥの情報を用いることで、シーンの暗部の明るさ（ＥＶ値）を取得する。シーンの明るさとは、ヘッドマウントディスプレイ１ａを装着したユーザの周囲の明るさである。 First, the CPU 23 acquires the brightness of the scene (EV (Exposure Value) value) (S231). Generally, in a normal camera, in order to keep the brightness of the captured image data constant, the brightness of the scene is estimated using AE and the shutter speed and sensitivity are determined. In this embodiment, the CPU 23 uses this AE information to acquire the brightness of dark areas of the scene (EV value). The brightness of the scene is the brightness around the user wearing the head-mounted display 1a.

　また、ＣＰＵ２３は、取得したＥＶ値に基づいて、レンズ２の透過率を決定する（Ｓ２３２）。レンズ２の透過率は、液晶シャッタの遮光の度合いを示す。 The CPU 23 also determines the transmittance of lens 2 based on the acquired EV value (S232). The transmittance of lens 2 indicates the degree of light blocking by the liquid crystal shutter.

　そして、ＣＰＵ２３は、ビデオシースルー画像データ９２３における明所領域及び暗所領域を抽出する（Ｓ２３３）。明所領域及び暗所領域の抽出は、換言すれば、ビデオシースルー画像データ９２３における明所領域に該当する範囲、及び暗所領域に該当する範囲の特定である。ここで、このフローチャートの処理は終了する。 Then, the CPU 23 extracts bright and dark areas from the video see-through image data 923 (S233). Extracting bright and dark areas means, in other words, identifying the ranges that correspond to bright and dark areas in the video see-through image data 923. At this point, the processing of this flowchart ends.

　ここで、図１６のＳ２３１のＥＶ値の取得処理の詳細について説明する。図１７は、第１の実施形態に係るＥＶ値の取得処理の流れの一例を示す図である。図１７に示すＳ３０１～Ｓ３０７の処理は、ＣＰＵ２３により実行される。ＣＰＵ２３は、Ｃｏｌｏｒ　ＩＳＰ１６が取得したビデオシースルー画像データ９２３からＳＴＡＴ２１が生成した、ヒストグラム、及び矩形ブロック単位の輝度の平均値の算出結果を取得する。 Here, the details of the EV value acquisition process of S231 in Figure 16 will be explained. Figure 17 is a diagram showing an example of the flow of the EV value acquisition process according to the first embodiment. The processes of S301 to S307 shown in Figure 17 are executed by the CPU 23. The CPU 23 acquires the histogram and the calculation results of the average brightness value for each rectangular block generated by STAT21 from the video see-through image data 923 acquired by Color ISP16.

　ＣＰＵ２３は、ヒストグラムから暗所の基準となる閾値を算出する。例えば、ＣＰＵ２３は、ヒストグラムを要素０から積算し全画素数のＮ％を超えたときの要素番号を閾値とする（Ｓ３０１）。図１８は、第１の実施形態に係るヒストグラムに基づいて決定される閾値の一例を示す図である。ＣＰＵ２３は、ヒストグラムから閾値を算出することにより、ビデオシースルー画像データ９２３に応じた閾値を動的に設定することができる。 The CPU 23 calculates a threshold value that serves as a reference for dark places from the histogram. For example, the CPU 23 integrates the histogram from element 0 and sets the element number when the integrated value exceeds N% of the total number of pixels as the threshold value (S301). Figure 18 is a diagram showing an example of a threshold value determined based on a histogram according to the first embodiment. By calculating the threshold value from the histogram, the CPU 23 can dynamically set a threshold value that corresponds to the video see-through image data 923.

　図１７に戻り、ＣＰＵ２３は、ＳＴＡＴ２１から取得した矩形ブロック単位の輝度の平均値から、輝度が閾値以下の領域の平均値を取得する（Ｓ３０２）。当該平均値を、ＳＴＡＴ平均値という。 Returning to FIG. 17, the CPU 23 obtains the average value of the area where the brightness is equal to or less than the threshold value from the average brightness values of rectangular blocks obtained from the STAT 21 (S302). This average value is called the STAT average value.

　そして、ＣＰＵ２３は、ＳＴＡＴ平均値と現在のセンサモジュール設定値（露光時間・センサゲイン・絞り）からシーンの明るさ（ＥＶ値）を推定する（Ｓ３０３）。 Then, the CPU 23 estimates the brightness (EV value) of the scene from the STAT average value and the current sensor module settings (exposure time, sensor gain, and aperture) (S303).

　また、ＣＰＵ２３は、推定した明るさに基づいて、新しいセンサ設定値を各種センサ（Ａｍｂｉｅｎｔ　Ｌｉｇｈｔ　Ｓｅｎｓｏｒ６０等）に設定する（Ｓ３０４）。これにより、実際の明るさの程度に合わせでセンサの感度が調整可能である。 The CPU 23 also sets new sensor setting values for various sensors (such as the Ambient Light Sensor 60) based on the estimated brightness (S304). This allows the sensor sensitivity to be adjusted to match the actual level of brightness.

　また、ＣＰＵ２３は、Ｉ２Ｃインタフェース２２を介してＡｍｂｉｅｎｔ　Ｌｉｇｈｔ　Ｓｅｎｓｏｒ６０から環境光の明るさ（Ｌｕｘ）を取得する（Ｓ３０５）。 The CPU 23 also acquires the ambient light brightness (Lux) from the Ambient Light Sensor 60 via the I2C interface 22 (S305).

　そして、ＣＰＵ２３は、公知のＥＶ／Ｌｕｘ変換表に基づいて、ＬｕｘをＥＶ値に変換する（Ｓ３０６）。図１９は、一般的なＥＶ／Ｌｕｘ変換表の一例を示す図である。ＥＶ値はカメラの露出補正値を決定する際に用いるための数値で、一般的な明るさを表す単位である照度(Ｌｕｘ値)とは、おおよそ図１９の表のような対応関係があることが知られている。ＥＶ値が高いほどシーンが明るく、ＥＶ値が低いほどシーンが暗い。ＣＰＵ２３は、図１９に示す例のように、ＥＶ値“－４”以上“０”未満を「暗所」、ＥＶ値“０”以上“２”未満を「薄明り」、ＥＶ値“１５”以上を「明所」と判定してもよい。なお、図１９に示す「暗所」、「薄明り」、「明所」の基準は一例であり、これに限定されるものではない。また、図１９ではＥＶ値“２”以上“１５”未満を「日中」として「明所」と区別しているが、「日中」も「明所」に含まれてもよい。 Then, the CPU 23 converts Lux to an EV value based on a known EV/Lux conversion table (S306). Figure 19 is a diagram showing an example of a common EV/Lux conversion table. The EV value is a numerical value used when determining the exposure compensation value of a camera, and it is known that there is a correspondence relationship between the EV value and illuminance (Lux value), a common unit of brightness, roughly as shown in the table in Figure 19. The higher the EV value, the brighter the scene, and the lower the EV value, the darker the scene. As in the example shown in Figure 19, the CPU 23 may determine that an EV value of "-4" or greater and less than "0" is a "dark place," an EV value of "0" or greater and less than "2" is a "twilight" place, and an EV value of "15" or greater is a "bright place." Note that the criteria for "dark place," "twilight," and "bright place" shown in Figure 19 are merely examples and are not limited to these. Also, in Figure 19, EV values of "2" or greater and less than "15" are considered "daytime" and distinguished from "bright places," but "daytime" may also be included in "bright places."

　図１７に戻り、ＣＰＵ２３は、Ａｍｂｉｅｎｔ　Ｌｉｇｈｔ　Ｓｅｎｓｏｒ６０からの環境光の明るさの計測結果が取得可能な場合はＡｍｂｉｅｎｔ　Ｌｉｇｈｔ　Ｓｅｎｓｏｒ６０の検出結果（ＡＬＳ明るさ）、取得不可の場合はＡＥの情報（ＡＥ明るさ）を使用する（Ｓ３０７）。Ａｍｂｉｅｎｔ　Ｌｉｇｈｔ　Ｓｅｎｓｏｒ６０による計測結果を使用する場合、ＡＥの情報を使用する場合よりも更新間隔は長くなるが、シーンの明るさの推定処理のための消費電力を低減することができる。また、撮影画像データは暗所が明るく撮影できるようセンサ設定値を設定しているため明所は飽和しており、撮影画像データに基づいたＡＥの情報は明所の明るさを正確に測定できない場合がある。明所の明るさの度合いを含むＡＬＳの検出結果を用いる方がより正確な周辺環境の明るさ情報が取得可能となる。ＣＰＵ２３は、ＡＬＳ明るさまたはＡＥ明るさのいずれかに基づくＥＶ値をＤｉｓｐｌａｙ　Ｃｏｎｔｒｏｌｌｅｒ１９に出力する。なお、ＣＰＵ２３は、シーンの明るさの推定にＡＬＳ明るさとＡＥ明るさの両方を使用してもよい。 Returning to FIG. 17, if the ambient light brightness measurement results from the Ambient Light Sensor 60 are available, the CPU 23 uses the detection results (ALS brightness) of the Ambient Light Sensor 60; if the measurement results are unavailable, the CPU 23 uses the AE information (AE brightness) (S307). When the measurement results from the Ambient Light Sensor 60 are used, the update interval is longer than when AE information is used, but power consumption for scene brightness estimation processing can be reduced. Furthermore, since the sensor settings are set so that dark areas can be captured brightly, bright areas are saturated, and AE information based on the captured image data may not accurately measure the brightness of bright areas. Using the ALS detection results, which include the degree of brightness in bright areas, makes it possible to obtain more accurate information about the brightness of the surrounding environment. The CPU 23 outputs an EV value based on either the ALS brightness or the AE brightness to the Display Controller 19. The CPU 23 may use both the ALS brightness and the AE brightness to estimate the brightness of a scene.

　ＣＰＵ２３は、シーンの明るさの推定にＡＥの情報を用いる場合、明所の明るさを取得するために、ヒストグラムの最大値とＳＴＡＴ平均値に基づいてＥＶ値を補正する。具体的には、ＣＰＵ２３は、ＡＥの情報に基づくＥＶ値を下記の式（１）で補正する。 When using AE information to estimate the brightness of a scene, the CPU 23 corrects the EV value based on the maximum value of the histogram and the average STAT value to obtain the brightness of a bright place. Specifically, the CPU 23 corrects the EV value based on the AE information using the following formula (1).

　ここで、図１６のＳ２３２のレンズ２の透過率の決定処理の詳細について説明する。ＣＰＵ２３は、Ｓ２３１で取得したシーンの明るさを示すＥＶ値に応じて、レンズ２の透過率を決定する。 Here, the process of determining the transmittance of lens 2 in S232 of FIG. 16 will be described in detail. The CPU 23 determines the transmittance of lens 2 according to the EV value indicating the brightness of the scene obtained in S231.

　ＥＶ値がある一定の閾値以上の明るさの場合、ユーザにとって非常に明るく視覚上見づらくなる。このため、ＣＰＵ２３は、明るさ度合に応じてレンズ２の透過率を制御することで明るさによる視認性の低下を軽減する。ＥＶ値が１上がるごとに、明るさ（Ｌｕｘ）は約２倍となる。このため、例えば、ＣＰＵ２３は、図１９に示したようにＥＶ値“１５”以上を明所とし、ＥＶ値“１５”からＥＶ値が１上がるごとにレンズ２の遮光度合を２倍にしてもよい。このような液晶シャッタの制御によって、レンズ２の透過像の明るさはＥＶ値“１４”以上にならず、ユーザが感じるまぶしさを軽減することができる。 When the EV value is brighter than a certain threshold, it becomes very bright and visually difficult for the user. For this reason, the CPU 23 reduces the decrease in visibility due to brightness by controlling the transmittance of the lens 2 according to the degree of brightness. For every 1 increase in the EV value, the brightness (Lux) approximately doubles. For this reason, for example, the CPU 23 may define an EV value of 15 or higher as a bright place, as shown in Figure 19, and double the degree of light blocking by the lens 2 for every 1 increase in the EV value from 15. By controlling the liquid crystal shutter in this way, the brightness of the transmitted image through the lens 2 will not exceed an EV value of 14, reducing the glare felt by the user.

　図２０は、第１の実施形態に係るＥＶ値とレンズ２の透過率との関係の一例を示す図である。一例として、図２０のグラフのように透過率は推移する。通常においてレンズ２の最大透過率は１００％にすることはできないため、図２０では例として最大透過率を８０％とした。図２０に示す例では、ＥＶ値が１増える（明るさが２倍になる）ごとに透過率が半分になる。ＣＰＵ２３は、決定した透過率に基づいて液晶シャッタの遮光度合を設定する。 FIG. 20 is a diagram showing an example of the relationship between the EV value and the transmittance of the lens 2 according to the first embodiment. As an example, the transmittance changes as shown in the graph in FIG. 20. Normally, the maximum transmittance of the lens 2 cannot be set to 100%, so in FIG. 20, the maximum transmittance is set to 80% as an example. In the example shown in FIG. 20, the transmittance is halved every time the EV value increases by 1 (brightness doubles). The CPU 23 sets the degree of light blocking of the liquid crystal shutter based on the determined transmittance.

　ここで、図１６のＳ２３３明所・暗所領域の抽出処理の詳細について説明する。図２１は、第１の実施形態に係るビデオシースルー画像データ９２３の矩形ブロック単位のＥＶ値の推定処理の一例を示す図である。 Here, the details of the bright and dark area extraction process S233 in Figure 16 will be explained. Figure 21 is a diagram showing an example of the EV value estimation process for rectangular blocks of video see-through image data 923 according to the first embodiment.

　ＣＰＵ２３は、視覚補正が必要な箇所を判定するために、ビデオシースルー画像データ９２３上の場所ごとの明るさを推定する。当該明るさの推定処理は画素単位で行うことも可能だが、ＳＴＡＴ２１から取得されるブロックごとの平均値を用いてブロック単位で明るさを推定することにより演算量が削減可能となる。なお、図２１に示す各種数値は一例であり、これに限定されるものではない。 The CPU 23 estimates the brightness for each location on the video see-through image data 923 to determine areas where visual correction is required. This brightness estimation process can be performed on a pixel-by-pixel basis, but the amount of calculation can be reduced by estimating the brightness on a block-by-block basis using the average value for each block obtained from STAT 21. Note that the various numerical values shown in Figure 21 are merely examples and are not limited to these.

　ＣＰＵ２３は、ブロックごとの平均値をＥＶ値に変換する。 The CPU 23 converts the average value for each block into an EV value.

　具体的には、ＣＰＵ２３は、下記の式（２）を用いてブロックごとの明るさ（ＥＶ値）の推定処理を行う。 Specifically, the CPU 23 estimates the brightness (EV value) for each block using the following equation (2):

　また、ビデオシースルー画像データ９２３はレンズ２を介さずに撮影された画像であるが、実際にユーザの目８で見えるシーンの明るさはレンズ２を透過した明るさである。このため、ＣＰＵ２３は、下記の式（３）を用いて、処理時点のレンズ２の透過率に基づいて各ブロックのＥＶ値を補正する。ＣＰＵ２３は、当該補正により、実際の透過像の明るさを示すＥＶ値を推定する。レンズ２の透過率による補正後のＥＶ値を、視覚ＥＶ値ともいう。 Furthermore, although the video see-through image data 923 is an image captured without using the lens 2, the brightness of the scene actually seen by the user's eye 8 is the brightness transmitted through the lens 2. For this reason, the CPU 23 corrects the EV value of each block based on the transmittance of the lens 2 at the time of processing, using the following equation (3). Through this correction, the CPU 23 estimates the EV value indicating the brightness of the actual transmitted image. The EV value after correction based on the transmittance of the lens 2 is also called the visual EV value.

　例えば、ＣＰＵ２３は、明るさ（視覚ＥＶ値）に応じて、ユーザにとって視覚的に認識可能な領域を「明所」、認識が困難な領域を「暗所」として特定して領域分割する。また、ＣＰＵ２３は、暗所よりも明るいが、やや認識が困難な領域を「薄明り」として特定して分割してもよい。 For example, the CPU 23 may divide the area into "bright areas" that are visually recognizable to the user and "dark areas" that are difficult to recognize, depending on the brightness (visual EV value). The CPU 23 may also divide the area into "twilight areas" that are brighter than dark areas but somewhat difficult to recognize.

　ここで、暗さまたは明るさにより視覚的な認識が困難になるパターンとしては、１つ目のパターンとして「単純に暗すぎて認識が困難」な場合と、２つ目のパターンとして「輝度差が大きいために認識が困難」な場合とがある。１つ目のパターンの場合は、ＣＰＵ２３はＥＶ値の絶対値により該当の領域を特定可能である。また、２つ目のパターンに関して、人の視覚のダイナミックレンジは８０～１２０ｄＢ程度といわれており、それ以上の輝度差があると暗部の視認が困難となる。このため、ＣＰＵ２３は、これら２つのパターンを含めるために、ＥＶ値の絶対値がある閾値以下、もしくは、ＥＶ値の最大値からの差分がある一定の値以下の領域を、認識困難な領域と判定する。 Here, there are two patterns where darkness or brightness makes visual recognition difficult: the first pattern is when "it is simply too dark, making recognition difficult," and the second pattern is when "it is difficult to recognize due to a large difference in brightness." In the first pattern, the CPU 23 can identify the relevant area using the absolute value of the EV value. Regarding the second pattern, the dynamic range of human vision is said to be around 80 to 120 dB, and if there is a brightness difference greater than this, it becomes difficult to see dark areas. For this reason, in order to include these two patterns, the CPU 23 determines that areas where the absolute value of the EV value is below a certain threshold, or where the difference from the maximum EV value is below a certain value, are difficult to recognize.

　具体的には例えば１つ目のパターンの場合、ＣＰＵ２３は、ＥＶ値が“０”～“１”の範囲を「薄明り」とし、視覚的な認識の困難度が高いため、ビデオシースルー画像データによる視覚補助を徐々に開始する。また、ＣＰＵ２３は、ＥＶ値が“０”を下回った場合、「暗所」と判定し、ビデオシースルー画像データによる視覚補助を実施する。 Specifically, in the case of the first pattern, for example, the CPU 23 determines that an EV value in the range of "0" to "1" is "twilight," and as the level of visual recognition is high, gradually begins providing visual assistance using video see-through image data. Furthermore, if the EV value falls below "0," the CPU 23 determines that the location is a "dark place" and provides visual assistance using video see-through image data.

　図２２は、第１の実施形態に係るビデオシースルー表示画像の明るさと、シーンにおけるＥＶ値との関係の一例を示す図である。図２２の縦軸はビデオシースルー表示画像の明るさ（輝度）、横軸はＥＶ値である。「視覚補助を徐々に開始する」とは、例えば、図２２に示すグラフのように、ＥＶ値が小さくなるほど、透過ディスプレイ３に表示されるビデオシースルー表示画像の明るさを、徐々に明るくすることをいう。 FIG. 22 is a diagram showing an example of the relationship between the brightness of the video see-through display image and the EV value in a scene according to the first embodiment. The vertical axis of FIG. 22 is the brightness (luminance) of the video see-through display image, and the horizontal axis is the EV value. "Gradually starting visual assistance" means, for example, as shown in the graph in FIG. 22, that the brightness of the video see-through display image displayed on the transmissive display 3 gradually increases as the EV value decreases.

　また、例えば２つ目のパターンの場合、ＣＰＵ２３は、ブロックのＥＶ値と、ビデオシースルー画像データ９２３全体におけるＥＶ値の最大値との差分に基づいてビデオシースルー表示画像の明るさを決定する。例えば、ＣＰＵ２３は、“ＥＶ値－ＥＶ値の最大値”が“－４”を下回ったら「薄明り」と判定する。「薄明り」の場合、視覚的な認識がやや困難になるため、ＣＰＵ２３は、ビデオシースルー画像データによる視覚補助を徐々に開始する。また、ＣＰＵ２３は、“ＥＶ値－ＥＶ値の最大値”が“－７”を下回ったら「暗所」と判定する。ＣＰＵ２３は、「暗所」と判定した場合、ビデオシースルー画像データによる視覚補助を実施する。 Furthermore, in the case of the second pattern, for example, the CPU 23 determines the brightness of the video see-through display image based on the difference between the EV value of the block and the maximum EV value for the entire video see-through image data 923. For example, the CPU 23 determines that it is "twilight" if "EV value - maximum EV value" falls below "-4." In the case of "twilight," visual recognition becomes somewhat difficult, so the CPU 23 gradually begins providing visual assistance using the video see-through image data. Furthermore, the CPU 23 determines that it is a "dark place" if "EV value - maximum EV value" falls below "-7." If the CPU 23 determines that it is a "dark place," it provides visual assistance using the video see-through image data.

　図２３は、第１の実施形態に係るビデオシースルー表示画像の明るさと、シーンにおけるＥＶ値の輝度差との関係の一例を示す図である。図２３の縦軸はビデオシースルー表示画像の明るさ（輝度）、横軸はＥＶ値である。ブロックのＥＶ値と、ビデオシースルー画像データ９２３全体におけるＥＶ値の最大値との差分（輝度差）である。ＣＰＵ２３は、例えば、図２３に示すグラフのように、輝度差が大きくなるほど、透過ディスプレイ３に表示されるビデオシースルー表示画像の明るさを、徐々に明るくする。 FIG. 23 is a diagram showing an example of the relationship between the brightness of the video see-through display image according to the first embodiment and the luminance difference in EV value in a scene. The vertical axis of FIG. 23 is the brightness (luminance) of the video see-through display image, and the horizontal axis is the EV value. This is the difference (luminance difference) between the EV value of the block and the maximum EV value in the entire video see-through image data 923. For example, as shown in the graph in FIG. 23, the CPU 23 gradually brightens the brightness of the video see-through display image displayed on the transmissive display 3 as the luminance difference increases.

　ＣＰＵ２３は、図２２、２３に示す２つのグラフに基づいてビデオシースルー画像データ９２３の明るさを変換することで、ＥＶ値に基づく「明所」、「暗所」、及び「薄明り」該当する各領域の明るさを決定できる。 By converting the brightness of the video see-through image data 923 based on the two graphs shown in Figures 22 and 23, the CPU 23 can determine the brightness of each area corresponding to "bright place," "dark place," and "twilight" based on the EV value.

　また、３つ目のパターンとして、ビデオシースルー画像データ９２３には、シーン全体の明るさ以外に、白飛び等の局所的な明るい箇所が存在する場合がある。これば、ビデオシースルー用カメラ４１に通常のイメージセンサが使用されている場合、ダイナミックレンジが狭いため、相対的に輝度が高い箇所で白飛び（画素値が飽和した状態）が発生するためである。相対的に輝度が高い箇所とは、例えば、強い逆光が射している箇所である。透過ディスプレイ３に表示されるビデオシースルー表示画像がこのような白飛びした領域を含むと、却って視覚を阻害することとなり好ましくない。よってＣＰＵ２３は、このような相対的に高輝度な領域も明所として判定する。ビデオシースルー表示画像のうちＣＰＵ２３が明所として判定した個所は、透過ディスプレイ３に表示されない。 As a third pattern, the video see-through image data 923 may contain locally bright areas such as blown-out highlights in addition to the brightness of the entire scene. This is because when a normal image sensor is used in the video see-through camera 41, the dynamic range is narrow, causing blown-out highlights (a state in which pixel values are saturated) in areas with relatively high brightness. An example of an area with relatively high brightness is an area with strong backlight. If the video see-through display image displayed on the transmissive display 3 includes such blown-out highlight areas, this can actually impair vision, which is undesirable. Therefore, the CPU 23 also determines such relatively bright areas to be bright areas. Areas of the video see-through display image that the CPU 23 determines to be bright areas are not displayed on the transmissive display 3.

　図２４は、第１の実施形態に係るビデオシースルー表示画像の明るさとブロックごとの輝度の平均値との関係の一例を示す図である。ＣＰＵ２３は、ＳＴＡＴ２１から出力されたブロックごとの輝度の平均値に対して図２４に示すグラフのような変換を行う。 FIG. 24 is a diagram showing an example of the relationship between the brightness of a video see-through display image and the average luminance value for each block according to the first embodiment. The CPU 23 performs a conversion as shown in the graph in FIG. 24 on the average luminance value for each block output from STAT 21.

　このようなＣＰＵ２３による明所・暗所領域の抽出処理の結果を受けて、Ｄｉｓｐｌａｙ　Ｃｏｎｔｒｏｌｌｅｒ１９は、ビデオシースルー表示画像の領域ごとの明るさを調整する。例えば、Ｄｉｓｐｌａｙ　Ｃｏｎｔｒｏｌｌｅｒ１９は、上述の３つの暗さまたは明るさにより視覚的な認識が困難になるパターンの全てにおいて透過ディスプレイ３への表示対象と判定した領域のみ、ビデオシースルー表示画像を透過ディスプレイ３に表示させる。例えば、Ｄｉｓｐｌａｙ　Ｃｏｎｔｒｏｌｌｅｒ１９は、ビデオシースルー画像データ９２３の各画素値に対して、当該画像データの明るさ値を乗算することで、ビデオシースルー画像データ９２３の明るさを調整する。 In response to the results of this bright and dark area extraction process by the CPU 23, the Display Controller 19 adjusts the brightness of each area of the video see-through display image. For example, the Display Controller 19 displays the video see-through display image on the transmissive display 3 only in areas that it has determined should be displayed on the transmissive display 3 in all of the three patterns where visual recognition is difficult due to darkness or brightness described above. For example, the Display Controller 19 adjusts the brightness of the video see-through image data 923 by multiplying each pixel value of the video see-through image data 923 by the brightness value of that image data.

　図２５は第１の実施形態に係るビデオシースルー画像データ９２３の明るさの調整処理の一例を示す図である。図２５に示すように、ＣＰＵ２３によって推定された透過像の明るさ（ＥＶ値）と、ＳＴＡＴ２１から出力されたビデオシースルー画像データ９２４のブロックごとの輝度の平均値に基づいて、ＣＰＵ２３はビデオシースルー画像データ９２４の「明所」、「暗所」の領域を判定する。また、ＣＰＵ２３はビデオシースルー画像データ９２４における白飛び等の相対的な高輝度の領域も特定する。Ｄｉｓｐｌａｙ　Ｃｏｎｔｒｏｌｌｅｒ１９は、ＣＰＵ２３の処理結果に基づいて、ビデオシースルー画像データ９２４の明るさを調整する。例えば、Ｄｉｓｐｌａｙ　Ｃｏｎｔｒｏｌｌｅｒ１９は、ビデオシースルー画像データ９２４の「明所」、「暗所」の領域の判定結果と、白飛び等の相対的な高輝度の領域の特定結果とを乗算することで、ビデオシースルー画像データ９２４のうち表示対象の領域を特定する。Ｗａｒｐ１８は、ビデオシースルー画像データ９２４のうち、表示対象外と判定した領域（例えば、「明所」に該当する領域）については“０”を設定する。このため、明るさ調整後のビデオシースルー画像データ９２４ａでは、３つのパターンのいずれかの判定で「明所」に該当する領域には“０”が設定される。 25 is a diagram showing an example of the brightness adjustment process for the video see-through image data 923 according to the first embodiment. As shown in FIG. 25, the CPU 23 determines the "bright" and "dark" areas of the video see-through image data 924 based on the brightness (EV value) of the transmitted image estimated by the CPU 23 and the average brightness value for each block of the video see-through image data 924 output from the STAT 21. The CPU 23 also identifies areas of relatively high brightness, such as blown-out highlights, in the video see-through image data 924. The Display Controller 19 adjusts the brightness of the video see-through image data 924 based on the processing results of the CPU 23. For example, the Display Controller 19 identifies the area to be displayed in the video see-through image data 924 by multiplying the determination result of the "bright" and "dark" areas of the video see-through image data 924 by the identification result of the relatively high brightness areas, such as blown-out highlights. Warp18 sets "0" to areas of the video see-through image data 924 that it determines not to be displayed (for example, areas that fall into a "bright place"). Therefore, in the video see-through image data 924a after brightness adjustment, areas that fall into a "bright place" in any of the three pattern determinations are set to "0".

　また、Ｄｉｓｐｌａｙ　Ｃｏｎｔｒｏｌｌｅｒ１９は、「暗所」だけではなく、「薄明り」の領域についてもビデオシースルー画像データによる視覚補助を開始する。このような「薄明り」の領域は、「暗所」の領域とは異なり、ユーザは透過像がうっすらと見えている状態である。このため、Ｗａｒｐ１８は、２重像を抑止するために、少なくとも「薄明り」の領域に対してはビデオシースルー画像データを、第１位置関係データを用いて変形する。当該変形により、透過ディスプレイ３上で、透過像内の被写体とビデオシースルー表示画像内の被写体の位置及び大きさが一致する。なお、「暗所」においては透過像が見えず２重像が発生しないため、ビデオシースルー画像データの変形は省略可能である。この場合、変形のための演算量を削減することができる。しかしながら、「暗所」と「暗所」外の領域の境界部分、及び透過ディスプレイ３の内外の境界部分でユーザが違和感を覚える可能性がある。このため、Ｗａｒｐ１８は、ビデオシースルー画像データのうち、ビデオシースルー表示画像として透過ディスプレイ３に表示される「暗所」及び「薄明り」の両方の領域を変形対象とすることが好ましい。 In addition, the Display Controller 19 begins providing visual assistance using video see-through image data not only in "dark places" but also in "twilight" areas. Unlike "dark places," such "twilight" areas allow the user to faintly see the transmitted image. Therefore, in order to prevent double images, the Warp 18 uses the first positional relationship data to deform the video see-through image data for at least the "twilight" area. This deformation causes the position and size of the subject in the transmitted image to match the subject in the video see-through display image on the transmissive display 3. Note that in "dark places," the transmitted image is not visible and no double images occur, so deformation of the video see-through image data can be omitted. In this case, the amount of calculation required for deformation can be reduced. However, the user may experience discomfort at the boundary between the "dark place" and the area outside the "dark place," as well as at the boundary between the inside and outside of the transmissive display 3. For this reason, it is preferable for Warp 18 to target both "dark" and "twilight" areas of the video see-through image data that are displayed on the transmissive display 3 as a video see-through display image for deformation.

　図２６は、第１の実施形態に係る補正前後のビデオシースルー画像データ９２５，９２５ａの一例を示す図である。図２６に示すように、Ｗａｒｐ１８は、ビデオシースルー画像データ９２５のうち、「薄明り」の領域を変形する。「暗所」で、透過像と２重になることがない領域は、変形しない例を示すが、「暗所」及び「薄明り」の両方の領域を変形することもできる。また、図２６に示すように、Ｗａｒｐ１８は、補正対象外の「明所」の領域には値“０”を設定する。 FIG. 26 shows an example of video see-through image data 925, 925a before and after correction according to the first embodiment. As shown in FIG. 26, Warp18 deforms the "twilight" area of the video see-through image data 925. An example is shown in which areas in "dark places" that do not overlap with the transmitted image are not deformed, but it is also possible to deform both "dark places" and "twilight" areas. Also, as shown in FIG. 26, Warp18 sets the value "0" to "bright places" areas that are not subject to correction.

　このように、本実施形態のＳｏＣ１００ａは、第１位置関係データに基づいて、ビデオシースルー画像データの少なくとも一部の領域を抽出および変形する。また、本実施形態のＳｏＣ１００ａは、透過像に含まれる被写体の輪郭とビデオシースルー画像データの一部に含まれる被写体の輪郭とが透過ディスプレイ３上で連続するように表示データを生成し、生成した表示データを透過ディスプレイ３に表示する。このため、本実施形態のＳｏＣ１００ａによれば、透過ディスプレイ３上にビデオシースルー表示画像を表示する場合において、透過像とビデオシースルー表示画像との境界での視認性を向上させることができる。 In this way, the SoC 100a of this embodiment extracts and deforms at least a partial area of the video see-through image data based on the first positional relationship data. Furthermore, the SoC 100a of this embodiment generates display data so that the outline of the subject included in the transmitted image and the outline of the subject included in part of the video see-through image data are continuous on the transmissive display 3, and displays the generated display data on the transmissive display 3. Therefore, the SoC 100a of this embodiment can improve visibility at the boundary between the transmitted image and the video see-through display image when displaying a video see-through display image on the transmissive display 3.

　また、本実施形態のＳｏＣ１００ａは、ビデオシースルー画像データに基づいて透過像の特徴点を算出し、透過像の特徴点と、ビデオシースルー画像データの特徴点とに基づいて第１位置関係データを生成する。ＳｏＣ１００ａは、生成した第１位置関係データをメモリへ出力する。このため、本実施形態のＳｏＣ１００ａによれば、事前に生成及び記憶した第１位置関係データを、ヘッドマウントディスプレイ１ａのユーザ使用時に使用するため、補正のための処理速度を向上させることができる。 In addition, the SoC 100a of this embodiment calculates feature points of the transmitted image based on the video see-through image data, and generates first positional relationship data based on the feature points of the transmitted image and the feature points of the video see-through image data. The SoC 100a outputs the generated first positional relationship data to memory. Therefore, according to the SoC 100a of this embodiment, the first positional relationship data generated and stored in advance is used when the user uses the head-mounted display 1a, thereby improving the processing speed for correction.

　また、本実施形態のＳｏＣ１００ａは、透過ディスプレイ３の前面３０１側からキャリブレーション用カメラ４２で撮影された第１のキャリブレーション用画像データ９０１から、透過像の第１の特徴点を抽出する。また、ＳｏＣ１００ａは、ビデオシースルー用カメラ４１で撮影されたビデオシースルー画像データに基づく表示データが表示された透過ディスプレイ３をキャリブレーション用カメラ４２で撮影した第２のキャリブレーション用画像データ９０２から被写体の第２の特徴点を抽出する。そして、本実施形態のＳｏＣ１００ａは、第１の特徴点と第２の特徴点の位置関係に基づいて第１位置関係データを生成する。このため、本実施形態のＳｏＣ１００ａは、実際のレンズ歪や表示系の歪等を加味して、ビデオシースルー画像デーと透過像とのずれを補正することができる。 The SoC 100a of this embodiment also extracts a first feature point of the transmitted image from first calibration image data 901 captured by the calibration camera 42 from the front surface 301 of the transmissive display 3. The SoC 100a also extracts a second feature point of the subject from second calibration image data 902 captured by the calibration camera 42 of the transmissive display 3, on which display data based on the video see-through image data captured by the video see-through camera 41 is displayed. The SoC 100a of this embodiment then generates first positional relationship data based on the positional relationship between the first and second feature points. Therefore, the SoC 100a of this embodiment can correct the misalignment between the video see-through image data and the transmitted image by taking into account actual lens distortion, display system distortion, etc.

　また、本実施形態のヘッドマウントディスプレイ１ａは、上述のように、暗所における視覚補助、及び明所におけるまぶしさによる視認性の低下を抑制する視覚補助の機能を有する。このため、例えば、ユーザが車両の運転時にヘッドマウントディスプレイ１ａを装着する場合、物陰等の暗所についてはビデオシースルー表示画像の表示による視覚補助を受け、逆光等の明所については液晶シャッタの遮光による視覚補助を受けることができる。また、本実施形態のヘッドマウントディスプレイ１ａは、夜間や地下等の暗所における作業の際の視覚補助にも利用可能である。 Furthermore, as described above, the head-mounted display 1a of this embodiment has the functions of providing visual assistance in dark places and suppressing reduced visibility due to glare in bright places. Therefore, for example, when a user wears the head-mounted display 1a while driving a vehicle, visual assistance is provided by the display of a video see-through display image in dark places such as behind objects, and visual assistance is provided by the light blocking liquid crystal shutter in bright places such as backlight. Furthermore, the head-mounted display 1a of this embodiment can also be used as a visual assistance when working in dark places such as at night or underground.

（第１の実施形態の変形例１）
　上述の第１の実施形態では、ＳｏＣ１００ａは、例えば、透過ディスプレイ３上の「暗所」及び「薄明り」の領域において、ビデオシースルー画像データを透過像に合わせて変形することにより、２重像の発生を低減していた。 (Modification 1 of the First Embodiment)
In the first embodiment described above, the SoC 100a reduced the occurrence of double images, for example, in the "dark" and "twilight" areas on the transmissive display 3 by transforming the video see-through image data to match the transmitted image.

　さらに２重像を目立たなくする方法として、液晶シャッタの遮光度合を強くし外光の影響を軽減したうえで、ビデオシースルー画像データを表示する方法がある。当該手法によれば、遮光によって透過像が見えなくなるため、２重像を抑止できる。 Another way to make double images less noticeable is to increase the degree of light blocking by the LCD shutter to reduce the effects of external light, and then display video see-through image data. With this method, the transmitted image becomes invisible due to the light blocking, thereby preventing double images.

　図２７は、第１の実施形態の変形例１に係る遮光対象領域２０１の一例を示す図である。一般的に、透過ディスプレイ３はレンズ２よりも画角が狭いためユーザの視界に対して一部分しか表示できないことが多い。このため、本変形例のＤｉｓｐｌａｙ　Ｃｏｎｔｒｏｌｌｅｒ１９は、ユーザの視界全体（レンズ２全体）を遮光するのではなく透過ディスプレイ３の存在する範囲のみを遮光対象領域２０１とする。Ｄｉｓｐｌａｙ　Ｃｏｎｔｒｏｌｌｅｒ１９は、当該範囲を遮光対象領域２０１とすることで、透過ディスプレイ３の範囲外の視界を阻害せずに、２重像を抑止することができる。 Figure 27 is a diagram showing an example of a light-blocking target area 201 according to variant 1 of the first embodiment. Generally, the transparent display 3 has a narrower angle of view than the lens 2, and therefore can often only display a portion of the user's field of view. For this reason, the Display Controller 19 of this variant does not block the entire field of view of the user (the entire lens 2), but rather sets only the area where the transparent display 3 is present as the light-blocking target area 201. By setting this area as the light-blocking target area 201, the Display Controller 19 can prevent double images without obstructing the field of view outside the range of the transparent display 3.

（第１の実施形態の変形例２）
　上述の第１の実施形態の変形例１では、ＳｏＣ１００ａは、レンズ２全体のうち、透過ディスプレイ３の存在する範囲のみを遮光対象領域２０１としていた。このような遮光範囲では、例えば強い逆光のようにＥＶ値の相対値または絶対値が高い領域がある場合、透過ディスプレイ３の範囲外についてはまぶしさを低減することができない。 (Modification 2 of the First Embodiment)
In the first modification of the first embodiment described above, the SoC 100a defines only the area of the entire lens 2 where the transmissive display 3 is present as the light-shielded area 201. In such a light-shielded area, if there is an area where the relative or absolute value of the EV value is high, such as in strong backlight, it is not possible to reduce glare outside the area of the transmissive display 3.

　図２８は、第１の実施形態の変形例２に係る明るさの強い領域が存在する場合の一例を示す図である。このような場合、ＳｏＣ１００ａが、透過ディスプレイ３が存在する範囲のみを液晶シャッタによって遮光しても、ユーザはまぶしさを感じる。また、図２８に示す例では、レンズ２の視野内に暗所も存在する。暗所であっても、透過ディスプレイ３の範囲外の領域はビデオシースルー表示画像による視覚補助の対象外である。このため、ＳｏＣ１００ａが液晶シャッタによりレンズ２の全面を遮光すると、透過ディスプレイ３の範囲外の領域は暗いままであり、視認性が悪い。 FIG. 28 is a diagram showing an example of a case where a bright area exists according to Variation 2 of the first embodiment. In such a case, even if SoC 100a uses the liquid crystal shutter to block light only in the area where the transmissive display 3 exists, the user will still feel dazzled. Furthermore, in the example shown in FIG. 28, there is also a dark area within the field of view of lens 2. Even in dark areas, areas outside the range of the transmissive display 3 are not subject to visual aid by the video see-through display image. Therefore, when SoC 100a blocks light from the entire surface of lens 2 using the liquid crystal shutter, areas outside the range of the transmissive display 3 remain dark, resulting in poor visibility.

　本変形例のＳｏＣ１００ａは、このような場合には、強い逆光等でまぶしい箇所のみ遮光するように液晶シャッタを制御する。 In such cases, the SoC 100a of this modified example controls the liquid crystal shutter to block light only in areas that are dazzling due to strong backlight, etc.

　図２９は、第１の実施形態の変形例２に係る遮光対象領域２０２の一例を示す図である。図２９に示すように、遮光対象領域２０２は、透過ディスプレイ３の範囲外も含む。本変形例のＳｏＣ１００ａの例えばＣＰＵ２３は、レンズ２の領域ごとの透過率を矩形ブロック単位の輝度の平均値から算出し、液晶シャッタの透過率を領域ごとに制御する。本変形例のレンズ２の液晶シャッタは、部分的に透過率を制御可能な液晶パネルを有する。 FIG. 29 is a diagram showing an example of a light-shielding target area 202 according to Modification 2 of the first embodiment. As shown in FIG. 29, the light-shielding target area 202 also includes areas outside the range of the transmissive display 3. For example, the CPU 23 of the SoC 100a of this modification calculates the transmittance for each area of the lens 2 from the average brightness value for each rectangular block, and controls the transmittance of the liquid crystal shutter for each area. The liquid crystal shutter of the lens 2 of this modification has a liquid crystal panel whose transmittance can be partially controlled.

　このような遮光範囲の制御により、本変形例のヘッドマウントディスプレイ１ａによれば、レンズ２の視野内の透過ディスプレイ３の範囲外に相対的または絶対的に輝度の高い明所がある場合におけるユーザの視認性の低下を軽減することができる。また、本変形例のヘッドマウントディスプレイ１ａによれば、レンズ２の視野内に明所と暗所が混在する場合においても、ユーザの視認性の低下を軽減することができる。 By controlling the light-blocking range in this manner, the head-mounted display 1a of this modified example can reduce the reduction in visibility for the user when there is a bright area with relatively or absolutely high brightness outside the range of the transmissive display 3 within the field of view of the lens 2. Furthermore, the head-mounted display 1a of this modified example can reduce the reduction in visibility for the user even when there is a mixture of bright and dark areas within the field of view of the lens 2.

（第２の実施形態）
　上述の第１の実施形態では、ＳｏＣ１００ａは、ヘッドマウントディスプレイ１ａを装着したユーザの目８の位置と、ビデオシースルー用カメラ４１の搭載位置との差異に基づくビデオシースルー表示画像と透過像とのずれを補正していた。第１の実施形態のキャリブレーション処理では、ＳｏＣ１００ａはユーザの目８がレンズ２の中心付近に位置することを前提に第1位置関係データを決定していた。しかしながら、ユーザの個人差によって目８の位置は異なる上に、同一人物であっても装着状況によって目８とビデオシースルー用カメラ４１との相対的な位置は常時変化することが想定される。このため、この第２の実施形態では、さらに、実装着時のずれについても補正する。 Second Embodiment
In the first embodiment described above, the SoC 100a corrects the misalignment between the video see-through display image and the transmitted image based on the difference between the position of the user's eye 8 wearing the head-mounted display 1a and the mounting position of the video see-through camera 41. In the calibration process of the first embodiment, the SoC 100a determines the first positional relationship data on the assumption that the user's eye 8 is located near the center of the lens 2. However, the position of the eye 8 varies depending on the individual user, and even for the same person, the relative position between the eye 8 and the video see-through camera 41 is expected to constantly change depending on the wearing conditions. For this reason, in this second embodiment, the misalignment during mounting is also corrected.

　図３０は、第２の実施形態に係るヘッドマウントディスプレイ１ｂの全体構成の一例を示す図である。本実施形態のヘッドマウントディスプレイ１ｂは、眼鏡本体部１０、レンズ２ａ，２ｂ、透過ディスプレイ３ａ，３ｂ、ビデオシースルー用カメラ４１ａ，４１ｂ、ディスプレイプロジェクタ５ａ，５ｂ、Ａｍｂｉｅｎｔ　Ｌｉｇｈｔ　Ｓｅｎｓｏｒ６０、Ｈｅａｄ　Ｔｒａｃｋｉｎｇ用カメラ６３ａ，６３ｂ、及びＳｏＣ１００ｂを備える。また、本実施形態のヘッドマウントディスプレイ１ｂは、さらに、Ｅｙｅ　Ｔｒａｃｋｉｎｇ用カメラ４３ａ，４３ｂを備える。 Figure 30 is a diagram showing an example of the overall configuration of a head-mounted display 1b according to the second embodiment. The head-mounted display 1b of this embodiment includes an eyeglass body 10, lenses 2a and 2b, transparent displays 3a and 3b, video see-through cameras 41a and 41b, display projectors 5a and 5b, an ambient light sensor 60, head tracking cameras 63a and 63b, and an SoC 100b. The head-mounted display 1b of this embodiment also includes eye tracking cameras 43a and 43b.

　眼鏡本体部１０、レンズ２ａ，２ｂ、透過ディスプレイ３ａ，３ｂ、ビデオシースルー用カメラ４１ａ，４１ｂ、ディスプレイプロジェクタ５ａ，５ｂ、Ａｍｂｉｅｎｔ　Ｌｉｇｈｔ　Ｓｅｎｓｏｒ６０、及びＨｅａｄ　Ｔｒａｃｋｉｎｇ用カメラ６３ａ，６３ｂは、第１の実施形態と同様の機能を備える。また、ヘッドマウントディスプレイ１ｂは、第１の実施形態と同様に、ＩＭＵ６１、ＴｏＦセンサ６２、Ｆｌａｓｈ　ｍｅｍｏｒｙ３１、及びＤＲＡＭ３２を備える。 The eyeglasses main body 10, lenses 2a, 2b, transmissive displays 3a, 3b, video see-through cameras 41a, 41b, display projectors 5a, 5b, ambient light sensor 60, and head tracking cameras 63a, 63b have the same functions as in the first embodiment. Also, the head-mounted display 1b has an IMU 61, a ToF sensor 62, a flash memory 31, and a DRAM 32, just like in the first embodiment.

　Ｅｙｅ　Ｔｒａｃｋｉｎｇ用カメラ４３ａ，４３ｂは、透過ディスプレイ３の前面が向く方向を撮影する。 Eye Tracking cameras 43a and 43b capture images in the direction in which the front of the transmissive display 3 faces.

　図３１は、第２の実施形態に係るＥｙｅ　Ｔｒａｃｋｉｎｇ用カメラ４３ａ，４３ｂとユーザの目８ａ，８ｂとの位置関係の一例を示す図である。図３１に示すように、Ｅｙｅ　Ｔｒａｃｋｉｎｇ用カメラ４３ａ，４３ｂは、ヘッドマウントディスプレイ１ｂを装着したユーザの目８ａ，８ｂを撮影する。 Figure 31 is a diagram showing an example of the positional relationship between the eye tracking cameras 43a, 43b and the user's eyes 8a, 8b according to the second embodiment. As shown in Figure 31, the eye tracking cameras 43a, 43b capture images of the eyes 8a, 8b of a user wearing the head-mounted display 1b.

　なおＥｙｅ　Ｔｒａｃｋｉｎｇ用カメラ４３ａ，４３ｂの数は、２台に限定されるものではない。図３２は、第２の実施形態に係るＥｙｅ　Ｔｒａｃｋｉｎｇ用カメラ４３ａ～４３ｄとユーザの目８ａ，８ｂとの位置関係の他の一例を示す図である。図３２に示すように、ヘッドマウントディスプレイ１ｂが４台のＥｙｅ　Ｔｒａｃｋｉｎｇ用カメラ４３ａ～４３ｄを備える場合、１つの目８に対して２台のカメラで撮影が可能である。当該構成によれば、三角測量の原理で目８ａ，８ｂからレンズ２ａ，２ｂまでの距離も推定可能となる。 Note that the number of eye tracking cameras 43a, 43b is not limited to two. Figure 32 is a diagram showing another example of the positional relationship between the eye tracking cameras 43a-43d and the user's eyes 8a, 8b according to the second embodiment. As shown in Figure 32, when the head-mounted display 1b is equipped with four eye tracking cameras 43a-43d, it is possible to capture images of one eye 8 using two cameras. With this configuration, it is also possible to estimate the distance from the eyes 8a, 8b to the lenses 2a, 2b using the principle of triangulation.

　以下、個々のＥｙｅ　Ｔｒａｃｋｉｎｇ用カメラ４３ａ～４３ｄを特に区別しない場合は、単にＥｙｅ　Ｔｒａｃｋｉｎｇ用カメラ４３という。Ｅｙｅ　Ｔｒａｃｋｉｎｇ用カメラ４３は、本実施形態における第３のカメラの一例である。また、Ｅｙｅ　Ｔｒａｃｋｉｎｇ用カメラ４３によって撮影される画像データは、本実施形態における第４画像データの一例である。 Hereinafter, when there is no need to distinguish between the individual eye tracking cameras 43a to 43d, they will simply be referred to as eye tracking cameras 43. Eye tracking camera 43 is an example of the third camera in this embodiment. Furthermore, image data captured by eye tracking camera 43 is an example of the fourth image data in this embodiment.

　図３３は、第２の実施形態に係るＳｏＣ１００ｂの構成の一例を示す図である。ＳｏＣ１００ｂは、第１の実施形態と同様にＩ２Ｃインタフェース１１，２２、Ｍｏｎｏ　ＩＳＰ１２、ＤＳＰ＆ＡＩＡｃｃｅｌｅｒａｔｏｒ１３、ＳＲＡＭ１４ａ～１４ｃ、ＧＰＵ１５、Ｃｏｌｏｒ　ＩＳＰ１６、Ｔｉｍｅ　Ｗａｒｐ１７、Ｗａｒｐ１８、Ｄｉｓｐｌａｙ　Ｃｏｎｔｒｏｌｌｅｒ１９、ＳＴＡＴ２１、ＣＰＵ２３、及びＤＲＡＭ　Ｃｏｎｔｒｏｌｌｅｒ２４を備える。 FIG. 33 is a diagram showing an example of the configuration of SoC 100b according to the second embodiment. Like the first embodiment, SoC 100b includes I2C interfaces 11 and 22, Mono ISP 12, DSP & AI Accelerator 13, SRAMs 14a to 14c, GPU 15, Color ISP 16, Time Warp 17, Warp 18, Display Controller 19, STAT 21, CPU 23, and DRAM Controller 24.

　本実施形態のＭｏｎｏ　ＩＳＰ１２は、第１の実施形態の機能に加えて、さらに、Ｅｙｅ　Ｔｒａｃｋｉｎｇ用カメラ４３から撮影画像データを取得し、明るさ等を補正する。 In addition to the functions of the first embodiment, the Mono ISP 12 of this embodiment also acquires captured image data from the eye tracking camera 43 and corrects brightness, etc.

　本実施形態のＤＳＰ＆ＡＩ　Ａｃｃｅｌｅｒａｔｏｒ１３は、第１の実施形態の機能に加えて、さらに、Ｍｏｎｏ　ＩＳＰ１２が取得及び補正したＥｙｅ　Ｔｒａｃｋｉｎｇ用カメラ４３の撮影画像データに基づいてユーザの目８における瞳の位置を検出するＥｙｅ　Ｔｒａｃｋｉｎｇ処理を実行する。ＤＳＰ＆ＡＩ　Ａｃｃｅｌｅｒａｔｏｒ１３は、本実施形態における瞳検出回路の一例である。ＤＳＰ＆ＡＩ　Ａｃｃｅｌｅｒａｔｏｒ１３のうち、特に、ＤＳＰ１３１は、ユーザの瞳の位置と、ビデオシースルー画像データの特徴点とに基づいて第２位置関係データを生成し、ＳＲＡＭ１４ａ等のメモリへ出力する。ＤＳＰ１３１は、本実施形態における画像処理回路の一例である。 In addition to the functions of the first embodiment, the DSP & AI Accelerator 13 of this embodiment also performs eye tracking processing to detect the position of the pupil of the user's eye 8 based on image data captured by the eye tracking camera 43 acquired and corrected by the Mono ISP 12. The DSP & AI Accelerator 13 is an example of an pupil detection circuit in this embodiment. Of the DSP & AI Accelerator 13, the DSP 131 in particular generates second positional relationship data based on the position of the user's pupil and feature points of the video see-through image data, and outputs it to a memory such as the SRAM 14a. The DSP 131 is an example of an image processing circuit in this embodiment.

　本実施形態のＷａｒｐ１８は、第１の実施形態の機能に加えて、さらに、第２位置関係データに基づいて、ビデオシースルー画像データ内の被写体の位置を補正する。第２位置関係データは、ユーザの瞳の位置とビデオシースルー用カメラ４１との位置関係を表す。 In addition to the functions of the first embodiment, Warp 18 of this embodiment also corrects the position of the subject in the video see-through image data based on second positional relationship data. The second positional relationship data represents the positional relationship between the position of the user's pupil and the video see-through camera 41.

　なお、ＣＰＵ２３は、ヘッドマウントディスプレイ１ｂの使用前のキャリブレーション処理については第１の実施形態と同様に実行し、第１位位置関係データをメモリに格納する。 The CPU 23 performs the pre-use calibration process for the head-mounted display 1b in the same manner as in the first embodiment, and stores the first positional relationship data in memory.

　次に、本実施形態における瞳位置の位置ずれ補正量の取得処理の流れについて説明する。図３４は、第２の実施形態に係る瞳位置の位置ずれ補正量の取得処理の流れの一例を示すフローチャートである。 Next, the flow of the process for obtaining the pupil position misalignment correction amount in this embodiment will be described. Figure 34 is a flowchart showing an example of the flow of the process for obtaining the pupil position misalignment correction amount according to the second embodiment.

　まず、Ｍｏｎｏ　ＩＳＰ１２は、ヘッドマウントディスプレイ１ｂを装着したユーザ（装着者）の瞳を撮影した画像データをＥｙｅ　Ｔｒａｃｋｉｎｇ用カメラ４３から取得する（Ｓ３１）。Ｅｙｅ　Ｔｒａｃｋｉｎｇ用カメラ４３によって撮影された当該画像データをＥｙｅ　Ｔｒａｃｋｉｎｇ用画像データという。 First, Mono ISP 12 acquires image data of the eyes of a user (wearer) wearing head-mounted display 1b from eye tracking camera 43 (S31). The image data captured by eye tracking camera 43 is referred to as eye tracking image data.

　そして、ＤＳＰ＆ＡＩ　Ａｃｃｅｌｅｒａｔｏｒ１３は、Ｅｙｅ　Ｔｒａｃｋｉｎｇ用画像データから、ユーザの瞳孔の位置を検出する（Ｓ３２）。ユーザの瞳孔の位置を示すデータは、ユーザ瞳の位置を示す瞳位置データの一例である。 Then, the DSP & AI Accelerator 13 detects the position of the user's pupil from the image data for eye tracking (S32). The data indicating the position of the user's pupil is an example of pupil position data indicating the position of the user's pupil.

　そして、ＤＳＰ＆ＡＩ　Ａｃｃｅｌｅｒａｔｏｒ１３のうちのＤＳＰは、ユーザの瞳孔の位置の検出結果に基づいて、ユーザの目８とビデオシースルー用カメラ４１との相対位置を取得する（Ｓ３３）。ＤＳＰ１３１は、取得したユーザの目８とビデオシースルー用カメラ４１との相対位置を示す第２位置関係データを、ＳＲＡＭ１４ａ等のメモリへ出力する。また、ＤＳＰ１３１は、さらに、ビデオシースルー画像データから抽出された特徴点に基づいてユーザの目８とビデオシースルー用カメラ４１との相対位置を示す第２位置関係データを生成してもよい。また、ＤＳＰ１３１は、ユーザの瞳の位置を示す瞳位置データをＳＲＡＭ１４ａ等のメモリへ格納してもよい。ここで、図３４のフローチャートの処理は終了する。 Then, the DSP in the DSP & AI Accelerator 13 acquires the relative position between the user's eye 8 and the video see-through camera 41 based on the detection result of the user's pupil position (S33). The DSP 131 outputs second positional relationship data indicating the acquired relative position between the user's eye 8 and the video see-through camera 41 to a memory such as the SRAM 14a. The DSP 131 may also generate second positional relationship data indicating the relative position between the user's eye 8 and the video see-through camera 41 based on feature points extracted from the video see-through image data. The DSP 131 may also store pupil position data indicating the position of the user's pupil in a memory such as the SRAM 14a. At this point, the processing of the flowchart in Figure 34 ends.

　なお、Ｅｙｅ　Ｔｒａｃｋｉｎｇ処理の手法は、特に限定されるものではなく、公知の手法を採用可能である。例えば、ビデオシースルー用カメラ４１とＥｙｅ　Ｔｒａｃｋｉｎｇ用カメラ４３との位置関係を示すデータがあらかじめＲＡＭ１４ａ等のメモリに格納されていてもよい。この場合、ＤＳＰ１３１は、当該位置関係を示すデータに基づいてユーザの目８とＥｙｅ　Ｔｒａｃｋｉｎｇ用カメラ４３との相対関係をユーザの目８とビデオシースルー用カメラ４１との相対位置に変換してもよい。なお、図３４のフローチャートの処理は、ヘッドマウントディスプレイ１ｂがユーザに使用されている間は繰り返し実行される。 The method of eye tracking processing is not particularly limited, and any known method can be used. For example, data indicating the positional relationship between the video see-through camera 41 and the eye tracking camera 43 may be stored in advance in a memory such as RAM 14a. In this case, the DSP 131 may convert the relative relationship between the user's eyes 8 and the eye tracking camera 43 into the relative position between the user's eyes 8 and the video see-through camera 41 based on the data indicating this positional relationship. The processing of the flowchart in Figure 34 is repeatedly executed while the head-mounted display 1b is being used by the user.

　図３５は、第２の実施形態に係るヘッドマウントディスプレイ１ｂの使用時の表示データの生成処理の流れの一例を示すフローチャートである。Ｓ２１の画像データ取得処理、及びＳ２３の明所・暗所領域の抽出処理は、図１２で説明した第１の実施形態の処理と同様である。 FIG. 35 is a flowchart showing an example of the flow of the display data generation process when using the head-mounted display 1b according to the second embodiment. The image data acquisition process in S21 and the bright and dark area extraction process in S23 are the same as the processes in the first embodiment described in FIG. 12.

　また、Ｓ２２の画像データ内の被写体の位置関係データの取得処理において、Ｗａｒｐ１８は、ＳＲＡＭ１４ａ等のメモリから、第１位置関係データと第２位置関係データとを取得する。 Furthermore, in the process of acquiring positional relationship data of the subject in the image data in S22, Warp 18 acquires first positional relationship data and second positional relationship data from a memory such as SRAM 14a.

　そして、Ｗａｒｐ１８は、第２位置関係データに基づいて、ビデオシースルー画像データ内の被写体の位置関係を補正する（Ｓ４１）。Ｓ２４のビデオシースルー画像データの変形処理から、Ｓ２７の表示データ及び透過率データの出力処理までは、図１２で説明した第１の実施形態の処理と同様である。ここで、このフローチャートの処理は終了する。 Warp 18 then corrects the positional relationship of the subject in the video see-through image data based on the second positional relationship data (S41). The processes from the transformation process of the video see-through image data in S24 to the output process of the display data and transmittance data in S27 are the same as those in the first embodiment described in FIG. 12. At this point, the processing of this flowchart ends.

　このように、本実施形態のＳｏＣ１００ｂは、Ｅｙｅ　Ｔｒａｃｋｉｎｇ用画像データに基づいてユーザの瞳の位置を検出し、瞳の位置とビデオシースルー用カメラ４１との位置関係を表す第２位置関係データを生成する。このため、本実施形態のＳｏＣ１００ｂによれば、実際の装着時の瞳の位置のずれやユーザの個人差を透過ディスプレイ３に表示されるビデオシースルー表示画像に反映することができる。 In this way, the SoC 100b of this embodiment detects the position of the user's pupils based on the Eye Tracking image data, and generates second positional relationship data that represents the positional relationship between the pupil positions and the video see-through camera 41. Therefore, the SoC 100b of this embodiment can reflect the deviation in pupil position when actually worn and individual differences between users in the video see-through display image displayed on the transmissive display 3.

（第２の実施形態の変形例１）
　上述の第２の実施形態では、ＤＳＰ＆ＡＩ　Ａｃｃｅｌｅｒａｔｏｒ１３は、Ｅｙｅ　Ｔｒａｃｋｉｎｇ用画像データから、ユーザの瞳孔の位置を検出していた。本変形例では、ＤＳＰ＆ＡＩ　Ａｃｃｅｌｅｒａｔｏｒ１３は、Ｅｙｅ　Ｔｒａｃｋｉｎｇ用画像データから、ユーザの瞳に映る瞳反射画像データを抽出する。ＤＳＰ＆ＡＩ　Ａｃｃｅｌｅｒａｔｏｒ１３は、本変形例における瞳反射画像抽出回路の一例である。 (Modification 1 of the second embodiment)
In the second embodiment described above, the DSP&AI accelerator 13 detects the position of the user's pupil from the eye tracking image data. In this modification, the DSP&AI accelerator 13 extracts pupil reflection image data reflected in the user's pupil from the eye tracking image data. The DSP&AI accelerator 13 is an example of a pupil reflection image extraction circuit in this modification.

　また、本変形例のＤＳＰ１３１は、瞳反射画像データと、ビデオシースルー画像データの特徴点とに基づいて第２位置関係データを生成し、ＳＲＡＭ１４ａ等のメモリへ出力する。ＤＳＰ１３１は、本変形例における画像処理回路の一例である。 In addition, the DSP 131 of this modified example generates second positional relationship data based on the pupil reflection image data and feature points of the video see-through image data, and outputs the data to a memory such as the SRAM 14a. The DSP 131 is an example of an image processing circuit in this modified example.

　本変形例のＳｏＣ１００ｂは、Ｅｙｅ　Ｔｒａｃｋｉｎｇ用カメラ４３により撮影されるＥｙｅ　Ｔｒａｃｋｉｎｇ用画像データからユーザの瞳に映る瞳反射画像データを抽出し、当該瞳反射画像データに基づく第２位置関係データを生成する。このため、本変形例のＳｏＣ１００ｂによれば、ユーザが実際に見ている画像を加味してビデオシースルー画像データを補正することができる。 SoC100b of this modified example extracts pupil reflection image data reflected in the user's eyes from the eye tracking image data captured by the eye tracking camera 43, and generates second positional relationship data based on that pupil reflection image data. Therefore, SoC100b of this modified example can correct video see-through image data by taking into account the image that the user is actually viewing.

（その他の変形例）
　なお、上述の各実施形態では、ＳｏＣ１００ａ，１００ｂはヘッドマウントディスプレイ１ａ，１ｂに搭載されたが、ＳｏＣ１００ａ，１００ｂは他の態様の表示装置に適用されてもよい。例えば、ＳｏＣ１００ａ，１００ｂは車両のフロントガラスに設けられたヘッドアップディスプレイの表示装置であってもよい。 (Other Modifications)
In the above-described embodiments, the SoCs 100a and 100b are mounted on the head-mounted displays 1a and 1b, but the SoCs 100a and 100b may be applied to other display devices. For example, the SoCs 100a and 100b may be display devices for head-up displays mounted on the windshield of a vehicle.

　また、上述の各実施形態ではヘッドマウントディスプレイ１ａ，１ｂは２つのレンズ２ａ，２ｂを備えるゴーグル型の装置であってもよく、また両目用の１つの大きなレンズ２を備えるゴーグル型の装置であってもよい。また、透過ディスプレイ３は、レンズ２の一部ではなく、全面に設けられてもよい。 Furthermore, in each of the above-described embodiments, the head-mounted displays 1a and 1b may be goggle-type devices equipped with two lenses 2a and 2b, or may be goggle-type devices equipped with one large lens 2 for both eyes. Furthermore, the see-through display 3 may be provided on the entire surface of the lens 2, rather than as part of it.

　上述の各実施形態及びその変形例においてＳｏＣ１００ａ，１００ｂが実行する各種の処理は、例えば、プログラムとして不揮発性の記憶媒体に保存されていてもよい。例えば、ＳｏＣ１００ａ，１００ｂのＣＰＵ２３等が当該プログラムを読み出すことにより、上述の各種の処理が実行されてもよい。 The various processes executed by SoC 100a, 100b in the above-described embodiments and their variations may be stored, for example, as a program on a non-volatile storage medium. For example, the CPU 23 of SoC 100a, 100b may read the program to execute the various processes described above.

　以上、各実施形態及び変形例について説明したが、本願の開示する画像処理装置、画像処理方法、及び画像処理プログラムは、上述の各実施形態等そのままに限定されるものではなく、各実施段階等ではその要旨を逸脱しない範囲で構成要素を変形して具体化できる。また、上述の各実施形態等に開示されている複数の構成要素の適宜な組み合わせにより、種々の発明を形成できる。例えば、実施形態に示される全構成要素から幾つかの構成要素を削除してもよい。 Although the above describes each embodiment and its variations, the image processing device, image processing method, and image processing program disclosed herein are not limited to the above-described embodiments, and the components can be modified and embodied at each implementation stage without departing from the spirit of the invention. Furthermore, various inventions can be created by appropriately combining multiple components disclosed in the above-described embodiments. For example, some components may be deleted from all of the components shown in the embodiments.

　１ａ，１ｂ　ヘッドマウントディスプレイ
　２，２ａ，２ｂ　レンズ
　３，３ａ，３ｂ　透過ディスプレイ
　５，５ａ，５ｂ　ディスプレイプロジェクタ
　８，８ａ，８ｂ　目
　９ａ，９ｂ　被写体
　１０　眼鏡本体部
　１１，２２　Ｉ２Ｃインタフェース
　１２　Ｍｏｎｏ　ＩＳＰ
　１３　ＤＳＰ＆ＡＩ　Ａｃｃｅｌｅｒａｔｏｒ
　１４ａ～１４ｃ　ＳＲＡＭ
　１５　ＧＰＵ
　１６　Ｃｏｌｏｒ　ＩＳＰ
　１７　Ｔｉｍｅ　Ｗａｒｐ
　１８　Ｗａｒｐ
　１９　Ｄｉｓｐｌａｙ　Ｃｏｎｔｒｏｌｌｅｒ
　２１　ＳＴＡＴ
　２３　ＣＰＵ
　２４　ＤＲＡＭ　Ｃｏｎｔｒｏｌｌｅｒ
　３１　Ｆｌａｓｈ　ｍｅｍｏｒｙ
　３２　ＤＲＡＭ
　４１，４１ａ，４１ｂ　ビデオシースルー用カメラ
　４２，４２ａ，４２ｂ　キャリブレーション用カメラ
　４３，４３ａ，４３ｂ　Ｅｙｅ　Ｔｒａｃｋｉｎｇ用カメラ
　６０　Ａｍｂｉｅｎｔ　Ｌｉｇｈｔ　Ｓｅｎｓｏｒ
　６１　ＩＭＵ
　６２　ＴｏＦセンサ
　６３，６３ａ，６３ｂ　Ｈｅａｄ　Ｔｒａｃｋｉｎｇ用カメラ
　７１　暗所
　７２　明所
　９０　特徴点
　１００ａ，１００ｂ　ＳｏＣ
　１３１　ＤＳＰ
　１３２　ＡＩ　Ａｃｃｅｌｅｒ
　１９１　ＥＮブロック
　１９２　Ｂｌｅｎｄブロック
　２０１，２０２　遮光対象領域
　３０１　前面
　３０２　背面
　３１０　領域
　３２０　領域
　９０１　第１のキャリブレーション用画像データ
　９０２　第２のキャリブレーション用画像データ
　９１１　疑似透過像データ
　９２１～９２５，９２１ａ，９２２ａ，９２４ａ，９２５ａ　ビデオシースルー画像データ
　９３１～９３４，９３１ａ　ビデオシースルー表示画像
　９４０，９４０ａ　疑似シースルー表示画像データ
　９４１　透過像 1a, 1b Head-mounted display 2, 2a, 2b Lens 3, 3a, 3b See-through display 5, 5a, 5b Display projector 8, 8a, 8b Eye 9a, 9b Subject 10 Glasses body 11, 22 I2C interface 12 Mono ISP
13 DSP&AI Accelerator
14a-14c SRAM
15 GPUs
16 Color ISP
17 Time Warp
18 Warp
19 Display Controller
21 STAT
23 CPU
24 DRAM Controller
31 Flash memory
32 DRAM
41, 41a, 41b Video see-through cameras 42, 42a, 42b Calibration cameras 43, 43a, 43b Eye tracking cameras 60 Ambient Light Sensor
61 IMU
62 ToF sensor 63, 63a, 63b Head tracking camera 71 Dark place 72 Light place 90 Feature point 100a, 100b SoC
131 DSP
132 AI Acceler
191 EN block 192 Blend block 201, 202 Light-shielded target area 301 Front surface 302 Back surface 310 Area 320 Area 901 First calibration image data 902 Second calibration image data 911 Pseudo-transmitted image data 921 to 925, 921a, 922a, 924a, 925a Video see-through image data 931 to 934, 931a Video see-through display image 940, 940a Pseudo-see-through display image data 941 Transmitted image

Claims

a memory that stores first positional relationship data indicating a positional relationship between a subject included in a transmission image transmitted through a transmissive display and the subject included in first image data captured by a first camera that captures an image in a direction in which the back surface of the transmissive display faces;
a display data generating circuit that extracts and deforms at least a partial area of the first image data based on the first positional relationship data, and generates display data so that the contour of the subject included in the transmission image and the contour of the subject included in the area of the first image data are continuous on the transmissive display;
a display control circuit that displays the display data on the transmissive display;
A semiconductor device having:

an image processing circuit that calculates feature points of the transmission image based on the first image data, generates the first positional relationship data based on the feature points of the transmission image and the feature points of the first image data, and outputs the first positional relationship data to the memory;
The semiconductor device according to claim 1 .

The image processing circuit
extracting a first feature point of the transmission image from second image data obtained by photographing the subject located on the rear side of the transmission display through the transmission display with a second camera from the front side of the transmission display;
extracting second feature points of the subject from third image data obtained by capturing an image of the transmissive display, on which the display data based on the first image data captured by the first camera is displayed, using the second camera;
generating the first positional relationship data based on a positional relationship between the first feature point and the second feature point;
The semiconductor device according to claim 2 .

an image processing circuit that extracts a bright area of the first image data and calculates a bright area of the transmission image based on the bright area of the first image data and the first positional relationship data;
the display data generation circuit generates the display data by deforming a portion of the extracted first image data that is in contact with a bright area of the transmission image.
The semiconductor device according to claim 1 .

an image processing circuit that extracts a dark area of the first image data and calculates a dark area of the transmission image based on the dark area of the first image data and the first positional relationship data;
the display control circuit extracts a portion of the first image data corresponding to a dark area of the transmission image and adjusts brightness thereof;
The semiconductor device according to claim 1 .

an image processing circuit that extracts a bright area of the first image data and calculates a bright area of the transmission image based on the bright area of the first image data and the first positional relationship data;
the display control circuit controls the transmittance of a region of the transmissive display corresponding to a bright region of the transmission image to a second transmittance that is smaller than a first transmittance used when the image processing circuit calculates the bright region of the transmission image;
The semiconductor device according to claim 1 .

a pupil detection circuit that detects the position of a pupil of a viewer of the transmissive display based on fourth image data captured by a third camera that captures an image in a direction in which the front surface of the transmissive display faces;
an image processing circuit that generates second positional relationship data representing a positional relationship between the position of the pupil and the first camera based on the position of the pupil and the feature points of the first image data, and outputs the second positional relationship data to the memory;
The semiconductor device according to claim 1 .

a pupil reflection image extraction circuit that extracts pupil reflection image data reflected in the pupil of an observer of the transmissive display from fourth image data captured by a third camera that captures an image in a direction in which the front surface of the transmissive display faces;
an image processing circuit that generates second positional relationship data representing a positional relationship between the position of the pupil and the first camera based on the pupil reflection image data and feature points of the first image data, and outputs the second positional relationship data to the memory;
The semiconductor device according to claim 1 .

the memory further stores pupil position data indicating the positions of the pupils of a viewer of the transmissive display;
an image processing circuit that generates second positional relationship data representing a positional relationship between the pupil position and the first camera based on the pupil position data and the feature points of the first image data, and outputs the second positional relationship data to the memory;
The semiconductor device according to claim 1 .

the display data generation circuit further corrects the position of the subject in the first image data based on the second positional relationship data.
The semiconductor device according to claim 7 .

a first positional relationship data storage step for storing first positional relationship data indicating a positional relationship between a subject included in a transmission image transmitted through a transmissive display and the subject included in first image data captured by a first camera that captures an image in a direction in which the back surface of the transmissive display faces;
a display data generating step of extracting and deforming at least a partial area of the first image data based on the first positional relationship data, and generating display data so that an outline of the subject included in the transmission image and an outline of the subject included in the area of the first image data are continuous on the transmission display;
a display control step of displaying the display data on the transmissive display;
A method comprising:

at least one see-through display;
a display projector that displays display data on the transmissive display;
a memory that stores first positional relationship data indicating a positional relationship between a subject included in a transmission image transmitted through the transmissive display and the subject included in first image data captured by a first camera that captures an image in a direction in which the back surface of the transmissive display faces;
a display data generating circuit that extracts and deforms at least a partial area of the first image data based on the first positional relationship data, and generates the display data so that the contour of the subject included in the transmission image and the contour of the subject included in the area of the first image data are continuous on the transmissive display;
a display control circuit that displays the display data on the transmissive display;
A head-mounted display comprising: