JPH1132250A

JPH1132250A - Voice-guided landscape labeling device and system

Info

Publication number: JPH1132250A
Application number: JP9186676A
Authority: JP
Inventors: Takahiro Matsumura; 隆宏松村; Toshiaki Sugimura; 利明杉村; Masaji Katagiri; 雅二片桐; Masaji Takano; 正次高野; Takeshi Ikeda; 武史池田
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: NTT Inc
Priority date: 1997-07-11
Filing date: 1997-07-11
Publication date: 1999-02-02

Abstract

(57)【要約】【課題】コンピュータ上の地理的情報と実風景の景観
画像中の各部分とを対応付けて利用者に提示する。【解決手段】景観画像を画像取得部１で取得し、画像
取得時のカメラ位置を位置情報取得部２で取得し、カメ
ラ角と焦点距離と景観画像サイズをカメラ属性情報取得
部３で取得する。地図情報管理部５で、取得したカメラ
位置とカメラ角と焦点距離と画像サイズを基に地図情報
空間の中で視野空間を求め、視野空間中に存在する構造
物を取得する。ガイド音声情報作成部６Ａで構造物の名
称またはその属性情報を含むガイド音声情報を作成し、
ガイド音声情報出力部７Ａでガイド音声情報を出力す
る。 (57) [Summary] [Problem] To present to a user geographical information on a computer in association with each part in a landscape image of a real scene. A landscape image is acquired by an image acquisition unit, a camera position at the time of image acquisition is acquired by a position information acquisition unit, and a camera angle, a focal length, and a landscape image size are acquired by a camera attribute information acquisition unit. . The map information management unit 5 obtains a visual field space in the map information space based on the acquired camera position, camera angle, focal length, and image size, and acquires a structure existing in the visual field space. Guide voice information including the name of the structure or its attribute information is created by the guide voice information creation unit 6A,
The guide voice information output section 7A outputs guide voice information.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、カメラ等の景観画
像入力機器を用いて利用者が撮影した画像に対してその
画像中の各部分領域に関する地理的な情報を画像表示装
置に重畳表示したり音声案内等して利用者に教示する装
置に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention superimposes and displays, on an image display device, geographical information about each partial area in an image taken by a user using a landscape image input device such as a camera. The present invention relates to a device that teaches a user by voice guidance or the like.

【０００２】[0002]

【従来の技術】従来、利用者がいる周辺に関する地理的
情報を利用者に教示するシステムとして種々のナビゲー
ションシステムがあった。2. Description of the Related Art Conventionally, there have been various navigation systems as systems for instructing users on geographical information about the area where the users are located.

【０００３】図１３は特開平８−２７３０００号に開示
されたナビゲーション装置の構成図である。この装置
は、車両の位置データと動きデータを入力すると、道路
地図データを参照して車両の位置を更新する位置更新部
７１と、地図データ等に基づいて表示用道路データおよ
び表示用背景データを発生させる表示用データ発生部７
２と、これらの表示用データに基づいて３次元動画像デ
ータを作成する３次元動画像データ作成部７３と、記憶
部７４を有し、ナビゲーション装置のユーザが目的地、
経由地を含む走行経路を事前に設定する場合に、地図画
面でなく実際に存在する道路に沿ったリアルな動画像表
示画面を見ながら経路を設定できる機能を有する。FIG. 13 is a block diagram of a navigation device disclosed in Japanese Patent Application Laid-Open No. 8-273000. This device receives a vehicle position data and a movement data, receives a road map data, updates a vehicle position with reference to a position updating unit 71, and displays display road data and display background data based on map data and the like. Display data generator 7 to be generated
2, a three-dimensional moving image data creating unit 73 for creating three-dimensional moving image data based on these display data, and a storage unit 74.
When a travel route including a waypoint is set in advance, the device has a function of setting a route while viewing a real moving image display screen along a road that actually exists instead of a map screen.

【０００４】この装置によれば、ユーザは実際に在る経
路に沿って走行するときに、その経路に沿った動画像表
示（例えば、図１４）を見ることができる。According to this device, when a user travels along a route that actually exists, the user can view a moving image display (for example, FIG. 14) along the route.

【０００５】[0005]

【発明が解決しようとする課題】しかしながら、同装置
を用いる場合、最終的には人間が現実の風景とコンピュ
ータの世界での地理的情報とを肉眼で対応付けることに
よって、現実の風景の中のものが何であるかを認識しな
ければならない。つまり、利用者の眼前にある実際の建
物や道路や山が何であるかを、動画像表示された地図中
の記号等を基にして人間が肉眼を頼りにして人間の脳を
無意識に働かせて対応付けの作業を行って理解しなけれ
ばならない。街角等では、コンピュータでの地図と実際
の景観を見比べては方角を把握したり目印を見つけたり
してその方向を注視し、その方向にある建物の特徴を理
解した上で再度地図を見てその建物が何であるかを理解
している。However, in the case of using the same device, the human finally associates the real scene with the geographical information in the computer world with the naked eye, so that the person in the real scene can be used. You have to recognize what is. In other words, humans rely on the naked eye to work the human brain unconsciously, based on the symbols in the map displayed as a moving image, to determine what the actual buildings, roads, and mountains are in front of the user. It must be understood by performing the work of association. At street corners, etc., comparing the map on the computer with the actual scenery, grasping the direction and finding landmarks, gazing at the direction, understanding the characteristics of the building in that direction, and then looking at the map again Understand what the building is.

【０００６】このため、何度もコンピュータ上の地図と
実風景を見比べて人間の方で対応付けする手間は省略で
きないという問題点がある。特に薄暗がりや夜間等は実
風景が見にくくて対応を取りにくい。[0006] For this reason, there is a problem that the trouble of comparing the map on the computer with the actual scenery many times and associating with the human cannot be omitted. Especially in the dark or at night, it is difficult to see the actual scenery and to take measures.

【０００７】本発明の目的は、コンピュータ上の地理的
情報と実風景の画像（以下、景観画像と呼ぶ。）中の各
部分とを対応付けて利用者に教示する景観ラベリング装
置およびシステムを提供することである。An object of the present invention is to provide a landscape labeling apparatus and system for associating a user with geographical information on a computer and each part in an image of a real landscape (hereinafter referred to as a landscape image). It is to be.

【０００８】[0008]

【課題を解決するための手段】本発明は、コンピュータ
上の地図データを３次元データとして予め作成してお
き、画像（ＣＧ画像と区別するため以降景観画像と呼
ぶ）が入力されるときの位置とカメラの角度と焦点距離
と画像サイズを撮影時に取得し、コンピュータ上の３次
元地図空間内で実風景撮影時の位置とカメラの角度と焦
点距離から眺望した場合のコンピュータグラフィックス
（以下、ＣＧとする。）画像内での地理的情報を取得
し、その地理的情報を含むガイド音声を作成し、ガイド
音声で景観の説明を行うものである。この地理的情報と
は画像での、構造物等の名称またはその属性情報であ
り、属性情報とはその構造物に関するあらゆる属性（例
えば輪郭、色等）についての情報を意味する。この明細
書の中では構造物という言葉を人工の構造物以外に、山
や川や海等の天然の地形も含めて地図ＤＢでの何らかの
地理的構造を有するデータ全ての意味で用いることとす
る。地理的情報の取得にあたっては、カメラ位置、カメ
ラ角、焦点距離、画像サイズをもとに景観画像を求め、
複数画像の構造物を求める。その構造物が写っているは
ずの景観画像の位置（以下、付与位置と称す）を求め
て、構造物の名称または属性情報を含むガイド音声情報
を作成する。According to the present invention, map data on a computer is created in advance as three-dimensional data, and a position at which an image (hereinafter referred to as a landscape image to distinguish it from a CG image) is input. , Camera angle, focal length, and image size at the time of shooting, and computer graphics (hereinafter referred to as CG) when viewed from the position at the time of shooting the actual scenery, the camera angle, and the focal length in the three-dimensional map space on the computer. It is assumed that geographical information in an image is obtained, a guide voice including the geographical information is created, and the landscape is described with the guide voice. The geographic information is the name of a structure or the like or its attribute information in an image, and the attribute information means information on all attributes (for example, contour, color, etc.) of the structure. In this specification, the term structure is used to mean all data having a certain geographical structure in the map DB, including natural topography such as mountains, rivers, and the sea, in addition to artificial structures. . When acquiring geographic information, we obtain a landscape image based on the camera position, camera angle, focal length, image size,
Find structures of multiple images. The position of the landscape image in which the structure is supposed to be found (hereinafter, referred to as an assigned position) is obtained, and guide voice information including the name or attribute information of the structure is created.

【０００９】さらに、景観画像での構造物とＣＧ画像で
の構造物との対応付けの精度をさらに上げるためには、
景観画像の各部分領域に対して先に獲得した構造物をパ
ターンマッチングにより対応付ける。獲得した構造物を
基にしてＣＧ画像を作成し、景観画像の前記部分領域に
対してパターンマッチングによりＣＧ画像中の部分領域
を対応付け、対応付けられた部分領域のもととなった構
造物を求める。Further, in order to further improve the accuracy of associating the structure in the landscape image with the structure in the CG image,
The previously acquired structure is associated with each partial region of the landscape image by pattern matching. A CG image is created on the basis of the acquired structure, a partial region in the CG image is associated with the partial region of the landscape image by pattern matching, and the structure as a source of the associated partial region Ask for.

【００１０】ここで、ＣＧ画像の作成法の一例について
述べる。先に取得したカメラ位置とカメラ角度と焦点距
離と画像サイズを基に３次元地図ＤＢにアクセスして、
３次元地図空間内での視野空間を求める。視野空間中の
構造物を求め、カメラ画面を投影面として、各構造物の
立体データをこの投影面に３次元投影変換する。さらに
各構造物の投影図形を構成する線データのうち、他の構
造物に隠れて見えない線データを法線ベクトル法等の手
法を用いて隠線消去する。隠線消去して残った線データ
を基にして、ＣＧ画像を領域分割する。３次元地図ＤＢ
を利用しているため、各領域毎にその領域のもととなる
構造物の名称を対応付けできる。Here, an example of a method for creating a CG image will be described. By accessing the three-dimensional map DB based on the camera position, camera angle, focal length, and image size obtained earlier,
Obtain the visual field space in the three-dimensional map space. A structure in the visual field space is obtained, and three-dimensional data of each structure is three-dimensionally projected and converted onto the projection plane using the camera screen as a projection plane. Further, of the line data constituting the projected figure of each structure, line data hidden by other structures and invisible is erased using a normal vector method or the like. The CG image is divided into regions based on the line data remaining after the elimination of the hidden lines. 3D map DB
Is used, the name of the structure that is the basis of the area can be associated with each area.

【００１１】そうして、パターンマッチングにより景観
画像の各部分領域に対応付けられたＣＧ画像の部分領域
の構造物名称を抽出する。抽出した構造物名称を重畳す
べき実風景画像の位置座標を、３次元地図空間中での構
造物の位置座標を先の投影面に３次元投影変換して求め
る。抽出した構造物名称を重畳すべき実風景画像の位置
座標からガイド音声情報を作成する。ガイド音声情報を
出力し、利用者に音声で案内する。Then, the structure name of the partial area of the CG image associated with each partial area of the landscape image is extracted by pattern matching. The position coordinates of the actual scenery image on which the extracted structure name is to be superimposed are obtained by three-dimensionally projecting the position coordinates of the structure in the three-dimensional map space onto the projection plane. Guide voice information is created from the position coordinates of the actual scenery image on which the extracted structure name is to be superimposed. It outputs guide voice information and guides the user by voice.

【００１２】本発明の音声案内型景観ラベリング装置
は、画像を取得する画像取得手段と、画像取得時のカメ
ラ位置を取得する位置情報取得手段と、画像を取得した
ときのカメラ角と焦点距離と画像サイズを取得するカメ
ラ属性情報取得手段と、地図情報を管理し、取得したカ
メラ位置とカメラ角と焦点距離と画像サイズを基に地図
情報空間の中で視野空間を求め、その視野空間中に存在
する構造物を獲得する地図情報管理手段と、前記構造物
の名称またはその属性情報を含むガイド音声情報を作成
するガイド音声情報作成手段と、前記ガイド音声情報を
出力するガイド音声情報出力手段と、上記各手段を制御
する制御手段を有する。[0012] A voice-guided landscape labeling apparatus of the present invention comprises: an image acquisition unit for acquiring an image; a position information acquisition unit for acquiring a camera position at the time of image acquisition; a camera angle and a focal length when the image is acquired; A camera attribute information acquisition unit that acquires an image size, and manages map information, obtains a visual field space in a map information space based on the acquired camera position, camera angle, focal length, and image size. Map information management means for acquiring an existing structure, guide voice information creation means for creating guide voice information including the name of the structure or its attribute information, and guide voice information output means for outputting the guide voice information And control means for controlling each of the above means.

【００１３】本発明の他の音声案内型景観ラベリング装
置は、画像を取得する画像取得手段と、画像取得時のカ
メラ位置を取得する位置情報取得手段と、画像取得時の
カメラ角と焦点距離と画像サイズを取得するカメラ属性
情報取得手段と、取得した画像を複数の部分領域に分割
する画像処理手段と、地図情報を管理し、取得したカメ
ラ位置とカメラ角と焦点距離と画像サイズを基に地図情
報空間のなかで視野空間を求め、その視野空間中に存在
する構造物を獲得する地図情報管理手段と、前記画像の
前記部分領域に対して前記獲得した構造物を対応付け、
対応付けられた前記構造物の名称またはその属性情報を
含むガイド音声情報を作成するガイド音声情報作成手段
と、前記ガイド音声情報を出力するガイド音声情報出力
手段と、前記各手段を制御する制御手段を有する。According to another aspect of the present invention, there is provided a voice-guided landscape labeling apparatus, comprising: an image acquisition unit for acquiring an image; a position information acquisition unit for acquiring a camera position at the time of image acquisition; A camera attribute information acquiring unit for acquiring an image size, an image processing unit for dividing an acquired image into a plurality of partial areas, and managing map information, based on the acquired camera position, camera angle, focal length, and image size. Finding the view space in the map information space, map information management means for acquiring a structure existing in the view space, and associating the acquired structure with the partial region of the image,
Guide voice information generating means for generating guide voice information including the associated structure name or attribute information thereof, guide voice information output means for outputting the guide voice information, and control means for controlling the respective means Having.

【００１４】本発明の実施態様によれば、ガイド音声情
報作成手段は、獲得した構造物を基にしてＣＧ画像を作
成し、前記画像の前記部分領域に対してパターンマッチ
ングにより前記ＣＧ画像中の部分領域に対応付け、対応
付けられた部分領域の構造物を求め、その構造物の名称
または属性情報を含むガイド音声情報を作成する。According to an embodiment of the present invention, the guide audio information creating means creates a CG image based on the acquired structure, and performs pattern matching on the partial area of the image to obtain a CG image in the CG image. The structure is associated with the partial area, a structure of the associated partial area is obtained, and guide voice information including the name or attribute information of the structure is created.

【００１５】本発明の実施態様によれば、ガイド音声情
報作成手段は、獲得した構造物をカメラ画面に３次元投
影変換し、視点から見えない構造物を消去してＣＧ画像
を作成し、ＣＧ画像中の部分領域の輪郭線によってＣＧ
画像を部分領域に分割し、前記画像の前記部分領域と前
記ＣＧ画像の前記部分領域とをパターンマッチングによ
り対応付け、対応付けられた部分領域の構造物を求め、
その構造物の名称または属性情報を含むガイド音声情報
を作成する。According to an embodiment of the present invention, the guide voice information creating means performs a three-dimensional projection conversion of the acquired structure on a camera screen, deletes a structure that cannot be seen from the viewpoint, creates a CG image, and creates a CG image. CG by contour line of partial area in image
Dividing the image into partial areas, associating the partial areas of the image with the partial areas of the CG image by pattern matching, and determining a structure of the associated partial area;
Guide voice information including the name or attribute information of the structure is created.

【００１６】本発明の音声案内型景観ラベリングシステ
ムは、景観ラベリング端末と景観ラベリングセンターか
らなり、景観ラベリング端末は、画像を取得する画像取
得手段と、画像取得時のカメラ位置を取得する位置情報
取得手段と、画像取得時のカメラ角と焦点距離と画像サ
イズを取得するカメラ属性情報取得手段と、取得した画
像を複数の部分領域に分割する画像処理手段と、前記画
像の領域分割に関する情報と前記カメラ位置と前記カメ
ラ角と前記焦点距離と前記画像サイズとを通信網を介し
て前記景観ラベリングセンターに送信し、景観ラベリン
グセンターからガイド音声情報を受信する通信制御手段
と、前記ガイド音声情報を出力するガイド音声情報出力
手段と、上記各手段を制御する端末制御手段を有し、景
観ラベリングセンターは、前記通信網を介して前記景観
ラベリング端末から前記画像の領域分割に関する情報と
前記カメラ位置と前記カメラ角と前記焦点距離と前記画
像サイズとを受信し、前記景観ラベリング端末に前記ラ
ベル情報を送信する通信制御手段と、地図情報を管理
し、受信したカメラ位置とカメラ角と焦点距離と画像サ
イズを基に地図情報空間の中で視野空間を求め、その視
野空間中に存在する構造物を獲得する地図情報管理手段
と、前記画像の前記部分領域に対して前記獲得した構造
物を対応付け、対応付けられた前記構造物の名称または
属性情報を含むガイド音声情報を作成するガイド音声情
報作成手段と、上記各手段を制御するセンター制御手段
を有する。A voice-guided landscape labeling system according to the present invention comprises a landscape labeling terminal and a landscape labeling center. The landscape labeling terminal includes image acquisition means for acquiring an image, and position information acquisition for acquiring a camera position at the time of image acquisition. Means, camera attribute information acquisition means for acquiring a camera angle, a focal length, and an image size at the time of image acquisition; image processing means for dividing the acquired image into a plurality of partial areas; information on area division of the image; Communication control means for transmitting the camera position, the camera angle, the focal length, and the image size to the landscape labeling center via a communication network, receiving guide voice information from the landscape labeling center, and outputting the guide voice information And a terminal control means for controlling each of the above means. Receiving information about the area division of the image, the camera position, the camera angle, the focal length, and the image size from the landscape labeling terminal via the communication network, and sending the label information to the landscape labeling terminal. And communication control means for transmitting map information, managing the map information, obtaining a view space in the map information space based on the received camera position, camera angle, focal length, and image size, and constructing a structure existing in the view space. Map information management means for acquiring the acquired structure with the partial area of the image, and guide audio information for creating guide audio information including name or attribute information of the associated structure It has a creating means and a center control means for controlling each of the above means.

【００１７】[0017]

【発明の実施の形態】次に、本発明の実施の形態につい
て図面を参照して説明する。Next, embodiments of the present invention will be described with reference to the drawings.

【００１８】図１は本発明の第１の実施形態の音声案内
型景観ラベリング装置の構成図である。FIG. 1 is a configuration diagram of a voice-guided landscape labeling apparatus according to a first embodiment of the present invention.

【００１９】本実施形態の景観ラベリング装置は、画像
を取得する、例えばディジタルカメラである画像取得部
１と、画像を取得する際のカメラ位置を取得する、例え
ばＧＰＳ受信機である位置情報取得部２と、画像を取得
する際にカメラ角と焦点距離と画像サイズを取得する、
例えばディジタルカメラに取り付けられた３次元電子コ
ンパスであるカメラ属性情報取得部３と、地図情報を管
理し、取得した位置とカメラ角と焦点距離と画像サイズ
を基に地図情報空間の中で視野空間を求め、その視野空
間中に存在する構造物を獲得する、例えば地図ＤＢ管理
プログラムである地図情報管理部５と、構造物の名称ま
たはその属性情報を含むガイド音声情報を作成するガイ
ド音声情報作成部６Ａと、作成されたガイド音声を出力
する、例えばスピーカであるガイド音声情報出力部７Ａ
と、各部１〜７Ａを制御する制御部８Ａで構成されてい
る。The landscape labeling apparatus according to the present embodiment acquires an image, for example, an image acquisition unit 1 which is a digital camera, and acquires a camera position when acquiring an image, for example, a position information acquisition unit which is a GPS receiver. 2. When acquiring an image, acquire a camera angle, a focal length, and an image size;
For example, a camera attribute information acquisition unit 3, which is a three-dimensional electronic compass attached to a digital camera, and manages map information, and a view space in a map information space based on the acquired position, camera angle, focal length, and image size. To obtain a structure existing in the visual field space, for example, a map information management unit 5 which is a map DB management program, and guide voice information generation for generating guide voice information including the name of the structure or its attribute information Unit 6A and a guide voice information output unit 7A that outputs the generated guide voice, for example, a speaker
And a control unit 8A for controlling each of the units 1 to 7A.

【００２０】図２は本発明の第２の実施形態の景観ラベ
リング装置の構成図、図３は図１の景観ラベリング装置
の処理の流れ図である。FIG. 2 is a configuration diagram of a landscape labeling apparatus according to a second embodiment of the present invention, and FIG. 3 is a flowchart of processing of the landscape labeling apparatus of FIG.

【００２１】本実施形態の景観ラベリング装置は、景観
画像を取得する、例えばディジタルカメラである画像取
得部１と、画像を取得する際のカメラ位置を取得する、
例えばＧＰＳ受信機である位置情報取得部２と、同じく
画像を取得する際にカメラ角と焦点距離と画像サイズを
取得する、例えばディジタルカメラに取り付けられた３
次元電子コンパスであるカメラ属性情報取得部３と、取
得した画像を複数の部分領域に分割する画像処理部４
と、地図情報を管理し、取得したカメラ位置とカメラ角
と焦点距離と画像サイズを基に地図情報空間の中で視野
空間を求め、その視野空間中に存在する構造物を獲得す
る地図情報管理部５と、画像の前記部分領域に対して獲
得した構造物をパターンマッチングにより対応付け、対
応付けられた構造物の名称または属性情報を含むガイド
音声情報を作成するガイド音声情報作成部６Ｂと、作成
されたガイド音声を出力する、例えばスピーカであるガ
イド音声情報出力部７Ｂと、上記各部１〜７Ｂを制御す
る制御部８Ｂで構成されている。The landscape labeling apparatus of the present embodiment acquires a landscape image, for example, an image acquisition unit 1 which is a digital camera, and acquires a camera position at the time of acquiring an image.
For example, a position information acquisition unit 2 that is a GPS receiver, and a camera angle, a focal length, and an image size that are also acquired when acquiring an image.
Camera attribute information acquisition unit 3 which is a three-dimensional electronic compass, and image processing unit 4 which divides the acquired image into a plurality of partial areas
Map information management that manages map information, obtains a visual space in the map information space based on the acquired camera position, camera angle, focal length, and image size, and acquires structures existing in the visual space. A unit 5, a guide voice information creating unit 6 </ b> B that creates a guide voice information including a name or attribute information of the associated structure by associating the acquired structure with the partial region of the image by pattern matching, A guide voice information output unit 7B, which is a speaker, for example, that outputs the generated guide voice, and a control unit 8B that controls the above units 1 to 7B.

【００２２】次に、本実施形態の動作を詳細に説明す
る。Next, the operation of this embodiment will be described in detail.

【００２３】景観ラベリング装置が起動されると、まず
制御部８Ｂが景観画像に関する情報を取得するために、
位置情報取得部２、カメラ属性情報取得部３、画像取得
部１に対して処理開始コマンドを送る。位置情報取得部
２は、制御部８Ｂから命令を受けてＧＰＳ受信機等によ
り位置情報を毎秒収集し、制御部８Ｂに渡す（ステップ
２１）。ここで、時間間隔は秒単位に限らずどのように
とってもよい。画像取得部１は、制御部８Ｂから命令を
受けて毎秒の景観画像を取得し、制御部８Ｂに渡す（ス
テップ２２）。カメラ属性情報取得部３は、制御部８Ｂ
の命令を受けて画像撮影時のカメラ等景観画像記録装置
のカメラ角を水平角と仰角の組で取得し（ステップ２
３）、同時にズーム機能を有する景観画像装置であれば
焦点距離を取得する（ステップ２４）。画像サイズは景
観画像装置毎に固定なので、制御部８Ｂが画像サイズ情
報を保持しておく。制御部８Ｂは収集した情報を景観画
像ファイルとして保持する。When the landscape labeling device is activated, first, the control unit 8B obtains information on the landscape image by:
A processing start command is sent to the position information acquisition unit 2, the camera attribute information acquisition unit 3, and the image acquisition unit 1. The position information acquisition unit 2 receives a command from the control unit 8B, collects position information with a GPS receiver or the like every second, and transfers the collected information to the control unit 8B (step 21). Here, the time interval is not limited to the unit of seconds, but may be any value. The image acquisition unit 1 receives a command from the control unit 8B, acquires a landscape image every second, and passes it to the control unit 8B (step 22). The camera attribute information acquisition unit 3 includes a control unit 8B
(Step 2), the camera angle of the landscape image recording device such as a camera at the time of image capturing is acquired as a set of the horizontal angle and the elevation angle.
3) At the same time, if it is a landscape image device having a zoom function, the focal length is obtained (step 24). Since the image size is fixed for each landscape image device, the control unit 8B holds the image size information. The control unit 8B holds the collected information as a landscape image file.

【００２４】図４は、景観画像ファイルのデータ構造の
ファイル形式を示す。景観画像ファイルはヘッダ情報と
画像データを持つ。ヘッダ情報としては、位置情報、カ
メラ角情報、焦点距離、時刻情報、画像ファイルの画像
サイズ、タイプおよびサイズを持つ。位置情報として、
東経、北緯、標高の各データ（例えば、東経１３７度５
５分１０秒、北緯３４度３４分３０秒、標高１０１ｍ３
３ｃｍ等）を有する。カメラ角として、水平角と仰角の
各データ（例えば、水平角右回り２５４度、仰角１５度
等）を有する。焦点距離データは、画像撮影時のカメラ
レンズの焦点距離（例えば２８ｍｍ等）である。時刻情
報として、撮影時の時刻（例えば、日本時間１９９７年
１月３１日１５時６分１７秒等）を持つ。画像ファイル
の画像サイズとして、縦横の画素サイズ（例えば、６４
０×４８０等）を持つ。同じくファイルタイプ（ＴＩＦ
Ｅ形式、８ビットカラー等）を持つ。同じくファイルの
バイト数（３０７．２ＫＢ等）を持つ。画像データその
ものを例えばバイナリー形式で持つ。FIG. 4 shows the file format of the data structure of the landscape image file. The landscape image file has header information and image data. The header information includes position information, camera angle information, focal length, time information, image size, type, and size of an image file. As location information,
East longitude, north latitude, and elevation data (for example, 137 degrees east 5 degrees
5 minutes 10 seconds, latitude 34 degrees 34 minutes 30 seconds north, altitude 101m3
3 cm). The camera angle includes horizontal angle data and elevation angle data (for example, horizontal angle 254 degrees clockwise, elevation angle 15 degrees, etc.). The focal length data is the focal length (for example, 28 mm) of the camera lens at the time of capturing an image. The time information includes the time at the time of shooting (for example, 15:06:17, January 31, 1997, Japan time). As the image size of the image file, the vertical and horizontal pixel sizes (for example, 64
0 × 480). File type (TIF
E format, 8-bit color, etc.). It also has the number of bytes of the file (307.2 KB, etc.). The image data itself has, for example, a binary format.

【００２５】制御部８Ｂは景観画像ファイルを格納する
と、画像処理部４に対して、景観画像から輪郭線を抽出
し、景観画像を複数の領域に分割するように命令する。
画像処理部４では、大まかに言えば景観画像内の濃度差
を基に微分処理を行って輪郭線を抽出し（ステップ２
５）、その輪郭線を境界としたラベリングを行うことに
よって領域分割する（ステップ２６）。なお、ここで用
いたラベリングと言う技術用語は画像の領域分割におい
て用いられる技術用語であって、本発明の名称である景
観ラベリングとは異なるものである。手順としてはま
ず、画像を白黒濃淡画像に変換する。輪郭は明るさの急
変する部分であるから、微分処理を行って微分値がしき
い値より大きい部分を求めることで輪郭線の抽出を行
う。このとき輪郭線の線幅は１画素であり、輪郭線は連
結しているようにする。そのために細線化処理を行っ
て、線幅１画素の連結した線を得る。ここで微分処理、
細線化処理は従来からある手法を用いれば十分である。After storing the landscape image file, the control unit 8B instructs the image processing unit 4 to extract the outline from the landscape image and divide the landscape image into a plurality of regions.
The image processing unit 4 performs a differentiation process on the basis of the density difference in the landscape image to extract a contour line (step 2).
5) The region is divided by performing labeling with the outline as a boundary (step 26). Note that the technical term of labeling used here is a technical term used in image area division, and is different from landscape labeling, which is the name of the present invention. As a procedure, first, the image is converted into a monochrome grayscale image. Since the contour is a portion where the brightness changes suddenly, the contour is extracted by performing a differentiation process to find a portion where the differential value is larger than a threshold value. At this time, the line width of the outline is one pixel, and the outlines are connected. For this purpose, a thinning process is performed to obtain a connected line having a line width of one pixel. Where the differentiation process,
It is sufficient to use a conventional method for the thinning processing.

【００２６】得られた輪郭線を領域の輪郭線と考え、輪
郭線により構成される領域に番号をつける操作を行う。
その番号の中で最大の数が領域の数となり、領域中の画
素数がその領域の面積を表す。景観画像を複数の部分領
域に分割した例を図９に示す。なお、領域間の類似度
（近さ）の尺度を導入し、性質が似ている複数の領域を
一つの領域にまとめていくクラスタ化処理を行ってもよ
い。既存方法のどのようなクラスタ化方法によってもよ
い。The obtained outline is regarded as the outline of the region, and an operation of numbering the region constituted by the outline is performed.
The largest number among the numbers is the number of areas, and the number of pixels in the area indicates the area of the area. FIG. 9 shows an example in which a landscape image is divided into a plurality of partial areas. Note that a measure of similarity (closeness) between regions may be introduced, and a clustering process may be performed to combine a plurality of regions having similar properties into one region. Any existing clustering method may be used.

【００２７】制御部８Ｂは景観画像の領域分割処理を完
了させると、地図情報管理部５に対して景観画像ファイ
ルのヘッダ情報を渡して視野空間の算出処理を行う処理
要求を出す（ステップ２７）。地図情報管理部５の例と
しては、地図データベースプログラムがある。地図情報
管理部５は３次元地図データを管理している。２次元地
図データでもよいが、その場合は高さ情報がないために
実風景へのラベリングの付与位置の精度が劣る。なお、
２次元地図データを基にする場合は、高さ情報を補って
処理する。例えば、家屋の２次元データである場合に、
家屋が何階建てかを表す階数情報があれば、階数に一定
数を掛けてその家屋の高さを推定し、２次元データと推
定して求めた高さ情報を基に３次元データを作成する。
階数情報がない場合でも、家屋図形の面積に応じて一定
数の高さを割り振る等して高さ情報を推定することがで
き、同様に推定高さ情報をもとに３次元データを作成す
る。こうして３次元データを作成して処理を進める。When the control section 8B completes the area division processing of the landscape image, it issues a processing request for calculating the visual field space by passing the header information of the landscape image file to the map information management section 5 (step 27). . An example of the map information management unit 5 is a map database program. The map information management unit 5 manages three-dimensional map data. Although two-dimensional map data may be used, in this case, since there is no height information, the accuracy of the position at which the labeling is applied to the actual scenery is inferior. In addition,
When the data is based on the two-dimensional map data, the processing is performed by supplementing the height information. For example, if it is two-dimensional data of a house,
If there is floor information indicating the number of floors of the house, multiply the number of floors by a certain number to estimate the height of the house, and create three-dimensional data based on the height information estimated and estimated as two-dimensional data. I do.
Even when there is no floor information, the height information can be estimated by assigning a certain number of heights according to the area of the house figure, and similarly, three-dimensional data is created based on the estimated height information. . In this way, three-dimensional data is created and the process proceeds.

【００２８】３次元地図データの例を図５に示す。図５
（１）に２次元で表現した地図情報空間を示し、図５
（２）に３次元で表現した地図情報空間を示す。この３
次元地図情報空間に対して、地図情報管理部５では制御
部８Ｂの命令を受けて景観画像ファイルのヘッダ情報を
基に視野空間を算出する（ステップ２８）。図６に視野
空間の計算例を示す。まず、水平方向にＸＹ軸が張り、
垂直方向にＺ軸が張るものとする。景観画像ファイルの
ヘッダ情報中の位置情報から、視点Ｅの位置を３次元地
図情報空間の中で設定する。例えば、東経１３７度５５
分１９秒、北緯３４度３４分３０秒、標高１０１ｍ３３
ｃｍであれば、それに対応する地図メッシュ番号中の対
応する座標を設定する。同じくヘッダ情報中のカメラ角
情報中の水平角と仰角をもとにカメラ角方向を設定す
る。カメラ角方向を表す直線上に視点Ｅから焦点距離分
進んだ点に焦点Ｆをとる。視線方向ベクトルはその直線
上で視点Ｅから出る長さ１の単位ベクトルである。景観
画像ファイルの画像サイズで横方向のサイズからカメラ
画面のＸ軸での幅ｘを設定し、縦方向のサイズからＹ軸
での幅ｙを設定する。横ｘ縦ｙの平面は視線方向ベクト
ルに対してカメラ角方向に垂直で、かつ焦点Ｆを含むよ
うに設定される。視点Ｅの座標からカメラ画面の４隅の
点とを結ぶ直線を各々求め、視点Ｅから延びる４本の半
直線が作る３次元空間を視野空間とする。図７に、３次
元地図空間での視野空間の例を示す。３次元地図空間を
ＸＺ平面から眺めたものである。図７中で斜線で囲まれ
た部分は視野空間に属する空間の、ＸＺ平面での断面図
である。図７の例では、視野空間の中のビルや山が含ま
れている。FIG. 5 shows an example of three-dimensional map data. FIG.
(1) shows a map information space expressed in two dimensions, and FIG.
(2) shows a map information space expressed in three dimensions. This 3
With respect to the three-dimensional map information space, the map information management unit 5 receives a command from the control unit 8B and calculates a visual field space based on the header information of the landscape image file (step 28). FIG. 6 shows a calculation example of the visual field space. First, the XY axes stretch in the horizontal direction,
The Z axis extends in the vertical direction. The position of the viewpoint E is set in the three-dimensional map information space from the position information in the header information of the landscape image file. For example, 137 degrees east longitude 55
Minute 19 seconds, latitude 34 degrees 34 minutes 30 seconds north, altitude 101m33
If it is cm, the corresponding coordinates in the map mesh number corresponding to the cm are set. Similarly, the camera angle direction is set based on the horizontal angle and the elevation angle in the camera angle information in the header information. The focal point F is set at a point advanced from the viewpoint E by the focal length on a straight line representing the camera angle direction. The line-of-sight direction vector is a unit vector having a length of 1 from the viewpoint E on the straight line. In the image size of the landscape image file, the width x on the X-axis of the camera screen is set from the horizontal size, and the width y on the Y-axis is set from the vertical size. The horizontal x vertical y plane is set so as to be perpendicular to the camera angle direction with respect to the line of sight vector and to include the focal point F. Straight lines connecting the four corner points of the camera screen are obtained from the coordinates of the viewpoint E, and a three-dimensional space formed by four half lines extending from the viewpoint E is defined as a visual field space. FIG. 7 shows an example of the visual field space in the three-dimensional map space. The three-dimensional map space is viewed from the XZ plane. 7 is a cross-sectional view of the space belonging to the viewing space on the XZ plane. In the example of FIG. 7, buildings and mountains in the viewing space are included.

【００２９】さらに、地図情報管理部５では、求めら視
野空間の中に存在する構造物を求める。構造物毎に、構
造物を表す立体を構成する各頂点が、視野空間の内部領
域に存在するか否かを計算する。通常２次元地図空間は
一定サイズの２次元メッシュで区切られている。３次元
地図空間のメッシュの切り方としては、縦横の２次元方
向のメッシュに加えて高さ方向にも一定間隔でメッシュ
を切っていく。空間を直方体の単位空間で区切ることに
なる。まず、直方体の単位空間毎視野空間との重なり部
分の有無を調べ、重なり部分がある３次元単位地図空間
の番号を求める。ここでいう３次元単位地図空間の番号
とは、いわゆるメッシュ番号と同様のものである。重な
りを持つ３次元単位地図空間内にある構造物に対して、
視野空間と重なり部分の有無を調べる。構造物を構成す
る頂点の座標と視点の座標とを結ぶ直線を求め、その直
線が図８のカメラ画面に対して交点を持つならば視野空
間内にある。構造物を構成する複数の頂点のうち、一つ
の頂点でもこの条件を満たせば、その構造物は視野空間
と重なり部分を持つものとする。Further, the map information management unit 5 obtains a structure existing in the visual field space. For each structure, it is calculated whether or not each vertex constituting the solid representing the structure exists in the internal region of the viewing space. Usually, a two-dimensional map space is divided by a two-dimensional mesh of a fixed size. As a method of cutting a mesh in the three-dimensional map space, a mesh is cut at regular intervals in the height direction in addition to the two-dimensional mesh in the vertical and horizontal directions. The space is divided by a rectangular parallelepiped unit space. First, the presence or absence of an overlapping portion between the rectangular parallelepiped unit space and the visual field space is checked, and the number of the three-dimensional unit map space having the overlapping portion is obtained. The number of the three-dimensional unit map space here is the same as a so-called mesh number. For structures in the overlapping 3D unit map space,
Investigate whether there is any overlap with the visual field space. A straight line connecting the coordinates of the vertices constituting the structure and the coordinates of the viewpoint is obtained. If the straight line has an intersection with the camera screen of FIG. 8, it is within the visual field space. If at least one vertex of the plurality of vertices constituting the structure satisfies this condition, the structure has an overlapping portion with the viewing space.

【００３０】構造物が視野空間の内部に含まれるか、ま
たはその一部が含まれる場合、カメラ画面を投影面とし
て、各構造物をこの投影面に３次元投影変換する処理に
入る（ステップ２９）。ここで、図８に示すように、点
Ｐを次式（１）を基にして視点Ｅを基にした座標系で表
現し直した後、点Ｐをカメラ画面に投影して交点Ｑを求
める。When the structure is included in the view space or a part of the structure, the camera screen is used as a projection plane, and a process for three-dimensionally projecting each structure to this projection plane is started (step 29). ). Here, as shown in FIG. 8, after the point P is re-expressed in the coordinate system based on the viewpoint E based on the following equation (1), the point P is projected on the camera screen to obtain the intersection Q. .

【００３１】[0031]

【数１】ここで、点Ｐ＝（ｘ，ｙ，ｚ）：構造物を構成する頂点の座標点Ｅ＝（ｅｘ，ｅｙ，ｅｚ）：視点の座標ベクトルＬ＝（ｌｘ，ｌｙ，ｌｚ）：視線方向ベクトル
（単位ベクトル）点Ｐ’＝（ｘ’，ｙ’，ｚ’）：点Ｐの視点Ｅを基にし
た座標系で表現した場合の座標ｒ＝（ｌｘ² ＋ｌｙ² ）^1/2 交点Ｑ＝（Ｘ，Ｙ）：点Ｐのカメラ画面への投影点ｔは焦点距離３次元投影変換にあたっては、まず各構造物毎にその頂
点が張る面を求める。例えば、直方体で表現される構造
物ならば、６つの面が求まる。各面をカメラ画面に投影
変換する際に、投影領域に含まれるカメラ画面上の各画
素に対し、視点とその面上の対応点との距離を計算して
奥行き値（Ｚ値）としてメモリに格納する。各構造物の
各面毎に、カメラ画面上の各画素に対する奥行き値（Ｚ
値）を計算し、メモリに格納する。なお（式）１中の
ｚ’は視点からの奥行き値（Ｚ値）を表す。(Equation 1) Here, point P = (x, y, z): coordinates of the vertices constituting the structure point E = (ex, ey, ez): coordinates of the viewpoint vector L = (lx, ly, lz): gaze direction vector (Unit vector) Point P ′ = (x ′, y ′, z ′): coordinates in a coordinate system based on the viewpoint E of the point P r = (lx ² + ly ² ) ^1/2 intersection Q = (X, Y): The projection point t of the point P on the camera screen is the focal length. In the three-dimensional projection conversion, first, for each structure, the surface on which the vertex extends is obtained. For example, in the case of a structure represented by a rectangular parallelepiped, six surfaces are obtained. When each plane is projected onto the camera screen, the distance between the viewpoint and the corresponding point on the plane is calculated for each pixel on the camera screen included in the projection area, and stored as a depth value (Z value) in the memory. Store. For each surface of each structure, the depth value (Z
Value) and store it in memory. Note that z ′ in (Equation 1) represents a depth value (Z value) from the viewpoint.

【００３２】カメラ画面に３次元投影変換された構造物
のうちには、視点から見える構造物と見えない構造物が
ある。その中で視点から見える構造物のみを求め、視点
から反対側にある面や他の構造物に遮られている面を求
める必要がある。そこで、隠面処理を行う（ステップ３
０）。隠面処理の方法には、いろいろあるが、例えばＺ
バッファ法を用いる。他のスキャンライン法、光線追跡
法でもよい。Among the structures three-dimensionally transformed on the camera screen, there are structures that can be seen from the viewpoint and structures that cannot be seen. Among them, it is necessary to find only the structure that can be seen from the viewpoint, and to find the surface on the opposite side from the viewpoint or the surface that is blocked by other structures. Therefore, hidden surface processing is performed (step 3
0). There are various methods for processing the hidden surface.
Use the buffer method. Other scan line methods and ray tracing methods may be used.

【００３３】カメラ画面上の画素を任意にとって、その
画素に対して最も小さい奥行き値をとる面を求める。こ
のように各構造物の各面について順次処理を続けていく
と、カメラ画面上の各画素毎に視点に最も近い面が残さ
れる。カメラ画面上の各画素毎に視点に最も近い面が決
定され、また視点に最も近い面が共通するカメラ画面上
画素は一般的に領域を構成するので、カメラ画面では、
共通の面を最も近い面とする画素からなる領域が複数で
きる。こうして求まった領域が、視点から見える構造物
の部分領域を３次元投影変換した結果の領域である。視
点から反対側にある面や他の構造物に遮られている面は
消去されている。A pixel on the camera screen is arbitrarily determined, and a plane having the smallest depth value for the pixel is determined. As described above, when processing is sequentially performed on each surface of each structure, the surface closest to the viewpoint is left for each pixel on the camera screen. The plane closest to the viewpoint is determined for each pixel on the camera screen, and pixels on the camera screen that share the plane closest to the viewpoint generally form an area.
There can be a plurality of regions consisting of pixels with the common surface being the closest surface. The region obtained in this way is a region obtained by performing a three-dimensional projection conversion of the partial region of the structure seen from the viewpoint. Surfaces on the other side of the view and those obstructed by other structures have been erased.

【００３４】こうしてできた領域がＣＧ画像領域を形成
する（ステップ３１）。The area thus formed forms a CG image area (step 31).

【００３５】ＣＧ画像領域を構成する２次元図形の頂点
座標に対して、投影変換前の３次元座標を求め、両者の
対応関係をリンク情報としてメモリに格納する。リンク
情報を基にして、その２次元領域がどの構造物の投影図
かということを求めること等に用いる。With respect to the coordinates of the vertices of the two-dimensional figure constituting the CG image area, three-dimensional coordinates before projection transformation are obtained, and the correspondence between the two is stored in the memory as link information. It is used to determine which structure the two-dimensional area is a projection of based on the link information.

【００３６】隠線消去して残った線データを基にして、
ＣＧ画像を領域分割する。３次元地図ＤＢを利用してい
るため、各領域毎にその領域の基となる構造物の名称を
対応付けできる。ＣＧ画像の分割された領域に順番に番
号を付けていく。ＣＧ画像を複数の部分領域に分割した
例を図１０に示す。Based on the line data remaining after erasing hidden lines,
The CG image is divided into regions. Since the three-dimensional map DB is used, the name of the structure serving as the basis of each area can be associated with each area. The divided areas of the CG image are numbered sequentially. FIG. 10 shows an example in which a CG image is divided into a plurality of partial regions.

【００３７】ＣＧ画像の領域分割処理が完了したら、制
御部８Ｂはガイド音声情報作成部６Ｂに対して、ＣＧ画
像の分割領域と景観画像の分割領域の対応付けを行うよ
うに命令する。ガイド音声情報作成部６Ｂでは、テンプ
レートマッチングによりＣＧ画像の分割領域と景観画像
の分割領域の対応付けを行う（ステップ３２、図１１参
照）。When the CG image region division processing is completed, the control unit 8B instructs the guide voice information creation unit 6B to associate the CG image division region with the landscape image division region. The guide voice information creating unit 6B associates the divided regions of the CG image with the divided regions of the landscape image by template matching (step 32, see FIG. 11).

【００３８】景観画像の分割領域のうち、番号の若い領
域（例えば、１番）から順にＣＧ画像の分割領域と対応
付けしていく。対応付けに当たっては、従来からあるマ
ッチング方法のうちのどれをとってもよいが、ここでは
単純なテンプレートマッチング法をとる。つまり、比較
する２つの領域を重ね合わせ、重なり合う部分の比率
が、しきい値として決めた一定の比率以上にある場合に
同一の構造物に関する領域として対応付けることとす
る。例えば、景観画像の分割領域１番目のＲ１に関し
て、その領域内にある各画素の座標値を（Ａ，Ｂ）とす
る。座標（Ａ，Ｂ）での画素の値は、領域の内部ゆえに
１である。ＣＧ画像の１番目の分割領域Ｓ１において、
座標（Ａ，Ｂ）が領域Ｓ１内ならば画素値１であり重な
るが、Ｓ１の外ならば画素値０であり重ならない。こう
して座標（Ａ，Ｂ）での重なり係数Ｋ（Ａ，Ｂ）とし
て、重なる場合１、重ならない場合０で決まる。座標
（Ａ，Ｂ）を領域Ｒ１内で動かして、重なり係数Ｋ
（Ａ，Ｂ）を求める。そして、領域Ｒ１内で動かした座
標（Ａ，Ｂ）の数Ｎ１に対して、重なり係数Ｋ（Ａ，
Ｂ）が１であった座標の数Ｎ２を求めて、Ｎ１／Ｎ２が
しきい値以上である場合に、景観画像の分割領域Ｒ１と
ＣＧ画像の分割領域Ｓ１が対応するものと決める。この
対応付けを景観画像の分割領域の１番目から最後のもの
まで行う。なお、マッチング方法としてこの他、ＸＹ方
向に多少の位置ずれがあっても同じ値になるような評価
関数を用いてもよい。The divided areas of the landscape image are associated with the divided areas of the CG image in ascending order of the number (for example, No. 1). For matching, any of the conventional matching methods may be used, but here, a simple template matching method is used. That is, two regions to be compared are superimposed, and when the ratio of overlapping portions is equal to or greater than a certain ratio determined as a threshold, the regions are associated as regions relating to the same structure. For example, the coordinate value of each pixel in the first divided region R1 of the landscape image is set to (A, B). The value of the pixel at coordinates (A, B) is 1 because it is inside the area. In the first divided area S1 of the CG image,
If the coordinates (A, B) are within the area S1, the pixel value is 1 and overlaps, but if the coordinates (A, B) are outside the area S1, the pixel value is 0 and does not overlap. In this way, the overlap coefficient K (A, B) at the coordinates (A, B) is determined to be 1 when they overlap and 0 when they do not overlap. The coordinates (A, B) are moved within the region R1, and the overlap coefficient K
(A, B) is obtained. Then, with respect to the number N1 of coordinates (A, B) moved in the region R1, the overlap coefficient K (A,
The number N2 of coordinates where B) was 1 is obtained, and when N1 / N2 is equal to or larger than the threshold value, it is determined that the divided region R1 of the landscape image and the divided region S1 of the CG image correspond to each other. This association is performed from the first to the last divided area of the landscape image. In addition, as the matching method, an evaluation function that has the same value even if there is a slight displacement in the XY directions may be used.

【００３９】ガイド音声情報作成部６Ｂでは、景観画像
の部分領域に対してＣＧ画像の部分領域を対応付けた
後、さらに景観画像の部分領域毎に出力すべきガイド音
声情報を作成する処理（ステップ３４）に入る。まず、
景観画像の部分領域に対して、対応するＣＧ画像の部分
領域を取り出す。取り出したＣＧ画像の部分領域はもと
もと３次元地図空間の中の３次元構造物のある面をカメ
ラ画面に対して３次元投影変換して得られたものであ
る。そこで、３次元投影変換の基となった３次元構造物
の面を、ＣＧ画像の部分領域が持つ奥行き値（Ｚ値）を
キーとして求める。先に３次元投影変換した際に作成し
ておいたリンク情報をキーにしてもよい。もととなった
構造物の面をもとに、３次元地図ＤＢにアクセスしてそ
の構造物の名称または属性情報を取得する。ここで属性
情報とは、その構造物に関して付随する情報を意味し、
その構造物に係る情報ならば何でもよい。その構造物の
名称または属性情報を含むガイド音声情報を作成する。The guide audio information creating section 6B associates the partial area of the CG image with the partial area of the landscape image, and then creates guide audio information to be output for each partial area of the landscape image (step Enter 34). First,
With respect to the partial region of the landscape image, the corresponding partial region of the CG image is extracted. The partial region of the extracted CG image is originally obtained by three-dimensionally projecting and converting a certain surface of the three-dimensional structure in the three-dimensional map space onto the camera screen. Therefore, the surface of the three-dimensional structure on which the three-dimensional projection conversion is based is obtained using the depth value (Z value) of the partial region of the CG image as a key. The link information created at the time of the three-dimensional projection conversion may be used as a key. Based on the surface of the original structure, the three-dimensional map DB is accessed to acquire the name or attribute information of the structure. Here, the attribute information means information accompanying the structure,
Any information may be used as long as the information is related to the structure. Guide voice information including the name or attribute information of the structure is created.

【００４０】ここで、ガイド音声情報の例について説明
する。Here, an example of the guide voice information will be described.

【００４１】道路の所在について案内するようにあらか
じめ設定された場合は、地図情報中から道路に関する情
報（例えば“国道１号付与座標（ｘ１、ｙ１）”、
“横断歩道付与座標（ｘ２、ｙ２）”）を抽出し、そ
の情報を基にして“国道１号が手前を横切っています。
真正面に横断歩道があります。”等のガイド音声を既存
の音声合成技術を利用して自動生成する。景観画像中で
面積が大きいものを案内するように予め設定された場合
には、地図情報中の各名称情報に対応する景観画像中の
部分領域を求め、その部分領域の面積を計算して面積の
大きいものから上位数個を抽出し、抽出された部分領域
に対応する名称情報とその付与座標を基に“ライオンマ
ンション銀座が右４５度の方向、距離１００ｍの場所に
あります。”等のガイド音声を同じく自動生成する。If it is set in advance so as to provide guidance on the location of the road, information on the road from the map information (for example, “National highway No. 1 given coordinates (x1, y1)”,
“Pedestrian crossing assigned coordinates (x2, y2)”) is extracted, and based on that information, “National Highway No. 1 is crossing in front.
There is a pedestrian crossing in front of you. Automatically generates a guide voice such as "" using existing voice synthesis technology. If it is set in advance so as to guide a large area in a landscape image, it corresponds to each name information in the map information. The partial area in the landscape image is obtained, the area of the partial area is calculated, and the top several are extracted from the large area. Based on the name information corresponding to the extracted partial area and the assigned coordinates, “Lion Mansion” Ginza is in the direction of 45 degrees to the right and 100 meters away. Similarly, a guide voice such as "" is automatically generated.

【００４２】ガイド音声情報作成部６Ｂは、ガイド音声
情報を作成し終ったら、制御部８Ｂにガイド音声情報を
渡す。When the guide voice information creating section 6B has completed the creation of the guide voice information, it passes the guide voice information to the control section 8B.

【００４３】制御部８Ｂは、ガイド音声情報を受け取る
と、ガイド音声情報出力部７Ｂに対してガイド音声情報
を出力するように命令する。ガイド音声情報をガイド音
声情報出力部７Ｂから出力する（ステップ３６）。When receiving the guide voice information, the control section 8B instructs the guide voice information output section 7B to output the guide voice information. The guide voice information is output from the guide voice information output unit 7B (step 36).

【００４４】ガイド音声情報出力部７Ｂはガイド音声情
報を出力すると、出力完了を制御部８Ｂに通知する。制
御部８Ｂは出力完了通知を受け取ると、連続して景観ラ
ベリングの処理を行う場合は先に示した一連の処理手順
を再び実行する。When outputting the guide voice information, the guide voice information output section 7B notifies the control section 8B of the completion of the output. When receiving the output completion notification, the control unit 8B again executes the above-described series of processing procedures when performing the landscape labeling process continuously.

【００４５】図１２は図２の景観ラベリング装置を通信
システムに適用した音声案内型景観ラベリングシステム
の構成図である。FIG. 12 is a block diagram of a voice-guided landscape labeling system in which the landscape labeling device of FIG. 2 is applied to a communication system.

【００４６】景観ラベリングシステムは景観ラベリング
端末４０と景観ラベリングセンター５０と通信網６０で
構成される。The landscape labeling system includes a landscape labeling terminal 40, a landscape labeling center 50, and a communication network 60.

【００４７】景観ラベリング端末４０は、画像を取得す
る画像取得部４１と、画像取得時のカメラ位置を取得す
る位置情報取得部４２と、画像取得時のカメラ角と焦点
距離と画像サイズを取得するカメラ属性情報取得部４３
と、取得した画像を複数の部分領域に分割する画像処理
部４４と、画像の領域分割に関する情報とカメラ位置と
カメラ角と焦点距離と画像サイズとを通信網６０を介し
て景観ラベリングセンター５０に送信し、景観ラベリン
グセンター５０からガイド音声情報を受信する通信制御
部４５と、ラベル情報中の構造物の名称または属性情報
を含むガイド音声情報を出力するガイド音声情報出力部
４７と、上記各部を制御する端末制御部４６で構成され
る。The landscape labeling terminal 40 acquires an image acquiring section 41 for acquiring an image, a position information acquiring section 42 for acquiring a camera position when acquiring an image, and acquires a camera angle, a focal length, and an image size when acquiring an image. Camera attribute information acquisition unit 43
And an image processing unit 44 that divides the acquired image into a plurality of partial areas, and information about the area division of the image, the camera position, the camera angle, the focal length, and the image size, which are transmitted to the landscape labeling center 50 via the communication network 60. A communication control unit 45 that transmits and receives guide voice information from the landscape labeling center 50; a guide voice information output unit 47 that outputs guide voice information including the name or attribute information of the structure in the label information; It comprises a terminal control unit 46 for controlling.

【００４８】景観ラベリングセンター５０は通信網６０
を介して景観ラベリング端末４０から前記画像の領域分
割に関する情報とカメラ位置とカメラ角と焦点距離と画
像サイズを受信し、景観ラベリング端末４０にガイド音
声情報を送信する通信制御部５３と、地図情報を管理
し、受信したカメラ位置とカメラ角と焦点距離と画像サ
イズを基に地図情報空間の中で視野空間を求め、その視
野空間中に存在する構造物を獲得する地図情報管理部５
１と、画像の前記部分領域に対して前記獲得した構造物
をパターンマッチングにより対応付け、対応付けられた
構造物の名称または属性情報を含むガイド音声情報を作
成するガイド音声情報作成部５２と、上記各部を制御す
るセンタ−制御部５４で構成される。The landscape labeling center 50 has a communication network 60.
A communication control unit 53 that receives information related to the area division of the image, a camera position, a camera angle, a focal length, and an image size from the landscape labeling terminal 40 via the landscape labeling terminal 40, and transmits guide voice information to the landscape labeling terminal 40; Map information management unit 5 that calculates a view space in the map information space based on the received camera position, camera angle, focal length, and image size, and acquires a structure existing in the view space.
1, a guide audio information creating unit 52 that associates the acquired structure with the partial region of the image by pattern matching, and creates guide audio information including the name or attribute information of the associated structure; It comprises a center control unit 54 for controlling each of the above units.

【００４９】なお、ガイド音声情報作成部５２は図２中
のガイド音声情報作成部６Ｂと同じ構成をとることがで
きる。The guide voice information creation section 52 can have the same configuration as the guide voice information creation section 6B in FIG.

【００５０】[0050]

【発明の効果】以上説明したように本発明によれば、実
風景の景観画像中の各部分を音声で利用者に案内するた
め、画面を見なくとも、また目の不自由な人でも景観画
像の名称等がわかる。As described above, according to the present invention, each part in the scenery image of the actual scene is guided to the user by voice. You can see the name of the image.

[Brief description of the drawings]

【図１】本発明の第１の実施形態の音声案内型景観ラベ
リング装置の構成図である。FIG. 1 is a configuration diagram of a voice-guided landscape labeling device according to a first embodiment of the present invention.

【図２】本発明の第２の実施形態の音声案内型景観ラベ
リング装置の構成図である。FIG. 2 is a configuration diagram of a voice-guided landscape labeling device according to a second embodiment of the present invention.

【図３】第２の実施形態の音声案内型景観ラベリング装
置の処理の流れ図である。FIG. 3 is a flowchart of processing of a voice-guided landscape labeling device according to a second embodiment.

【図４】景観画像ファイルのデータ構造を示す図であ
る。FIG. 4 is a diagram showing a data structure of a landscape image file.

【図５】２次元地図の例（同図（１））とその３次元地
図（同図（２））を示す図である。FIG. 5 is a diagram showing an example of a two-dimensional map (FIG. 1 (1)) and its three-dimensional map (FIG. 2 (2)).

【図６】視野空間の計算方法を示す図である。FIG. 6 is a diagram illustrating a calculation method of a visual field space.

【図７】３次元地図空間での視野空間の例を示す図であ
る。FIG. 7 is a diagram showing an example of a visual field space in a three-dimensional map space.

【図８】投影図の例を示す図である。FIG. 8 is a diagram showing an example of a projection view.

【図９】景観画像の領域分割例を示す図である。FIG. 9 is a diagram illustrating an example of area division of a landscape image.

【図１０】ＣＧ画像の領域分割例を示す図である。FIG. 10 is a diagram illustrating an example of area division of a CG image.

【図１１】景観画像の部分領域とＣＧ画像の部分領域の
パターンマッチングの説明図である。FIG. 11 is an explanatory diagram of pattern matching between a partial area of a landscape image and a partial area of a CG image.

【図１２】本発明の一実施形態の音声案内型景観ラベリ
ングシステムの構成図である。FIG. 12 is a configuration diagram of a voice-guided landscape labeling system according to an embodiment of the present invention.

【図１３】特開平８−２７３０００号に開示されたナビ
ゲーション装置の構成図である。FIG. 13 is a configuration diagram of a navigation device disclosed in Japanese Patent Application Laid-Open No. 8-273000.

【図１４】動画像の表示例を示す図である。FIG. 14 is a diagram illustrating a display example of a moving image.

[Explanation of symbols]

１景観画像取得部２位置情報取得部３カメラ属性情報取得部４画像処理部５地図情報管理部６Ａ，６Ｂガイド音声情報作成部７Ａ，７Ｂガイド音声情報出力部８Ａ，８Ｂ制御部２１〜３６ステップ４０景観ラベリング端末４１景観画像取得部４２位置情報取得部４３カメラ属性情報取得部４４画像処理部４５通信制御部４６端末制御部４７ガイド音声情報出力部５０景観ラベリングセンター５１地図情報管理部５２ガイド音声情報作成部５３通信制御部５４センター制御部６０通信網 Reference Signs List 1 landscape image acquisition unit 2 position information acquisition unit 3 camera attribute information acquisition unit 4 image processing unit 5 map information management unit 6A, 6B guide audio information creation unit 7A, 7B guide audio information output unit 8A, 8B control unit 21 to 36 steps 40 landscape labeling terminal 41 landscape image acquisition unit 42 position information acquisition unit 43 camera attribute information acquisition unit 44 image processing unit 45 communication control unit 46 terminal control unit 47 guide audio information output unit 50 landscape labeling center 51 map information management unit 52 guide audio Information creation unit 53 Communication control unit 54 Center control unit 60 Communication network

フロントページの続き (51)Int.Cl.⁶ 識別記号ＦＩＧ１０Ｌ 3/00 Ｇ０６Ｆ 15/62 ３５０ＡＨ０４Ｎ 7/18 ３８０ (72)発明者高野正次東京都新宿区西新宿三丁目19番２号日本電信電話株式会社内 (72)発明者池田武史東京都新宿区西新宿三丁目19番２号日本電信電話株式会社内Continued on the front page (51) Int.Cl. ⁶ Identification code FI G10L 3/00 G06F 15/62 350A H04N 7/18 380 (72) Inventor Masaji Takano 3-9-1-2 Nishishinjuku, Shinjuku-ku, Tokyo Japan Telegraph and Telephone Co., Ltd. (72) Inventor Takeshi Ikeda 3-19-2 Nishishinjuku, Shinjuku-ku, Tokyo Japan Telegraph and Telephone Co., Ltd.

Claims

[Claims]

An image acquisition unit for acquiring an image; a position information acquisition unit for acquiring a camera position at the time of image acquisition; and a camera attribute information acquisition for acquiring a camera angle, a focal length, and an image size when the image is acquired. Means, managing map information, obtaining a view space in a map information space based on the obtained camera position, camera angle, focal length, and image size, and obtaining map information that acquires structures existing in the view space. Management means; guide voice information generating means for generating guide voice information including the name of the structure or attribute information thereof; guide voice information output means for outputting the guide voice information; control means for controlling the above means Landscape labeling device with a.

2. An image acquisition unit for acquiring an image, a position information acquisition unit for acquiring a camera position at the time of image acquisition, a camera attribute information acquisition unit for acquiring a camera angle, a focal length, and an image size at the time of image acquisition. , Image processing means for dividing the acquired image into a plurality of partial areas, managing map information, and calculating a visual field space in the map information space based on the acquired camera position, camera angle, focal length, and image size, Map information management means for acquiring a structure existing in the visual field space, and associating the acquired structure with the partial region of the image, and associating the name of the associated structure or its attribute information Guide voice information generating means for generating guide voice information including the guide voice information output means for outputting the generated guide voice information; Ring device.

3. The guide voice information creating means creates a CG image which is a computer graphic image based on the acquired structure, and performs pattern matching on the partial region of the image in the CG image. Corresponding to the partial area, finding the structure of the associated partial area,
3. The apparatus according to claim 2, wherein guide voice information including name or attribute information of the structure is created.

4. The guide voice information creating means performs a three-dimensional projection conversion of the acquired structure on a camera screen, deletes a structure that cannot be seen from a viewpoint, creates a CG image, and creates a CG image of a partial region in the CG image. The CG image is divided into partial regions by contour lines, the partial regions of the image and the partial regions of the CG image are associated by pattern matching, a structure of the associated partial region is obtained, 3. A guide voice information including name or attribute information is created.
The described device.

5. A landscape labeling terminal and a landscape labeling center, wherein the landscape labeling terminal is an image acquisition unit for acquiring an image, a position information acquisition unit for acquiring a camera position at the time of image acquisition, and a camera at the time of image acquisition. Camera attribute information acquisition means for acquiring an angle, a focal length, and an image size; image processing means for dividing the acquired image into a plurality of partial areas; information on area division of the image; the camera position; the camera angle; A communication control unit that transmits the focal length and the image size to the landscape labeling center via a communication network and receives guide audio information from the landscape labeling center, and a guide audio information output unit that outputs the guide audio information. And a terminal control means for controlling each of the above means, wherein the landscape labeling center is connected to the A communication control unit that receives information related to area division of the image, the camera position, the camera angle, the focal length, and the image size from a viewing labeling terminal, and transmits the guide voice information to the landscape labeling terminal; Map information management means for managing information, obtaining a view space in a map information space based on the received camera position, camera angle, focal length, and image size, and acquiring a structure existing in the view space; Guiding audio information creating means for associating the acquired structure with the partial region of the image, creating the guide audio information including the name or attribute information of the associated structure, A landscape labeling system having a center control means for controlling.

6. The guide audio information creating means creates a CG image, which is a computer graphics image, based on the acquired structure, and performs pattern matching on the partial region of the image in the CG image. Corresponding to the partial area, finding the structure of the associated partial area,
The system according to claim 5, wherein guide audio information including name or attribute information of the structure is created.

7. The guide voice information creating means performs a three-dimensional projection conversion of the acquired structure on a camera screen, creates a CG image by erasing a structure that cannot be seen from a viewpoint, and creates a CG image of a partial region in the CG image. The CG image is divided into partial regions by contour lines, the partial regions of the image and the partial regions of the CG image are associated with each other by pattern matching, and the structure of the associated partial region is obtained. The system according to claim 5, wherein guide audio information including name or attribute information and an assigned position is created.