JP3533754B2

JP3533754B2 - Image synthesis device

Info

Publication number: JP3533754B2
Application number: JP11147395A
Authority: JP
Inventors: 一生登; 郷鴨川
Original assignee: Panasonic Corp; Matsushita Electric Industrial Co Ltd
Current assignee: Panasonic Corp; Panasonic Holdings Corp
Priority date: 1995-05-10
Filing date: 1995-05-10
Publication date: 2004-05-31
Anticipated expiration: 2019-05-31
Also published as: JPH08305830A

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は、予め記録された動画像
中の対象物画像を、入力された別の対象物画像と置き換
えた動画像を合成する画像合成装置に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to an image synthesizing apparatus for synthesizing a moving image in which an object image in a moving image previously recorded is replaced with another input object image.

【０００２】[0002]

【従来の技術】近年、イメージスキャナやビデオカメラ
などから画像を入力し、予め記録された背景や前景とな
る画像と合成したり、複数の画像を混合した画像を合成
することが可能な画像合成装置が実現されている。特
に、対象となる画像として顔画像が取り扱われる事が多
く、予め記録された画像の顔の全部または一部分を、別
の人物の顔画像と置き換えた画像を合成する技術は、犯
罪捜査に使われるモンタージュシステムやテレビ電話な
どの実用的な用途のみならず、芸術分野やアミューズメ
ント分野などへ広まっていくと考えられる。2. Description of the Related Art In recent years, it is possible to input an image from an image scanner, a video camera, or the like, and synthesize the image with a prerecorded background or foreground image, or to synthesize an image obtained by mixing a plurality of images. The device is realized. In particular, a face image is often handled as a target image, and a technique for synthesizing an image in which all or part of the face of a pre-recorded image is replaced with the face image of another person is used for criminal investigation. It is thought that it will spread not only to practical applications such as montage systems and videophones, but also to the arts and amusement fields.

【０００３】上記のような画像合成装置を実現する従来
技術の例として、対象物として顔画像を扱うものでは、
人間の目や鼻などの各部の相対的な位置関係が一般的に
共通であることを利用し、顔画像に標準的な構造情報を
対応させ、複数の顔画像と対応させた構造情報間の対応
関係を用いて画像を合成する技術がある。As an example of the prior art for realizing the image synthesizing apparatus as described above, in the case of treating a face image as an object,
By utilizing the fact that the relative positional relationship of each part such as human eyes and nose is generally common, standard structural information is associated with face images, and structural information between multiple facial images is associated. There is a technique for synthesizing images by using correspondence.

【０００４】例えば、原島らの研究報告（人工知能学会
研究会資料SIG-HICG-8901-3)などによれば、２つの顔画
像に対して、顔の３次元的な構造情報を多面体で近似し
た基準顔モデルとの対応関係を与え、２つの顔画像のテ
クスチャとそれぞれの顔画像に対応した基準顔モデルと
を内挿して、いわゆる中割り画像を合成することが可能
であるとしている。For example, according to a research report by Harashima et al. (Article SIG-HICG-8901-3 of the Society for Artificial Intelligence), three-dimensional structural information of a face is approximated by a polyhedron for two face images. It is said that it is possible to interpolate the textures of the two face images and the reference face models corresponding to the respective face images to synthesize a so-called split image by giving a corresponding relationship with the reference face model.

【０００５】この技術を用いると、予め記録された顔を
含む画像に対して基準顔モデルを対応させておき、入力
された顔画像に対して基準顔モデルとの対応関係を求め
ることによって、予め記録された画像の顔部分を、入力
された顔画像と置き換えることが可能である。When this technique is used, a reference face model is made to correspond to an image containing a face recorded in advance, and a correspondence relationship with the reference face model is obtained for an inputted face image. It is possible to replace the face portion of the recorded image with the input face image.

【０００６】[0006]

【発明が解決しようとする課題】しかしながら上記した
従来の技術による方法では、入力された顔画像の形状
は、予め記録された顔画像の形状とは一般的に異なるた
め、２つの顔画像の形状の差異から、顔と背景の境界部
分で背景画像のない部分が生じる可能性があり、自然な
画像を合成できないという問題点を有していた。例え
ば、予め記録された背景を含んだ画像中の顔が横長であ
り、入力された縦長の顔画像で置き換えを行う場合、自
然な合成画像を得るためには、記録された顔画像の左右
の両端部分の背景画像が必要となるが、この部分の情報
がないため自然な合成画像を得ることは困難であった。However, in the above-described method according to the conventional technique, the shape of the input face image is generally different from the shape of the face image recorded in advance, and therefore the shapes of the two face images are different. There is a possibility that a part without a background image may occur at the boundary part between the face and the background due to the difference between, and there is a problem that a natural image cannot be synthesized. For example, when a face in an image containing a prerecorded background is horizontally long and is replaced with an input vertically long face image, in order to obtain a natural composite image, the left and right sides of the recorded face image are Background images at both ends are required, but it is difficult to obtain a natural composite image because there is no information about this part.

【０００７】上記の課題は、予め背景となる画像の情報
を余分に記録しておくという方法によって解決可能では
あるが、上記の従来の技術を動画像に適応させた場合、
予め記録された動画像中の１フレーム毎に同様の方法を
繰り返す事によって動画像を合成することが可能である
が、動画像を記録する場合は非常に容量が大きくなるた
め、各フレーム毎に余分な画像情報を余分に記録させる
ことは、さらに記憶容量が大きくなってしまうという問
題点を有していた。The above problem can be solved by a method of recording extra information of a background image in advance. However, when the above-mentioned conventional technique is applied to a moving image,
It is possible to synthesize a moving image by repeating the same method for each frame in a prerecorded moving image. However, when recording a moving image, the capacity is very large, and therefore, it is necessary for each frame. The extra recording of extra image information has a problem that the storage capacity is further increased.

【０００８】そこで、本発明は、予め記録しておく動画
像は、事前に加工することによって階層化して記録する
ことが可能であることから、階層化によって動画像を効
率よく符号化可能であり、かつ、１つのフレーム画像か
らでは得られない背景画像を得ることが可能であるとい
う点に着目して得られたものである。Therefore, according to the present invention, since a moving image to be recorded in advance can be layered and recorded by processing it in advance, the moving image can be efficiently coded by layering. Moreover, it is obtained by paying attention to the fact that it is possible to obtain a background image that cannot be obtained from one frame image.

【０００９】従って本発明は上記問題点に鑑み、予め記
録された動画像中の対象物画像を入力された別の対象物
画像と置き換える際に、背景の欠落などのない自然な動
画像を生成し、かつ、より少ない記憶容量で動画像を記
録しておくことが可能な画像合成装置を提供することを
目的とする。Therefore, in view of the above problems, the present invention generates a natural moving image with no background loss when replacing an object image in a previously recorded moving image with another input object image. In addition, it is an object of the present invention to provide an image synthesizing device capable of recording a moving image with a smaller storage capacity.

【００１０】[0010]

【課題を解決するための手段】本発明は上記目的を達す
るため、動画像中の各フレーム内の対象物画像と対象物
の構造との対応関係を示す構造情報を有する構造情報時
系列データと前記動画像を空間的な前後関係で分離して
階層画像を構成し符号化した符号化動画像データとを記
録したデータベース部と、前記データベース部から符号
化動画像データを読み出し、階層化された複数の画像を
それぞれ復号化して階層画像を生成する階層画像復号化
部と、前記データベース部から読み出した構造情報時系
列データから、対応する構造情報を選択して出力する構
造情報選択部と、前記動画像中の対象物と同様な構造を
もった別の対象物の入力画像と前記入力画像に対応した
構造情報である入力構造情報とをそれぞれ外部から入力
し、前記構造情報選択部で出力された構造情報と前記入
力構造情報とを用いて構造情報を加工し、さらに前記入
力画像を用いて合成対象物画像を合成する対象物画像合
成部と、前記復号化して階層化した画像と前記合成対象
物画像を重ね合わせて、合成画像を出力する階層画像合
成部を備えたものである。 In order to achieve the above object, the present invention provides structure information time-series data having structure information indicating a correspondence relationship between an object image in each frame in a moving image and the structure of the object. a database unit for recording and constitute a hierarchical image coding coded moving picture data to the moving image is separated in a spatial context, it reads the encoded moving image data from the previous SL database unit, stratified a hierarchical image decoding unit that generates a hierarchical image a plurality of images respectively decoded was, from the read structure information time-series data from said database unit, structure information selection unit for selecting and outputting structure information that corresponds And an input image of another object having a structure similar to that of the object in the moving image and input structure information which is structure information corresponding to the input image, respectively, and the structure information Processing the structural information using the output structure information in selecting section and the input structure information, further wherein the input image object image synthesizing unit you synthesize synthetic object image using a hierarchy said decoding A hierarchical image synthesizing unit that superimposes the transformed image and the synthesis target image and outputs a synthesized image is provided.

【００１１】[0011]

【作用】本発明は上記した構成によって、動画像を前後
関係で分離して階層画像を構成し符号化した符号化動画
像データを用いることによって、少ない記憶容量で動画
像を記録することを可能とし、かつ、合成された対象物
画像が記録された対象物画像の形状が異なる場合でも、
別の階層画像として復号化した背景画像などとを重ね合
わせて合成することによって、背景画像などの欠落のな
い自然な動画像を生成することが可能となる。According to the present invention, by using the above-described structure, it is possible to record a moving image with a small storage capacity by using the encoded moving image data in which the moving image is separated in the context of the hierarchical image. And, even when the shape of the object image in which the combined object image is recorded is different,
By superimposing and combining the decoded background image and the like as another hierarchical image, it is possible to generate a natural moving image with no loss of the background image or the like.

【００１２】[0012]

【実施例】以下、本発明の一実施例の画像合成装置につ
いて、図面を参照しながら説明する。図１は本発明の一
実施例における画像合成装置の構成図である。DESCRIPTION OF THE PREFERRED EMBODIMENTS An image synthesizing apparatus according to an embodiment of the present invention will be described below with reference to the drawings. FIG. 1 is a block diagram of an image synthesizing apparatus according to an embodiment of the present invention.

【００１３】図１において、データベース部１０１は、
予め動画像中の各フレーム内の対象物画像と対象物の構
造との対応関係を示す構造情報時系列データ１１２と、
動画像を前後関係で分離して階層画像を構成し符号化し
た符号化動画像データ１１１を記録しておく。階層画像
復号化部１０２は、データベース部１０１より符号化動
画像データ１１１を読み出し、複合化した階層画像１１
７を画像メモリ部１０５へ書き込むと同時に、復号化を
行うフレームの位置を示すフレーム情報１１５を出力す
る。In FIG. 1, the database unit 101 is
Structure information time-series data 112 indicating a correspondence relationship between the object image and the structure of the object in each frame in the moving image in advance,
Encoded moving image data 111, in which moving images are separated according to the context, form hierarchical images and are encoded, is recorded. The hierarchical image decoding unit 102 reads the encoded moving image data 111 from the database unit 101, and combines the hierarchical image 11 into a composite image.
7 is written in the image memory unit 105, and at the same time, the frame information 115 indicating the position of the frame to be decoded is output.

【００１４】構造情報選択部１０３は、データベース部
１０１に記録された構造情報時系列データ１１２を読み
出し、フレーム情報１１５に対応する構造情報１１６を
選択して出力する。対象物画像合成部１０４では、外部
から入力された入力画像１１３とそれに対応した入力構
造情報１１４をもとに、構造情報選択部１０３から出力
された構造情報１１６と入力構造情報１１４の対応関係
から、入力構造情報１１４を変形加工し入力画像１１３
を用いて合成対象物画像１１８を生成し、画像メモリ部
１０５へ書き込む。画像メモリ部１０５は、階層画像１
１７と合成対象物画像１１８を一時的に記憶する。階層
画像合成部１０６は、画像メモリ部１０５から階層画像
１１７と合成対象物画像１１８を読み出し、重ね合わせ
て１つの合成画像として出力する。The structure information selecting unit 103 reads the structure information time-series data 112 recorded in the database unit 101, selects the structure information 116 corresponding to the frame information 115, and outputs it. Based on the input image 113 input from the outside and the input structure information 114 corresponding thereto, the object image synthesis unit 104 determines the correspondence between the structure information 116 output from the structure information selection unit 103 and the input structure information 114. , The input image 113 by transforming the input structure information 114
Is used to generate a synthesis target image 118 and write it in the image memory unit 105. The image memory unit 105 displays the hierarchical image 1
17 and the composite object image 118 are temporarily stored. The hierarchical image synthesis unit 106 reads the hierarchical image 117 and the synthesis target image 118 from the image memory unit 105, superimposes them, and outputs them as one synthetic image.

【００１５】以上のように構成された画像合成装置につ
いて、以下図１、図２および図３を用いてその動作を説
明する。図２は、図１のデータベース部１０１に記録さ
れた符号化動画像データ１１１を符号化、復号化する方
法を示す図である。図中２０１はもとになる動画像であ
り、この動画像中から同じ動きパラメータで記述される
領域などをそれぞれ抽出し、重ねることによって階層画
像２０２を生成する。この階層画像２０２をそれぞれ符
号化することによって、データベース部１０１に記録し
ておく符号化動画像データ２０３を得ることができ、こ
れを予め記録しておく。The operation of the image synthesizing apparatus configured as described above will be described below with reference to FIGS. 1, 2 and 3. FIG. 2 is a diagram showing a method of encoding and decoding the encoded moving image data 111 recorded in the database unit 101 of FIG. In the figure, reference numeral 201 is a base moving image, and a layered image 202 is generated by extracting regions and the like described by the same motion parameter from the moving image and overlapping them. By encoding each of the hierarchical images 202, the encoded moving image data 203 recorded in the database unit 101 can be obtained, and this is recorded in advance.

【００１６】このような動画像を階層化して符号化復号
化する方式や、それを実現する装置の詳細な構成につい
ては、特開平６−２５３４００号公報や、ワンらの研究
報告（J.Wang and E. Adelson: "Layered Representati
on for Image Sequence Coding", Proc. IEEE Int. Con
f. Acoustic Speech Signal Processing '93, pp.V221-
224, 1993 および J.Wang and E. Adelson: "Layered R
epresentation for Motion Analisis", Proc. Computer
Vision and Pattern Recognition, pp.361-366, 1993)
などで開示されている。Regarding the method of hierarchically encoding and decoding moving images and the detailed configuration of a device for realizing the method, Japanese Unexamined Patent Publication (Kokai) No. 6-253400 and research report by J. Wang et al. and E. Adelson: "Layered Representati
on for Image Sequence Coding ", Proc. IEEE Int. Con
f. Acoustic Speech Signal Processing '93, pp.V221-
224, 1993 and J. Wang and E. Adelson: "Layered R
epresentation for Motion Analisis ", Proc. Computer
Vision and Pattern Recognition, pp.361-366, 1993)
Etc.

【００１７】図３（ａ）は、図１のデータベース部１０
１に記録された構造情報時系列データ１１２の一部と、
入力構造情報１１６と対応する画像の関係を図示したも
のであり、図３（ｂ）は、構造情報の具体例である。そ
れぞれの構造情報は、少なくとも対象物の構造と画像中
での位置に関する情報を含む。図３の例では、対象物の
構造は、多角形の集合で構成された多面体の構造で記述
されており、各多角形の頂点と画像の２次元座標空間で
の座標値が対応している。同様な構造をもつ対象物であ
れば、異なる対象物であっても相対的に同じ構造をもっ
た構造情報を対応させることができ、例えば、顔画像中
の目や鼻などのように構造的に共通の構造を持つ部分
を、構造情報中でそれぞれ特定の位置の頂点や多角形に
対応させておく。FIG. 3A shows the database unit 10 of FIG.
A part of the structural information time series data 112 recorded in 1;
The relationship between the input structure information 116 and the corresponding image is illustrated, and FIG. 3B is a specific example of the structure information. Each structure information includes at least information about the structure of the object and the position in the image. In the example of FIG. 3, the structure of the object is described by the structure of a polyhedron composed of a set of polygons, and the vertices of each polygon correspond to the coordinate values in the two-dimensional coordinate space of the image. . As long as the objects have similar structures, structural information having the same structure can be associated with different objects. For example, structural information such as eyes and nose in a face image can be associated. The parts having a common structure are associated with vertices and polygons at specific positions in the structure information.

【００１８】このような画像とそれに対応した構造情報
は、予め手作業で与えておくことができ、角らの研究報
告（角義恭ほか：「表情・年齢変化が可能な３次元顔
画像システム」，電子情報通信学会技術研究報告，HC-9
1-43，1992）などでは基準構造情報から自動的に構造情
報を生成する方法も開示されている。Such an image and the structural information corresponding to it can be manually given in advance, and Kaku et al.'S research report (Yoshiyasu Kaku et al .: "3D face image system capable of facial expression and age change"). ,, IEICE Technical Report, HC-9
1-43, 1992) and the like also disclose a method of automatically generating structural information from standard structural information.

【００１９】上記した内容の符号化動画像データ１１１
と構造情報時系列データ１１２は、予めデータベース部
１０１に記録しておく。まず、入力画像１１３と入力構
造情報１１４が入力された後、ユーザの指示などによっ
て動作を開始し、動画像の１フレームの周期毎に以下の
動作を繰り返す。Coded moving image data 111 having the above contents
The structure information time-series data 112 is recorded in the database unit 101 in advance. First, after the input image 113 and the input structure information 114 are input, the operation is started according to a user's instruction or the like, and the following operation is repeated for each cycle of one frame of the moving image.

【００２０】階層画像復号化部１０２は、データベース
部１０１より符号化動画像データ１１１を読み出し１フ
レーム毎に順次復号化を行う。復号化によって得られた
複数の階層画像１１７は、画像メモリ部１０５へ書き込
む。復号化の詳細方法については前述した文献に示され
ているため、ここでは詳細の説明は省略する。階層画像
復号化部１０２は、復号化と同時に復号化中のフレーム
の位置を示すフレーム情報１１５を出力する。The hierarchical image decoding unit 102 reads the encoded moving image data 111 from the database unit 101 and sequentially decodes it for each frame. The plurality of hierarchical images 117 obtained by decoding are written in the image memory unit 105. Since the detailed method of decoding is described in the above-mentioned document, detailed description is omitted here. The hierarchical image decoding unit 102 outputs the frame information 115 indicating the position of the frame being decoded simultaneously with the decoding.

【００２１】次に、構造情報選択部１０３では、階層画
像復号化部１０２から出力されたフレーム情報１１５を
入力し、データベース部１０１に記録された構造情報時
系列データ１１２を読み出し、フレーム情報１１５に対
応する構造情報１１６を選択して出力する。対象物画像
合成部１０４では、外部から入力された入力画像１１３
とそれに対応した入力構造情報１１４をもとに、構造情
報選択部１０３から出力された構造情報１１６と入力構
造情報１１４の対応関係から、入力構造情報１１４を変
形加工し入力画像１１３を用いて合成対象物画像１１８
を生成し、画像メモリ部１０５へ書き込む。Next, the structure information selection unit 103 inputs the frame information 115 output from the hierarchical image decoding unit 102, reads the structure information time-series data 112 recorded in the database unit 101, and stores it in the frame information 115. The corresponding structure information 116 is selected and output. In the object image synthesis unit 104, the input image 113 input from the outside
And the input structure information 114 corresponding thereto, based on the correspondence between the structure information 116 output from the structure information selection unit 103 and the input structure information 114, the input structure information 114 is modified and combined using the input image 113. Object image 118
Is generated and written in the image memory unit 105.

【００２２】具体的には、例えば、構造情報選択部１０
３から出力された構造情報１１６中の、画像の位置座標
に関する情報の平均値を求め、入力構造情報１１４中の
位置座標の平均と一致するよう入力構造情報１１４を加
工する。こうして得られた構造情報の各多角形を、それ
ぞれ対応する入力画像１１３と入力構造情報１１４の多
角形で置き換えることによって、合成対象物画像１１８
を生成し画像メモリ部１０５へ書き込む。上記の方法に
よって得られる合成対象物画像１１８は、構造情報１１
６が示す対象物の２次元画像空間でほぼ同じ位置にな
る。つまり、階層画像復号化部１０２で復号化された対
象物の階層画像１１７と、２次元画像空間上で相対的に
ほぼ同じ位置に、別の対象物のある合成対象物画像１１
８を合成することができる。また、構造情報中の画像位
置座標の平均値が一致するように加工する以外に、特定
の頂点座標が一致するよう加工することによって、画像
中の特定の位置が同じ合成対象物画像１１８を合成した
り、特定の２頂点間の相対的な位置関係が一致するよう
な加工を行うことにより、サイズや回転角度の一致した
合成対象物画像を得ることも可能である。Specifically, for example, the structure information selection unit 10
The average value of the information on the position coordinates of the image in the structure information 116 output from No. 3 is calculated, and the input structure information 114 is processed so as to match the average of the position coordinates in the input structure information 114. By replacing each polygon of the structure information obtained in this way with the corresponding polygon of the input image 113 and the corresponding input structure information 114, the composite object image 118 is obtained.
Is generated and written in the image memory unit 105. The synthesis target image 118 obtained by the above method is the structure information 11
It is almost the same position in the two-dimensional image space of the object indicated by 6. That is, the composite target object image 11 with another target object at a substantially almost same position in the two-dimensional image space as the hierarchical image 117 of the target object decoded by the hierarchical image decoding unit 102.
8 can be synthesized. Further, in addition to processing so that the average values of the image position coordinates in the structure information match, by processing so that the specific vertex coordinates match, the synthesis target image 118 with the same specific position in the image is synthesized. It is also possible to obtain a combined object image having the same size and the same rotation angle by performing processing such that the relative positional relationship between two specific vertices is the same.

【００２３】画像メモリ部１０５は、上記の手順によっ
て得られた、複数の階層画像１１７と合成対象物画像１
１８を一時的に記憶しており、階層画像合成部１０６
は、画像メモリ部１０５から階層画像１１７と合成対象
物画像１１８を読み出し、重ね合わせて１つの合成画像
として出力する。具体的に、複数の画像を重ね合わせる
方法としては、各階層画像の動画像中での前後関係と、
どの階層画像が対象物の画像を含むかを予め与えてお
き、対象物の階層画像を合成対象物画像１１８で置き換
える。そして、各階層画像を同時に読み出しながら、前
側にある階層画像を優先的に読み出すことによって、合
成された動画像を得ることができる。The image memory unit 105 stores a plurality of hierarchical images 117 and the composite object image 1 obtained by the above procedure.
18 is temporarily stored in the hierarchical image synthesizing unit 106.
Reads out the hierarchical image 117 and the composite object image 118 from the image memory unit 105, superimposes them, and outputs them as one composite image. Specifically, as a method of superimposing a plurality of images, the context of each hierarchical image in the moving image,
It is given in advance which hierarchical image includes the image of the target object, and the hierarchical image of the target object is replaced with the composite target object image 118. Then, by reading the hierarchical images on the front side preferentially while simultaneously reading the hierarchical images, a combined moving image can be obtained.

【００２４】ここで階層画像復号化部１０２で復号化さ
れた背景となる階層画像は、図２の例のように動画像中
の背景画像を足しあわせて得られたものであるため、も
との動画像中に含まれてさえいれば、ある１フレーム中
では隠れていても階層画像として復号化することが可能
である。従って、もとの動画像中の対応する１フレーム
中では対象物に背景画像が隠れており、かつ、合成対象
物画像では画像がない部分であっても、背景となる階層
画像と合成対象物画像を重ねあわせることによって、背
景画像の欠落した不自然な部分がない画像を生成するこ
とができる。The background hierarchical image decoded by the hierarchical image decoding unit 102 is obtained by adding the background images in the moving image as shown in the example of FIG. It is possible to decode as a hierarchical image even if it is hidden in a certain frame as long as it is included in the moving image. Therefore, even if the background image is hidden in the object in the corresponding one frame in the original moving image and there is no image in the composite object image, the background hierarchical image and the composite object are included. By superimposing the images, it is possible to generate an image having no unnatural portion where the background image is missing.

【００２５】さらに、上記した動画像を階層化して符号
化、復号化する方法は、前述したワンらの研究報告にも
あるように、同じ動きパラメータで記述できる領域毎に
符号化できるため、少ない符号化量で伝送記録できる。Further, the method of hierarchically coding and decoding the moving image described above can be coded for each region which can be described by the same motion parameter as described in the research report of Wan et al. It can be transmitted and recorded with the amount of coding.

【００２６】以上のように本実施例によれば、データベ
ース部の動画像データとして、動画像を前後関係で分離
して階層画像を構成し符号化する手法によって得られる
符号化動画像データを記録することにより、少ない記憶
容量のデータベース部で動画像を記録できるという効果
がある。さらに、画像中の対象物画像を入力された別の
対象物画像で置き換える際に、動画像中の対象物の構造
情報と入力された入力画像、入力構造情報もちいて合成
した合成対象物画像と、復号化された階層画像中に含ま
れる背景の情報とを重ねあわせることにより、背景画像
の欠落などのない自然な画像を合成する事が可能である
という効果がある。As described above, according to the present embodiment, as the moving image data of the database section, the encoded moving image data obtained by the method of separating the moving images in the front-back relation to form a hierarchical image and to encode them is recorded. By doing so, there is an effect that a moving image can be recorded with a database unit having a small storage capacity. Furthermore, when replacing the object image in the image with another input object image, the structure information of the object in the moving image and the input image input, and the combined object image synthesized using the input structure information By superimposing the background information included in the decoded hierarchical image, it is possible to synthesize a natural image having no missing background image.

【００２７】なお、本実施例では、構造情報時系列デー
タおよび入力構造情報として対応する画像の２次元的な
位置座標をもつとしたが、３次元的な構造情報を持つと
しても良く、例えば、構造情報中の多角形の各頂点に３
次元的な座標を付加し、さらに任意の２頂点間の画像の
視点方向に対する角度、すなわち対象物の画像中での向
きの情報を付加することにより、入力画像での対象物の
向きとは異なる向きの対象物合成画像を得ることが可能
となり、元の動画像中の対象物の動きに近い動画像を合
成することができることから、より自然な動画像を得る
ことが可能となるという効果がある。In this embodiment, the structure information time-series data and the input structure information have the two-dimensional position coordinates of the corresponding image, but the structure information may have three-dimensional structure information. 3 at each vertex of the polygon in the structure information
By adding a dimensional coordinate and further adding information about the angle between the arbitrary two vertices with respect to the viewpoint direction of the image, that is, the orientation information in the image of the object, the orientation of the object in the input image is different. Since it is possible to obtain a target composite image of orientations and to combine a moving image close to the motion of the target object in the original moving image, it is possible to obtain a more natural moving image. is there.

【００２８】また、本実施例の対象物画像合成部１０４
では、入力画像、入力構造情報、構造情報選択部から出
力された構造情報を入力とし、合成対象物画像を合成す
るとしたが、入力画像、入力構造情報、構造情報選択部
から出力された構造情のほかに、階層画像復号化部で合
成され画像メモリ部に書き込まれた対象物の階層画像を
用いて、合成対象物画像を合成するとしても良く、動画
像中の対象物を入力画像と置き換える際に、対象物の構
造的な関係だけでなく、画像としての色に関する情報を
含めた画像を合成することが可能になるという効果があ
る。例えば、動画像中の対象物への照明の色が時間とと
もに変っていくような場合、構造的な関係だけで合成対
象物画像を合成すると合成された動画像中で不自然な画
像になってしまうが、階層画像中の対象物の色の変化の
情報を使い、合成対象物画像の色を変化させることよっ
て、より自然な動画像の合成が可能になる。Further, the object image synthesizing unit 104 of the present embodiment.
In the above, the input image, the input structure information, and the structure information output from the structure information selection unit are input, and the synthesis target image is to be combined, but the input image, the input structure information, and the structure information output from the structure information selection unit. In addition to the above, the combined object image may be combined using the hierarchical image of the object that is combined by the hierarchical image decoding unit and written in the image memory unit, and the object in the moving image is replaced with the input image. At this time, there is an effect that it is possible to synthesize an image including not only the structural relationship of the target object but also information about the color as the image. For example, if the color of the illumination on the object in the moving image changes over time, combining the combined object images based only on structural relationships will result in an unnatural image in the combined moving image. However, by using the information about the color change of the object in the hierarchical image and changing the color of the combined object image, a more natural moving image can be combined.

【００２９】[0029]

【発明の効果】以上のように本発明は、動画像中の対象
物を入力された別の対象物の画像と置き換えた画像を合
成する手法として、動画像を前後関係で分離して階層画
像を構成し符号化した符号化動画像データとして記録す
ることにより、少ない記憶容量で動画像を記録できると
いう効果があり、さらに、画像中の対象物画像を入力さ
れた別の対象物画像で置き換える際に、復号化された階
層画像中に含まれる背景の情報を用いることによって、
背景画像の欠落などのない自然な画像を合成する事が可
能であるという効果のある画像合成装置を提供すること
ができる。As described above, according to the present invention, as a method of synthesizing an image in which an object in a moving image is replaced with an image of another input object, the moving image is separated in the front-rear relationship and the hierarchical image is obtained. By recording as encoded moving image data, the moving image can be recorded with a small storage capacity, and the object image in the image is replaced with another input object image. At this time, by using the background information included in the decoded hierarchical image,
It is possible to provide an image synthesizing device that is effective in synthesizing a natural image without a missing background image.

[Brief description of drawings]

【図１】本発明の一実施例における画像合成装置の構成
を示すブロック図FIG. 1 is a block diagram showing the configuration of an image synthesizing apparatus according to an embodiment of the present invention.

【図２】同実施例における動画像データを生成する方法
を示す図FIG. 2 is a diagram showing a method of generating moving image data in the embodiment.

【図３】(a),(b)は同実施例における構造情報の具体例
を示す図3 (a) and 3 (b) are diagrams showing specific examples of structural information in the embodiment.

【符号の説明】１０１データベース部１０２階層画像復号化部１０３構造情報選択部１０４対象物画像合成部１０５画像メモリ部１０６階層画像合成部１１１符号化動画像データ１１２構造情報時系列データ１１３入力画像１１４入力構造情報１１５フレーム情報１１６構造情報１１７階層画像１１８合成対象物画像２０１動画像２０２階層画像２０３符号化動画像データ[Explanation of symbols] 101 Database Department 102 layer image decoding unit 103 Structure Information Selection Section 104 object image composition unit 105 image memory section 106 layer image composition unit 111 encoded video data 112 Structural information time series data 113 Input image 114 Input structure information 115 frame information 116 Structural information 117 Hierarchical image 118 Composite object image 201 moving image 202 Hierarchical image 203 coded moving image data

───────────────────────────────────────────────────── フロントページの続き (56)参考文献特開平１−274193（ＪＰ，Ａ) 特開平２−127886（ＪＰ，Ａ) 特開平４−199474（ＪＰ，Ａ) 特開平５−249953（ＪＰ，Ａ) 特開平６−28449（ＪＰ，Ａ) 特開平６−350984（ＪＰ，Ａ) 特開平７−46534（ＪＰ，Ａ) 特開平８−272998（ＪＰ，Ａ) (58)調査した分野(Int.Cl.⁷，ＤＢ名) G06T 11/80 G06T 3/00 G06T 13/00 H04N 5/272 ＣＳＤＢ（日本国特許庁)─────────────────────────────────────────────────── ─── Continuation of the front page (56) References JP-A-1-274193 (JP, A) JP-A-2-127886 (JP, A) JP-A-4-199474 (JP, A) JP-A-5- 249953 (JP, A) JP-A-6-28449 (JP, A) JP-A-6-350984 (JP, A) JP-A-7-46534 (JP, A) JP-A-8-272998 (JP, A) (58) Fields investigated (Int.Cl. ⁷ , DB name) G06T 11/80 G06T 3/00 G06T 13/00 H04N 5/272 CSDB (Japan Patent Office)

Claims

(57) [Claims]

1. Structure information time-series data having structure information indicating a correspondence relationship between an object image in each frame in a moving image and a structure of the object and the moving image are separated in a spatial context. a database unit for recording and constitute a hierarchical image coding coded moving picture data, before SL reads the encoded moving image data from the database unit, generates a hierarchical image by respectively decoding the plurality of image stratified a hierarchical image decoding unit you from the read structure information time-series data from said database unit, and the structure information selection unit for selecting and outputting structure information that corresponds, similar to the object in the moving image structure another and the input image of the object the input image to the input structure information and a structure information corresponding to the input from the outside, respectively having the structural information structure information outputted by the selection unit and said input structure information Processing the structure information used, further said input image object image synthesizing unit you synthesize synthetic object image with, superimposed the composite object image as stratifying by the decoding, synthesis An image synthesizing apparatus comprising a hierarchical image synthesizing unit for outputting an image.

2. The structure information includes information about a three-dimensional position of an object, and the object image synthesizing unit receives the three-dimensional information of the structure information and the input structure information output from the structure information selecting unit and the input image. The image synthesizing apparatus according to claim 1, wherein the image of the synthesizing target is synthesized by using.

Wherein the object image synthesizing unit, the input image, the input structure information, in addition to the structure information output from the structure information selecting unit, synthesizing the synthesized object image using the hierarchical image of the Target product The image synthesizing device according to claim 1 or 2.

4. An object image in each frame in a moving image is paired with the object image.
Structural information time series data showing the correspondence with the structure of the elephant
And the moving image is separated in a spatial context, and a hierarchical image is obtained.
And the encoded moving image data that is encoded and recorded.
Database part and image memory part that temporarily stores images
Read the encoded moving image data from the database section.
Then, the multiple hierarchized images are decoded and
At the same time when a layer image is generated and written in the image memory unit,
Indicates the frame position for decoding in the encoded moving image data.
And a hierarchical image decoding unit for outputting frame information,
The structural information time-series data is read from the database,
Select and output the structural information corresponding to the frame information.
Structure information selection unit and a structure similar to the object in the moving image.
The input image of another object with a structure and the input image
The input structure information, which is the corresponding structure information, is
Information input from the structure information output section
And process the structural information using the input structural information, and
A composite object image is composed using the input image to create the image.
The object image composition unit to be written in the image memory, and the image memo
Section image and the composite object image written in the
The layered image compositing unit that superimposes
An image synthesizing device characterized by being provided.