WO2012172659A1

WO2012172659A1 - Image encoding method, image decoding method, image encoding apparatus, and image decoding apparatus

Info

Publication number: WO2012172659A1
Application number: PCT/JP2011/063720
Authority: WO
Inventors: 中條　健; 山影　朋夫
Original assignee: Toshiba Corp
Current assignee: Toshiba Corp
Priority date: 2011-06-15
Filing date: 2011-06-15
Publication date: 2012-12-20
Anticipated expiration: 2013-12-15

Abstract

The purpose of the present invention is to reduce the amount of calculation necessary in a bidirectional prediction method. An image encoding apparatus of an embodiment is provided with a motion compensation prediction unit, an interpolation image generating unit, and an encoding unit. The motion compensation prediction unit generates a motion compensation prediction image that is predicted from a reference image on the basis of a motion vector. The interpolation image generating unit generates a bidirectional prediction image that is the result of interpolating a plurality of motion compensation prediction images that are generated for a plurality of reference images. The encoding unit generates encoded data that is the result of encoding the difference between an input image and the bidirectional prediction image. When a first set, which indicates a combination of pixel positions of each of the plurality of motion compensation prediction images for obtaining the pixel value of the first pixel position of the bidirectional prediction image, is included among particular sets that are sets of predetermined pixel positions, the interpolation image generating unit adds the pixel values at each of the pixel positions contained in the first set and interpolates the added pixel values to calculate the pixel value of the first pixel position.

Description

Image encoding method, image decoding method, image encoding device, and image decoding device

　本発明の実施形態は、画像符号化方法、画像復号方法、画像符号化装置および画像復号装置に関する。 Embodiments described herein relate generally to an image encoding method, an image decoding method, an image encoding device, and an image decoding device.

　動画像符号化技術では、整数画素精度よりも高い精度の補間画像を作成し、２つの補間画像の平均値または重み付き平均により予測値を算出する双方向予測を用いることによって予測効率の改善を図る方法が一般的である。双方向予測方式では、２つの単方向の補間画像を作成し、２つの補間画像の平均値を求めて双方向予測画像を作成する。このため、平均値を求める時の丸め誤差が問題になることが知られている。双方向予測補間画像を生成する時に、単方向の補間画像生成結果を入力画素ビット長よりも長いビット長で保持してから平均を求めることで、演算途中での丸め誤差を減らす技術が存在する。 In the moving image coding technology, an interpolation image with an accuracy higher than the integer pixel accuracy is created, and the prediction efficiency is improved by using bi-directional prediction in which a prediction value is calculated by an average value or a weighted average of two interpolation images. The method of aiming is common. In the bidirectional prediction method, two unidirectional interpolation images are created, and an average value of the two interpolation images is obtained to create a bidirectional prediction image. For this reason, it is known that a rounding error in determining the average value becomes a problem. There is a technique for reducing a rounding error in the middle of calculation by generating an average after holding a unidirectional interpolation image generation result with a bit length longer than an input pixel bit length when generating a bidirectional prediction interpolation image.

K.　Ugur,　J.　Lainema,　A.　HＡＬlapuro,　“High　precision　bi-directionＡＬ　averaging,”　JCTVC-D321,　4th.　JCT-VC　Meeting,　Daegu,　KR,　20-28　January,　2011K. Ugur, J. Lainema, A. HALlapuro, “High precision bi-direction AL averaging,” JCTVC-D321, 4th. JCT-VC Meeting, Daegu, KR, 20-28 January, 2011

　しかしながら、従来技術では、単方向の補間画像を複数生成した上で、複数の補間画像の平均値を求める処理を行うことから、通常の２倍の補間画像生成処理が必要になり、演算量が増加するという問題があった。 However, in the conventional technique, since a plurality of unidirectional interpolation images are generated and an average value of the plurality of interpolation images is calculated, an interpolation image generation process twice as usual is required, and the amount of calculation is large. There was a problem of increasing.

　実施形態の画像符号化装置は、動き補償予測部と、補間画像生成部と、符号化部とを備える。動き補償予測部は、参照画像から動きベクトルに基づいて予測される動き補償予測画像を生成する。補間画像生成部は、複数の参照画像に対して生成される複数の動き補償予測画像を補間した双方向予測画像を生成する。符号化部は、入力画像と双方向予測画像との差分を符号化した符号化データを生成する。また、補間画像生成部は、双方向予測画像の第１画素位置の画素値を求めるための複数の動き補償予測画像それぞれの画素位置の組合わせを表す第１組が、予め定められた画素位置の組である特定組に含まれる場合、第１組に含まれる各画素位置の画素値を加算し、加算した画素値を補間することにより第１画素位置の画素値を算出する。 The image coding apparatus according to the embodiment includes a motion compensation prediction unit, an interpolation image generation unit, and a coding unit. The motion compensated prediction unit generates a motion compensated predicted image that is predicted based on the motion vector from the reference image. The interpolation image generation unit generates a bidirectional prediction image obtained by interpolating a plurality of motion compensated prediction images generated for a plurality of reference images. The encoding unit generates encoded data obtained by encoding the difference between the input image and the bidirectional prediction image. The interpolated image generation unit is configured such that a first set representing a combination of pixel positions of a plurality of motion compensated prediction images for obtaining a pixel value of the first pixel position of the bidirectional prediction image is a predetermined pixel position. When the pixel values are included in the specific group, the pixel values at the respective pixel positions included in the first group are added, and the pixel value at the first pixel position is calculated by interpolating the added pixel values.

第１の実施形態に係わる画像符号化装置のブロック図。1 is a block diagram of an image encoding device according to a first embodiment. 第１の実施形態に係わる画像復号装置のブロック図。1 is a block diagram of an image decoding apparatus according to a first embodiment. 予測画像生成部の示すブロック図。The block diagram which a prediction image generation part shows. 第１の実施形態に係わる双方向予測部のブロック図。The block diagram of the bidirectional | two-way prediction part concerning 1st Embodiment. 第１の実施形態の双方向補間画像生成部のブロック図。The block diagram of the bidirectional | two-way interpolation image generation part of 1st Embodiment. 分数画素位置の一例を示す図。The figure which shows an example of a fraction pixel position. 分数画素位置に対応する分数精度情報の一例を示す図。The figure which shows an example of the fraction precision information corresponding to a fraction pixel position. 整数画素精度の動き補償画像の例を示す図。The figure which shows the example of the motion compensation image of integer pixel precision. スイッチ部の切り替え条件の一例を示すテーブルの図。The figure of the table which shows an example of the switching conditions of a switch part. 第１画像生成部のブロック図。The block diagram of a 1st image generation part. 第２画像生成部のブロック図。The block diagram of a 2nd image generation part. 第１の実施形態における補間画像生成処理のフローチャート。The flowchart of the interpolation image generation process in 1st Embodiment. 第１の実施形態による一画素あたりの乗算回数を示した図。The figure which showed the frequency | count of multiplication per pixel by 1st Embodiment. 従来の方法による一画素あたりの乗算回数を示した図。The figure which showed the multiplication frequency per pixel by the conventional method. 第２の実施形態による一画素あたりの乗算回数を示した図。The figure which showed the frequency | count of multiplication per pixel by 2nd Embodiment. 第３の実施形態による一画素あたりの乗算回数を示した図。The figure which showed the frequency | count of multiplication per pixel by 3rd Embodiment. 第４の実施形態による一画素あたりの乗算回数を示した図。The figure which showed the frequency | count of multiplication per pixel by 4th Embodiment. 第１～第４の実施形態にかかる装置のハードウェア構成図。FIG. 5 is a hardware configuration diagram of an apparatus according to first to fourth embodiments.

　以下に添付図面を参照して、この発明にかかる画像符号化方法、画像復号方法、画像符号化装置および画像復号装置の好適な実施形態を詳細に説明する。
（第１の実施形態）
　第１の実施形態にかかる画像符号化装置および画像復号装置は、双方向予測に用いる２つの参照画像それぞれの画素位置の組が所定の組（特定組）に相当する場合に、先に各画素位置の画素値を加算してから、補間に用いるフィルタのフィルタ係数との乗算を実行する。これにより、補間画像を生成するときの演算数を削減することができる。 Exemplary embodiments of an image encoding method, an image decoding method, an image encoding device, and an image decoding device according to the present invention will be described below in detail with reference to the accompanying drawings.
(First embodiment)
The image encoding device and the image decoding device according to the first embodiment first detect each pixel when a set of pixel positions of two reference images used for bidirectional prediction corresponds to a predetermined set (specific set). After adding the pixel values of the positions, multiplication with the filter coefficient of the filter used for interpolation is executed. Thereby, it is possible to reduce the number of operations when generating the interpolation image.

　図１は、第１の実施形態に係わる画像符号化装置６００の構成例を示すブロック図である。第１の実施形態の画像符号化装置６００は、減算部６０２と、変換／量子化部６０３と、逆量子化／逆変換部６０４と、エントロピー符号化部６０５と、加算部６０６と、フレームメモリ部６０８と、予測画像生成部６１０と、動きベクトル探索部６１２と、符号化制御部６１４と、を備えている。画像符号化装置６００は、入力動画像信号６０１から、符号化データ６１５を生成する。 FIG. 1 is a block diagram illustrating a configuration example of an image encoding device 600 according to the first embodiment. The image coding apparatus 600 according to the first embodiment includes a subtracting unit 602, a transform / quantization unit 603, an inverse quantization / inverse transform unit 604, an entropy coding unit 605, an addition unit 606, and a frame memory. Unit 608, predicted image generation unit 610, motion vector search unit 612, and encoding control unit 614. The image encoding device 600 generates encoded data 615 from the input moving image signal 601.

　画像符号化装置６００には、例えばフレーム単位で入力動画像信号６０１が入力される。入力動画像信号６０１は、例えば、マクロブロックといった単位にブロック化される。減算部６０２は、予測画像生成部６１０で生成された予測画像信号６１１と入力動画像信号６０１との差分である予測誤差信号を出力する。 For example, the input moving image signal 601 is input to the image encoding device 600 in units of frames. The input moving image signal 601 is divided into units such as macro blocks. The subtraction unit 602 outputs a prediction error signal that is a difference between the predicted image signal 611 generated by the predicted image generation unit 610 and the input moving image signal 601.

　変換／量子化部６０３は、予測誤差信号を例えば離散コサイン変換（ＤＣＴ）などにより直交変換した上で、量子化処理を実行し、量子化変換係数情報を生成する。量子化変換係数情報は、二分岐され、一方は、エントロピー符号化部６０５に入力される。二分岐された量子化変換係数情報の他方は、逆量子化／逆変換部６０４に入力される。逆量子化／逆変換部６０４は、変換／量子化部６０３の処理と逆の処理として、逆量子化、逆変換の処理を量子化変換係数情報に対して実行し、予測誤差信号を再生する。加算部６０６は、予測誤差信号と予測画像信号とを加算する。これにより、復号画像信号６０７が生成される。 The transform / quantization unit 603 orthogonally transforms the prediction error signal by, for example, discrete cosine transform (DCT), and executes quantization processing to generate quantized transform coefficient information. The quantized transform coefficient information is bifurcated, and one is input to the entropy encoding unit 605. The other of the bifurcated quantized transform coefficient information is input to the inverse quantization / inverse transform unit 604. The inverse quantization / inverse transform unit 604 performs inverse quantization and inverse transform processing on the quantized transform coefficient information as processing reverse to the processing of the transform / quantization unit 603, and reproduces the prediction error signal. . The adding unit 606 adds the prediction error signal and the prediction image signal. As a result, a decoded image signal 607 is generated.

　復号画像信号６０７は、フレームメモリ部６０８に入力される。フレームメモリ部６０８は、復号画像信号６０７に対してフィルタ処理等を行った後に、予測制御情報５０７をもとに、予測画像生成部６１０に入力する参照画像信号６０９とするために、復号画像信号６０７を蓄えるかを判定する。 The decoded image signal 607 is input to the frame memory unit 608. The frame memory unit 608 performs a filtering process or the like on the decoded image signal 607, and then uses the decoded image signal 609 as a reference image signal 609 input to the predicted image generation unit 610 based on the prediction control information 507. It is determined whether 607 is stored.

　参照画像信号６０９は、予測画像生成部６１０に入力されるとともに、動きベクトル探索部６１２にも入力される。動きベクトル探索部６１２は、入力動画像信号６０１と、参照画像信号６０９とから、動きベクトル情報６１３を生成する。動きベクトル情報６１３は、予測画像生成部６１０に入力されるとともに、エントロピー符号化部６０５にも送られる。予測画像生成部６１０は、参照画像信号６０９と予測制御情報５０７と動きベクトル情報６１３とから、予測画像信号６１１を生成する。 The reference image signal 609 is input to the predicted image generation unit 610 and also to the motion vector search unit 612. The motion vector search unit 612 generates motion vector information 613 from the input moving image signal 601 and the reference image signal 609. The motion vector information 613 is input to the predicted image generation unit 610 and is also sent to the entropy encoding unit 605. The predicted image generation unit 610 generates a predicted image signal 611 from the reference image signal 609, the prediction control information 507, and the motion vector information 613.

　符号化制御部６１４は、変換／量子化部６０３、予測画像生成部６１０、および、フレームメモリ部６０８などの動作を制御する。符号化制御部６１４によって生成された予測制御情報５０７は、予測画像生成部６１０とフレームメモリ部６０８に入力されるとともに、エントロピー符号化部６０５にも送られる。エントロピー符号化部６０５は、変換／量子化部６０３からの量子化変換係数情報、符号化制御部６１４からの予測制御情報５０７、および、動きベクトル探索部６１２からの動きベクトル情報６１３などの各種符号化情報をエントロピー符号化して予め決められたシンタクスに従って符号化データを生成する。 The encoding control unit 614 controls operations of the transform / quantization unit 603, the predicted image generation unit 610, the frame memory unit 608, and the like. The prediction control information 507 generated by the encoding control unit 614 is input to the predicted image generation unit 610 and the frame memory unit 608 and is also sent to the entropy encoding unit 605. The entropy encoding unit 605 includes various codes such as quantized transform coefficient information from the transform / quantization unit 603, prediction control information 507 from the encoding control unit 614, and motion vector information 613 from the motion vector search unit 612. Encoded data is entropy-encoded to generate encoded data according to a predetermined syntax.

　図２は、画像符号化装置６００に対応する画像復号装置７００の構成例を示すブロック図である。画像復号装置７００は、エントロピー復号化部７０２と、逆量子化／逆変換部７０３と、加算部７０４と、フレームメモリ部７０６と、予測画像生成部６１０と、を備える。画像復号装置７００は、符号化データ７０１から再生動画像信号７０７を生成する。 FIG. 2 is a block diagram illustrating a configuration example of an image decoding apparatus 700 corresponding to the image encoding apparatus 600. The image decoding apparatus 700 includes an entropy decoding unit 702, an inverse quantization / inverse transform unit 703, an addition unit 704, a frame memory unit 706, and a predicted image generation unit 610. The image decoding device 700 generates a playback video signal 707 from the encoded data 701.

　エントロピー復号化部７０２は、予め決められたシンタクスに従って符号化データ７０１のエントロピー復号化処理を行う。エントロピー復号化部７０２は、符号化データ７０１を復号して、量子化変換係数情報と予測制御情報７１１と動きベクトル情報７１２とを求める。復号化された量子化変換係数情報は、逆量子化／逆変換部７０３に入力される。復号化された予測制御情報７１１および動きベクトル情報７１２は、予測画像生成部６１０に入力される。 The entropy decoding unit 702 performs entropy decoding processing of the encoded data 701 according to a predetermined syntax. The entropy decoding unit 702 decodes the encoded data 701 to obtain quantized transform coefficient information, prediction control information 711, and motion vector information 712. The decoded quantized transform coefficient information is input to the inverse quantization / inverse transform unit 703. The decoded prediction control information 711 and motion vector information 712 are input to the predicted image generation unit 610.

　逆量子化／逆変換部７０３は、量子化変換係数情報に対して逆量子化、逆直交変換処理を行って予測誤差信号を再生する。加算部７０４は、予測誤差信号と予測画像信号７１０とを加算して、復号画像信号７０５を生成する。 The inverse quantization / inverse transform unit 703 performs inverse quantization and inverse orthogonal transform processing on the quantized transform coefficient information to reproduce a prediction error signal. The adder 704 adds the prediction error signal and the prediction image signal 710 to generate a decoded image signal 705.

　復号画像信号７０５は、フレームメモリ部７０６に入力される。フレームメモリ部７０６は、復号画像信号７０５にフィルタ処理を行って再生動画像信号７０７として出力する。フレームメモリ部７０６は、フィルタ処理された復号画像信号７０５を記憶するかを、予測制御情報７１１に基づいて判定する。記憶された復号画像信号７０５は、参照画像信号７０８として予測画像生成部６１０に入力される。 The decoded image signal 705 is input to the frame memory unit 706. The frame memory unit 706 performs a filtering process on the decoded image signal 705 and outputs it as a reproduced moving image signal 707. The frame memory unit 706 determines whether to store the filtered decoded image signal 705 based on the prediction control information 711. The stored decoded image signal 705 is input to the predicted image generation unit 610 as a reference image signal 708.

　予測画像生成部６１０は、参照画像信号７０８と予測制御情報７１１と動きベクトル情報７１２とを用いて、予測画像信号７１０を生成する。 The predicted image generation unit 610 generates a predicted image signal 710 using the reference image signal 708, the prediction control information 711, and the motion vector information 712.

　図３は、画像符号化装置６００および画像復号装置７００に備えられる予測画像生成部６１０の構成例を示すブロック図である。予測画像生成部６１０は、スイッチ５０３と、双方向予測部５０４と、単方向予測部５０５と、イントラ予測部５０６と、を備えている。予測画像生成部６１０は、参照画像信号５０２と予測制御情報５０７と動きベクトル情報５０８とから、予測画像信号５０９を生成する。 FIG. 3 is a block diagram illustrating a configuration example of the predicted image generation unit 610 included in the image encoding device 600 and the image decoding device 700. The predicted image generation unit 610 includes a switch 503, a bidirectional prediction unit 504, a unidirectional prediction unit 505, and an intra prediction unit 506. The predicted image generation unit 610 generates a predicted image signal 509 from the reference image signal 502, the prediction control information 507, and the motion vector information 508.

　予測制御情報５０７は、例えば、双方向予測部５０４、単方向予測部５０５、および、イントラ予測部５０６のいずれを用いるかを指定する情報を含む。スイッチ５０３は、この情報を参照して、双方向予測部５０４、単方向予測部５０５、および、イントラ予測部５０６のいずれかを選択するように切り替える。 The prediction control information 507 includes, for example, information specifying which of the bidirectional prediction unit 504, the unidirectional prediction unit 505, and the intra prediction unit 506 is used. The switch 503 refers to this information and switches so as to select any one of the bidirectional prediction unit 504, the unidirectional prediction unit 505, and the intra prediction unit 506.

　参照画像信号５０２は、スイッチ５０３によって選択された、双方向予測部５０４、単方向予測部５０５、および、イントラ予測部５０６のいずれかに入力される。双方向予測部５０４が選択された場合、双方向予測部５０４は、複数の参照フレームからの参照画像信号５０２と、動きベクトル情報５０８とを用いて、動き補償画像信号を生成し、双方向予測に基づいて予測画像信号５０９を生成する。単方向予測部５０５が選択された場合、単方向予測部５０５は、単一の参照フレームからの参照画像信号５０２と、動きベクトル情報５０８とを用いて、動き補償画像信号を生成し、予測画像信号５０９を生成する。イントラ予測部５０６が選択された場合、イントラ予測部５０６は、画面内の参照画像信号５０２を用いて予測画像信号５０９を生成する。 The reference image signal 502 is input to any of the bidirectional prediction unit 504, the unidirectional prediction unit 505, and the intra prediction unit 506 selected by the switch 503. When the bidirectional prediction unit 504 is selected, the bidirectional prediction unit 504 generates a motion compensated image signal using the reference image signals 502 and the motion vector information 508 from a plurality of reference frames, and performs bidirectional prediction. Based on this, a predicted image signal 509 is generated. When the unidirectional predictor 505 is selected, the unidirectional predictor 505 generates a motion compensated image signal using the reference image signal 502 and the motion vector information 508 from a single reference frame, and generates a predicted image. A signal 509 is generated. When the intra prediction unit 506 is selected, the intra prediction unit 506 generates a predicted image signal 509 using the reference image signal 502 in the screen.

　図４は、第１の実施形態に係わる双方向予測部５０４の構成例を示すブロック図である。双方向予測部５０４は、動き補償予測部１０７と、精度計算部１０４と、双方向補間画像生成部１０９と、を備えている。双方向予測部５０４は、参照画像信号１０２と動きベクトル情報１０３とを入力とし、双方向予測画像信号１１０を出力する。 FIG. 4 is a block diagram illustrating a configuration example of the bidirectional prediction unit 504 according to the first embodiment. The bidirectional prediction unit 504 includes a motion compensation prediction unit 107, an accuracy calculation unit 104, and a bidirectional interpolation image generation unit 109. The bidirectional prediction unit 504 receives the reference image signal 102 and the motion vector information 103 as inputs, and outputs a bidirectional prediction image signal 110.

　動き補償予測部１０７は、精度計算部１０４からの整数精度情報１０５に基づいて、整数画素精度の参照画像信号を参照する。ここで、Ｈ．２６４／ＡＶＣにならって、２つの参照画像をリスト０の参照画像とリスト１の参照画像と呼ぶことで区別することにする。本実施形態では、精度計算部１０４は、動きベクトル１０３を１／４画素精度とした場合、リストＬ（Ｌは０または１）の動きベクトル（ｍｖＬＸ［０］，ｍｖＬＸ［１］）から、整数精度情報（ｘＩｎｔ，ｙＩｎｔ）と分数精度情報（ｘＦｒａｃ，ｙＦｒａｃ）とを、以下の数式により計算する。
　ｘＩｎｔ＝ｘＰ＋（ｍｖＬＸ［０］＞＞２）＋ｘ
　ｙＩｎｔ＝ｙＰ＋（ｍｖＬＸ［１］＞＞２）＋ｙ
　ｘＦｒａｃ＝ｍｖＬＸ［０］＆３
　ｙＦｒａｃ＝ｍｖＬＸ［１］＆３ The motion compensation prediction unit 107 refers to the reference image signal with integer pixel accuracy based on the integer accuracy information 105 from the accuracy calculation unit 104. Here, H. According to H.264 / AVC, the two reference images are distinguished by being referred to as a reference image in list 0 and a reference image in list 1. In the present embodiment, when the motion vector 103 is set to 1/4 pixel accuracy, the accuracy calculation unit 104 calculates an integer from the motion vector (mvLX [0], mvLX [1]) of the list L (L is 0 or 1). The accuracy information (xInt, yInt) and the fractional accuracy information (xFrac, yFrac) are calculated by the following equations.
xInt = xP + (mvLX [0] >> 2) + x
yInt = yP + (mvLX [1] >> 2) + y
xFrac = mvLX [0] & 3
yFrac = mvLX [1] & 3

　なお、動きベクトルの分数精度については、１／４画素精度以外にも１／８画素精度や、１／１６画素精度等の他の分数精度でもよく、同様の議論が成り立つ。ここで、動き補償のブロックの左上の基準画素を（ｘＰ，ｙＰ）とし、ブロック内の画素位置を（ｘ，ｙ）で表現するものとする。 The fractional accuracy of the motion vector may be other fractional accuracy such as 1/8 pixel accuracy or 1/16 pixel accuracy in addition to the 1/4 pixel accuracy, and the same argument holds. Here, the upper left reference pixel of the motion compensation block is (xP, yP), and the pixel position in the block is represented by (x, y).

　リスト０とリスト１のそれぞれの整数精度情報（ｘＩｎｔ，ｙＩｎｔ）は、動き補償予測部１０７に送られる。動き補償予測部１０７は、リスト０とリスト１のそれぞれの参照画像信号から必要な範囲の画素データを読み出し、双方向補間画像生成部１０９に送る。 The integer accuracy information (xInt, yInt) of list 0 and list 1 is sent to the motion compensation prediction unit 107. The motion compensation prediction unit 107 reads out pixel data in a necessary range from the reference image signals of the list 0 and the list 1, and sends the pixel data to the bidirectional interpolation image generation unit 109.

　双方向補間画像生成部１０９は、リスト０とリスト１のそれぞれの分数精度情報（ｘＦｒａｃ，ｙＦｒａｃ）と動き補償予測部１０７からの整数画素精度の動き補償予測画像１０８をもとに、双方向予測画像信号１１０を生成して出力する。 The bidirectional interpolation image generation unit 109 performs bidirectional prediction based on the fractional accuracy information (xFrac, yFrac) of each of list 0 and list 1 and the motion compensated prediction image 108 with integer pixel accuracy from the motion compensation prediction unit 107. An image signal 110 is generated and output.

　図５は、本実施形態の双方向補間画像生成部１０９の構成例を示すブロック図である。双方向補間画像生成部１０９は、スイッチ部２０１と、第１画像生成部２０２と、第２画像生成部２０３と、を備えている。双方向補間画像生成部１０９は、スイッチ部２０１で切り替えられる第１画像生成部２０２と第２画像生成部２０３のいずれかにより、動き補償予測画像１０８と分数精度情報１０６とをもとに双方向予測画像信号１１０を生成する。 FIG. 5 is a block diagram illustrating a configuration example of the bidirectional interpolation image generation unit 109 of the present embodiment. The bidirectional interpolation image generation unit 109 includes a switch unit 201, a first image generation unit 202, and a second image generation unit 203. The bidirectionally interpolated image generation unit 109 performs bidirectional operation based on the motion compensated predicted image 108 and the fractional accuracy information 106 by either the first image generation unit 202 or the second image generation unit 203 switched by the switch unit 201. A predicted image signal 110 is generated.

　ここで、１／４画素精度とした場合の双方向予測部５０４の分数精度情報の具体的な例を説明する。図６は、分数画素位置の一例を示す図である。図７は、分数画素位置に対応する分数精度情報の一例を示す図である。 Here, a specific example of the fractional accuracy information of the bidirectional prediction unit 504 in the case of 1/4 pixel accuracy will be described. FIG. 6 is a diagram illustrating an example of the fractional pixel position. FIG. 7 is a diagram illustrating an example of fraction accuracy information corresponding to the fractional pixel position.

　図６に示すように、ａ，ｂ，ｃ，ｄ，ｅ，ｆ，ｇ，ｈ，ｉ，ｊ，ｋ，ｎ，ｐ，ｑ，ｒは、分数画素位置を表している。また、図７に示すように、ｘＦｒａｃは、リストＬの水平方向の分数画素精度情報を表し、ｙＦｒａｃは、リストＬの垂直方向の分数画素精度を表している。ここで、ｘＦｒａｃ＝０、ｙＦｒａｃ＝０に相当するＡがリストＬの整数画素位置を表している。 As shown in FIG. 6, a, b, c, d, e, f, g, h, i, j, k, n, p, q, and r represent fractional pixel positions. 7, xFrac represents the fractional pixel accuracy information in the horizontal direction of the list L, and yFrac represents the fractional pixel accuracy in the vertical direction of the list L. Here, A corresponding to xFrac = 0 and yFrac = 0 represents an integer pixel position in the list L.

　図８は、１／４画素精度とした場合の動き補償予測部１０７で参照される整数画素精度の動き補償画像の例を示す図である。Ａ_{Ｌ，ｘ，ｙ}は、リストＬの整数画素位置ｘ，ｙの画素値を示す。双方向予測部５０４は、整数画素位置０，０を基準に、整数画素間の補間画像を生成する。 FIG. 8 is a diagram illustrating an example of a motion compensated image with integer pixel accuracy referred to by the motion compensation prediction unit 107 in the case of 1/4 pixel accuracy. A _{L, x, y} indicates pixel values of integer pixel positions x, y in the list L. The bidirectional prediction unit 504 generates an interpolated image between integer pixels with reference to the integer pixel positions 0 and 0.

　本実施形態では、１／４画素精度の２次元補間画像を作成することとする。また、ＦＩＲフィルタのフィルタ係数をＦとすると、以下の数１のような関係が成り立つものとする。なお、１／８画素精度および１／１６画素精度でも同様の議論が可能である。また、適用する動画像信号については、輝度信号のみならず、色差信号にも適用可能である。例えば、４：２：０フォーマットの場合、輝度が１／４画素精度ならば、色差信号は１／８画素精度であり、それらは独立或いは、同時に本実施の形態の方式が適用可能である。４：２：２フォーマットの場合、輝度が１／４画素精度ならば、色差信号は垂直方向に１／４画素精度で水平方向に１／８画素精度となり、同様の議論が可能である。

In the present embodiment, a two-dimensional interpolation image with 1/4 pixel accuracy is created. When the filter coefficient of the FIR filter is F, the following relationship is established. The same argument can be made with 1/8 pixel accuracy and 1/16 pixel accuracy. The moving image signal to be applied can be applied not only to a luminance signal but also to a color difference signal. For example, in the case of 4: 2: 0 format, if the luminance is 1/4 pixel accuracy, the color difference signal has 1/8 pixel accuracy, and the method of this embodiment can be applied independently or simultaneously. In the case of 4: 2: 2 format, if the luminance is 1/4 pixel accuracy, the color difference signal has 1/4 pixel accuracy in the vertical direction and 1/8 pixel accuracy in the horizontal direction, and the same argument can be made.

　ここで、１／２画素精度用のフィルタ係数の値が補間対象画素位置に対して対称になっていることが分かる。このことから、動き補償予測画像の補間対象画素位置に対して対称な位置に存在する画素位置をまとめて乗算することが可能である。また、１／４画素精度用のフィルタ係数と３／４画素精度用のフィルタ係数は、フィルタ係数を反転させたものになる。このことから、画素精度が３／４の時は、入力される動き補償予測画像の画素位置を画素精度が１／２の画素位置を中心として反転させることで、画素精度が１／４と同じ処理になることが分かる。 Here, it can be seen that the value of the filter coefficient for 1/2 pixel accuracy is symmetric with respect to the interpolation target pixel position. From this, it is possible to multiply together pixel positions existing at positions symmetrical to the interpolation target pixel position of the motion compensated prediction image. The filter coefficient for 1/4 pixel accuracy and the filter coefficient for 3/4 pixel accuracy are obtained by inverting the filter coefficient. Therefore, when the pixel accuracy is 3/4, the pixel accuracy is the same as 1/4 by inverting the pixel position of the input motion compensated prediction image around the pixel position having the pixel accuracy of 1/2. It turns out that it becomes processing.

　具体的にＮ＝４、いわゆる８タップのフィルタの例として、以下の数２のようなフィルタ係数を示す。フィルタ係数の精度を表す変数ｓｈｉｆｔ０は、６となる。

Specifically, as an example of a so-called 8-tap filter with N = 4, a filter coefficient such as the following Expression 2 is shown. The variable shift0 representing the accuracy of the filter coefficient is 6.

　Ｔ_Ｌ，ＺをリストＬの画素位置Ｚ（Ｚは、Ａ，ａ，ｂ，ｃ，ｄ，ｅ，ｆ，ｇ，ｈ，ｉ，ｊ，ｋ，ｎ，ｐ，ｑ，ｒのいずれか）の補間画像とする。ｂｉｐｒｅｄＳａｍｐｌｅ_{ｐｒｅｄＳａｍｐｌｅＬ０，ｐｒｅｄＳａｍｐｌｅＬ１}を、リスト０の画素位置ｐｒｅｄＳａｍｐｌｅＬ０と、リスト１の画素位置ｐｒｅｄＳａｍｐｌｅＬ１との組合せの場合の双方向予測画像信号１１０とする。また、以下のような変数を定義する。ＢｉｔＤｅｐｔｈは、入力動画像信号６０１の画素ビット長を表す。ここで、“＜＜”および“＞＞”は、それぞれ左算術シフトおよび右算術シフトを表すオペレーションである。負の値でシフトする場合は、それぞれシフトの向きを反転して正の値での算術シフトになるものとする。すなわち、例えば“ｘ＞＞（－２）”は、“ｘ＜＜２”とする。また、Ｃ言語等で用いられている三項演算子を用いる。そして、以下のような変数を定義する。
　ｓｈｉｆｔ１＝ＢｉｔＤｅｐｔｈ－８－Ｓ
　ｓｈｉｆｔ２＝ＢｉｔＤｅｐｔｈ－２－Ｓ
　ｓｈｉｆｔ３＝14-ＢｉｔＤｅｐｔｈ＋Ｓ
　ｏｆｆｓｅｔ０＝
　（ｓｈｉｆｔ０＞０）？（１＜＜（ｓｈｉｆｔ０－１））：０；
　ｏｆｆｓｅｔ１＝
　（ｓｈｉｆｔ１＞０）？（１＜＜（ｓｈｉｆｔ１－１））：０；
　ｏｆｆｓｅｔ２＝
　（ｓｈｉｆｔ２＞０）？（１＜＜（ｓｈｉｆｔ２－１））：０；
　ｏｆｆｓｅｔ３＝
　（ｓｈｉｆｔ３＞０）？（１＜＜（ｓｈｉｆｔ３－１））：０； _{TL, Z} is the pixel position Z of the list L (Z is any one of A, a, b, c, d, e, f, g, h, i, j, k, n, p, q, r) Interpolated image of bipredSample _{predSampleL0 and predSampleL1} are the bi-predictive image signals 110 in the case of the combination of the pixel position predSampleL0 in list 0 and the pixel position predSampleL1 in list 1. In addition, the following variables are defined. Bit Depth represents the pixel bit length of the input moving image signal 601. Here, “<<” and “>>” are operations representing a left arithmetic shift and a right arithmetic shift, respectively. When shifting with a negative value, the shift direction is inverted to be an arithmetic shift with a positive value. That is, for example, “x >> (− 2)” is set to “x << 2”. A ternary operator used in C language or the like is used. And define the following variables:
shift1 = BitDepth-8-S
shift2 = BitDepth-2-S
shift3 = 14-BitDepth + S
offset0 =
(Shift0> 0)? (1 << (shift0-1)): 0;
offset1 =
(Shift1> 0)? (1 << (shift1-1)): 0;
offset2 =
(Shift2> 0)? (1 << (shift2-1)): 0;
offset3 =
(Shift3> 0)? (1 << (shift3-1)): 0;

　ｓｈｉｆｔ１，ｓｈｉｆｔ２，ｓｈｉｆｔ３は、演算精度を制御するための変数であり、Ｓはパラメータである。Ｓが０の時、１６ビット精度に全ての補間処理が収まるようになる。ＳがＢｉｔＤｅｐｔｈ－２の時、途中の丸め処理による精度落ちがなくなる。ｏｆｆｓｅｔ０，ｏｆｆｓｅｔ１，ｏｆｆｓｅｔ２，ｏｆｆｓｅｔ３は、丸めのためのオフセット値である。なお、Ｓの値は、第１画像生成部２０２と第２画像生成部２０３で異なった値でもかまわない。 Shift1, shift2, and shift3 are variables for controlling calculation accuracy, and S is a parameter. When S is 0, all interpolation processes can be accommodated in 16-bit precision. When S is Bit Depth-2, there is no loss of accuracy due to rounding processing in the middle. offset 0, offset 1, offset 2, offset 3 are offset values for rounding. Note that the value of S may be different between the first image generation unit 202 and the second image generation unit 203.

　双方向予測画像信号ｐｒｅｄＳａｍｐｌｅは、最終的に以下の式（１）で求めることにする。
　ｐｒｅｄＳａｍｐｌｅ＝ｃｌｉｐ（（ｂｉｐｒｅｄＳａｍｐｌｅ_{ｐｒｅｄＳａｍｐｌｅＬ０，ｐｒｅｄＳａｍｐｌｅＬ１}＋（ｏｆｆｓｅｔ３＋１））＞＞（ｓｈｉｆｔ３＋１））　・・・（１） The bidirectional prediction image signal predSample is finally obtained by the following equation (1).
predSample = clip ((bipleSample _{predSampleL0, predSampleL1} + (offset3 + 1)) >> (shift3 + 1)) (1)

　ここで、関数ｃｌｉｐは、値を０以上、（１＜＜ＢｉｔＤｅｐｔｈ）－１以下に制限する関数とする。 Here, the function clip is a function that limits the value to 0 or more and (1 << BitDepth) −1 or less.

　図９は、スイッチ部２０１の切り替え条件の一例を示すテーブルである。このテーブルは、リスト０およびリスト１それぞれで、ｘＦｒａｃおよびｙＦｒａｃにより示される画素位置（Ａ，ａ，ｂ，ｃ，ｄ，ｅ，ｆ，ｇ，ｈ，ｉ，ｊ，ｋ，ｎ，ｐ，ｑ，ｒ）の組合わせに応じて、第１画像生成部２０２か第２画像生成部２０３のいずれを選択するかを示している。１／４画素精度の例であるため、リスト０とリスト１の画素位置の組合せは、２５６通り存在する。第１行目がリスト０の画素位置を表し、第１列目がリスト１の画素位置を表すものとする。図９では、値が０ならば、スイッチ部２０１が第１画像生成部２０２を選択するように切り替え、値が１ならば、スイッチ部２０１が第２画像生成部２０３を選択するように切り替る例を示している。 FIG. 9 is a table showing an example of switching conditions of the switch unit 201. This table includes pixel positions (A, a, b, c, d, e, f, g, h, i, j, k, n, p, q indicated by xFrac and yFrac in list 0 and list 1, respectively. , R), which one of the first image generation unit 202 and the second image generation unit 203 is to be selected is shown. Since this is an example of 1/4 pixel accuracy, there are 256 combinations of pixel positions in list 0 and list 1. The first row represents the pixel position of list 0, and the first column represents the pixel position of list 1. In FIG. 9, when the value is 0, the switch unit 201 switches so as to select the first image generation unit 202, and when the value is 1, the switch unit 201 switches so as to select the second image generation unit 203. An example is shown.

　図１０は、第１画像生成部２０２の構成例を示すブロック図である。第１画像生成部２０２は、リスト０補間部３０１とリスト１補間部３０２と加算部３０３とを備えている。第１画像生成部２０２は、動き補償予測画像１０８と、分数精度情報１０６とを用いて、リスト０とリスト１の補間画像をそれぞれ作成した上で、加算部３０３でそれらの平均値を計算して、双方向予測画像信号１１０を作成する。 FIG. 10 is a block diagram illustrating a configuration example of the first image generation unit 202. The first image generation unit 202 includes a list 0 interpolation unit 301, a list 1 interpolation unit 302, and an addition unit 303. The first image generation unit 202 uses the motion compensated prediction image 108 and the fractional accuracy information 106 to create the interpolated images of list 0 and list 1, respectively, and then calculates an average value thereof by the adding unit 303. Thus, the bidirectional prediction image signal 110 is created.

　リスト０補間部３０１は、リスト０の参照画像の補間処理を行うことにより補間画像を生成する。リスト１補間部３０２は、リスト１の参照画像の補間処理を行うことにより補間画像を生成する。例えば、リスト０補間部３０１およびリスト１補間部３０２は、それぞれリスト０の参照画像およびリスト１の参照画像の各画素の画素値を補間した補間値を算出することにより、補間画像を生成する。加算部３０３は、リスト０の補間画像、および、リスト１の補間画像を加算する。 The list 0 interpolation unit 301 generates an interpolated image by performing an interpolation process on the reference image of list 0. The list 1 interpolation unit 302 generates an interpolation image by performing an interpolation process on the reference image of list 1. For example, the list 0 interpolation unit 301 and the list 1 interpolation unit 302 generate an interpolation image by calculating an interpolation value obtained by interpolating the pixel values of the pixels of the reference image of list 0 and the reference image of list 1, respectively. The adding unit 303 adds the interpolation image of list 0 and the interpolation image of list 1.

　以下は、第１画像生成部２０２におけるｂｉｐｒｅｄＳａｍｐｌｅ_{ｐｒｅｄＳａｍｐｌｅＬ０，ｐｒｅｄＳａｍｐｌｅＬ１}の求め方について、式を用いて、具体的な計算方法を説明する。リスト０の補間画像をＴ_{０、ｐｒｅｄＳａｍｐｌｅＬ０}とし、リスト１の補間画像をＴ_{１,ｐｒｅｄＳａｍｐｌｅＬ１}とする。ｂｉｐｒｅｄＳａｍｐｌｅ_{ｐｒｅｄＳａｍｐｌｅＬ０，ｐｒｅｄＳａｍｐｌｅＬ１}を以下の各式に示すように計算する。
　ｂｉｐｒｅｄＳａｍｐｌｅ_{ｐｒｅｄＳａｍｐｌｅＬ０，ｐｒｅｄＳａｍｐｌｅＬ１}＝
　　Ｔ_{０、ｐｒｅｄＳａｍｐｌｅＬ０}＋Ｔ_{１,ｐｒｅｄＳａｍｐｌｅＬ１}
　Ｔ_Ｌ，Ａ＝Ａ_{Ｌ，０，０}＜＜ｓｈｉｆｔ３

　なお、Ｔ_Ｌ，jの右辺の式は、Ｓ＝０の時に符号付き１５ビット整数の範囲を超えるこの可能性があるため、ある一定の範囲の値に制限してもよい。例えば、下限が－２¹⁵で上限が２¹⁵－１の範囲に結果を制限する。 In the following, a specific calculation method will be described using equations for how to obtain bipredSample _{predSampleL0 and predSampleL1} in the first image generation unit 202. The interpolated image of list 0 is T _{0 and predSampleL0,} and the interpolated image of list 1 is T _{1 and predSampleL1} . bipredSample _{predSampleL0 and predSampleL1} are calculated as shown in the following equations.
bipredSample _{predSampleL0, predSampleL1} =
T _{0, predSampleL0} + T _{1, predSampleL1}
T _{L, A} = A _{L, 0,0} << shift3

Note that the expression on the right side of _{TL, j} may exceed the range of the signed 15-bit integer when S = 0, and therefore may be limited to a certain range of values. For example, limit the results to a range with a lower limit of −2 ¹⁵ and an upper limit of 2 ¹⁵ −1.

　図１１は、第２画像生成部２０３の構成例を示すブロック図である。第２画像生成部２０３は、置換部４０１と、加算部４０２と、補間部４０３とを備えている。第２画像生成部２０３は、動き補償予測画像１０８と分数精度情報１０６から、必要に応じて動き補償予測画像１０８の画素位置の置換を行い（置換部４０１）、加算処理を行い（加算部４０２）、補間処理（補間部４０３）を行うことで、双方向予測画像信号１１０を生成する。 FIG. 11 is a block diagram illustrating a configuration example of the second image generation unit 203. The second image generation unit 203 includes a replacement unit 401, an addition unit 402, and an interpolation unit 403. The second image generation unit 203 replaces the pixel position of the motion compensated prediction image 108 as necessary from the motion compensated prediction image 108 and the fractional accuracy information 106 (replacement unit 401), and performs addition processing (addition unit 402). ), The bidirectional prediction image signal 110 is generated by performing the interpolation process (interpolation unit 403).

　置換部４０１は、画素位置の組合せが、画素位置を置換する組合せとして予め定められた組に相当する場合に、当該組に含まれる画素位置を予め定められた画素位置に置換する。置換部４０１は、例えば、画素精度が１／４および３／４の画素位置の組合せの場合、画素精度が３／４の画素位置を画素精度が１／２の画素位置を中心として反転させる。なお、画素位置を置換させない構成の場合は、置換部４０１を備えなくてもよい。例えば、フィルタ係数が同一の画素位置の組合せのみを第２画像生成部２０３の処理対象とする場合は、画素位置の反転等が不要なため、置換部４０１を備える必要はない。 When the combination of the pixel positions corresponds to a predetermined group as a combination for replacing the pixel position, the replacement unit 401 replaces the pixel position included in the group with a predetermined pixel position. For example, in the case of a combination of pixel positions with a pixel accuracy of 1/4 and 3/4, the replacement unit 401 inverts a pixel position with a pixel accuracy of 3/4 around a pixel position with a pixel accuracy of 1/2. In the case of a configuration in which the pixel position is not replaced, the replacement unit 401 may not be provided. For example, when only the combination of pixel positions having the same filter coefficient is the processing target of the second image generation unit 203, it is not necessary to include the replacement unit 401 because it is not necessary to invert the pixel positions.

　加算部４０２は、必要に応じて置換された後の画素位置の組合せに含まれる各画素位置の画素値を加算する。補間部４０３は、加算により得られた画像に対してフィルタを用いた補間処理を行うことにより双方向予測画像信号１１０を作成する。 The addition unit 402 adds the pixel values at each pixel position included in the combination of pixel positions after replacement as necessary. The interpolation unit 403 creates the bidirectional prediction image signal 110 by performing interpolation processing using a filter on the image obtained by the addition.

　なお、置換部４０１、加算部４０２、および、補間部４０３を備える構成は一例であり、後述の数６～数３３により双方向予測画像信号１１０を算出可能な構成であれば他の構成としてもよい。 Note that the configuration including the replacement unit 401, the addition unit 402, and the interpolation unit 403 is merely an example, and any other configuration may be used as long as the bidirectional prediction image signal 110 can be calculated using Equations 6 to 33 described later. Good.

　第２画像生成部２０３は、第１画像生成部２０２とは異なり、先に動き補償予測画像１０８を足し合わせた上で、補間処理を行うことで、演算回数の削減を実現している。 Unlike the first image generation unit 202, the second image generation unit 203 realizes a reduction in the number of computations by performing interpolation processing after adding the motion compensated prediction image 108 first.

　以下に、第２画像生成部２０３が、分数精度情報に基づいてｂｉｐｒｅｄＳａｍｐｌｅ_{ｐｒｅｄＳａｍｐｌｅＬ０，ｐｒｅｄＳａｍｐｌｅＬ１}を算出する具体的な計算方法を示す。 Hereinafter, a specific calculation method in which the second image generation unit 203 calculates bipred _{Sample pred} Sample _{L0 and pred} Sample _L1 based on the fractional accuracy information will be described.

　上述のように例えば図９のテーブルで値が「１」の画素位置の組合せの場合に第２画像生成部２０３によりｂｉｐｒｅｄＳａｍｐｌｅ_{ｐｒｅｄＳａｍｐｌｅＬ０，ｐｒｅｄＳａｍｐｌｅＬ１}が算出される。以下の数６～数３３の各式は、図９のテーブルで値が「１」の画素位置の組合せそれぞれに対するｂｉｐｒｅｄＳａｍｐｌｅ_{ｐｒｅｄＳａｍｐｌｅＬ０，ｐｒｅｄＳａｍｐｌｅＬ１}の算出式の一例を表している。

As described above, for example, in the case of a combination of pixel positions having a value of “1” in the table of FIG. 9, bipleSample _{predSampleL0 and predSampleL1} are calculated by the second image generation unit 203. Each of the following equations 6 to 33 represents an example of a calculation formula of bipredSample _{predSampleL0 and predSampleL1} for each combination of pixel positions having a value of “1” in the table of FIG.

　なお、これらの例では、水平方向と垂直方向の双方にフィルタ処理を行う場合、演算回数が変わらない場合は、垂直方向を先に実施し、水平方向を後に実施しているが、その逆であっても同様に実現できる。 In these examples, when filtering is performed both in the horizontal direction and in the vertical direction, if the number of computations does not change, the vertical direction is performed first and the horizontal direction is performed later, but vice versa. Even if it exists, it is realizable similarly.

　重み付予測を行う場合は、第１画像生成部２０２の場合は、Ｈ．２６４／ＡＶＣ同様に、補間画像に対してリストＬの重み係数Ｗ_Ｌとオフセット係数Ｏ_ＬとスケーリングＬＳを用いて
　ｐｒｅｄＳａｍｐｌｅ＝ｃｌｉｐ（（Ｗ_０×Ｔ_{０、ｐｒｅｄＳａｍｐｌｅＬ０}＋Ｗ_１×Ｔ_{１,ｐｒｅｄＳａｍｐｌｅＬ１}＋（ｏｆｆｓｅｔ３＋ＬＳ＋２））＞＞（ｓｈｉｆｔ３＋ＬＳ＋２）＋（（Ｏ_０＋Ｏ_１＋１）＞＞１））
のような重み付の予測画像を作成する。または、以下の式（２）で表される重み付の補間画像ＴＷ_{Ｌ,ｐｒｅｄＳａｍｐｌｅＬx}を作成した上で、以下の式（３）を式（１）に代入して重み付の予測画像を作成する。
　ＴＷ_{Ｌ,　ｐｒｅｄＳａｍｐｌｅＬx}＝
（Ｗ_Ｌ×Ｔ_{L、ｐｒｅｄＳａｍｐｌｅＬx}＋（１＜＜（ＬＳ－１）））＞＞ＬＳ＋Ｏ_Ｌ　・・・（２）
　ｂｉｐｒｅｄＳａｍｐｌｅ_{ｐｒｅｄＳａｍｐｌｅＬ０，ｐｒｅｄＳａｍｐｌｅＬ１}＝（ＴＷ_{０,　ｐｒｅｄＳａｍｐｌｅＬ0}＋ＴＷ_{１，　ｐｒｅｄＳａｍｐｌｅＬ１}）　・・・（３） In the case of performing the weighted prediction, in the case of the first image generation unit 202, H. 264 / AVC Similarly, using a weight coefficient _{W L} and the offset coefficient _{O L} and scaling LS list L with respect to the interpolation image _{predSample = clip ((W 0 ×} T 0, predSampleL0 + W 1 × T 1, predSampleL1 + ( offset3 + LS + 2)) >> (shift3 + LS + 2) + ((O ₀ + O ₁ +1) >> 1))
A weighted prediction image is created. _{Alternatively} , a weighted interpolation image TW _{L, predSampleLx} represented by the following equation (2) is created, and the following equation (3) is substituted into equation (1) to create a weighted prediction image. .
TW _{L, predSampleLx} =
_{_{(W L × T L, predSampleLx}} + (1 << (LS-1))) >> LS + O L ··· (2)
bipredSample _{predSampleL0, predSampleL1} = (TW _{0, pred Sample L0} + TW _{1, pred Sample L1} ) (3)

　第２画像生成部２０３の場合は、整数画素Ａ_ｘ，ｙに対して予め、
　ＡＷ_{Ｌ，ｘ，ｙ}＝Ｗ_Ｌ×Ａ_{Ｌ，ｘ，ｙ}
と重み係数をかけて重み付の整数画素値ＡＷ_{Ｌ，ｘ，ｙ}を生成した上で、第２画像生成部２０３で生成した画像ｂｉｐｒｅｄＳａｍｐｌｅ_{ｐｒｅｄＳａｍｐｌｅＬ０，ｐｒｅｄＳａｍｐｌｅＬ１}を、
　ｐｒｅｄＳａｍｐｌｅ＝ｃｌｉｐ（（ｂｉｐｒｅｄＳａｍｐｌｅ_{ｐｒｅｄＳａｍｐｌｅＬ０，ｐｒｅｄＳａｍｐｌｅＬ１}＋（ｏｆｆｓｅｔ３＋ＬＳ＋２））＞＞（ｓｈｉｆｔ３＋ＬＳ＋２）＋（（Ｏ_０＋Ｏ_１＋１）＞＞１））
のようにして重み付の予測画像を作成する。または、予め整数画素Ａに対して、
　ＡＷ_{Ｌ，ｘ，ｙ}＝（Ｗ_Ｌ×Ａ_{Ｌ，ｘ，ｙ}＋（１＜＜（ＬＳ－１）））＞＞ＬＳ＋Ｏ_Ｌ
のような重み付の整数画素値ＡＷ_{Ｌ，ｘ，ｙ}を作成した上で、第２画像生成部２０３で生成した画像ｂｉｐｒｅｄＳａｍｐｌｅ_{ｐｒｅｄＳａｍｐｌｅＬ０，ｐｒｅｄＳａｍｐｌｅＬ１}を、式（１）に代入して重み付の予測画像を作成する。 In the case of the second image generation unit 203, for the integer pixels A _{x, y} ,
AW _{L, x, y} = W _L × A _{L, x, y}
And weighted integer pixel values AWL _{, x, y} are generated, and then the images bipleSample _{predSampleL0 and predSampleL1} generated by the second image generation unit 203 are
predSample = clip ((bipleSample _{predSampleL0, predSampleL1} + (offset3 + LS + 2)) >> (shift3 + LS + 2) + ((O ₀ + O ₁ +1) >> 1)
Thus, a weighted prediction image is created. Or, for integer pixel A in advance,
AW _{L, x, y} = (W _L × A _{L, x, y} + (1 << (LS-1))) >> LS + _OL
The weighted integer pixel values AWL _{, x, y} are generated, and the image bipleSample _{predSampleL0 and predSampleL1} generated by the second image generation unit 203 are substituted into the formula (1) to obtain a weighted prediction image Create

　次に、第１の実施形態にかかる画像符号化装置６００による補間画像生成処理について図１２を用いて説明する。図１２は、第１の実施形態における補間画像生成処理の全体の流れを示すフローチャートである。 Next, interpolation image generation processing by the image encoding device 600 according to the first embodiment will be described with reference to FIG. FIG. 12 is a flowchart showing an overall flow of the interpolation image generation processing in the first embodiment.

　動き補償予測部１０７は、精度計算部１０４により算出される整数精度情報にしたがい、リスト０とリスト１のそれぞれの参照画像信号から整数画素精度の画像信号を読み出す（ステップＳ１０１）。動き補償予測部１０７は、読み出した画像信号を、整数画素精度の動き補償予測画像１０８として双方向補間画像生成部１０９に出力する。 The motion compensation prediction unit 107 reads out image signals with integer pixel accuracy from the reference image signals of list 0 and list 1 according to the integer accuracy information calculated by the accuracy calculation unit 104 (step S101). The motion compensation prediction unit 107 outputs the read image signal to the bidirectional interpolation image generation unit 109 as a motion compensated prediction image 108 with integer pixel accuracy.

　双方向補間画像生成部１０９は、精度計算部１０４により算出されるリスト０とリスト１のそれぞれの分数精度情報（ｘＦｒａｃ，ｙＦｒａｃ）を入力する（ステップＳ１０２）。双方向補間画像生成部１０９は、リスト０およびリスト１の分数精度情報に対応する画素位置の組合せに基づき、第２画像生成部２０３を用いるか否かを判断する（ステップＳ１０３）。 The bidirectional interpolation image generation unit 109 inputs the fractional accuracy information (xFrac, yFrac) of the list 0 and the list 1 calculated by the accuracy calculation unit 104 (step S102). The bidirectionally interpolated image generation unit 109 determines whether to use the second image generation unit 203 based on the combination of pixel positions corresponding to the fractional accuracy information of the list 0 and the list 1 (step S103).

　例えば、双方向補間画像生成部１０９は、分数精度情報に対応する画素位置を、リスト０およびリスト１それぞれについて求める。分数精度情報（ｘＦｒａｃ，ｙＦｒａｃ）＝（１，１）の場合、双方向補間画像生成部１０９は、図７に示すように、画素位置が「ｅ」であることを特定できる。次に、双方向補間画像生成部１０９は、リスト０について求めた画素位置と、リスト１について求めた画素位置との組合せが、予め定められた組合せ（特定組）であるか否かを判断する。双方向補間画像生成部１０９は、例えば図９のテーブルで値が「１」である画素位置の組合せを特定組として、上記判断を行う。そして、リスト０について求めた画素位置と、リスト１について求めた画素位置との組合せが、特定組の場合、双方向補間画像生成部１０９は、第２画像生成部２０３を用いると判断する。特定組でない場合、双方向補間画像生成部１０９は、第１画像生成部２０２を用いると判断する。 For example, the bidirectional interpolated image generation unit 109 obtains pixel positions corresponding to the fractional accuracy information for each of the list 0 and the list 1. When the fractional accuracy information (xFrac, yFrac) = (1, 1), the bidirectionally interpolated image generation unit 109 can specify that the pixel position is “e” as shown in FIG. Next, the bidirectionally interpolated image generation unit 109 determines whether or not the combination of the pixel position obtained for the list 0 and the pixel position obtained for the list 1 is a predetermined combination (specific set). . For example, the bidirectional interpolation image generation unit 109 performs the above determination with a combination of pixel positions having a value of “1” in the table of FIG. Then, when the combination of the pixel position obtained for list 0 and the pixel position obtained for list 1 is a specific set, bi-directionally interpolated image generation unit 109 determines to use second image generation unit 203. If it is not a specific group, the bidirectional interpolation image generation unit 109 determines to use the first image generation unit 202.

　第１画像生成部２０２を用いる場合（ステップＳ１０３：Ｎｏ）、第１画像生成部２０２のリスト０補間部３０１が、リスト０の補間画像を作成し、第１画像生成部２０２のリスト１補間部３０２が、リスト１の補間画像を作成する（ステップＳ１０４）。次に、加算部３０３が、リスト０の補間画像とリスト１の補間画像とを加算することにより、双方向予測画像信号１１０を作成して出力する（ステップＳ１０５）。 When the first image generation unit 202 is used (step S103: No), the list 0 interpolation unit 301 of the first image generation unit 202 creates an interpolation image of list 0, and the list 1 interpolation unit of the first image generation unit 202 302 creates an interpolated image of list 1 (step S104). Next, the adding unit 303 adds the interpolated image of list 0 and the interpolated image of list 1 to create and output a bidirectional prediction image signal 110 (step S105).

　第２画像生成部２０３を用いる場合（ステップＳ１０３：Ｙｅｓ）、第２画像生成部２０３は、数６～数３３にしたがい双方向予測画像信号１１０を作成する（ステップＳ１０６～ステップＳ１０７）。 When the second image generation unit 203 is used (step S103: Yes), the second image generation unit 203 creates the bidirectional prediction image signal 110 according to the equations 6 to 33 (steps S106 to S107).

　まず、置換部４０１は、画素位置の組合せが、画素位置を置換する組合せとして予め定められた組に相当する場合に、当該組に含まれる画素位置を予め定められた画素位置に置換する。加算部４０２は、リスト０に対する動き補償予測画像１０８の画素値と、リスト１に対する動き補償予測画像の画素値とを加算する（ステップＳ１０６）。補間部４０３は、加算により得られた画像に対してフィルタを用いた補間処理を行い、双方向予測画像信号１１０を作成して出力する（ステップＳ１０７）。 First, when the combination of pixel positions corresponds to a predetermined group as a combination for replacing a pixel position, the replacement unit 401 replaces the pixel position included in the group with a predetermined pixel position. The adding unit 402 adds the pixel value of the motion compensated predicted image 108 for list 0 and the pixel value of the motion compensated predicted image for list 1 (step S106). The interpolation unit 403 performs an interpolation process using a filter on the image obtained by the addition, and generates and outputs the bidirectional prediction image signal 110 (step S107).

　以上のように、本実施形態では、２つの参照画像の画素位置の組合せが特定の条件を満たす場合に、画素値を加算してからフィルタ処理を実行することができる。したがって、リスト０とリスト１で独立に補間画像を作成してから、平均値を作成する従来の方法と比較すると、本実施形態では、フィルタ処理における乗算数、および、演算精度を制限するためのクリッピング処理等の演算数を削減することができる。すなわち、効率的に双方向補間画像生成処理を実現できる。 As described above, in this embodiment, when the combination of the pixel positions of the two reference images satisfies a specific condition, the filter processing can be executed after adding the pixel values. Therefore, compared with the conventional method of creating an average value after creating an interpolation image independently in list 0 and list 1, this embodiment is for limiting the number of multiplications and the calculation accuracy in filter processing. The number of operations such as clipping processing can be reduced. That is, the bidirectionally interpolated image generation process can be efficiently realized.

　図１３は、Ｎ＝４とした場合の、第１の実施形態による一画素あたりの乗算回数を示した図である。図１４は、Ｎ＝４とした場合の、従来の方法（全ての補間画素を第１画像生成部２０２で算出する方法）による一画素あたりの乗算回数を示した図である。 FIG. 13 is a diagram showing the number of multiplications per pixel according to the first embodiment when N = 4. FIG. 14 is a diagram showing the number of multiplications per pixel by a conventional method (a method in which all the interpolated pixels are calculated by the first image generation unit 202) when N = 4.

　図１３に示すように、本実施形態を適用した場合は、乗算回数は最大１０４回となり、乗算回数の平均は５１．９２回となる。一方、図１４に示すように、従来の方法を適用した場合は、乗算回数は最大１４４回となり、乗算回数の平均は６２．５回となる。 As shown in FIG. 13, when this embodiment is applied, the maximum number of multiplications is 104, and the average number of multiplications is 51.92. On the other hand, as shown in FIG. 14, when the conventional method is applied, the maximum number of multiplications is 144, and the average number of multiplications is 62.5.

　このように、第１の実施形態にかかる画像符号化装置では、双方向予測に用いる２つの参照画像それぞれの画素位置の組が特定組に相当する場合に、先に各画素位置の画素値を加算してからフィルタ係数との乗算を実行する。これにより、双方向予測の演算量を削減することができる。 As described above, in the image encoding device according to the first embodiment, when the set of pixel positions of each of the two reference images used for bidirectional prediction corresponds to a specific set, the pixel value of each pixel position is first determined. After the addition, multiplication with the filter coefficient is executed. Thereby, the calculation amount of bidirectional prediction can be reduced.

　なお、第２画像生成部２０３を適用する分数精度情報の組合せは、上記の組合せのうち一部でもよい。すなわち、例えば上記の数６～数３３のうち一部の式を用いるように構成してもよい。この場合であっても、少なくとも第２画像生成部２０３を適用する組合せに対しては演算量を削減可能となる。 The combination of fractional accuracy information to which the second image generation unit 203 is applied may be a part of the above combination. In other words, for example, a part of the expressions 6 to 33 may be used. Even in this case, the amount of calculation can be reduced at least for the combination to which the second image generation unit 203 is applied.

（第２の実施形態）
　第２の実施形態にかかる画像符号化装置および画像復号装置は、第１画像生成部による計算量をさらに削減する。第２の実施形態の第１画像生成部は、計算量の多い画素位置の場合にタップ長の短い（小さい）フィルタを用いる。 (Second Embodiment)
The image encoding device and the image decoding device according to the second embodiment further reduce the amount of calculation by the first image generation unit. The first image generation unit of the second embodiment uses a filter with a short (small) tap length in the case of a pixel position with a large calculation amount.

　例えば、第１の実施形態の第１画像生成部２０２では、リスト０およびリスト１の分数精度情報の組合せが、ｆ，ｉ，ｋ，ｑと、ｅ，ｇ，ｐ，ｒとの組合せになっている場合の３２通りが、最悪の乗算回数（１４４回、図１４参照）になる。そこで、第２の実施形態の第１画像生成部は、ｅ，ｇ，ｐ，ｒの画素位置の補間画像を算出するときに、Ｎ’＜Ｎとなる短いタップ長のフィルタを用いる。以下の数３４は、リストＬのｅ，ｇ，ｐ，ｒの画素位置の補間画像Ｔ’_Ｌ，ｅ、Ｔ’_Ｌ，ｇ、Ｔ’_Ｌ，ｐ、Ｔ’_Ｌ，ｒを算出する式を表す。

For example, in the first image generation unit 202 of the first embodiment, the combination of fraction accuracy information of list 0 and list 1 is a combination of f, i, k, q and e, g, p, r. 32 are the worst multiplications (144 times, see FIG. 14). Therefore, the first image generation unit of the second embodiment uses a filter with a short tap length that satisfies N ′ <N when calculating an interpolation image at pixel positions e, g, p, and r. The following Expression 34 is an expression for calculating the interpolated images T ′ _{L, e} , T ′ _{L, g} , T ′ _{L, p} , T ′ _{L, r} at the pixel positions of e, g, p, r in the list L. To express.

　図１５は、第２の実施形態による一画素あたりの乗算回数を示した図である。Ｎ＝４、Ｎ’＝３の場合は、最悪１０４回の乗算回数が必要だった部分が、７８回の乗算回数になる。２５６通りの平均で、４８．６７回になり、第１の実施形態からさらに約３回、乗算回数を減少させることができる。なお、本実施形態では、ｆ，ｉ，ｋ，ｑと、ｅ，ｇ，ｐ，ｒの組合せの場合の３２通りに適用したが、これらの一部、または別の組合せでもよい。例えば、これに加えて、例えば図１５で９２回の乗算回数であるｊと、ｅ，ｇ，ｐ，ｒの組合せの場合の８通りも含めてもよい。あるいは、いずれにせよ演算数の多い、リスト０とリスト１のｅ，ｆ，ｇ，ｉ，ｋ，ｐ，ｑ，ｒの組合せの全てあるいは一部に対して、Ｎ’を適用してもよい。 FIG. 15 is a diagram showing the number of multiplications per pixel according to the second embodiment. In the case of N = 4 and N ′ = 3, the portion that required the worst 104 multiplications becomes 78 multiplications. The average of 256 patterns is 48.67, and the number of multiplications can be further reduced by about 3 times from the first embodiment. In the present embodiment, 32 types of combinations of f, i, k, q and e, g, p, r are applied, but a part of these or another combination may be used. For example, in addition to this, for example, eight combinations of j, which is the number of multiplications of 92 times in FIG. 15, and e, g, p, r may be included. Alternatively, N ′ may be applied to all or part of the combinations of e, f, g, i, k, p, q, and r in list 0 and list 1, which have a large number of operations. .

（第３の実施形態）
　第３の実施形態にかかる画像符号化装置および画像復号装置は、計算量の多い画素位置の場合に、リスト０とリスト１が同一の分数精度になるように近似することにより、第２画像生成部２０３による双方向予測画像の生成を可能とする。 (Third embodiment)
The image encoding device and the image decoding device according to the third embodiment perform the second image generation by approximating the list 0 and the list 1 to have the same fractional accuracy in the case of a pixel position having a large calculation amount. It is possible to generate a bidirectional prediction image by the unit 203.

　上述のように、リスト０およびリスト１の分数精度情報の組合せが、ｆ，ｉ，ｋ，ｑと、ｅ，ｇ，ｐ，ｒとの組合せになっている場合の３２通りが、最悪の乗算回数になる。そこで、第３の実施形態では、例えば、リスト０がｆの位置であるｘＦｒａｃ＝１／２、ｙＦｒａｃ＝１／４と、リスト１がｅの位置であるｘＦｒａｃ＝１／４，ｙＦｒａｃ＝１／４の組合せの場合は、両者をそれぞれｘＦｒａｃ＝３／８，ｙＦｒａｃ＝１／４に近似する。この場合、新たに３／８画素精度の補間画像を生成するための以下の式で表されるフィルタを導入する。
　Ｆ_{３／８，ｘ}＝Ｆ_{５／８，－ｘ＋１}　（ｘ＝－Ｎ＋１，・・・，Ｎ） As described above, the worst multiplication is performed when 32 combinations of the fractional accuracy information of list 0 and list 1 are combinations of f, i, k, q and e, g, p, r. It becomes the number of times. Therefore, in the third embodiment, for example, xFrac = 1/2 and yFrac = 1/4 where the list 0 is the position of f, and xFrac = 1/4, yFrac = 1/4 where the list 1 is the position of e. In the case of the combination of 4, both are approximated to xFrac = 3/8 and yFrac = 1/4, respectively. In this case, a filter represented by the following formula for newly generating an interpolation image with 3/8 pixel accuracy is introduced.
F _3/8 _{, x} = F _{5/8, −x + 1} (x = −N + 1,..., N)

　具体的には、ｆ，ｉ，ｋ，ｑと、ｅ，ｇ，ｐ，ｒの組合せの３２通りに対して、以下の数３５～数３７に示すような予測式を導入する。

Specifically, prediction formulas shown in the following formulas 35 to 37 are introduced for 32 combinations of f, i, k, q and e, g, p, r.

　図１６は、第３の実施形態による一画素あたりの乗算回数を示した図である。上記予測式を導入することで、Ｎ＝４の場合、これまで最悪１０４回の乗算回数が必要だった部分が、７２回または４０回または３６回の乗算回数になる。２５６通りの平均で、４５．８０回になり、第１の実施形態からさらに約６回減らせる。なお、この実施形態では、ｆ，ｉ，ｋ，ｑと、ｅ，ｇ，ｐ，ｒの組合せの場合の３２通りに適用したが、これらの一部、または別の組合せでもよい。例えば、これに加えて、図１６で９２回の乗算回数であるｊと、ｅ，ｇ，ｐ，ｒとの組合せの場合の８通りも含めてもよい。 FIG. 16 is a diagram showing the number of multiplications per pixel according to the third embodiment. By introducing the above prediction formula, in the case of N = 4, the portion where the worst 104 times of multiplication has been necessary so far becomes 72, 40 or 36 times. The average of 256 is 45.80 times, which can be further reduced by about 6 times from the first embodiment. In this embodiment, 32 combinations of f, i, k, q and e, g, p, r are applied, but some or a combination of these may be used. For example, in addition to this, eight types of combinations of j, which is the number of multiplications of 92 times in FIG. 16, and e, g, p, r may be included.

（第４の実施形態）
　第４の実施形態にかかる画像符号化装置および画像復号装置は、計算量の多い画素位置の場合に、リスト０とリスト１が同一の分数精度となる画素位置に置き換えることにより、第２画像生成部２０３による双方向予測画像の生成を可能とする。 (Fourth embodiment)
The image encoding device and the image decoding device according to the fourth embodiment generate a second image by replacing list 0 and list 1 with pixel positions having the same fractional accuracy in the case of pixel positions with a large amount of calculation. It is possible to generate a bidirectional prediction image by the unit 203.

　具体的には、第１画像生成部２０２が算出すべき
　ｂｉｐｒｅｄＳａｍｐｌｅ_ｅ，ｋ
　ｂｉｐｒｅｄＳａｍｐｌｅ_ｋ，ｅ
　ｂｉｐｒｅｄＳａｍｐｌｅ_ｇ，ｉ
　ｂｉｐｒｅｄＳａｍｐｌｅ_ｉ，ｇ
　ｂｉｐｒｅｄＳａｍｐｌｅ_ｅ，ｑ
　ｂｉｐｒｅｄＳａｍｐｌｅ_ｑ，ｅ
　ｂｉｐｒｅｄＳａｍｐｌｅ_ｆ，ｐ
　ｂｉｐｒｅｄＳａｍｐｌｅ_ｐ，ｆ
　ｂｉｐｒｅｄＳａｍｐｌｅ_ｆ，ｒ
　ｂｉｐｒｅｄＳａｍｐｌｅ_ｒ，ｆ
　ｂｉｐｒｅｄＳａｍｐｌｅ_ｇ，ｑ
　ｂｉｐｒｅｄＳａｍｐｌｅ_ｑ，ｇ
　ｂｉｐｒｅｄＳａｍｐｌｅ_ｉ，ｒ
　ｂｉｐｒｅｄＳａｍｐｌｅ_ｒ，ｉ
　ｂｉｐｒｅｄＳａｍｐｌｅ_ｋ，ｐ
　ｂｉｐｒｅｄＳａｍｐｌｅ_ｐ，ｋ
の１６通りを、第２画像生成部２０３の場合のｂｉｐｒｅｄＳａｍｐｌｅ_ｊ，ｊに置き換える。 Specifically, bipred Sample _{e, k} to be calculated by the first image generation unit 202
bipredSample _{k, e}
bipredSample _{g, i}
bipredSample _{i, g}
bipred Sample _{e, q}
bipred Sample _{q, e}
bipredSample _{f, p}
bipred Sample _{p, f}
bipred Sample _{f, r}
bipredSample _{r, f}
bipred Sample _{g, q}
bipred Sample _{q, g}
bipredSample _{i, r}
bipredSample _{r, i}
bipredSample _{k, p}
bipred Sample _{p, k}
Are replaced with bipreted Sample _{j, j} in the case of the second image generation unit 203.

　また、第１画像生成部２０２が算出すべき
　ｂｉｐｒｅｄＳａｍｐｌｅ_ｅ，ｆ
　ｂｉｐｒｅｄＳａｍｐｌｅ_ｆ，ｅ
　ｂｉｐｒｅｄＳａｍｐｌｅ_ｅ，ｉ
　ｂｉｐｒｅｄＳａｍｐｌｅ_ｉ，ｅ
の４通りを、第２画像生成部２０３の場合のｂｉｐｒｅｄＳａｍｐｌｅ_ｅ，ｅに置き換える。 In addition, bipred Sample _{e, f} to be calculated by the first image generation unit 202
bipred Sample _{f, e}
bipred Sample _{e, i}
bipredSample _{i, e}
Are replaced with bipreted Sample _{e, e} in the case of the second image generation unit 203.

　また、第１画像生成部２０２が算出すべき
　ｂｉｐｒｅｄＳａｍｐｌｅ_ｆ，ｇ
　ｂｉｐｒｅｄＳａｍｐｌｅ_ｇ，ｆ
　ｂｉｐｒｅｄＳａｍｐｌｅ_ｇ，ｋ
　ｂｉｐｒｅｄＳａｍｐｌｅ_ｋ，ｇ
の４通りを、第２画像生成部２０３の場合のｂｉｐｒｅｄＳａｍｐｌｅ_ｇ，ｇに置き換える。 In addition, bipred Sample _{f, g} to be calculated by the first image generation unit 202
bipred Sample _{g, f}
bipredSample _{g, k}
bipredSample _{k, g}
Are replaced with bipred Sample _{g, g} in the case of the second image generation unit 203.

　また、第１画像生成部２０２が算出すべき
　ｂｉｐｒｅｄＳａｍｐｌｅ_ｉ，ｐ
　ｂｉｐｒｅｄＳａｍｐｌｅ_ｐ，ｉ
　ｂｉｐｒｅｄＳａｍｐｌｅ_ｐ，ｑ
　ｂｉｐｒｅｄＳａｍｐｌｅ_ｑ，ｐ
の４通りを、第２画像生成部２０３の場合のｂｉｐｒｅｄＳａｍｐｌｅ_ｐ，ｐに置き換える。 In addition, bipredSample _{i, p} to be calculated by the first image generation unit 202
bipred Sample _{p, i}
bipred Sample _{p, q}
bipred Sample _{q, p}
Are replaced with bisampleSample _{p, p} in the case of the second image generation unit 203.

　また、第１画像生成部２０２が算出すべき
　ｂｉｐｒｅｄＳａｍｐｌｅ_ｋ，ｒ
　ｂｉｐｒｅｄＳａｍｐｌｅ_ｒ，ｋ
　ｂｉｐｒｅｄＳａｍｐｌｅ_ｑ，ｒ
　ｂｉｐｒｅｄＳａｍｐｌｅ_ｒ，ｑ
の４通りを、第２画像生成部２０３の場合のｂｉｐｒｅｄＳａｍｐｌｅ_ｒ，ｒに置き換える。 In addition, bipred Sample _{k, r} to be calculated by the first image generation unit 202
bipredSample _{r, k}
bipred Sample _{q, r}
bipred Sample _{r, q}
Are replaced with bisampled _{r, r} in the case of the second image generation unit 203.

　図１７は、第４の実施形態による一画素あたりの乗算回数を示した図である。本実施形態では、Ｎ＝４の場合、これまで最悪９６回の乗算回数が必要だった部分が、１６回または６４回の乗算回数になる。２５６通りの平均で、約４４．８０回になり、第１の実施形態からさらに約７回減らせる。 FIG. 17 is a diagram showing the number of multiplications per pixel according to the fourth embodiment. In this embodiment, in the case of N = 4, the portion where the worst 96 times of multiplication is necessary is 16 or 64 times. The average of 256 ways is about 44.80 times, which can be further reduced about 7 times from the first embodiment.

　なお、この実施形態では、ｆ，ｉ，ｋ，ｑと、ｅ，ｇ，ｐ，ｒの組合せの場合の３２通りに適用したが、これらの一部、または別の組合せでもよい。例えば、これに加えて、図１７で９２回の乗算回数のｊと、ｅ，ｇ，ｐ，ｒの組合せの場合８通りを含め、これらの８通りを全てｂｉｐｒｅｄＳａｍｐｌｅ_ｊ，ｊに置き換えてもよい。または、これらの８通りのうち、ｅ，ｇ，ｐ，ｒを含む組合せをそれぞれｂｉｐｒｅｄＳａｍｐｌｅ_ｅ，ｅ，ｂｉｐｒｅｄＳａｍｐｌｅ_ｇ，ｇ，ｂｉｐｒｅｄＳａｍｐｌｅ_ｐ，ｐ，ｂｉｐｒｅｄＳａｍｐｌｅ_ｒ，ｒに置き換えてもよい。 In this embodiment, 32 combinations of f, i, k, q and e, g, p, r are applied, but some or a combination of these may be used. For example, in addition to this, in the case of a combination of 92 times of multiplication and e, g, p, and r in FIG. 17, all eight of these may be replaced with bipledSample _{j, j.} . Alternatively, among these eight types _, combinations including e, g, p, and r may be replaced with bisampled sample _{e, e} , bipreded sample _{g, g} , bipreded sample _{p, p} , bipreded sample _{r, r} , respectively.

　以上説明したとおり、第１から第４の実施形態によれば、補間画像を生成するときの演算数を削減することができる。 As described above, according to the first to fourth embodiments, the number of operations when generating an interpolation image can be reduced.

　次に、第１～第４の実施形態にかかる装置（画像符号化装置および画像復号装置）のハードウェア構成について図１８を用いて説明する。図１８は、第１～第４の実施形態にかかる装置のハードウェア構成を示す説明図である。 Next, the hardware configuration of the devices (image encoding device and image decoding device) according to the first to fourth embodiments will be described with reference to FIG. FIG. 18 is an explanatory diagram showing the hardware configuration of the devices according to the first to fourth embodiments.

　第１～第４の実施形態にかかる装置は、ＣＰＵ（CentrＡＬ　Processing　Unit）５１などの制御装置と、ＲＯＭ（Read　Only　Memory）５２やＲＡＭ（Random　Access　Memory）５３などの記憶装置と、ネットワークに接続して通信を行う通信Ｉ／Ｆ５４と、各部を接続するバス６１を備えている。 The devices according to the first to fourth embodiments are connected to a control device such as a CPU (CentrAL Processing Unit) 51, a storage device such as a ROM (Read Only Memory) 52 and a RAM (Random Access Memory) 53, and a network. A communication I / F 54 that performs communication and a bus 61 that connects each unit are provided.

　第１～第４の実施形態にかかる装置で実行されるプログラムは、ＲＯＭ５２等に予め組み込まれて提供される。 The program executed by the devices according to the first to fourth embodiments is provided by being incorporated in advance in the ROM 52 or the like.

　第１～第４の実施形態にかかる装置で実行されるプログラムは、インストール可能な形式又は実行可能な形式のファイルでＣＤ－ＲＯＭ（Compact　Disk　Read　Only　Memory）、フレキシブルディスク（ＦＤ）、ＣＤ－Ｒ（Compact　Disk　Recordable）、ＤＶＤ（DigitＡＬ　Versatile　Disk）等のコンピュータで読み取り可能な記録媒体に記録してコンピュータプログラムプロダクトとして提供されるように構成してもよい。 The program executed by the apparatus according to the first to fourth embodiments is a file in an installable or executable format and is a CD-ROM (Compact Disk Read Only Memory), a flexible disk (FD), a CD-R. (Compact Disk Recordable), DVD (DigitAL Versatile Disk), etc. may be recorded on a computer-readable recording medium and provided as a computer program product.

　さらに、第１～第４の実施形態にかかる装置で実行されるプログラムを、インターネット等のネットワークに接続されたコンピュータ上に格納し、ネットワーク経由でダウンロードさせることにより提供するように構成してもよい。また、第１～第４の実施形態にかかる装置で実行されるプログラムをインターネット等のネットワーク経由で提供または配布するように構成してもよい。 Furthermore, the program executed by the apparatuses according to the first to fourth embodiments may be stored on a computer connected to a network such as the Internet and provided by being downloaded via the network. . The program executed by the devices according to the first to fourth embodiments may be provided or distributed via a network such as the Internet.

　第１～第４の実施形態にかかる装置で実行されるプログラムは、コンピュータを上述した装置の各部（動き補償予測部、双方向補間画像生成部、精度計算部等）として機能させうる。このコンピュータは、ＣＰＵ５１がコンピュータ読取可能な記憶媒体からプログラムを主記憶装置上に読み出して実行することができる。 The program executed by the devices according to the first to fourth embodiments can cause the computer to function as each unit (motion compensation prediction unit, bidirectional interpolation image generation unit, accuracy calculation unit, etc.) of the above-described device. In this computer, the CPU 51 can read a program from a computer-readable storage medium onto a main storage device and execute the program.

　本発明のいくつかの実施形態を説明したが、これらの実施形態は、例として提示したものであり、発明の範囲を限定することは意図していない。これら新規な実施形態は、その他の様々な形態で実施されることが可能であり、発明の要旨を逸脱しない範囲で、種々の省略、置き換え、変更を行うことができる。これら実施形態やその変形は、発明の範囲や要旨に含まれるとともに、特許請求の範囲に記載された発明とその均等の範囲に含まれる。 Although several embodiments of the present invention have been described, these embodiments are presented as examples and are not intended to limit the scope of the invention. These novel embodiments can be implemented in various other forms, and various omissions, replacements, and changes can be made without departing from the scope of the invention. These embodiments and modifications thereof are included in the scope and gist of the invention, and are included in the invention described in the claims and the equivalents thereof.

１０４　精度計算部
１０６　分数精度情報
１０７　動き補償予測部
１０８　動き補償予測画像
１０９　双方向補間画像生成部
１１０　双方向予測画像信号
２０１　スイッチ部
２０２　第１画像生成部
２０３　第２画像生成部
３０１　リスト０補間部
３０２　リスト１補間部
３０３　加算部
４０１　置換部
４０２　加算部
４０３　補間部
５０２　参照画像信号
５０３　スイッチ
５０４　双方向予測部
５０５　単方向予測部
５０６　イントラ予測部
５０７　予測制御情報
５０８　動きベクトル情報
５０９　予測画像信号
６００　画像符号化装置
６０１　入力動画像信号
６０２　減算部
６０３　変換／量子化部
６０４　逆量子化／逆変換部
６０５　エントロピー符号化部
６０６　加算部
６０７　復号画像信号
６０８　フレームメモリ部
６０９　参照画像信号
６１０　予測画像生成部
６１１　予測画像信号
６１２　動きベクトル探索部
６１３　動きベクトル情報
６１４　符号化制御部
６１５　符号化データ
７００　画像復号装置
７０１　符号化データ
７０２　エントロピー復号化部
７０３　逆量子化／逆変換部
７０４　加算部
７０５　復号画像信号
７０６　フレームメモリ部
７０７　再生動画像信号
７０８　参照画像信号
７１０　予測画像信号
７１１　予測制御情報
７１２　動きベクトル情報 104 Accuracy calculation unit 106 Fractional accuracy information 107 Motion compensation prediction unit 108 Motion compensated prediction image 109 Bidirectional interpolation image generation unit 110 Bidirectional prediction image signal 201 Switch unit 202 First image generation unit 203 Second image generation unit 301 List 0 interpolation Unit 302 list 1 interpolation unit 303 addition unit 401 replacement unit 402 addition unit 403 interpolation unit 502 reference image signal 503 switch 504 bidirectional prediction unit 505 unidirectional prediction unit 506 intra prediction unit 507 prediction control information 508 motion vector information 509 prediction image signal 600 Image coding apparatus 601 Input video signal 602 Subtraction unit 603 Transform / quantization unit 604 Inverse quantization / inverse transform unit 605 Entropy coding unit 606 Addition unit 607 Decoded image signal 608 Frame memory unit 609 Reference image signal 610 Predicted image Generation unit 611 predicted image No. 612 Motion vector search unit 613 Motion vector information 614 Encoding control unit 615 Encoded data 700 Image decoding device 701 Encoded data 702 Entropy decoding unit 703 Inverse quantization / inverse transform unit 704 Adder 705 Decoded image signal 706 Frame memory Unit 707 playback video signal 708 reference image signal 710 prediction image signal 711 prediction control information 712 motion vector information

Claims

A motion compensated prediction step for generating a motion compensated prediction image predicted based on a motion vector from a reference image;
An interpolated image generation step of generating a bi-directional prediction image obtained by interpolating the plurality of motion compensation prediction images generated for the plurality of reference images;
An encoding step for generating encoded data obtained by encoding the difference between the input image and the bidirectional prediction image,
In the interpolation image generation step, a first set representing a combination of pixel positions of a plurality of motion compensated prediction images for obtaining a pixel value of a first pixel position of the bidirectional prediction image is a predetermined pixel. When included in a specific set which is a set of positions, the pixel value of each pixel position included in the first set is added, and the pixel value of the first pixel position is calculated by interpolating the added pixel value ,
An image encoding method characterized by the above.

When the first set is not included in the specific set, the interpolation image generation step calculates an interpolation value obtained by interpolating the pixel value for each pixel value of each pixel position included in the first set, and the interpolation Calculating a pixel value of the first pixel position by adding the values;
The image encoding method according to claim 1, wherein:

In the interpolation image generation step, when the first set is not included in the specific set, the tap length of at least some of the filters used for calculating the interpolation value is made smaller than other filters.
The image encoding method according to claim 2.

In the interpolation image generation step, when the first set is not included in the specific set, the first set is replaced with one of the specific sets determined in advance according to the first set. Calculating the pixel value of the first pixel position by adding the pixel values of each pixel position included in the specific set and interpolating the added pixel values;
The image encoding method according to claim 2.

The specific set is a set of pixel positions having the same fractional precision;
The image encoding method according to claim 1, wherein:

The specific set is a set of pixel positions whose fractional accuracy is symmetric with respect to 1/2,
In the interpolation image generation step, when the first set is included in the specific set, one of the pixel positions included in the first set is inverted with respect to the first pixel position. Adding the pixel value and the pixel value of the other pixel position among the pixel positions included in the first set, and calculating the pixel value of the first pixel position by interpolating the added pixel value;
The image encoding method according to claim 1, wherein:

The specific set is a set of pixel positions whose fractional accuracy is ½,
In the interpolation image generation step, when the first set is included in the specific set, the pixel value of each pixel position included in the first set and the same filter coefficient as each pixel position included in the first set Calculating the pixel value at the first pixel position by interpolating the pixel value at the pixel position using
The image encoding method according to claim 1, wherein:

The specific set is a set of pixel positions with different fractional accuracy,
In the interpolation image generation step, when the first set is included in the specific set, the fractional accuracy of each pixel position included in the first set is interpolated and changed to the same value, and the fractional accuracy is changed. Calculating the pixel value of the first pixel position by adding the pixel values of the pixel positions and interpolating the added pixel values using a filter having a filter coefficient determined according to the changed fractional accuracy;
The image encoding method according to claim 1, wherein:

A motion compensated prediction step for generating a motion compensated prediction image predicted based on a motion vector from a reference image;
An interpolated image generation step of generating a bi-directional prediction image obtained by interpolating the plurality of motion compensation prediction images generated for the plurality of reference images;
A decoding step of decoding the difference from encoded data obtained by encoding a difference between the input image and the bidirectional prediction image, and generating the input image based on the decoded difference and the bidirectional prediction image; Including
In the interpolation image generation step, a first set representing a combination of pixel positions of a plurality of motion compensated prediction images for obtaining a pixel value of a first pixel position of the bidirectional prediction image is a predetermined pixel. When included in a specific set which is a set of positions, the pixel value of each pixel position included in the first set is added, and the pixel value of the first pixel position is calculated by interpolating the added pixel value ,
An image decoding method characterized by the above.

When the first set is not included in the specific set, the interpolation image generation step calculates an interpolation value obtained by interpolating the pixel value for each pixel value of each pixel position included in the first set, and the interpolation Calculating a pixel value of the first pixel position by adding the values;
The image decoding method according to claim 9.

In the interpolation image generation step, when the first set is not included in the specific set, the tap length of at least some of the filters used for calculating the interpolation value is made smaller than other filters.
The image decoding method according to claim 10.

In the interpolation image generation step, when the first set is not included in the specific set, the first set is replaced with one of the specific sets determined in advance according to the first set. Calculating the pixel value of the first pixel position by adding the pixel values of each pixel position included in the specific set and interpolating the added pixel values;
The image decoding method according to claim 10.

The specific set is a set of pixel positions having the same fractional precision;
The image decoding method according to claim 9.

The specific set is a set of pixel positions whose fractional accuracy is symmetric with respect to 1/2,
In the interpolation image generation step, when the first set is included in the specific set, one of the pixel positions included in the first set is inverted with respect to the first pixel position. Adding the pixel value and the pixel value of the other pixel position among the pixel positions included in the first set, and calculating the pixel value of the first pixel position by interpolating the added pixel value;
The image decoding method according to claim 9.

The specific set is a set of pixel positions whose fractional accuracy is ½,
In the interpolation image generation step, when the first set is included in the specific set, the pixel value of each pixel position included in the first set and the same filter coefficient as each pixel position included in the first set Calculating the pixel value at the first pixel position by interpolating the pixel value at the pixel position using
The image decoding method according to claim 9.

The specific set is a set of pixel positions with different fractional accuracy,
In the interpolation image generation step, when the first set is included in the specific set, the fractional accuracy of each pixel position included in the first set is interpolated and changed to the same value, and the fractional accuracy is changed. Calculating the pixel value of the first pixel position by adding the pixel values of the pixel positions and interpolating the added pixel values using a filter having a filter coefficient determined according to the changed fractional accuracy;
The image decoding method according to claim 9.

A motion-compensated prediction unit that generates a motion-compensated prediction image that is predicted based on a motion vector from a reference image;
An interpolated image generating unit that generates a bi-directional prediction image obtained by interpolating the plurality of motion compensation prediction images generated for the plurality of reference images;
An encoding unit that generates encoded data obtained by encoding the difference between the input image and the bidirectional prediction image,
In the interpolation image generation unit, a first set representing a combination of pixel positions of a plurality of motion compensated prediction images for obtaining a pixel value of a first pixel position of the bidirectional prediction image is a predetermined pixel. When included in a specific set which is a set of positions, the pixel value of each pixel position included in the first set is added, and the pixel value of the first pixel position is calculated by interpolating the added pixel value ,
An image encoding device characterized by the above.

A motion-compensated prediction unit that generates a motion-compensated prediction image that is predicted based on a motion vector from a reference image;
An interpolated image generating unit that generates a bi-directional prediction image obtained by interpolating the plurality of motion compensation prediction images generated for the plurality of reference images;
A decoding unit that decodes the difference from encoded data obtained by encoding a difference between the input image and the bidirectional prediction image, and generates the input image based on the decoded difference and the bidirectional prediction image; With
In the interpolation image generation unit, a first set representing a combination of pixel positions of a plurality of motion compensated prediction images for obtaining a pixel value of a first pixel position of the bidirectional prediction image is a predetermined pixel. When included in a specific set which is a set of positions, the pixel value of each pixel position included in the first set is added, and the pixel value of the first pixel position is calculated by interpolating the added pixel value ,
An image decoding apparatus characterized by the above.