WO2012121052A1 - Dispositif de traitement d'image, procédé de traitement d'image et programme - Google Patents
Dispositif de traitement d'image, procédé de traitement d'image et programme Download PDFInfo
- Publication number
- WO2012121052A1 WO2012121052A1 PCT/JP2012/054856 JP2012054856W WO2012121052A1 WO 2012121052 A1 WO2012121052 A1 WO 2012121052A1 JP 2012054856 W JP2012054856 W JP 2012054856W WO 2012121052 A1 WO2012121052 A1 WO 2012121052A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- image
- picture
- parallax
- unit
- viewpoint
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T9/00—Image coding
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformations in the plane of the image
- G06T3/40—Scaling of whole images or parts thereof, e.g. expanding or contracting
- G06T3/4007—Scaling of whole images or parts thereof, e.g. expanding or contracting based on interpolation, e.g. bilinear interpolation
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N13/00—Stereoscopic video systems; Multi-view video systems; Details thereof
- H04N2013/0074—Stereoscopic image analysis
- H04N2013/0081—Depth or disparity estimation from stereoscopic image signals
Definitions
- the present technology relates to an image processing device, an image processing method, and a program, and more particularly to an image processing device, an image processing method, and a program that can improve the image quality of a decoded image for a plurality of viewpoint images.
- an encoding method for encoding an image of a plurality of viewpoints such as a 3D (Dimension) image
- an MVC (Multiview Video Coding) method which is an extension of the AVC (Advanced Video Coding) (H.264 / AVC) method, etc. is there.
- an image to be encoded is a color image having a value corresponding to light from a subject as a pixel value, and each of the color images of a plurality of viewpoints is a color image of the viewpoint as necessary.
- encoding is performed with reference to color images of other viewpoints.
- one viewpoint color image among a plurality of viewpoint color images is used as a base view (Base View) image, and other viewpoint color images are used as dependent views (Dependent View). ) Image.
- the base view image (color image) is encoded with reference to only the base view image, and the dependent view image (color image) is not only the dependent view image, but also other images.
- the dependent view image is also referred to as necessary to be encoded.
- a parallax information image (depth image) having, as a pixel value, parallax information related to the parallax for each pixel of the color image of each viewpoint, in addition to the color image of each viewpoint, is adopted as an image of a plurality of viewpoints.
- an encoding method for encoding the color image of each viewpoint and the parallax information image of each viewpoint standards such as the MPEG3DV method are being formulated.
- each viewpoint color image and each viewpoint parallax information image are encoded in the same manner as the MVC system in principle, but the parallax information image is improved in encoding efficiency, etc.
- various encoding methods have been proposed (see, for example, Non-Patent Document 1).
- Encoding method (and decoding method) that considers improvement of image quality of decoded image in addition to improvement of encoding efficiency for color images of a plurality of viewpoints and parallax information images of a plurality of viewpoints as images of a plurality of viewpoints ) Proposal is requested.
- the present technology has been made in view of such a situation, and makes it possible to improve the image quality of a decoded image with respect to a plurality of viewpoint images.
- the image processing device or the program according to the first aspect of the present technology provides the first viewpoint image among the first viewpoint image and the second viewpoint image different from the first viewpoint.
- a warping unit that generates a warped image by converting a picture of the first viewpoint image into an image obtained at the second viewpoint by warping the picture of the first viewpoint, and a picture of the second viewpoint image
- a reference index representing a reference picture to be referred to for generating a prediction image of the target block to be decoded is obtained, and based on the reference index, a reference picture including at least a picture of the warped image is obtained.
- An image processing apparatus comprising a reference picture selection unit for selecting the reference picture, or a program for causing a computer to function as the image processing apparatus It is.
- An image processing method warps a picture of a first viewpoint image out of a first viewpoint image and a second viewpoint image different from the first viewpoint. By doing so, a picture of the warped image obtained by converting the picture of the first viewpoint image into the image obtained at the second viewpoint is generated, and the target block to be decoded of the picture of the second viewpoint image is generated.
- Obtaining a reference index representing a reference picture to be referred to for generating a predicted image, and selecting the reference picture from among reference picture candidates including at least a picture of the warped image based on the reference index Is an image processing method.
- warping the picture of the first viewpoint image out of the first viewpoint image and the second viewpoint image different from the first viewpoint As a result, a picture of the warped image obtained by converting the picture of the image at the first viewpoint into the image obtained at the second viewpoint is generated. Then, a reference index representing a reference picture to be referred to for generating a prediction image of a target block to be decoded of the picture of the second viewpoint image is obtained, and based on the reference index, the warped image picture is obtained.
- the reference picture is selected from at least reference picture candidates to be included.
- the image processing device or the program according to the second aspect of the present technology provides an image of the first viewpoint among an image of a first viewpoint and an image of a second viewpoint different from the first viewpoint.
- a warping unit that generates a warped image by converting a picture of the first viewpoint image into an image obtained at the second viewpoint by warping the picture of the first viewpoint, and a picture of the second viewpoint image Codes required for encoding the target block for each reference picture candidate that is a reference picture candidate to be referred to when generating a predicted image of the target block to be encoded, and includes at least the picture of the warped image
- a cost calculation unit for calculating a coding cost, and a reference index assigned to each of the reference picture candidates based on the coding cost.
- a selection unit that selects and outputs a reference index assigned to a reference picture candidate used for encoding the target block, or a program for causing a computer to function as the image processing apparatus It is.
- An image processing method warps a picture of a first viewpoint image out of a first viewpoint image and a second viewpoint image different from the first viewpoint.
- To generate a warped image picture obtained by converting a picture of the first viewpoint image into an image obtained at the second viewpoint, and a target block to be encoded of the picture of the second viewpoint image Calculating a coding cost required for coding the target block for each reference picture candidate that includes at least a picture of the warped image, which is a reference picture candidate to be referred to in generating a predicted image of A reference picture used for encoding the target block from among the reference indexes assigned to each of the reference picture candidates based on the encoding cost.
- the image processing method comprising the step of selecting and outputting a reference index assigned to the candidate.
- the image processing apparatus may be an independent apparatus or an internal block constituting one apparatus.
- the program can be provided by being transmitted through a transmission medium or by being recorded on a recording medium.
- the image quality of the decoded image can be improved.
- FIG. 2 is a block diagram illustrating a configuration example of an encoder 11.
- FIG. It is a figure explaining the macroblock type of a MVC (AVC) system. It is a figure explaining the prediction vector (PMV) of a MVC (AVC) system. It is a figure explaining the prediction vector of the skip macroblock of a MVC (AVC) system.
- 3 is a block diagram illustrating a configuration example of an encoder 22.
- FIG. 4 is a diagram illustrating a decoded parallax image stored in a DPB 31 and a warped parallax image stored in a warped picture buffer 232.
- FIG. 4 is a block diagram illustrating a configuration example of a disparity prediction unit 234.
- FIG. 16 is a flowchart for describing an encoding process for encoding a parallax image D # 2 of a view # 2. It is a flowchart explaining a parallax prediction process. It is a block diagram which shows the structural example of one Embodiment of the multiview image decoder to which this technique is applied.
- 11 is a block diagram illustrating a configuration example of a decoder 311.
- FIG. 16 is a flowchart for describing an encoding process for encoding a parallax image D # 2 of a view # 2. It is a flowchart explaining a parallax prediction process.
- 11 is a block diagram illustrating a configuration example of
- FIG. 11 is a block diagram illustrating a configuration example of a decoder 322.
- FIG. It is a block diagram which shows the structural example of the parallax prediction part.
- 21 is a flowchart for describing a decoding process for decoding encoded data of a parallax image D # 2 of view # 2. It is a flowchart explaining a parallax prediction process. It is a figure explaining the warped reference allocation system which makes a color image object.
- It is a block diagram which shows the structural example of the encoder 12 which encodes color image C # 2 with a warped reference allocation system.
- 5 is a block diagram illustrating a configuration example of a disparity prediction unit 534.
- 12 is a flowchart for describing an encoding process for encoding a color image C # 2 of a view # 2. It is a flowchart explaining a parallax prediction process. It is a block diagram which shows the structural example of the decoder 312 which decodes color image C # 2 with a warped reference allocation system. It is a block diagram which shows the structural example of the parallax prediction part 663. 16 is a flowchart for describing a decoding process for decoding encoded data of a color image C # 2 of a view # 2. It is a flowchart explaining a parallax prediction process. It is a figure explaining the warped reference allocation system using the candidate picture containing the picture used for temporal prediction.
- FIG. 3 is a block diagram illustrating a configuration example of a reference index allocation unit 701.
- FIG. It is a figure explaining the method to allocate a reference index to a candidate picture based on prediction accuracy.
- FIG. 3 shows the structural example of the decoder 322 which decodes the coding data of parallax image # 2 with the warped reference allocation method using the candidate picture containing the picture used for a time prediction. It is a figure explaining parallax and depth.
- FIG. 18 is a block diagram illustrating a configuration example of an embodiment of a computer to which the present technology is applied. It is a figure which shows the schematic structural example of the television apparatus to which this technique is applied. It is a figure which shows the schematic structural example of the mobile telephone to which this technique is applied. It is a figure which shows the schematic structural example of the recording / reproducing apparatus to which this technique is applied. It is a figure which shows the schematic structural example of the imaging device to which this technique is applied.
- FIG. 38 is a diagram illustrating parallax and depth.
- the depth of the subject M from the camera c1 (camera c2).
- the depth Z that is the distance in the direction is defined by the following equation (a).
- L is a horizontal distance between the position C1 and the position C2 (hereinafter, referred to as an inter-camera distance).
- D is the position of the subject M on the color image photographed by the camera c2 from the horizontal distance u1 of the position of the subject M on the color image photographed by the camera c1 from the center of the color image.
- f is the focal length of the camera c1, and in the formula (a), the focal lengths of the camera c1 and the camera c2 are the same.
- the parallax d and the depth Z can be uniquely converted. Therefore, in this specification, the image representing the parallax d and the image representing the depth Z of the two viewpoint color images captured by the camera c1 and the camera c2 are collectively referred to as a depth image (parallax information image).
- the depth image may be an image representing the parallax d or the depth Z
- the pixel value of the depth image is not the parallax d or the depth Z itself but the parallax d as a normal value.
- the normalized value the value obtained by normalizing the reciprocal 1 / Z of the depth Z, and the like can be employed.
- the value I obtained by normalizing the parallax d with 8 bits (0 to 255) can be obtained by the following equation (b). Note that the normalization bit number of the parallax d is not limited to 8 bits, and other bit numbers such as 10 bits and 12 bits may be used.
- D max is the maximum value of the parallax d
- D min is the minimum value of the parallax d.
- the maximum value D max and the minimum value D min may be set in units of one screen, or may be set in units of a plurality of screens.
- the value y obtained by normalizing the reciprocal 1 / Z of the depth Z by 8 bits (0 to 255) can be obtained by the following equation (c).
- the normalized bit number of the inverse 1 / Z of the depth Z is not limited to 8 bits, and other bit numbers such as 10 bits and 12 bits may be used.
- Z far is the maximum value of the depth Z
- Z near is the minimum value of the depth Z.
- the maximum value Z far and the minimum value Z near may be set in units of one screen or may be set in units of a plurality of screens.
- an image having a pixel value of the value I obtained by normalizing the parallax d, and an inverse 1 / of the depth Z is collectively referred to as a depth image (parallax information image).
- a depth image parllax information image
- the color format of the depth image is YUV420 or YUV400, but other color formats are also possible.
- the value I or the value y is set as the depth information (disparity information). Further, the mapping of the value I or the value y is a depth map.
- FIG. 1 is a diagram for explaining an example of a viewpoint information image encoding method proposed in Non-Patent Document 1.
- the already proposed encoding method described below is also referred to as a proposed method.
- the viewpoint is also referred to as a view below.
- color image C # 1 of view # 1 which is a color image of two (different) viewpoints (views), color image C # 2 of view # 2, and two viewpoints It is assumed that there is a parallax information image D # 1 of view # 1 and a parallax information image D # 2 of view # 2, which are the parallax information images of the first view.
- the color images C # 1 and C # 2 are encoded by the MVC method, for example. Therefore, since the pictures of the color images C # 1 and C # 2 are predictively encoded with reference to other pictures as necessary, in order to generate a predictive image used for predictive encoding after encoding Locally decoded.
- a block (macroblock) having a t-th picture with a disparity information image D # 2 of view # 2 is a target block to be encoded.
- a picture having a target block that is, a picture to be encoded is also referred to as a target picture.
- a picture of the warped color image C ′ # 1 is generated by converting the t-th picture of the color image C # 1 into an image obtained at the viewpoint # 2.
- each pixel (value) of the color image C # 1 is connected to the viewpoint # 1 in each pixel.
- the warped color image C ′ # 1 is generated by being moved by an amount corresponding to the parallax with # 2.
- occlusion occurs only by moving each pixel of the color image C # 1 by an amount corresponding to the parallax between the viewpoints # 1 and # 2 in each pixel.
- the portion that is shown in the color image # 2 but is not shown in the color image # 1 is a so-called occlusion portion with no pixel value and a hole.
- the hatched portion indicates an occlusion portion.
- the background portion that is visible from viewpoint # 2 but is hidden in the foreground due to parallax is the occlusion portion from viewpoint # 1.
- the pixels in the occlusion portion are interpolated by the pixel values of the surrounding pixels, that is, the nearest pixel in the direction opposite to the moving direction in warping (closest to the occlusion portion).
- the color image C # 2 of the view # 2 which is the target block of the t-th picture of the disparity information image D # 2 of the view # 2, for example, at the same position (and size) as the block MBD # 21
- the block MBC # 21 of the t-th picture is detected.
- block MBC '# 11 of the picture (tth picture) of warped color image C' # 1 at the same position as that of block MBC # 21 is detected, and warped color of block MBC # 21 of color image C # 2 is detected.
- SAD Sud of Absolute Differences
- the block MBD # 21 that is the target block of the disparity information image D # 2 is skipped with respect to the warped disparity information image D' # 1 Encoded as a macroblock.
- the block MBD # 21 that is the target block of the disparity information image D # 2 is the picture of the disparity information image D # 1 ) As a reference picture.
- the SAD with the block MBD # 21 that is the target block is minimized in the disparity information image D # 1 that is the reference picture, for example, by ME (MotionMEEstimation) (motion detection).
- ME MotionMEEstimation
- a shift vector (parallax vector), which is a vector representing a position shift with respect to a block to be performed (hereinafter also referred to as a corresponding block), is detected.
- a prediction image is generated by performing MC (Motion Compensation) (motion compensation) based on the shift vector, that is, in the disparity information image D # 1 that is a reference picture, A block that is shifted by a shift vector from the position of a certain block MBD # 21, that is, a corresponding block is acquired as a predicted image, and the block MBD # 21 that is the target block is encoded using the predicted image.
- MC Motion Compensation
- motion compensation motion compensation
- the residual of the target block block MBD # 21 with respect to the predicted image is obtained, and the residual is encoded together with the shift vector (the vector detected by the ME) of the target block MBD # 21.
- generating a predicted image based on a shift vector is also referred to as shift prediction (disparity prediction, motion prediction) or shift compensation (parallax compensation, motion compensation).
- shift prediction includes detection of a shift vector as necessary.
- the SAD of the color image C ′ # 1 with the block MBC ′ # 11 is equal to or less than a predetermined threshold value, so that the block MBD # 21 is, for example, the block MBD # 21 of the warped parallax information image D ′ # 1. Is encoded as a skip macroblock for the block MBD '# 11 at the same position.
- the block MBC # 22 of the disparity information image D # 2 uses the disparity information image D # 1 as a reference picture and the disparity that is the reference picture.
- the information image D # 1 is predictively encoded using the block MBD # 12, which is a corresponding block corresponding to the block MBC # 22, as a predicted image.
- FIG. 2 is a diagram for further explaining the proposed method.
- the warped color image C ′ # 1 is obtained by warping the color image C # 1 of the view # 1 in the encoding of the disparity information image D # 2 of the view # 2. And the warped parallax information image D # 1 of the view # 1 generates the warped parallax information image D ′ # 1.
- occlusion may occur.
- the occlusion portion in which the occlusion occurs for example, the reverse of the moving direction in warping is performed. Pixel values such as pixels closest to the occlusion portion in the direction are interpolated.
- the viewpoints of the background reflected in the color image of the viewpoint # 2 From # 1 the viewpoints of the background reflected in the color image of the viewpoint # 2 From # 1, the portion hidden behind the foreground and invisible becomes the occlusion portion in the warped color image C ′ # 1 obtained by warping the color image C # 1 of the viewpoint # 1.
- the occlusion part is the background
- the pixel closest to the occlusion part in the direction opposite to the moving direction in the warping is also the background (pixel).
- the parallax information as the pixel values of the two pixels in the close position where the background is captured is the (almost) the same value unless the distance in the depth direction of the background changes sharply. Therefore, for the warped parallax information image D ′ # 1 obtained by warping the parallax information image D # 1 of the viewpoint # 1, the pixels around the occlusion part, that is, for example, the reverse of the moving direction in the warping When the occlusion part is interpolated with the pixel closest to the occlusion part in the direction, the correlation between the occlusion part and the part of the disparity information image D # 2 at the same position as the occlusion part is often high. .
- the color as the pixel value of the two pixels in the close position where the background is reflected may differ greatly depending on the background texture, so the color image C # 1 of the viewpoint # 1 is warped.
- the warped color image C '# 1 obtained by the above when the occlusion part is interpolated by pixels around the occlusion part, the occlusion part and the part of the color image C # 2 at the same position as the occlusion part The correlation with is often not high.
- the target block of the disparity information image D # 2 is a block MBD at the same position as the block MBC '# 12 including (a part of) the occlusion part of the warped color image C' # 1.
- the SAD of the block MBC # 22 of the color image C # 2 and the block MBC '# 12 including the occlusion portion of the warped color image C' # 1 at the same position as the block MBD # 22 Does not fall below the predetermined threshold, and as a result, the block MBD # 22 that is the target block is predictively encoded using the disparity information image D # 1 as a reference picture.
- the disparity information image D # 1 for the block that includes the (part of) the occlusion portion of the warped color image C ′ # 1, the disparity information image D # 1 is used as a reference picture, as in the block MBD # 22. Is predictively encoded.
- the target block of the disparity information image D # 2 When the target block of the disparity information image D # 2 is predictively encoded, it represents a shift between the target block of the disparity information image D # 2 and the corresponding block of the reference picture corresponding to the target block. In this case, a non-zero vector) shift vector (disparity vector detected by ME) is generated.
- the block MBC # 22 that is the target block is coded as a skip macroblock for the warped parallax information image D '# 1 It becomes.
- the SAD between the block MBC # 22 of the color image C # 2 and the block MBC '# 12 of the warped color image C' # 1 is equal to or less than a predetermined threshold.
- the target block MBD # 22 that is the skip macroblock even if there is a margin in the bit rate of the encoded data The image quality of the decoded image cannot be improved beyond a certain image quality.
- the proposed method it is determined whether to use one of the parallax information image D # 1 and the warped parallax information image D ′ # 1 for encoding the target block of the parallax information image D # 2.
- the block layer that is, using the SAD of the block (macroblock) of the warped color image C '# 1 and the block (macroblock) of the color image C # 2 at the same position as the target block Therefore, when encoding the color images C # 1 and C # 2 and the parallax information images D # 1 and D # 2 using, for example, an existing encoding method such as the MVC method.
- the macroblock layer On the decoder side, the macroblock layer must be changed to determine which of the disparity information image D # 1 and the warped disparity information image D ′ # 1 is used for decoding the target block. Therefore, a major change of the existing encoding method is required.
- the warped color image C ′ # 1 generated by warping the decoded color image C # 1 is used, the warped color image C is used for encoding (and decoding) the disparity information image D # 2. It is necessary to store the (local) decoded color image C # 1 used to generate '# 1 in a DPB (Decode Picture Buffer) that is a buffer for storing the (local) decoded image.
- DPB Decode Picture Buffer
- a warped disparity information image D ′ # 1 (picture thereof) generated by warping the disparity information image D # 1 (after local decoding) is at least a reference picture to which a reference index is assigned.
- the target block of the disparity information image D # 2 is predictively encoded.
- FIG. 3 is a diagram for explaining the outline of the present technology.
- the warped disparity information image D ′ # 1 (picture thereof) generated by warping the disparity information image D # 1 (after local decoding)
- the disparity information image D # 1 (picture thereof) is a picture that can be a reference picture.
- warped disparity information image D ′ # 1 (picture thereof) is used as a reference picture
- the shift vector (disparity vector) is 0 vector
- the target block of disparity information image D # 2 is determined by MC.
- a block at a position shifted by a shift vector from the position of a certain block MBD # 21, that is, a block MBD '# 11 at the same position as the target block MBD # 21 is acquired as a predicted image.
- a coding cost COST COST1 ′ required for coding the target block MBD # 21 when the information image D ′ # 1 is a reference picture is calculated according to the equation (1).
- ⁇ is a weight for the value MV corresponding to the code amount of the shift vector, and is set according to the residual quantization step.
- the warped parallax information image D ′ # 1 is an image obtained by converting the parallax information image D # 1 of the viewpoint # 1 into an image viewed from the viewpoint # 2, and the parallax information image D # 2 of the viewpoint # 2 Since it can be estimated that there is no parallax (parallax compensation is performed), a zero vector is assumed as the shift vector.
- the shift vector that is the 0 vector has a code amount of the shift vector, and as a value MV corresponding to the code amount, , 0 (or a small value close to 0) can be employed.
- the shift vector of 0 vector is shifted.
- the value MV corresponding to the code amount of the vector 0 (or a small value close to 0) can be adopted.
- 0 vector is adopted as the shift vector for the warped parallax information image D ′ # 1, but for the warped parallax information image D ′ # 1, ME is performed with the target block MBD # 21.
- a grazing vector may be detected.
- a shift vector (disparity vector) is obtained by performing ME between the target block MBD # 21 and the disparity information image D # 1. ) Is detected.
- the block (corresponding block) MBD # 11 shifted by the shift vector from the position of the target block MBD # 21 is acquired by the MC as a predicted image.
- the coding cost required for coding the target block MBD # 21 (coding cost for the warped parallax information image D ′ # 1) COST1
- the warped disparity information image D ′ # 1 and the disparity information image D # 1 having the smaller encoding cost are used for encoding the target block MBD # 21. Selected as a reference picture.
- a reference index ref_idx for distinguishing each picture is assigned to one or more pictures (candidates for reference pictures) that can become reference pictures in coding of the target block.
- images of a plurality of viewpoints are encoded using an encoding method in which a reference index is assigned to each reference picture candidate and encoded, as in MVC, in FIG. (Hereinafter also referred to as candidate pictures) are the warped parallax information image D ′ # 1 and the parallax information image D # 1 (picture thereof), and the warped parallax information image D ′ # 1 and the parallax information image D Reference index ref_idx is assigned to each # 1.
- a reference index ref_idx having a value of 0 (first value) is assigned to the warped parallax information image D ′ # 1, and a value of 1 (second value) is assigned to the parallax information image D # 1.
- the reference index ref_idx is assigned.
- the reference picture A residual (residual image) of the target block MBD # 21 with respect to the predicted image generated by using, and the shift vector information regarding the residual, the shift vector (disparity vector) used to determine the predicted image, and
- the reference index ref_idx assigned to the reference picture used for obtaining the predicted image is encoded.
- the target block is always a skip macroblock.
- the residual of the target block is not encoded.
- the target block when encoding a target block using a reference picture to which a reference index having a value of 0 is assigned, the target block can be a skip macroblock.
- the warped parallax information image D ′ # 1 is selected as a reference picture.
- any one of the warped disparity information image D ′ # 1 and the disparity information image D # 1 is used for decoding the target block on the decoder side. Can be determined by the reference index ref_idx, so there is no need to change the macroblock layer (below) as in the proposed method. Therefore, the existing coding method such as MVC is greatly changed. Without using the existing encoding method.
- the disparity information image D # since a color image is not used unlike the proposed method in selecting a reference picture to be referred to when encoding a target block of the disparity information image D # 2, the disparity information image D # It is not necessary to store the color image after local decoding in the DPB for the encoding of the target block of 2, and a buffer with a small storage capacity as the DPB compared to the proposed method. Can be adopted.
- the candidate picture includes the warped parallax information image D ′ # 1 and the warped parallax information image D ′ # 1 is assigned a reference index ref_idx having a value of 0, which has been proposed. Compared with the method, encoding efficiency can be improved.
- the code amount of the reference index ref_idx having a value of 0 is smaller than the code amount of the reference index ref_idx having other values.
- a reference index ref_idx having a value of 0 is assigned to a candidate picture that can be more easily selected as a reference picture among candidate pictures.
- the code amount can be reduced and the encoding efficiency can be improved.
- the candidate pictures are the warped parallax information image D ′ # 1 and the parallax information image D # 1, and the warped parallax information image D ′ # 1 includes a reference index ref_idx having a value of 0 as the parallax information image.
- a reference index ref_idx having a value of 0 is assigned to D # 1.
- the warped parallax information image D ′ # 1 and the parallax information image D # 1 that are candidate pictures the warped parallax information image D ′ # 1 is easily selected as a reference picture, and such warped parallax information Since the reference index ref_idx having a value of 0 is assigned to the image D ′ # 1, the encoding efficiency can be improved.
- FIG. 4 is a diagram for explaining that the warped parallax information image D ′ # 1 among the warped parallax information image D ′ # 1 and the parallax information image D # 1 is easily selected as a reference picture.
- the warped parallax information image D ′ # 1 is obtained at viewpoint # 2 (obtained at viewpoint # 2) generated by warping the parallax information image D # 1 (after local decoding) as described above. It will be a parallax information image.
- the SAD between the target block of the parallax information image D # 2 at the viewpoint # 2 and the block of the warped parallax information image D ′ # 1 at the same position as the target block is often a small value.
- the target of the disparity information image D # 2 is assumed by the MC assuming that the shift vector is a zero vector.
- a block (corresponding block) at a position shifted by a shift vector from the block position, that is, a block at the same position as the target block is acquired as a predicted image.
- the SAD between the target block and the predicted image is often a small value.
- the warped parallax information image D ′ # 1 is more easily selected as a reference picture than the parallax information image D # 1. .
- the coding efficiency can be improved by assigning the reference index ref_idx having a value of 0 to the warped disparity information image D ′ # 1 that is easily selected as the reference picture.
- the parallax information image D # 1 is warped, and an occlusion portion is generated by the warping.
- the occlusion portion is interpolated by pixels around the occlusion portion as described in FIG. 2, and this occlusion portion and the occlusion portion of the parallax information image D # 2
- the correlation with the part at the same position is often high.
- the target block of the disparity information image D # 2 is a block MBD at the same position as the block MBD '# 22 including (part of) the occlusion part of the warped disparity information image D ′ # 1.
- the SAD of the target block MBD # 22 and the predicted image when the warped disparity information image D ′ # 1 is used as a reference picture that is, the target block MBD # 22 and the target block
- the SAD with the block MBD '# 22 of the warped parallax information image D' # 1 at the same position as the MBD # 22 tends to be small.
- the warped disparity information image D ′ # 1 tends to be small. Therefore, since the warped parallax information image D ′ # 1 is easily selected as a reference picture, encoding efficiency is improved by assigning a reference index having a value of 0 to such warped parallax information image D ′ # 1. be able to.
- the shift vector is a 0 vector, and thus a shift vector (not a 0 vector) does not occur.
- the correlation between the occlusion part of the warped color image and the part of the color image before warping of the warped color image does not increase. There are many cases.
- the determination of which one of the parallax information image D # 1 and the warped parallax information image D ′ # 1 is used to encode the target block of the parallax information image D # 2 is performed as the warped color image C ′.
- the target block of the disparity information image D # 2 is the warped color image C ′ #.
- Block MBC of color image C # 2 at the same position as the target block MBD # 22 when block MBD # 22 is at the same position as block MBC '# 12 including (part of) one occlusion part The SAD between # 22 and the block MBC '# 12 including the occlusion portion of the warped color image C' # 1 does not fall below a predetermined threshold, and as a result, the block MBD # 22 that is the target block has disparity information Predictive encoding is performed using the image D # 1 as a reference picture.
- the target block MBD # 22 of the parallax information image D # 2 at the same position as the block MBD '# 22 including the occlusion part of the warped parallax information image D' # 1 the target block MBD # Even if the SAD between 22 and the block MBD '# 22 of the warped parallax information image D' # 1 at the same position as the target block MBD # 22 is small, the SAD calculated using the color image is not small ( Predictive coding is likely to be performed using the parallax information image D # 1 as a reference picture, not below a predetermined threshold.
- a shift vector (in many cases not a zero vector) is generated by the ME performed using the target block MBD # 22 and the disparity information image D # 1.
- the target block MBD # 22 at the same position as the block MBD '# 22 including the occlusion part of the warped parallax information image D' # 1 the target block MBD # 22 and the target block If the SAD with the block MBD ′ # 22 of the warped parallax information image D ′ # 1 at the same position as the MBD # 22 is small, the deviation vector generated by the ME in the proposed method does not occur.
- the method for encoding a parallax information image described in FIG. 3 can also be applied to a color image.
- FIG. 5 is a block diagram illustrating a configuration example of an embodiment of a multi-view image encoder to which the present technology is applied.
- the multi-viewpoint image encoder in FIG. 5 is an encoder that encodes images of a plurality of viewpoints using, for example, the MVC method, and description of processing similar to that in the MVC method will be omitted as appropriate.
- multi-viewpoint image encoder is not limited to an encoder that uses the MVC method.
- the color image C # 1 of the view # 1 which is the color image of the two viewpoints # 1 and # 2
- the color image C # 2 of the view # 2 and its It is assumed that the parallax information image D # 1 of view # 1 and the parallax information image D # 2 of view # 2 which are parallax information images of two viewpoints # 1 and # 2 are adopted.
- the color image C # 1 of the view # 1 and the parallax information image D # 1 are set as base view images, and the color image C # 2 and the parallax information image D # 2 of the remaining view # 2 Are treated as dependent view images.
- a color image of three or more viewpoints and a parallax information image can be adopted, and among the color images of the three or more viewpoints and the parallax information image, Any one viewpoint color image and disparity information image can be used as a base view image, and the remaining viewpoint color images and disparity information images can be handled as dependent view images.
- the multi-view image encoder includes encoders 11, 12, 21, 22, DPB 31, and a multiplexing unit 32.
- the encoder 11 is supplied with the color image C # 1 of view # 1 and parallax related information (depth related information).
- the parallax related information is metadata of the parallax information (depth information), and details thereof will be described later.
- the encoder 11 encodes the color image C # 1 of the view # 1 using the disparity related information as necessary, and multiplexes the encoded data of the color image C # 1 of the view # 1 obtained as a result To the unit 32.
- the encoder 12 is supplied with the color image C # 2 of view # 2 and parallax related information.
- the encoder 12 encodes the color image C # 2 of the view # 2 using the disparity related information as necessary, and multiplexes the encoded data of the color image C # 2 of the view # 2 obtained as a result To the unit 32.
- the encoder 21 is supplied with the parallax information image D # 1 of view # 1 and parallax-related information.
- the encoder 21 encodes the disparity information image D # 1 of the view # 1 using the disparity related information as necessary, and the encoded data of the disparity information image D # 1 of the view # 1 obtained as a result is The data is supplied to the multiplexing unit 32.
- the encoder 22 is supplied with the parallax information image D # 2 of view # 2 and parallax-related information.
- the encoder 22 encodes the disparity information image D # 2 of the view # 2 using the disparity related information as necessary, and encodes the encoded data of the disparity information image D # 2 of the view # 2 obtained as a result.
- the data is supplied to the multiplexing unit 32.
- the DPB 31 refers to an image (decoded image) after local decoding obtained by encoding and locally decoding an image to be encoded at each of the encoders 11, 12, 21, and 22 when generating a predicted image. Temporarily store as a reference picture (candidate).
- the encoders 11, 12, 21, and 22 perform predictive encoding on the encoding target image. Therefore, the encoders 11, 12, 21, and 22 generate a predicted image to be used for predictive encoding, encode an encoding target image, perform local decoding, and obtain a decoded image.
- the DPB 31 is a shared buffer that temporarily stores decoded images obtained by the encoders 11, 12, 21, and 22.
- the encoders 11, 12, 21, and 22 are each decoded by the DPB 31.
- a reference picture to be referred to for encoding an image to be encoded is selected from the images.
- Each of the encoders 11, 12, 21, and 22 generates a prediction image using the reference picture, and performs image encoding (prediction encoding) using the prediction image.
- each of the encoders 11, 12, 21, and 22 is decoded by other encoders in addition to the decoded image obtained by itself. Images can also be referenced.
- the multiplexing unit 32 is supplied with the parallax-related information in addition to the encoded data from each of the encoders 11, 12, 21, and 22.
- the multiplexing unit 32 multiplexes the encoded data from each of the encoders 11, 12, 21, and 22, and further disparity related information supplied thereto, and outputs multiplexed data obtained as a result.
- the multiplexed data output from the multiplexing unit 32 is recorded on a recording medium (not shown) or transmitted via a transmission medium (not shown).
- FIG. 6 is a block diagram illustrating a configuration example of a multi-view image generation apparatus that generates images of a plurality of viewpoints to be encoded in the multi-view image encoder of FIG.
- two cameras 41 and 42 are installed at positions where two viewpoints can be photographed in order to photograph images of two viewpoints as a plurality of viewpoints. Yes.
- the cameras 41 and 42 are arranged at different positions on a straight line on a horizontal plane with the optical axis directed in a direction perpendicular to the straight line. I will do it.
- the camera 41 shoots a subject at a position where the camera 41 is arranged, and outputs a color image C # 1 that is a moving image.
- the camera 41 outputs a disparity vector d1 representing the disparity with respect to the reference viewpoint for each pixel of the color image C # 1, with the position of the camera 42, which is any other arbitrary camera, as the reference viewpoint.
- the camera 42 shoots a subject at a position where the camera 42 is arranged, and outputs a color image C # 2 that is a moving image.
- the camera 42 outputs a disparity vector d2 representing the disparity with respect to the reference viewpoint for each pixel of the color image C # 2, using the position of the camera 41, which is another arbitrary camera, as the reference viewpoint.
- the disparity vectors d1 and d2 are vectors whose values are corresponding to the positional relationship of the cameras 41 and 42 in the horizontal direction, etc., with the y component being 0.
- parallax vectors (parallax) d1 and d2 output from the cameras 41 and 42 are also referred to as shooting parallax vectors d1 and d2.
- the color image C # 1 and the shooting parallax vector d1 output from the camera 41 and the color image C # 2 and the shooting parallax vector d2 output from the camera 42 are supplied to the multi-viewpoint image information generation unit 43. .
- the multi-viewpoint image information generation unit 43 outputs the color image C # 1 from the cameras 41 and 42 as it is.
- the multi-viewpoint image information generation unit 43 obtains parallax information regarding the parallax for each pixel of the color image # 1 from the shooting parallax vector d1 from the camera 41, and the parallax information image D # having the parallax information as a pixel value Generate 1 and output.
- the multi-viewpoint image information generation unit 43 obtains parallax information regarding the parallax for each pixel of the color image # 2 from the shooting parallax vector d2 from the camera 42, and the parallax information image D # having the parallax information as a pixel value Generate 2 and output.
- parallax information for example, a parallax value (value I) that is a value corresponding to a shooting parallax vector, or a value after normalization of a depth Z that represents a distance (depth) to a subject There is y.
- the pixel value of the parallax information image takes an integer value of 0 to 255 represented by 8 bits, for example.
- the shooting parallax vector (x component thereof) is represented by d, and the maximum value and the minimum value of the shooting parallax vector (x component thereof) (for example, in a picture or a moving image as one content), Let dmax and dmin.
- the parallax value ⁇ (value I) is obtained by using, for example, the shooting parallax vector (x component thereof) d, its maximum value dmax (D max ), and its minimum value dmin (D min ), It is obtained according to equation (2).
- parallax value ⁇ in Expression (2) can be converted into a shooting parallax vector (its x component) d according to Expression (3).
- the depth Z represents the distance from the straight line where the cameras 41 and 42 are arranged to the subject.
- the base line length which is the distance (distance from the reference viewpoint) between the camera 41 and the camera 42 arranged in a straight line, is L, and the focal length of the camera 41 is f.
- the depth Z can be obtained according to the equation (4) using the imaging parallax vector (x component thereof) d (d1).
- the disparity value ⁇ and the depth Z, which are disparity information, are equivalent information because they can be converted into each other according to Equation (4).
- a parallax information image (depth image) having a parallax value ⁇ as a pixel value is also referred to as a parallax image
- an image having a value y after normalization of the depth Z as a pixel value is also referred to as a depth image.
- a parallax image of parallax images and depth images is used as the parallax information image.
- a depth image can also be used as the parallax information image.
- the multi-viewpoint image information generation unit 43 outputs parallax related information in addition to the color images # 1 and # 2 and the parallax images D # 1 and # 2.
- the multi-viewpoint image information generation unit 43 is supplied with the base length L and the focal length f, which are distances between the cameras 41 and 42 (distances between the cameras 41 and 42 and the reference viewpoint) from the outside. Is done.
- the multi-viewpoint image information generation unit 43 sets the maximum value dmax and the minimum value dmin of the shooting parallax vector (x component thereof) d for each of the shooting parallax vector d1 from the camera 41 and the shooting parallax vector d2 from the camera 41. To detect.
- the multi-viewpoint image information generation unit 43 outputs the maximum value dmax and the minimum value dmin of the shooting parallax vector d, the baseline length L, and the focal length f as parallax related information.
- the color images C # 1 and C # 2, the parallax images D # 1 and D # 2, and the parallax related information output from the multiview image information generation unit 43 are supplied to the multiview image encoder in FIG.
- the cameras 41 and 42 are arranged on a straight line on the same plane orthogonal to the color image plane, and the shooting parallax vector d (d1 and d2) has a y component of 0.
- the cameras 41 and 42 can be arranged on different planes orthogonal to the color image plane.
- the shooting parallax vector d is a vector that can be a value other than 0 for both the x component and the y component.
- FIG. 7 is a diagram for explaining a picture to be referred to when a predicted image is generated in MVC predictive coding.
- the pictures of the view # 1 image which is the base view image, are represented in the order of (display) time as p11, p12, p13,...
- the view # 2 image which is the dependent view image
- the pictures are expressed as p21, p22, p23,.
- the base view picture for example, the picture p12 is predictively encoded by referring to the base view picture, for example, the pictures p11 and p13 as necessary.
- prediction generation of a predicted image
- prediction can be performed with reference to only the pictures p11 and p13 which are pictures at other times of the base view.
- the dependent view picture for example, the picture p22 requires the dependent view picture, for example, the pictures p21 and p23, and further the base view picture p12, which is another view, is required.
- the prediction encoding is performed with reference to.
- the dependent-view picture p22 performs prediction by referring to the pictures p21 and p23 that are pictures at other times of the dependent view and the base-view picture p12 that is a picture of another view. be able to.
- prediction performed with reference to a picture in the same view as the picture to be encoded is also referred to as temporal prediction
- prediction performed with reference to a picture in a view different from the picture to be encoded is referred to as disparity prediction.
- temporal prediction and disparity prediction can be performed for a dependent view picture.
- a picture of a view different from the encoding target picture that is referred to in the disparity prediction must be a picture at the same time as the encoding target picture.
- the encoders 11, 12, 21, and 22 constituting the multi-view image encoder in FIG. 5 perform prediction (prediction image generation) according to the MVC method.
- FIG. 8 is a diagram for explaining the encoding (and decoding) order of pictures in the MVC system.
- the picture of the view # 1 that is the base view image is represented as p11, p12, p13,... In the (display) time order, and the view # that is the dependent view image.
- the pictures of the second image are represented as p21, p22, p23,.
- the base view picture and the dependent view picture are encoded in the same order.
- FIG. 9 is a block diagram illustrating a configuration example of the encoder 11 of FIG.
- encoders 12 and 21 in FIG. 5 are also configured in the same manner as the encoder 11 and encode an image in accordance with, for example, the MVC method.
- an encoder 11 includes an A / D (Analog / Digital) conversion unit 111, a screen rearrangement buffer 112, a calculation unit 113, an orthogonal transformation unit 114, a quantization unit 115, a variable length encoding unit 116, and a storage buffer 117. , An inverse quantization unit 118, an inverse orthogonal transform unit 119, a calculation unit 120, a deblocking filter 121, an intra prediction unit 122, an inter prediction unit 123, and a predicted image selection unit 124.
- a / D Analog / Digital
- the A / D converter 111 When the picture supplied to the A / D converter 111 is an analog signal, the A / D converter 111 performs A / D conversion on the analog signal and supplies it to the screen rearrangement buffer 112.
- the screen rearrangement buffer 112 temporarily stores the pictures from the A / D conversion unit 111, and reads out the pictures according to a predetermined GOP (Group of Pictures) structure, thereby arranging the picture arrangement in the display order. From this, the rearrangement is performed in the order of encoding (decoding order).
- GOP Group of Pictures
- the picture read from the screen rearrangement buffer 112 is supplied to the calculation unit 113, the intra prediction unit 122, and the inter prediction unit 123.
- the calculation unit 113 is supplied with a picture from the screen rearrangement buffer 112 and a prediction image generated by the intra prediction unit 122 or the inter prediction unit 123 from the prediction image selection unit 124.
- the calculation unit 113 sets the picture read from the screen rearrangement buffer 112 as a target picture to be encoded, and sequentially sets macroblocks constituting the target picture as a target block to be encoded.
- the calculation unit 113 calculates a subtraction value obtained by subtracting the pixel value of the prediction image supplied from the prediction image selection unit 124 from the pixel value of the target block as necessary, and supplies the calculated value to the orthogonal transformation unit 114.
- the orthogonal transform unit 114 performs orthogonal transform such as discrete cosine transform and Karhunen-Loeve transform on the target block (the pixel value or the residual obtained by subtracting the predicted image) from the computation unit 113, and The transform coefficient obtained as a result is supplied to the quantization unit 115.
- the quantization unit 115 quantizes the transform coefficient supplied from the orthogonal transform unit 114, and supplies the quantized value obtained as a result to the variable length coding unit 116.
- variable length coding unit 116 performs variable length coding (for example, CAVLC (Context-Adaptive Variable Length Coding)) or arithmetic coding (for example, CABAC (Context) on the quantized value from the quantization unit 115. -Adaptive Binary Arithmetic Coding), etc.) and the like, and the encoded data obtained as a result is supplied to the accumulation buffer 117.
- variable length coding for example, CAVLC (Context-Adaptive Variable Length Coding)
- CABAC Context
- CABAC Context
- CABAC Context-Adaptive Binary Arithmetic Coding
- variable length coding unit 116 is also supplied with header information included in the header of the encoded data from the intra prediction unit 122 and the inter prediction unit 123. .
- variable length encoding unit 116 encodes the header information from the intra prediction unit 122 or the inter prediction unit 123 and includes it in the header of the encoded data.
- the accumulation buffer 117 temporarily stores the encoded data from the variable length encoding unit 116 and outputs it at a predetermined data rate.
- the encoded data output from the accumulation buffer 117 is supplied to the multiplexing unit 32 (FIG. 5).
- the quantization value obtained by the quantization unit 115 is supplied to the variable length coding unit 116 and also to the inverse quantization unit 118, and the inverse quantization unit 118, the inverse orthogonal transform unit 119, and the calculation In unit 120, local decoding is performed.
- the inverse quantization unit 118 inversely quantizes the quantized value from the quantization unit 115 into a transform coefficient and supplies the transform coefficient to the inverse orthogonal transform unit 119.
- the inverse orthogonal transform unit 119 performs inverse orthogonal transform on the transform coefficient from the inverse quantization unit 118 and supplies it to the arithmetic unit 120.
- the calculation unit 120 decodes the target block by adding the pixel value of the predicted image supplied from the predicted image selection unit 124 to the data supplied from the inverse orthogonal transform unit 119 as necessary. A decoded image is obtained and supplied to the deblocking filter 121.
- the deblocking filter 121 removes (reduces) block distortion generated in the decoded image by filtering the decoded image from the arithmetic unit 120, and supplies it to the DPB 31 (FIG. 5).
- the DPB 31 predictively encodes the decoded image from the deblocking filter 121, that is, the picture of the color image C # 1 encoded by the encoder 11 and locally decoded (timer 113).
- the reference picture is stored as a reference picture (candidate) to be referred to when generating a predicted picture to be used for encoding).
- the encoder 12 Since the DPB 31 is shared by the encoders 11, 12, 21, and 22, in addition to the picture of the color image C # 1 encoded and locally decoded by the encoder 11, the encoder 12 The picture of the color image C # 2 that has been encoded and locally decoded in FIG. 5, the picture of the disparity image D # 1 that has been encoded and locally decoded by the encoder 21, and the disparity that has been encoded by the encoder 22 and locally decoded The picture of image D # 2 is also stored.
- the DPB 31 stores decoded images of an I picture, a P picture, and a Bs picture.
- the in-screen prediction unit 122 reads from the DPB 31 the target picture. A portion (decoded image) that has already been locally decoded is read. Then, the intra-screen prediction unit 122 sets a part of the decoded image of the target picture read from the DPB 31 as the predicted image of the target block of the target picture supplied from the screen rearrangement buffer 112.
- the intra-screen prediction unit 122 calculates the encoding cost required to encode the target block using the predicted image, that is, the encoding cost required to encode the residual of the target block with respect to the predicted image. Obtained and supplied to the predicted image selection unit 124 together with the predicted image.
- the inter prediction unit 123 encodes from the DPB 31 one or more encoded and locally decoded before the target picture Are read out as candidate pictures (reference picture candidates).
- the inter prediction unit 123 performs an ME using the target block of the target picture from the screen rearrangement buffer 112 and the candidate picture, and a corresponding block (target block and target block) corresponding to the target block of the candidate picture.
- a shift vector representing a shift (parallax, motion) with respect to a block that minimizes the SAD is detected.
- the shift vector detected by the ME using the target block and the candidate picture is the motion (time) between the target block and the candidate picture.
- a motion vector representing a general deviation is the motion (time) between the target block and the candidate picture.
- the shift vector detected by the ME using the target block and the candidate picture is a disparity (spatial) between the target block and the candidate picture. This is a disparity vector representing a deviation.
- the parallax vector obtained by the ME is also referred to as a calculated parallax vector in order to distinguish it from the shooting parallax vector described in FIG.
- the shooting disparity vector is a vector whose y component is 0, but the calculated disparity vector detected by the ME is the target block and the target picture of the candidate picture. Since the shift (positional relationship) with the block (corresponding block) that minimizes the SAD with the block is represented, the y component is not always zero.
- the inter prediction unit 123 performs prediction by performing compensation for a candidate picture from the DPB 31 (motion compensation that compensates for a displacement for motion or disparity compensation that compensates for a displacement for disparity) in accordance with the displacement vector of the target block. Generate an image.
- the inter prediction unit 123 acquires, as a predicted image, a corresponding block that is a block (region) at a position shifted (shifted) from the position of the target block of the candidate picture according to the shift vector of the target block.
- the inter prediction unit 123 sets the encoding cost required to encode the target block using the prediction image for each inter prediction mode in which a candidate picture used for generating a prediction image, a macroblock type described later, and the like are different. Ask.
- the inter prediction unit 123 sets the inter prediction mode with the minimum encoding cost as the optimal inter prediction mode that is the optimal inter prediction mode, and the prediction image and the encoding cost obtained in the optimal inter prediction mode.
- the predicted image selection unit 124 is supplied.
- the predicted image selection unit 124 selects a predicted image having a lower encoding cost from the predicted images from the intra-screen prediction unit 122 and the inter prediction unit 123, and supplies the selected one to the calculation units 113 and 120.
- the intra prediction unit 122 supplies information related to intra prediction as header information to the variable length encoding unit 116
- the inter prediction unit 123 includes information related to inter prediction (shift vector information, reference index, etc.). Is supplied to the variable length encoding unit 116 as header information.
- variable length encoding unit 116 selects header information from the one in which the prediction image with the lower encoding cost is generated among the header information from the intra prediction unit 122 and the inter prediction unit 123, and Included in the header of the data.
- FIG. 10 is a diagram for explaining a macroblock type of the MVC (AVC) system.
- a macroblock to be a target block is a block having horizontal ⁇ vertical 16 ⁇ 16 pixels, but ME (and prediction image generation) is performed for each partition by dividing the macroblock into partitions. be able to.
- the macroblock is divided into any partition of 16 ⁇ 16 pixels, 16 ⁇ 8 pixels, 8 ⁇ 16 pixels, or 8 ⁇ 8 pixels, and ME is performed for each partition.
- a grazing vector motion vector or calculated disparity vector
- an 8 ⁇ 8 pixel partition is further divided into any one of 8 ⁇ 8 pixel, 8 ⁇ 4 pixel, 4 ⁇ 8 pixel, or 4 ⁇ 4 pixel subpartitions.
- ME can be performed to detect a grazing vector (motion vector or calculated disparity vector).
- the macro block type represents what partition (and sub-partition) the macro block is divided into.
- the encoding cost of each macroblock type is calculated as the encoding cost of each inter prediction mode, and the inter prediction mode (macroblock type) with the minimum encoding cost is selected. Is selected as the optimal inter prediction mode.
- FIG. 11 is a diagram for explaining a prediction vector (PMV) of the MVC (AVC) method.
- a shift vector motion vector or calculated disparity vector
- a predicted image is generated using the shift vector.
- the shift vector Since the shift vector is necessary for decoding the image on the decoding side, it is necessary to encode the shift vector information and include it in the encoded data. However, if the shift vector is encoded as it is, The amount of code increases and the coding efficiency may deteriorate.
- the macro block is divided into 8 ⁇ 8 pixel partitions, and each of the 8 ⁇ 8 pixel partitions is further divided into 4 ⁇ 4 pixel sub-partitions.
- shift vector information (disparity vector information, disparity vector information, Encoded as motion vector information)
- a certain macro block X is a target block to be encoded.
- the target block X is divided into 16 ⁇ 16 pixel partitions (the target block X is used as a partition as it is).
- the prediction vector PMVX of the shift vector mvX of the target block X is the target of the macroblocks already encoded (in raster scan order) when the target block X is encoded.
- the shift vector mvA of the macroblock A adjacent on the block X the shift vector mvB of the macroblock B adjacent to the left, and the shift vector mvC of the macroblock C adjacent to the upper right, according to the equation (5) Calculated.
- med () represents the median (median) of the values in parentheses.
- the target block X is the macro block at the right end of the picture or the like, if the shift vector mvC of the macro block C is not available (unavailable), the target block X is replaced with the shift vector mvC.
- the prediction vector PMVX is calculated using the shift vector mvD of the macroblock D adjacent on the upper left.
- the calculation of the prediction vector PMVX according to the equation (5) is performed independently for each of the x component and the y component.
- the difference vector mvX of the target block X and the difference mvX ⁇ PMV between the prediction vector PMVX are included in the header information as the shift vector information of the target block X.
- FIG. 12 is a diagram for explaining a prediction vector of a skip macroblock of the MVC (AVC) method.
- the target block when encoding a target block using a reference picture to which a reference index rev_idx having a value of 0 is assigned, the target block is set as a skip macroblock. Can do.
- a method of generating a prediction vector of a shift vector of the target block is a reference index assigned to a reference picture used for generating a prediction image of a macroblock around the target block. (Hereinafter also referred to as a prediction reference index).
- a plurality of pictures can be used as candidate pictures when generating a predicted image.
- the candidate pictures are stored in a buffer called DPB after being decoded (local decoding).
- DPB Downlink Prediction Picture
- short-term reference pictures used (for short-term reference)
- long-time reference pictures used for long-term reference
- Pictures that are not marked are marked as unreferenced pictures (unusedunfor reference), respectively.
- the DPB is managed by the FIFO (First In First Out) method, and the pictures stored in the DPB are released in order from the picture with the smallest frame_num (becomes non-reference pictures).
- FIFO First In First Out
- the I (Intra) picture, the P (Predictive) picture, and the Bs picture that is a reference B (Bi-directional ⁇ Predictive) picture are stored in the DPB as short-time reference pictures.
- the earliest (old) short-time reference pictures among the short-time reference pictures stored in the DPB Is released.
- the moving window memory management method does not affect the long-term reference picture stored in the DPB. That is, in the moving window memory management method, only the short-time reference pictures are managed by the FIFO method among the reference pictures.
- pictures stored in the DPB are managed using a command called MMCO (Memory management control operation).
- MMCO Memory management control operation
- a short-time reference picture is set as a non-reference picture, or a reference index for managing a long-time reference picture for a short-time reference picture.
- Assign a long-term frame index to set a short-term reference picture as a long-term reference picture, set a maximum long-term frame index, and set all reference pictures as non-reference pictures Etc. can be performed.
- inter prediction for generating a predicted image is performed by performing motion compensation of a reference picture stored in the DPB.
- inter prediction of a B picture (including a Bs picture) has a maximum of 2 pictures. Reference pictures can be used.
- the inter prediction using the two reference pictures is referred to as L0 (List) 0) prediction and L1 (List 1) prediction, respectively.
- L0 prediction, L1 prediction, or both L0 prediction and L1 prediction are used as inter prediction.
- L0 prediction is used as inter prediction.
- a reference picture that is referred to when generating a predicted image is managed by a reference list (Reference Picture List).
- a reference index that is an index for designating a reference picture (possible candidate picture) to be referred to for generation of a predicted image is a reference picture (candidate candidate picture) stored in the DPB. Assigned.
- the reference index is assigned only for the L0 prediction.
- both the L0 prediction and the L1 prediction may be used as the inter prediction for the B picture. Is assigned to both the L0 prediction and the L1 prediction.
- the reference index for L0 prediction is also referred to as L0 index
- the reference index for L1 prediction is also referred to as L1 index.
- a reference index (L0 index) having a smaller value is assigned to the reference picture stored in the DPB as the reference picture is later in decoding order. It is done.
- the reference index is an integer value of 0 or more, and the minimum value is 0. Therefore, when the target picture is a P picture, 0 is assigned to the reference picture decoded immediately before the target picture as the L0 index.
- the reference index (L0 index, L0 index, POC (Picture Order Count) order is the reference picture stored in the DPB. And L1 index).
- an L0 index having a smaller value is assigned to a reference picture that is temporally previous to the target picture in the display order, and the reference picture that is closer to the target picture is allocated.
- an L0 index having a smaller value is assigned to a reference picture that is closer to the target picture.
- an L1 index with a smaller value is assigned to a reference picture that is temporally later than the target picture in display order, and the reference picture that is closer to the target picture is allocated.
- An L1 index having a smaller value is assigned to a reference picture that is closer to the current picture than a reference picture that is temporally previous.
- the default reference indexes (L0 index and L1 index) in the above AVC method are assigned to short-time reference pictures.
- the assignment of the reference index to the long-time reference picture is performed after the reference index is assigned to the short-time reference picture.
- a reference index having a larger value than that of the short-time reference picture is assigned to the long-time reference picture.
- the reference index can be assigned by using the default method as described above, or by using a command called Reference Picture List Reordering (hereinafter also referred to as RPLR command). it can.
- RPLR command Reference Picture List Reordering
- the reference index is assigned to the reference picture by a default method.
- the prediction vector PMVX of the shift vector mvX of the target block X is placed on the target block X.
- Reference indexes for prediction of adjacent macroblock A, adjacent macroblock B on the left, and adjacent macroblock C on the upper right (used to generate predicted images of macroblocks A, B, and C)
- the reference index assigned to the assigned reference picture is determined in a different manner.
- the shift vector of the one macroblock (the macroblock for which the prediction reference index ref_idx is 0) is set as the prediction vector PMVX of the shift vector mvX of the target block X.
- the macroblock A among the three macroblocks A to C adjacent to the target block X is a macroblock whose reference index ref_idx for prediction is 0.
- the shift vector mvA of the macroblock A is set as the prediction vector PMVX of the target block X (shift vector mvX).
- all of the three macroblocks A to C adjacent to the target block X are macroblocks for which the reference index ref_idx for prediction is 0. Therefore, the shift vector of the macroblock A
- the median med (mvA, mvB, mvC) of the deviation vector mvB of the macro block B and the deviation vector mvC of the macro block C is set as the prediction vector PMVX of the target block X.
- the vector is the prediction vector PMVX of the target block X.
- the prediction vector is used as it is as the shift vector of the skip macroblock, and a copy of the block (corresponding block) of the reference picture at a position shifted by the shift vector from the position of the skip macroblock is the skip macroblock. Is the decoding result.
- the target block is a skip macroblock depends on the specifications of the encoder, but is determined (determined) based on, for example, the amount of encoded data, the encoding cost of the target block, and the like.
- FIG. 13 is a block diagram illustrating a configuration example of the encoder 22 of FIG.
- the encoder 22 performs encoding of the parallax image D # 2 of the view # 2 that is an image to be encoded using the MVC method, that is, as described in FIG.
- an encoder 22 includes an A / D conversion unit 211, a screen rearranging buffer 212, a calculation unit 213, an orthogonal transformation unit 214, a quantization unit 215, a variable length coding unit 216, an accumulation buffer 217, and an inverse quantization unit. 218, an inverse orthogonal transform unit 219, a calculation unit 220, a deblocking filter 221, an intra-screen prediction unit 222, a predicted image selection unit 224, a warping unit 231, a warped picture buffer 232, a reference index allocation unit 233, and a disparity prediction unit 234.
- the A / D conversion unit 211 to the intra prediction unit 222 and the prediction image selection unit 224 are the A / D conversion unit 111 to the intra prediction unit 122 and the prediction image selection unit 124 of the encoder 11 of FIG. Since each of them is configured similarly, the description thereof will be omitted as appropriate.
- the DPB 31 is supplied with a decoded image, that is, a picture of a parallax image (hereinafter also referred to as a decoded parallax image) D # 2 encoded by the encoder 22 and locally decoded from the deblocking filter 221. It is stored as a candidate picture that can be a reference picture.
- a decoded image that is, a picture of a parallax image (hereinafter also referred to as a decoded parallax image) D # 2 encoded by the encoder 22 and locally decoded from the deblocking filter 221. It is stored as a candidate picture that can be a reference picture.
- the DPB 31 includes a picture of the color image C # 1 encoded by the encoder 11 and locally decoded, and a color image C encoded by the encoder 12 and locally decoded.
- the # 2 picture and the parallax image (decoded parallax image) D # 1 encoded and locally decoded by the encoder 21 are also supplied and stored.
- the decoded parallax image D # 1 obtained by the encoder 21 is used for encoding the parallax image D # 2 to be encoded. Therefore, in FIG. 13, an arrow indicating that the decoded parallax image D # 1 obtained by the encoder 21 is supplied to the DPB 31 is illustrated.
- the warping unit 231 is supplied with the maximum value dmax and the minimum value dmin of the shooting parallax vector d (the shooting parallax vector d1 of the viewpoint # 1), the baseline length L, and the focal length f as parallax-related information (FIG. 5). .
- the warping unit 231 acquires (reads out) a picture of the decoded parallax image D # 1 (a picture at the same time as the target picture) among the decoded parallax images D # 1 and D # 2 stored in the DPB 31. .
- the warping unit 231 warps the picture of the decoded parallax image D # 1 acquired from the DPB 31 using the parallax related information as necessary, and thereby the picture of the decoded parallax image D # 1 at the viewpoint # 2.
- the warping unit 231 uses the maximum value dmax and the minimum value dmin of the shooting parallax vector d as the pixel value of each pixel of the picture of the decoded parallax image D # 1, and uses the maximum value dmax and the minimum value dmin to calculate the pixel Each image is converted into a shooting parallax vector d.
- the base line length L and the focal length f are used, and the normal value of the value y, which is the pixel value of the depth image, according to Equation (4)
- the depth Z which is a value before conversion, is converted into a shooting parallax vector d.
- the warping unit 231 generates a picture of the warped parallax image D ′ # 1 by performing warping that moves each pixel of the picture of the decoded parallax image D # 1 according to the shooting parallax vector d of the pixel.
- a picture of warped parallax image D ′ # 1 may have an occlusion part with no pixel value and a hole, but the pixel of the occlusion part Interpolation is performed using a pixel value (parallax value) of a pixel, that is, a pixel closest to the occlusion portion, for example, in the direction opposite to the moving direction in warping.
- the pixel closest to the occlusion portion in the direction opposite to the moving direction in the warping is a pixel having a parallax value (background parallax value) indicating the parallax of the background on the back side as a pixel value, and therefore
- the occlusion part (its pixel) is interpolated by the background parallax value.
- the warping unit 231 When the warping unit 231 generates a picture of the warped parallax image D ′ # 1 by warping the picture of the decoded parallax image D # 1, the warping unit 231 supplies the warped parallax image D ′ # 1 picture to the warped picture buffer 232. .
- the warped picture buffer 232 temporarily stores the picture of the warped parallax image D ′ # 1 from the warping unit 231.
- a warped picture buffer 232 that stores a picture of the warped parallax image D ′ # 1 is provided separately from the DPB 31, but the DPB 31 and the warped picture buffer 232 are one buffer. It is possible to use both.
- the reference index allocation unit 233 uses the decoded parallax image D # 1 picture stored in the DPB 31 and the warped parallax picture D ′ # 1 picture stored in the warped picture buffer 232 as candidates for reference pictures.
- a reference index is assigned to each candidate picture as a picture.
- the reference index assigning unit 233 supplies the reference index assigned to the candidate picture to the disparity prediction unit 234.
- the reference index assigning unit 233 sets the value to 1 in the decoded parallax image D # 1 of the decoded parallax image D # 1 and the warped parallax image D ′ # 1. Are assigned, and a reference index having a value of 0 is assigned to the warped parallax image D ′ # 1.
- the code amount of a reference index having a value of 0 is smaller than the code amount of a reference index having a value of 1.
- the picture of the warped parallax image D ′ # 1 is more than the picture of the parallax image D # 1.
- the encoding cost of the target block is likely to be small, and is easily selected as a reference picture.
- the encoder 22 warps the candidate picture with the parallax image (decoded parallax image) D # 1 of the viewpoint # 1 different from the viewpoint # 2 of the parallax image D # 2 to be encoded.
- the MVC (AVC) is similar to the decoder 11 (and the decoders 12 and 21) except that a reference index is assigned to the warped parallax image D ′ # 1 that is a candidate picture including the warped parallax image D ′ # 1 generated by ) Processing according to the method is performed.
- the target block can be a skip macroblock.
- the disparity prediction unit 234 is a candidate picture to which the reference index is assigned by the reference index assigning unit 233, that is, a picture of the decoded disparity image D # 1 stored in the DPB 31, and a warped disparity image stored in the warped picture buffer 232 Parallax prediction (prediction image generation) of the target block is performed using the D ′ # 1 picture as a reference picture.
- the disparity prediction unit 234 uses a predicted image obtained by the disparity prediction from the candidate picture for each of the decoded disparity image D # 1 picture and the warped disparity image D ′ # 1 picture that are candidate pictures. An encoding cost required for block encoding (predictive encoding) is calculated.
- the disparity prediction unit 234 includes a reference index assigned to each of the decoded disparity image D # 1 picture and the warped disparity image D ′ # 1 picture, which are candidate pictures. Then, the reference index assigned to the candidate picture used for encoding the target block is selected as the reference index for prediction of the target block, and is output to the variable length encoding unit 216 as one piece of header information.
- the disparity prediction unit 234 uses a candidate picture (a picture of the decoded parallax image D # 1 or a picture of the warped parallax image D ′ # 1) to which a reference index for prediction of the target block is assigned as a reference picture.
- a predicted image generated by the parallax prediction is supplied to the predicted image selection unit 224.
- the encoder 22 is provided with a parallax prediction unit 234 that performs parallax prediction of inter prediction, but the encoder 22 performs inter prediction of the encoder 11 of FIG. 9. Similar to the unit 123, temporal prediction can be performed in addition to parallax prediction.
- the reference index allocation unit 233 uses the warped parallax image D ′ # 1 and the decoded parallax image D # 1 that are candidate pictures that can be referred to in the parallax prediction.
- a reference index is also assigned to a picture of a decoded parallax image D # 2, which is a candidate picture that can be referred to in temporal prediction (another time picture having a different time from the target picture).
- FIG. 14 is a diagram for explaining the decoded parallax image stored in the DPB 31 of FIG. 13 and the warped parallax image stored in the warped picture buffer 232.
- the encoder 21 when the t-1th picture D1 (t-1) of the parallax image D # 1 of the view # 1 is encoded and locally decoded, the decoded parallax image D # 1 obtained by the local decoding is obtained.
- the picture D1 (t-1) is supplied to the DPB 31 and stored therein.
- the warping unit 231 warps the picture D1 (t-1) of the decoded parallax image D # 1 stored in the DPB 31, so that the picture of the warped parallax image D '# 1 D1 ′ (t ⁇ 1) is generated and supplied to the warped picture buffer 232 for storage.
- the reference index assigning unit 233 assigns a reference index having a value of 0 to the picture D1 ′ (t ⁇ 1) of the warped parallax image D ′ # 1 stored in the warped picture buffer 232.
- a reference index having a value of 1 is assigned to the picture D1 (t-1) of the decoded parallax image D # 1 stored in the DPB 31.
- the picture D1 ′ (t ⁇ 1) of the warped parallax image D ′ # 1 to which the reference index is assigned or the picture D1 (t ⁇ 1) of the decoded parallax image D # 1 is used as necessary.
- the t-1th picture D2 (t-1) of the parallax image D # 2 of the view # 2 is encoded and locally decoded.
- the picture D2 (t ⁇ 1) of the decoded parallax image D # 2 obtained by the local decoding is supplied to the DPB 31 and stored.
- the picture D2 (t-1) of the decoded parallax image D # 2 and the picture D1 (t-1) of the decoded parallax image D # 1 are stored in the DPB 31.
- the t-th picture D1 (t) of the parallax image D # 1 of the view # 1 is encoded and locally decoded.
- the picture D1 (t) of the decoded parallax image D # 1 obtained by the local decoding is supplied to the DPB 31 and stored.
- the picture D1 (t) of the decoded parallax image D # 1, the picture D2 (t-1) of the decoded parallax image D # 2, and the decoded parallax image D # 1 is stored.
- the warping unit 231 of the encoder 22 stores the picture D1 (t of the decoded parallax image D # 1 stored in the DPB 31. ) Is generated, and a picture D1 ′ (t) of the warped parallax image D ′ # 1 is generated and supplied to the warped picture buffer 232 for storage.
- the warped picture buffer 232 stores the pictures D1 ′ (t) and D1 ′ (t ⁇ 1) of the warped parallax image D ′ # 1 as shown in FIG.
- the reference index assigning unit 233 assigns a reference index having a value of 0 to the picture D 1 ′ (t) of the warped parallax image D ′ # 1 stored in the warped picture buffer 232, and assigns it to the DPB 31.
- a reference index having a value of 1 is assigned to the picture D1 (t) of the stored decoded parallax image D # 1.
- the picture D1 ′ (t) of the warped parallax image D ′ # 1 to which the reference index is assigned or the picture D1 (t) of the decoded parallax image D # 1 is referred to as a reference picture as necessary.
- the t-th picture D2 (t) of the parallax image D # 2 of the view # 2 is encoded and locally decoded.
- FIG. 15 is a block diagram illustrating a configuration example of the disparity prediction unit 234 in FIG.
- the parallax prediction unit 234 includes a parallax detection unit 241, parallax compensation units 242 and 243, a cost function calculation unit 244, a mode selection unit 245, and a prediction vector generation unit 246.
- the picture of the decoded parallax image D # 1 which is a candidate picture stored in the DPB 31 is supplied to the parallax detection unit 241. Further, the parallax detection unit 241 is supplied with the reference index idx (here, 1) assigned to the picture of the decoded parallax image D # 1 which is a candidate picture from the reference index assignment unit 233, and the screen rearrangement The target block of the picture of the parallax image D # 2 to be encoded is supplied from the buffer 212.
- idx here, 1
- the parallax detection unit 241 performs ME using the target block and the picture of the decoded parallax image D # 1 that is a candidate picture, so that the target block and the picture of the decoded parallax image D # 1 Is calculated, and a calculated disparity vector mv indicating the disparity with respect to the viewpoint # 1 of the target block is calculated and supplied to the disparity compensation unit 242. To do.
- the parallax compensation unit 242 is supplied with a shift vector, which is a calculated parallax vector mv, from the parallax detection unit 241 and also a picture of a decoded parallax image D # 1 that is a candidate picture stored in the DPB 31. Further, the reference index idx assigned to the picture of the decoded parallax image D # 1 which is a candidate picture is supplied from the reference index assigning unit 233 to the disparity compensation unit 242.
- the parallax compensation unit 242 uses a picture of the decoded parallax image D # 1 that is a candidate picture as a reference picture, and performs displacement compensation (parallax compensation) of the reference picture using the calculated parallax vector mv from the parallax detection unit 241. By performing the same as in the MVC method, the predicted image pp of the target block is generated.
- the parallax compensation unit 242 acquires, as the predicted image pp, a corresponding block that is a block at a position shifted by the calculated parallax vector mv from the position of the target block in the picture of the decoded parallax image D # 1.
- the parallax compensation unit 242 combines the predicted image pp with the calculated parallax vector mv from the parallax detection unit 241 and the reference index idx assigned to the picture of the decoded parallax image D # 1 from the reference index assignment unit 233. And supplied to the cost function calculation unit 244.
- the disparity compensation unit 243 uses the picture of the warped disparity image D ′ # 1 that is a candidate picture as a reference picture, performs displacement compensation (disparity compensation) of the reference picture, and the calculated disparity vector mv ′ as the displacement vector is 0 vector. Assuming that there is, the prediction image pp ′ of the target block is generated by performing in the same manner as in the MVC method.
- the warped parallax image D ′ # 1 is an image converted into the parallax image obtained at the viewpoint # 2 by warping the parallax image D # 1, the target block of the parallax image D # 2 at the viewpoint # 2 And 0 vector is adopted as the calculated parallax vector mv ′.
- the parallax compensation unit 243 costs the predicted image pp ′ along with the calculated parallax vector mv ′ and the reference index idx ′ assigned to the picture of the warped parallax image D ′ # 1 from the reference index assignment unit 233. It supplies to the function calculation part 244.
- the calculated disparity vector mv ′ of the target block for the picture of the warped parallax image D ′ # 1 is assumed to be 0 vector, but the calculated disparity vector mv ′ of the target block for the picture of the warped disparity image D ′ # 1 Can perform ME using the picture of the warped parallax image D ′ # 1 and the target block, and employ a shift vector obtained by the ME as the calculated parallax vector mv ′.
- the cost function calculation unit 244 is supplied with the predicted image pp, the calculated parallax vector mv, and the reference index idx from the parallax compensation unit 242, and from the parallax compensation unit 243, the predicted image pp ′ and the calculated parallax vector mv ′.
- the prediction vector generation unit 246 supplies the prediction vector and the screen rearrangement unit buffer 212 supplies the target block.
- the cost function calculation unit 244 encodes the encoding cost required for encoding the target block for each macroblock type (FIG. 10) for the reference index idx (the picture of the decoded parallax image D # 1 to which the index is assigned).
- a cost function for calculating the cost for example, it is obtained according to the equation (1).
- the cost function calculation unit 244 obtains a residual vector with respect to the prediction vector of the calculated disparity vector mv for the reference index idx, and obtains a value MV corresponding to the code amount of the residual vector.
- the cost function calculation unit 244 calculates, for the reference index idx, SAD that is a value corresponding to the residual of the target block with respect to the predicted image pp generated from the decoded parallax image D # 1 to which the reference index idx is assigned. Ask.
- the cost function calculation unit 244 obtains an encoding cost for each macroblock type for the reference index idx according to the equation (1).
- the cost function calculation unit 244 similarly obtains the encoding cost required for encoding the target block for each macroblock type. .
- the cost function for obtaining the coding cost is not limited to the equation (1). That is, the coding cost is, for example, SAD, a value obtained by multiplying the value corresponding to the code amount of the residual vector by the weight ⁇ 1 with ⁇ 1 and ⁇ 2 as weights, and a value corresponding to the code amount of the reference index ⁇ 2 It is possible to obtain the value by adding the value multiplied by.
- the cost function calculation unit 244 obtains the coding cost (cost function value) for each macroblock type for each of the reference indexes idx and idx ′, the cost function is calculated as the reference index, the predicted image, and the residual. Along with the vector (disparity vector information), it is supplied to the mode selection unit 245.
- the mode selection unit 245 detects the minimum cost, which is the minimum value, from the encoding cost for each macroblock type for each of the reference indexes idx and idx ′ from the cost function calculation unit 244.
- the mode selection unit 245 selects the reference index and macroblock type for which the minimum cost is obtained as the optimal inter prediction mode.
- a reference index with a lower coding cost is selected from among the reference indexes idx and idx ′, and then, from each macroblock type.
- the reference index and macroblock type with the lowest cost can be selected as the optimal inter prediction mode. it can.
- the mode selection unit 245 uses, as header information, mode-related information indicating the optimal inter prediction mode, a reference index of the optimal inter prediction mode (reference index for prediction), disparity vector information of the optimal inter prediction mode, and the like. This is supplied to the variable length coding unit 216.
- the mode selection unit 245 supplies the prediction image and the encoding cost (minimum cost) in the optimal inter prediction mode to the prediction image selection unit 224.
- the mode selection unit 245 encodes the target block as a skip macroblock based on the minimum cost or the like, for example. Judge whether or not.
- the optimal inter prediction mode is set to a skip mode in which the target block is encoded as a skip macroblock.
- the prediction vector generation unit 246 generates a prediction vector by, for example, the MVC (AVC) method and supplies the prediction vector to the cost function calculation unit 244 as described with reference to FIG.
- FIG. 16 is a flowchart for explaining an encoding process for encoding the parallax image D # 2 of the view # 2 performed by the encoder 22 of FIG.
- step S11 the A / D conversion unit 211 performs A / D conversion on the analog signal of the picture of the parallax image D # 2 of view # 2 supplied thereto and supplies the analog signal to the screen rearrangement buffer 212.
- the process proceeds to step S12.
- step S12 the screen rearrangement buffer 212 temporarily stores the picture of the parallax image D # 2 from the A / D conversion unit 211, and reads out the picture according to a predetermined GOP structure, thereby Rearrangement is performed to rearrange the display from the display order to the encoding order (decoding order).
- the picture read from the screen rearrangement buffer 212 is supplied to the calculation unit 213, the intra-screen prediction unit 222, and the parallax prediction unit 234, and the process proceeds from step S12 to step S13.
- step S13 the calculation unit 213 sets the picture of the parallax image D # 2 from the screen rearrangement buffer 212 as a target picture to be encoded, and further sequentially selects macroblocks constituting the target picture as encoding target pictures.
- the target block The target block.
- the calculation unit 213 calculates a difference (residual) between the pixel value of the target block and the pixel value of the predicted image supplied from the predicted image selection unit 224 as necessary, and supplies the difference to the orthogonal transform unit 214. Then, the process proceeds from step S13 to step S14.
- step S14 the orthogonal transform unit 214 performs orthogonal transform on the target block from the calculation unit 213, supplies the transform coefficient obtained as a result to the quantization unit 215, and the process proceeds to step S15.
- the quantization unit 215 quantizes the transform coefficient supplied from the orthogonal transform unit 214 and supplies the resulting quantized value to the inverse quantization unit 218 and the variable length coding unit 216. The process proceeds to step S16.
- step S16 the inverse quantization unit 218 inversely quantizes the quantized value from the quantization unit 215 into a transform coefficient, supplies the transform coefficient to the inverse orthogonal transform unit 219, and the process proceeds to step S17.
- step S17 the inverse orthogonal transform unit 219 performs inverse orthogonal transform on the transform coefficient from the inverse quantization unit 218, supplies the transform coefficient to the calculation unit 220, and the process proceeds to step S18.
- step S18 the calculation unit 220 adds the pixel value of the predicted image supplied from the predicted image selection unit 224 to the data supplied from the inverse orthogonal transform unit 219 as necessary, thereby adding the target block.
- a decoded parallax image D # 2 obtained by decoding (local decoding) is obtained.
- the arithmetic unit 220 supplies the decoded parallax image D # 2 obtained by locally decoding the target block to the deblocking filter 221, and the processing proceeds from step S18 to step S19.
- step S19 the deblocking filter 221 filters the decoded parallax image D # 2 from the calculation unit 220 and supplies it to the DPB 31 (FIG. 5), and the process proceeds to step S20.
- step S20 the DPB 31 is supplied with the decoded parallax image D # 1 obtained by encoding the parallax image D # 1 and performing local decoding from the encoder 21 that encodes the parallax image D # 1. After waiting, the decoded parallax image D # 1 is stored, and the process proceeds to step S21.
- step S21 the DPB 31 stores the decoded parallax image D # 2 from the deblocking filter 221, and the process proceeds to step S22.
- step S 22 the warping unit 231 generates a picture of the warped parallax image D ′ # 1 by warping the picture of the decoded parallax image D # 1 stored in the DPB 31, and supplies the generated picture to the warped picture buffer 232.
- the process proceeds to step S23.
- step S23 the warped picture buffer 232 stores the picture of the warped parallax image D ′ # 1 from the warping unit 231, and the process proceeds to step S24.
- step S24 the reference index assigning unit 233 references each of the decoded parallax image D # 1 picture stored in the DPB 31 and the warped parallax picture D ′ # 1 picture stored in the warped picture buffer 232. Assign an index.
- the reference index assigning unit 233 supplies the reference index assigned to each of the picture of the decoded parallax image D # 1 and the picture of the warped parallax image D ′ # 1 to the disparity prediction unit 234. The process proceeds from step S24 to step S25.
- step S25 the intra prediction unit 222 performs an intra prediction process (intra prediction process) for the next target block that is the next macro block to be encoded.
- the intra prediction unit 222 performs intra prediction (intra prediction) for generating a prediction image (prediction image of intra prediction) from the picture of the decoded parallax image D # 2 stored in the DPB 31 for the next target block. Do.
- the intra-screen prediction unit 222 obtains the encoding cost required to encode the target block using the prediction image of the intra prediction, and supplies it to the prediction image selection unit 224 together with the prediction image of the intra prediction. The process proceeds from step S25 to step S26.
- step S26 the disparity prediction unit 234 performs disparity prediction processing on the next target block, using the decoded parallax image D # 1 picture and the warped parallax image D ′ # 1 picture as candidate pictures.
- the disparity prediction unit 234 stores the next target block in the DPB 31 and in the decoded parallax image D # 1 picture to which the reference index is assigned by the reference index assignment unit 233 and the warped picture buffer 232.
- the disparity prediction unit 234 stores the next target block in the DPB 31 and in the decoded parallax image D # 1 picture to which the reference index is assigned by the reference index assignment unit 233 and the warped picture buffer 232.
- a prediction image and a code for each inter prediction mode having different macroblock types and the like The cost of conversion is calculated.
- the disparity prediction unit 234 sets the inter prediction mode with the minimum encoding cost as the optimal inter prediction mode, and supplies the prediction image of the optimal inter prediction mode together with the encoding cost to the prediction image selection unit 224. The process proceeds from step S26 to step S27.
- the predicted image selection unit 224 includes, for example, a predicted image from the intra-screen prediction unit 222 (prediction image for intra prediction) and a predicted image from the parallax prediction unit 234 (prediction image for inter prediction).
- the prediction image with the smaller encoding cost is selected and supplied to the calculation units 213 and 220, and the process proceeds to step S28.
- the predicted image selected by the predicted image selection unit 224 in step S27 is used in the processing of steps S13 and S18 performed in the encoding of the next target block.
- the intra-screen prediction unit 222 supplies information related to the intra prediction obtained in the intra prediction process of step S25 as header information to the variable length encoding unit 216, and the disparity prediction unit 234 performs the disparity prediction process of step S26.
- Information related to the parallax prediction (inter prediction) obtained in is supplied to the variable length coding unit 216 as header information.
- step S28 the variable length encoding unit 216 performs variable length encoding on the quantized value from the quantization unit 215 to obtain encoded data.
- variable length coding unit 216 selects header information from the one in which the prediction image with the lower coding cost is generated among the header information from each of the intra-screen prediction unit 222 and the parallax prediction unit 234. Included in the header of the encoded data.
- variable length encoding unit 216 supplies the encoded data to the accumulation buffer 217, and the process proceeds from step S28 to step S29.
- step S29 the accumulation buffer 217 temporarily stores the encoded data from the variable length encoding unit 216 and outputs it at a predetermined data rate.
- the encoded data output from the accumulation buffer 217 is supplied to the multiplexing unit 32 (FIG. 5).
- FIG. 17 is a flowchart illustrating the disparity prediction process performed by the disparity prediction unit 234 in FIG. 15 in step S26 in FIG.
- step S41 the parallax prediction unit 234 acquires a picture of the decoded parallax image D # 1 which is a candidate picture from the DPB 31, and supplies the decoded parallax image D # 1 to the parallax detection unit 241 and the parallax compensation unit 242, and the processing proceeds to step S42. move on.
- step S42 the parallax prediction unit 234 acquires the reference index idx assigned to the picture of the decoded parallax image D # 1 from the reference index allocation unit 233, and supplies the reference index idx to the parallax detection unit 241 and the parallax compensation unit 242. Then, the process proceeds to step S43.
- step S43 the parallax detection unit 241 is assigned the reference index idx from the reference index allocation unit 233 of the (next) target block of the parallax image D # 2 that is the original image supplied from the screen rearrangement buffer 212.
- the calculated disparity vector mv representing the disparity for the picture of the decoded disparity image D # 1 is detected by the ME.
- the parallax detection unit 241 supplies the calculated parallax vector mv to the parallax compensation unit 242, and the process proceeds from step S43 to step S44.
- step S44 the parallax compensation unit 242 uses the picture of the decoded parallax image D # 1 as a reference picture, and performs displacement compensation (parallax compensation) of the reference picture using the calculated parallax vector mv from the parallax detection unit 241.
- displacement compensation parallax compensation
- the parallax compensation unit 242 supplies the predicted image pp together with the calculated parallax vector mv and the reference index idx to the cost function calculation unit 244, and the process proceeds from step S44 to step S45.
- step S45 the parallax prediction unit 234 acquires a picture of the warped parallax image D ′ # 1 that is a candidate picture from the warped picture buffer 232, supplies the picture to the parallax compensation unit 243, and the process proceeds to step S46.
- step S46 the disparity prediction unit 234 acquires the reference index idx ′ assigned to the picture of the warped parallax image D ′ # 1 from the reference index assignment unit 233, and supplies the reference index idx ′ to the disparity compensation unit 243. Proceed to step S47.
- step S47 the parallax compensation unit 243 sets the calculated parallax vector mv ′ for the picture of the warped parallax image D ′ # 1 of the (next) target block to 0 vector, and the process proceeds to step S48.
- step S48 the disparity compensation unit 243 uses the calculated disparity vector mv ′ in which the compensation of the reference picture (disparity compensation) is set to 0 vector with the picture of the warped disparity image D ′ # 1 as a reference picture. By performing this, a predicted image pp ′ of the target block is generated.
- the parallax compensation unit 242 supplies the predicted image pp ′ together with the calculated parallax vector mv ′ and the reference index idx ′ to the cost function calculation unit 244, and the processing proceeds from step S48 to step S49.
- step S49 the prediction vector generation unit 246 generates prediction vectors for the calculated disparity vectors mv and mv ′ and supplies them to the cost function calculation unit 244, and the process proceeds to step S50.
- step S50 the cost function calculation unit 244, the (next) target block supplied from the screen rearrangement buffer 212, the predicted image pp supplied from the parallax compensation unit 242, the calculated parallax vector mv, and the reference index idx,
- a cost function such as a residual vector between the calculated parallax vector mv and the prediction vector, and SAD between the target block and the prediction image pp. Find the parameters.
- the cost function calculation unit 244 calculates a cost function using the parameter, thereby encoding the reference index idx (the picture of the decoded parallax image D # 1 to which the index is assigned) for each macroblock type. And the process proceeds to step S51.
- step S51 the cost function calculation unit 244, the (next) target block supplied from the screen rearrangement buffer 212, the predicted image pp ′ supplied from the disparity compensation unit 243, the calculated disparity vector mv ′, and the reference index Based on idx ′ and the prediction vector supplied from the prediction vector generation unit 246, a cost vector such as a residual vector between the calculated disparity vector mv ′ and the prediction vector, SAD between the target block and the prediction image pp ′, and the like. Find the parameters required for the operation.
- the cost function calculation unit 244 calculates the cost function using the parameter, thereby calculating the reference index idx ′ (the picture of the warped parallax image D ′ # 1 to which the reference index is assigned) for each macroblock type. Calculation cost.
- the cost function calculation unit 244 determines the encoding cost (cost function value) for each macroblock type for each of the reference indexes idx and idx ′, the reference index, the predicted image, and the residual vector (disparity vector information). ) And the process proceeds to step S52 from step S51.
- step S52 the mode selection unit 245 detects the minimum cost, which is the minimum value, from the encoding costs for each macroblock type for each of the reference indexes idx and idx ′ from the cost function calculation unit 244.
- the mode selection unit 245 selects the reference index and macroblock type for which the minimum cost is obtained as the optimal inter prediction mode, and the process proceeds from step S52 to step S53.
- step S53 the mode selection unit 245 supplies the prediction image in the optimal inter prediction mode and the encoding cost (minimum cost) to the prediction image selection unit 224, and the process proceeds to step S54.
- the mode selection unit 245 includes, as header information, mode-related information representing the optimal inter prediction mode, a reference index for the optimal inter prediction mode (reference index for prediction), disparity vector information for the optimal inter prediction mode, and the like. Is supplied to the variable length coding unit 216, and the process returns.
- FIG. 18 is a block diagram illustrating a configuration example of an embodiment of a multi-view image decoder to which the present technology is applied.
- the multi-view image decoder in FIG. 18 is a decoder that decodes data obtained by encoding images of a plurality of viewpoints using, for example, the MVC method.
- MVC method processing similar to the MVC method will be described as appropriate. Is omitted.
- multi-viewpoint image decoder is not limited to a decoder that uses the MVC method.
- the multiplexed data output from the multi-view image encoder of FIG. 5 is the color image C # 1 of the view # 1, which is the color image of the two viewpoints # 1 and # 2, and the view The color image C # 2 of # 2, and the parallax image D # 1 of view # 1 and the parallax image D # 2 of view # 2 that are the parallax information images of the two viewpoints # 1 and # 2 are decoded.
- the multi-viewpoint image decoder includes a separation unit 301, decoders 311, 312, 321, 322, and DPB 331.
- the multiplexed data output from the multi-viewpoint image encoder in FIG. 5 is supplied to the separation unit 301 via a recording medium and a transmission medium (not shown).
- the separation unit 301 from the multiplexed data supplied thereto, encodes color image C # 1, encoded data of color image C # 2, encoded data of parallax image D # 1, parallax image D # 2 Encoded data and disparity related information are separated.
- the separation unit 301 transmits the encoded data of the color image C # 1 to the decoder 311, the encoded data of the color image C # 2 to the decoder 312, and the encoded data of the parallax image D # 1 to the decoder 321.
- the encoded data of the image D # 2 is supplied to the decoder 322, and the disparity related information is supplied to the decoders 311, 312, 321, and 322.
- the decoder 311 decodes the encoded data of the color image C # 1 from the separation unit 301 using the disparity related information from the separation unit 301 as necessary, and outputs the resulting color image C # 1 To do.
- the decoder 312 decodes the encoded data of the color image C # 2 from the separation unit 301 using the disparity related information from the separation unit 301 as necessary, and outputs the resulting color image C # 2 To do.
- the decoder 321 decodes the encoded data of the parallax image D # 1 from the separation unit 301 using the parallax related information from the separation unit 301 as necessary, and outputs the resulting parallax image D # 1 To do.
- the decoder 322 decodes the encoded data of the parallax image D # 2 from the separation unit 301 using the parallax related information from the separation unit 301 as necessary, and outputs the resulting parallax image D # 2 To do.
- the DPB 331 temporarily uses the decoded images (decoded images) obtained by decoding the decoding target images in the decoders 311, 312, 321, and 322 as reference picture candidates to be referred to when the predicted image is generated.
- the decoders 311, 312, 321, and 322 decode the images that have been predictively encoded by the encoders 11, 12, 21, and 22 in FIG. 5, respectively.
- the decoders 311, 312, 321, and 322 determine the predictive image used in the predictive encoding.
- the decoded image used for generating the predicted image is temporarily stored in the DPB 331.
- the DPB 331 is a shared buffer that temporarily stores decoded images (decoded images) obtained by the decoders 311, 312, 321, and 322.
- the decoders 311, 312, 321, and 322 are respectively connected to the DPB 331.
- a reference picture to be referenced for decoding the decoding target picture is selected from the stored decoded picture, and a predicted picture is generated using the reference picture.
- DPB 331 is shared by decoders 311, 312, 321 and 322, each of decoders 311, 312, 321 and 322 has a decoded image obtained by itself and a decoded image obtained by another decoder. Images can also be referenced.
- FIG. 19 is a block diagram illustrating a configuration example of the decoder 311 in FIG.
- decoders 312 and 321 in FIG. 18 are also configured in the same manner as the decoder 311 and encode an image in accordance with, for example, the MVC method.
- the decoder 311 includes an accumulation buffer 341, a variable length decoding unit 342, an inverse quantization unit 343, an inverse orthogonal transform unit 344, an operation unit 345, a deblocking filter 346, a screen rearrangement buffer 347, and a D / A conversion unit. 348, the intra prediction unit 349, the inter prediction unit 350, and the predicted image selection unit 351.
- the encoded data of the color image C # 1 is supplied to the accumulation buffer 341 from the separation unit 301 (FIG. 18).
- the accumulation buffer 341 temporarily stores the encoded data supplied thereto and supplies the encoded data to the variable length decoding unit 342.
- variable length decoding unit 342 restores the quantization value and header information by variable length decoding the encoded data from the accumulation buffer 341. Then, the variable length decoding unit 342 supplies the quantization value to the inverse quantization unit 343 and supplies the header information to the intra-screen prediction unit 349 and the inter prediction unit 350.
- the inverse quantization unit 343 inversely quantizes the quantized value from the variable length decoding unit 342 into a transform coefficient and supplies the transform coefficient to the inverse orthogonal transform unit 344.
- the inverse orthogonal transform unit 344 performs inverse orthogonal transform on the transform coefficient from the inverse quantization unit 343, and supplies the transform coefficient to the arithmetic unit 345 in units of macroblocks.
- the calculation unit 345 sets the macroblock supplied from the inverse orthogonal transform unit 344 as a target block to be decoded, and adds the predicted image supplied from the predicted image selection unit 351 to the target block as necessary. Thus, a decoded image is obtained and supplied to the deblocking filter 346.
- the deblocking filter 346 performs, for example, the same filtering as the deblocking filter 121 in FIG. 9 on the decoded image from the arithmetic unit 345, and supplies the filtered decoded image to the screen rearrangement buffer 347.
- the screen rearrangement buffer 347 temporarily stores and reads out the picture of the decoded image from the deblocking filter 346, thereby rearranging the picture arrangement to the original arrangement (display order), and D / A (Digital / Analog) This is supplied to the conversion unit 348.
- the D / A conversion unit 348 performs D / A conversion on the picture when it is necessary to output the picture from the screen rearrangement buffer 347 as an analog signal and outputs the picture.
- the deblocking filter 346 supplies, to the DPB 331, decoded images of I pictures, P pictures, and Bs pictures, which are referenceable pictures, among the decoded images after filtering.
- the DPB 331 is a reference picture candidate to be referred to when generating a predicted picture used for decoding that is performed later in time, using the picture of the decoded picture from the deblocking filter 346, that is, the picture of the color picture C # 1. Store as (candidate picture).
- the DPB 331 is shared by the decoders 311, 312, 321, and 322, so that the color image C # 1 decoded by the decoder 311 and the color decoded by the decoder 312 are used.
- the picture of the image C # 2 the picture of the parallax image D # 1 decoded by the decoder 321 and the picture of the parallax image D # 2 decoded by the decoder 322 are also stored.
- the intra prediction unit 349 recognizes whether the target block is encoded using a prediction image generated by intra prediction (intra prediction) based on the header information from the variable length decoding unit 342.
- the intra-screen prediction unit 349 receives a picture including the target block from the DPB 331 as in the intra-screen prediction unit 122 of FIG. A portion (decoded image) that has already been decoded in the target picture) is read out. Then, the intra-screen prediction unit 349 supplies a part of the decoded image of the target picture read from the DPB 331 to the predicted image selection unit 351 as a predicted image of the target block.
- the inter prediction unit 350 recognizes whether the target block is encoded using a prediction image generated by the inter prediction based on the header information from the variable length decoding unit 342.
- the inter prediction unit 350 When the target block is encoded using a prediction image generated by inter prediction, the inter prediction unit 350, based on the header information from the variable length decoding unit 342, a prediction reference index, that is, the target block The reference index assigned to the reference picture used to generate the predicted image is recognized.
- the inter prediction unit 350 reads a candidate picture to which a reference index for prediction is assigned from the candidate pictures stored in the DPB 331 as a reference picture.
- the inter prediction unit 350 recognizes a shift vector (disparity vector, motion vector) used for generating the predicted image of the target block based on the header information from the variable length decoding unit 342, and the inter prediction unit in FIG. Similarly to 123, a prediction picture is generated by performing reference picture shift compensation (motion compensation that compensates for a shift for motion or parallax compensation that compensates for a shift for parallax) according to the shift vector.
- a shift vector displacement vector, motion vector
- motion vector motion vector used for generating the predicted image of the target block based on the header information from the variable length decoding unit 342, and the inter prediction unit in FIG.
- a prediction picture is generated by performing reference picture shift compensation (motion compensation that compensates for a shift for motion or parallax compensation that compensates for a shift for parallax) according to the shift vector.
- the inter prediction unit 350 acquires, as a predicted image, a block (corresponding block) at a position moved (shifted) from the position of the target block of the candidate picture according to the shift vector of the target block.
- the inter prediction unit 350 supplies the predicted image to the predicted image selection unit 351.
- the prediction image selection unit 351 selects the prediction image when the prediction image is supplied from the intra-screen prediction unit 349, and selects the prediction image when the prediction image is supplied from the inter prediction unit 350. And supplied to the calculation unit 345.
- FIG. 20 is a block diagram illustrating a configuration example of the decoder 322 in FIG.
- the decoder 322 performs the encoded data of the parallax image D # 2 of the view # 2 to be decoded using the MVC scheme, that is, in the same manner as the local decoding performed by the encoder 22 of FIG.
- the decoder 322 includes an accumulation buffer 441, a variable length decoding unit 442, an inverse quantization unit 443, an inverse orthogonal transform unit 444, a calculation unit 445, a deblocking filter 446, a screen rearrangement buffer 447, and a D / A conversion unit. 448, an intra-screen prediction unit 449, a predicted image selection unit 451, a warping unit 461, a warped picture buffer 462, and a parallax prediction unit 463.
- the storage buffer 441 through the intra-screen prediction unit 449 and the predicted image selection unit 451 are configured in the same manner as the storage buffer 341 through the intra-screen prediction unit 349 and the predicted image selection unit 351 of FIG. The description is omitted as appropriate.
- the DPB 331 is supplied with a decoded image, that is, a picture of a decoded parallax image D # 2, which is a parallax image decoded by the decoder 322, from the deblocking filter 446, and is stored as a candidate picture that can be a reference picture.
- a decoded image that is, a picture of a decoded parallax image D # 2, which is a parallax image decoded by the decoder 322, from the deblocking filter 446, and is stored as a candidate picture that can be a reference picture.
- the DPB 331 includes a picture of the color image C # 1 decoded by the decoder 311, a picture of the color image C # 2 decoded by the decoder 312, and a decoder 321.
- the decoded parallax image (decoded parallax image) D # 1 picture is also supplied and stored.
- the decoded parallax image D # 1 obtained by the decoder 321 is used for decoding the parallax image D # 2 to be decoded. Therefore, in FIG. 20, an arrow indicating that the decoded parallax image D # 1 obtained by the decoder 321 is supplied to the DPB 331 is illustrated.
- the warping unit 461 is supplied with the maximum value dmax and the minimum value dmin of the shooting parallax vector d (the shooting parallax vector d1 of the viewpoint # 1), the base length L, and the focal length f as parallax related information (FIG. 18). .
- the warping unit 461 acquires (reads out) the picture of the decoded parallax image D # 1 among the pictures of the decoded parallax images D # 1 and D # 2 stored in the DPB 331.
- the warping unit 461 warps the picture of the decoded parallax image D # 1 acquired from the DPB 331 using the parallax related information as necessary, similarly to the warping unit 231 of FIG.
- a warped parallax image D ′ # 1 picture which is a warped image obtained by converting the D # 1 picture into an image (parallax image) obtained at the viewpoint # 2, is generated.
- the warping unit 461 When the warping unit 461 generates a picture of the warped parallax image D ′ # 1 by warping the picture of the decoded parallax image D # 1, the warping unit 461 supplies the picture of the warped parallax image D ′ # 1 to the warped picture buffer 462. .
- the warped picture buffer 462 temporarily stores the picture of the warped parallax image D ′ # 1 from the warping unit 461.
- a warped picture buffer 462 for storing a picture of the warped parallax image D ′ # 1 is provided separately from the DPB 331.
- the DPB 331 and the warped picture buffer 462 are a single buffer. It is possible to use both.
- the parallax prediction unit 463 recognizes whether or not the target block is encoded using a prediction image generated by the parallax prediction (inter prediction) based on the header information from the variable length decoding unit 342.
- the disparity prediction unit 463 When the target block is encoded using a prediction image generated by the disparity prediction, the disparity prediction unit 463, based on the header information from the variable length decoding unit 342, a prediction reference index, that is, the target block The reference index assigned to the reference picture used to generate the predicted image is recognized (acquired).
- the disparity prediction unit 463 then decodes the parallax image D # 1 as a candidate picture stored in the DPB 331 and the warped parallax image D ′ # 1 as a candidate picture stored in the warped picture buffer 462. Among these pictures, a candidate picture to which a reference index for prediction is assigned is selected as a reference picture.
- the disparity prediction unit 463 recognizes the calculated disparity vector as a shift vector used for generating the predicted image of the candidate block based on the header information from the variable length decoding unit 342, and the disparity prediction unit 234 of FIG. Similarly, a predicted image is generated by performing parallax prediction according to the calculated parallax vector.
- the disparity prediction unit 463 acquires, as a predicted image, a block (corresponding block) at a position moved (shifted) from the position of the target block of the candidate picture according to the calculated disparity vector of the target block.
- the parallax prediction unit 463 supplies the predicted image to the predicted image selection unit 451.
- a disparity prediction unit 463 that performs disparity prediction among inter predictions is provided in the decoder 322 as in the case of the encoder 22 in FIG. 13.
- the decoder 322 performs parallax prediction and temporal prediction (prediction image generation by the same method) as in the encoder 22. .
- FIG. 21 is a block diagram illustrating a configuration example of the disparity prediction unit 463 in FIG.
- the parallax prediction unit 463 includes a reference picture selection unit 471, a prediction vector generation unit 472, and a parallax compensation unit 473.
- the reference picture selection unit 471 is supplied with the picture of the decoded parallax image D # 1 stored in the DPB 331 and the warped parallax image D ′ # 1 picture stored in the warped picture buffer 462.
- the reference picture selection unit 471 is supplied with the reference index for prediction of the target block, which is included in the header information, from the variable length decoding unit 442.
- the reference picture selection unit 471 uses the decoded parallax image D # 1 picture stored in the DPB 331 and the warped parallax picture D ′ # 1 picture stored in the warped picture buffer 462 as reference picture candidates (candidate pictures). Of the decoded parallax image D # 1 picture and the warped parallax image D ′ # 1 picture, the picture to which the reference index for prediction from the variable-length decoding unit 442 is assigned is referred to as the reference picture. To the parallax compensation unit 773 together with the prediction reference index from the variable length decoding unit 442.
- the prediction vector generation unit 472 generates a prediction vector in the same manner as the prediction vector generation unit 246 in FIG.
- the disparity compensation unit 473 is supplied with a reference picture to which a reference index for prediction is assigned from the reference picture selection unit 471, and is also supplied with a prediction vector from the prediction vector generation unit 473, and a variable length decoding unit From 472, mode-related information and disparity vector information included in the header information are supplied.
- the disparity compensation unit 473 adds a residual vector, which is disparity vector information from the variable length decoding unit 472, and a prediction vector from the prediction vector generation unit 472, thereby obtaining a shift vector as a calculated disparity vector of the target block. Decrypt.
- the disparity compensation unit 473 performs reference picture shift compensation (disparity compensation) from the reference picture selection unit 471 according to mode-related information (optimum inter prediction mode) using the calculated disparity vector of the target block and the MVC scheme. By performing in the same manner, a predicted image of the target block is generated.
- the disparity compensation unit 473 acquires, for example, a corresponding block that is a block at a position shifted by a calculated disparity vector from the position of the target block in the reference picture as a predicted image.
- the parallax compensation unit 473 supplies the predicted image to the predicted image selection unit 451.
- FIG. 22 is a flowchart illustrating a decoding process performed by the decoder 322 of FIG. 20 to decode the encoded data of the parallax image D # 2 of the view # 2.
- step S111 the accumulation buffer 441 stores the encoded data of the parallax image D # 2 of view # 2 supplied thereto, and the process proceeds to step S112.
- step S112 the variable length decoding unit 442 restores the quantization value and header information by reading the encoded data stored in the accumulation buffer 441 and performing variable length decoding. Then, the variable length decoding unit 442 supplies the quantized value to the inverse quantization unit 443, and supplies the header information to the intra-screen prediction unit 449 and the disparity prediction unit 450, and the process proceeds to step S113. move on.
- step S113 the inverse quantization unit 443 inversely quantizes the quantized value from the variable length decoding unit 442 into a transform coefficient, supplies the transform coefficient to the inverse orthogonal transform unit 444, and the process proceeds to step S114.
- step S114 the inverse orthogonal transform unit 444 performs inverse orthogonal transform on the transform coefficient from the inverse quantization unit 443, supplies the transform coefficient in units of macroblocks to the calculation unit 445, and the process proceeds to step S115.
- step S115 the computing unit 445 supplies the macroblock from the inverse orthogonal transform unit 444 as a target block (residual image) to be decoded, and supplies the target block from the predicted image selecting unit 451 as necessary.
- the decoded image is obtained by adding the predicted images.
- the arithmetic unit 445 supplies the decoded image to the deblocking filter 446, and the process proceeds from step S115 to step S116.
- step S116 the deblocking filter 446 performs filtering on the decoded image from the calculation unit 445, and uses the decoded image (decoded parallax image D # 2) after the filtering as the DPB 331 and the screen rearrangement buffer 447. The process proceeds to step S117.
- step S117 the DPB 331 waits for the decoded parallax image D # 1 to be supplied from the decoder 321 that decodes the parallax image D # 1, and stores the decoded parallax image D # 1, and the process proceeds to step S118. Proceed to
- step S118 the DPB 331 stores the decoded parallax image D # 2 from the deblocking filter 446, and the process proceeds to step S119.
- step S119 the warping unit 461 generates a picture of the warped parallax image D ′ # 1 by warping the picture of the decoded parallax image D # 1 stored in the DPB 331, and supplies the generated picture to the warped picture buffer 462. The process proceeds to step S120.
- step S120 the warped picture buffer 462 stores the picture of the warped parallax image D ′ # 1 from the warping unit 461, and the process proceeds to step S121.
- step S121 the intra prediction unit 449 and the disparity prediction unit 463 determine that the next target block (the next macro block to be decoded) is intra predicted based on the header information supplied from the variable length decoding unit 442. It recognizes whether it is encoded using the prediction image produced
- the intra prediction unit 449 performs an intra prediction process (intra prediction process).
- the intra-screen prediction unit 449 performs intra prediction (intra-screen prediction) for generating a predicted image (predicted image of intra prediction) from the picture of the decoded parallax image D # 2 stored in the DPB 331 for the next target block.
- the prediction image is supplied to the prediction image selection unit 451, and the process proceeds from step S121 to step S122.
- the disparity prediction unit 463 performs disparity prediction processing (inter prediction processing).
- the disparity prediction unit 463 includes the decoded parallax image D # 1 stored in the DPB 331 and the warped parallax image D ′ # 1 stored in the warped picture buffer 462.
- the picture to which the reference index for prediction of the next target block, which is included in the header information from the variable length decoding unit 442, is assigned is selected as the reference picture.
- the disparity prediction unit 463 generates a predicted image by performing disparity prediction (disparity compensation) using the mode-related information and disparity vector information included in the header information from the variable length decoding unit 442, and The predicted image is supplied to the predicted image selection unit 451, and the process proceeds from step S121 to step S122.
- step S ⁇ b> 122 the predicted image selection unit 451 selects the predicted image from the one of the intra-screen prediction unit 449 and the parallax prediction unit 463 to which the predicted image is supplied, and supplies the selected predicted image to the calculation unit 445. Then, the process proceeds to step S123.
- the predicted image selected by the predicted image selection unit 451 in step S122 is used in the process of step S115 performed in the decoding of the next target block.
- step S123 the screen rearrangement buffer 447 temporarily stores and reads out the picture of the decoded parallax image D # 2 from the deblocking filter 446, whereby the picture rearrangement is rearranged and the D / A conversion unit Then, the process proceeds to step S124.
- step S124 when it is necessary to output the picture from the screen rearrangement buffer 447 as an analog signal, the D / A conversion unit 348 performs D / A conversion on the picture and outputs it.
- FIG. 23 is a flowchart illustrating the parallax prediction processing performed by the parallax prediction unit 463 in FIG. 21 in step S121 in FIG.
- step S131 the reference picture selection unit 471 of the disparity prediction unit 463 obtains the (next) reference index for prediction of the target block included in the header information from the variable length decoding unit 442, and the processing is performed in step S132. Proceed to
- step S132 the reference picture selection unit 471 determines the value of the reference index for prediction.
- step S132 If it is determined in step S132 that the prediction reference index is 0, the process proceeds to step S133, and the reference picture selection unit 471 selects a picture of the decoded parallax image D # 1 that is a candidate picture, and Of the warped parallax image D ′ # 1, the warped parallax image D ′ # 1 picture to which the reference index of value 0 is assigned is acquired from the warped picture buffer 462.
- the reference picture selection unit 471 supplies the warped parallax image D ′ # 1 picture as a reference picture to the parallax compensation unit 473, and the process proceeds from step S133 to step S135.
- step S132 If the prediction reference index is determined to be 1 in step S132, the process proceeds to step S134, and the reference picture selection unit 471 selects a picture of the decoded parallax image D # 1, which is a candidate picture, In addition, the picture of the decoded parallax image D # 1 to which the reference index of value 1 is assigned among the pictures of the warped parallax image D ′ # 1 is acquired from the DPB 331.
- the reference picture selection unit 471 supplies the picture of the decoded parallax image D # 1 as a reference picture to the parallax compensation unit 473, and the process proceeds from step S134 to step S135.
- step S135 the parallax compensation unit 473 acquires the mode related information and the parallax vector information (residual vector) included in the header information from the variable length decoding unit 442, and the process proceeds to step S136.
- step S136 the prediction vector generation unit 472 generates a prediction vector and supplies it to the parallax compensation unit 473, and the process proceeds to step S137.
- step S137 the disparity compensation unit 473 adds the residual vector, which is the disparity vector information from the variable length decoding unit 472, and the prediction vector from the prediction vector generation unit 472, thereby obtaining the calculated disparity vector of the target block. And the process proceeds from step S137 to step S138.
- step S138 the disparity compensation unit 473 performs prediction compensation of the reference picture from the reference picture selection unit 471 (disparity compensation) using the calculated disparity vector of the target block according to the mode related information, thereby predicting the target block.
- An image is generated, and the process proceeds to step S139.
- step S139 the parallax compensation unit 473 supplies the predicted image to the predicted image selection unit 451, and the process returns.
- / Decoding is also referred to as a warped reference allocation scheme.
- the warped reference allocation method can be applied to color image encoding and decoding in addition to parallax image encoding and decoding.
- FIG. 24 is a diagram for explaining a warped reference assignment method for a color image.
- the warped color image C ′ # 1 (picture) generated by warping the color image C # 1 (after local decoding), and the color Image C # 1 (picture thereof) is a picture (candidate picture) that can be a reference picture.
- the shift vector is a 0 vector
- the MC of the block MBC # 21 that is the target block of the color image C # 2 is assumed by the MC.
- a block at a position shifted from the position by the shift vector, that is, a block MBC '# 11 at the same position as the target block MBC # 21 is acquired as a predicted image.
- the encoding cost COST COST1 ′ required for encoding the target block MBC # 21 when the image C ′ # 1 is a reference picture is calculated according to the above equation (1).
- the warped color image C ′ # 1 is an image obtained by converting the color image C # 1 of the viewpoint # 1 into an image viewed from the viewpoint # 2, and between the color image C # 2 of the viewpoint # 2. Since it can be estimated that there is no parallax (compensation for parallax), the warped color image C ′ # 1 is used as a reference picture as in the case of the warped reference allocation method for parallax images. In this case, a zero vector is assumed as the shift vector.
- 0 is adopted as the value MV corresponding to the code amount of the shift vector in the calculation of the coding cost COST of Equation (1).
- the encoding cost COST1 ′ when encoding the target block MBC # 21 is the same as in the case of the warped reference allocation scheme for disparity images.
- a shift vector (calculated parallax vector) is obtained by performing ME between target block MBC # 21 and color image C # 1. Is detected.
- the MC acquires a block (corresponding block) MBC # 11 at a position shifted by a shift vector from the position of the target block MBC # 21 in the color image C # 1 as a predicted image.
- the coding cost COST COST1 required for coding the target block MBC # 21 is calculated according to the equation (1).
- the encoding cost required for encoding the target block MBC # 21 (encoding cost for the warped color image C ′ # 1) COST1 ′ and
- the encoding cost required for encoding the target block MBC # 21 (the encoding cost for the color image C # 1) COST1 is calculated, and the encoding cost COST1 is calculated.
- the warped color image C' # 1 and the color image C # 1 with the lower encoding cost is selected as the reference picture used for encoding the target block MBC # 21 .
- a reference index ref_idx having a value of 0 (first value) is assigned to the warped color image C ′ # 1
- the color image C Reference index ref_idx having a value of 1 (second value) is assigned to # 1.
- the same effects as the warped reference assignment method targeting parallax images can be achieved.
- FIG. 25 is a block diagram illustrating a configuration example of the encoder 12 of FIG. 5 that encodes the color image C # 2 by the warped reference assignment method.
- the encoder 12 includes an A / D conversion unit 511, a screen rearrangement buffer 512, a calculation unit 513, an orthogonal transformation unit 514, a quantization unit 515, a variable length coding unit 516, an accumulation buffer 517, and an inverse quantization unit. 518, an inverse orthogonal transform unit 519, a calculation unit 520, a deblocking filter 521, an intra prediction unit 522, a predicted image selection unit 524, a warping unit 531, a warped picture buffer 532, a reference index allocation unit 533, and a disparity prediction unit 534.
- the A / D conversion unit 211 through the intra-screen prediction unit 222, the predicted image selection unit 224, and the warping unit 231 through the parallax prediction unit 234 perform similar processing.
- the DPB 31 is supplied from the deblocking filter 521 with a decoded image, that is, a C # 2 picture encoded by the encoder 12 and locally decoded (hereinafter also referred to as a decoded color image). And stored as a candidate picture that can be a reference picture.
- the DPB 31 includes a picture of a color image (decoded color image) C # 1 encoded by the encoder 11 and locally decoded, and is encoded by the encoder 21 and decoded locally.
- a picture of the parallax image (decoded parallax image) D # 1 and a picture of the parallax image (decoded parallax image) D # 2 encoded by the encoder 22 and locally decoded are also supplied and stored.
- the decoded color image C # 1 obtained by the encoder 11 and the decoded parallax image D # 1 obtained by the encoder 21 are Since it is used for encoding the color image C # 2 to be encoded, in FIG. 25, the decoded color image C # 1 obtained by the encoder 11 and the decoded parallax image D # 1 obtained by the encoder 21 are DPB31. An arrow indicating that it is supplied is shown.
- the decoded parallax image D # 1 stored in the DPB 31 is obtained by warping the decoded color image C # 1 stored in the DPB 31 in the warping unit 531 to view the decoded color image C # 1. Used to generate a picture of a warped color image C ′ # 1, which is a warped image converted into an image (color image) obtained in # 2.
- the warping unit 531 uses the parallax value ⁇ , which is the pixel value of each pixel of the picture of the decoded parallax image D # 1, as in the case of the warping unit 231 of FIG. Convert to d.
- the warping unit 531 generates a picture of the warped color image C ′ # 1 by performing warping by moving each pixel of the picture of the decoded color image C # 1 according to the shooting parallax vector d of the pixel.
- the target block of the color image C # 2 includes a portion at the same position as the occlusion portion of the warped color image C '# 1, the picture of the warped color image C' # 1 is used as a reference picture.
- the picture of the color image C # 1, which is another candidate picture, is selected as the reference picture.
- FIG. 26 is a block diagram illustrating a configuration example of the disparity prediction unit 534 in FIG.
- FIG. 26 is a block diagram illustrating a configuration example of the disparity prediction unit 534 in FIG.
- the parallax prediction unit 534 includes a parallax detection unit 541, parallax compensation units 542 and 543, a cost function calculation unit 544, a mode selection unit 545, and a prediction vector generation unit 546.
- the parallax detection unit 541 through the prediction vector generation unit 546 perform the same processing as the parallax detection unit 241 through the prediction vector generation unit 246 in FIG. 15 except that processing is performed on a color image instead of the parallax image. .
- FIG. 27 is a flowchart for explaining an encoding process for encoding the color image C # 2 of the view # 2 performed by the encoder 12 of FIG.
- steps S201 to S209 the same processing as that of steps S11 to S19 of FIG. 16 is performed on the color image instead of the parallax image, thereby filtering by the deblocking filter 521.
- the obtained decoded color image C # 2 is supplied to the DPB 31 (FIG. 5), and the process proceeds to step S210.
- step S210 the DPB 31 is supplied with the decoded color image C # 1 obtained by encoding the color image C # 1 and performing local decoding from the encoder 21 that encodes the color image C # 1. After waiting, the decoded color image C # 1 is stored, and the process proceeds to step S211.
- step S211 the DPB 31 is supplied with the decoded parallax image D # 1 obtained by encoding the parallax image D # 1 and performing local decoding from the encoder 11 that encodes the parallax image D # 1. After waiting, the decoded parallax image D # 1 is stored, and the process proceeds to step S212.
- step S212 the DPB 31 stores the decoded color image C # 2 from the deblocking filter 521, and the process proceeds to step S213.
- step S213 the warping unit 531 warps the picture of the decoded color image C # 1 stored in the DPB 31 using the picture of the decoded parallax image D # 1 stored in the DPB 31, so that the warped color image C ′
- the # 1 picture is generated and supplied to the warped picture buffer 532, and the process proceeds to step S214.
- steps S214 to S220 the encoder 12 performs the same processing as steps S23 to S29 in FIG. 16 for color images instead of parallax images.
- FIG. 28 is a flowchart for explaining the parallax prediction processing performed by the parallax prediction unit 534 in FIG. 26 (in step S217 in FIG. 27).
- steps S241 to S254 the same processing as that in steps S41 to S54 of FIG. 17 is performed for color images instead of parallax images.
- FIG. 29 shows a configuration example of the decoder 312 of FIG. 18 when the encoder 12 is configured as shown in FIG. 25, that is, a configuration example of the decoder 312 that decodes the color image C # 2 by the warped reference allocation method.
- FIG. 29 shows a configuration example of the decoder 312 of FIG. 18 when the encoder 12 is configured as shown in FIG. 25, that is, a configuration example of the decoder 312 that decodes the color image C # 2 by the warped reference allocation method.
- the decoder 312 includes an accumulation buffer 641, a variable length decoding unit 642, an inverse quantization unit 643, an inverse orthogonal transform unit 644, an operation unit 645, a deblocking filter 646, a screen rearrangement buffer 647, and a D / A conversion unit. 648, an intra-screen prediction unit 649, a predicted image selection unit 651, a warping unit 661, a warped picture buffer 662, and a parallax prediction unit 663.
- the storage buffer 441 through the in-screen prediction unit 449, the predicted image selection unit 451, and the warping unit 461 through the parallax prediction unit 463 perform the same processing.
- the DPB 331 is supplied from the deblocking filter 646 with a decoded image, that is, a picture of the decoded color image C # 2, which is a color image decoded by the decoder 312, as a candidate picture that can be a reference picture.
- a decoded image that is, a picture of the decoded color image C # 2, which is a color image decoded by the decoder 312, as a candidate picture that can be a reference picture.
- the DPB 331 includes a color image (decoded color image) C # 1 decoded by the decoder 311 and a parallax image (decoded parallax image) D decoded by the decoder 321.
- the picture of # 1 and the picture of the parallax image (decoded parallax image) D # 2 decoded by the decoder 322 are also supplied and stored.
- the decoded color image C # 1 obtained by the decoder 311 and the decoded parallax image D # 1 obtained by the decoder 321 are DPB331. An arrow indicating that it is supplied is shown.
- the decoded parallax image D # 1 stored in the DPB 331 is warped by the warping unit 661 by warping the picture of the decoded color image C # 1 stored in the DPB 331 in the same manner as the warping unit 531 in FIG. This is used to generate a picture of a warped color image C ′ # 1, which is a warped image obtained by converting a picture of the decoded color image C # 1 into an image (color image) obtained at the viewpoint # 2.
- FIG. 30 is a block diagram illustrating a configuration example of the disparity prediction unit 663 in FIG.
- the parallax prediction unit 663 includes a reference picture selection unit 671, a prediction vector generation unit 672, and a parallax compensation unit 673.
- the reference picture selection unit 671 through the parallax compensation unit 673 are respectively the same as the reference picture selection unit 471 through the parallax compensation unit 473 of the parallax prediction unit 463 in FIG. 25 except that processing is performed on a color image instead of a parallax image. Similar processing is performed.
- FIG. 31 is a flowchart illustrating a decoding process performed by the decoder 312 of FIG. 29 to decode the encoded data of the color image C # 2 of the view # 2.
- the same processing as that of steps S111 to S116 of FIG. 22 is performed on the color image instead of the parallax image in steps S311 to S316, thereby filtering by the deblocking filter 646.
- the obtained decoded color image C # 2 is supplied to the DPB 331, and the process proceeds to step S317.
- step S317 the DPB 331 waits for the decoded color image C # 1 to be supplied from the decoder 311 that decodes the color image C # 1, stores the decoded color image C # 1, and the process proceeds to step S318. Proceed to
- step S318 the DPB 331 waits for the decoded parallax image D # 1 to be supplied from the decoder 321 that decodes the parallax image D # 1, stores the decoded parallax image D # 1, and the process proceeds to step S319. Proceed to
- step S319 the DPB 331 stores the decoded color image C # 2 from the deblocking filter 646, and the process proceeds to step S320.
- step S320 the warping unit 661 warps the picture of the decoded color image C # 1 stored in the DPB 331 using the picture of the decoded parallax image D # 1 stored in the DPB 331, so that the warped color image C ′
- the # 1 picture is generated and supplied to the warped picture buffer 662, and the process proceeds to step S321.
- the decoder 312 performs the same processing as steps S120 to S124 in FIG. 22 for the color image instead of the parallax image.
- FIG. 32 is a flowchart for explaining the parallax prediction processing performed by the parallax prediction unit 663 in FIG. 30 (in step S322 in FIG. 31).
- steps S331 to S339 the same processing as that in steps S131 to S139 in FIG. 23 is performed for the color image instead of the parallax image.
- FIG. 33 is a diagram for explaining a warped reference allocation method using candidate pictures including pictures used for temporal prediction.
- the encoder 22 (FIG. 5) can perform both parallax prediction and temporal prediction.
- the picture of the warped parallax image D ′ # 1 that can be referred to in the parallax prediction or the decoded parallax image D # 1 In addition to the picture, a picture of the decoded parallax image D # 2 that can be referred to in temporal prediction becomes a candidate picture and is assigned a reference index.
- the warped parallax referred to in the parallax prediction is used as a candidate picture when both the parallax prediction and the temporal prediction are performed in the encoder 22 that encodes the parallax image D # 2.
- a picture of an image D ′ # 1 and a picture of a decoded parallax image D # 2 referred to in temporal prediction are employed.
- the t-th picture of the image D ′ # 1 and the t-th picture of the encoding target parallax image D # 2 are pictures (candidate pictures) that can be reference pictures.
- the t′-th picture of the parallax image D # 2 that is the candidate picture is decoded (locally decoded) before the t-th picture of the parallax picture D # 2 that is the picture of the target block, and the decoded parallax picture D #
- the second picture is a picture stored in the DPB 31 (and the DPB 331).
- t′-th picture of the parallax image D # 2 that is the candidate picture for example, a picture that is decoded (and encoded), for example, one picture before the t-th picture of the parallax image D # 2 that is the picture of the target block Can be adopted.
- the block that is the target block of the t-th picture of the parallax image D # 2 is assumed by the MC assuming that the shift vector is a zero vector.
- a block at a position shifted by a shift vector from the position of MBD # 21, that is, block MBD '# 11 at the same position as target block MBD # 21 is acquired as a predicted image.
- the encoding cost COST COST1 ′ required for encoding the target block MBD # 21 when the image D ′ # 1 is a reference picture is calculated according to the above equation (1).
- the block (corresponding block) MBD # 21 ′ at a position shifted by the shift vector that is the motion vector from the position of the target block MBD # 21 in the t′-th picture of the parallax image D # 2 is predicted by the MC. Get as.
- the target block when the picture of the warped parallax image D '# 1 (the picture at the same time t as the picture of the target block MBD # 21) is used as the reference picture.
- Coding cost required for coding MBD # 21 (coding cost for a picture of warped parallax image D ′ # 1) COST1 ′ and picture of parallax image D # 2 (time t different from the picture of target block MBD # 21)
- the coding cost required for coding the target block MBD # 21 (coding cost for the picture of the parallax image D # 2) COST1 when the 'picture' is used as a reference picture
- the coding cost COST1 Based on 'and COST1, the coding cost of the target block MBD # 21 is smaller in the t-th picture of the warped parallax image D' # 1 and the t-th picture of the parallax image D # 2 Selected as a reference picture It is.
- FIG. 24 When encoding the target block of the parallax image D # 2 and adopting the picture of the warped parallax image D ′ # 1 and the other time picture of the parallax image D # 2 as candidate pictures as described above, FIG. Similarly to FIG. 24, a reference index ref_idx having a value of 0 is assigned to the picture of the warped parallax image D ′ # 1, and a reference index ref_idx having a value of 1 is assigned to the other time picture of the parallax image D # 2. be able to.
- the reference index ref_idx having a value of 0 is added to the warped parallax image D ′ # 1 picture. And assigning a reference index ref_idx having a value of 1 to the other time picture of the parallax image D # 2 may not be appropriate.
- the warped parallax image D ′ # 1 picture (t-th picture) is reflected by the influence of parallax in the portion shown in the encoding target parallax image D # 2 picture (t-th picture). There may be no part.
- the other time picture (t'th picture) of the parallax image D # 2 is reflected by the influence of the motion in the portion of the picture of the parallax image D # 2 to be encoded (tth picture). There may be parts that are not.
- the target block and the predicted image When at least a part of the target block of the parallax image D # 2 to be encoded is not reflected in the predicted image generated using the picture of the warped parallax image D '# 1 as a reference picture, the target block and the predicted image The residual becomes large and the coding cost for the picture of the warped parallax image D ′ # 1 becomes large.
- the target block of the parallax image D # 2 to be encoded is not reflected in the predicted image generated using the other time picture of the parallax image D # 2 as a reference picture, the target block is predicted.
- the residual with the image becomes large, and the coding cost for the other time picture of the parallax image D # 2 becomes large.
- the parallax image D # 2 when there is a scene change between the picture of the parallax image D # 2 to be encoded and the other time picture of the parallax image D # 2 that is the candidate picture, the parallax image D # 2
- the encoding cost for the other time picture is larger than the encoding cost for the warped parallax image D ′ # 1 picture.
- the encoding cost for the other time picture of the parallax image D # 2 is smaller than the encoding cost for the picture of the warped parallax image D ′ # 1.
- the warped parallax image D ′ # 1 picture and the other time picture of the parallax image D # 2 are included in the candidate pictures, the one with the lower coding cost is used as the reference picture for coding the target block.
- the encoding of the target picture is performed by selecting a picture of the warped parallax image D ′ # 1 that is the candidate picture and another time of the parallax image D # 2.
- a warped parallax image D ′ # 1 picture is used as a reference picture among the pictures, that is, when the parallax prediction is performed, and when another time picture of the parallax image D # 2 is performed as a reference picture, That is, a feature amount (hereinafter also referred to as a prediction determination feature amount) for determining which one is more often used in temporal prediction is obtained, and a warped parallax image D ′ # that is a candidate picture is obtained based on the prediction determination feature amount.
- the reference index ref_idx can be assigned to each of the one picture and the other time picture of the parallax image D # 2.
- FIG. 34 is a block diagram showing a configuration example of the encoder 22 (FIG. 5) that encodes the parallax image # 2 by the warped reference allocation method using candidate pictures including pictures used for temporal prediction.
- FIG. 34 is common to the case of FIG. 13 in that it includes an A / D conversion unit 211 or an intra-screen prediction unit 222, a predicted image selection unit 224, a warping unit 231, and a warped picture buffer 232.
- the encoder 22 of FIG. 34 differs from the case of FIG. 13 in that it includes a reference index allocation unit 701 and an inter prediction unit 702 instead of the reference index allocation unit 233 and the parallax prediction unit 234, respectively. .
- the reference index assigning unit 701 includes a decoded parallax image D # 2 other time picture stored in the DPB 31 (a picture previously encoded and locally decoded that is different from the picture of the target block), a warped A reference index is assigned to each candidate picture, with the picture of the warped parallax image D ′ # 1 stored in the picture buffer 232 as a candidate picture that is a candidate for a reference picture.
- the reference index assigning unit 701 obtains a prediction determination feature value, and uses the prediction determination feature value to encode a warped parallax image D ′ # 1 in coding of a picture (target picture) of the parallax image D # 2 to be encoded. And a reference index with a small code amount and a value of 0 is assigned to the picture that is presumed to be selected as the reference picture of the other picture of the parallax image D # 2 and the other picture is assigned to the other picture. Assign a reference index with a value of 1.
- the reference index allocation unit 701 supplies the reference index allocated to the candidate picture to the inter prediction unit 702.
- the inter prediction unit 702 is a candidate picture to which the reference index is assigned by the reference index assignment unit 701, that is, the decoded time image of the decoded parallax image D # 2 stored in the DPB 31, and the warped picture stored in the warped picture buffer 232.
- Inter prediction (temporal prediction and parallax prediction) of the target block is performed using the picture of the parallax image D ′ # 1 as a reference picture, and the coding cost is calculated.
- the inter prediction unit 702 performs the parallax prediction as the inter prediction by using the picture of the warped parallax image D ′ # 1 as a reference picture, as in the case of the parallax prediction unit 234 in FIG.
- a prediction image for parallax prediction is generated by assuming that the vector is zero.
- the inter prediction unit 702 encodes the target block using a prediction image of parallax prediction (prediction encoding) (coding cost for a picture of the warped parallax image D ′ # 1). Is calculated.
- prediction encoding coding cost for a picture of the warped parallax image D ′ # 1). Is calculated.
- the inter prediction unit 702 performs temporal prediction (motion prediction) as inter prediction using another time picture of the decoded parallax image D # 2 as a reference picture, and generates a prediction image of temporal prediction.
- temporal prediction motion prediction
- the inter prediction unit 702 detects a motion vector as a shift vector that represents a shift between the current block and the decoded parallax image D # 2 at another time picture. Further, the inter prediction unit 702 generates a prediction image for temporal prediction by performing motion compensation of the other time picture of the decoded parallax image D # 2 using the motion vector (other than the decoded parallax image D # 2). A block (corresponding block) at a position shifted by a motion vector as a shift vector from the target block in the time picture is acquired as a predicted image).
- the inter prediction unit 702 encodes the target block using a prediction image of temporal prediction (prediction encoding) (encoding cost for other time pictures of the decoded parallax image D # 2). ) Is calculated.
- prediction encoding encoding cost for other time pictures of the decoded parallax image D # 2).
- the inter prediction unit 702 selects, as a reference picture, one of the other time pictures of the decoded parallax image D # 2 that is a candidate picture and the picture of the warped parallax image D ′ # 1 that has the lower coding cost. To do.
- the inter prediction unit 702 selects a picture (other time picture of the decoded parallax image D # 2 or the warped parallax image D ′ # 1) selected from the reference indexes from the reference index allocation unit 701.
- a reference index assigned to (picture) is selected as a reference index for prediction of the target block, and is output to the variable length coding unit 216 as one piece of header information.
- the inter prediction unit 702 refers to a candidate picture to which a reference index for prediction of the target block is assigned (another time picture of the decoded parallax image D # 2 or a picture of the warped parallax image D ′ # 1). As such, a prediction image generated by inter prediction is supplied to the prediction image selection unit 224.
- the disparity prediction of FIG. 13 is performed except that another time picture of the decoded parallax image D # 2 is used instead of the picture of the decoded parallax image D # 1 as one of the candidate pictures. Processing similar to that of the unit 234 is performed.
- FIG. 35 is a block diagram illustrating a configuration example of the reference index allocation unit 701 in FIG.
- the reference index allocation unit 701 includes a feature amount generation unit 721 and an allocation unit 722.
- the feature quantity generation unit 721 generates a prediction determination feature quantity for the picture of the target block (target picture) and supplies the prediction determination feature quantity to the allocation unit 722.
- the assigning unit 722 assigns one of 0 and 1 as the reference index idx ′ of the picture of the warped parallax image D ′ # 1 based on the prediction determination feature quantity from the feature quantity generation unit 721, and the parallax image D The other is assigned as the reference index idx of the other time picture # 2, and is supplied to the inter prediction unit 702 (FIG. 34).
- the allocating unit 722 refers to either the warped parallax image D ′ # 1 that is a candidate picture or another time picture of the parallax image D # 2 with respect to the target picture based on the prediction determination feature value. It is determined whether it is easy to select a picture.
- allocating section 722 assigns a reference index having a value of 0 to a candidate picture that is more likely to be selected as a reference picture out of a picture of warped parallax image D ′ # 1 and another time picture of parallax image D # 2. And a reference index having a value of 1 is assigned to the other candidate picture.
- non-default reference index assignment of MVC (AVC) can be performed by the RPLR command as described with reference to FIG.
- the prediction determination feature amount covers all macroblocks of the target picture having the size of the shift vector (calculated disparity vector, motion vector) of the target block when inter prediction is performed using the candidate picture as a reference picture. Average values and variances can be employed.
- the feature amount generation unit 721 uses the picture of the warped parallax image D ′ # 1 as a reference picture, and the size of the calculated parallax vector as a shift vector of the target block when performing inter prediction (parallax prediction). The average value and variance over all macroblocks of the target picture are obtained.
- the magnitude of the motion vector as the shift vector of the target block is An average value and variance over all macroblocks of the target picture are obtained.
- the allocation unit 722 assigns a picture of the warped parallax image D ′ # 1 that is a candidate picture and another time picture of the parallax image D # 2 A reference index having a value of 0 is assigned to a candidate picture having a smaller variance vector size or a variance having a smaller value, and a reference index having a value of 1 is assigned to the other candidate picture. .
- the prediction determination feature amount covers all macroblocks of the target picture, which are absolute values of residuals between the target block and the corresponding block of the reference picture when inter prediction is performed using the candidate picture as a reference picture. Sums and averages can be used.
- the feature value generation unit 721 uses the picture of the warped parallax image D ′ # 1 as a reference picture, and when performing inter prediction (disparity prediction), the absolute value of the residual between the target block and the corresponding block is The sum and average value over all macroblocks of the target picture are obtained.
- allocation section 722 the sum of absolute values of residuals or the average value of the warped parallax image D ′ # 1 that is a candidate picture and the other time picture of parallax image D # 2 is small.
- a reference index having a value of 0 is assigned to one of the candidate pictures, and a reference index having a value of 1 is assigned to the other candidate picture.
- the prediction determination feature amount the sum or average value of all the macroblocks of the target picture of the encoding cost of the target block when inter prediction is performed using the candidate picture as a reference picture can be employed. .
- the feature quantity generation unit 721 all macroblocks of the target picture at the encoding cost of the target block when performing inter prediction (disparity prediction) using the picture of the warped parallax image D ′ # 1 as a reference picture The total and average values over
- the feature quantity generation unit 721 all macroblocks of the target picture at the coding cost of the target block when inter prediction (temporal prediction) is performed using the other time picture of the decoded parallax image D # 2 as a reference picture. The total and average values over
- allocation section 722 the sum of the coding costs or the smaller average value of the warped parallax image D ′ # 1 that is a candidate picture and the other time picture of parallax image D # 2 A reference index with a value of 0 is assigned to the candidate picture, and a reference index with a value of 1 is assigned to the other candidate picture.
- the prediction determination feature amount the ratio of the reference index for prediction in the immediately preceding picture that is the picture encoded immediately before the target picture, that is, the number of reference indexes having a value of 0 and the reference index having a value of 1 Can be adopted.
- the allocating unit 722 uses the warped parallax image D that is a candidate picture in encoding the target picture.
- a reference index is assigned to each of the '# 1 picture and the other time picture of the parallax image D # 2 as in the case of encoding the previous picture.
- the allocating unit 722 uses the warped parallax image D ′ # that is a candidate picture in encoding the target picture.
- a reference index is assigned to each of the picture 1 and the other time picture of the parallax image D # 2, contrary to the assignment of the reference index at the time of encoding the immediately preceding picture.
- the prediction accuracy of the prediction image of the target picture is adopted, and based on the prediction accuracy, the warped parallax image D ′ # 1 that is a candidate picture and the parallax image D # 2 A reference index can be assigned to each time picture.
- FIG. 36 is a diagram for explaining a method of adopting the prediction accuracy of the prediction image of the target picture as the prediction determination feature quantity and allocating the reference index to the candidate picture based on the prediction accuracy.
- FIG. 36 an I picture I # 11, a B picture B # 12, a P picture P # 13, a B picture B # 14 as pictures in order of (display) time of the parallax image D # 1, and a parallax image D # 2 P picture P # 21, B picture B # 22, P picture P # 23, and B picture B # 24 as pictures in time order are shown.
- the I picture I # 11 of the parallax image D # 1 and the P picture P # 21 of the parallax image D # 2 are pictures at the same time, and the B picture B # 12 and the B picture B # 22 P picture P # 13 and P picture P # 23, and B picture B # 14 and B picture B # 24 are also pictures at the same time.
- the P picture P # 23 of the parallax image D # 2 to be encoded is the target picture, and when encoding the target picture P # 23, the P picture P # 13 of the parallax image D # 1 and the parallax image It is assumed that P picture P # 21 of D # 2 is a candidate picture.
- the picture is a candidate picture, here, in order to simplify the description, the P picture P # 13 of the parallax image D # 1 is adopted as the candidate picture in place of the warped picture.
- the P picture P # 13 of the parallax image D # 1 that is a candidate picture and the P picture # 21 of the parallax image D # 2 are both I picture I # 11 of the parallax image D # 1. Is assumed to be predictive coded with reference picture.
- the P picture P # 21 of the parallax image D # 2 uses a prediction image obtained by performing parallax prediction preP ′ as inter prediction using the I picture I # 11 of the parallax image D # 1 as a reference picture, It is assumed that the residual between the predicted image and P picture P # 21 is encoded.
- the prediction accuracy X # 13 expressed by the equation X # 13 S ⁇ Q becomes smaller as the accuracy of the parallax prediction preP ′ (predicted image obtained by performing) is higher.
- the residual between the P picture P # 21 and the predicted image obtained by performing the parallax prediction preP ′ using the I picture I # 11 as a reference picture is encoded.
- the code amount S ′ and the average value Q ′ of the quantization step for quantizing the P picture P # 21 (the residual thereof) become smaller.
- the prediction picture is obtained by performing the parallax prediction preP using the P picture P # 13 as a reference picture. Are generated, and the residual between the current picture P # 23 and the predicted image is encoded.
- the prediction picture is obtained by performing temporal prediction preT using the P picture P # 21 as a reference picture. Are generated, and the residual between the current picture P # 23 and the predicted image is encoded.
- the prediction accuracy of the disparity prediction preP (predicted image generated by using the P picture P # 13 as a reference picture) when the target picture P # 23 is encoded is the disparity prediction using the I picture I # 11 as the reference picture.
- Pred ′ (predicted image generated by) is estimated to be approximately the same as the prediction accuracy X # 21.
- the prediction accuracy of temporal prediction preT using P picture P # 21 as a reference picture when encoding target picture P # 23 is the prediction accuracy X of temporal prediction preT ′ using I picture I # 11 as a reference picture. Estimated to be comparable to # 13.
- the feature amount generation unit 721 uses the P picture P that serves as a reference picture for the temporal prediction preT of the target picture P # 23.
- the prediction accuracy X # 21 of the disparity prediction preP ′ performed when encoding # 21 is obtained as the prediction accuracy of the disparity prediction preP using the P picture # 13 as a reference picture.
- the prediction accuracy X # 13 of the temporal prediction preT ′ performed when encoding the P picture P # 13 that is the reference picture of the disparity prediction preP of the target picture P # 23 is the P picture. It is obtained as the prediction accuracy of temporal prediction preT using # 21 as a reference picture.
- the allocation unit 722 if the prediction accuracy of the parallax prediction preP (prediction accuracy X # 21 of the parallax prediction preP ′) is better than the prediction accuracy of the temporal prediction preT (prediction accuracy X # 13 of the temporal prediction preT ′) ( If the value is small), a reference index of 0 is assigned to P picture P # 13, which is a reference picture of disparity prediction preP, and a value of 1 is assigned to P picture P # 21, which is a reference picture of temporal prediction preT A reference index is assigned.
- a reference index having a value of 0 is allocated to the P picture P # 21 that is a reference picture of the temporal prediction preT
- a reference index having a value of 0 is assigned to P picture P # 13, which is a reference picture for disparity prediction preP.
- a reference index with a small code amount and a value of 0 can be assigned to a candidate picture that is easily selected as a reference picture. As a result, encoding efficiency can be improved.
- FIG. 37 is a block diagram illustrating a configuration example of the decoder 322 (FIG. 18) that decodes the encoded data of the parallax image # 2 by the warped reference allocation method using candidate pictures including pictures used for temporal prediction.
- the decoder 322 of FIG. 37 is common to the case of FIG. 20 in that it includes an accumulation buffer 441 or an in-screen prediction unit 449, a predicted image selection unit 451, a warping unit 461, and a warped picture buffer 462.
- the decoder 322 in FIG. 37 is different from the case in FIG. 20 in that an inter prediction unit 801 is provided instead of the parallax prediction unit 463.
- the inter prediction unit 801 recognizes whether or not the target block is encoded using the prediction image generated by the inter prediction based on the header information from the variable length decoding unit 342.
- the inter prediction unit 801 uses the prediction reference index, that is, the target block, based on the header information from the variable length decoding unit 342.
- the reference index assigned to the reference picture used to generate the predicted image is recognized (acquired).
- the inter prediction unit 801 has a decoded parallax image D # 2 picture (other time picture) as a candidate picture stored in the DPB 331 and a warped parallax as a candidate picture stored in the warped picture buffer 462.
- a candidate picture to which a reference index for prediction is assigned is selected as a reference picture.
- the inter prediction unit 801 recognizes the shift vector (calculated parallax vector or motion vector) used for generating the predicted image of the target block based on the header information from the variable length decoding unit 342, and calculates the calculated parallax.
- a prediction image is generated by performing shift compensation (parallax compensation or motion compensation) according to the vector.
- the inter prediction unit 801 acquires, as a predicted image, a block (corresponding block) at a position moved (shifted) from the position of the target block of the candidate picture according to the shift vector of the target block.
- the inter prediction unit 801 supplies the predicted image to the predicted image selection unit 451.
- the inter prediction unit 801 uses the other time picture of the decoded parallax image D # 2 instead of the picture of the decoded parallax image D # 1 as one of the candidate pictures. Processing similar to that performed by the prediction unit 463 is performed.
- the warped reference allocation method using candidate pictures including pictures used for temporal prediction includes an encoder 22 (FIG. 5) that encodes the parallax image # 2, and a decoder that decodes the encoded data of the parallax image # 2.
- the present invention can also be applied to the encoder 12 (FIG. 5) that encodes the color image # 2 and the decoder 312 (FIG. 18) that decodes the encoded data of the color image # 2. .
- FIG. 39 shows a configuration example of an embodiment of a computer in which a program for executing the series of processes described above is installed.
- the program can be recorded in advance in a hard disk 805 or ROM 803 as a recording medium built in the computer.
- the program can be stored (recorded) in a removable recording medium 811.
- a removable recording medium 811 can be provided as so-called package software.
- examples of the removable recording medium 811 include a flexible disk, a CD-ROM (Compact Disc Read Only Memory), an MO (Magneto Optical) disc, a DVD (Digital Versatile Disc), a magnetic disc, and a semiconductor memory.
- the program can be installed in the computer from the removable recording medium 811 as described above, or can be downloaded to the computer via the communication network or the broadcast network and installed in the built-in hard disk 805. That is, the program is transferred from a download site to a computer wirelessly via a digital satellite broadcasting artificial satellite, or wired to a computer via a network such as a LAN (Local Area Network) or the Internet. be able to.
- a network such as a LAN (Local Area Network) or the Internet.
- the computer includes a CPU (Central Processing Unit) 802, and an input / output interface 810 is connected to the CPU 802 via a bus 801.
- CPU Central Processing Unit
- the CPU 802 executes a program stored in a ROM (Read Only Memory) 803 accordingly. .
- the CPU 802 loads a program stored in the hard disk 805 to a RAM (Random Access Memory) 804 and executes it.
- the CPU 802 performs processing according to the flowchart described above or processing performed by the configuration of the block diagram described above. Then, the CPU 802 outputs the processing result as necessary, for example, via the input / output interface 810, from the output unit 806, or from the communication unit 808, and further recorded on the hard disk 805.
- the input unit 807 includes a keyboard, a mouse, a microphone, and the like.
- the output unit 806 includes an LCD (Liquid Crystal Display), a speaker, and the like.
- the processing performed by the computer according to the program does not necessarily have to be performed in chronological order in the order described as the flowchart. That is, the processing performed by the computer according to the program includes processing executed in parallel or individually (for example, parallel processing or object processing).
- the program may be processed by one computer (processor), or may be distributedly processed by a plurality of computers. Furthermore, the program may be transferred to a remote computer and executed.
- the present technology is not limited to encoding and decoding using MVC. That is, the present technology can be applied to a case where a reference index is assigned to a candidate picture, a predicted image is generated, and images of a plurality of viewpoints are encoded and decoded using the predicted image.
- FIG. 40 illustrates a schematic configuration of a television apparatus to which the present technology is applied.
- the television apparatus 900 includes an antenna 901, a tuner 902, a demultiplexer 903, a decoder 904, a video signal processing unit 905, a display unit 906, an audio signal processing unit 907, a speaker 908, and an external interface unit 909. Furthermore, the television apparatus 900 includes a control unit 910, a user interface unit 911, and the like.
- the tuner 902 selects a desired channel from the broadcast wave signal received by the antenna 901, demodulates it, and outputs the obtained encoded bit stream to the demultiplexer 903.
- the demultiplexer 903 extracts video and audio packets of the program to be viewed from the encoded bit stream, and outputs the extracted packet data to the decoder 904.
- the demultiplexer 903 supplies a packet of data such as EPG (Electronic Program Guide) to the control unit 910. If scrambling is being performed, descrambling is performed by a demultiplexer or the like.
- the decoder 904 performs packet decoding processing, and outputs video data generated by the decoding processing to the video signal processing unit 905 and audio data to the audio signal processing unit 907.
- the video signal processing unit 905 performs noise removal, video processing according to user settings, and the like on the video data.
- the video signal processing unit 905 generates video data of a program to be displayed on the display unit 906, image data by processing based on an application supplied via a network, and the like.
- the video signal processing unit 905 generates video data for displaying a menu screen for selecting an item and the like, and superimposes the video data on the video data of the program.
- the video signal processing unit 905 generates a drive signal based on the video data generated in this way, and drives the display unit 906.
- the display unit 906 drives a display device (for example, a liquid crystal display element or the like) based on a drive signal from the video signal processing unit 905 to display a program video or the like.
- a display device for example, a liquid crystal display element or the like
- the audio signal processing unit 907 performs predetermined processing such as noise removal on the audio data, performs D / A conversion processing and amplification processing on the processed audio data, and outputs the audio data to the speaker 908.
- the external interface unit 909 is an interface for connecting to an external device or a network, and transmits and receives data such as video data and audio data.
- a user interface unit 911 is connected to the control unit 910.
- the user interface unit 911 includes an operation switch, a remote control signal receiving unit, and the like, and supplies an operation signal corresponding to a user operation to the control unit 910.
- the control unit 910 is configured using a CPU (Central Processing Unit), a memory, and the like.
- the memory stores a program executed by the CPU, various data necessary for the CPU to perform processing, EPG data, data acquired via a network, and the like.
- the program stored in the memory is read and executed by the CPU at a predetermined timing such as when the television device 900 is activated.
- the CPU executes the program to control each unit so that the television apparatus 900 performs an operation according to the user operation.
- the television device 900 is provided with a bus 912 for connecting the tuner 902, the demultiplexer 903, the video signal processing unit 905, the audio signal processing unit 907, the external interface unit 909, and the control unit 910.
- the decoder 904 is provided with the function of the image processing apparatus (image processing method) of the present application. For this reason, the image quality of a decoded image can be improved about the image of a some viewpoint.
- FIG. 41 illustrates a schematic configuration of a mobile phone to which the present technology is applied.
- the cellular phone 920 includes a communication unit 922, an audio codec 923, a camera unit 926, an image processing unit 927, a demultiplexing unit 928, a recording / reproducing unit 929, a display unit 930, and a control unit 931. These are connected to each other via a bus 933.
- an antenna 921 is connected to the communication unit 922, and a speaker 924 and a microphone 925 are connected to the audio codec 923. Further, an operation unit 932 is connected to the control unit 931.
- the mobile phone 920 performs various operations such as transmission / reception of voice signals, transmission / reception of e-mail and image data, image shooting, and data recording in various modes such as a voice call mode and a data communication mode.
- the voice signal generated by the microphone 925 is converted into voice data and compressed by the voice codec 923 and supplied to the communication unit 922.
- the communication unit 922 performs audio data modulation processing, frequency conversion processing, and the like to generate a transmission signal.
- the communication unit 922 supplies a transmission signal to the antenna 921 and transmits it to a base station (not shown).
- the communication unit 922 performs amplification, frequency conversion processing, demodulation processing, and the like of the reception signal received by the antenna 921, and supplies the obtained audio data to the audio codec 923.
- the audio codec 923 performs data expansion of the audio data and conversion to an analog audio signal and outputs the result to the speaker 924.
- the control unit 931 receives character data input by operating the operation unit 932 and displays the input characters on the display unit 930.
- the control unit 931 generates mail data based on a user instruction or the like in the operation unit 932 and supplies the mail data to the communication unit 922.
- the communication unit 922 performs mail data modulation processing, frequency conversion processing, and the like, and transmits the obtained transmission signal from the antenna 921.
- the communication unit 922 performs amplification, frequency conversion processing, demodulation processing, and the like of the reception signal received by the antenna 921, and restores mail data. This mail data is supplied to the display unit 930 to display the mail contents.
- the mobile phone 920 can also store the received mail data in a storage medium by the recording / playback unit 929.
- the storage medium is any rewritable storage medium.
- the storage medium is a removable medium such as a semiconductor memory such as a RAM or a built-in flash memory, a hard disk, a magnetic disk, a magneto-optical disk, an optical disk, a USB memory, or a memory card.
- the image data generated by the camera unit 926 is supplied to the image processing unit 927.
- the image processing unit 927 performs encoding processing of image data and generates encoded data.
- the demultiplexing unit 928 multiplexes the encoded data generated by the image processing unit 927 and the audio data supplied from the audio codec 923 by a predetermined method, and supplies the multiplexed data to the communication unit 922.
- the communication unit 922 performs modulation processing and frequency conversion processing of multiplexed data, and transmits the obtained transmission signal from the antenna 921.
- the communication unit 922 performs amplification, frequency conversion processing, demodulation processing, and the like of the reception signal received by the antenna 921, and restores multiplexed data. This multiplexed data is supplied to the demultiplexing unit 928.
- the demultiplexing unit 928 performs demultiplexing of the multiplexed data, and supplies the encoded data to the image processing unit 927 and the audio data to the audio codec 923.
- the image processing unit 927 performs a decoding process on the encoded data to generate image data.
- the image data is supplied to the display unit 930 and the received image is displayed.
- the audio codec 923 converts the audio data into an analog audio signal, supplies the analog audio signal to the speaker 924, and outputs the received audio.
- the image processing unit 927 is provided with the function of the image processing device (image processing method) of the present application. For this reason, the image quality of the decoded image can be improved with respect to the images of a plurality of viewpoints.
- FIG. 42 illustrates a schematic configuration of a recording / reproducing apparatus to which the present technology is applied.
- the recording / reproducing apparatus 940 records, for example, audio data and video data of a received broadcast program on a recording medium, and provides the recorded data to the user at a timing according to a user instruction.
- the recording / reproducing device 940 can also acquire audio data and video data from another device, for example, and record them on a recording medium. Further, the recording / reproducing apparatus 940 decodes and outputs the audio data and video data recorded on the recording medium, thereby enabling image display and audio output on the monitor apparatus or the like.
- the recording / reproducing apparatus 940 includes a tuner 941, an external interface unit 942, an encoder 943, an HDD (Hard Disk Drive) unit 944, a disk drive 945, a selector 946, a decoder 947, an OSD (On-Screen Display) unit 948, a control unit 949, A user interface unit 950 is included.
- Tuner 941 selects a desired channel from a broadcast signal received by an antenna (not shown).
- the tuner 941 outputs an encoded bit stream obtained by demodulating the received signal of a desired channel to the selector 946.
- the external interface unit 942 includes at least one of an IEEE 1394 interface, a network interface unit, a USB interface, a flash memory interface, and the like.
- the external interface unit 942 is an interface for connecting to an external device, a network, a memory card, and the like, and receives data such as video data and audio data to be recorded.
- the encoder 943 performs encoding by a predetermined method when the video data and audio data supplied from the external interface unit 942 are not encoded, and outputs an encoded bit stream to the selector 946.
- the HDD unit 944 records content data such as video and audio, various programs, and other data on a built-in hard disk, and reads them from the hard disk during playback.
- the disk drive 945 records and reproduces signals with respect to the mounted optical disk.
- An optical disk such as a DVD disk (DVD-Video, DVD-RAM, DVD-R, DVD-RW, DVD + R, DVD + RW, etc.), Blu-ray disk, or the like.
- the selector 946 selects one of the encoded bit streams from the tuner 941 or the encoder 943 and supplies it to either the HDD unit 944 or the disk drive 945 when recording video or audio. Further, the selector 946 supplies the encoded bit stream output from the HDD unit 944 or the disk drive 945 to the decoder 947 at the time of reproduction of video and audio.
- the decoder 947 performs a decoding process on the encoded bit stream.
- the decoder 947 supplies the video data generated by performing the decoding process to the OSD unit 948.
- the decoder 947 outputs audio data generated by performing the decoding process.
- the OSD unit 948 generates video data for displaying a menu screen for selecting an item and the like, and superimposes it on the video data output from the decoder 947 and outputs the video data.
- a user interface unit 950 is connected to the control unit 949.
- the user interface unit 950 includes an operation switch, a remote control signal receiving unit, and the like, and supplies an operation signal corresponding to a user operation to the control unit 949.
- the control unit 949 is configured using a CPU, a memory, and the like.
- the memory stores programs executed by the CPU and various data necessary for the CPU to perform processing.
- the program stored in the memory is read and executed by the CPU at a predetermined timing such as when the recording / reproducing apparatus 940 is activated.
- the CPU executes the program to control each unit so that the recording / reproducing device 940 operates according to the user operation.
- the decoder 947 is provided with the function of the image processing apparatus (image processing method) of the present application. For this reason, the image quality of a decoded image can be improved about the image of a some viewpoint.
- FIG. 43 illustrates a schematic configuration of an imaging apparatus to which the present technology is applied.
- the imaging device 960 images a subject, displays an image of the subject on a display unit, and records it on a recording medium as image data.
- the imaging device 960 includes an optical block 961, an imaging unit 962, a camera signal processing unit 963, an image data processing unit 964, a display unit 965, an external interface unit 966, a memory unit 967, a media drive 968, an OSD unit 969, and a control unit 970. Have. In addition, a user interface unit 971 is connected to the control unit 970. Furthermore, the image data processing unit 964, the external interface unit 966, the memory unit 967, the media drive 968, the OSD unit 969, the control unit 970, and the like are connected via a bus 972.
- the optical block 961 is configured using a focus lens, a diaphragm mechanism, and the like.
- the optical block 961 forms an optical image of the subject on the imaging surface of the imaging unit 962.
- the imaging unit 962 is configured using a CCD or CMOS image sensor, generates an electrical signal corresponding to the optical image by photoelectric conversion, and supplies the electrical signal to the camera signal processing unit 963.
- the camera signal processing unit 963 performs various camera signal processing such as knee correction, gamma correction, and color correction on the electrical signal supplied from the imaging unit 962.
- the camera signal processing unit 963 supplies the image data after the camera signal processing to the image data processing unit 964.
- the image data processing unit 964 performs an encoding process on the image data supplied from the camera signal processing unit 963.
- the image data processing unit 964 supplies the encoded data generated by performing the encoding process to the external interface unit 966 and the media drive 968. Further, the image data processing unit 964 performs a decoding process on the encoded data supplied from the external interface unit 966 and the media drive 968.
- the image data processing unit 964 supplies the image data generated by performing the decoding process to the display unit 965. Further, the image data processing unit 964 superimposes the processing for supplying the image data supplied from the camera signal processing unit 963 to the display unit 965 and the display data acquired from the OSD unit 969 on the image data. To supply.
- the OSD unit 969 generates display data such as a menu screen and icons made up of symbols, characters, or figures and outputs them to the image data processing unit 964.
- the external interface unit 966 includes, for example, a USB input / output terminal, and is connected to a printer when printing an image.
- a drive is connected to the external interface unit 966 as necessary, a removable medium such as a magnetic disk or an optical disk is appropriately mounted, and a computer program read from them is installed as necessary.
- the external interface unit 966 has a network interface connected to a predetermined network such as a LAN or the Internet.
- the control unit 970 reads the encoded data from the memory unit 967 in accordance with an instruction from the user interface unit 971, and supplies the encoded data to the other device connected via the network from the external interface unit 966. it can.
- the control unit 970 may acquire encoded data and image data supplied from another device via the network via the external interface unit 966 and supply the acquired data to the image data processing unit 964. it can.
- any readable / writable removable medium such as a magnetic disk, a magneto-optical disk, an optical disk, or a semiconductor memory is used.
- the recording medium may be any type of removable medium, and may be a tape device, a disk, or a memory card. Of course, a non-contact IC card or the like may be used.
- media drive 968 and the recording medium may be integrated and configured by a non-portable storage medium such as a built-in hard disk drive or an SSD (Solid State Drive).
- a non-portable storage medium such as a built-in hard disk drive or an SSD (Solid State Drive).
- the control unit 970 is configured using a CPU, a memory, and the like.
- the memory stores programs executed by the CPU, various data necessary for the CPU to perform processing, and the like.
- the program stored in the memory is read and executed by the CPU at a predetermined timing such as when the imaging device 960 is activated.
- the CPU executes the program to control each unit so that the imaging device 960 operates according to the user operation.
- the image data processing unit 964 is provided with the function of the image processing apparatus (image processing method) of the present application. For this reason, the image quality of a decoded image can be improved about the image of a some viewpoint.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Multimedia (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
- Testing, Inspecting, Measuring Of Stereoscopic Televisions And Televisions (AREA)
Abstract
Priority Applications (3)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US14/002,185 US9105076B2 (en) | 2011-03-08 | 2012-02-28 | Image processing apparatus, image processing method, and program |
| CN2012800110104A CN103404154A (zh) | 2011-03-08 | 2012-02-28 | 图像处理设备、图像处理方法以及程序 |
| JP2013503459A JPWO2012121052A1 (ja) | 2011-03-08 | 2012-02-28 | 画像処理装置、画像処理方法、及び、プログラム |
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| JP2011050583 | 2011-03-08 | ||
| JP2011-050583 | 2011-03-08 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2012121052A1 true WO2012121052A1 (fr) | 2012-09-13 |
Family
ID=46798020
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/JP2012/054856 Ceased WO2012121052A1 (fr) | 2011-03-08 | 2012-02-28 | Dispositif de traitement d'image, procédé de traitement d'image et programme |
Country Status (4)
| Country | Link |
|---|---|
| US (1) | US9105076B2 (fr) |
| JP (1) | JPWO2012121052A1 (fr) |
| CN (1) | CN103404154A (fr) |
| WO (1) | WO2012121052A1 (fr) |
Cited By (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2014072801A (ja) * | 2012-09-28 | 2014-04-21 | Sharp Corp | 多視点画像生成装置、画像生成方法、表示装置、プログラム、及び、記録媒体 |
| CN105794207A (zh) * | 2013-12-02 | 2016-07-20 | 高通股份有限公司 | 参考图片选择 |
| US9462251B2 (en) | 2014-01-02 | 2016-10-04 | Industrial Technology Research Institute | Depth map aligning method and system |
Families Citing this family (12)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| EP2757527B1 (fr) * | 2013-01-16 | 2018-12-12 | Honda Research Institute Europe GmbH | Système et procédé de correction d'image de caméra distordue |
| US10911779B2 (en) * | 2013-10-17 | 2021-02-02 | Nippon Telegraph And Telephone Corporation | Moving image encoding and decoding method, and non-transitory computer-readable media that code moving image for each of prediction regions that are obtained by dividing coding target region while performing prediction between different views |
| US20150195564A1 (en) * | 2014-01-03 | 2015-07-09 | Qualcomm Incorporated | Method for coding a reference picture set (rps) in multi-layer coding |
| EP2933943A1 (fr) * | 2014-04-14 | 2015-10-21 | Alcatel Lucent | Efficacité de stockage et récupération d'informations privées sécurisées inconditionnelles |
| US9589362B2 (en) | 2014-07-01 | 2017-03-07 | Qualcomm Incorporated | System and method of three-dimensional model generation |
| US9607388B2 (en) * | 2014-09-19 | 2017-03-28 | Qualcomm Incorporated | System and method of pose estimation |
| US9911242B2 (en) | 2015-05-14 | 2018-03-06 | Qualcomm Incorporated | Three-dimensional model generation |
| US10304203B2 (en) | 2015-05-14 | 2019-05-28 | Qualcomm Incorporated | Three-dimensional model generation |
| US10373366B2 (en) | 2015-05-14 | 2019-08-06 | Qualcomm Incorporated | Three-dimensional model generation |
| US10341568B2 (en) | 2016-10-10 | 2019-07-02 | Qualcomm Incorporated | User interface to assist three dimensional scanning of objects |
| KR102608466B1 (ko) * | 2016-11-22 | 2023-12-01 | 삼성전자주식회사 | 영상 처리 방법 및 영상 처리 장치 |
| WO2021230157A1 (fr) * | 2020-05-15 | 2021-11-18 | ソニーグループ株式会社 | Dispositif de traitement d'informations, procédé de traitement d'informations et programme de traitement d'informations |
Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2009091383A2 (fr) * | 2008-01-11 | 2009-07-23 | Thomson Licensing | Codage vidéo et de profondeur |
| JP2010520697A (ja) * | 2007-03-02 | 2010-06-10 | エルジー エレクトロニクス インコーポレイティド | ビデオ信号のデコーディング/エンコーディング方法及び装置 |
| JP2010157822A (ja) * | 2008-12-26 | 2010-07-15 | Victor Co Of Japan Ltd | 画像復号装置、画像符復号方法およびそのプログラム |
| JP2010524381A (ja) * | 2007-04-11 | 2010-07-15 | サムスン エレクトロニクス カンパニー リミテッド | 多視点映像の符号化、復号化方法及び装置 |
Family Cites Families (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN101415115B (zh) * | 2007-10-15 | 2011-02-02 | 华为技术有限公司 | 基于运动跳跃模式的视频编解码方法及其编解码器 |
| WO2010050728A2 (fr) * | 2008-10-27 | 2010-05-06 | 엘지전자 주식회사 | Procédé et appareil pour la synthèse d’images de visualisation virtuelle |
-
2012
- 2012-02-28 WO PCT/JP2012/054856 patent/WO2012121052A1/fr not_active Ceased
- 2012-02-28 CN CN2012800110104A patent/CN103404154A/zh active Pending
- 2012-02-28 US US14/002,185 patent/US9105076B2/en not_active Expired - Fee Related
- 2012-02-28 JP JP2013503459A patent/JPWO2012121052A1/ja active Pending
Patent Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2010520697A (ja) * | 2007-03-02 | 2010-06-10 | エルジー エレクトロニクス インコーポレイティド | ビデオ信号のデコーディング/エンコーディング方法及び装置 |
| JP2010524381A (ja) * | 2007-04-11 | 2010-07-15 | サムスン エレクトロニクス カンパニー リミテッド | 多視点映像の符号化、復号化方法及び装置 |
| WO2009091383A2 (fr) * | 2008-01-11 | 2009-07-23 | Thomson Licensing | Codage vidéo et de profondeur |
| JP2010157822A (ja) * | 2008-12-26 | 2010-07-15 | Victor Co Of Japan Ltd | 画像復号装置、画像符復号方法およびそのプログラム |
Cited By (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2014072801A (ja) * | 2012-09-28 | 2014-04-21 | Sharp Corp | 多視点画像生成装置、画像生成方法、表示装置、プログラム、及び、記録媒体 |
| CN105794207A (zh) * | 2013-12-02 | 2016-07-20 | 高通股份有限公司 | 参考图片选择 |
| CN105794207B (zh) * | 2013-12-02 | 2019-02-22 | 高通股份有限公司 | 参考图片选择 |
| US9462251B2 (en) | 2014-01-02 | 2016-10-04 | Industrial Technology Research Institute | Depth map aligning method and system |
Also Published As
| Publication number | Publication date |
|---|---|
| US9105076B2 (en) | 2015-08-11 |
| JPWO2012121052A1 (ja) | 2014-07-17 |
| US20130336589A1 (en) | 2013-12-19 |
| CN103404154A (zh) | 2013-11-20 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| JP6061150B2 (ja) | 画像処理装置、画像処理方法、及び、プログラム | |
| WO2012121052A1 (fr) | Dispositif de traitement d'image, procédé de traitement d'image et programme | |
| US9445092B2 (en) | Image processing apparatus, image processing method, and program | |
| US9350972B2 (en) | Encoding device and encoding method, and decoding device and decoding method | |
| KR102092822B1 (ko) | 복호 장치 및 복호 방법, 및 부호화 장치 및 부호화 방법 | |
| US20140036033A1 (en) | Image processing device and image processing method | |
| US20140085418A1 (en) | Image processing device and image processing method | |
| KR102706378B1 (ko) | 인터 예측 기반 영상 코딩 방법 및 장치 | |
| JPWO2016104179A1 (ja) | 画像処理装置および画像処理方法 | |
| JPWO2012128241A1 (ja) | 画像処理装置、画像処理方法、及び、プログラム | |
| KR20220110284A (ko) | 머지 후보들의 최대 개수 정보를 포함하는 시퀀스 파라미터 세트를 이용한 영상 부호화/복호화 방법, 장치 및 비트스트림을 전송하는 방법 | |
| JP2024125405A (ja) | Bdofを行う画像符号化/復号化方法、装置、及びビットストリームを伝送する方法 | |
| US20150071350A1 (en) | Image processing device and image processing method | |
| WO2013157439A1 (fr) | Dispositif et procédé de décodage, dispositif et procédé de codage | |
| HK1216809B (zh) | 用於纹理译码的先进残余预测(arp)的方法和设备 | |
| HK1223757B (zh) | 用於3d视频译码的基於块的高级残差预测 |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 12755731 Country of ref document: EP Kind code of ref document: A1 |
|
| ENP | Entry into the national phase |
Ref document number: 2013503459 Country of ref document: JP Kind code of ref document: A |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 14002185 Country of ref document: US |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| 122 | Ep: pct application non-entry in european phase |
Ref document number: 12755731 Country of ref document: EP Kind code of ref document: A1 |