WO2005104560A1

WO2005104560A1 - Method of processing decoded pictures.

Info

Publication number: WO2005104560A1
Application number: PCT/IB2005/051326
Authority: WO
Inventors: Arnaud Bourge; Joël JUNG; Luis Escobar
Original assignee: Koninklijke Philips Electronics NV
Current assignee: Koninklijke Philips NV
Priority date: 2004-04-27
Filing date: 2005-04-22
Publication date: 2005-11-03
Anticipated expiration: 2006-10-27

Abstract

The present invention relates to a method of and a device for processing decoded pictures having a predetermined resolution. Said device comprises: a filter (D SF) for down-sampling data values of the decoded pictures so as to deliver down-sampled data values; an embedded compression unit (eENC) for encoding the down-sampled values so as to deliver compressed data values; a memory (MEM) for storing the compressed data values; an embedded decompression unit (eDEC) for decoding the stored compressed values so as to deliver uncompressed data values; and a filter (USF) for up-sampling the uncompressed data values so as to deliver upsampled data values at the predetermined resolution.

Description

Method of processing decoded pictures

FIELD OF THE INVENTION The present invention relates to a method of and a device for processing decoded pictures having a predetermined resolution. This invention may be used in, for example, video decoders, video encoders or portable apparatuses, such as personal digital assistants or mobile phones, said apparatuses being adapted to decode or to encode pictures.

BACKGROUND OF THE INVENTION Low power consumption is a key- feature of mobile devices. Mobile devices now provide video encoding and decoding capabilities that are known to dissipate a lot of energy. So-called low-power video algorithms are thus needed. As a matter of fact, accesses to an external memory such as SDRAM are a bottleneck for video devices. This is due both to power consumption issues, as memories are known to be the most power-consuming part of a system, and to speed limitation, due to the bandwidth of the exchanges between a central processing unit CPU and the memory. In conventional video decoders, the motion compensation module needs many such accesses because it constantly points to blocks of pixels in so-called reference frames. To overcome this problem, so-called "embedded compression" has been proposed. Said embedded compression has originally been developed to decrease the memory size at the expense of a quality decrease, due to lossy compression of the reference frame(s). An example of embedded compression is described in "Low-power H.264 video decoder with graceful degradation", by A.Bourge and J.Jung, Proc. Of VCIP, Electronic

Imaging, San Jose, California, USA, January 2004. Such an embedded compression is shown in Figure 1 applied to an H.264 decoder. Said decoding device comprises: a variable length decoding block VLD suitable for decoding an encoded input data stream BS and for delivering decoded data, on the one hand, and decoded motion vectors MV to an image memory, on the other hand, an inverse quantizing block IQ suitable for producing quantized data from the decoded data, an inverse frequency transform block IT, for example in inverse discrete cosine transform block IDCT, for producing inversely transformed data representing a residual error e from the quantized data. The decoding device further includes an adder for adding motion-compensated data to the residual error, data-block-by-data block. The motion-compensated data are produced by a modified motion compensation unit MMC comprising in series an embedded compression unit eENC, an image memory MEM, an embedded decompression unit eDEC and a motion compensation unit MC. The output of the adder is a data block of the decoded output image OF which is then delivered to a display (not represented) and which is also delivered to the embedded decompression unit eDEC. The decoding device optionally comprises a deblocking filter FIL, said filter being for example the one proposed in the H.264 standard. The embedded compression unit eENC comprises, for example, a transform block, a quantization block, a variable-length coding block and a buffer in series. It further comprises a regulation unit connected between the buffer and the quantization block so as to achieve a given compression ratio. The embedded decompression unit comprises, for example, a variable-length decoding block, an inverse quantization block and an inverse transform block blocks in series. However, it has then been proved that these embedded compression techniques greatly impair the visual quality for high compression factors (i.e. compression factors higher than 3).

SUMMARY OF THE INVENTION It is an object of the invention to propose a method of processing decoded pictures which provides better visual quality than the one of the prior art for high compression factors. To this end, the method in accordance with the invention is characterized in that it comprises the steps of: down-sampling data values of the decoded pictures for delivering down-sampled data values; encoding the down-sampled data values for delivering compressed data values; - storing the compressed data values; decoding the stored compressed data values for delivering uncompressed data values; and up-sampling the uncompressed data values for delivering up-sampled data values at the predetermined resolution. As it will be explained in more detail hereinafter, the visual quality resulting from a combination of an encoding step at a first compression factor with a down-sampling step at a given down-scaling factor is better than the visual quality resulting from an encoding step at a second compression factor equal to the multiplication of the first compression factor by the down-scaling factor. Beneficially, the step of down-sampling is adapted to down-sample the data values of the decoded pictures in a horizontal direction. The present invention also relates to a processing device implementing such a processing method. It relates to a video decoder comprising a decoding unit for providing a residual error, said processing device in series with a motion compensation unit adapted to deliver motion compensated data values, and an adder for adding the residual error to the motion compensated data values, the output of said adder being provided to the input of the processing device. It relates to a video encoder for encoding input data values, said encoder comprising an encoding unit for providing encoded data values, a partial decoding unit for providing partially decoded data values, the processing device in series with a motion compensation unit adapted to deliver motion compensated data values, an adder for adding the motion compensated data values to the partially decoded data values, the output of said adder being provided to the input of the processing device, and a subtracter for subtracting the motion compensated data values from the input data values. The invention also relates to a portable apparatus comprising the processing device. Said invention finally relates to a computer program product comprising program instructions for implementing said processing method. These and other aspects of the invention will be apparent from and will be elucidated with reference to the embodiments described hereinafter.

BRIEF DESCRIPTION OF THE DRAWINGS The present invention will now be described in more detail, by way of example, with reference to the accompanying drawings, wherein: Figure 1 shows a decoding device in accordance with the prior art; Figure 2 shows a block diagram of an embodiment of a decoding device in accordance with the invention; Figures 3 A and 3B show the results of a combination of an embedded compression and an embedded resizing, and of an embedded compression alone, respectively, for a same compression factor; and Figure 4 shows a block diagram of an embodiment of an encoding device in accordance with the invention.

DETAILED DESCRIPTION OF THE INVENTION The present invention relates to a method of processing decoded data values included in a sequence of pictures. These data values are, for example, the luminance or the chrominance of pixels. Said processing method can be applied to a video decoder or to a video encoder.

The present invention can be applied to any video encoding or decoding device where sequences have to be stored in a memory. It is particularly interesting for reducing the size of the reference image memory while keeping a sufficient overall image quality of the decoded output image. For clarity purpose, we focus on the case of a conventional video decoder (for example MPEG-2, MPEG-4, H.264, or the like). In such a decoder, a decoded frame generally needs to be stored in the memory so that it can be later retrieved to predict the next frame(s) through motion compensation. Figure 2 shows a block diagram of an example of a decoding device according to the invention. Said decoding device comprises: a variable length decoding block VLD suitable for decoding an encoded input data stream BS and for delivering decoded data, on the one hand, and decoded motion vectors MV to an image memory, on the other hand, - an inverse quantizing block IQ suitable for producing quantized data from the decoded data, an inverse frequency transform block IT, for example in inverse discrete cosine transform IDCT, for producing inversely transformed data representing a residual error e from the quantized data. The decoding device further includes an adder for adding motion-compensated data to the residual error, data-block-by-data block, in order to deliver the output frame OF. The motion-compensated data are produced by a modified motion compensation unit MMC comprising in series a down-sampling unit DSF, an embedded compression unit eENC as described in the prior art, an image memory MEM, an embedded decompression unit eDEC, an up-sampling unit USF and a motion compensation unit MC so as to reconstruct the reference frames data-block-by-data block. As in the prior art, the embedded compression unit eENC comprises a transform block, a quantization block, a variable-length coding block and a buffer in series. It further comprises a regulation unit connected between the buffer and the quantization block so as to achieve a given compression ratio. The embedded decompression unit comprises a variable- length decoding block, an inverse quantization block and an inverse transform block in series. It will be apparent to a person skilled in the art that the embedded compression and decompression can be realized using other means than quantizing and variable length coding means. It can be, for example, based on bit plane coding, as described by R.J. van der

Vleuten, in "Low-complexity lossless and fine-granularity scalable near-lossless compression of color images", Proceedings of the Data Compression Conference, pp.477, April 2002. Any linear or non-linear filter can be used for the down-sampling and up-sampling operations. In the preferred embodiment of the invention, the best trade-off between visual quality and computational complexity is, for down-sampling, the use of a 7-tap FIR (for Finite Impulse Response) filter with the following weights: (-1/32, 0, 9/32, 16/32, 9/32, 0, - 1/32); and for up-sampling, the use of a 6-tap FIR filter with the following weights: (1/32, - 5/32, 5/8, 5/8, -5/32, 1/32), said filters being the ones used for sub-pixel motion compensation in H.264 standard, as described in ITU-T Rec. H.264 / ISO/IEC 11496-10, "Advanced Video Coding", Final Committee Draft, Document JVTF 100, December 2002. The output of the adder is a decoded data block of the decoded output image OF which is then delivered to a display (not represented) and which is also delivered to the down-sampling unit DSF. The decoding device optionally comprises a deblocking filter FIL, said filter being for example the one proposed in the H.264 standard. The size of the reference frame memory is then reduced by using a combination of the so-called embedded compression, and the so-called embedded resizing, said embedded resizing comprising the down-sampling and up-sampling, as described before. By using this approach, the complexity is reduced, thanks to the down-scaling. Beneficially, the reference frames are only down-sampled and up-sampled horizontally by a factor of 2. In this case, after down-sampling, the image size is reduced by half. Therefore, the complexity of the embedded compression technique used is also reduced by half. The horizontal direction is preferred to the vertical direction, as the extraction from the memory is facilitated. Alternatively, the down-sampling method can be applied in the horizontal direction and in the vertical direction. The down-sampling factor can also be different from 2. Moreover, for high compression factors, e.g. 6, which provide the larger gains in terms of memory size and accesses, the visual quality resulting from a combination of the embedded compression at a compression factor of 3 with an horizontal down-sampling is better than the visual quality resulting from the embedded compression at a compression factor of 6. This point is illustrated in Figures 3 A and 3B, where Figure 3 A shows the result of a combination of an embedded compression with a compression factor of 3 with an embedded resizing in the horizontal direction (down-scaling factor of 2), and where Figure 3B shows the result of an embedded compression with a compression factor of 6. Figure 4 shows an example of a video encoding device. Such an encoding device comprises a direct frequency transform block T, for example a direct discrete cosine transform DCT, suitable for transforming input video data IN into transformed data; a quantizing block Q suitable for producing quantized data from the transformed data; and a variable length coding block VLC suitable for producing coded data ES from the quantized data. It also comprises a prediction circuit comprising in series an inverse quantizing block IQ; an inverse frequency transform block IT, for example an inverse discrete cosine transform block IDCT; an adder for adding the data block coming from the inverse transform block IDCT and from a motion compensation unit MC; the down-sampling unit DSF in accordance with the invention; an image memory MEM suitable for storing the images used by the motion compensation unit MC and the motion vectors resulting from a motion estimation unit ME; an up-sampling unit USF; and a subtracter suitable for subtracting the data coming from the motion compensation unit MC from the input video data IN, the result of this subtracter being delivered to the transform block DCT. The proposed invention can be applied to any video encoding or decoding device where accesses to an external memory represent a bottleneck, either because of limited bandwidth or because of high power consumption. The latter reason is especially crucial in mobile devices, where extended battery lifetime is a key feature. The proposed encoder or decoder is suitable in situations requiring a large amount of memory resources, and/or power savings while video quality can be degraded. The mobile device may comprise a switch that starts using embedded compression techniques with a compression factor of 3 when battery is full, and then switch to the proposed invention when battery gets flat, which would be a practical case of power scalability. Several embodiments of the present invention have been described above by way of examples only, and it will be apparent to a person skilled in the art that modifications and variations can be made to the described embodiments without departing from the scope of the invention as defined by the appended claims. Further, in the claims, any reference signs placed between parentheses shall not be construed as limiting the claim. The term

"comprising" does not exclude the presence of elements or steps other than those listed in a claim. The terms "a" or " an" does not exclude a plurality. The invention can be implemented by means of hardware comprising several distinct elements, and by means of a suitably programmed computer. In a device claim enumerating several means, several of these means can be embodied by one and the same item of hardware. The mere fact that measures are recited in mutually different independent claims does not indicate that a combination of these measures cannot be used to advantage.

Claims

1. A method of processing decoded pictures having a predetermined resolution, said method comprising the step of: - down-sampling data values of the decoded pictures for delivering down-sampled data values; encoding the down-sampled data values for delivering compressed data values; storing the compressed data values; decoding the stored compressed data values for delivering uncompressed data values; and up-sampling the uncompressed data values for delivering up-sampled data values at the predetermined resolution.

2. A method as claimed in claim 1, wherein the step of down-sampling is adapted to down-sample the data values of the decoded pictures in a horizontal direction.

3. A device for processing decoded pictures having a predetermined resolution, said device comprising: a filter (DSF) for down-sampling data values of the decoded pictures so as to deliver down-sampled data values; an embedded compression unit (eENC) for encoding the down-sampled values so as to deliver compressed data values; a memory (MEM) for storing the compressed data values; an embedded decompression unit (eDEC) for decoding the stored compressed values so as to deliver uncompressed data values; and a filter (USF) for up-sampling the uncompressed data values so as to deliver up- sampled data values at the predetermined resolution.

4. A video decoder comprising a decoding unit (VLD,IQ,IT) for providing a residual error, a processing device as claimed in claim 3 in series with a motion compensation unit

(MC) adapted to deliver motion compensated data values, and an adder for adding the residual error to the motion compensated data values, the output of said adder being provided to the input of the processing device.

5. A video encoder for encoding input data values (IN), said encoder comprising an encoding unit (DCT,Q,VLC) for providing encoded data values (BS), a partial decoding unit (IQ,IDCT) for providing partially decoded data values, a processing device as claimed in claim 3 in series with a motion compensation unit (MC) adapted to deliver motion compensated data values, an adder for adding the motion compensated data values to the partially decoded data values, the output of said adder being provided to the input of the processing device, and a subtracter for subtracting the motion compensated data values from the input data values.

6. A portable apparatus comprising a device for processing data values as claimed in claim 3.

7. A computer program product comprising program instructions for implementing, when said program is executed by a processor, a method as claimed in claim 1.