[go: up one dir, main page]

HK1180161B - Managing predicted motion vector candidates - Google Patents

Managing predicted motion vector candidates Download PDF

Info

Publication number
HK1180161B
HK1180161B HK13107329.2A HK13107329A HK1180161B HK 1180161 B HK1180161 B HK 1180161B HK 13107329 A HK13107329 A HK 13107329A HK 1180161 B HK1180161 B HK 1180161B
Authority
HK
Hong Kong
Prior art keywords
motion vector
pmv
candidates
candidate
list
Prior art date
Application number
HK13107329.2A
Other languages
Chinese (zh)
Other versions
HK1180161A (en
Inventor
Thomas Rusert
Kenneth Andersson
Per Wennersten
Jacob STRÖM
Rickard Sjöberg
Original Assignee
Telefonaktiebolaget Lm Ericsson (Publ)
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Telefonaktiebolaget Lm Ericsson (Publ) filed Critical Telefonaktiebolaget Lm Ericsson (Publ)
Publication of HK1180161A publication Critical patent/HK1180161A/en
Publication of HK1180161B publication Critical patent/HK1180161B/en

Links

Description

Managing predicted motion vector candidates
Technical Field
The application relates to a method of managing PMV candidates, a video encoding apparatus, a video decoding apparatus, and a computer readable medium.
Background
Recent video coding standards are based on hybrid coding principles, which include motion compensated temporal prediction of video frames and coding of frame residual signals. For efficient motion compensated temporal prediction, a block-based motion model is used to describe the motion of blocks of pixels across a frame. Each motion compensation block is assigned one motion vector (for uni-predictive temporal prediction, e.g., in P frames) or two motion vectors (for bi-predictive temporal prediction, e.g., in B frames). These motion vectors as well as the frame residual signal are encoded in the video bitstream.
At high compression ratios (or equivalently, at low video bit rates), motion vector coding takes a large portion of the total bit volume, especially in recent video coding standards such as h.264/AVC that use small motion compensation block sizes. In general, lossless predictive coding using motion vectors, i.e. the coding of motion vectors MV consists of: firstly, constructing a motion vector predicted value PMV for a vector to be coded; then, the difference DMV between the motion vector and the motion vector predictor is transmitted, where DMV = MV-PMV.
In h.264/AVC, the PMV is derived as the median of the motion vectors of three spatial neighboring blocks. Other methods also consider temporally neighboring blocks (i.e., co-located in neighboring frames) for motion vector prediction. Instead of using fixed rules to construct the PMV, the latest methods are proposed that explicitly signal the PMV to be used in the PMV candidate set PMV _ CANDS. Although this requires extra bits to signal one candidate in the set, it can save bits for motion vector coding in general, since DMV coding can be more efficient. That is, identifying the PMV and signaling the DMV can take fewer bits than signaling the MV independently.
The efficiency of a motion vector coding scheme using PMV candidate signaling depends on the suitability of available candidates in PMV _ CANDS. That is, the construction of the candidate list has a major impact on the coding performance. Existing methods typically use motion vectors from spatially surrounding blocks or temporally neighboring blocks (blocks co-located in neighboring frames). This construction of PMV _ CANDS, i.e. considering only a few surrounding blocks as a source of motion vector predictors, can be sub-optimal.
The number of candidates in PMV _ CANDS, i.e. the "length" of PMV _ CANDS, also has a major impact on coding efficiency. The reason is that the larger the number of candidates, the larger the number of bits required to signal one of the candidates, which in turn causes additional overhead and thus reduces compression efficiency. Existing approaches assume that a fixed number of candidates (e.g., spatially neighboring motion vectors) in PMV _ CANDS are valid for encoding of a video frame or sequence, and that the number of candidates can be reduced only if some are the same.
Motion vector encoding can require a significant proportion of the available bit rate in video encoding. Improving the number of PMV candidates may reduce the size of the difference value (DMV) that has to be signaled, but requires more signaling to identify a particular PMV. Therefore, in order to improve video coding efficiency, improved methods and apparatus are required to manage PMV candidates.
"An effective motion vector coding scheme based on minimum bit rate prediction", published by Sun Deuk Kim and Jong Beom Ra (IEEE Trans. Image Proc., Vol.8, No. 8, 1999, 8, p. 1117-. The predicted motion vector is selected from the three causal neighbouring motion vectors such that it results in the minimum bit rate in motion vector difference encoding. Then, the prediction error or motion vector difference and mode information for determining the predicted motion vector at the decoder are encoded and transmitted in order.
"compatibility-Based Scheme for motion vector Selection and Coding" (Video Coding Experts Group (VCEG) of ITU-Telecommunications Standardization Sector, reference number VCEG-AC06, Klagenfut, Austria, 7.2006) by Joel Jung and Guillame Laroche describes a method for reducing the cost of motion information in Video Coding. Two modifications to the selection of motion vector predictors are disclosed: improvements in the prediction of motion vectors that need to be transmitted; and a skip mode for increasing the number of macroblocks that do not require any motion information to be sent.
US 2009/0129464 to Jung et al relates to adaptive encoding and decoding. This document describes a method for transmitting image portions, whereby, in an encoding phase in which an encoding context is analyzed, parameters of a group of prediction functions that can be used for encoding are adapted. A first prediction descriptor is formed using the selected prediction function. A remainder between the first prediction descriptor and the current descriptor is determined and communicated.
Disclosure of Invention
A method of managing PMV candidates is provided. The method includes selecting the set of PMV candidates to be a subset of previously encoded motion vectors. The method also includes assigning a code value to each PMV candidate in the set of PMV candidates. The code values vary in length and are assigned to the PMV candidates in order of expected usage such that the PMV candidate with the highest expected usage has one of the shortest code values.
The above method allows less bits to be used to signal more frequently used PMV candidates. This provides an increase in coding efficiency.
The code value assigned to each PMV candidate may comprise any code system that produces code words of varying lengths. For example, code values may be assigned according to at least one of: arithmetic coding; variable length coding; and context-based adaptive arithmetic coding.
Any unnecessary PMV candidates may be removed from the set of PMV candidates. This ensures that the length of the list is not unnecessarily long, which would reduce coding efficiency. The PMV candidate may be determined to be unnecessary if at least one of the following conditions is satisfied: the PMV candidate is a duplicate of another PMV candidate in the set; the PMV candidate is determined to be within a threshold distance of an existing PMV candidate; and the PMV candidate will never be used since at least one candidate PMV candidate will allow the use of fewer bits to encode the motion vector.
Furthermore, a set of PMV candidates may be determined to be unnecessary if removing the set from the PMV candidate list would result in requiring a maximum of N extra bits to encode any single motion vector, where N is a predetermined threshold. Removing the set of PMV candidates from the PMV candidate list allows the remaining PMV candidates to be signaled using shorter codes and thus using fewer bits. However, removing the PMV candidate set from the PMV candidate list will result in some motion vectors having larger difference vectors, which will require more bits to encode. The savings of signaling which PMV candidate to use can exceed the maximum N extra bits required to signal some motion vectors over the average of several motion vectors.
Before assigning code values, the PMV candidates in the set of PMV candidates may be ordered according to the expected usage of the PMV. Subsequently, a code value may be assigned to the PMV candidate in the ordered list. Code values may be assigned in the order of the sorted entries such that the length of the assigned code values increases with the position of the PMV candidate in the list rather than decreasing.
PMV candidates may be identified by a scan pattern applied to a set of previously encoded blocks. The scan pattern may identify a particular block. The scan patterns may be arranged in the order of intended use. The scan patterns may be arranged in order of increasing distance of each identification block from the current block.
Each PMV candidate may correspond to a motion vector used to encode a previous block, which is a distance away from the current block. The code values may be sequentially assigned to the PMV candidates according to distances of respective previous blocks of the PMV candidates from the current block.
The distance may be measured in euclidean distance or chebyshev distance. The euclidean distance can be obtained by taking the square root of the sum of the squares of the differences between the x and y positions between the current block and the previous block. Sorting according to euclidean distance can be performed by sorting by squared euclidean distance, with a square root function not affecting the sorting. The manhattan distance is given by the sum of the absolute values of the following differences: the difference in x-coordinates of the current block and the previous block; and the difference in y-coordinates of the current block and the previous block. The chebyshev distance is defined as the larger of two values, the first value being the absolute value of the difference in x-coordinates of the current block and the previous block, and the second value being the absolute value of the difference in y-coordinates of the current block and the previous block.
First, PMV candidates may be ranked according to chebyshev distance; the PMV candidates with the same chebyshev distance may then be further ranked according to euclidean distance. This allows efficient ordering of PMV candidates.
The method may be employed for video encoding or video decoding, wherein the current block is a block encoded or decoded accordingly.
There is also provided a video encoding device comprising a processor arranged to select the set of PMV candidates to be a subset of previously encoded motion vectors. The processor is further arranged to assign a code value to each PMV candidate in the set of PMV candidates, wherein the code values vary in length and are assigned to the PMV candidates in order of expected usage such that the PMV candidate with the highest expected usage has one of the shortest code values.
There is also provided a video decoding device comprising a processor arranged to select the set of PMV candidates to be a subset of previously encoded motion vectors. The processor is further arranged to assign a code value to each PMV candidate in the set of PMV candidates, wherein the code values vary in length and are assigned to the PMV candidates in order of expected usage such that the PMV candidate with the highest expected usage has one of the shortest code values.
A computer-readable medium carrying instructions which, when executed by computer logic, cause the computer logic to carry out any of the methods disclosed herein is also provided.
Also provided herein are various methods for constructing a PMV candidate list PMV _ CANDS. The candidates in PMV _ CANDS are sorted and the use of one of the candidates in PMV _ CANDS is signaled from the encoder to the decoder. The signaling is arranged such that the first candidate in the list is assigned the shortest codeword of the candidates and subsequent candidates in the list are assigned codewords of non-reduced length (obviously any other equivalent mapping of candidates on codewords is equally applicable). Then, using a combination of several methods, the PMV _ CANDS set is constructed such that the most beneficial candidates for prediction are placed towards the beginning of the list. Also, using these methods, the candidates in the list are selected such that they allow efficient coding of motion vectors, and if only a few such candidates are available, the size of the list is reduced such that the length of the used codeword for signaling the candidates can be reduced. Finally, a method for efficiently signaling the use of PMV candidates is introduced.
Further, there is provided a method for encoding a motion vector, the method comprising: identifying a PMV candidate set; determining that a particular PMV candidate has coordinate values that enable candidate PMV candidates in the set of PMV candidates to encode a motion vector with an x-coordinate or a y-coordinate that is less than the coordinate values of the particular PMV candidate using fewer bits; determining that a motion vector is to be encoded using the particular PMV candidate; calculating a difference vector as a difference between the motion vector and the particular PMV; and encoding the difference vector without encoding the sign bit.
In addition, there is provided a method for decoding a motion vector, the method including: receiving an identity of a PMV candidate; receiving the difference without the sign bit; determining a plurality of potential motion vectors based on the possible sign values of the difference values; determining a lowest bit cost solution for encoding the identified potential motion vectors using available PMV candidates; and selecting a motion vector using the identified PMV candidate. In the case of receiving a DMV containing a difference value without a sign bit, two potential motion vectors are found. In the case of receiving a DMV containing two difference values without sign bits, four potential motion vectors are found.
By applying this method, some situations are identified, whereby the sign of the difference component (xdiff or ydiff) can only take one value. This can be done because the system selects the PMV candidate that minimizes bit cost. If the sign of the difference component is unknown, there are two possible motion vectors. In some cases, the system can identify that one of the motion vectors is to be encoded with the minimum bit cost using the indicated PMV, while the other motion vector is to be encoded with the minimum bit cost using a different PMV. Thus, the system can identify the sign of the difference as the sign assuming that the motion vector would be encoded with the minimum bit cost using the indicated PMV. Therefore, for some PMV candidates, it is not necessary to transmit a sign bit regarding at least one of the difference components, thereby reducing bit cost and improving coding efficiency.
Drawings
An improved method and apparatus for managing PMV candidates will now be described, by way of example only, with reference to the accompanying drawings, in which:
fig. 1 illustrates a video encoding and transmission system;
fig. 2a and 2b illustrate the use of PMV candidate lists during encoding and decoding, respectively;
FIG. 3 shows examples of two PMV candidates;
FIG. 4 illustrates the bit cost of encoding a motion vector using the example motion vector of FIG. 3; and
fig. 5 illustrates a method disclosed herein.
Detailed Description
Fig. 1 shows a video encoding system in which a video signal from a source 110 is ultimately delivered to a device 160. The video signal from source 110 passes through an encoder 120 that includes a processor 125. Encoder 120 applies an encoding process to the video signal to create an encoded video stream. The encoded video stream is sent to a transmitter 130 where it may receive further processing, such as packetization, prior to transmission. The receiver 140 receives the transmitted encoded video stream and passes it to the decoder 150. Decoder 150 includes a processor 155. processor 155 is employed in decoding the encoded video stream. Decoder 150 outputs the decoded video stream to device 160.
The methods disclosed herein are performed in an encoder during encoding and also in a decoder during decoding. This can be done even if the generation of the signaling bits is done in the encoder. During decoding, the decoder parses the bits and emulates the encoder in order to achieve encoder/decoder synchronization. Since the encoder and decoder follow the same rules to create and modify the PMV candidate set, the respective PMV candidate lists stored in the encoder and decoder remain synchronized. Explicit signaling of the PMV candidate list may still be performed in some circumstances.
The described method assumes that the Motion Vector (MV) 210 is coded using predictive coding techniques, where the MV 210 is predicted using a Predictive Motion Vector (PMV) 220 and the prediction error or Difference (DMV) is found 230 from DMV = MV-PMV. DMV 230 is signaled from encoder 120 to decoder 150. In addition, a code "index" 250 is sent to select a particular PMV candidate, in this case 242 from the PMV candidate list (i.e., PMV _ CANDS 240 as shown in fig. 2 a). The index 250 may be transmitted once with each transmitted motion vector MV 210, i.e., each sub-block (e.g., 8 x 8 pixel block). Also, an index may be sent for a motion vector group (e.g., each macroblock (16 × 16 block)).
The PMV candidate list PMV _ CANDS (240) has N elements, i.e., PMV _1 (241), PMV _2 (242), PMV _3 (243), etc. The list PMV _ CANDS 240 is equally available at the encoder 120 and the decoder 150. Using the transmitted index, the decoder 150 can determine the PMV 220 used in the encoder, as shown in fig. 2b, and can thus reconstruct MV = DMV + PMV.
There are two main operations for constructing the PMV candidate set 240, namely initialization and update.
"initialization" means that some predefined state of the list is established. The PMV _ CANDS list may be initialized, for example, to an empty list (zero entries) or a list with one or more predefined entries, such as a zero vector (0, 0). "update" means adding one or more motion vectors to the existing PMV _ CANDS list. The PMV _ CANDS list can be updated to include previously coded motion vectors MV. At the encoder, when encoding a current block, PMV _ CANDS may include, in addition to any predefined initialization vectors, motion vectors associated with previously encoded blocks in the video. By constraining the possible candidates in PMV _ CANDS to a predefined vector and a previously encoded vector, the decoder can derive the list PMV _ CANDS in the same way as the encoder.
Alternatively, one or more motion vector candidates that have not been previously encoded may be added to the PMV _ CANDS list at the encoder, and those motion vectors will then be explicitly signaled to the decoder for use with PMV _ CANDS, enabling PMV _ CANDS to be updated in the same way at both the encoder and decoder.
The PMV _ CANDS list used for encoding the motion vector MV associated with the current motion compensation block can be generated dynamically, in particular for the current motion compensation block, i.e. without taking into account the PMV _ CANDS list used for encoding the motion vector MV associated with the previously encoded motion compensation block. In this case, before a certain block is processed, the PMV _ CANDS list is initialized and then updated with a number of previously coded or predefined motion vectors. Alternatively, the PMV _ CANDS list may be initialized once (e.g., before the video encoder/decoder starts up, or before a certain frame is processed, or after a number of macroblocks are encoded in a certain frame) and then used to encode more than one motion vector, with the advantage that a potentially complex process of deriving the PMV _ CANDS list only needs to process once to encode the set of motion vectors. However, when used to encode more than one motion vector, the PMV _ CANDS list may be updated after encoding one of the motion vectors. For example, a motion vector MV associated with a first motion compensation block may be first encoded using a PMV _ CANDS list, then the PMV _ CANDS may be updated using the vector MV (e.g., adding the MV to the list), then a second motion compensation block may be encoded using PMV _ CANDS. The list is updated according to the sliding window method by subsequently updating PMV _ CANDS using the encoded motion vectors.
During video encoding, one or more PMV _ CANDS lists may be maintained according to a sliding window method, e.g., one per frame type, one per macroblock type, or one per reference frame. When encoding two motion vectors associated with a bi-predictive motion compensated block, a single or two different PMV _ CANDS lists may be used.
Before processing the motion vector associated with the current motion compensation block, the PMV _ CANDS list used to encode the current motion vector may be updated by using the motion vectors associated with the surrounding blocks.
The PMV _ CANDS list may be updated such that motion vectors associated with close motion compensation blocks are inserted towards the beginning of the PMV _ CANDS list (signaled with fewer bits), while motion vectors associated with more distant motion compensation blocks may be inserted towards the end of the PMV _ CANDS list. Possible metrics for determining how far a motion compensated block is from the current block include: euclidean distance (dx)2+dy2Where dx and dy are distances in the x and y directions, respectively); manhattan distance (sum of absolute values | dx | + | dy |); or chebyshev distance (maximum value max in absolute value (| dx |, | dy |),it is also called maximum measure, chessboard distance, frame distance,L Measure (L-infinite measure) orL Reference). To this end, an outward scan may be performed around the current block to obtain a motion vector, thereby updating PMV _ CANDS. The scan may terminate when at least one of the following conditions is met:
-all blocks of the current frame have been scanned,
all blocks of a predefined number of subsequent frames (e.g. the last frame) have been scanned,
as soon as a certain distance is reached,
-once a predefined number of unique PMV candidates are found,
-all blocks have been scanned for a predetermined scanning pattern.
Note that the list may be kept sorted by avoiding sorting the list by distance in the outbound scan by inserting a unique vector at the end of the PMV _ CANDS list, with the spatially closest vector at the head of the list.
The motion vector to be added to the PMV _ CANDS list may comprise spatial or temporal neighbors of the current block, or a combination of spatial and/or temporal neighbors, e.g. h.264/AVC type median predictors derived based on spatial neighboring blocks.
As an alternative to scanning motion vector candidates in view of predefined neighbors, a decoder may be signaled from the encoder (and thus dynamically decided at the encoder) for each motion vector or set of motion vectors (e.g., macroblock), whether the associated motion vector is added to the PMV _ CANDS list.
Among the set of possible mechanisms for determining motion vector candidates, one mechanism or a combination of mechanisms may be dynamically decided at the encoder and then signaled to the decoder.
Limiting and/or reducing the number of candidates in PMV _ CANDS can help reduce the overhead of signaling which PMV to use for motion vector prediction, since the shorter the list, the shorter the required codeword. In addition, constraining the addition of certain candidates can make room to add other more beneficial candidates.
One measure for reducing the number of candidates is to avoid the repeated occurrence of the same motion vector in a given PMV _ CANDS list. This can be done when updating the list by comparing the candidates already in the list with the new vectors that can be added and if a duplicate is found removing the duplicate vector or skipping the new vector. It is preferable to skip the new vector; otherwise, subsequent iterations from far blocks may result in placing higher ordered candidates in the list to the end of the list.
Removing or skipping new motion vectors may also be done for similar but not identical motion vectors, e.g., pairs of motion vectors having a similarity measure less than a predefined threshold, where the similarity measure may be the euclidean distance (x)0-x1)2+(y0-y1)2Or absolute distance | x0-x1|+|y0-y1L where (x)0, y0) And (x)1, y1) Is the motion vector pair under consideration. Instead of a straight-line distance measure, another approach is to focus on the number of bits required to encode the distance between motion vectors using a given coding scheme.
Also, the number of candidates in PMV _ CANDS may be limited to a predefined or dynamically obtained number. It is possible that once this number is reached, additional candidates will be added and then the candidate at the end of the PMV _ CANDS list may be removed. This can be done because the candidate list is ordered such that the PMV candidate that determines the end of the list is the least likely candidate to be used.
Alternatively, the removal of a candidate from the PMV _ CANDS list may be explicitly signaled from the encoder to the decoder (and thus dynamically decided by the encoder), e.g., by sending the code used to remove the motion vector candidate from the list and an identifier (e.g., index) of the motion vector.
How to determine the order of the motion vector candidates in the candidate list will now be addressed. It is assumed that the candidates in PMV _ CANDS are sorted and that one of the candidates in PMV _ CANDS is signaled for prediction such that the first candidate in the list is assigned the shortest codeword of the candidates and subsequent candidates in the list are assigned codewords of non-reduced length. When updating the PMV _ CANDS list, the following method may be used in order to order the candidates in a way that is beneficial to the overall coding efficiency.
●, the motion vector corresponding to a block that is close to the current block (using some distance metric) will get a position closer to the start of the list than the motion vectors belonging to blocks that are far from the current block.
● the motion vector associated with the last coded block is placed at the beginning of the list (shortest codeword). Alternatively, a combination candidate for the current block, such as the h.264/AVC median predictor (or the like), is placed at the beginning of the list. Combining this approach with dynamic adaptation of the PMV _ CANDS list size allows for a guaranteed prediction performance of the median prediction value, e.g. h.264/AVC, since it is possible to set the PMV _ CANDS list size to 1, such that no bits need to be sent for index signaling.
● can order the candidates according to their frequency of occurrence in previously coded blocks (or other candidates having, for example, euclidean or absolute distances below a predefined threshold) such that vectors describing typical motion in a video frame or sequence are assigned short codewords. Alternatively, if a new candidate copy is already in the list, the copy can be removed and a new vector added to the beginning of the list, or as another alternative, an existing motion vector can be moved up the list one or more steps.
● it can also be useful to include weights on the motion compensated partition size so that motion vectors with larger weights are placed further away from the beginning of the PMV _ CANDS list than those with smaller weights. For example, larger partitions can be more reliable than smaller partitions in the sense that, depending on the encoded sequence, the associated motion vector may be more likely to describe typical motion in the sequence. Thus, motion vectors associated with larger partitions may be assigned greater weight. Also, skipped motion vectors may be trusted differently, e.g., assigned less weight, than non-skipped motion vectors.
Alternatively, the reordering of the PMV _ CANDS list can be explicitly signaled from the encoder to the decoder (and thus dynamically decided by the encoder) by, for example, sending a code for reordering of the motion vector candidates from the list, along with an identifier (e.g., index) of the motion vector to be moved and a signal on where to move the candidate.
When a motion vector candidate is added to or obtained from the PMV _ CANDS list (in the latter case, in order to use it for prediction), it may be modified according to a predefined method. Since the modification when adding (during encoding) or obtaining (during decoding) is equivalent, it can be assumed without loss of generality that the vector is modified when obtaining. Such modifications as obtained may include:
● scale the motion vector candidates according to the frame distance of the reference frame predicted using the motion vector candidates. For example, assume that the candidate motion vector MV in PMV _ CANDS has been applied(T-1)= (X, Y) to perform motion compensated prediction from a reference frame representing video at time T-1, where the reference frame is the next frame to the current frame assumed to represent video at time T. Now, if the candidate is obtained from PMV _ CANDS for predicting the motion vector pointing to the reference frame (the next two frames of the current frame) representing the video at time T-2, the motion vector magnitude can be scaled by a factor of 2, i.e., (2X, 2Y). And, if the candidate motion vector (X, Y) in the PMV _ CANDS list refers to a video frame at T-2 to be used to reference a frame at T-1, then the motion vector can be scaled to (X/2, Y/2). For both cases, the result is a duplicate candidate motion vector, in which case it can be removed. Under the assumption of linear motion, candidate motionThe scaling of the motion vector is reasonable.
● similarly, when a motion vector predictor MV is obtained in a B frame representing time T(T-1)= X, Y, and the motion vector has been exploited for motion compensated prediction from the left reference frame (time T-1), and the predictor is now to be used to predict the vector for motion compensated prediction from the right reference frame (time T + 1), then the sign of the motion vector predictor may be opposite, i.e., -X, -Y.
The candidate predictor list size can vary. Limiting and/or reducing the number of candidates in PMV _ CANDS can help reduce the overhead of signaling which PMV to use for motion vector prediction, since the shorter the list, the shorter the required codeword. On the other hand, depending on the video sequence characteristics, it may be beneficial to have a larger number of motion vector prediction candidates, e.g. in order to save bits for DMV coding in case of irregular motion. The following method can be used to adapt the size of the PMV _ CANDS list according to the video sequence characteristics.
● the list size can be defined in the slice/frame/picture header or in the sequence-wide header (e.g., parameter set), i.e., signaled from the encoder to the decoder, and thus dynamically adapted by the encoder.
● can remove from the list candidates that have not been used for prediction during encoding of a number of previously encoded blocks (according to a predefined threshold), thereby reducing the list size.
● the list size may be adapted according to the similarity of the candidates in the list. For example, when updating the list using motion vectors MV, the number of candidates that are similar to MV (according to a distance measure such as euclidean or absolute distance, by a predefined threshold) is counted. A high count indicates a high number of similar candidates, and since it may not be necessary to have many similar candidates, at least one of them may be removed and the list size reduced. On the other hand, a low number of similar candidates may indicate that it may be beneficial to have additional candidates, so that the list size may be increased.
As described above, the candidates in PMV _ CANDS are sorted and one of the candidates in PMV _ CANDS is signaled to be used such that the first candidate in the list is assigned the shortest of the candidates and subsequent candidates in the list are assigned codewords of non-reduced length. Such codewords can be defined according to, for example, a Variable Length Coding (VLC) table. The VLC table used can depend on the maximum number of candidates (list size) in PMV _ CANDS as dynamically adapted, for example, according to the above method. Table 1 below presents some examples of VLC codes for different maximum list sizes. The left column shows the maximum list size, which is again denoted C. In the right column VLC codes are shown, as well as indices for addressing candidates in the PMV _ CANDS list.
Table 1: example VLC codes for different maximum list sizes.
For bi-predictive motion compensated blocks, two motion vectors are encoded, and thus two PMV candidates can be necessary. In this case, the index numbers of the two PMV candidates can be jointly encoded to further reduce the number of bits required for index encoding. Table 2 shows an example for joint index coding, which considers that two motion vectors use the same PMV _ CANDS list, and that it is possible for two motion vectors to use the same predictor in the PMV _ CANDS list. Here, idx0 and idx1 denote indexes of the first and second predicted values, respectively. VLC0(idx, C) denotes VLC according to index "idx" of table 1, considering a maximum list size of C.
Table 2: VLC codes for encoding two candidate indices associated with a bi-prediction block,
c: maximum size of PMV _ CANDS.
Unnecessary PMV candidates are removed from the PMV _ CANDS candidate list. The mechanism for removing PMV candidates is required because some of the candidates in the list are likely never used, since selecting candidates with shorter codewords and encoding their distance will give a bit sequence with shorter or the same length than all possible motion vectors. In this case, they can be removed, making the list shorter and the average bit length of each index shorter. Alternatively, it is possible to insert more candidates instead. In this way, the average bit length remains the same, but the newly inserted candidate has an opportunity to be used.
As an example, suppose we have the following candidates:
and, suppose we encode the difference DMV, (xdiff, ydiff), where xdiff and ydiff are encoded using table 3 below. If we want to encode a motion vector such as MV = (0,2), we can encode it using candidate 3 (i.e., PMV = (0, 2)) plus the difference DMV = (0, 0):
PMV+DMV=MV
(0, 2)+(0, 0)=(0, 2)
the index takes four bits (the code length of index 3 is four bits) and each zero in the difference takes one bit, so the total number of bits required to encode MV = (0,2) is 4+1+1=6 bits.
However, we can also encode the vector using the index 0 (i.e., PMV = (-1, 2)) plus the difference DMV = (1, 0):
(-1, 2)+(1, 0)=(0, 2)
the index takes one bit (the code length of index 1 is one bit), the term xdiff =1 in the difference takes three bits (see table 3 below), and the term ydiff =0 takes one bit. Therefore, we get a total of 1+3+1=5 bits, which is better than using index 3, which requires 6 bits. It is readily seen that given the use of table 3 below to encode the vector difference, it would not be beneficial to use index 3, since using index 0 would always be one bit less or better. Therefore, we can eliminate the candidate vector (0,2) and instead get:
now, index 4 (i.e., vector (3, 4)) has a shorter code; it is now three bits instead of four. Thus, if vector (3, 4) is used, we benefit from the cancellation and nothing is lost. In the above example we have removed the PMV candidate with index 3, but it should be obvious to the person skilled in the art that this is only one example. For example, in some cases, it may also be beneficial to remove candidates 1 and 2.
Since the same analysis is performed in both the encoder and decoder, the same vector is removed from the list in both the encoder and decoder. Thus, after removal, both the encoder and decoder will use the same candidate list.
Sometimes it is not possible to ensure a benefit by removing a single candidate, but it is possible if two or more candidates are eliminated at the same time. Another possibility is that changing the order of candidates or even adding new candidates to the list can allow removing candidates that have now become unnecessary and thus become beneficial regardless of the final motion vector to be encoded.
Table 3: cost of sending the difference
The number of bits required can be further reduced by not sending a sign bit, which may be because, in some cases, a sign bit for the difference is not necessary. For example, suppose we have the following list of PMV candidates:
suppose we want to encode a vector using a PMV candidate with index number 3, i.e. PMV = (11, 3). Since the PMV candidate with index number 4 is to its right (with the coordinates PMV = (12, 3)), if the x-coordinate is 12 or greater, it is advantageous to encode it using candidate 4 instead. As one example, the vector MV = (15, 2) can be encoded using candidate 3 as:
(11, 3)+(4, -1)
the index takes four bits, +4 seven bits, and-1 three bits (see table 3), for a total of 14 bits. However, it can also be encoded using candidate 4 as:
(12, 3)+(3, -1)
this takes four bits for the index, +3 takes five bits, and-1 takes three bits, taking 12 bits total. Since candidate 4 will always be closer to any point in the right half, it will be advantageous to select candidate 4 for this. Similarly, if we are in the left half (as seen from the point between (11, 3) and (12, 3)), then it is better to select candidate 3.
This means that it is not necessary to specify a sign bit for the difference in the x component, since it will always be negative for (11, 3) and positive for (12, 3). In table 3, the sign bit is the last bit, except for 0, which has no sign bit. This means that if candidates 3 or 4 are to be selected, they will be one bit less for encoding.
Of course, the decoder will do the same analysis and if the above occurs, reading the sign bit will be avoided.
It is still possible to avoid sending the sign bit, even if the candidates are not exactly next to each other, or even if they do not have exactly the same cost, at least for one of the candidates. As an example, assume we want to encode a certain value using a PMV candidate with index number 5 (i.e., PMV = (1, 2)). If the x-coordinate of the vector to be encoded is less than or equal to zero, it is always advantageous to switch to candidate 0 because it has the lowest cost. This means that for candidate 5, it is not necessary to send the sign bit for the x coordinate. However, for candidate 0, it is not possible to remove the sign bit of the x component. Since its index value can be encoded with only such a low cost, it may be advantageous to select it even if the vector to be encoded is to the right of candidate 5.
If index 0 has the same cost as index 4, then two candidates would be equally good for encoding a vector with an x coordinate of 0. However, in such a case, we may decide to always use the lowest index and therefore still avoid sending the sign bit when selecting index 4. If the vectors are located in the same row (as described above) or indeed in the same column, it is possible to derive a generic expression when the transmit sign bit is never useful, as described below.
Referring to fig. 3, suppose we have two candidates a = (Ax, Ay) and B = (Bx, By) on the same line (thus, By = Ay), and the distance between them is D, thus Bx = Ax + D. In the following we will assume that D is positive, but the skilled person will understand that it is also valid if we swap positions of a and B. It is also assumed that transmitting the index of candidate a never costs more than transmitting the index of candidate B, i.e. cost (a _ index) < = cost (B _ index), where cost (a _ index) is the cost of transmitting the index associated with candidate a. The cost of transmitting the difference k in the x direction is denoted as cost _ x (k). For example, according to Table 3, cost _ x (-3) is equal to 5.
Now, if cost (a _ index) -cost (B _ index) + cost _ x (D-1) -3< =0, we do not have to send the sign bit of candidate B. As an example, if a = (11, 2), B = (13, 2), and a _ index is 1, and B _ index is 0001, then D =2, and cost (a _ index) -cost (B _ index) + cost _ x (D-1) -3 equals 1-4+3-3= -2, which is less than 0, so we do not need to send the sign bit of B. This example is illustrated in fig. 4, where the x-axis and y-axis show the x-and y-components of the motion vector. Each box in fig. 4 represents a motion vector; the bit cost for encoding the respective motion vector using PMV candidate a is shown on the left side of the box and the bit cost for encoding the respective motion vector using PMV candidate B is shown on the right side of the box. As can be seen from fig. 4, for MVs with x components of 12 or less, the most efficient PMV to be used is a, and for MVs with x components of 13 or more, the most efficient PMV to be used is B.
In one embodiment of the proposed solution, we use a maximum of four candidates in the list. However, in another embodiment, we use seven and in principle there is no maximum limit. If we allow a larger maximum, the list can grow and the chance that a suitable vector can be found increases. On the other hand, the number of bits required to specify a candidate vector also increases. On top of this, the problem is that several candidates can be used to represent multiple vectors, which is not necessary. This redundant representation evolves to add more vectors.
One way to avoid such redundant representations is to constrain the number of vectors that may be encoded with each candidate vector. For example, it is possible to constrain a candidate such that it can only encode motion vectors that are exactly equal to the candidate or that differ by one step in one direction. This can be done by changing the way the difference is encoded. Typically, the difference values are encoded using table 3, where x and y are encoded separately. Alternatively, we can use the following short table:
such constrained difference coding may be used for candidates larger than a certain index. For example, all candidates with indices of 3 or greater may be encoded in this manner.
This has at least two advantages:
1) the coding of the difference becomes very short, which is good because the cost of signaling an index of 3 or more is rather high; and
2) the candidate will not spend bits covering motion vectors that would be better encoded anyway using some other candidate vector. Thereby alleviating the redundancy problem described above.
Fig. 5 illustrates a method according to the present application. At 510, a set of PMV candidates is selected from a set of previously used motion vectors. The previously used motion vectors are those used during encoding of a previous block in the frame. At 520, duplicate PMV candidates may be removed from the set. At 530, the PMV candidates in the set are ranked according to expected usage. The expected usage may be calculated based on the most recently encoded video. The expected usage may be determined from the proximity of the current block to the block using the PMV candidate as a motion vector using some distance metric. At 540, code values are assigned to the PMV candidates, which vary in length. The shortest code value is assigned to the PMV candidate with the highest expected usage. Subsequent code values have non-decreasing lengths.
The methods and apparatus described herein improve coding efficiency using a motion vector prediction scheme that signals motion vector predictors.
Those skilled in the art will appreciate that the exact order and content of the actions performed in the methods described herein may vary depending on the requirements of a particular set of execution parameters. Accordingly, the order in which actions are described and/or claimed should not be construed as a strict limitation on the order in which the actions are performed.
Furthermore, although examples are given in the context of particular coding standards, these examples are not intended as limitations on the coding standards to which the disclosed methods and apparatus may be applied. For example, although specific examples are given in the context of H.264/AVC, the principles disclosed herein are also applicable to the MPEG2 system, other coding standards, and virtually any coding system that uses predictive motion vectors.

Claims (12)

1. A method of managing predicted motion vector candidates, the method comprising:
selecting a set of predicted motion vector candidates as a subset of previously encoded motion vectors;
scaling the prediction motion vector candidate according to a frame distance of a reference frame using the previously encoded motion vector;
each predictive motion vector candidate in the set of predictive motion vector candidates is assigned a code value, wherein the code values vary in length and are assigned to the predictive motion vector candidates in order of expected usage such that the predictive motion vector candidate with the highest expected usage has one of the shortest code values.
2. The method of claim 1, wherein the code value assigned to each predictive motion vector candidate is assigned according to at least one of: arithmetic coding; variable length coding; and context-based adaptive arithmetic coding.
3. The method of claim 1 or 2, further comprising removing unnecessary predicted motion vector candidates from the set of predicted motion vector candidates.
4. The method of claim 1 or 2, the method further comprising ordering the predicted motion vector candidates in the set of predicted motion vector candidates according to expected usage.
5. The method of claim 1 or 2, wherein the expected usage of the predicted motion vector candidate is obtained according to a frequency of usage of the predicted motion vector.
6. The method of claim 1 or 2, wherein each predicted motion vector candidate corresponds to a motion vector used for encoding a previous block, said previous block being at a distance from the current block, and wherein the intended use of the predicted motion vector candidate is obtained from the distance of the respective previous block of the predicted motion vector candidates from the current block.
7. The method of claim 6, wherein the distance is measured in euclidean distance.
8. The method of claim 6, wherein the distance is measured as a Chebyshev distance.
9. The method of claim 8, wherein the code value is assigned to a predicted motion vector candidate having a common chebyshev distance according to a euclidean distance of the predicted motion vector candidate having the common chebyshev distance.
10. The method of claim 6, the method being for video encoding or video decoding, wherein the current block is a respective encoded or decoded block.
11. A video encoding apparatus comprising a processor arranged to:
selecting a set of predicted motion vector candidates as a subset of previously encoded motion vectors;
scaling the prediction motion vector candidate according to a frame distance of a reference frame using the previously encoded motion vector;
each predictive motion vector candidate in the set of predictive motion vector candidates is assigned a code value, wherein the code values vary in length and are assigned to the predictive motion vector candidates in order of expected usage such that the predictive motion vector candidate with the highest expected usage has one of the shortest code values.
12. A video decoding apparatus comprising a processor arranged to:
selecting a set of predicted motion vector candidates as a subset of previously encoded motion vectors;
scaling the prediction motion vector candidate according to a frame distance of a reference frame using the previously encoded motion vector;
each predictive motion vector candidate in the set of predictive motion vector candidates is assigned a code value, wherein the code values vary in length and are assigned to the predictive motion vector candidates in order of expected usage such that the predictive motion vector candidate with the highest expected usage has one of the shortest code values.
HK13107329.2A 2010-02-05 2010-12-23 Managing predicted motion vector candidates HK1180161B (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US61/301649 2010-02-05

Publications (2)

Publication Number Publication Date
HK1180161A HK1180161A (en) 2013-10-11
HK1180161B true HK1180161B (en) 2018-02-15

Family

ID=

Similar Documents

Publication Publication Date Title
CN102860006B (en) Management prognostic motion vector candidate
JP7271768B2 (en) Candidate list sharing method and apparatus using such method
JP7727024B2 (en) Method for deriving predicted motion vector and device using the method
JP6995952B2 (en) Inter-prediction method and its device
KR101366648B1 (en) Decoder For Video Information
KR101873767B1 (en) Method and apparatus for processing a video signal
JP6005762B2 (en) Video encoding / decoding method and apparatus
JP2016007065A (en) Method and device for encoding/decoding motion vector based upon candidate for reduced prediction motion vector
JP5983430B2 (en) Moving picture coding apparatus, moving picture coding method, moving picture decoding apparatus, and moving picture decoding method
CN117426087A (en) Method and apparatus for geometry partitioning mode with motion vector refinement
JP2024524402A (en) Method and device for geometric partition mode with motion vector refinement - Patents.com
HK1180161B (en) Managing predicted motion vector candidates
HK1180161A (en) Managing predicted motion vector candidates
WO2012090397A1 (en) Video encoding device, video encoding method, and video encoding program, and video decoding device, video decoding method, and video decoding program
CN117643054A (en) Geometry partitioning mode with motion vector refinement
JP2019146252A (en) Method and device for encoding/decoding motion vector based on reduced prediction motion vector candidate