WO2025000429A1 - Coding method, decoding method, code stream, coder, decoder, and storage medium - Google Patents
Coding method, decoding method, code stream, coder, decoder, and storage medium Download PDFInfo
- Publication number
- WO2025000429A1 WO2025000429A1 PCT/CN2023/104460 CN2023104460W WO2025000429A1 WO 2025000429 A1 WO2025000429 A1 WO 2025000429A1 CN 2023104460 W CN2023104460 W CN 2023104460W WO 2025000429 A1 WO2025000429 A1 WO 2025000429A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- current
- grid
- current image
- current layer
- image
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/70—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards
Definitions
- the embodiments of the present application relate to the field of dynamic grid coding and decoding technology, and in particular, to a coding and decoding method, a bit stream, an encoder, a decoder, and a storage medium.
- the encoder can use the basic grid of the reference image to determine the grid information of the current image.
- the existing technical solutions are not perfect, and the coding bit rate of the shift coefficient is increased, thereby reducing the grid compression performance.
- the embodiments of the present application provide a coding and decoding method, a bit stream, an encoder, a decoder and a storage medium, which can reduce the coding rate of the shift coefficient while ensuring the quality of grid reconstruction, thereby improving the geometric coding efficiency of the grid.
- an embodiment of the present application provides a decoding method, which is applied to a decoder, and the method includes:
- the geometric position information of the reconstructed grid of the current layer in the current image is determined according to the geometric position information of the initial grid of the current layer in the current image and the shift coefficient of the current layer in the reference image.
- an embodiment of the present application provides an encoding method, which is applied to an encoder, and the method includes:
- the geometric position information of the first reconstructed grid and the geometric position information of the original grid it is determined whether to perform encoding processing on the shift coefficients of the current layer in the current image.
- an embodiment of the present application provides a code stream, which is generated by bit encoding according to information to be encoded; wherein the information to be encoded includes at least one of the following:
- the second grammar identification information is used to indicate whether the shift coefficient of the current sequence enables the first coding mode
- the third grammar identification information is used to indicate whether the shift coefficient of the current image enables the first coding mode
- the first grammar identification information is used to indicate whether the shift coefficient of the current layer in the current image uses the first coding mode
- the fourth grammar identification information is used to indicate whether the basic grid of the current image uses inter-frame processing; and the current sequence includes the current image, and the LOD layers divided by the current image include the current layer.
- an encoder comprising a first determining unit, a first subdivision unit, a first reconstruction unit and an encoding unit, wherein:
- a first determining unit is configured to determine a base grid of the current image according to an original grid of the current image
- a first subdivision unit is configured to subdivide the base grid and determine geometric position information of an initial grid of a current layer in a current image
- the encoding unit is configured to determine whether to perform encoding processing on the shift coefficients of the current layer in the current image according to the geometric position information of the first reconstructed grid and the geometric position information of the original grid.
- an embodiment of the present application provides an encoder, the encoder comprising a first memory and a first processor; wherein,
- a first memory for storing a computer program that can be run on the first processor
- the first processor is used to execute the method described in the second aspect when running a computer program.
- an embodiment of the present application provides a decoder, the decoder comprising a second determination unit, a second subdivision unit, a decoding unit, and a second reconstruction unit, wherein:
- a second determining unit is configured to determine a base grid of the current image
- a second subdivision unit is configured to subdivide the base grid and determine geometric position information of an initial grid of a current layer in a current image
- a decoding unit configured to decode the code stream, determine the first syntax identification information; and determine the shift coefficient of the current layer in the reference image when the first syntax identification information indicates that the shift coefficient of the current layer in the current image uses the first decoding mode;
- the second reconstruction unit is configured to determine the geometric position information of the reconstructed grid of the current layer in the current image according to the geometric position information of the initial grid of the current layer in the current image and the shift coefficient of the current layer in the reference image.
- an embodiment of the present application provides a decoder, the decoder comprising a second memory and a second processor; wherein:
- the second processor is used to execute the method described in the first aspect when running a computer program.
- an embodiment of the present application provides a computer-readable storage medium, which stores a computer program.
- the computer program When executed, it implements the method as described in the first aspect, or implements the method as described in the second aspect.
- the basic grid of the current image is determined; the basic grid is subdivided to determine the geometric position information of the initial grid of the current layer in the current image; the code stream is decoded to determine the first syntax identification information; when the first syntax identification information indicates that the shift coefficient of the current layer in the current image uses the first decoding mode, the shift coefficient of the current layer in the reference image is determined; according to the geometric position information of the initial grid of the current layer in the current image and the shift coefficient of the current layer in the reference image, the geometric position information of the reconstructed grid of the current layer in the current image is determined.
- the adaptive encoding of the shift coefficient in the current image can be determined; if the encoding of the shift coefficient in the current image is skipped, the shift coefficient in the current image does not need to be transmitted in the bitstream at this time.
- the decoding end can determine the geometric position information of the first reconstructed grid based on the geometric position information of the initial grid and the shift coefficient in the reference image, which not only reduces the bitstream of the shift coefficient, but also ensures the reconstructed geometric quality of the midpoints in the grid, thereby further improving the geometric information quality of the midpoints in the grid, and thus improving the encoding and decoding efficiency.
- FIG1A is a schematic diagram of a three-dimensional grid image 1;
- FIG1B is a partial enlarged schematic diagram of a three-dimensional grid image
- FIG2 is a schematic diagram of a connection method of a three-dimensional grid
- FIG3A is a second schematic diagram of a three-dimensional grid image
- FIG3B is a schematic diagram of a grid data storage format
- FIG3C is a schematic diagram of properties of a three-dimensional grid image
- FIG4 is a schematic diagram showing the composition of the overall framework of grid coding
- FIG5A is a schematic diagram of preprocessing of a two-dimensional curve
- FIG5B is a schematic diagram of generating a shift coefficient
- FIG6A is a first schematic diagram of quantization processing of grid geometric position information
- FIG6B is a second schematic diagram of quantization processing of grid geometric position information
- FIG. 7A is a schematic diagram of coding of the connection relationship of triangular facets
- FIG7B is a schematic diagram of encoding geometric position information
- FIG7C is a schematic diagram of encoding of texture coordinates
- FIG8 is a schematic diagram of the basic principle of the shift coefficient
- FIG9 is a schematic diagram of encoding of mapping shift coefficients to a two-dimensional image
- FIG10 is a schematic diagram of encoding geometric position information between frames
- FIG11B is a schematic diagram showing the composition of an inter-frame coding framework
- FIG12A is a schematic diagram showing the composition of an intra-frame decoding framework
- FIG13A is a schematic diagram of iterative subdivision of a basic grid
- FIG13B is a schematic diagram of a LOD space structure
- FIG16 is a schematic diagram of a mesh architecture of a codec provided in an embodiment of the present application.
- FIG17 is a flowchart diagram 1 of a decoding method provided in an embodiment of the present application.
- FIG18 is a second flow chart of a decoding method provided in an embodiment of the present application.
- FIG20 is a second flow chart of an encoding method provided in an embodiment of the present application.
- FIG21 is a schematic diagram of another basic principle of reconstructing and restoring geometric position information provided by an embodiment of the present application.
- FIG22 is a schematic diagram of the composition structure of an encoder provided in an embodiment of the present application.
- FIG23 is a schematic diagram of a specific hardware structure of an encoder provided in an embodiment of the present application.
- first ⁇ second ⁇ third involved in the embodiments of the present application are only used to distinguish similar objects and do not represent a specific ordering of the objects. It can be understood that “first ⁇ second ⁇ third” can be interchanged in a specific order or sequence where permitted, so that the embodiments of the present application described here can be implemented in an order other than that illustrated or described here.
- different data format bitstreams can be decoded and synthesized in the same video scene.
- at least image format, point cloud format, and mesh format can be included.
- real-time immersive video interaction services can be provided for multiple data formats (for example, mesh, point cloud, image, etc.) with different sources.
- 3D animation content is represented based on keyframes, that is, each frame is a static mesh. Static meshes at different times have the same topological structure and different geometric structures.
- the amount of data of 3D dynamic meshes represented based on keyframes is extremely large, so how to effectively store, transmit and draw them has become a problem faced by the development of 3D dynamic meshes.
- the spatial scalability of the mesh needs to be supported for different user terminals (computers, notebooks, portable devices, mobile phones); different mesh bandwidths (broadband, narrowband, wireless) need to support the quality scalability of the mesh. Therefore, 3D dynamic mesh compression is a very critical issue.
- a 3D grid is a 3D object surface composed of numerous polygons in space.
- a polygon consists of vertices and edges.
- FIG1A shows a 3D grid image
- FIG1B shows a partially enlarged schematic diagram of a 3D grid image. It can be seen from FIG1A and FIG1B that the grid surface is composed of closed polygons.
- a two-dimensional image has information expressed at each pixel point and is distributed regularly, so there is no need to record its position information additionally; however, the distribution of vertices in the mesh in three-dimensional space is random and irregular, and the way polygons are formed requires additional regulations. Therefore, it is necessary to record the position of each vertex in space and the connection information of each polygon in order to fully express a mesh image. As shown in Figure 2, the same number of vertices and vertex positions, due to different connection methods, form completely different surfaces.
- the three-dimensional grid image is usually encoded using an existing two-dimensional image/video encoding method
- the three-dimensional grid needs to be converted from three-dimensional space to a two-dimensional image, and the UV coordinates define this conversion process.
- each position in the acquisition process may have corresponding attribute information, usually RGB color values, which reflect the color of the object; for 3D meshes, in addition to color, the attribute information corresponding to each vertex is also commonly reflectance values, which reflect the surface material of the object.
- the attribute information of 3D meshes is stored in 2D images, and its mapping from 2D to 3D is specified by UV coordinates.
- 3D mesh data usually includes 3D geometric position information (x, y, z), geometric connection relationship, UV coordinates and attribute graph.
- Figure 3A is a 3D mesh image
- Figure 3B is a mesh data storage format, which includes 3D geometric position information, UV coordinates and connection information
- Figure 3C is a corresponding attribute diagram.
- Current 3D dynamic mesh compression methods include space-time prediction methods, which improve compression efficiency by eliminating spatial and temporal correlations; principal component analysis (PCA)-based technology, which projects in the eigenvector space to concentrate energy; and wavelet-based methods, which support spatial scalability and quality scalability.
- PCA principal component analysis
- FIG4 is a schematic diagram of the overall framework of mesh coding
- FIG5A is a schematic diagram of the preprocessing of a two-dimensional curve
- FIG5B is a schematic diagram of the generation of a shift coefficient.
- the preprocessing process of a three-dimensional mesh can be analogous, and is mainly divided into two parts at the encoding end: preprocessing (Pre-processing) and encoder (Encoder).
- preprocessing Pre-processing
- Encoder encoder
- the base mesh and the shift coefficient are first generated by preprocessing.
- the preprocessing process includes: first, the original mesh is downsampled to generate a simplified mesh (Decimated Mesh) with a greatly reduced number of vertices, or a base mesh (Base mesh). Then the base mesh is subdivided, generated by an algorithm, and the newly generated vertices are inserted on the edge of the base mesh to obtain a subdivided mesh (Subdivided Mesh). Finally, for each vertex in the subdivided mesh, the vertex closest to it is found in the original mesh, and the vector between the vertex in the subdivided mesh and the nearest vertex in the original mesh is the shift coefficient. As long as the subdivision algorithm and the number of subdivision iterations are determined, the subdivision grid can be automatically generated at the codec end. Therefore, after preprocessing, the original grid only needs to be represented as a simple basic grid and a series of shift coefficients. This can greatly reduce the amount of data without affecting the reconstruction at the decoding end.
- V-DMC Video Dynamic Mesh Coding based on video coding can be mainly divided into two categories: geometric position information coding and attribute information coding.
- each frame file of the sequence basketball_player includes two files: basketball_player_fr0001_qp12_qt12.obj and basketball_player_fr0002.png.
- basketball_player_fr0001_qp12_qt12.obj contains four types of information: geometric position information (x, y, z), the connection relationship of geometric position information triangles, texture coordinates (u, v) and the connection relationship of texture coordinates.
- basketball_player_fr0002.png represents the texture attribute information of the current image.
- the geometric position information is jointly encoded by the dynamic range arithmetic coding (DRACO) and the video codec, and the texture information encoding is directly encoded by the video codec.
- Video Codec can include H.264/Advanced Video Coding (AVC), H.265/High Efficiency Video Coding (HEVC), H.266/Versatile Video Coding (VVC/VV-enC), etc. Therefore, the geometric information encoding of mesh is introduced in detail below.
- the geometric information can be divided into: the encoding of position information (geometric position information and texture position information) and the encoding of connection relationship (connection relationship of triangle patches of geometric position information and connection relationship of texture position information).
- the current V-DMC encoding is mainly divided into two encoding test conditions: intra-frame encoding and inter-frame encoding (low latency, currently there is no RA test environment).
- connection relationship of the original mesh contains a large number of points.
- the mesh geometry information is first quantized or simplified, and finally the corresponding simplified mesh (Decimated mesh) is obtained as the basic mesh.
- the quantization processing of the grid is performed based on the coordinates of the triangular patch. According to the connection relationship between the quantization points, the quantization processing can be divided into the following two cases:
- V-DMC In the whole process of mesh quantization based on triangle patch coordinates, the core problem is how to get the best vertex based on the previous vertex coordinates.
- the current V-DMC will get the best quantization point in the following four modes. Assuming that the vertex distribution before quantization is V1 and V2, and the vertex coordinates after quantization are V', there are the following: V1, V2, (V1+V2)/2 and Q -1 (V1+V2), where Q is the quantization matrix corresponding to the vertex coordinates of V1 and V2. Finally, the distortion measure D before and after basic quantization is used to select the best quantization point.
- the DRACO encoder is used to encode the geometric information of the Base mesh.
- the information mainly includes: the connection relationship and the connection relationship of the geometric position information.
- the entire DRACO encoding process is as follows: first complete the encoding of the connection relationship, then encode the geometric position information of the point based on the connection relationship of the geometric position, and finally encode the texture position information based on the connection relationship and geometric position information.
- DRACO uses the "Edgebreaker Coding" scheme to encode the connection relationship of the mesh. See Figure 7A for details.
- v represents the current vertex.
- C the vertices of the mesh are divided into five types: C, L, R, S, and E. The physical meaning of each symbol is as follows:
- the type of each vertex and the processing order of the vertices are encoded in a certain order, and the decoding end restores the geometric connection relationship of the mesh according to the processing order and type of the vertices.
- the texture coordinates are predicted and encoded based on the decoding and reconstruction of the two, as shown in Figure 7C.
- the texture coordinates of the left and right vertices are used to predict the texture coordinates of the current vertex C.
- a certain partitioning algorithm is used to partition the base mesh to obtain the initial reconstructed mesh.
- the curve corresponding to the subdivided mesh in Figure 5A is used to obtain the subdivided mesh (also called the "initial mesh") by using simple linear interpolation.
- the coordinates of the newly inserted point are obtained by linear interpolation based on the two vertices on the current boundary:
- the error Delta between the points of the subdivided mesh and the original mesh is calculated.
- the error Delta can be an error of the point in the world coordinate system.
- the Displacement (i.e., displacement coefficient) of each point is calculated using the error Delta between each point and the normal vector Norm of each point, as shown in Figure 8.
- the bold solid line represents the error Delta
- the lifting transform can be used to transform the spatial domain residual coefficients to the frequency domain to obtain the corresponding frequency domain residual coefficients.
- the coefficient packing algorithm is used to map the frequency domain residual coefficients of each point into the two-dimensional image in a certain order.
- the current V-DMC can be arranged in Morton Code Order, as shown in Figure 9.
- Recoloring is an algorithm on the encoder side. After the reconstruction of the geometric information on the encoder side is completed, the texture attribute information of the reconstructed mesh is recolored using the original geometric information, the original texture attribute information, and the reconstructed mesh geometric information.
- the geometric position information includes the geometric connection relationship and the geometric position information encoding.
- the inter-frame geometric position information encoding only needs to encode the geometric position information (x, y, z) of the current Base mesh, and does not need to encode the connection relationship and texture position information (u, v). The specific reasons are as follows: If the current image can be inter-frame encoded, then the Base mesh of the reference image of the current image will be used at the encoding end to obtain the mesh information of the current image. Therefore, the current image and the reference image have the same connection relationship and uv texture coordinates, but the geometric position information is different.
- the black dot is the point to be encoded.
- the corresponding prediction point (similar to the same-position block in video encoding) is obtained by using the current point in the reference image.
- the motion vector (MV) of the current point is predicted and encoded by using the neighboring points of the current point (the MV of the vertices that have been encoded).
- the rate-distortion optimization algorithm is used to obtain the optimal coding mode for each coding group (CG).
- the current V-DMC sets the maximum number of points for each CG to 16.
- the current V-DMC encodes texture attribute information directly using a video codec (Video-Codec), such as AVC, HEVC, VVC or VV-enC.
- Video-Codec such as AVC, HEVC, VVC or VV-enC.
- FIG11A is a schematic diagram of a framework of an intra-frame encoder.
- a common static mesh encoder (Static Mesh Encoder) can be used to encode the simplified mesh to generate the corresponding bitstream (Compressed base mesh bitstream).
- the displacement coefficients are updated (Update Displacements) using the reconstructed simplified mesh.
- the updated displacement coefficients are subjected to wavelet transform (Wavelet Transform) and quantization (Quantization) to obtain the displacement coefficients. They are then packaged into images and videos (Image Packing, Video Packing) and encoded using HEVC to generate a bitstream (Compressed displacements bitstream) of the displacement coefficients.
- Wavelet Transform Wavelet Transform
- Quantization quantization
- the feature map is first transformed (Texture Transfer) according to the difference between the reconstructed geometric information and the original geometric information, and then padded (Padding) and packaged (Video Packing) and encoded using a video encoder to form an attribute bitstream (Compressed attribute bitstream).
- FIG11B is a schematic diagram of an inter-frame encoder. As shown in FIG11B , the inter-frame encoder has a similar process to the intra-frame encoder, but the inter-frame encoder does not directly encode the simplified grid, but encodes the motion vector MV between the simplified grid of the current image and the simplified grid of the reference image, and generates a corresponding motion vector bitstream (Compressed motion bitstream).
- the inter-frame encoder does not directly encode the simplified grid, but encodes the motion vector MV between the simplified grid of the current image and the simplified grid of the reference image, and generates a corresponding motion vector bitstream (Compressed motion bitstream).
- the decoder can also be divided into an intra-frame decoder and an inter-frame decoder according to the type of the frame it acts on, which are used to perform intra-frame decoding and inter-frame decoding respectively.
- FIG12A is a schematic diagram of intra-frame decoding.
- a static mesh decoder (Static Mesh Decoder) can be used to decode the simplified mesh.
- a video decoder (Video Decoder) is used to decode the shift coefficient video, and the shift coefficient is obtained through video unpacking (Video Unpacking) and inverse wavelet transform (Inverse Wavelet Transform).
- the decoded simplified mesh and shift coefficient are used to obtain the decoded mesh geometry information.
- the decoding of the attribute graph is directly decoded by the video decoder.
- FIG12B is a schematic diagram of inter-frame decoding. As shown in FIG12B , for an inter-frame decoder, the process is basically the same as that of an intra-frame decoder, except that the simplified grid is not directly decoded, but the motion vector is decoded, and the simplified grid of the current image is calculated by the simplified grid of the previous frame image (such as the reference image).
- the dynamic mesh coding process is divided into the following steps: at the encoding end, the basic mesh generated by preprocessing is quantized and then encoded using Google's open source DRACO encoder, and the shift coefficient is encoded using HEVC after wavelet transform, quantization, and two-dimensional mapping, and the two-dimensional attribute map is also directly sent to the HEVC encoder for encoding; at the decoding end, the basic mesh code stream is decoded by DRACO to generate a decoded basic mesh, and the shift coefficient is decoded by HEVC decoding, inverse two-dimensional mapping, inverse quantization, and inverse transformation to generate a decoded shift coefficient, and then the decoded basic mesh and the decoded shift coefficient are used together to generate the reconstructed three-dimensional mesh geometry, and the attribute code stream is decoded by HEVC to generate a reconstructed attribute map.
- the basic mesh code stream is decoded by DRACO to generate a decoded basic mesh
- the shift coefficient is decoded by HE
- Condition 1 all intra geometry lossy, attributes lossy
- Condition 2 random access is lossy in geometry and attributes
- the general test sequence may include five categories, namely Cat1-A, Cat1-B and Cat1-C, all of which contain geometric information and color attribute information.
- the Base mesh is first iterated and divided according to a certain algorithm to obtain the corresponding mesh position information.
- the specific division algorithm is consistent with the above content, and the vertices on each boundary are linearly interpolated to obtain the corresponding geometric position information. Assuming that the entire division is iterated N times, the LOD division is performed according to the Displacement coefficients obtained by different iterative divisions, as shown in Figure 13A.
- the Base mesh can obtain the corresponding mesh geometric position information through the linear interpolation algorithm.
- the initial geometric position information and the original mesh are used to calculate the error to obtain the Displacement coefficient of each point.
- the LOD division can be a four-layer structure, specifically: level 0 , level 1 , level 2 and level 3.
- the specific LOD spatial structure is shown in Figure 13B.
- a lifting wavelet transform is performed based on the LOD spatial structure, which may include two steps: prediction and update.
- the prediction algorithm is as follows:
- v represents the vertex to be predicted
- v1 and v2 represent the two end vertices of the boundary where the vertex to be predicted is located.
- the transformed coefficients can be quantized and the quantized coefficients can be reorganized.
- the coefficients can be reorganized based on blocks, which can be reorganized into Block 0 , Block 1 , Block 2 , and Block 3.
- the current V-DMC can reorganize coefficients based on blocks.
- the size of each block is 16 ⁇ 16.
- the coefficients in each block are arranged according to the Morton code to obtain the corresponding two-dimensional image. After completing a series of operations, the two-dimensional image can be encoded using Video-Codec.
- the dynamic grid encoding process can be divided into the following steps:
- step 2 Subdivide the simplified mesh in step 1. For any two connected vertices in step 1, add a new point at the midpoint of the line segment connecting them, and repeat twice.
- step 2 For each vertex in step 2, find the point in the original mesh that is closest to it and calculate the displacement coefficient of the two points.
- step 5 Adjust the shift coefficients in step 3 according to the reconstructed simplified grid obtained in step 4.
- step 6 Perform wavelet transform on the shift coefficients in step 5, and quantize the shift coefficients after wavelet transform to obtain quantized transform coefficients.
- the Video-Codec is used to decode and reconstruct the two-dimensional image to restore the corresponding two-dimensional image.
- the lifting transform coefficient corresponding to each point can be restored.
- the inverse transform of the lifting wavelet transform can be used to restore the displacement coefficient of each point.
- the geometric position information of the Base mesh and the displacement coefficient are used to reconstruct and restore the geometric position information corresponding to the current mesh.
- the geometric position information of the level 0 layer in the base mesh and the displacement coefficient of the level 0 layer can reconstruct and restore the geometric position information of the level 0 layer in the reconstructed mesh
- the geometric position information of the level 1 layer in the Base mesh and the displacement coefficient of the level 1 layer can reconstruct and restore the geometric position information of the level 1 layer in the Reconstruct mesh, and so on.
- the geometric position information of the level 3 layer in the Base mesh and the displacement coefficient of the level 3 layer can reconstruct and restore the geometric position information of the level 3 layer in the Reconstruct mesh.
- the dynamic grid decoding process can be divided into the following steps:
- the basic grid code stream is decoded by a decoder such as draco to generate a decoded basic grid.
- the shift coefficient bit stream is decoded using a standard video encoder such as H.265 to obtain a shift coefficient two-dimensional image.
- the decoded base grid and the decoded shift coefficients are combined to generate the reconstructed 3D grid geometry information.
- the existing V-DMC coding always uses the initial mesh reconstruction geometric position information and Displacement coefficients obtained by dividing the Base Mesh to reconstruct and restore the geometric position information of the current mesh.
- the current image can be inter-frame coded, that is, there is a reference image for the current image, then when the current image encodes or reconstructs the geometric information of the mesh, the reconstructed Displacement coefficients of the reference image can be obtained.
- an inter-frame coding scheme that is: the geometric information of the initial mesh obtained by dividing the Base Mesh of the current image and the reconstructed Displacement coefficients of the reference image are used to obtain the corresponding predicted mesh geometric position information, and the re-obtained geometric position information is used together with the original mesh geometric position information for predictive coding or skip coding.
- the embodiment of the present application provides a coding and decoding method.
- the geometric information of the initial mesh obtained by dividing the Base Mesh of the current image and the reconstructed Displacement coefficient of the reference image are used to obtain the corresponding predicted mesh geometric position information.
- the new predicted geometric position information and the original mesh position information are used for parameter fitting at the encoding end, and finally the parameter relationship obtained by fitting is used to encode the geometric position information of the current mesh.
- the Displacement coefficients of some LOD layers of the current image can be skipped or the correlation between the Displacement coefficients between different points can be used to reduce the encoding of the Displacement coefficients of some points.
- FIG16 is a schematic diagram of a codec grid architecture provided by the present application.
- the grid architecture includes one or more electronic devices 13 to 1N and a communication grid 01, wherein the electronic devices 13 to 1N can perform video interaction through the communication grid 01.
- the process may be various types of devices with encoding and decoding functions, for example, the electronic device may include a mobile phone, a tablet computer, a personal computer, a personal digital assistant, a navigator, a digital phone, a video phone, a television, a sensor device, a server, etc., which is not specifically limited in the embodiments of the present application.
- the decoder or encoder described in the embodiments of the present application may be the above-mentioned electronic device.
- FIG17 a schematic flow chart of a decoding method provided by an embodiment of the present application is shown. As shown in FIG17 , the method may include:
- the decoding method of the embodiment of the present application may refer to an inter-frame decoding method, and more specifically, may be an inter-frame decoding method for displacement coefficients in a dynamic grid.
- the decoding method may be applied to a decoder in a V-DMC, but is not limited thereto.
- determining the basic grid of the current image may include: decoding a bitstream to determine the basic grid of the current image.
- the code stream here may refer to a basic grid code stream. Then, by decoding the basic grid code stream with a dynamic grid decoder (eg, DRACO), the basic grid of the current image may be obtained.
- a dynamic grid decoder eg, DRACO
- determining the base grid of the current image may include: determining the base grid of the current image based on the base grid of the reference image.
- the reference image is a decoded image before the current image.
- the reference image may be an image frame before the current image, but this is not specifically limited.
- the base grid of the reference image can also be used as the base grid of the current image.
- S1702 Subdivide the basic grid to determine the geometric position information of the initial grid of the current layer in the current image.
- At least one layer can be determined; wherein, the at least one layer may include the current layer.
- the geometric position information of the initial mesh obtained by subdividing the basic mesh three times is that the basic mesh is regarded as the 0th iteration corresponding to the 0th layer (level 0 ), the newly added vertices in the first iteration constitute the 1st layer (level 1 ), the newly added vertices in the second iteration constitute the 2nd layer (level 2 ), and the newly added vertices in the third iteration constitute the 3rd layer (level 3 ).
- the specific LOD division structure is shown in FIG13B , where the top layer is the basic mesh, and as the iteration proceeds, the number of newly added vertices increases successively during each iteration to form a pyramid structure. In this way, by subdividing the basic mesh, the geometric position information of the initial mesh of the current layer in the current image can be determined.
- subdividing the base grid to determine the geometric position information of the initial grid of the current layer in the current image may include: iteratively dividing the base grid according to a grid subdivision mode to determine the geometric position information of the initial grid of the current layer in the current image.
- the mesh subdivision mode can be understood as upsampling the vertices on each boundary of the base mesh, or can also be understood as interpolating the vertices on each boundary of the base mesh.
- the mesh subdivision mode includes a subdivision algorithm and a number of subdivision iterations.
- the subdivision algorithm can be an interpolation algorithm, for example, the subdivision algorithm can be a linear interpolation algorithm, or can also be a nonlinear interpolation algorithm, which is not specifically limited here.
- the base mesh can obtain the initial mesh (also called “subdivided mesh") through the linear interpolation algorithm.
- the coordinates of the newly inserted points are obtained by linear interpolation based on the two vertices on the current boundary:
- pos 1 and pos 2 are the geometric position coordinates of the two end vertices on the current boundary participating in this iteration, and pos new is the geometric position coordinates of the vertex newly added in this iteration.
- S1703 Decode the code stream and determine the first syntax identification information.
- the first syntax identification information is used to indicate whether the shift coefficient of the current layer in the current image uses the first decoding mode.
- the method may also include:
- the value of the first syntax identification information is the first value, determining that the first syntax identification information indicates that the shift coefficient of the current layer in the current image uses the first decoding mode;
- the value of the first syntax identification information is the second value, it is determined that the first syntax identification information indicates that the shift coefficient of the current layer in the current image does not use the first decoding mode.
- the first value is different from the second value, and the first value and the second value can be in parameter form or in digital form.
- the first syntax identification information can be a parameter written in the profile or a flag value, which is not specifically limited here.
- the first syntax identification information is used as a syntax element at the LOD layer level to indicate whether the shift coefficient of the LOD layer of the current image uses the first decoding mode.
- the method may further include:
- the code stream is decoded to determine the first syntax identification information.
- the second syntax identification information is a syntax element at the Sequence Parameter Set (SPS) level
- the third syntax identification information is a syntax element at the frame parameter set (FPS) level.
- the second syntax identification information is used to indicate whether the shift coefficient of the current sequence enables the first decoding mode
- the third syntax identification information is used to indicate whether the shift coefficient of the current image enables the first decoding mode.
- the current sequence includes at least the current image
- the LOD layer divided by the current image includes at least the current layer.
- the second grammar identification information indicates that the shift coefficient of the current sequence enables the first decoding mode; if the value of the second grammar identification information is a second value, it is determined that the second grammar identification information indicates that the shift coefficient of the current sequence does not enable the first decoding mode.
- the value of the third grammar identification information is a first value, it is determined that the third grammar identification information indicates that the shift coefficient of the current image enables the first decoding mode; if the value of the third grammar identification information is a second value, it is determined that the third grammar identification information indicates that the shift coefficient of the current image does not enable the first decoding mode.
- the first value is different from the second value, and the first value and the second value can be in parameter form or in digital form.
- the second syntax identification information and the third syntax identification information can be parameters written in the profile or the value of a flag, which is not specifically limited here.
- the first value can be set to 1 and the second value can be set to 0; or, the first value can be set to 0 and the second value can be set to 1; or, the first value can be set to true and the second value can be set to false; or, the first value can be set to false and the second value can be set to true.
- the first value is set to 1 and the second value is set to 0, but this is not specifically limited.
- S1705 Determine the geometric position information of the reconstructed grid of the current layer in the current image according to the geometric position information of the initial grid of the current layer in the current image and the shift coefficient of the current layer in the reference image.
- the first decoding mode represents skipping the decoding of the shift coefficient of the current layer in the current image, that is, not decoding the shift coefficient of the current layer in the current image, and the shift coefficient of the current layer in the reference image can be used at this time.
- the encoding and decoding efficiency of the geometric position information can be improved on the basis of ensuring the reconstruction quality of the geometric position information of the grid.
- determining the geometric position information of the reconstructed grid of the current layer in the current image based on the geometric position information of the initial grid of the current layer in the current image and the shift coefficient of the current layer in the reference image may include: determining the mapping relationship between the geometric position information of the initial grid of the current layer in the current image, the shift coefficient of the current layer in the reference image, and the geometric position information of the reconstructed grid of the current layer in the current image; determining the geometric position information of the reconstructed grid of the current layer in the current image based on the mapping relationship and the geometric position information of the initial grid of the current layer in the current image and the shift coefficient of the current layer in the reference image.
- the mapping relationship can be a lookup table (Look Up Table, LUT), which can record the correspondence between the geometric position information of the initial grid of the current layer in the current image and the shift coefficient of the current layer in the reference image and the geometric position information of the reconstructed grid of the current layer in the current image; or, the mapping relationship can also be a preset function, which can characterize the correspondence between the geometric position information of the initial grid of the current layer in the current image and the shift coefficient of the current layer in the reference image and the geometric position information of the reconstructed grid of the current layer in the current image.
- LUT Look Up Table
- the mapping relationship may include at least one of the following: a mapping relationship based on a linear function, a mapping relationship based on a nonlinear function, and a mapping relationship based on a neural grid.
- the fitting of such a mapping relationship may include, but is not limited to, linear fitting, curve fitting, or convolution parameter fitting, etc., which is not specifically limited here.
- mapping relationship may be established by the decoding end according to relevant parameters, or may be determined by decoding the bit stream.
- the method may further include:
- the encoding end can determine the mapping relationship between the geometric position information of the initial grid of the current layer in the current image, the shift coefficient of the current layer in the reference image and the geometric position information of the reconstructed grid of the current layer in the current image, and write the mapping relationship into the bit stream; in this way, the decoding end can determine the mapping relationship by decoding the bit stream, and then determine the geometric position information of the reconstructed grid of the current layer in the current image.
- the mapping indication information includes first indication information; wherein the first indication information is used to indicate fitting parameters of the mapping relationship.
- determining the mapping relationship between the geometric position information of the initial grid of the current layer in the current image, the shift coefficient of the current layer in the reference image, and the geometric position information of the reconstructed grid of the current layer in the current image based on the mapping indication information may include: determining the fitting parameters of the mapping relationship based on the first indication information; determining the mapping relationship between the geometric position information of the initial grid of the current layer in the current image, the shift coefficient of the current layer in the reference image, and the geometric position information of the reconstructed grid of the current layer in the current image based on the fitting parameters.
- the code stream is decoded to determine the fitting parameters of the mapping relationship; then, based on the fitting parameters, the mapping relationship between the geometric position information of the initial grid of the current layer in the current image, the shift coefficient of the current layer in the reference image, and the geometric position information of the reconstructed grid of the current layer in the current image can be determined.
- the fitting parameter may include the slope and/or intercept of the linear function.
- the fitting parameter may include at least one constant in the nonlinear function.
- the fitting parameter may include the constant a of the nonlinear function
- the fitting parameter may include the coefficients a 0 , a 1 , a 2 , ... of the polynomial function
- the fitting parameter may include the base a of the nonlinear function.
- the mapping indication information further includes second indication information; wherein the second indication information is used to indicate the type of the mapping relationship.
- determining the mapping relationship between the geometric position information of the initial grid of the current layer in the current image, the shift coefficient of the current layer in the reference image, and the geometric position information of the reconstructed grid of the current layer in the current image based on the mapping indication information may include: determining the type of the mapping relationship based on the second indication information; determining the mapping relationship between the geometric position information of the initial grid of the current layer in the current image, the shift coefficient of the current layer in the reference image, and the geometric position information of the reconstructed grid of the current layer in the current image based on the type of the mapping relationship and the fitting parameters.
- the type of mapping relationship may include a linear function type, an exponential function type, a logarithmic function type, a polynomial function type, etc., which is not specifically limited here.
- the code stream is decoded to determine the type of mapping relationship and fitting parameters; then, based on the type of mapping relationship and fitting parameters, the mapping relationship between the geometric position information of the initial grid of the current layer in the current image, the shift coefficient of the current layer in the reference image, and the geometric position information of the reconstructed grid of the current layer in the current image can be determined.
- lvl represents different LOD layers
- reconMesh represents the geometric position information of the initial mesh
- refDisp represents the reconstructed Displacement coefficient of the reference image
- predict mesh represents the geometric position information of the reconstructed mesh obtained by using a certain functional relationship.
- predict mesh k*(reconMesh+refDisp,lvl)+b (9)
- * and + are vector multiplication and addition; k and b represent fitting parameters.
- the mapping indication information includes third indication information; wherein the third indication information is used to indicate the index number of the mapping relationship.
- determining the mapping relationship between the geometric position information of the initial grid of the current layer in the current image, the shift coefficient of the current layer in the reference image, and the geometric position information of the reconstructed grid of the current layer in the current image based on the mapping indication information may include: determining the index number of the mapping relationship based on the third indication information; determining the mapping relationship between the geometric position information of the initial grid of the current layer in the current image, the shift coefficient of the current layer in the reference image, and the geometric position information of the reconstructed grid of the current layer in the current image based on the index number of the mapping relationship.
- the encoding end and the decoding end are both pre-set with several mapping relationships, and at this time the corresponding mapping relationship can be determined according to the index number.
- the code stream is decoded to determine the index number of the mapping relationship; then, based on the index number of the mapping relationship, the mapping relationship between the geometric position information of the initial grid of the current layer in the current image, the shift coefficient of the current layer in the reference image, and the geometric position information of the reconstructed grid of the current layer in the current image can be determined.
- the method may further include:
- S1802 Determine geometric position information of a reconstructed grid of the current layer in the current image according to geometric position information of an initial grid of the current layer in the current image and a shift coefficient of the current layer in the current image.
- the first decoding mode is different from the second decoding mode.
- the first decoding mode may represent skipping the decoding of the shift coefficient of the current layer in the current image, that is, there is no need to decode the shift coefficient of the current layer in the current image;
- the second decoding mode represents decoding the shift coefficient of the current layer in the current image, that is, it is necessary to decode the shift coefficient of the current layer in the current image.
- decoding the code stream and determining the shift coefficient of the current layer in the current image may include: decoding the code stream to determine the two-dimensional image of the current layer in the current image; performing coefficient reorganization processing on the two-dimensional image to determine the lifting transform coefficient of the current layer; and performing inverse transform processing on the lifting transform coefficient of the current layer to determine the shift coefficient of the current layer.
- performing coefficient reorganization processing on a two-dimensional image to determine the lifting transform coefficients of the current layer may include: performing coefficient reorganization processing on the two-dimensional image to determine the quantization coefficients of the current layer; and performing inverse quantization processing on the quantization coefficients of the current layer to determine the lifting transform coefficients of the current layer.
- the code stream here can refer to the shift coefficient code stream.
- a video decoder such as Video-Codec
- the quantization coefficient corresponding to each point in the current layer is restored; then, the lifting transform coefficient corresponding to each point in the current layer is restored by inverse quantization; finally, the shift coefficient corresponding to each point in the current layer is restored by the inverse transform of the lifting wavelet transform.
- the geometric position information of the basic grid and the decoded shift coefficient can be used to reconstruct and restore the geometric position information of the reconstructed grid.
- an identification information may be set to determine the decoding method of the basic grid of the current image.
- the method may further include: decoding the code stream to determine the fourth syntax identification information; when the fourth syntax identification information indicates that the basic grid of the current image uses the inter-frame processing method, performing the step of decoding the code stream to determine the first syntax identification information.
- the method may further include: decoding the code stream to determine the reference image index of the current image; and determining the reference image according to the reference image index of the current image.
- an identification information can be first set to determine the decoding method of the basic grid of the current image. If the basic grid of the current image can be decoded by inter-frame, then the reference image index of the current image needs to be transmitted. Based on such an algorithm, the embodiment of the present application first determines whether the basic grid of the current image can be decoded by inter-frame. If inter-frame decoding can be used, then it will determine whether the shift coefficient of the current layer in the current image uses skip decoding, otherwise the shift coefficient of the current layer in the current image is defaulted to use the second decoding mode.
- the method may further include: when the fourth syntax identification information indicates that the basic grid of the current image does not use inter-frame processing, decoding the code stream and determining the shift coefficient of the current layer in the current image; determining the geometric position information of the reconstructed grid of the current layer in the current image based on the geometric position information of the initial grid of the current layer in the current image and the shift coefficient of the current layer in the current image.
- the shift coefficient of the current layer in the current image can use the second decoding mode, that is, the shift coefficient of the current layer in the current image is determined by decoding the code stream, and then the geometric position information of the reconstructed grid of the current layer in the current image is determined according to the geometric position information of the initial grid of the current layer in the current image and the shift coefficient of the current layer in the current image.
- an identification information may be first set to determine the decoding method of the base grid of the current image. If the base grid of the current image can be decoded by inter-frame, then the reference image index of the current image needs to be transmitted.
- An embodiment of the present application may also be that regardless of whether the base grid of the current image can be decoded by inter-frame, an identification information needs to be transmitted to indicate whether the shift coefficient of the current layer in the current image adopts skip decoding.
- the geometric position information of the initial mesh obtained by using the basic mesh division of the current image and the shift coefficient of the reference image are used as a method for reconstructing the geometric position information of the mesh points.
- This scheme is to reduce the code stream of the shift coefficient under the premise of reducing the degree of the reconstructed mesh point position information, thereby improving the coding efficiency of the mesh.
- the mapping relationship between the predicted mesh point geometric position information and the original mesh point geometric position information can also be used by using parameter fitting, so as to ensure the quality of the reconstructed mesh point geometric position information, reduce the code stream size of the shift coefficient, and further improve the coding efficiency of the mesh.
- This embodiment provides a decoding method, which determines the basic grid of the current image; subdivides the basic grid to determine the geometric position information of the initial grid of the current layer in the current image; decodes the code stream to determine the first syntax identification information; when the first syntax identification information indicates the shift coefficient of the current layer in the current image using the first decoding mode, determines the shift coefficient of the current layer in the reference image; determines the geometric position information of the reconstructed grid of the current layer in the current image based on the geometric position information of the initial grid of the current layer in the current image and the shift coefficient of the current layer in the reference image. In this way, the shift coefficient of the current layer in the current image can be indicated based on the first syntax identification information. Whether to skip decoding.
- the geometric position information of the reconstructed grid is determined based on the geometric position information of the initial grid of the current layer in the current image and the shift coefficient of the current layer in the reference image. This not only reduces the bit rate of the shift coefficient, but also ensures the reconstructed geometric quality of the grid midpoint, thereby further improving the geometric information quality of the grid midpoint and further improving the encoding and decoding efficiency.
- FIG19 a schematic diagram of a flow chart of an encoding method provided in an embodiment of the present application is shown. As shown in FIG19, the method may include:
- S1901 Determine a basic grid of the current image according to the original grid of the current image.
- the encoding method in the embodiment of the present application may refer to an inter-frame encoding method, and more specifically, may be an inter-frame encoding method for displacement coefficients in a dynamic grid.
- the encoding method may be applied to an encoder in a V-DMC, but is not limited thereto.
- determining the basic grid of the current image based on the original grid of the current image can include: performing downsampling processing on the original grid of the current image to determine the basic grid of the current image.
- the original mesh of the current image may be downsampled to generate a base mesh with a significantly reduced number of vertices.
- the base mesh may be encoded and the obtained encoding bits may be written into the bitstream.
- the code stream here may refer to a basic grid code stream.
- a dynamic grid encoder such as DRACO
- DRACO may be used to encode the geometric information of the basic grid, and the obtained coded bits may be written into the basic grid code stream.
- the geometric information mainly includes: a connection relationship and a connection relationship of geometric position information.
- the encoding process can be: first complete the encoding of the connection relationship, then encode the geometric position information of the point based on the connection relationship of the geometric position, and finally encode the texture position information based on the connection relationship and the geometric position information.
- S1902 Subdivide the basic grid to determine the geometric position information of the initial grid of the current layer in the current image.
- At least one layer can be determined; wherein, the at least one layer can include the current layer.
- the basic mesh after obtaining the basic mesh, the basic mesh can also be subdivided, and new vertices can be inserted on the boundary of the basic mesh to generate the geometric position information of the initial mesh.
- the basic mesh is regarded as the 0th iteration corresponding to the 0th layer (level 0 ), the first iteration newly added vertices constitute the 1st layer (level 1 ), the second iteration newly added vertices constitute the 2nd layer (level 2 ), and the third iteration newly added vertices constitute the 3rd layer (level 3 ).
- the specific LOD division structure is shown in FIG13B, where the top layer is the basic mesh. As the iteration proceeds, the number of newly added vertices increases successively during each iteration to form a pyramid structure.
- subdividing the base grid to determine the geometric position information of the initial grid of the current layer in the current image may include: iteratively dividing the base grid according to a grid subdivision mode to determine the geometric position information of the initial grid of the current layer in the current image.
- the mesh subdivision mode can be understood as upsampling the vertices on each boundary of the base mesh, or can also be understood as interpolating the vertices on each boundary of the base mesh.
- the mesh subdivision mode includes a subdivision algorithm and a number of subdivision iterations.
- the subdivision algorithm can be an interpolation algorithm, for example, the subdivision algorithm can be a linear interpolation algorithm, or can also be a nonlinear interpolation algorithm, which is not specifically limited here.
- the base mesh can be used to obtain the initial mesh (also called "subdivided mesh") through a linear interpolation algorithm.
- the coordinates of the newly inserted points are obtained by linear interpolation based on the two vertices on the current boundary:
- pos 1 and pos 2 are the geometric position coordinates of the two end vertices on the current boundary participating in this iteration, and pos new is the geometric position coordinates of the vertex newly added in this iteration.
- S1903 Determine a shift coefficient of the current layer in the reference image, and determine the geometric position information of the first reconstructed grid of the current layer in the current image based on the geometric position information of the initial grid of the current layer in the current image and the shift coefficient of the current layer in the reference image.
- the reference image is an encoded image before the current image.
- the reference image may be an image frame before the current image, but this is not specifically limited.
- determining the geometric position information of the first reconstructed grid of the current layer in the current image according to the geometric position information of the initial grid of the current layer in the current image and the shift coefficient of the current layer in the reference image may include: determining a mapping relationship between the geometric position information of the initial grid of the current layer in the current image, the shift coefficient of the current layer in the reference image, and the geometric position information of the first reconstructed grid of the current layer in the current image; determining the geometric position information of the first reconstructed grid of the current layer in the current image according to the mapping relationship and the geometric position information of the initial grid of the current layer in the current image and the shift coefficient of the current layer in the reference ...
- first reconstructed grid of the current layer in the current image determining the geometric position information of the first reconstructed grid of the current layer in the current image according to the mapping relationship and the geometric position information of the first reconstructed grid of the current layer in the current image; determining the geometric position information of the first reconstructed grid of the current layer in the current image according to the mapping relationship and the geometric position information of the first reconstructed grid of the current layer in the current image; determining the geometric position information of the first reconstructed grid of the current layer in the current image according to the mapping relationship
- the shift coefficient of the current layer in the reference image is used to determine geometric position information of a first reconstruction grid of the current layer in the current image.
- the mapping relationship can be a lookup table (Look Up Table, LUT), which can record the correspondence between the geometric position information of the initial grid of the current layer in the current image and the shift coefficient of the current layer in the reference image and the geometric position information of the first reconstructed grid; or, the mapping relationship can also be a preset function, which can characterize the correspondence between the geometric position information of the initial grid of the current layer in the current image and the shift coefficient of the current layer in the reference image and the geometric position information of the first reconstructed grid.
- Look Up Table, LUT Look Up Table, LUT
- the mapping relationship can also be a preset function, which can characterize the correspondence between the geometric position information of the initial grid of the current layer in the current image and the shift coefficient of the current layer in the reference image and the geometric position information of the first reconstructed grid.
- the mapping relationship may include at least one of the following: a mapping relationship based on a linear function, a mapping relationship based on a nonlinear function, and a mapping relationship based on a neural grid.
- the fitting of such a mapping relationship may include, but is not limited to, linear fitting, curve fitting, or convolution parameter fitting, etc., which is not specifically limited here.
- the encoding end can write the bitstream after determining the mapping relationship.
- the method may also include: determining the mapping indication information of the current layer in the current image based on the mapping relationship; encoding the mapping indication information, and writing the obtained encoding bits into the bitstream.
- the encoding end can determine the mapping relationship between the geometric position information of the initial grid of the current layer in the current image, the shift coefficient of the current layer in the reference image and the geometric position information of the first reconstructed grid, and write the mapping relationship into the bit stream; in this way, the decoding end can determine the mapping relationship by decoding the bit stream, and then restore the geometric position information of the first reconstructed grid.
- the mapping indication information includes first indication information; wherein the first indication information is used to indicate the fitting parameters of the mapping relationship.
- the method may also include: determining the fitting parameters of the mapping relationship; encoding the fitting parameters of the mapping relationship, and writing the obtained encoding bits into the bitstream.
- the mapping indication information further includes second indication information; wherein the second indication information is used to indicate the type of the mapping relationship.
- the method may further include: determining the type and fitting parameters of the mapping relationship; encoding the type and fitting parameters of the mapping relationship, and writing the obtained encoding bits into the bitstream.
- the fitting parameter when the mapping relationship is a mapping relationship based on a linear function, the fitting parameter may include the slope and/or intercept of the linear function.
- the fitting parameter may include at least one constant in the nonlinear function.
- the fitting parameter when the nonlinear function is an exponential function, the fitting parameter may include the constant a of the nonlinear function; when the nonlinear function is a polynomial function, the fitting parameter may include the coefficients a 0 , a 1 , a 2 , ... of the polynomial function; when the nonlinear function is a logarithmic function, the fitting parameter may include the base a of the nonlinear function.
- the type of mapping relationship may include a linear function type, an exponential function type, a logarithmic function type, a polynomial function type, etc., which is not specifically limited here.
- lvl represents different LOD layers
- reconMesh represents the geometric position information of the initial mesh
- refDisp represents the reconstruction Displacement coefficient of the reference image
- predict mesh represents the geometric position information of the first reconstructed mesh obtained by using a certain functional relationship.
- * and + are vector multiplication and addition; k and b represent fitting parameters.
- the mapping indication information includes third indication information; wherein the third indication information is used to indicate the index number of the mapping relationship.
- the method may also include: determining the index number of the mapping relationship; encoding the index number of the mapping relationship, and writing the obtained encoding bits into the bit stream.
- the encoding end and the decoding end are both pre-set with several mapping relationships, and the corresponding mapping relationship can be determined according to the index number.
- the encoding end can write it into the code stream; subsequently, the decoding end can obtain the index number of the mapping relationship by decoding the code stream, and then determine the corresponding mapping relationship according to the index number of the mapping relationship.
- S1904 Determine whether to perform encoding processing on the shift coefficients of the current layer in the current image according to the geometric position information of the first reconstructed grid and the geometric position information of the original grid.
- determining whether to encode the shift coefficient of the current layer in the current image based on the geometric position information of the first reconstructed grid and the geometric position information of the original grid may include: performing error calculation on the geometric position information of the first reconstructed grid and the geometric position information of the original grid to determine a first error result; based on the first error result, determining the encoding mode of the shift coefficient of the current layer in the current image, wherein the encoding mode is used to characterize whether to encode the shift coefficient of the current layer in the current image.
- the method may further include: determining the geometric position information of a second reconstructed grid of the current layer in the current image based on the geometric position information of the initial grid of the current layer in the current image and the shift coefficient of the current layer in the current image; performing error calculation on the geometric position information of the second reconstructed grid and the geometric position information of the original grid to determine a second error result.
- the method may also include: determining the shift coefficient of the current layer in the current image based on the geometric position information of the initial grid of the current layer in the current image and the geometric position information of the original grid of the current layer in the current image.
- determining the shift coefficient of the current layer in the current image based on the geometric position information of the initial mesh of the current layer in the current image and the geometric position information of the original mesh of the current layer in the current image can include: determining the error value of the first vertex between the initial mesh and the original mesh based on the first vertex in the initial mesh, and determining the normal vector of the first vertex; calculating the shift coefficient of the first vertex based on the error value of the first vertex and the normal vector of the first vertex; wherein the first vertex is any vertex in the initial mesh.
- the error value Delta between the initial mesh and the original mesh can be calculated first, and the error Delta can be an error of a point in the world coordinate system. Then, the error Delta between the first vertices and the normal vector Norm of the first vertex are used to calculate the displacement coefficient of the first vertex, as shown in Figure 8.
- Displacement is the displacement coefficient.
- the method may include:
- the error ratio is greater than a preset threshold, it is determined that the shift coefficient of the current layer in the current image uses the second encoding mode.
- the first coding mode is different from the second coding mode.
- the first coding mode represents skipping the encoding of the shift coefficient of the current layer in the current image;
- the second coding mode represents encoding the shift coefficient of the current layer in the current image.
- the error ratio when the error ratio is equal to a preset threshold, it can be determined that the shift coefficient of the current layer in the current image uses the first coding mode, or it can be determined that the shift coefficient of the current layer in the current image uses the second coding mode.
- the method may include:
- S2001 Determine geometric position information of a first reconstructed grid of a current layer in a current image according to geometric position information of an initial grid of a current layer in a current image and a shift coefficient of the current layer in a reference image.
- S2003 Determine geometric position information of a second reconstructed grid of the current layer in the current image according to the geometric position information of the initial grid of the current layer in the current image and the shift coefficient of the current layer in the current image.
- S2005 Determine an error ratio between the first error result and the second error result.
- MSE mean square error
- the method of the embodiment of the present application use the geometric position information of the initial mesh obtained by dividing the Base mesh and the Displacement coefficient in the reference image to reconstruct the error between the geometric position information of the second reconstructed mesh and the original mesh, that is, the first error result, represented by Dist_skip.
- Dist_skip/Dist_org can be expressed by Dist_skip/Dist_org. If Dist_skip/Dist_org is less than the preset threshold, it can be determined that the Displacement coefficient of the lvl layer in the current image is encoded using the first encoding mode, that is, the encoding is skipped, that is, the Displacement coefficient of the lvl layer in the current image is not encoded; if Dist_skip/Dist_org is greater than the preset threshold, it can be determined that the Displacement coefficient of the lvl layer in the current image is encoded using the second encoding mode, that is, the Displacement coefficient of the lvl layer in the current image needs to be encoded.
- the method may further include: performing rate-distortion cost calculation on the shift coefficient of the current layer in the current image according to the first coding mode to determine a first rate-distortion result; and performing rate-distortion cost calculation on the shift coefficient of the current layer in the current image according to the second coding mode to determine a second rate-distortion result;
- a coding mode of a shift coefficient of a current layer in a current image is determined according to the first rate-distortion result and the second rate-distortion result.
- determining the coding mode of the shift coefficient of the current layer in the current image according to the first rate-distortion result and the second rate-distortion result may include:
- the shift coefficient of the current layer in the current image uses the first coding mode, or it can be determined that the shift coefficient of the current layer in the current image uses the second coding mode.
- the embodiment of the present application in order to more accurately measure whether the first coding mode and the second coding mode improve the coding performance, the embodiment of the present application simultaneously performs rate-distortion trade-offs on the two coding modes.
- the rate-distortion cost method can be used to calculate the rate-distortion result after the comprehensive quality improvement and the increase in bit rate.
- the first rate-distortion result and the second rate-distortion result can respectively represent the rate-distortion cost of the reconstructed grid obtained by the first coding mode relative to the original grid or the reconstructed grid obtained by the second coding mode relative to the original grid, which is used to indicate whether the compression efficiency of the coding shift coefficient is skipped.
- J is the rate-distortion result
- D is the distortion size between the original grid and the reconstructed grid obtained by the first coding mode or the second coding mode, such as the sum of squares of corresponding point errors
- ⁇ is a quantity related to the quantization parameter QP
- R is the total geometric bitstream size divided by the number of frames.
- the first coding mode can be determined as the optimal coding mode.
- the displacement coefficient of the lvl layer in the current image uses the first coding mode, that is, skipping coding, that is, not encoding the Displacement coefficient of the lvl layer in the current image; if the second rate-distortion result is smaller than the first rate-distortion result, then the second coding mode can be determined as the optimal coding mode.
- the displacement coefficient of the lvl layer in the current image uses the second coding mode, that is, it is necessary to encode the Displacement coefficient of the lvl layer in the current image.
- the method may further include: when the shift coefficient of the current layer in the current image uses the second coding mode, encoding the shift coefficient of the current layer in the current image, and writing the obtained coding bits into the bitstream.
- encoding processing is performed on the shift coefficients of the current layer in the current image, and the obtained coded bits are written into the bitstream, which can specifically include: performing a lifting transform on the shift coefficients of the current layer in the current image to determine the lifting transform coefficients; performing quantization processing on the lifting transform coefficients to determine the quantization coefficients; performing coefficient reorganization processing on the quantization coefficients to determine a two-dimensional image; encoding the two-dimensional image, and writing the obtained coded bits into the bitstream.
- the implementation steps of the shift coefficient encoding method provided in the embodiment of the present application may include:
- the Base mesh is iterated and divided according to a certain algorithm to obtain the corresponding mesh position information.
- the specific division algorithm is consistent with the above content.
- the vertices on each boundary are used for linear interpolation to obtain the corresponding geometric position information.
- LOD division is performed according to the Displacement obtained by different iterative divisions to obtain the corresponding mesh geometric position information.
- the initial geometric position information is used to calculate the error with the original mesh to obtain the Displacement of each point.
- the shift coefficients are subjected to a lifting wavelet transform, which includes two steps: prediction and update.
- the prediction step is as follows:
- the transformed coefficients are quantized and reorganized.
- the current V-DMC reorganizes coefficients based on blocks.
- the size of each block is 16 ⁇ 16.
- the coefficients in each block are arranged according to the Morton code to obtain the corresponding two-dimensional image.
- the two-dimensional image is finally encoded using Video-Codec.
- the shift coefficients may also be directly encoded using an entropy encoder to obtain a shift coefficient code stream.
- the method may also include: determining a value of second grammar identification information, wherein the second grammar identification information is used to indicate whether the shift coefficient of the current sequence enables the first coding mode; encoding the value of the second grammar identification information, and writing the obtained coded bits into the bit stream.
- the method may also include: when the second grammar identification information indicates that the shift coefficient of the current sequence enables the first coding mode, determining the value of the third grammar identification information, wherein the third grammar identification information is used to indicate whether the shift coefficient of the current image enables the first coding mode; encoding the value of the third grammar identification information and writing the obtained coded bits into the bitstream.
- the method may further include: when the third syntax identification information indicates that the shift coefficient of the current image enables the first coding mode, determining the value of the first syntax identification information, wherein the first syntax identification information is used to indicate whether the shift coefficient of the current layer in the current image uses the first coding mode; encoding the value of the first syntax identification information, and converting the obtained encoding mode into Special write to the code stream.
- the second syntax identification information is a syntax element at the Sequence Parameter Set (SPS) level
- the third syntax identification information is a syntax element at the frame parameter set (FPS) level
- the first syntax identification information is a syntax element at the LOD layer level.
- the second syntax identification information is used to indicate whether the shift coefficient of the current sequence enables the first coding mode
- the third syntax identification information is used to indicate whether the shift coefficient of the current image enables the first coding mode
- the first syntax identification information is used to indicate whether the shift coefficient of the current layer uses the first coding mode.
- the current sequence includes at least the current image
- the LOD layer divided by the current image includes at least the current layer.
- the value of the first grammar identification information if the shift coefficient of the current layer in the current image uses the first coding mode, or in other words, encoding the shift coefficient of the current layer in the current image is skipped, the value of the first grammar identification information is determined to be the first value; if the shift coefficient of the current layer in the current image does not use the first coding mode, or in other words, it is necessary to encode the shift coefficient of the current layer in the current image, the value of the first grammar identification information is determined to be the second value.
- the value of the second grammar identification information if the shift coefficient of the current sequence enables the first coding mode, the value of the second grammar identification information is determined to be the first value; if the shift coefficient of the current sequence does not enable the first coding mode, the value of the second grammar identification information is determined to be the second value.
- the value of the third grammar identification information if the shift coefficient of the current image enables the first coding mode, the value of the third grammar identification information is determined to be the first value; if the shift coefficient of the current image does not enable the first coding mode, the value of the third grammar identification information is determined to be the second value.
- the first value is different from the second value, and the first value and the second value can be in parameter form or in digital form.
- the first syntax identification information, the second syntax identification information and the third syntax identification information can be parameters written in the profile or the value of a flag, which is not specifically limited here.
- the first value can be set to 1 and the second value can be set to 0; or, the first value can be set to 0 and the second value can be set to 1; or, the first value can be set to true and the second value can be set to false; or, the first value can be set to false and the second value can be set to true.
- the first value is set to 1 and the second value is set to 0, but this is not specifically limited.
- the encoding method of the shift coefficient of the current layer is determined at each LOD level, specifically, whether to use the first encoding mode or the second encoding mode to encode the shift coefficient of the current layer.
- the method may also include: determining a value of fourth grammar identification information, wherein the fourth grammar identification information is used to indicate whether the basic grid of the current image uses an inter-frame processing method; encoding the value of the fourth grammar identification information, and writing the obtained coded bits into the bitstream.
- the value of the fourth grammar identification information if the basic grid of the current image uses the inter-frame processing method, the value of the fourth grammar identification information is determined to be the first value; if the basic grid of the current image does not use the inter-frame processing method, the value of the fourth grammar identification information is determined to be the second value.
- the first value is different from the second value, and the first value and the second value can be in parameter form or in digital form.
- the fourth syntax identification information can be a parameter written in the profile or a flag value, which is not specifically limited here.
- the first value is set to 1 and the second value is set to 0, but this is not specifically limited either.
- the method may further include: when the base grid of the current image uses an inter-frame processing method, executing a step of determining whether to perform encoding processing on the shift coefficients of the current layer in the current image.
- the method when the basic grid of the current image uses an inter-frame processing method, the method also includes: determining a reference image index of the current image based on a reference image; encoding the reference image index of the current image, and writing the obtained encoding bits into a bitstream.
- an identification information can be first set to determine the encoding method of the basic grid of the current image. If the basic grid of the current image can be inter-frame encoded, then the reference image index of the current image needs to be transmitted. Based on such an algorithm, the embodiment of the present application first determines whether the basic grid of the current image can be inter-frame encoded. If inter-frame encoding can be used, then it will determine whether the shift coefficient of the current layer in the current image uses skip encoding, otherwise the shift coefficient of the current layer in the current image is defaulted to use the second encoding mode.
- the method may also include: determining the shift coefficient of the current layer in the current image when the fourth syntax identification information indicates that the basic grid of the current image does not use inter-frame processing; determining the geometric position information of the reconstructed grid of the current layer in the current image based on the geometric position information of the initial grid of the current layer in the current image and the shift coefficient of the current layer in the current image.
- the shift coefficient of the front layer can use the second encoding mode by default, that is, after determining the shift coefficient of the current layer in the current image, the geometric position information of the reconstructed grid of the current layer in the current image is determined according to the geometric position information of the initial grid of the current layer in the current image and the shift coefficient of the current layer in the current image.
- an identification information may be first set to determine the encoding method of the base grid of the current image. If the base grid of the current image can be inter-coded, then the reference image index of the current image needs to be transmitted.
- An embodiment of the present application may also be that regardless of whether the base grid of the current image can be inter-coded, an identification information needs to be transmitted to indicate whether the shift coefficient of the current layer in the current image adopts skip coding.
- the geometric position information of the initial mesh obtained by using the basic mesh division of the current image and the shift coefficient of the reference image are used as a method for reconstructing the geometric position information of the mesh points.
- This scheme is to reduce the code stream of the shift coefficient under the premise of reducing the degree of the reconstructed mesh point position information, thereby improving the coding efficiency of the mesh.
- the mapping relationship between the predicted mesh point geometric position information and the original mesh point geometric position information can also be used by using parameter fitting, so as to ensure the quality of the reconstructed mesh point geometric position information, reduce the code stream size of the shift coefficient, and further improve the coding efficiency of the mesh.
- the embodiment of the present application also provides a code stream, which is generated by bit encoding according to the information to be encoded; wherein the information to be encoded may include at least one of the following:
- the second grammar identification information is used to indicate whether the shift coefficient of the current sequence enables the first coding mode
- the third grammar identification information is used to indicate whether the shift coefficient of the current image enables the first coding mode
- the first grammar identification information is used to indicate whether the shift coefficient of the current layer in the current image uses the first coding mode
- the fourth grammar identification information is used to indicate whether the basic grid of the current image uses inter-frame processing; and the current sequence includes the current image, and the LOD layers divided by the current image include the current layer.
- This embodiment provides a coding method, which determines the basic grid of the current image according to the original grid of the current image; subdivides the basic grid to determine the geometric position information of the initial grid of the current layer in the current image; determines the shift coefficient of the current layer in the reference image, and determines the geometric position information of the first reconstructed grid of the current layer in the current image according to the geometric position information of the initial grid of the current layer in the current image and the shift coefficient of the current layer in the reference image; determines whether to perform coding processing on the shift coefficient of the current layer in the current image according to the geometric position information of the first reconstructed grid and the geometric position information of the original grid.
- the geometric position information of the first reconstructed grid can be determined according to the geometric position information of the initial grid of the current layer in the current image and the shift coefficient of the current layer in the reference image, which not only reduces the code stream of the shift coefficient, but also ensures the reconstructed geometric quality of the midpoint of the grid, thereby further improving the geometric information quality of the midpoint of the grid, and further improving the coding efficiency.
- the embodiment of the present application is based on the existing V-DMC coding foundation, and obtains the corresponding geometric position information of the reconstructed mesh (also referred to as "the geometric position information of the predicted mesh") by using the geometric position information after the Base Mesh division of the current image and the shift coefficient reconstructed by the reference image.
- the shift coefficient can be adaptively encoded, so that the bit rate of the shift coefficient encoding can be reduced on the basis of ensuring the quality of the mesh geometric position information reconstruction, thereby improving the encoding efficiency of the mesh geometric position information.
- Figure 21 is a schematic diagram of the basic principle of determining the geometric position information of the prediction grid provided by an embodiment of the present application.
- the geometric position information of the prediction grid can be determined based on the shift coefficients of the base grid and the reference image, and then based on the parameter correlation between the geometric position information of the midpoint of the prediction grid and the geometric position information of the midpoint of the original grid, it can be determined whether to encode the shift coefficient of the current image.
- lvl represents different LOD layers
- reconMesh represents the geometric position information of the initial mesh
- refDisp represents the displacement coefficient of the reference image
- predict mesh represents the geometric position information of the corresponding predicted mesh obtained by using a certain functional relationship.
- * and + are vector multiplication and addition.
- the MSE algorithm can be used to obtain the error of the mesh geometric position information corresponding to different LOD layers, as shown below:
- the error between the point geometric position information of the reconstructed grid obtained by the original encoding scheme and the point geometric position information of the original grid which is Dist_org; secondly, use the method in the embodiment of the present application: the error between the point geometric position information of the predicted grid obtained by reconstructing the point geometric position information of the initial grid obtained by dividing using the Base Mesh and the shift coefficient in the reference image and the point geometric position information of the original grid is Dist_skip.
- Dist_skip/Dist_org is less than a certain threshold (Th)
- Th a certain threshold
- the shift coefficient of the current layer in the current image is skipped for encoding, specifically: the shift coefficient of the current layer in the current image is not encoded, and the geometric position information after the Base Mesh division is directly used to reconstruct the geometric position information of the current reconstructed grid with the shift coefficient in the reference image.
- the basic grid of the current image is obtained by parsing and reconstructing, and the basic grid of the current image is divided by a division algorithm to obtain geometric position information of the initial grid. Then, the decoding mode of the shift coefficient of the current image is determined according to the encoding mode of the shift coefficient of the current image.
- the initial geometric position information obtained by dividing the Base mesh of the current image and the Displacement coefficient are used to reconstruct and restore the geometric position information of the reconstructed mesh;
- the initial geometric position information obtained by dividing the Base mesh of the current image and the Displacement coefficient in the reference image are directly used to reconstruct and restore the geometric position information of the reconstructed mesh.
- the displacement coefficient coding mode of each frame is modified.
- an identification information identifier
- the embodiment of the present application first determines whether the Base Mesh of the current image can be inter-frame coded. If inter-frame coding can be used, then it will determine whether the Displacement coefficient of the current image uses skip coding, otherwise the Displacement coefficient of the current image is defaulted to use the original coding scheme.
- the displacement coefficient coding mode of each frame is modified.
- inter-frame coding there is first an identification information that determines the coding method of the Base Mesh of the current image. If the Base Mesh of the current image can be inter-coded, it is necessary to pass the reference image index of the current image. In the embodiment of the present application, regardless of whether the Base Mesh of the current image can be inter-coded, it is necessary to pass an identification information indicating whether the Displacement coefficient of the current image adopts skip coding.
- the displacement coefficient coding mode for each frame is modified.
- the geometric position information of the initial mesh obtained by using the Base Mesh division of the current image and the Displacement coefficient of the reference image are used as a way to reconstruct the geometric position information of a mesh point.
- This coding scheme is to reduce the code stream of the encoded Displacement coefficient under the premise of reducing the geometric position information of the reconstructed mesh point to a certain extent, thereby improving the coding efficiency of the mesh.
- the relationship between the predicted mesh point geometric position information and the original mesh geometric position information can be used by using parameter fitting to ensure the quality of the reconstructed mesh point geometric position information, thereby reducing the code stream size of the Displacement coefficient and further improving the coding efficiency of the mesh.
- the Displacement coefficient encoding mode in the current image is determined by using the geometric position information of different LOD layer points after the Base Mesh division in the current image and the Displacement coefficients reconstructed at different LOD layers in the reference image.
- the geometric position information of the corresponding predicted mesh is obtained by using the geometric position information of different LOD layer points after the Base Mesh division in the current image and the Displacement coefficients reconstructed at different LOD layers in the reference image;
- the Dist_org between the reconstructed mesh geometric position information and the original position information obtained by the original coding scheme is calculated, and the error Dist_skip between the geometric position information of the predicted mesh and the geometric position information of the original mesh is used, and the error between the two is used to adaptively determine the encoding mode of the Displacement coefficient in the current image.
- this solution can also first predict the relationship between the geometric position information of the mesh and the original mesh position information by fitting, and then skip encoding using the fitted position information, which can also further improve the quality of the reconstructed geometric information of the mesh.
- fitting method there is no restriction on the fitting method or the use of linear fitting, curve fitting or convolution parameter fitting.
- This solution is more about making use of the relationship between the position information of the points after the Base mesh division of the current image, the Displacement coefficient in the reference image and the geometric position information of the mesh of the current image as much as possible.
- the encoding bitstream size of the Displacement coefficient can be reduced, and at the same time, the quality of the point reconstruction geometric information of the mesh can be guaranteed, thereby further improving the quality of the point geometric information of the mesh.
- the Displacement coefficient encoding mode in the current image is determined by using the point geometric position information of different LOD layers after the Base mesh division in the current image and the Displacement coefficients reconstructed at different LOD layers in the reference image. Specifically, at the encoding end, firstly, the point geometric position information of different LOD layers after the Base mesh division in the current image and the Displacement coefficients reconstructed at different LOD layers in the reference image are used. The Displacement coefficients reconstructed at different LOD layers in the image are used to obtain the corresponding predicted mesh geometric position information.
- the Dist_org between the geometric position information of the reconstructed mesh obtained by the original encoding scheme and the geometric position information of the original mesh is calculated, and the error Dist_skip between the geometric position information of the predicted mesh and the geometric position information of the original mesh is used to adaptively determine the encoding mode of the Displacement coefficient in the current image using the error between the two. If the Displacement coefficient in the current image can be skipped for encoding, and the quality of the position information of the mesh after reconstruction can be guaranteed to be similar to the original encoding scheme, the mesh geometric information encoding efficiency can be further improved.
- the embodiment of the present application can also firstly fit the relationship between the geometric position information of the predicted mesh and the geometric position information of the original mesh, and then use the fitted geometric position information for skip encoding, which can also further improve the quality of the reconstructed geometric information of the mesh.
- linear fitting, curve fitting or convolution parameter fitting can be used for the fitting method, which is not specifically limited here.
- the embodiments of the present application are more about making use of the relationship between the geometric position information of the points after the Base mesh division of the current image, the Displacement coefficient in the reference image, and the geometric position information of the mesh of the current image as much as possible.
- BD-Rate is a performance indicator for measuring lossy compression efficiency.
- BD-Rate is less than 0, it means that the coding efficiency is improved relative to the existing coding scheme.
- the threshold Th is set to 1.06, 1.07 and 1.08
- the bitstream can be reduced by about 12%, 30% and 38%; but D1 is reduced by about -0.25dB, -0.731dB and -0.96dB, and D2 is reduced by about -0.279dB, -0.8dB and -1.10dB; thus, the bitrate can be increased and the encoding and decoding efficiency can be improved.
- the encoder 220 may include a first determination unit 2201, a first subdivision unit 2202, a first reconstruction unit 2203 and an encoding unit 2204, wherein:
- the first determining unit 2201 is configured to determine a basic grid of the current image according to an original grid of the current image
- a first subdivision unit 2202 is configured to subdivide the basic grid and determine geometric position information of an initial grid of a current layer in a current image
- the first reconstruction unit 2203 is configured to determine a shift coefficient of the current layer in the reference image, and determine geometric position information of a first reconstructed grid of the current layer in the current image according to the geometric position information of the initial grid of the current layer in the current image and the shift coefficient of the current layer in the reference image;
- the encoding unit 2204 is configured to determine whether to perform encoding processing on the shift coefficients of the current layer in the current image according to the geometric position information of the first reconstructed grid and the geometric position information of the original grid.
- the reference picture is an encoded picture preceding the current picture.
- the first determining unit 2201 is further configured to perform downsampling processing on the original grid of the current image to determine the basic grid of the current image.
- the first determination unit 2201 is further configured to determine the shift coefficient of the current layer in the current image based on the geometric position information of the initial grid of the current layer in the current image and the geometric position information of the original grid of the current layer in the current image.
- the first determination unit 2201 is also configured to determine the error value of the first vertex between the initial mesh and the original mesh based on the first vertex in the initial mesh, and determine the normal vector of the first vertex; and calculate the shift coefficient of the first vertex based on the error value of the first vertex and the normal vector of the first vertex; wherein the first vertex is any vertex in the initial mesh.
- the first determination unit 2201 is further configured to perform error calculation on the geometric position information of the first reconstructed grid and the geometric position information of the original grid to determine a first error result; and based on the first error result, determine the encoding mode of the shift coefficient of the current layer in the current image, wherein the encoding mode is used to characterize whether to perform encoding processing on the shift coefficient of the current layer in the current image.
- the first determination unit 2201 is also configured to determine the geometric position information of the second reconstructed grid of the current layer in the current image based on the geometric position information of the initial grid of the current layer in the current image and the shift coefficient of the current layer in the current image; and perform error calculation on the geometric position information of the second reconstructed grid and the geometric position information of the original grid to determine a second error result.
- the first determining unit 2201 is further configured to determine an error ratio between the first error result and the second error result; if the error ratio is less than a preset threshold, determine that the shift coefficient of the current layer in the current image uses the first coding mode; if the error ratio is greater than the preset threshold, determine that the shift coefficient of the current layer in the current image uses the first coding mode. value, it is determined that the shift coefficient of the current layer in the current image uses the second coding mode; wherein the first coding mode is different from the second coding mode.
- the first determination unit 2201 is further configured to perform rate-distortion cost calculation on the shift coefficient of the current layer in the current image according to the first coding mode to determine a first rate-distortion result; and perform rate-distortion cost calculation on the shift coefficient of the current layer in the current image according to the second coding mode to determine a second rate-distortion result; and determine the coding mode of the shift coefficient of the current layer in the current image based on the first rate-distortion result and the second rate-distortion result.
- the first determination unit 2201 is further configured to determine that the shift coefficient of the current layer in the current image uses the first encoding mode if the first rate-distortion result is smaller than the second rate-distortion result; and to determine that the shift coefficient of the current layer in the current image uses the second encoding mode if the second rate-distortion result is smaller than the first rate-distortion result.
- the first encoding mode represents skipping of encoding the shift coefficients of the current layer in the current image; and the second encoding mode represents encoding the shift coefficients of the current layer in the current image.
- the encoding unit 2204 is further configured to encode the shift coefficient of the current layer in the current image when the shift coefficient of the current layer in the current image uses the second encoding mode, and write the obtained encoding bits into the bitstream.
- the first determining unit 2201 is further configured to determine a value of second grammar identification information, wherein the second grammar identification information is used to indicate whether the shift coefficient of the current sequence enables the first coding mode;
- the encoding unit 2204 is further configured to perform encoding processing on the value of the second syntax identification information, and write the obtained encoding bits into the bit stream.
- the first determining unit 2201 is further configured to determine a value of third grammar identification information when the second grammar identification information indicates that the shift coefficient of the current sequence enables the first coding mode, wherein the third grammar identification information is used to indicate whether the shift coefficient of the current image enables the first coding mode;
- the encoding unit 2204 is further configured to perform encoding processing on the value of the third syntax identification information, and write the obtained encoding bits into the bit stream.
- the first determining unit 2201 is further configured to determine a value of the first syntax identification information when the third syntax identification information indicates that the shift coefficient of the current image enables the first coding mode, wherein the first syntax identification information is used to indicate whether the shift coefficient of the current layer in the current image uses the first coding mode;
- the encoding unit 2204 is further configured to perform encoding processing on the value of the first syntax identification information, and write the obtained encoding bits into the bit stream.
- the first determining unit 2201 is further configured to determine a value of fourth syntax identification information, wherein the fourth syntax identification information is used to indicate whether the basic grid of the current image uses an inter-frame processing method;
- the encoding unit 2204 is further configured to perform encoding processing on the value of the fourth syntax identification information, and write the obtained encoding bits into the bitstream.
- the first determination unit 2201 is further configured to execute a step of determining whether to perform encoding processing on the shift coefficients of the current layer in the current image when the base grid of the current image uses an inter-frame processing method.
- the first determining unit 2201 is further configured to determine a reference image index of the current image according to the reference image when the base grid of the current image uses an inter-frame processing method;
- the encoding unit 2204 is further configured to perform encoding processing on the reference image index of the current image, and write the obtained encoding bits into the bitstream.
- the first determining unit 2201 is further configured to perform a lifting transformation on the shift coefficients of the current layer in the current image to determine the lifting transformation coefficients; perform a quantization process on the lifting transformation coefficients to determine the quantization coefficients; and perform a coefficient reorganization process on the quantization coefficients to determine the two-dimensional image;
- the encoding unit 2204 is further configured to perform encoding processing on the two-dimensional image and write the obtained encoding bits into the bit stream.
- the first reconstruction unit 2203 is also configured to determine a mapping relationship between geometric position information of an initial grid of a current layer in a current image, a shift coefficient of a current layer in a reference image, and geometric position information of a first reconstructed grid of a current layer in a current image; and determine geometric position information of a first reconstructed grid of a current layer in a current image based on the mapping relationship and the geometric position information of an initial grid of a current layer in a current image and a shift coefficient of a current layer in a reference image.
- the mapping relationship includes at least one of the following: a mapping relationship based on a linear function, a mapping relationship based on a nonlinear function, and a mapping relationship based on a neural grid.
- the first determining unit 2201 is further configured to determine mapping indication information of a current layer in the current image based on the mapping relationship;
- the encoding unit 2204 is further configured to perform encoding processing on the mapping indication information and write the obtained encoding bits into the bit stream.
- the first subdivision unit 2202 is further configured to iteratively divide the basic grid according to the grid subdivision mode to determine the geometric position information of the initial grid of the current layer in the current image.
- the mesh subdivision mode includes: a subdivision algorithm and a number of subdivision iterations.
- the subdivision algorithm is a linear interpolation algorithm.
- a "unit” may be a part of a circuit, a part of a processor, a part of a program or software, etc., and of course, it may be a module, or it may be non-modular.
- the components in the present embodiment may be integrated into a processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit.
- the above-mentioned integrated unit may be implemented in the form of hardware or in the form of a software functional module.
- the integrated unit is implemented in the form of a software function module and is not sold or used as an independent product, it can be stored in a computer-readable storage medium.
- the technical solution of this embodiment is essentially or the part that contributes to the prior art or all or part of the technical solution can be embodied in the form of a software product.
- the computer software product is stored in a storage medium, including several instructions for a computer device (which can be a personal computer, server, or grid device, etc.) or a processor to perform all or part of the steps of the method described in this embodiment.
- the aforementioned storage medium includes: U disk, mobile hard disk, read-only memory (ROM), random access memory (RAM), disk or optical disk, etc., various media that can store program codes.
- an embodiment of the present application provides a computer-readable storage medium, which is applied to the encoder 220.
- the computer-readable storage medium stores a computer program, and when the computer program is executed by the first processor, the method described in any one of the aforementioned embodiments is implemented.
- the encoder 220 may include: a first communication interface 2301, a first memory 2302 and a first processor 2303; each component is coupled together through a first bus system 2304. It can be understood that the first bus system 2304 is used to achieve connection and communication between these components. In addition to the data bus, the first bus system 2304 also includes a power bus, a control bus and a status signal bus. However, for the sake of clarity, various buses are labeled as the first bus system 2304 in Figure 23. Among them,
- the first communication interface 2301 is used for receiving and sending signals during the process of sending and receiving information with other external network elements;
- a first memory 2302 used to store a computer program that can be run on the first processor 2303;
- the first processor 2303 is configured to, when running the computer program, execute:
- the geometric position information of the first reconstructed grid and the geometric position information of the original grid it is determined whether to perform encoding processing on the shift coefficients of the current layer in the current image.
- the first memory 2302 in the embodiment of the present application can be a volatile memory or a non-volatile memory, or can include both volatile and non-volatile memories.
- the non-volatile memory can be a read-only memory (ROM), a programmable read-only memory (PROM), an erasable programmable read-only memory (EPROM), an electrically erasable programmable read-only memory (EEPROM), or a flash memory.
- the volatile memory can be a random access memory (RAM), which is used as an external cache.
- RAM static RAM
- DRAM dynamic RAM
- SDRAM synchronous DRAM
- DDRSDRAM double data rate synchronous DRAM
- ESDRAM enhanced SDRAM
- SLDRAM synchronous link DRAM
- DRRAM direct RAM bus RAM
- the first processor 2303 may be an integrated circuit chip with signal processing capabilities. In the implementation process, each step of the above method can be completed by the hardware integrated logic circuit or software instructions in the first processor 2303.
- the above-mentioned first processor 2303 can be a general-purpose processor, a digital signal processor (Digital Signal Processor, DSP), an application-specific integrated circuit (Application Specific Integrated Circuit, ASIC), a field programmable gate array (Field Programmable Gate Array, FPGA) or other programmable logic devices, discrete gates or transistor logic devices, discrete hardware components.
- DSP Digital Signal Processor
- ASIC Application Specific Integrated Circuit
- FPGA Field Programmable Gate Array
- the methods, steps and logic block diagrams disclosed in the embodiments of the present application can be implemented or executed.
- the general-purpose processor can be a microprocessor or the processor can also be any conventional processor, etc.
- the steps of the method disclosed in the embodiments of the present application can be directly embodied as a hardware decoding processor to execute, or the hardware and software modules in the decoding processor can be executed.
- the software module can be located in a mature storage medium in the field such as a random access memory, a flash memory, a read-only memory, a programmable read-only memory or an electrically erasable programmable memory, a register, etc.
- the storage medium is located in the first memory 2302, and the first processor 2303 reads the information in the first memory 2302 and completes the steps of the above method in combination with its hardware.
- the embodiments described in this application can be implemented by hardware, software, firmware, middleware, microcode or a combination thereof.
- the processing unit can be implemented in one or more application specific integrated circuits (Application Specific Integrated Circuits, ASIC), Digital Signal Processing (DSP), Digital Signal Processing Device (DSPD), Programmable Logic Device (PLD), Field-Programmable Gate Array (FPGA), general-purpose processor, controller, microcontroller, microprocessor, other electronic units or combinations thereof for performing the functions described in the present application.
- ASIC Application Specific Integrated Circuits
- DSP Digital Signal Processing
- DSPD Digital Signal Processing Device
- PLD Programmable Logic Device
- FPGA Field-Programmable Gate Array
- the technology described in the present application can be implemented by modules (such as procedures, functions, etc.) that perform the functions described in the present application.
- the software code can be stored in a memory and executed by a processor.
- the memory can be implemented in the processor or outside the processor.
- the first processor 2303 is further configured to execute the method described in any one of the aforementioned embodiments when running the computer program.
- the present embodiment provides an encoder in which encoding processing is performed based on whether the shift coefficient of the current layer in the current image is to be performed. If encoding the shift coefficient of the current layer in the current image is skipped, the geometric position information of the first reconstructed grid can be determined based on the geometric position information of the initial grid of the current layer in the current image and the shift coefficient of the current layer in the reference image. This not only reduces the code rate of the shift coefficient, but also ensures the reconstructed geometric quality of the midpoint of the grid, thereby further improving the geometric information quality of the midpoint of the grid, and further improving the encoding efficiency.
- the decoder 240 may include a second determination unit 2401, a second subdivision unit 2402, a decoding unit 2403, and a second reconstruction unit 2404, wherein:
- the second determining unit 2401 is configured to determine a basic grid of the current image
- a second subdivision unit 2402 is configured to subdivide the basic grid and determine geometric position information of an initial grid of a current layer in a current image
- the decoding unit 2403 is configured to decode the code stream, determine the first syntax identification information; and determine the shift coefficient of the current layer in the reference image when the first syntax identification information indicates that the shift coefficient of the current layer in the current image uses the first decoding mode;
- the second reconstruction unit 2404 is configured to determine the geometric position information of the reconstructed grid of the current layer in the current image according to the geometric position information of the initial grid of the current layer in the current image and the shift coefficient of the current layer in the reference image.
- the reference picture is a decoded picture preceding the current picture.
- the decoding unit 2403 is further configured to decode the bitstream and determine the second grammar identification information; when the second grammar identification information indicates that the shift coefficient of the current sequence enables the first decoding mode, decode the bitstream and determine the third grammar identification information; when the third grammar identification information indicates that the shift coefficient of the current image enables the first decoding mode, decode the bitstream and determine the first grammar identification information; wherein the first grammar identification information is used to indicate whether the shift coefficient of the current layer uses the first decoding mode, and the current sequence includes the current image, and the LOD layer divided by the current image includes the current layer.
- the decoding unit 2403 is further configured to decode the code stream and determine the shift coefficient of the current layer in the current image when the first syntax identification information indicates that the shift coefficient of the current layer in the current image uses the second decoding mode;
- the second determining unit 2401 is further configured to determine the geometric position information of the reconstructed grid of the current layer in the current image according to the geometric position information of the initial grid of the current layer in the current image and the shift coefficient of the current layer in the current image.
- the first decoding mode is different from the second decoding mode; wherein: the first decoding mode represents skipping of decoding the shift coefficients of the current layer in the current image; and the second decoding mode represents decoding the shift coefficients of the current layer in the current image.
- the decoding unit 2403 is further configured to decode the code stream and determine fourth syntax identification information; when the fourth syntax identification information indicates that the basic grid of the current image uses an inter-frame processing method, execute the step of decoding the code stream and determining the first syntax identification information.
- the decoding unit 2403 is further configured to decode the code stream and determine the reference image index of the current image when the fourth syntax identification information indicates that the base grid of the current image uses the inter-frame processing mode;
- the second determining unit 2401 is further configured to determine a reference image according to a reference image index of a current image.
- the decoding unit 2403 is further configured to decode the code stream and determine the shift coefficient of the current layer in the current image when the fourth syntax identification information indicates that the base grid of the current image does not use the inter-frame processing mode;
- the second determining unit 2401 is further configured to determine the geometric position information of the reconstructed grid of the current layer in the current image according to the geometric position information of the initial grid of the current layer in the current image and the shift coefficient of the current layer in the current image.
- the decoding unit 2403 is further configured to decode the code stream to determine a two-dimensional image of a current layer in the current image;
- the second determination unit 2401 is further configured to perform coefficient reorganization processing on the two-dimensional image to determine the lifting transformation coefficients of the current layer; and perform inverse transformation processing on the lifting transformation coefficients of the current layer to determine the shift coefficients of the current layer.
- the second determination unit 2401 is further configured to perform coefficient reorganization processing on the two-dimensional image to determine the quantization coefficients of the current layer; and perform inverse quantization processing on the quantization coefficients of the current layer to determine the lifting transformation coefficients of the current layer.
- the second reconstruction unit 2404 is further configured to determine a mapping relationship between geometric position information of an initial grid of a current layer in a current image, a shift coefficient of a current layer in a reference image, and geometric position information of a reconstructed grid of a current layer in a current image; and to determine the geometric position information of a reconstructed grid of a current layer in a current image based on the mapping relationship and the geometric position information of an initial grid of a current layer in a current image and the shift coefficient of a current layer in a reference image.
- the mapping relationship includes at least one of the following: a mapping relationship based on a linear function, a mapping relationship based on a nonlinear function, and a mapping relationship based on a neural grid.
- the decoding unit 2403 is further configured to decode the code stream to determine mapping indication information of the current layer in the current image
- the second determination unit 2401 is further configured to determine the mapping relationship between the geometric position information of the initial grid of the current layer in the current image, the shift coefficient of the current layer in the reference image and the geometric position information of the reconstructed grid of the current layer in the current image according to the mapping indication information.
- the mapping indication information includes first indication information; the second determination unit 2401 is also configured to determine the fitting parameters of the mapping relationship based on the first indication information; and determine the mapping relationship between the geometric position information of the initial grid of the current layer in the current image, the shift coefficient of the current layer in the reference image, and the geometric position information of the reconstructed grid of the current layer in the current image based on the fitting parameters.
- the mapping indication information also includes second indication information; the second determination unit 2401 is further configured to determine the type of mapping relationship based on the second indication information; and determine the mapping relationship between the geometric position information of the initial grid of the current layer in the current image, the shift coefficient of the current layer in the reference image, and the geometric position information of the reconstructed grid of the current layer in the current image based on the type of mapping relationship and the fitting parameters.
- the mapping indication information includes third indication information; the second determination unit 2401 is also configured to determine the index number of the mapping relationship based on the third indication information; and determine the mapping relationship between the geometric position information of the initial grid of the current layer in the current image, the shift coefficient of the current layer in the reference image, and the geometric position information of the reconstructed grid of the current layer in the current image based on the index number of the mapping relationship.
- the decoding unit 2403 is further configured to decode the code stream to determine a base grid of the current image.
- the second subdivision unit 2402 is further configured to iteratively divide the basic grid according to the grid subdivision mode to determine the geometric position information of the initial grid of the current layer in the current image.
- the mesh subdivision mode includes: a subdivision algorithm and a number of subdivision iterations.
- the subdivision algorithm is a linear interpolation algorithm.
- a "unit" can be a part of a circuit, a part of a processor, a part of a program or software, etc., and of course it can also be a module, or it can be non-modular.
- the components in this embodiment can be integrated into a processing unit, or each unit can exist physically separately, or two or more units can be integrated into one unit.
- the above-mentioned integrated unit can be implemented in the form of hardware or in the form of a software functional module.
- the integrated unit is implemented in the form of a software function module and is not sold or used as an independent product, it can be stored in a computer-readable storage medium.
- this embodiment provides a computer-readable storage medium, which is applied to the decoder 240, and the computer-readable storage medium stores a computer program. When the computer program is executed by the second processor, the method described in any one of the above embodiments is implemented.
- the decoder 240 may include: a second communication interface 2501, a second memory 2502 and a second processor 2503; each component is coupled together through a second bus system 2504. It can be understood that the second bus system 2504 is used to achieve connection and communication between these components.
- the second bus system 2504 also includes a power bus, a control bus and a status signal bus. However, for the sake of clarity, various buses are marked as the second bus system 2504 in Figure 25. Among them,
- the second communication interface 2501 is used for receiving and sending signals during the process of sending and receiving information with other external network elements;
- the second memory 2502 is used to store a computer program that can be run on the second processor 2503;
- the second processor 2503 is configured to, when running the computer program, execute:
- the geometric position information of the reconstructed grid of the current layer in the current image is determined according to the geometric position information of the initial grid of the current layer in the current image and the shift coefficient of the current layer in the reference image.
- the second processor 2503 is further configured to execute any one of the methods described in the foregoing embodiments when running the computer program.
- This embodiment provides a decoder, in which the first syntax identification information is used to indicate whether the shift coefficient of the current layer in the current image is skipped for decoding.
- the first syntax identification information is used to indicate whether the shift coefficient of the current layer in the current image is skipped for decoding.
- the geometric position information of the initial grid of the layer and the shift coefficient of the current layer in the reference image are used to determine the geometric position information of the reconstructed grid, which not only reduces the bit rate of the shift coefficient, but also ensures the reconstructed geometric quality of the grid midpoint, thereby further improving the geometric information quality of the grid midpoint and further improving the encoding and decoding efficiency.
- a schematic diagram of the composition structure of a coding and decoding system provided in an embodiment of the present application is shown.
- a coding and decoding system 260 may include an encoder 2601 and a decoder 2602 .
- the encoder 2601 may be the encoder described in any one of the aforementioned embodiments
- the decoder 2602 may be the decoder described in any one of the aforementioned embodiments.
- the base grid of the current image is determined; the base grid is subdivided to determine the geometric position information of the initial grid of the current layer in the current image; the shift coefficient of the current layer in the reference image is determined, and based on the geometric position information of the initial grid of the current layer in the current image and the shift coefficient of the current layer in the reference image, the geometric position information of the first reconstructed grid of the current layer in the current image is determined; based on the geometric position information of the first reconstructed grid and the geometric position information of the original grid, it is determined whether to encode the shift coefficient of the current layer in the current image.
- the base grid of the current image is determined; the base grid is subdivided to determine the geometric position information of the initial grid of the current layer in the current image; the code stream is decoded to determine the first syntax identification information; when the first syntax identification information indicates that the shift coefficient of the current layer in the current image uses the first decoding mode, the shift coefficient of the current layer in the reference image is determined; based on the geometric position information of the initial grid of the current layer in the current image and the shift coefficient of the current layer in the reference image, the geometric position information of the reconstructed grid of the current layer in the current image is determined.
- the adaptive encoding of the shift coefficient in the current image can be determined; if the encoding of the shift coefficient in the current image is skipped, the shift coefficient in the current image does not need to be transmitted in the bitstream at this time.
- the decoding end can determine the geometric position information of the first reconstructed grid based on the geometric position information of the initial grid and the shift coefficient in the reference image, which not only reduces the bitstream of the shift coefficient, but also ensures the reconstructed geometric quality of the midpoints in the grid, thereby further improving the geometric information quality of the midpoints in the grid, and thus improving the encoding and decoding efficiency.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
Abstract
Description
本申请实施例涉及动态网格编解码技术领域,尤其涉及一种编解码方法、码流、编码器、解码器以及存储介质。The embodiments of the present application relate to the field of dynamic grid coding and decoding technology, and in particular, to a coding and decoding method, a bit stream, an encoder, a decoder, and a storage medium.
在运动图像专家组(Moving Picture Experts Group,MPEG)提供的动态网格编码(Dynamic Mesh Coding)的标准参考软件中,大多数是利用基础网格划分得到的初始网格的几何信息与移位系数来重建恢复出当前网格的几何位置信息。In the standard reference software of Dynamic Mesh Coding provided by Moving Picture Experts Group (MPEG), most of them use the geometric information and shift coefficients of the initial mesh obtained by basic mesh division to reconstruct and restore the geometric position information of the current mesh.
如果当前图像可以采用帧间编码方案,即当前图像存在有参考图像,那么编码端可以利用参考图像的基础网格来确定当前图像的网格信息。但是已有的技术方案并不完善,增加了移位系数的编码码率,从而降低了网格压缩性能。If the current image can adopt the inter-frame coding scheme, that is, the current image has a reference image, then the encoder can use the basic grid of the reference image to determine the grid information of the current image. However, the existing technical solutions are not perfect, and the coding bit rate of the shift coefficient is increased, thereby reducing the grid compression performance.
发明内容Summary of the invention
本申请实施例提供一种编解码方法、码流、编码器、解码器以及存储介质,在保证网格重建质量的同时,还能够降低移位系数的编码码率,进而提升网格的几何编码效率。The embodiments of the present application provide a coding and decoding method, a bit stream, an encoder, a decoder and a storage medium, which can reduce the coding rate of the shift coefficient while ensuring the quality of grid reconstruction, thereby improving the geometric coding efficiency of the grid.
本申请实施例的技术方案可以如下实现:The technical solution of the embodiment of the present application can be implemented as follows:
第一方面,本申请实施例提供了一种解码方法,应用于解码器,该方法包括:In a first aspect, an embodiment of the present application provides a decoding method, which is applied to a decoder, and the method includes:
确定当前图像的基础网格;Determine the base grid of the current image;
对基础网格进行细分,确定当前图像中的当前层的初始网格的几何位置信息;Subdividing the base grid to determine the geometric position information of the initial grid of the current layer in the current image;
解码码流,确定第一语法标识信息;Decoding the code stream to determine first syntax identification information;
在第一语法标识信息指示当前图像中的当前层的移位系数使用第一解码模式时,确定参考图像中的当前层的移位系数;When the first syntax identification information indicates that the shift coefficient of the current layer in the current image uses the first decoding mode, determining the shift coefficient of the current layer in the reference image;
根据当前图像中的当前层的初始网格的几何位置信息与参考图像中的当前层的移位系数,确定当前图像中的当前层的重建网格的几何位置信息。The geometric position information of the reconstructed grid of the current layer in the current image is determined according to the geometric position information of the initial grid of the current layer in the current image and the shift coefficient of the current layer in the reference image.
第二方面,本申请实施例提供了一种编码方法,应用于编码器,该方法包括:In a second aspect, an embodiment of the present application provides an encoding method, which is applied to an encoder, and the method includes:
根据当前图像的原始网格,确定当前图像的基础网格;According to the original grid of the current image, determine the basic grid of the current image;
对基础网格进行细分,确定当前图像中的当前层的初始网格的几何位置信息;Subdividing the base grid to determine the geometric position information of the initial grid of the current layer in the current image;
确定参考图像中的当前层的移位系数,并根据当前图像中的当前层的初始网格的几何位置信息与参考图像中的当前层的移位系数,确定当前图像中的当前层的第一重建网格的几何位置信息;Determine a shift coefficient of the current layer in the reference image, and determine geometric position information of a first reconstructed grid of the current layer in the current image according to the geometric position information of the initial grid of the current layer in the current image and the shift coefficient of the current layer in the reference image;
根据第一重建网格的几何位置信息和原始网格的几何位置信息,确定是否对当前图像中的当前层的移位系数进行编码处理。According to the geometric position information of the first reconstructed grid and the geometric position information of the original grid, it is determined whether to perform encoding processing on the shift coefficients of the current layer in the current image.
第三方面,本申请实施例提供了一种码流,该码流是根据待编码信息进行比特编码生成的;其中,待编码信息包括下述至少一项:In a third aspect, an embodiment of the present application provides a code stream, which is generated by bit encoding according to information to be encoded; wherein the information to be encoded includes at least one of the following:
第一语法标识信息的取值、第二语法标识信息的取值、第三语法标识信息的取值、第四语法标识信息的取值、当前图像的参考图像索引、当前图像中的当前层的移位系数和当前图像中的当前层的映射指示信息;The value of the first syntax identification information, the value of the second syntax identification information, the value of the third syntax identification information, the value of the fourth syntax identification information, the reference image index of the current image, the shift coefficient of the current layer in the current image, and the mapping indication information of the current layer in the current image;
其中,第二语法标识信息用于指示当前序列的移位系数是否启用第一编码模式,第三语法标识信息用于指示当前图像的移位系数是否启用第一编码模式,第一语法标识信息用于指示当前图像中的当前层的移位系数是否使用第一编码模式,第四语法标识信息用于指示当前图像的基础网格是否使用帧间处理方式;且当前序列包括当前图像,当前图像划分的LOD层包括当前层。Among them, the second grammar identification information is used to indicate whether the shift coefficient of the current sequence enables the first coding mode, the third grammar identification information is used to indicate whether the shift coefficient of the current image enables the first coding mode, the first grammar identification information is used to indicate whether the shift coefficient of the current layer in the current image uses the first coding mode, and the fourth grammar identification information is used to indicate whether the basic grid of the current image uses inter-frame processing; and the current sequence includes the current image, and the LOD layers divided by the current image include the current layer.
第四方面,本申请实施例提供了一种编码器,该编码器包括第一确定单元、第一细分单元、第一重建单元和编码单元,其中:In a fourth aspect, an embodiment of the present application provides an encoder, the encoder comprising a first determining unit, a first subdivision unit, a first reconstruction unit and an encoding unit, wherein:
第一确定单元,配置为根据当前图像的原始网格,确定当前图像的基础网格;A first determining unit is configured to determine a base grid of the current image according to an original grid of the current image;
第一细分单元,配置为对基础网格进行细分,确定当前图像中的当前层的初始网格的几何位置信息;A first subdivision unit is configured to subdivide the base grid and determine geometric position information of an initial grid of a current layer in a current image;
第一重建单元,配置为确定参考图像中的当前层的移位系数,并根据当前图像中的当前层的初始网 格的几何位置信息与参考图像中的当前层的移位系数,确定当前图像中的当前层的第一重建网格的几何位置信息;The first reconstruction unit is configured to determine a shift coefficient of a current layer in a reference image and to reconstruct the current layer according to an initial grid of the current layer in the current image. The geometric position information of the grid and the shift coefficient of the current layer in the reference image are used to determine the geometric position information of the first reconstructed grid of the current layer in the current image;
编码单元,配置为根据第一重建网格的几何位置信息和原始网格的几何位置信息,确定是否对当前图像中的当前层的移位系数进行编码处理。The encoding unit is configured to determine whether to perform encoding processing on the shift coefficients of the current layer in the current image according to the geometric position information of the first reconstructed grid and the geometric position information of the original grid.
第五方面,本申请实施例提供了一种编码器,该编码器包括第一存储器和第一处理器;其中,In a fifth aspect, an embodiment of the present application provides an encoder, the encoder comprising a first memory and a first processor; wherein,
第一存储器,用于存储能够在第一处理器上运行的计算机程序;A first memory, for storing a computer program that can be run on the first processor;
第一处理器,用于在运行计算机程序时,执行如第二方面所述的方法。The first processor is used to execute the method described in the second aspect when running a computer program.
第六方面,本申请实施例提供了一种解码器,该解码器包括第二确定单元、第二细分单元、解码单元和第二重建单元,其中:In a sixth aspect, an embodiment of the present application provides a decoder, the decoder comprising a second determination unit, a second subdivision unit, a decoding unit, and a second reconstruction unit, wherein:
第二确定单元,配置为确定当前图像的基础网格;A second determining unit is configured to determine a base grid of the current image;
第二细分单元,配置为对基础网格进行细分,确定当前图像中的当前层的初始网格的几何位置信息;A second subdivision unit is configured to subdivide the base grid and determine geometric position information of an initial grid of a current layer in a current image;
解码单元,配置为解码码流,确定第一语法标识信息;以及在第一语法标识信息指示当前图像中的当前层的移位系数使用第一解码模式时,确定参考图像中的当前层的移位系数;A decoding unit configured to decode the code stream, determine the first syntax identification information; and determine the shift coefficient of the current layer in the reference image when the first syntax identification information indicates that the shift coefficient of the current layer in the current image uses the first decoding mode;
第二重建单元,配置为根据当前图像中的当前层的初始网格的几何位置信息与参考图像中的当前层的移位系数,确定当前图像中的当前层的重建网格的几何位置信息。The second reconstruction unit is configured to determine the geometric position information of the reconstructed grid of the current layer in the current image according to the geometric position information of the initial grid of the current layer in the current image and the shift coefficient of the current layer in the reference image.
第七方面,本申请实施例提供了一种解码器,该解码器包括第二存储器和第二处理器;其中,In a seventh aspect, an embodiment of the present application provides a decoder, the decoder comprising a second memory and a second processor; wherein:
第二存储器,用于存储能够在第二处理器上运行的计算机程序;A second memory for storing a computer program that can be run on a second processor;
第二处理器,用于在运行计算机程序时,执行如第一方面所述的方法。The second processor is used to execute the method described in the first aspect when running a computer program.
第八方面,本申请实施例提供了一种计算机可读存储介质,该计算机可读存储介质存储有计算机程序,所述计算机程序被执行时实现如第一方面所述的方法、或者实现如第二方面所述的方法。In an eighth aspect, an embodiment of the present application provides a computer-readable storage medium, which stores a computer program. When the computer program is executed, it implements the method as described in the first aspect, or implements the method as described in the second aspect.
本申请实施例提供了一种编解码方法、码流、编码器、解码器以及存储介质,在编码端,根据当前图像的原始网格,确定当前图像的基础网格;对基础网格进行细分,确定当前图像中的当前层的初始网格的几何位置信息;确定参考图像中的当前层的移位系数,并根据当前图像中的当前层的初始网格的几何位置信息与参考图像中的当前层的移位系数,确定当前图像中的当前层的第一重建网格的几何位置信息;根据第一重建网格的几何位置信息和原始网格的几何位置信息,确定是否对当前图像中的当前层的移位系数进行编码处理。在解码端,确定当前图像的基础网格;对基础网格进行细分,确定当前图像中的当前层的初始网格的几何位置信息;解码码流,确定第一语法标识信息;在第一语法标识信息指示当前图像中的当前层的移位系数使用第一解码模式时,确定参考图像中的当前层的移位系数;根据当前图像中的当前层的初始网格的几何位置信息与参考图像中的当前层的移位系数,确定当前图像中的当前层的重建网格的几何位置信息。这样,基于当前图像的基础网格细分后的初始网格的几何位置信息与参考图像中的移位系数所得到的第一重建网格的几何位置信息,可以确定当前图像中的移位系数的自适应编码;如果跳过编码当前图像中的移位系数,这时候码流中无需传递当前图像中的移位系数,解码端可以根据初始网格的几何位置信息与参考图像中的移位系数来确定第一重建网格的几何位置信息,不仅降低了移位系数的码流,而且还可以保证网格中点的重建几何质量,从而进一步提升了网格中点的几何信息质量,进而提升了编解码效率。The embodiment of the present application provides a coding and decoding method, a code stream, an encoder, a decoder and a storage medium. At the encoding end, according to the original grid of the current image, the basic grid is determined; the basic grid is subdivided to determine the geometric position information of the initial grid of the current layer in the current image; the shift coefficient of the current layer in the reference image is determined, and according to the geometric position information of the initial grid of the current layer in the current image and the shift coefficient of the current layer in the reference image, the geometric position information of the first reconstructed grid of the current layer in the current image is determined; according to the geometric position information of the first reconstructed grid and the geometric position information of the original grid, it is determined whether to encode the shift coefficient of the current layer in the current image. At the decoding end, the basic grid of the current image is determined; the basic grid is subdivided to determine the geometric position information of the initial grid of the current layer in the current image; the code stream is decoded to determine the first syntax identification information; when the first syntax identification information indicates that the shift coefficient of the current layer in the current image uses the first decoding mode, the shift coefficient of the current layer in the reference image is determined; according to the geometric position information of the initial grid of the current layer in the current image and the shift coefficient of the current layer in the reference image, the geometric position information of the reconstructed grid of the current layer in the current image is determined. In this way, based on the geometric position information of the initial grid after the basic grid of the current image is subdivided and the geometric position information of the first reconstructed grid obtained by the shift coefficient in the reference image, the adaptive encoding of the shift coefficient in the current image can be determined; if the encoding of the shift coefficient in the current image is skipped, the shift coefficient in the current image does not need to be transmitted in the bitstream at this time. The decoding end can determine the geometric position information of the first reconstructed grid based on the geometric position information of the initial grid and the shift coefficient in the reference image, which not only reduces the bitstream of the shift coefficient, but also ensures the reconstructed geometric quality of the midpoints in the grid, thereby further improving the geometric information quality of the midpoints in the grid, and thus improving the encoding and decoding efficiency.
图1A为三维网格图像示意图一;FIG1A is a schematic diagram of a three-dimensional grid image 1;
图1B为三维网格图像的局部放大示意图;FIG1B is a partial enlarged schematic diagram of a three-dimensional grid image;
图2为三维网格的连接方式示意图;FIG2 is a schematic diagram of a connection method of a three-dimensional grid;
图3A为三维网格图像示意图二;FIG3A is a second schematic diagram of a three-dimensional grid image;
图3B为网格数据存储格式示意图;FIG3B is a schematic diagram of a grid data storage format;
图3C为三维网格图像的属性示意图;FIG3C is a schematic diagram of properties of a three-dimensional grid image;
图4为网格编码整体框架的组成示意图;FIG4 is a schematic diagram showing the composition of the overall framework of grid coding;
图5A为二维曲线的预处理示意图;FIG5A is a schematic diagram of preprocessing of a two-dimensional curve;
图5B为移位系数的生成示意图;FIG5B is a schematic diagram of generating a shift coefficient;
图6A为网格几何位置信息的量化处理示意图一;FIG6A is a first schematic diagram of quantization processing of grid geometric position information;
图6B为网格几何位置信息的量化处理示意图二;FIG6B is a second schematic diagram of quantization processing of grid geometric position information;
图7A为三角面片的连接关系的编码示意图;FIG. 7A is a schematic diagram of coding of the connection relationship of triangular facets;
图7B为几何位置信息的编码示意图;FIG7B is a schematic diagram of encoding geometric position information;
图7C为纹理坐标的编码示意图; FIG7C is a schematic diagram of encoding of texture coordinates;
图8为移位系数的基本原理示意图;FIG8 is a schematic diagram of the basic principle of the shift coefficient;
图9为移位系数映射到二维图像的编码示意图;FIG9 is a schematic diagram of encoding of mapping shift coefficients to a two-dimensional image;
图10为帧间几何位置信息的编码示意图;FIG10 is a schematic diagram of encoding geometric position information between frames;
图11A为帧内编码框架的组成示意图;FIG11A is a schematic diagram showing the composition of an intra-frame coding framework;
图11B为帧间编码框架的组成示意图;FIG11B is a schematic diagram showing the composition of an inter-frame coding framework;
图12A为帧内解码框架的组成示意图;FIG12A is a schematic diagram showing the composition of an intra-frame decoding framework;
图12B为帧间解码框架的组成示意图;FIG12B is a schematic diagram showing the composition of an inter-frame decoding framework;
图13A为一种基础网格迭代细分示意图;FIG13A is a schematic diagram of iterative subdivision of a basic grid;
图13B为一种LOD空间结构示意图;FIG13B is a schematic diagram of a LOD space structure;
图14为对量化系数进行系数重组示意图;FIG14 is a schematic diagram of coefficient reorganization for quantized coefficients;
图15为一种重建恢复几何位置信息的基本原理示意图;FIG15 is a schematic diagram showing a basic principle of reconstructing and restoring geometric position information;
图16为本申请实施例提供的一种编解码的网格架构示意图;FIG16 is a schematic diagram of a mesh architecture of a codec provided in an embodiment of the present application;
图17为本申请实施例提供的一种解码方法的流程示意图一;FIG17 is a flowchart diagram 1 of a decoding method provided in an embodiment of the present application;
图18为本申请实施例提供的一种解码方法的流程示意图二;FIG18 is a second flow chart of a decoding method provided in an embodiment of the present application;
图19为本申请实施例提供的一种编码方法的流程示意图一;FIG19 is a flowchart diagram 1 of an encoding method provided in an embodiment of the present application;
图20为本申请实施例提供的一种编码方法的流程示意图二;FIG20 is a second flow chart of an encoding method provided in an embodiment of the present application;
图21为本申请实施例提供的另一种重建恢复几何位置信息的基本原理示意图;FIG21 is a schematic diagram of another basic principle of reconstructing and restoring geometric position information provided by an embodiment of the present application;
图22为本申请实施例提供的一种编码器的组成结构示意图;FIG22 is a schematic diagram of the composition structure of an encoder provided in an embodiment of the present application;
图23为本申请实施例提供的一种编码器的具体硬件结构示意图;FIG23 is a schematic diagram of a specific hardware structure of an encoder provided in an embodiment of the present application;
图24为本申请实施例提供的一种解码器的组成结构示意图;FIG24 is a schematic diagram of the structure of a decoder provided in an embodiment of the present application;
图25为本申请实施例提供的一种解码器的具体硬件结构示意图;FIG25 is a schematic diagram of a specific hardware structure of a decoder provided in an embodiment of the present application;
图26为本申请实施例提供的一种编解码系统的组成结构示意图。FIG. 26 is a schematic diagram of the composition structure of a coding and decoding system provided in an embodiment of the present application.
为了能够更加详尽地了解本申请实施例的特点与技术内容,下面结合附图对本申请实施例的实现进行详细阐述,所附附图仅供参考说明之用,并非用来限定本申请实施例。In order to enable a more detailed understanding of the features and technical contents of the embodiments of the present application, the implementation of the embodiments of the present application is described in detail below in conjunction with the accompanying drawings. The attached drawings are for reference only and are not used to limit the embodiments of the present application.
除非另有定义,本文所使用的所有的技术和科学术语与属于本申请的技术领域的技术人员通常理解的含义相同。本文中所使用的术语只是为了描述本申请实施例的目的,不是旨在限制本申请。Unless otherwise defined, all technical and scientific terms used herein have the same meaning as those commonly understood by those skilled in the art to which this application belongs. The terms used herein are only for the purpose of describing the embodiments of this application and are not intended to limit this application.
在以下的描述中,涉及到“一些实施例”,其描述了所有可能实施例的子集,但是可以理解,“一些实施例”可以是所有可能实施例的相同子集或不同子集,并且可以在不冲突的情况下相互结合。In the following description, reference is made to “some embodiments”, which describe a subset of all possible embodiments, but it will be understood that “some embodiments” may be the same subset or different subsets of all possible embodiments and may be combined with each other without conflict.
还需要指出,本申请实施例所涉及的术语“第一\第二\第三”仅是用于区别类似的对象,不代表针对对象的特定排序,可以理解地,“第一\第二\第三”在允许的情况下可以互换特定的顺序或先后次序,以使这里描述的本申请实施例能够以除了在这里图示或描述的以外的顺序实施。It should also be pointed out that the terms "first\second\third" involved in the embodiments of the present application are only used to distinguish similar objects and do not represent a specific ordering of the objects. It can be understood that "first\second\third" can be interchanged in a specific order or sequence where permitted, so that the embodiments of the present application described here can be implemented in an order other than that illustrated or described here.
需要说明的是,可以允许在同一视频场景中解码和合成不同的数据格式比特流。其中,至少可以包括图像格式、点云(Point Cloud)格式、网格(Mesh)格式。通过这种方式,可以为具有不同来源的多个数据格式(例如,网格、点云、图像等等)提供实时沉浸式视频交互服务。It should be noted that different data format bitstreams can be decoded and synthesized in the same video scene. Among them, at least image format, point cloud format, and mesh format can be included. In this way, real-time immersive video interaction services can be provided for multiple data formats (for example, mesh, point cloud, image, etc.) with different sources.
在本申请实施例中,基于数据格式的方法可以允许以数据格式的比特流级进行独立性处理。即与视频编码中的瓦片(tiles)或切片(slices)一样,该场景中的不同数据格式可以以独立的方式编码,从而可以基于数据格式进行独立的编码和解码。In the embodiment of the present application, the data format-based method can allow independent processing at the bitstream level of the data format. That is, like tiles or slices in video encoding, different data formats in the scene can be encoded in an independent manner, so that independent encoding and decoding can be performed based on the data format.
一般而言,三维动画内容采用基于关键帧的表示方法,即每帧是一个静态网格。不同时刻的静态网格具有相同的拓扑结构和不同的几何结构。但是,基于关键帧表示的三维动态网格的数据量特别大,因此如何能够有效的存储、传输和绘制成为三维动态网格发展所面临的问题。另外针对不同的用户终端(电脑、笔记本、便携式设备、手机)需要支持网格的空间可伸缩性;不同的网格带宽(宽带、窄带、无线)需要支持网格的质量可伸缩性。因此,三维动态网格压缩是一个非常关键的问题。Generally speaking, 3D animation content is represented based on keyframes, that is, each frame is a static mesh. Static meshes at different times have the same topological structure and different geometric structures. However, the amount of data of 3D dynamic meshes represented based on keyframes is extremely large, so how to effectively store, transmit and draw them has become a problem faced by the development of 3D dynamic meshes. In addition, the spatial scalability of the mesh needs to be supported for different user terminals (computers, notebooks, portable devices, mobile phones); different mesh bandwidths (broadband, narrowband, wireless) need to support the quality scalability of the mesh. Therefore, 3D dynamic mesh compression is a very critical issue.
三维网格是通过空间中的无数个多边形组成的三维物体表面,多边形由顶点和边组成,图1A展示了三维网格图像,图1B展示了三维网格图像的局部放大示意图。根据图1A和图1B可以看到网格表面是由闭合多边形所组成的。A 3D grid is a 3D object surface composed of numerous polygons in space. A polygon consists of vertices and edges. FIG1A shows a 3D grid image, and FIG1B shows a partially enlarged schematic diagram of a 3D grid image. It can be seen from FIG1A and FIG1B that the grid surface is composed of closed polygons.
二维图像在每一个像素点均有信息表达,分布规则,因此不需要额外记录其位置信息;然而网格中的顶点在三维空间中的分布具有随机性和不规则性,以及多边形的构成方式需要进行额外的规定,因此需要记录每一个顶点在空间中的位置,以及每个多边形的连接信息,才能完整地表达一幅网格图像,如图2所示,同样的顶点数目和顶点位置,由于连接方式的不同,所形成的表面也完全不同。 A two-dimensional image has information expressed at each pixel point and is distributed regularly, so there is no need to record its position information additionally; however, the distribution of vertices in the mesh in three-dimensional space is random and irregular, and the way polygons are formed requires additional regulations. Therefore, it is necessary to record the position of each vertex in space and the connection information of each polygon in order to fully express a mesh image. As shown in Figure 2, the same number of vertices and vertex positions, due to different connection methods, form completely different surfaces.
除了以上信息之外,由于三维网格图像通常采用已有的二维图像/视频编码方式进行编码,因此需要将三维网格进行从三维空间到二维图像的转化,UV坐标就定义了这一转化过程。In addition to the above information, since the three-dimensional grid image is usually encoded using an existing two-dimensional image/video encoding method, the three-dimensional grid needs to be converted from three-dimensional space to a two-dimensional image, and the UV coordinates define this conversion process.
与二维图像类似,采集过程中每一个位置可能会有对应的属性信息,通常为RGB颜色值,颜色值反映物体的色彩;对于三维网格来说,每一个顶点所对应的属性信息除了颜色以外,还有比较常见的是反射率(reflectance)值,反射率值反映物体的表面材质。三维网格的属性信息通过二维图像进行存储,其从二维到三维的映射由UV坐标规定。Similar to 2D images, each position in the acquisition process may have corresponding attribute information, usually RGB color values, which reflect the color of the object; for 3D meshes, in addition to color, the attribute information corresponding to each vertex is also commonly reflectance values, which reflect the surface material of the object. The attribute information of 3D meshes is stored in 2D images, and its mapping from 2D to 3D is specified by UV coordinates.
因此,三维网格数据通常包括三维几何位置信息(x,y,z)、几何连接关系、UV坐标以及属性图。其中,图3A为一幅三维网格图像,图3B为网格数据存储格式,这里包括三维几何位置信息、UV坐标和连接信息,图3C为对应的属性示意图。Therefore, 3D mesh data usually includes 3D geometric position information (x, y, z), geometric connection relationship, UV coordinates and attribute graph. Among them, Figure 3A is a 3D mesh image, Figure 3B is a mesh data storage format, which includes 3D geometric position information, UV coordinates and connection information, and Figure 3C is a corresponding attribute diagram.
目前的三维动态网格压缩方法有基于空间-时间的预测方法,通过消除空间和时间相关性来提高压缩效率;基于主成分分析(Principal Components Analysis,PCA)的技术,在特征向量空间进行投影,使能量集中;基于小波的方法,支持空间可伸缩和质量可伸缩。Current 3D dynamic mesh compression methods include space-time prediction methods, which improve compression efficiency by eliminating spatial and temporal correlations; principal component analysis (PCA)-based technology, which projects in the eigenvector space to concentrate energy; and wavelet-based methods, which support spatial scalability and quality scalability.
需要说明的是,在运动图像专家组(Moving Picture Experts Group,MPEG)提供的动态网格编码中,图4为网格编码整体框架示意图,图5A为二维曲线的预处理示意图,图5B为移位系数的生成示意图。其中,三维网格的预处理过程可同理类比,在编码端主要分为预处理(Pre-processing)和编码器(Encoder)两部分。其中,首先通过预处理生成基础网格和移位系数。预处理流程包括:首先对原始网格(Original mesh)进行下采样,生成顶点数量大幅减少的简化网格(Decimated Mesh),或称基础网格(Base mesh)。然后对基础网格进行细分,通过算法生成,在基础网格的边上插入新生成的顶点,得到细分网格(Subdivided mesh)。最后对细分网格中的每个顶点,在原始网格中寻找与其距离最近的顶点,则细分网格中的顶点和原始网格中的最近顶点之间的矢量为移位系数。由于只要确定了细分算法和细分迭代次数,细分网格就可以在编解码端自动生成,因此经过预处理之后,原始网格只需表示为简单的基础网格和一系列移位系数,这样可以大大降低数据量,且不影响解码端的重建。It should be noted that in the dynamic mesh coding provided by the Moving Picture Experts Group (MPEG), FIG4 is a schematic diagram of the overall framework of mesh coding, FIG5A is a schematic diagram of the preprocessing of a two-dimensional curve, and FIG5B is a schematic diagram of the generation of a shift coefficient. Among them, the preprocessing process of a three-dimensional mesh can be analogous, and is mainly divided into two parts at the encoding end: preprocessing (Pre-processing) and encoder (Encoder). Among them, the base mesh and the shift coefficient are first generated by preprocessing. The preprocessing process includes: first, the original mesh is downsampled to generate a simplified mesh (Decimated Mesh) with a greatly reduced number of vertices, or a base mesh (Base mesh). Then the base mesh is subdivided, generated by an algorithm, and the newly generated vertices are inserted on the edge of the base mesh to obtain a subdivided mesh (Subdivided Mesh). Finally, for each vertex in the subdivided mesh, the vertex closest to it is found in the original mesh, and the vector between the vertex in the subdivided mesh and the nearest vertex in the original mesh is the shift coefficient. As long as the subdivision algorithm and the number of subdivision iterations are determined, the subdivision grid can be automatically generated at the codec end. Therefore, after preprocessing, the original grid only needs to be represented as a simple basic grid and a series of shift coefficients. This can greatly reduce the amount of data without affecting the reconstruction at the decoding end.
基于视频编码的动态网格编码(Video Dynamic Mesh Coding,V-DMC)主要可以分为两大类:几何位置信息编码和属性信息编码。如图5所示,序列basketball_player的每一帧文件中包括两个文件:basketball_player_fr0001_qp12_qt12.obj和basketball_player_fr0002.png。其中,basketball_player_fr0001_qp12_qt12.obj中包含四种信息分别为:几何位置信息(x,y,z)、几何位置信息三角面片的连接关系、纹理坐标(u,v)以及纹理坐标的连接关系。basketball_player_fr0002.png代表的是当前图像的纹理属性信息。在目前的V-DMC编码器中,几何位置信息采用动态网格编码器(Dynamic Range Arithmetic Coding,DRACO)以及视频编码器(Video Codec)联合编码,纹理信息编码直接利用Video Codec进行编码。其中,Video Codec可以包括H.264/高级视频编码(Advanced Video Coding,AVC)、H.265/高效率视频编码(High Efficiency Video Coding,HEVC)、H.266/多功能视频编码(Versatile Video Coding,VVC/VV-enC)等。因此,接下来对mesh的几何信息编码进行详细的介绍。Video Dynamic Mesh Coding (V-DMC) based on video coding can be mainly divided into two categories: geometric position information coding and attribute information coding. As shown in Figure 5, each frame file of the sequence basketball_player includes two files: basketball_player_fr0001_qp12_qt12.obj and basketball_player_fr0002.png. Among them, basketball_player_fr0001_qp12_qt12.obj contains four types of information: geometric position information (x, y, z), the connection relationship of geometric position information triangles, texture coordinates (u, v) and the connection relationship of texture coordinates. basketball_player_fr0002.png represents the texture attribute information of the current image. In the current V-DMC encoder, the geometric position information is jointly encoded by the dynamic range arithmetic coding (DRACO) and the video codec, and the texture information encoding is directly encoded by the video codec. Among them, Video Codec can include H.264/Advanced Video Coding (AVC), H.265/High Efficiency Video Coding (HEVC), H.266/Versatile Video Coding (VVC/VV-enC), etc. Therefore, the geometric information encoding of mesh is introduced in detail below.
其中,几何信息可以分为:位置信息的编码(几何位置信息和纹理位置信息)与连接关系的编码(几何位置信息的三角面片连接关系、纹理位置信息的连接关系)。目前的V-DMC编码主要分为两种编码测试条件:帧内编码和帧间编码(低时延,目前没有RA的测试环境)。The geometric information can be divided into: the encoding of position information (geometric position information and texture position information) and the encoding of connection relationship (connection relationship of triangle patches of geometric position information and connection relationship of texture position information). The current V-DMC encoding is mainly divided into two encoding test conditions: intra-frame encoding and inter-frame encoding (low latency, currently there is no RA test environment).
(一)帧内几何信息编码(Intra coding)。(a) Intra-frame geometric information coding (Intra coding).
1.网格(mesh)预处理。1. Mesh preprocessing.
a)如图5A和图5B所示,以二维的连接关系为例子。原始网格的连接关系中含有大量的点,在对网格几何信息编码之前,首先进行网格几何信息的量化或者简单化,最终得到对应的简化网格(Decimated mesh)作为基础网格。a) As shown in Figures 5A and 5B, a two-dimensional connection relationship is used as an example. The connection relationship of the original mesh contains a large number of points. Before encoding the mesh geometry information, the mesh geometry information is first quantized or simplified, and finally the corresponding simplified mesh (Decimated mesh) is obtained as the basic mesh.
b)如图6A和图6B所示,基于三角面片坐标进行网格的量化处理。按照量化点之间的连接关系,将量化处理可以划分为以下两种情况:b) As shown in FIG6A and FIG6B , the quantization processing of the grid is performed based on the coordinates of the triangular patch. According to the connection relationship between the quantization points, the quantization processing can be divided into the following two cases:
当两个顶点共边,即量化之前属于一条边的两个顶点,则量化之后,需要将两个顶点连接的三角面片全部连接在一起,如图6A所示涉及到之前三角面片的消失化;When two vertices share an edge, that is, two vertices that belong to one edge before quantization, after quantization, all the triangles connected to the two vertices need to be connected together, which involves the disappearance of the previous triangles as shown in FIG6A ;
否则,如果两个顶点非共边,即不属于一条边的两个顶点,则量化之后,仅仅需要将两个顶点的边界进行合并即可,如图6B所示对三角面片的数量不出产生影响。Otherwise, if the two vertices do not share a common edge, that is, the two vertices do not belong to the same edge, then after quantization, it is only necessary to merge the boundaries of the two vertices, as shown in FIG. 6B , and the number of triangles is not affected.
c)基于三角面片坐标进行mesh的量化处理的整个过程中,最为核心的问题在于如何根据之前的顶点坐标得到最佳的顶点,目前的V-DMC中会在以下四种模式当中得到最佳的量化点,假设量化之前的顶点分布为V1和V2,量化之后的顶点坐标为V’,则存在以下:V1、V2、(V1+V2)/2以及Q-1(V1+V2),其中Q为V1和V2顶点坐标对应的量化矩阵。最终基础量化前后的失真测度D来选取最佳的量化点。c) In the whole process of mesh quantization based on triangle patch coordinates, the core problem is how to get the best vertex based on the previous vertex coordinates. The current V-DMC will get the best quantization point in the following four modes. Assuming that the vertex distribution before quantization is V1 and V2, and the vertex coordinates after quantization are V', there are the following: V1, V2, (V1+V2)/2 and Q -1 (V1+V2), where Q is the quantization matrix corresponding to the vertex coordinates of V1 and V2. Finally, the distortion measure D before and after basic quantization is used to select the best quantization point.
2.Base mesh编码。2.Base mesh encoding.
a)在得到Base mesh之后,会利用DRACO编码器来对Base mesh的几何信息编码。其中,几何 信息主要包括:连接关系与几何位置信息的连接关系。整个DRACO编码的流程按照:首先完成连接关系的编码,其次基于几何位置的连接关系对点的几何位置信息进行编码,最后基于连接关系、几何位置信息来对纹理位置信息进行编码。a) After obtaining the Base mesh, the DRACO encoder is used to encode the geometric information of the Base mesh. The information mainly includes: the connection relationship and the connection relationship of the geometric position information. The entire DRACO encoding process is as follows: first complete the encoding of the connection relationship, then encode the geometric position information of the point based on the connection relationship of the geometric position, and finally encode the texture position information based on the connection relationship and geometric position information.
b)连接关系的编码。DRACO对mesh的连接关系编码采用的是“EdgebreakerCoding”方案。具体参见图7A。在图7A中,v表示当前顶点。在对mesh的连接关系之前,将mesh的顶点划分为五种类型:C、L、R、S、E,每个符号表示的物理意义如下:b) Coding of connection relationship. DRACO uses the "Edgebreaker Coding" scheme to encode the connection relationship of the mesh. See Figure 7A for details. In Figure 7A, v represents the current vertex. Before the connection relationship of the mesh is encoded, the vertices of the mesh are divided into five types: C, L, R, S, and E. The physical meaning of each symbol is as follows:
C:与当前顶点连接的三角面片没有一个完成编码;C: None of the triangles connected to the current vertex have been encoded;
L:与当前顶点连接的左边的三角面片完成编码;L: The triangle face on the left connected to the current vertex completes the encoding;
R:与当前顶点连接的右边的三角面片完成编码;R: The right triangle connected to the current vertex completes the encoding;
S:与当前顶点连接的左边和右边的三角面片都没有完成编码;S: The left and right triangles connected to the current vertex have not been encoded;
E:与当前顶点连接的左边和右边的三角面片都已经完成编码。E: The left and right triangles connected to the current vertex have been encoded.
最终按照一定的顺序,对每个顶点的类型以及顶点的处理顺序进行编码,解码端按照顶点的处理顺序以及顶点的类型来恢复mesh的几何连接关系。Finally, the type of each vertex and the processing order of the vertices are encoded in a certain order, and the decoding end restores the geometric connection relationship of the mesh according to the processing order and type of the vertices.
c)几何位置信息的编码。在完成顶点连接关系的编码之后,基于顶点的连接关系来对每个顶点的几何位置信息进行预测编码。预测编码采用的思想是“Parallelograms algorithm”,具体如图7B所示。利用与当前待编码点相邻的三个顶点:左边(left)顶点、右边(right)顶点以及对端(opposite)顶点进行简单的线性拟合:
predpos=(left+right)-opposite (1)c) Coding of geometric position information. After completing the coding of the vertex connection relationship, the geometric position information of each vertex is predictively coded based on the vertex connection relationship. The idea adopted in predictive coding is the "Parallelograms algorithm", as shown in Figure 7B. A simple linear fit is performed using the three vertices adjacent to the current point to be coded: the left vertex, the right vertex, and the opposite vertex:
pred pos = (left+right)-opposite (1)
d)在完成点连接关系以及几何位置信息之后,基于这两者解码重建的基础,来对纹理坐标进行预测编码,具体如图7C所示。同样的,假设当前顶点为C,根据点的连接关系可以得到当前点的Left和Right顶点,其次利用left和right顶点的纹理坐标来对当前顶点C的纹理坐标进行预测编码。d) After completing the point connection relationship and geometric position information, the texture coordinates are predicted and encoded based on the decoding and reconstruction of the two, as shown in Figure 7C. Similarly, assuming that the current vertex is C, the Left and Right vertices of the current point can be obtained according to the point connection relationship, and then the texture coordinates of the left and right vertices are used to predict the texture coordinates of the current vertex C.
3.移位(Displacement)系数编码。3. Displacement coefficient encoding.
a)首先,在完成Base mesh的编码重建之后,会利用一定的划分算法,来对Base mesh进行划分得到初始重建mesh,具体的如图5A中的细分网格对应的曲线,通过利用简单的线性插值,得到细分网格(也可称为“初始mesh”)。其中新插入的点坐标根据当前边界上的两个顶点进行线性插值得到:
a) First, after the encoding and reconstruction of the base mesh is completed, a certain partitioning algorithm is used to partition the base mesh to obtain the initial reconstructed mesh. Specifically, the curve corresponding to the subdivided mesh in Figure 5A is used to obtain the subdivided mesh (also called the "initial mesh") by using simple linear interpolation. The coordinates of the newly inserted point are obtained by linear interpolation based on the two vertices on the current boundary:
b)其次,计算划分之后的细分mesh与原始mesh之间点的误差Delta,误差Delta可以是一种世界坐标系下点的误差。最终,利用每个点之间的误差Delta以及每个点的法向量Norm计算得到每个点的Displacement(即移位系数),具体参见图8。在图8中,加粗实线表示误差Delta,N和T表示法向量Norm;如此,具体的计算方式如下:
Displacement=Delta×Norm (3)b) Secondly, the error Delta between the points of the subdivided mesh and the original mesh is calculated. The error Delta can be an error of the point in the world coordinate system. Finally, the Displacement (i.e., displacement coefficient) of each point is calculated using the error Delta between each point and the normal vector Norm of each point, as shown in Figure 8. In Figure 8, the bold solid line represents the error Delta, and N and T represent the normal vector Norm; thus, the specific calculation method is as follows:
Displacement=Delta×Norm (3)
c)在计算得到每个点的Displacement之后,可以利用提升变换(Lifting Transform)来将空域残差系数变换到频域,得到对应的频域残差系数。c) After calculating the Displacement of each point, the lifting transform can be used to transform the spatial domain residual coefficients to the frequency domain to obtain the corresponding frequency domain residual coefficients.
d)最终,会利用系数重组(Packing)算法将每个点的频域残差系数按照一定顺序映射到二维图像当中,目前的V-DMC可以按照莫顿码顺序(Morton Code Order)进行排列,具体参见图9。d) Finally, the coefficient packing algorithm is used to map the frequency domain residual coefficients of each point into the two-dimensional image in a certain order. The current V-DMC can be arranged in Morton Code Order, as shown in Figure 9.
e)最终利用传统的Video Codec来对二维图像进行编码。e) Finally, use traditional Video Codec to encode the two-dimensional image.
4.重着色。4. Recoloring.
重着色属于编码端的算法,当完成编码端几何信息的重建之后。利用原始的几何信息、原始的纹理属性信息以及重建的mesh几何信息,来对重建之后的mesh的纹理属性信息进行重着色。Recoloring is an algorithm on the encoder side. After the reconstruction of the geometric information on the encoder side is completed, the texture attribute information of the reconstructed mesh is recolored using the original geometric information, the original texture attribute information, and the reconstructed mesh geometric information.
(二)帧间几何信息编码(Inter coding)。(B) Inter-frame geometric information coding (Inter coding).
a)类似上面的编码,几何位置信息包括几何连接关系以及几何位置信息编码。但是这里需要注意的是,帧间的几何位置信息编码,仅仅需要编码当前Base mesh的几何位置信息(x,y,z),不需要编码连接关系以及纹理位置信息(u,v),具体的原因如下:如果当前图像可以采用帧间编码,那么在编码端会利用当前图像的参考图像的Base mesh来得到当前图像的mesh信息,因此当前图像和参考图像拥有相同的连接关系以及uv纹理坐标,唯有几何位置信息不相同。a) Similar to the above encoding, the geometric position information includes the geometric connection relationship and the geometric position information encoding. However, it should be noted that the inter-frame geometric position information encoding only needs to encode the geometric position information (x, y, z) of the current Base mesh, and does not need to encode the connection relationship and texture position information (u, v). The specific reasons are as follows: If the current image can be inter-frame encoded, then the Base mesh of the reference image of the current image will be used at the encoding end to obtain the mesh information of the current image. Therefore, the current image and the reference image have the same connection relationship and uv texture coordinates, but the geometric position information is different.
b)基于a)可以知道,当前图像和参考图像只有几何位置信息存在误差,因此当前的V-DMC对当前图像的几何位置信息进行预测编码。b) Based on a), it can be known that there is only an error in the geometric position information between the current image and the reference image, so the current V-DMC performs predictive coding on the geometric position information of the current image.
具体如图10所示,黑色点为待编码点,利用当前点在参考图像得到对应的预测点(类似视频编码当中的同位块),其次利用当前点的邻域点(已经完成编码顶点的MV)来对当前点的运动向量(Motion Vector,MV)进行预测编码。具体的如下所示,假设当前点的坐标为pos,对应的同位点的坐标为Predpos,则当前点的MV计算为:
MV=Pos-Predpos (4)As shown in Figure 10, the black dot is the point to be encoded. The corresponding prediction point (similar to the same-position block in video encoding) is obtained by using the current point in the reference image. Then, the motion vector (MV) of the current point is predicted and encoded by using the neighboring points of the current point (the MV of the vertices that have been encoded). As shown below, assuming that the coordinates of the current point are pos and the coordinates of the corresponding same-position point are Pred pos , the MV of the current point is calculated as:
MV=Pos-Pred pos (4)
在当前的V-DMC存在了两种预测编码模式:There are two prediction coding modes in the current V-DMC:
i.对当前点的MV直接进行编码;i. Directly encode the MV of the current point;
ii.对当前点的MV利用邻域进行预测编码。ii. Use the neighborhood to perform predictive coding on the MV of the current point.
在编码端利用率失真优化算法得到每个编码组(Coding Group,CG)的最佳编码模式,当前的V-DMC设定每个CG的点数最多为16个。At the encoding end, the rate-distortion optimization algorithm is used to obtain the optimal coding mode for each coding group (CG). The current V-DMC sets the maximum number of points for each CG to 16.
纹理属性信息的编码:目前的V-DMC对纹理属性信息编码直接利用视频编解码器(Video-Codec)进行编码,例如:AVC、HEVC、VVC或者VV-enC。Coding of texture attribute information: The current V-DMC encodes texture attribute information directly using a video codec (Video-Codec), such as AVC, HEVC, VVC or VV-enC.
图11A为一种帧内编码器的框架示意图。如图11A所示,在帧内编码器中,可以采用常见的静态网格编码器(Static Mesh Encoder)对简化网格进行编码,生成对应的码流(Compressed base mesh bitstream)。接下来,用重构的简化网格更新移位系数(Update Displacements)。对更新后的移位系数进行小波变换(Wavelet Transform)、量化(Quantization)后得到移位系数。并将其打包成图像、视频(Image Packing、Video Packing)后采用HEVC进行编码,生成移位系数的码流(Compressed displacements bitstream)。对于属性图(Attribute Map)编码,首先根据重构的几何信息与原始几何信息间的差异对特征图进行变换(Texture Transfer),然后将其进行填补(Padding)、打包(Video Packing)后用视频编码器编码形成属性码流(Compressed attribute bitstream)。FIG11A is a schematic diagram of a framework of an intra-frame encoder. As shown in FIG11A , in the intra-frame encoder, a common static mesh encoder (Static Mesh Encoder) can be used to encode the simplified mesh to generate the corresponding bitstream (Compressed base mesh bitstream). Next, the displacement coefficients are updated (Update Displacements) using the reconstructed simplified mesh. The updated displacement coefficients are subjected to wavelet transform (Wavelet Transform) and quantization (Quantization) to obtain the displacement coefficients. They are then packaged into images and videos (Image Packing, Video Packing) and encoded using HEVC to generate a bitstream (Compressed displacements bitstream) of the displacement coefficients. For attribute map encoding, the feature map is first transformed (Texture Transfer) according to the difference between the reconstructed geometric information and the original geometric information, and then padded (Padding) and packaged (Video Packing) and encoded using a video encoder to form an attribute bitstream (Compressed attribute bitstream).
图11B为一种帧间编码器的框架示意图。如图11B所示,帧间编码器与帧内编码器流程大致相同,但帧间编码器并不直接对简化网格进行编码,而是编码当前图像的简化网格与参考图像的简化网格间的运动矢量MV,并生成相应的运动矢量的码流(Compressed motion bitstream)。FIG11B is a schematic diagram of an inter-frame encoder. As shown in FIG11B , the inter-frame encoder has a similar process to the intra-frame encoder, but the inter-frame encoder does not directly encode the simplified grid, but encodes the motion vector MV between the simplified grid of the current image and the simplified grid of the reference image, and generates a corresponding motion vector bitstream (Compressed motion bitstream).
相应的,在解码过程中,解码器按照作用帧的类型不同,也可以分为帧内解码器和帧间解码器,分别用于执行帧内解码和帧间解码。Correspondingly, during the decoding process, the decoder can also be divided into an intra-frame decoder and an inter-frame decoder according to the type of the frame it acts on, which are used to perform intra-frame decoding and inter-frame decoding respectively.
图12A为帧内解码的示意图,如图12A所示,在帧内解码器中,可以采用静态网格解码器(Static Mesh Decoder)解码出简化网格。采用视频解码器(Video Decoder)解码出移位系数视频,并通过视频解包(Video Unpacking)和小波逆变换(Inverse Wavelet Transform)得到移位系数。通过解码得到的简化网格和移位系数得到解码的网格几何信息。属性图的解码则直接通过视频解码器进行解码。FIG12A is a schematic diagram of intra-frame decoding. As shown in FIG12A , in the intra-frame decoder, a static mesh decoder (Static Mesh Decoder) can be used to decode the simplified mesh. A video decoder (Video Decoder) is used to decode the shift coefficient video, and the shift coefficient is obtained through video unpacking (Video Unpacking) and inverse wavelet transform (Inverse Wavelet Transform). The decoded simplified mesh and shift coefficient are used to obtain the decoded mesh geometry information. The decoding of the attribute graph is directly decoded by the video decoder.
图12B为帧间解码的示意图,如图12B所示,对于帧间解码器,其流程与帧内解码器基本一致。除了不直接解码简化网格,而是解码运动矢量,并通过前一帧图像(例如参考图像)的简化网格计算得到当前图像的简化网格。FIG12B is a schematic diagram of inter-frame decoding. As shown in FIG12B , for an inter-frame decoder, the process is basically the same as that of an intra-frame decoder, except that the simplified grid is not directly decoded, but the motion vector is decoded, and the simplified grid of the current image is calculated by the simplified grid of the previous frame image (such as the reference image).
综上,目前MPEG提供的动态网格编码(Dynamic Mesh Coding)中,动态网格编码过程分为如下步骤:在编码端,预处理生成的基础网格量化后采用google开源的DRACO编码器进行编码,移位系数经过小波变换、量化、二维映射之后采用HEVC进行编码,二维属性图也直接输送到HEVC编码器进行编码;在解码端,基础网格码流经过DRACO解码之后生成解码基础网格,移位系数经过HEVC解码、反二维映射、反量化、反变换后生成解码移位系数,然后解码基础网格和解码移位系数一起生成重建三维网格几何,属性码流经过HEVC解码之后生成重建属性图。In summary, in the dynamic mesh coding (Dynamic Mesh Coding) currently provided by MPEG, the dynamic mesh coding process is divided into the following steps: at the encoding end, the basic mesh generated by preprocessing is quantized and then encoded using Google's open source DRACO encoder, and the shift coefficient is encoded using HEVC after wavelet transform, quantization, and two-dimensional mapping, and the two-dimensional attribute map is also directly sent to the HEVC encoder for encoding; at the decoding end, the basic mesh code stream is decoded by DRACO to generate a decoded basic mesh, and the shift coefficient is decoded by HEVC decoding, inverse two-dimensional mapping, inverse quantization, and inverse transformation to generate a decoded shift coefficient, and then the decoded basic mesh and the decoded shift coefficient are used together to generate the reconstructed three-dimensional mesh geometry, and the attribute code stream is decoded by HEVC to generate a reconstructed attribute map.
(三)MPEG DMC的通用测试条件。(iii) General test conditions for MPEG DMC.
a、测试条件共2种:a. There are 2 test conditions:
条件1:all intra几何有损、属性有损;Condition 1: all intra geometry lossy, attributes lossy;
条件2:random access几何有损、属性有损;Condition 2: random access is lossy in geometry and attributes;
b、通用测试序列可以包括Cat1-A,Cat1-B和Cat1-C共五类,均包含几何信息和颜色属性信息。b. The general test sequence may include five categories, namely Cat1-A, Cat1-B and Cat1-C, all of which contain geometric information and color attribute information.
下面将对V-DMC的Displacement系数编码进行详细的介绍。The following is a detailed introduction to the Displacement coefficient encoding of V-DMC.
在一种具体的实现方式中,对于编码端而言,首先会根据Base mesh来按照一定的算法进行迭代划分来得到对应的mesh位置信息,具体的划分算法和前述内容一致,利用每个边界上的顶点进行线性插值,得到对应的几何位置信息。假设整个划分迭代了N次,那么根据不同迭代划分得到的Displacement系数进行LOD划分,具体参见图13A。In a specific implementation, for the encoder, the Base mesh is first iterated and divided according to a certain algorithm to obtain the corresponding mesh position information. The specific division algorithm is consistent with the above content, and the vertices on each boundary are linearly interpolated to obtain the corresponding geometric position information. Assuming that the entire division is iterated N times, the LOD division is performed according to the Displacement coefficients obtained by different iterative divisions, as shown in Figure 13A.
如图13A所示,Base mesh经过线性差值算法可以得到对应的mesh几何位置信息,利用初始得到的几何位置信息与原始的mesh进行误差计算,可以得到每个点的Displacement系数。其中,LOD划分可以为四层结构,具体为:第0层(level0)、第1层(level1)、第2层(level2)和第3层(level3),具体的LOD空间结构如图13B所示。As shown in Figure 13A, the Base mesh can obtain the corresponding mesh geometric position information through the linear interpolation algorithm. The initial geometric position information and the original mesh are used to calculate the error to obtain the Displacement coefficient of each point. Among them, the LOD division can be a four-layer structure, specifically: level 0 , level 1 , level 2 and level 3. The specific LOD spatial structure is shown in Figure 13B.
其次,基于LOD空间结构进行提升小波变换(Lifting Transform),其中可以包括预测和更新两个步骤,其中的预测算法如下:
Secondly, a lifting wavelet transform is performed based on the LOD spatial structure, which may include two steps: prediction and update. The prediction algorithm is as follows:
在这里,v表示待预测顶点,v1和v2表示待预测顶点所在边界的两端顶点。 Here, v represents the vertex to be predicted, and v1 and v2 represent the two end vertices of the boundary where the vertex to be predicted is located.
再次,更新步骤如下:
Again, the update steps are as follows:
最终可以对变换之后的系数进行量化,并且对量化之后的系数进行系数重组,如图14所示,基于块(Block)进行系数重组,这里可以重组为Block0、Block1、Block2和Block3。也就是说,目前的V-DMC可以基于Block进行系数重组,每个Block的大小为16×16,对每个Block内部的系数按照莫顿码进行排列得到对应的二维图像。在完成一系列操作之后,最后可以利用Video-Codec来对二维图像进行编码。Finally, the transformed coefficients can be quantized and the quantized coefficients can be reorganized. As shown in FIG14, the coefficients can be reorganized based on blocks, which can be reorganized into Block 0 , Block 1 , Block 2 , and Block 3. In other words, the current V-DMC can reorganize coefficients based on blocks. The size of each block is 16×16. The coefficients in each block are arranged according to the Morton code to obtain the corresponding two-dimensional image. After completing a series of operations, the two-dimensional image can be encoded using Video-Codec.
综上,动态网格编码过程可以分为如下步骤:In summary, the dynamic grid encoding process can be divided into the following steps:
1、对原始网格进行预处理,具体表现为减少网格中的顶点数并简化连接关系。1. Preprocess the original mesh by reducing the number of vertices in the mesh and simplifying the connection relationship.
2、细分步骤1中的简化网格,具体操作为对于步骤1中任意有连接关系的两个顶点,在其连线段的中点加入一个新的点,并重复2次。2. Subdivide the simplified mesh in step 1. For any two connected vertices in step 1, add a new point at the midpoint of the line segment connecting them, and repeat twice.
3、对于步骤2中的每个顶点,寻找原始网格中与其距离最近的点,并计算这两个点的移位系数(Displacement系数)。3. For each vertex in step 2, find the point in the original mesh that is closest to it and calculate the displacement coefficient of the two points.
4、采用编码器如Draco对步骤1中的简化网格量化后进行编码。4. Use an encoder such as Draco to quantize the simplified grid in step 1 and then encode it.
5、根据步骤4所得到的重构简化网格对步骤3中的移位系数进行调整。5. Adjust the shift coefficients in step 3 according to the reconstructed simplified grid obtained in step 4.
6、对步骤5中的移位系数进行小波变换,并将小波变换后的移位系数进行量化得到量化变换系数。6. Perform wavelet transform on the shift coefficients in step 5, and quantize the shift coefficients after wavelet transform to obtain quantized transform coefficients.
7、将量化变换系数进行三维空间到二维图像的映射(或称“图像打包”),生成移位系数二维图像。7. Map the quantized transform coefficients from three-dimensional space to a two-dimensional image (or "image packing") to generate a two-dimensional image of shifted coefficients.
8、采用标准视频编码器如H.265对步骤6中的移位系数二维图像进行编码。8. Use a standard video encoder such as H.265 to encode the shift coefficient two-dimensional image in step 6.
在另一种具体的实现方式中,对于解码端而言,首先,利用Video-Codec来对二维图像进行解码重建恢复得到对应的二维图像。其次,按照系数重组的方式,可以恢复得到每个点对应的提升变换系数。最终利用提升小波变换的逆变换,可以恢复得到每个点的Displacement系数,在得到每个点的Displacement系数之后,利用Base mesh的几何位置信息与Displacement系数重建恢复得到当前mesh对应的几何位置信息。具体如图15所示,基础网格(Base mesh)中的level0层的几何位置信息与level0层的Displacement系数可以重建恢复出重建网格(Reconstruct mesh)中的level0层的几何位置信息,Base mesh中的level1层的几何位置信息与level1层的Displacement系数可以重建恢复出Reconstruct mesh中的level1层的几何位置信息,依次类推,Base mesh中的level3层的几何位置信息与level3层的Displacement系数可以重建恢复出Reconstruct mesh中的level3层的几何位置信息。In another specific implementation, for the decoding end, first, the Video-Codec is used to decode and reconstruct the two-dimensional image to restore the corresponding two-dimensional image. Secondly, according to the coefficient reorganization method, the lifting transform coefficient corresponding to each point can be restored. Finally, the inverse transform of the lifting wavelet transform can be used to restore the displacement coefficient of each point. After obtaining the displacement coefficient of each point, the geometric position information of the Base mesh and the displacement coefficient are used to reconstruct and restore the geometric position information corresponding to the current mesh. As shown in Figure 15, the geometric position information of the level 0 layer in the base mesh and the displacement coefficient of the level 0 layer can reconstruct and restore the geometric position information of the level 0 layer in the reconstructed mesh, the geometric position information of the level 1 layer in the Base mesh and the displacement coefficient of the level 1 layer can reconstruct and restore the geometric position information of the level 1 layer in the Reconstruct mesh, and so on. The geometric position information of the level 3 layer in the Base mesh and the displacement coefficient of the level 3 layer can reconstruct and restore the geometric position information of the level 3 layer in the Reconstruct mesh.
综上,动态网格解码过程可以分为如下步骤:In summary, the dynamic grid decoding process can be divided into the following steps:
1、基础网格码流经过解码器如draco解码之后生成解码基础网格。1. The basic grid code stream is decoded by a decoder such as draco to generate a decoded basic grid.
2、移位系数码流采用标准视频编码器如H.265解码得到移位系数二维图像。2. The shift coefficient bit stream is decoded using a standard video encoder such as H.265 to obtain a shift coefficient two-dimensional image.
3、将移位系数二维图像进行从二维图像到三维空间的映射(或称“图像解包”),得到量化变换系数。3. Map the shift coefficient two-dimensional image from the two-dimensional image to the three-dimensional space (or "image unpacking") to obtain the quantized transform coefficients.
4、对量化变换系数进行反量化、小波反变换得到解码移位系数。4. Dequantize and inverse wavelet transform the quantized transform coefficients to obtain the decoded shift coefficients.
5、将解码基础网格和解码移位系数一起生成重建三维网格几何信息。5. The decoded base grid and the decoded shift coefficients are combined to generate the reconstructed 3D grid geometry information.
6、属性码流经过HEVC解码之后生成重建属性图。6. After the attribute code stream is decoded by HEVC, a reconstructed attribute graph is generated.
在相关技术中,已有的V-DMC编码总是利用Base Mesh划分得到的mesh初始重建几何位置信息与Displacement系数来重建恢复得到当前mesh的几何位置信息。但是对于mesh的帧间编码,如果当前图像可以采用采用帧间编码,即当前图像存在有参考图像,那么当前图像在编码或者重建mesh的几何信息时,可以得到参考图像的重建Displacement系数,那么可以基于此引入一种帧间编码方案,即:利用当前图像的Base Mesh进行划分得到的初始mesh的几何信息和参考图像的重建Displacement系数得到对应的预测mesh的几何位置信息,通过利用重新得到的几何位置信息与原始的mesh几何位置信息进行预测编码或者跳过编码。In the related art, the existing V-DMC coding always uses the initial mesh reconstruction geometric position information and Displacement coefficients obtained by dividing the Base Mesh to reconstruct and restore the geometric position information of the current mesh. However, for inter-frame coding of mesh, if the current image can be inter-frame coded, that is, there is a reference image for the current image, then when the current image encodes or reconstructs the geometric information of the mesh, the reconstructed Displacement coefficients of the reference image can be obtained. Based on this, an inter-frame coding scheme can be introduced, that is: the geometric information of the initial mesh obtained by dividing the Base Mesh of the current image and the reconstructed Displacement coefficients of the reference image are used to obtain the corresponding predicted mesh geometric position information, and the re-obtained geometric position information is used together with the original mesh geometric position information for predictive coding or skip coding.
基于这样的基础,本申请实施例提供了一种编解码方法,对于mesh的帧间预测编码,通过利用当前图像的Base Mesh进行划分得到的初始mesh的几何信息和参考图像的重建Displacement系数得到对应的预测mesh几何位置信息,其次在编码端利用新的预测几何位置信息与原始的mesh位置信息进行参数拟合,最终利用拟合得到的参数关系,来对当前mesh的几何位置信息进行编码。例如:可以对当前图像的一些LOD层的Displacement系数进行跳过编码或者利用不同点之间Displacement系数之间的相关性,来减少一部分点的Displacement系数的编码,这样可以降低Displacement系数编码的码流大小,同时可以保证重建mesh的质量,从而进一步提升mesh的几何编码效率,甚至可以对当前图像中的Displacement系数进行跳过编码,进而能够节省码率,提高编解码性能。Based on this foundation, the embodiment of the present application provides a coding and decoding method. For the inter-frame prediction coding of the mesh, the geometric information of the initial mesh obtained by dividing the Base Mesh of the current image and the reconstructed Displacement coefficient of the reference image are used to obtain the corresponding predicted mesh geometric position information. Secondly, the new predicted geometric position information and the original mesh position information are used for parameter fitting at the encoding end, and finally the parameter relationship obtained by fitting is used to encode the geometric position information of the current mesh. For example: the Displacement coefficients of some LOD layers of the current image can be skipped or the correlation between the Displacement coefficients between different points can be used to reduce the encoding of the Displacement coefficients of some points. This can reduce the code stream size of the Displacement coefficient encoding, while ensuring the quality of the reconstructed mesh, thereby further improving the geometric coding efficiency of the mesh, and even the Displacement coefficients in the current image can be skipped, thereby saving bit rate and improving coding and decoding performance.
本申请实施例还提供了一种包含解码方法和编码方法的编解码系统的网格架构,图16为本申请实施例提供的一种编解码的网格架构示意图。如图16所示,该网格架构包括一个或多个电子设备13至1N和通信网格01,其中,电子设备13至1N可以通过通信网格01进行视频交互。电子设备在实施的 过程中可以为各种类型的具有编解码功能的设备,例如,所述电子设备可以包括手机、平板电脑、个人计算机、个人数字助理、导航仪、数字电话、视频电话、电视机、传感设备、服务器等,本申请实施例不作具体限定。在这里,本申请实施例所述的解码器或编码器就可以为上述电子设备。The present application also provides a grid architecture of a codec system including a decoding method and an encoding method. FIG16 is a schematic diagram of a codec grid architecture provided by the present application. As shown in FIG16, the grid architecture includes one or more electronic devices 13 to 1N and a communication grid 01, wherein the electronic devices 13 to 1N can perform video interaction through the communication grid 01. The process may be various types of devices with encoding and decoding functions, for example, the electronic device may include a mobile phone, a tablet computer, a personal computer, a personal digital assistant, a navigator, a digital phone, a video phone, a television, a sensor device, a server, etc., which is not specifically limited in the embodiments of the present application. Here, the decoder or encoder described in the embodiments of the present application may be the above-mentioned electronic device.
下面将结合附图对本申请各实施例进行详细说明。The embodiments of the present application will be described in detail below with reference to the accompanying drawings.
在本申请的一实施例中,参见图17,其示出了本申请实施例提供的一种解码方法的流程示意图。如图17所示,该方法可以包括:In one embodiment of the present application, referring to FIG17 , a schematic flow chart of a decoding method provided by an embodiment of the present application is shown. As shown in FIG17 , the method may include:
S1701:确定当前图像的基础网格。S1701: Determine the basic grid of the current image.
需要说明的是,本申请实施例的解码方法可以是指帧间解码方法,更具体地,可以是一种关于动态网格中的移位系数(Displacement系数)的帧间解码方法。其中,该解码方法可以应用于V-DMC中的解码器,但是并不局限于此。It should be noted that the decoding method of the embodiment of the present application may refer to an inter-frame decoding method, and more specifically, may be an inter-frame decoding method for displacement coefficients in a dynamic grid. The decoding method may be applied to a decoder in a V-DMC, but is not limited thereto.
还需要说明的是,在本申请实施例中,基础网格又可以称为“简化网格”。在一些实施例中,确定当前图像的基础网格,可以包括:解码码流,确定当前图像的基础网格。It should also be noted that, in the embodiments of the present application, the basic grid may also be referred to as a “simplified grid.” In some embodiments, determining the basic grid of the current image may include: decoding a bitstream to determine the basic grid of the current image.
示例性地,这里的码流可以是指基础网格码流。那么通过动态网格解码器(例如DRACO)解码基础网格码流,可以获得当前图像的基础网格。Exemplarily, the code stream here may refer to a basic grid code stream. Then, by decoding the basic grid code stream with a dynamic grid decoder (eg, DRACO), the basic grid of the current image may be obtained.
在一些实施例中,确定当前图像的基础网格,可以包括:基于参考图像的基础网格,确定当前图像的基础网格。In some embodiments, determining the base grid of the current image may include: determining the base grid of the current image based on the base grid of the reference image.
还需要说明的是,在本申请实施例中,参考图像为当前图像之前的已解码图像。示例性地,参考图像可以为当前图像的前一帧图像,但是并不作具体限定。It should also be noted that, in the embodiment of the present application, the reference image is a decoded image before the current image. For example, the reference image may be an image frame before the current image, but this is not specifically limited.
这样,如果当前图像使用帧间解码,而且当前图像的基础网格没有写入码流(即跳过解码当前图像的基础网格),那么这里也可以将参考图像的基础网格作为当前图像的基础网格。In this way, if the current image uses inter-frame decoding, and the base grid of the current image is not written into the code stream (ie, decoding of the base grid of the current image is skipped), the base grid of the reference image can also be used as the base grid of the current image.
S1702:对基础网格进行细分,确定当前图像中的当前层的初始网格的几何位置信息。S1702: Subdivide the basic grid to determine the geometric position information of the initial grid of the current layer in the current image.
需要说明的是,在本申请实施例中,通过对当前图像进行细节层次(Levels of Detail,LOD)划分,可以确定至少一层;其中,至少一层可以包括当前层。It should be noted that in the embodiment of the present application, by dividing the current image into levels of detail (LOD), at least one layer can be determined; wherein, the at least one layer may include the current layer.
示例性地,如图13A所示,对基础网格经过3次迭代细分得到的初始网格的几何位置信息,基础网格视为第0次迭代对应第0层(level0),第一次迭代新增顶点构成第1层(level1),第二次迭代新增顶点构成第2层(level2),第三次迭代新增顶点构成第3层(level3)。具体的LOD划分结构如图13B所示,顶层为基础网格,随着迭代的进行,每次迭代过程中新增顶点数依次增加,形成金字塔结构。这样,通过对基础网格进行细分,可以确定当前图像中的当前层的初始网格的几何位置信息。Exemplarily, as shown in FIG13A , the geometric position information of the initial mesh obtained by subdividing the basic mesh three times is that the basic mesh is regarded as the 0th iteration corresponding to the 0th layer (level 0 ), the newly added vertices in the first iteration constitute the 1st layer (level 1 ), the newly added vertices in the second iteration constitute the 2nd layer (level 2 ), and the newly added vertices in the third iteration constitute the 3rd layer (level 3 ). The specific LOD division structure is shown in FIG13B , where the top layer is the basic mesh, and as the iteration proceeds, the number of newly added vertices increases successively during each iteration to form a pyramid structure. In this way, by subdividing the basic mesh, the geometric position information of the initial mesh of the current layer in the current image can be determined.
在一些实施例中,对基础网格进行细分,确定当前图像中的当前层的初始网格的几何位置信息,可以包括:根据网格细分模式对基础网格进行迭代划分,确定当前图像中的当前层的初始网格的几何位置信息。In some embodiments, subdividing the base grid to determine the geometric position information of the initial grid of the current layer in the current image may include: iteratively dividing the base grid according to a grid subdivision mode to determine the geometric position information of the initial grid of the current layer in the current image.
具体来说,在本申请实施例中,网格细分模式可以理解为对基础网格每个边界上的顶点进行上采样处理,或者也可以理解为对基础网格每个边界上的顶点进行插值处理。示例性地,网格细分模式包括细分算法和细分迭代次数。在一些实施例中,细分算法可以为插值算法,例如细分算法可以为线性插值算法,或者也可以为非线性插值算法,在此不作具体限定。Specifically, in the embodiment of the present application, the mesh subdivision mode can be understood as upsampling the vertices on each boundary of the base mesh, or can also be understood as interpolating the vertices on each boundary of the base mesh. Exemplarily, the mesh subdivision mode includes a subdivision algorithm and a number of subdivision iterations. In some embodiments, the subdivision algorithm can be an interpolation algorithm, for example, the subdivision algorithm can be a linear interpolation algorithm, or can also be a nonlinear interpolation algorithm, which is not specifically limited here.
在这里,完成Base mesh的解码重建之后,会利用一定的细分算法,来对Base mesh进行细分得到初始网格。示例性地,基础网格通过线性插值算法,可以得到初始网格(也可以称为“细分网格”)。其中,每次迭代过程中新插入的点坐标根据当前边界上的两个顶点进行线性插值得到:
Here, after the decoding and reconstruction of the base mesh is completed, a certain subdivision algorithm will be used to subdivide the base mesh to obtain the initial mesh. For example, the base mesh can obtain the initial mesh (also called "subdivided mesh") through the linear interpolation algorithm. In each iteration, the coordinates of the newly inserted points are obtained by linear interpolation based on the two vertices on the current boundary:
其中,pos1和pos2为参与本次迭代的当前边界上的两端顶点几何位置坐标,posnew为本次迭代新增顶点的几何位置坐标。Among them, pos 1 and pos 2 are the geometric position coordinates of the two end vertices on the current boundary participating in this iteration, and pos new is the geometric position coordinates of the vertex newly added in this iteration.
S1703:解码码流,确定第一语法标识信息。S1703: Decode the code stream and determine the first syntax identification information.
需要说明的是,在本申请实施例中,第一语法标识信息用于指示当前图像中的当前层的移位系数是否使用第一解码模式。在一些实施例中,该方法还可以包括:It should be noted that, in the embodiment of the present application, the first syntax identification information is used to indicate whether the shift coefficient of the current layer in the current image uses the first decoding mode. In some embodiments, the method may also include:
若第一语法标识信息的取值为第一值,则确定第一语法标识信息指示当前图像中的当前层的移位系数使用第一解码模式;If the value of the first syntax identification information is the first value, determining that the first syntax identification information indicates that the shift coefficient of the current layer in the current image uses the first decoding mode;
若第一语法标识信息的取值为第二值,则确定第一语法标识信息指示当前图像中的当前层的移位系数不使用第一解码模式。If the value of the first syntax identification information is the second value, it is determined that the first syntax identification information indicates that the shift coefficient of the current layer in the current image does not use the first decoding mode.
在本申请实施例中,第一值与第二值不同,而且第一值和第二值可以是参数形式,也可以是数字形式。具体地,第一语法标识信息可以是写入在概述(profile)中的参数,也可以是一个标志(flag)的取值,这里对此不作具体限定。 In the embodiment of the present application, the first value is different from the second value, and the first value and the second value can be in parameter form or in digital form. Specifically, the first syntax identification information can be a parameter written in the profile or a flag value, which is not specifically limited here.
在本申请实施例中,第一语法标识信息作为LOD层级别的语法元素,用于指示当前图像的LOD层的移位系数是否使用第一解码模式。In an embodiment of the present application, the first syntax identification information is used as a syntax element at the LOD layer level to indicate whether the shift coefficient of the LOD layer of the current image uses the first decoding mode.
在一些实施例中,该方法还可以包括:In some embodiments, the method may further include:
解码码流,确定第二语法标识信息;Decoding the code stream to determine second syntax identification information;
在第二语法标识信息指示当前序列的移位系数启用第一解码模式时,解码码流,确定第三语法标识信息;When the second syntax identification information indicates that the shift coefficient of the current sequence enables the first decoding mode, decoding the code stream to determine the third syntax identification information;
在第三语法标识信息指示当前图像的移位系数启用第一解码模式时,解码码流,确定第一语法标识信息。When the third syntax identification information indicates that the shift coefficient of the current image enables the first decoding mode, the code stream is decoded to determine the first syntax identification information.
在本申请实施例中,第二语法标识信息为序列参数集(Sequence Parameter Set,SPS)级别的语法元素,第三语法标识信息为帧参数集(frame Parameter Set,FPS)级别的语法元素。其中,第二语法标识信息用于指示当前序列的移位系数是否启用第一解码模式,第三语法标识信息用于指示当前图像的移位系数是否启用第一解码模式。在这里,当前序列至少包括当前图像,当前图像划分的LOD层至少包括当前层。In an embodiment of the present application, the second syntax identification information is a syntax element at the Sequence Parameter Set (SPS) level, and the third syntax identification information is a syntax element at the frame parameter set (FPS) level. The second syntax identification information is used to indicate whether the shift coefficient of the current sequence enables the first decoding mode, and the third syntax identification information is used to indicate whether the shift coefficient of the current image enables the first decoding mode. Here, the current sequence includes at least the current image, and the LOD layer divided by the current image includes at least the current layer.
在一些实施例中,若第二语法标识信息的取值为第一值,则确定第二语法标识信息指示当前序列的移位系数启用第一解码模式;若第二语法标识信息的取值为第二值,则确定第二语法标识信息指示当前序列的移位系数不启用第一解码模式。In some embodiments, if the value of the second grammar identification information is a first value, it is determined that the second grammar identification information indicates that the shift coefficient of the current sequence enables the first decoding mode; if the value of the second grammar identification information is a second value, it is determined that the second grammar identification information indicates that the shift coefficient of the current sequence does not enable the first decoding mode.
在一些实施例中,若第三语法标识信息的取值为第一值,则确定第三语法标识信息指示当前图像的移位系数启用第一解码模式;若第三语法标识信息的取值为第二值,则确定第三语法标识信息指示当前图像的移位系数不启用第一解码模式。In some embodiments, if the value of the third grammar identification information is a first value, it is determined that the third grammar identification information indicates that the shift coefficient of the current image enables the first decoding mode; if the value of the third grammar identification information is a second value, it is determined that the third grammar identification information indicates that the shift coefficient of the current image does not enable the first decoding mode.
在本申请实施例中,第一值与第二值不同,而且第一值和第二值可以是参数形式,也可以是数字形式。具体地,第二语法标识信息和第三语法标识信息可以是写入在概述(profile)中的参数,也可以是一个标志(flag)的取值,这里对此不作具体限定。In the embodiment of the present application, the first value is different from the second value, and the first value and the second value can be in parameter form or in digital form. Specifically, the second syntax identification information and the third syntax identification information can be parameters written in the profile or the value of a flag, which is not specifically limited here.
示例性地,对于第一值和第二值而言,第一值可以设置为1,第二值可以设置为0;或者,第一值可以设置为0,第二值可以设置为1;或者,第一值可以设置为true,第二值可以设置为false;或者,第一值可以设置为false,第二值可以设置为true。其中,在本申请实施例中,第一值设置为1,第二值设置为0,但是并不作具体限定。Exemplarily, for the first value and the second value, the first value can be set to 1 and the second value can be set to 0; or, the first value can be set to 0 and the second value can be set to 1; or, the first value can be set to true and the second value can be set to false; or, the first value can be set to false and the second value can be set to true. In the embodiment of the present application, the first value is set to 1 and the second value is set to 0, but this is not specifically limited.
也就是说,针对高层语法元素,首先在序列参数集中确定是否启动本申请实施例的解码方法,其次在帧参数集中确定是否启动本申请实施例的解码方法,最后在每个LOD层级确定当前层的移位系数的解码方法,具体是使用第一解码模式还是使用第二解码模式对当前层的移位系数进行解码。That is to say, for high-level syntax elements, first, determine in the sequence parameter set whether to start the decoding method of the embodiment of the present application, then determine in the frame parameter set whether to start the decoding method of the embodiment of the present application, and finally determine the decoding method of the shift coefficient of the current layer at each LOD level, specifically whether to use the first decoding mode or the second decoding mode to decode the shift coefficient of the current layer.
S1704:在第一语法标识信息指示当前图像中的当前层的移位系数使用第一解码模式时,确定参考图像中的当前层的移位系数。S1704: When the first syntax identification information indicates that the shift coefficient of the current layer in the current image uses the first decoding mode, determine the shift coefficient of the current layer in the reference image.
S1705:根据当前图像中的当前层的初始网格的几何位置信息与参考图像中的当前层的移位系数,确定当前图像中的当前层的重建网格的几何位置信息。S1705: Determine the geometric position information of the reconstructed grid of the current layer in the current image according to the geometric position information of the initial grid of the current layer in the current image and the shift coefficient of the current layer in the reference image.
需要说明的是,在本申请实施例中,第一解码模式表征跳过解码当前图像中的当前层的移位系数,即不对当前图像中的当前层的移位系数进行解码,这时候可以使用参考图像中的当前层的移位系数。如此,由于无需解码当前图像中的当前层的移位系数,在保证网格的几何位置信息的重建质量基础上,还能够提高几何位置信息的编解码效率。It should be noted that, in the embodiment of the present application, the first decoding mode represents skipping the decoding of the shift coefficient of the current layer in the current image, that is, not decoding the shift coefficient of the current layer in the current image, and the shift coefficient of the current layer in the reference image can be used at this time. In this way, since there is no need to decode the shift coefficient of the current layer in the current image, the encoding and decoding efficiency of the geometric position information can be improved on the basis of ensuring the reconstruction quality of the geometric position information of the grid.
在一些实施例中,根据当前图像中的当前层的初始网格的几何位置信息与参考图像中的当前层的移位系数,确定当前图像中的当前层的重建网格的几何位置信息,可以包括:确定当前图像中的当前层的初始网格的几何位置信息、参考图像中的当前层的移位系数与当前图像中的当前层的重建网格的几何位置信息之间的映射关系;根据映射关系以及当前图像中的当前层的初始网格的几何位置信息和参考图像中的当前层的移位系数,确定当前图像中的当前层的重建网格的几何位置信息。In some embodiments, determining the geometric position information of the reconstructed grid of the current layer in the current image based on the geometric position information of the initial grid of the current layer in the current image and the shift coefficient of the current layer in the reference image may include: determining the mapping relationship between the geometric position information of the initial grid of the current layer in the current image, the shift coefficient of the current layer in the reference image, and the geometric position information of the reconstructed grid of the current layer in the current image; determining the geometric position information of the reconstructed grid of the current layer in the current image based on the mapping relationship and the geometric position information of the initial grid of the current layer in the current image and the shift coefficient of the current layer in the reference image.
还需要说明的是,在本申请实施例中,映射关系可以是一个查找表(Look Up Table,LUT),该查找表可以记录当前图像中的当前层的初始网格的几何位置信息和参考图像中的当前层的移位系数与当前图像中的当前层的重建网格的几何位置信息之间的对应关系;或者,映射关系还可以是一种预设函数,该预设函数可以表征当前图像中的当前层的初始网格的几何位置信息和参考图像中的当前层的移位系数与当前图像中的当前层的重建网格的几何位置信息之间的对应关系。It should also be noted that, in the embodiments of the present application, the mapping relationship can be a lookup table (Look Up Table, LUT), which can record the correspondence between the geometric position information of the initial grid of the current layer in the current image and the shift coefficient of the current layer in the reference image and the geometric position information of the reconstructed grid of the current layer in the current image; or, the mapping relationship can also be a preset function, which can characterize the correspondence between the geometric position information of the initial grid of the current layer in the current image and the shift coefficient of the current layer in the reference image and the geometric position information of the reconstructed grid of the current layer in the current image.
还需要说明的是,在本申请实施例中,映射关系可以包括以下至少之一:基于线性函数的映射关系、基于非线性函数的映射关系和基于神经网格的映射关系。示例性地,针对这种映射关系的拟合,可以包括但不限于线性拟合、曲线拟合或者卷积参数拟合等,这里并不作具体限定。It should also be noted that, in the embodiment of the present application, the mapping relationship may include at least one of the following: a mapping relationship based on a linear function, a mapping relationship based on a nonlinear function, and a mapping relationship based on a neural grid. Exemplarily, the fitting of such a mapping relationship may include, but is not limited to, linear fitting, curve fitting, or convolution parameter fitting, etc., which is not specifically limited here.
还需要说明的是,在本申请实施例中,对于映射关系而言,可以是解码端根据相关参数建立的,也可以是通过解码码流来确定的。在一些实施例中,该方法还可以包括: It should also be noted that, in the embodiment of the present application, the mapping relationship may be established by the decoding end according to relevant parameters, or may be determined by decoding the bit stream. In some embodiments, the method may further include:
解码码流,确定当前图像中的当前层的映射指示信息;根据映射指示信息,确定当前图像中的当前层的初始网格的几何位置信息、参考图像中的当前层的移位系数与当前图像中的当前层的重建网格的几何位置信息之间的映射关系。Decode the code stream and determine mapping indication information of the current layer in the current image; determine the mapping relationship between the geometric position information of the initial grid of the current layer in the current image, the shift coefficient of the current layer in the reference image and the geometric position information of the reconstructed grid of the current layer in the current image based on the mapping indication information.
也就是说,在本申请实施例中,编码端可以确定当前图像中的当前层的初始网格的几何位置信息、参考图像中的当前层的移位系数与当前图像中的当前层的重建网格的几何位置信息之间的映射关系并将该映射关系写入码流;这样,解码端通过解码码流,就可以确定出该映射关系,进而确定当前图像中的当前层的重建网格的几何位置信息。That is to say, in an embodiment of the present application, the encoding end can determine the mapping relationship between the geometric position information of the initial grid of the current layer in the current image, the shift coefficient of the current layer in the reference image and the geometric position information of the reconstructed grid of the current layer in the current image, and write the mapping relationship into the bit stream; in this way, the decoding end can determine the mapping relationship by decoding the bit stream, and then determine the geometric position information of the reconstructed grid of the current layer in the current image.
在一种具体的实现方式中,映射指示信息包括第一指示信息;其中,第一指示信息用于指示该映射关系的拟合参数。在一些实施例中,根据映射指示信息,确定当前图像中的当前层的初始网格的几何位置信息、参考图像中的当前层的移位系数与当前图像中的当前层的重建网格的几何位置信息之间的映射关系,可以包括:根据第一指示信息,确定映射关系的拟合参数;根据拟合参数,确定当前图像中的当前层的初始网格的几何位置信息、参考图像中的当前层的移位系数与当前图像中的当前层的重建网格的几何位置信息之间的映射关系。In a specific implementation, the mapping indication information includes first indication information; wherein the first indication information is used to indicate fitting parameters of the mapping relationship. In some embodiments, determining the mapping relationship between the geometric position information of the initial grid of the current layer in the current image, the shift coefficient of the current layer in the reference image, and the geometric position information of the reconstructed grid of the current layer in the current image based on the mapping indication information may include: determining the fitting parameters of the mapping relationship based on the first indication information; determining the mapping relationship between the geometric position information of the initial grid of the current layer in the current image, the shift coefficient of the current layer in the reference image, and the geometric position information of the reconstructed grid of the current layer in the current image based on the fitting parameters.
需要说明的是,在本申请实施例中,解码码流,确定映射关系的拟合参数;然后根据拟合参数,可以确定当前图像中的当前层的初始网格的几何位置信息、参考图像中的当前层的移位系数与当前图像中的当前层的重建网格的几何位置信息之间的映射关系。It should be noted that in an embodiment of the present application, the code stream is decoded to determine the fitting parameters of the mapping relationship; then, based on the fitting parameters, the mapping relationship between the geometric position information of the initial grid of the current layer in the current image, the shift coefficient of the current layer in the reference image, and the geometric position information of the reconstructed grid of the current layer in the current image can be determined.
示例性地,映射关系为基于线性函数的映射关系时,拟合参数可以包括线性函数的斜率和/或截距。Exemplarily, when the mapping relationship is a mapping relationship based on a linear function, the fitting parameter may include the slope and/or intercept of the linear function.
示例性地,映射关系为基于非线性函数的映射关系时,拟合参数可以包括非线性函数中的至少一个常数。例如,非线性函数为指数函数,拟合参数可以包括非线性函数的常数a;非线性函数为多项式函数,拟合参数可以包括多项式函数的系数a0,a1,a2,…;非线性函数为对数函数,拟合参数可以包括非线性函数的底数a。Exemplarily, when the mapping relationship is a mapping relationship based on a nonlinear function, the fitting parameter may include at least one constant in the nonlinear function. For example, when the nonlinear function is an exponential function, the fitting parameter may include the constant a of the nonlinear function; when the nonlinear function is a polynomial function, the fitting parameter may include the coefficients a 0 , a 1 , a 2 , ... of the polynomial function; when the nonlinear function is a logarithmic function, the fitting parameter may include the base a of the nonlinear function.
在另一种具体的实现方式中,映射指示信息还包括第二指示信息;其中,第二指示信息用于指示该映射关系的类型。在一些实施例中,根据映射指示信息,确定当前图像中的当前层的初始网格的几何位置信息、参考图像中的当前层的移位系数与当前图像中的当前层的重建网格的几何位置信息之间的映射关系,可以包括:根据第二指示信息,确定映射关系的类型;根据映射关系的类型和拟合参数,确定当前图像中的当前层的初始网格的几何位置信息、参考图像中的当前层的移位系数与当前图像中的当前层的重建网格的几何位置信息之间的映射关系。In another specific implementation, the mapping indication information further includes second indication information; wherein the second indication information is used to indicate the type of the mapping relationship. In some embodiments, determining the mapping relationship between the geometric position information of the initial grid of the current layer in the current image, the shift coefficient of the current layer in the reference image, and the geometric position information of the reconstructed grid of the current layer in the current image based on the mapping indication information may include: determining the type of the mapping relationship based on the second indication information; determining the mapping relationship between the geometric position information of the initial grid of the current layer in the current image, the shift coefficient of the current layer in the reference image, and the geometric position information of the reconstructed grid of the current layer in the current image based on the type of the mapping relationship and the fitting parameters.
在本申请实施例中,映射关系的类型可以包括线性函数类型、指数函数类型、对数函数类型、多项式函数类型等,这里不作具体限定。In the embodiment of the present application, the type of mapping relationship may include a linear function type, an exponential function type, a logarithmic function type, a polynomial function type, etc., which is not specifically limited here.
还需要说明的是,在本申请实施例中,解码码流,确定映射关系的类型和拟合参数;然后根据映射关系的类型和拟合参数,可以确定当前图像中的当前层的初始网格的几何位置信息、参考图像中的当前层的移位系数与当前图像中的当前层的重建网格的几何位置信息之间的映射关系。It should also be noted that, in an embodiment of the present application, the code stream is decoded to determine the type of mapping relationship and fitting parameters; then, based on the type of mapping relationship and fitting parameters, the mapping relationship between the geometric position information of the initial grid of the current layer in the current image, the shift coefficient of the current layer in the reference image, and the geometric position information of the reconstructed grid of the current layer in the current image can be determined.
示例性地,映射关系的具体公式如下:
predictmesh=f(reconMesh+refDisp,lvl) (8)For example, the specific formula of the mapping relationship is as follows:
predict mesh =f(reconMesh+refDisp,lvl) (8)
其中,lvl表示不同的LOD层,reconMesh表示初始网格的几何位置信息,refDisp表示参考图像的重建Displacement系数,predictmesh表示利用一定的函数关系得到重建网格的几何位置信息。Among them, lvl represents different LOD layers, reconMesh represents the geometric position information of the initial mesh, refDisp represents the reconstructed Displacement coefficient of the reference image, and predict mesh represents the geometric position information of the reconstructed mesh obtained by using a certain functional relationship.
在一种具体的实现方式中,通过利用简单的线性函数关系来得到重建网格的几何位置信息与初始网格的几何位置信息之间的对应关系,具体的如下所示:
predictmesh=k*(reconMesh+refDisp,lvl)+b (9)In a specific implementation, a simple linear function relationship is used to obtain the corresponding relationship between the geometric position information of the reconstructed grid and the geometric position information of the initial grid, as shown below:
predict mesh =k*(reconMesh+refDisp,lvl)+b (9)
其中,*和+为向量的数乘和加法;k和b表示拟合参数。Among them, * and + are vector multiplication and addition; k and b represent fitting parameters.
在又一种具体的实现方式中,映射指示信息包括第三指示信息;其中,第三指示信息用于指示该映射关系的索引序号。在一些实施例中,根据映射指示信息,确定当前图像中的当前层的初始网格的几何位置信息、参考图像中的当前层的移位系数与当前图像中的当前层的重建网格的几何位置信息之间的映射关系,可以包括:根据第三指示信息,确定映射关系的索引序号;根据映射关系的索引序号,确定当前图像中的当前层的初始网格的几何位置信息、参考图像中的当前层的移位系数与当前图像中的当前层的重建网格的几何位置信息之间的映射关系。In another specific implementation, the mapping indication information includes third indication information; wherein the third indication information is used to indicate the index number of the mapping relationship. In some embodiments, determining the mapping relationship between the geometric position information of the initial grid of the current layer in the current image, the shift coefficient of the current layer in the reference image, and the geometric position information of the reconstructed grid of the current layer in the current image based on the mapping indication information may include: determining the index number of the mapping relationship based on the third indication information; determining the mapping relationship between the geometric position information of the initial grid of the current layer in the current image, the shift coefficient of the current layer in the reference image, and the geometric position information of the reconstructed grid of the current layer in the current image based on the index number of the mapping relationship.
在本申请实施例中,编码端和解码端均预先设置有几种映射关系,这时候根据索引序号就可以确定出对应的映射关系。In the embodiment of the present application, the encoding end and the decoding end are both pre-set with several mapping relationships, and at this time the corresponding mapping relationship can be determined according to the index number.
还需要说明的是,在本申请实施例中,解码码流,确定映射关系的索引序号;然后根据映射关系的索引序号,可以确定当前图像中的当前层的初始网格的几何位置信息、参考图像中的当前层的移位系数与当前图像中的当前层的重建网格的几何位置信息之间的映射关系。It should also be noted that, in an embodiment of the present application, the code stream is decoded to determine the index number of the mapping relationship; then, based on the index number of the mapping relationship, the mapping relationship between the geometric position information of the initial grid of the current layer in the current image, the shift coefficient of the current layer in the reference image, and the geometric position information of the reconstructed grid of the current layer in the current image can be determined.
进一步地,在一些实施例中,在步骤S1703之后,参见图18,该方法还可以包括: Further, in some embodiments, after step S1703, referring to FIG. 18 , the method may further include:
S1801:在第一语法标识信息指示当前图像中的当前层的移位系数使用第二解码模式时,解码码流,确定当前图像中的当前层的移位系数。S1801: When the first syntax identification information indicates that the shift coefficient of the current layer in the current image uses the second decoding mode, decode the code stream to determine the shift coefficient of the current layer in the current image.
S1802:根据当前图像中的当前层的初始网格的几何位置信息与当前图像中的当前层的移位系数,确定当前图像中的当前层的重建网格的几何位置信息。S1802: Determine geometric position information of a reconstructed grid of the current layer in the current image according to geometric position information of an initial grid of the current layer in the current image and a shift coefficient of the current layer in the current image.
在本申请实施例中,第一解码模式与第二解码模式不同。其中,第一解码模式可以表征跳过解码当前图像中的当前层的移位系数,即不需要对当前图像中的当前层的移位系数进行解码;第二解码模式表征解码当前图像中的当前层的移位系数,即需要对当前图像中的当前层的移位系数进行解码。In an embodiment of the present application, the first decoding mode is different from the second decoding mode. The first decoding mode may represent skipping the decoding of the shift coefficient of the current layer in the current image, that is, there is no need to decode the shift coefficient of the current layer in the current image; the second decoding mode represents decoding the shift coefficient of the current layer in the current image, that is, it is necessary to decode the shift coefficient of the current layer in the current image.
在本申请实施例中,如果需要解码当前图像中的当前层的移位系数,那么在一些实施例中,解码码流,确定当前图像中的当前层的移位系数,可以包括:解码码流,确定当前图像中的当前层的二维图像;对二维图像进行系数重组处理,确定当前层的提升变换系数;对当前层的提升变换系数进行逆变换处理,确定当前层的移位系数。In an embodiment of the present application, if it is necessary to decode the shift coefficient of the current layer in the current image, then in some embodiments, decoding the code stream and determining the shift coefficient of the current layer in the current image may include: decoding the code stream to determine the two-dimensional image of the current layer in the current image; performing coefficient reorganization processing on the two-dimensional image to determine the lifting transform coefficient of the current layer; and performing inverse transform processing on the lifting transform coefficient of the current layer to determine the shift coefficient of the current layer.
在一种具体的实施例中,对二维图像进行系数重组处理,确定当前层的提升变换系数,可以包括:对二维图像进行系数重组处理,确定当前层的量化系数;对当前层的量化系数进行反量化处理,确定当前层的提升变换系数。In a specific embodiment, performing coefficient reorganization processing on a two-dimensional image to determine the lifting transform coefficients of the current layer may include: performing coefficient reorganization processing on the two-dimensional image to determine the quantization coefficients of the current layer; and performing inverse quantization processing on the quantization coefficients of the current layer to determine the lifting transform coefficients of the current layer.
具体来说,这里的码流可以是指移位系数码流。那么通过视频解码器(例如Video-Codec)解码移位系数码流,可以重建恢复出对应的二维图像;然后按照系数重组的方式,恢复得到当前层中每个点对应的量化系数;再利用反量化恢复得到当前层中每个点对应的提升变换系数;最后利用提升小波变换的逆变换恢复得到当前层中每个点对应的移位系数。Specifically, the code stream here can refer to the shift coefficient code stream. Then, by decoding the shift coefficient code stream through a video decoder (such as Video-Codec), the corresponding two-dimensional image can be reconstructed and restored; then, according to the coefficient reorganization method, the quantization coefficient corresponding to each point in the current layer is restored; then, the lifting transform coefficient corresponding to each point in the current layer is restored by inverse quantization; finally, the shift coefficient corresponding to each point in the current layer is restored by the inverse transform of the lifting wavelet transform.
这样,在得到每个点的移位系数之后,利用基础网格的几何位置信息与解码得到的移位系数能够重建恢复得到重构网格的几何位置信息。In this way, after obtaining the shift coefficient of each point, the geometric position information of the basic grid and the decoded shift coefficient can be used to reconstruct and restore the geometric position information of the reconstructed grid.
进一步地,在本申请实施例中,还可以设置一个标识信息来确定当前图像的基础网格的解码方式。具体地,在一些实施例中,该方法还可以包括:解码码流,确定第四语法标识信息;在第四语法标识信息指示当前图像的基础网格使用帧间处理方式时,执行解码码流,确定第一语法标识信息的步骤。Furthermore, in the embodiment of the present application, an identification information may be set to determine the decoding method of the basic grid of the current image. Specifically, in some embodiments, the method may further include: decoding the code stream to determine the fourth syntax identification information; when the fourth syntax identification information indicates that the basic grid of the current image uses the inter-frame processing method, performing the step of decoding the code stream to determine the first syntax identification information.
在一些实施例中,在第四语法标识信息指示当前图像的基础网格使用帧间处理方式时,该方法还可以包括:解码码流,确定当前图像的参考图像索引;根据当前图像的参考图像索引,确定参考图像。In some embodiments, when the fourth syntax identification information indicates that the basic grid of the current image uses inter-frame processing, the method may further include: decoding the code stream to determine the reference image index of the current image; and determining the reference image according to the reference image index of the current image.
需要说明的是,在帧间预测过程中,可以首先设置一个标识信息决定当前图像的基础网格的解码方式。如果当前图像的基础网格可以采用帧间解码,那么需要传递当前图像的参考图像索引。基于这样的算法,本申请实施例首先判断当前图像的基础网格是否可以进行帧间解码,如果可以采用帧间解码,那么才会去判断当前图像中的当前层的移位系数是否采用跳过解码,否则默认当前图像中的当前层的移位系数使用第二解码模式。It should be noted that, in the inter-frame prediction process, an identification information can be first set to determine the decoding method of the basic grid of the current image. If the basic grid of the current image can be decoded by inter-frame, then the reference image index of the current image needs to be transmitted. Based on such an algorithm, the embodiment of the present application first determines whether the basic grid of the current image can be decoded by inter-frame. If inter-frame decoding can be used, then it will determine whether the shift coefficient of the current layer in the current image uses skip decoding, otherwise the shift coefficient of the current layer in the current image is defaulted to use the second decoding mode.
在一些实施例中,该方法还可以包括:在第四语法标识信息指示当前图像的基础网格不使用帧间处理方式时,解码码流,确定当前图像中的当前层的移位系数;根据当前图像中的当前层的初始网格的几何位置信息与当前图像中的当前层的移位系数,确定当前图像中的当前层的重建网格的几何位置信息。In some embodiments, the method may further include: when the fourth syntax identification information indicates that the basic grid of the current image does not use inter-frame processing, decoding the code stream and determining the shift coefficient of the current layer in the current image; determining the geometric position information of the reconstructed grid of the current layer in the current image based on the geometric position information of the initial grid of the current layer in the current image and the shift coefficient of the current layer in the current image.
也就是说,在本申请实施例中,如果当前图像的基础网格不采用帧间解码,这时候当前图像中的当前层的移位系数可以使用第二解码模式,即通过解码码流来确定当前图像中的当前层的移位系数,然后根据当前图像中的当前层的初始网格的几何位置信息与当前图像中的当前层的移位系数,确定当前图像中的当前层的重建网格的几何位置信息。That is to say, in an embodiment of the present application, if the basic grid of the current image does not adopt inter-frame decoding, the shift coefficient of the current layer in the current image can use the second decoding mode, that is, the shift coefficient of the current layer in the current image is determined by decoding the code stream, and then the geometric position information of the reconstructed grid of the current layer in the current image is determined according to the geometric position information of the initial grid of the current layer in the current image and the shift coefficient of the current layer in the current image.
进一步地,在帧间预测过程中,可以首先设置一个标识信息决定当前图像的基础网格的解码方式,如果当前图像的基础网格可以采用帧间解码,那么需要传递当前图像的参考图像索引。本申请实施例也可以是无论当前图像的基础网格是否可以采用帧间解码,都需要传递一个标识信息指示当前图像中的当前层的移位系数是否采用跳过解码。Furthermore, in the inter-frame prediction process, an identification information may be first set to determine the decoding method of the base grid of the current image. If the base grid of the current image can be decoded by inter-frame, then the reference image index of the current image needs to be transmitted. An embodiment of the present application may also be that regardless of whether the base grid of the current image can be decoded by inter-frame, an identification information needs to be transmitted to indicate whether the shift coefficient of the current layer in the current image adopts skip decoding.
进一步地,在本申请实施例中,通过利用当前图像的基础网格划分得到的初始mesh的几何位置信息与参考图像的移位系数来作为一种mesh点几何位置信息的重建方式,这种方案是通过降低重建mesh点位置信息的一定程度前提下,降低移位系数的码流,从而可以提升mesh的编码效率。在本申请实施例中,还可以通过利用参数拟合的方式,利用预测mesh点几何位置信息与原始mesh点几何位置信息之间的映射关系,从而可以保证重建mesh点几何位置信息的质量,降低移位系数的码流大小,进一步提升mesh的编码效率。Further, in the embodiment of the present application, the geometric position information of the initial mesh obtained by using the basic mesh division of the current image and the shift coefficient of the reference image are used as a method for reconstructing the geometric position information of the mesh points. This scheme is to reduce the code stream of the shift coefficient under the premise of reducing the degree of the reconstructed mesh point position information, thereby improving the coding efficiency of the mesh. In the embodiment of the present application, the mapping relationship between the predicted mesh point geometric position information and the original mesh point geometric position information can also be used by using parameter fitting, so as to ensure the quality of the reconstructed mesh point geometric position information, reduce the code stream size of the shift coefficient, and further improve the coding efficiency of the mesh.
本实施例提供了一种解码方法,确定当前图像的基础网格;对基础网格进行细分,确定当前图像中的当前层的初始网格的几何位置信息;解码码流,确定第一语法标识信息;在第一语法标识信息指示当前图像中的当前层的移位系数使用第一解码模式时,确定参考图像中的当前层的移位系数;根据当前图像中的当前层的初始网格的几何位置信息与参考图像中的当前层的移位系数,确定当前图像中的当前层的重建网格的几何位置信息。这样,可以根据第一语法标识信息来指示当前图像中的当前层的移位系数 是否跳过解码,在当前图像中的当前层的移位系数跳过解码时,这时候根据当前图像中的当前层的初始网格的几何位置信息与参考图像中的当前层的移位系数来确定重建网格的几何位置信息,不仅降低了移位系数的码流,而且还可以保证网格中点的重建几何质量,从而进一步提升了网格中点的几何信息质量,进而提升了编解码效率。This embodiment provides a decoding method, which determines the basic grid of the current image; subdivides the basic grid to determine the geometric position information of the initial grid of the current layer in the current image; decodes the code stream to determine the first syntax identification information; when the first syntax identification information indicates the shift coefficient of the current layer in the current image using the first decoding mode, determines the shift coefficient of the current layer in the reference image; determines the geometric position information of the reconstructed grid of the current layer in the current image based on the geometric position information of the initial grid of the current layer in the current image and the shift coefficient of the current layer in the reference image. In this way, the shift coefficient of the current layer in the current image can be indicated based on the first syntax identification information. Whether to skip decoding. When the shift coefficient of the current layer in the current image is skipped, the geometric position information of the reconstructed grid is determined based on the geometric position information of the initial grid of the current layer in the current image and the shift coefficient of the current layer in the reference image. This not only reduces the bit rate of the shift coefficient, but also ensures the reconstructed geometric quality of the grid midpoint, thereby further improving the geometric information quality of the grid midpoint and further improving the encoding and decoding efficiency.
在本申请的另一实施例中,参见图19,其示出了本申请实施例提供的一种编码方法的流程示意图。如图19所示,该方法可以包括:In another embodiment of the present application, referring to FIG19, a schematic diagram of a flow chart of an encoding method provided in an embodiment of the present application is shown. As shown in FIG19, the method may include:
S1901:根据当前图像的原始网格,确定当前图像的基础网格。S1901: Determine a basic grid of the current image according to the original grid of the current image.
需要说明的是,本申请实施例的编码方法可以是指帧间编码方法,更具体地,可以是一种关于动态网格中的移位系数(Displacement系数)的帧间编码方法。其中,该编码方法可以应用于V-DMC中的编码器,但是并不局限于此。It should be noted that the encoding method in the embodiment of the present application may refer to an inter-frame encoding method, and more specifically, may be an inter-frame encoding method for displacement coefficients in a dynamic grid. The encoding method may be applied to an encoder in a V-DMC, but is not limited thereto.
还需要说明的是,在本申请实施例中,基础网格又可以称为“简化网格”。在一些实施例中,根据当前图像的原始网格,确定当前图像的基础网格,可以包括:对当前图像的原始网格进行下采样处理,确定当前图像的基础网格。It should also be noted that in the embodiments of the present application, the basic grid can also be referred to as a "simplified grid". In some embodiments, determining the basic grid of the current image based on the original grid of the current image can include: performing downsampling processing on the original grid of the current image to determine the basic grid of the current image.
示例性地,首先可以对当前图像的原始网格进行下采样处理,生成顶点数量大幅减少的基础网格。另外,在得到基础网格之后,还可以对基础网格进行编码,将所得到的编码比特写入码流。For example, the original mesh of the current image may be downsampled to generate a base mesh with a significantly reduced number of vertices. In addition, after the base mesh is obtained, the base mesh may be encoded and the obtained encoding bits may be written into the bitstream.
示例性地,这里的码流可以是指基础网格码流。具体地,可以利用动态网格编码器(例如DRACO)对基础网格的几何信息进行编码,将所得到的编码比特写入基础网格码流。其中,几何信息主要包括:连接关系与几何位置信息的连接关系。Exemplarily, the code stream here may refer to a basic grid code stream. Specifically, a dynamic grid encoder (such as DRACO) may be used to encode the geometric information of the basic grid, and the obtained coded bits may be written into the basic grid code stream. The geometric information mainly includes: a connection relationship and a connection relationship of geometric position information.
示例性地,对于基础网格的几何信息的编码,该编码流程可以为:首先完成连接关系的编码,其次基于几何位置的连接关系对点的几何位置信息进行编码,最后基于连接关系和几何位置信息来对纹理位置信息进行编码。Exemplarily, for the encoding of the geometric information of the basic grid, the encoding process can be: first complete the encoding of the connection relationship, then encode the geometric position information of the point based on the connection relationship of the geometric position, and finally encode the texture position information based on the connection relationship and the geometric position information.
S1902:对基础网格进行细分,确定当前图像中的当前层的初始网格的几何位置信息。S1902: Subdivide the basic grid to determine the geometric position information of the initial grid of the current layer in the current image.
需要说明的是,在本申请实施例中,通过对当前图像进行LOD划分,可以确定至少一层;其中,至少一层可以包括当前层。It should be noted that, in the embodiment of the present application, by performing LOD division on the current image, at least one layer can be determined; wherein, the at least one layer can include the current layer.
还需要说明的是,在本申请实施例中,在得到基础网格之后,还可以对基础网格进行细分,在基础网格的边界上插入新的顶点,以生成初始网格的几何位置信息。示例性地,如图13A所示,对基础网格经过3次迭代细分得到的初始网格的几何位置信息,基础网格视为第0次迭代对应第0层(level0),第一次迭代新增顶点构成第1层(level1),第二次迭代新增顶点构成第2层(level2),第三次迭代新增顶点构成第3层(level3)。具体的LOD划分结构如图13B所示,顶层为基础网格,随着迭代的进行,每次迭代过程中新增顶点数依次增加,形成金字塔结构。It should also be noted that, in an embodiment of the present application, after obtaining the basic mesh, the basic mesh can also be subdivided, and new vertices can be inserted on the boundary of the basic mesh to generate the geometric position information of the initial mesh. Exemplarily, as shown in FIG13A, for the geometric position information of the initial mesh obtained by subdividing the basic mesh three times, the basic mesh is regarded as the 0th iteration corresponding to the 0th layer (level 0 ), the first iteration newly added vertices constitute the 1st layer (level 1 ), the second iteration newly added vertices constitute the 2nd layer (level 2 ), and the third iteration newly added vertices constitute the 3rd layer (level 3 ). The specific LOD division structure is shown in FIG13B, where the top layer is the basic mesh. As the iteration proceeds, the number of newly added vertices increases successively during each iteration to form a pyramid structure.
在一些实施例中,对基础网格进行细分,确定当前图像中的当前层的初始网格的几何位置信息,可以包括:根据网格细分模式对基础网格进行迭代划分,确定当前图像中的当前层的初始网格的几何位置信息。In some embodiments, subdividing the base grid to determine the geometric position information of the initial grid of the current layer in the current image may include: iteratively dividing the base grid according to a grid subdivision mode to determine the geometric position information of the initial grid of the current layer in the current image.
具体来说,在本申请实施例中,网格细分模式可以理解为对基础网格每个边界上的顶点进行上采样处理,或者也可以理解为对基础网格每个边界上的顶点进行插值处理。示例性地,网格细分模式包括细分算法和细分迭代次数。在一些实施例中,细分算法可以为插值算法,例如细分算法可以为线性插值算法,或者也可以为非线性插值算法,在此不作具体限定。Specifically, in the embodiment of the present application, the mesh subdivision mode can be understood as upsampling the vertices on each boundary of the base mesh, or can also be understood as interpolating the vertices on each boundary of the base mesh. Exemplarily, the mesh subdivision mode includes a subdivision algorithm and a number of subdivision iterations. In some embodiments, the subdivision algorithm can be an interpolation algorithm, for example, the subdivision algorithm can be a linear interpolation algorithm, or can also be a nonlinear interpolation algorithm, which is not specifically limited here.
在这里,在得到当前图像的Base mesh之后,会利用一定的细分算法,来对Base mesh进行细分得到初始网格。示例性地,基础网格通过线性插值算法,可以得到初始网格(也可以称为“细分网格”)。其中,每次迭代过程中新插入的点坐标根据当前边界上的两个顶点进行线性插值得到:
Here, after obtaining the base mesh of the current image, a certain subdivision algorithm is used to subdivide the base mesh to obtain the initial mesh. For example, the base mesh can be used to obtain the initial mesh (also called "subdivided mesh") through a linear interpolation algorithm. In each iteration, the coordinates of the newly inserted points are obtained by linear interpolation based on the two vertices on the current boundary:
其中,pos1和pos2为参与本次迭代的当前边界上的两端顶点几何位置坐标,posnew为本次迭代新增顶点的几何位置坐标。Among them, pos 1 and pos 2 are the geometric position coordinates of the two end vertices on the current boundary participating in this iteration, and pos new is the geometric position coordinates of the vertex newly added in this iteration.
S1903:确定参考图像中的当前层的移位系数,并根据当前图像中的当前层的初始网格的几何位置信息与参考图像中的当前层的移位系数,确定当前图像中的当前层的第一重建网格的几何位置信息。S1903: Determine a shift coefficient of the current layer in the reference image, and determine the geometric position information of the first reconstructed grid of the current layer in the current image based on the geometric position information of the initial grid of the current layer in the current image and the shift coefficient of the current layer in the reference image.
需要说明的是,在本申请实施例中,参考图像为当前图像之前的已编码图像。示例性地,参考图像可以为当前图像的前一帧图像,但是并不作具体限定。It should be noted that, in the embodiment of the present application, the reference image is an encoded image before the current image. Exemplarily, the reference image may be an image frame before the current image, but this is not specifically limited.
在一些实施例中,根据当前图像中的当前层的初始网格的几何位置信息与参考图像中的当前层的移位系数,确定当前图像中的当前层的第一重建网格的几何位置信息,可以包括:确定当前图像中的当前层的初始网格的几何位置信息、参考图像中的当前层的移位系数与当前图像中的当前层的第一重建网格的几何位置信息之间的映射关系;根据映射关系以及当前图像中的当前层的初始网格的几何位置信息和 参考图像中的当前层的移位系数,确定当前图像中的当前层的第一重建网格的几何位置信息。In some embodiments, determining the geometric position information of the first reconstructed grid of the current layer in the current image according to the geometric position information of the initial grid of the current layer in the current image and the shift coefficient of the current layer in the reference image may include: determining a mapping relationship between the geometric position information of the initial grid of the current layer in the current image, the shift coefficient of the current layer in the reference image, and the geometric position information of the first reconstructed grid of the current layer in the current image; determining the geometric position information of the first reconstructed grid of the current layer in the current image according to the mapping relationship and the geometric position information of the initial grid of the current layer in the current image and the shift coefficient of the current layer in the reference ... first reconstructed grid of the current layer in the current image; determining the geometric position information of the first reconstructed grid of the current layer in the current image according to the mapping relationship and the geometric position information of the first reconstructed grid of the current layer in the current image; determining the geometric position information of the first reconstructed grid of the current layer in the current image according to the mapping relationship and the geometric position information of the first reconstructed grid of the current layer in the current image; determining the geometric position information of the first reconstructed grid of the current layer in the current image according to the mapping relationship The shift coefficient of the current layer in the reference image is used to determine geometric position information of a first reconstruction grid of the current layer in the current image.
还需要说明的是,在本申请实施例中,映射关系可以是一个查找表(Look Up Table,LUT),该查找表可以记录当前图像中的当前层的初始网格的几何位置信息和参考图像中的当前层的移位系数与第一重建网格的几何位置信息之间的对应关系;或者,映射关系还可以是一种预设函数,该预设函数可以表征当前图像中的当前层的初始网格的几何位置信息和参考图像中的当前层的移位系数与第一重建网格的几何位置信息之间的对应关系。It should also be noted that, in the embodiments of the present application, the mapping relationship can be a lookup table (Look Up Table, LUT), which can record the correspondence between the geometric position information of the initial grid of the current layer in the current image and the shift coefficient of the current layer in the reference image and the geometric position information of the first reconstructed grid; or, the mapping relationship can also be a preset function, which can characterize the correspondence between the geometric position information of the initial grid of the current layer in the current image and the shift coefficient of the current layer in the reference image and the geometric position information of the first reconstructed grid.
还需要说明的是,在本申请实施例中,映射关系可以包括以下至少之一:基于线性函数的映射关系、基于非线性函数的映射关系和基于神经网格的映射关系。示例性地,针对这种映射关系的拟合,可以包括但不限于线性拟合、曲线拟合或者卷积参数拟合等,这里并不作具体限定。It should also be noted that, in the embodiment of the present application, the mapping relationship may include at least one of the following: a mapping relationship based on a linear function, a mapping relationship based on a nonlinear function, and a mapping relationship based on a neural grid. Exemplarily, the fitting of such a mapping relationship may include, but is not limited to, linear fitting, curve fitting, or convolution parameter fitting, etc., which is not specifically limited here.
还需要说明的是,在本申请实施例中,对于映射关系而言,编码端在确定出映射关系之后可以写入码流。在一些实施例中,该方法还可以包括:基于映射关系,确定当前图像中的当前层的映射指示信息;对映射指示信息进行编码处理,将所得到的编码比特写入码流。It should also be noted that in the embodiment of the present application, for the mapping relationship, the encoding end can write the bitstream after determining the mapping relationship. In some embodiments, the method may also include: determining the mapping indication information of the current layer in the current image based on the mapping relationship; encoding the mapping indication information, and writing the obtained encoding bits into the bitstream.
也就是说,在本申请实施例中,编码端可以确定当前图像中的当前层的初始网格的几何位置信息、参考图像中的当前层的移位系数与第一重建网格的几何位置信息之间的映射关系并将该映射关系写入码流;这样,解码端通过解码码流,就可以确定出该映射关系,进而恢复得到第一重建网格的几何位置信息。That is to say, in an embodiment of the present application, the encoding end can determine the mapping relationship between the geometric position information of the initial grid of the current layer in the current image, the shift coefficient of the current layer in the reference image and the geometric position information of the first reconstructed grid, and write the mapping relationship into the bit stream; in this way, the decoding end can determine the mapping relationship by decoding the bit stream, and then restore the geometric position information of the first reconstructed grid.
在一种具体的实现方式中,映射指示信息包括第一指示信息;其中,第一指示信息用于指示该映射关系的拟合参数。在一些实施例中,该方法还可以包括:确定映射关系的拟合参数;对映射关系的拟合参数进行编码处理,将所得到的编码比特写入码流。In a specific implementation, the mapping indication information includes first indication information; wherein the first indication information is used to indicate the fitting parameters of the mapping relationship. In some embodiments, the method may also include: determining the fitting parameters of the mapping relationship; encoding the fitting parameters of the mapping relationship, and writing the obtained encoding bits into the bitstream.
在一种具体的实现方式中,映射指示信息还包括第二指示信息;其中,第二指示信息用于指示该映射关系的类型。在一些实施例中,该方法还可以包括:确定映射关系的类型和拟合参数;对映射关系的类型和拟合参数进行编码处理,将所得到的编码比特写入码流。In a specific implementation, the mapping indication information further includes second indication information; wherein the second indication information is used to indicate the type of the mapping relationship. In some embodiments, the method may further include: determining the type and fitting parameters of the mapping relationship; encoding the type and fitting parameters of the mapping relationship, and writing the obtained encoding bits into the bitstream.
示例性地,映射关系为基于线性函数的映射关系时,拟合参数可以包括线性函数的斜率和/或截距。映射关系为基于非线性函数的映射关系时,拟合参数可以包括非线性函数中的至少一个常数。例如,非线性函数为指数函数,拟合参数可以包括非线性函数的常数a;非线性函数为多项式函数,拟合参数可以包括多项式函数的系数a0,a1,a2,…;非线性函数为对数函数,拟合参数可以包括非线性函数的底数a。Exemplarily, when the mapping relationship is a mapping relationship based on a linear function, the fitting parameter may include the slope and/or intercept of the linear function. When the mapping relationship is a mapping relationship based on a nonlinear function, the fitting parameter may include at least one constant in the nonlinear function. For example, when the nonlinear function is an exponential function, the fitting parameter may include the constant a of the nonlinear function; when the nonlinear function is a polynomial function, the fitting parameter may include the coefficients a 0 , a 1 , a 2 , ... of the polynomial function; when the nonlinear function is a logarithmic function, the fitting parameter may include the base a of the nonlinear function.
示例性地,映射关系的类型可以包括线性函数类型、指数函数类型、对数函数类型、多项式函数类型等,这里不作具体限定。Exemplarily, the type of mapping relationship may include a linear function type, an exponential function type, a logarithmic function type, a polynomial function type, etc., which is not specifically limited here.
在本申请实施例中,映射关系的具体公式如下:
predictmesh=f(reconMesh+refDisp,lvl) (11)In the embodiment of the present application, the specific formula of the mapping relationship is as follows:
predict mesh =f(reconMesh+refDisp,lvl) (11)
其中,lvl表示不同的LOD层,reconMesh表示初始网格的几何位置信息,refDisp表示参考图像的重建Displacement系数,predictmesh表示利用一定的函数关系得到第一重建网格的几何位置信息。Among them, lvl represents different LOD layers, reconMesh represents the geometric position information of the initial mesh, refDisp represents the reconstruction Displacement coefficient of the reference image, and predict mesh represents the geometric position information of the first reconstructed mesh obtained by using a certain functional relationship.
在一种具体的实现方式中,通过利用简单的线性函数关系来得到第一重建网格的几何位置信息与初始网格的几何位置信息之间的对应关系,具体的如下所示:
predictmesh=k*(reconMesh+refDisp,lvl)+b (12)In a specific implementation, the corresponding relationship between the geometric position information of the first reconstructed grid and the geometric position information of the initial grid is obtained by using a simple linear function relationship, as shown below:
predict mesh =k*(reconMesh+refDisp,lvl)+b (12)
其中,*和+为向量的数乘和加法;k和b表示拟合参数。Among them, * and + are vector multiplication and addition; k and b represent fitting parameters.
在又一种具体的实现方式中,映射指示信息包括第三指示信息;其中,第三指示信息用于指示该映射关系的索引序号。在一些实施例中,该方法还可以包括:确定映射关系的索引序号;对映射关系的索引序号进行编码处理,将所得到的编码比特写入码流。In another specific implementation, the mapping indication information includes third indication information; wherein the third indication information is used to indicate the index number of the mapping relationship. In some embodiments, the method may also include: determining the index number of the mapping relationship; encoding the index number of the mapping relationship, and writing the obtained encoding bits into the bit stream.
在本申请实施例中,编码端和解码端均预先设置有几种映射关系,这时候根据索引序号就可以确定出对应的映射关系。这样,编码端在确定出映射关系的索引序号之后,可以将其写入码流;后续在解码端通过解码码流即可获得映射关系的索引序号,然后根据映射关系的索引序号来确定对应的映射关系。In the embodiment of the present application, the encoding end and the decoding end are both pre-set with several mapping relationships, and the corresponding mapping relationship can be determined according to the index number. In this way, after determining the index number of the mapping relationship, the encoding end can write it into the code stream; subsequently, the decoding end can obtain the index number of the mapping relationship by decoding the code stream, and then determine the corresponding mapping relationship according to the index number of the mapping relationship.
S1904:根据第一重建网格的几何位置信息和原始网格的几何位置信息,确定是否对当前图像中的当前层的移位系数进行编码处理。S1904: Determine whether to perform encoding processing on the shift coefficients of the current layer in the current image according to the geometric position information of the first reconstructed grid and the geometric position information of the original grid.
需要说明的是,在一些实施例中,根据第一重建网格的几何位置信息和原始网格的几何位置信息,确定是否对当前图像中的当前层的移位系数进行编码处理,可以包括:对第一重建网格的几何位置信息和原始网格的几何位置信息进行误差计算,确定第一误差结果;基于第一误差结果,确定当前图像中的当前层的移位系数的编码模式,其中,编码模式用于表征是否对当前图像中的当前层的移位系数进行编码处理。It should be noted that, in some embodiments, determining whether to encode the shift coefficient of the current layer in the current image based on the geometric position information of the first reconstructed grid and the geometric position information of the original grid may include: performing error calculation on the geometric position information of the first reconstructed grid and the geometric position information of the original grid to determine a first error result; based on the first error result, determining the encoding mode of the shift coefficient of the current layer in the current image, wherein the encoding mode is used to characterize whether to encode the shift coefficient of the current layer in the current image.
还需要说明的是,在一些实施例中,该方法还可以包括:根据当前图像中的当前层的初始网格的几何位置信息与当前图像中的当前层的移位系数,确定当前图像中的当前层的第二重建网格的几何位置信息;对第二重建网格的几何位置信息和原始网格的几何位置信息进行误差计算,确定第二误差结果。 It should also be noted that, in some embodiments, the method may further include: determining the geometric position information of a second reconstructed grid of the current layer in the current image based on the geometric position information of the initial grid of the current layer in the current image and the shift coefficient of the current layer in the current image; performing error calculation on the geometric position information of the second reconstructed grid and the geometric position information of the original grid to determine a second error result.
在本申请实施例中,对于当前图像中的当前层的移位系数,该方法还可以包括:根据当前图像中的当前层的初始网格的几何位置信息与当前图像中的当前层的原始网格的几何位置信息,确定当前图像中的当前层的移位系数。In an embodiment of the present application, for the shift coefficient of the current layer in the current image, the method may also include: determining the shift coefficient of the current layer in the current image based on the geometric position information of the initial grid of the current layer in the current image and the geometric position information of the original grid of the current layer in the current image.
在一种具体的实施例中,根据当前图像中的当前层的初始网格的几何位置信息与当前图像中的当前层的原始网格的几何位置信息,确定当前图像中的当前层的移位系数,可以包括:基于初始网格中的第一顶点,确定第一顶点在初始网格与原始网格之间的误差值,以及确定第一顶点的法向量;根据第一顶点的误差值与第一顶点的法向量,计算第一顶点的移位系数;其中,第一顶点为初始网格中的任意一个顶点。In a specific embodiment, determining the shift coefficient of the current layer in the current image based on the geometric position information of the initial mesh of the current layer in the current image and the geometric position information of the original mesh of the current layer in the current image can include: determining the error value of the first vertex between the initial mesh and the original mesh based on the first vertex in the initial mesh, and determining the normal vector of the first vertex; calculating the shift coefficient of the first vertex based on the error value of the first vertex and the normal vector of the first vertex; wherein the first vertex is any vertex in the initial mesh.
需要说明的是,在本申请实施例中,针对第一顶点,首先可以计算初始网格与原始网格之间的误差值Delta,误差Delta可以是一种世界坐标系下点的误差。然后利用第一顶点之间的误差Delta以及第一顶点的法向量Norm来计算得到第一顶点的移位系数,具体参见图8。在示例性地,一种具体的计算方式如下:
Displacement=Delta×Norm (13)It should be noted that in the embodiment of the present application, for the first vertex, the error value Delta between the initial mesh and the original mesh can be calculated first, and the error Delta can be an error of a point in the world coordinate system. Then, the error Delta between the first vertices and the normal vector Norm of the first vertex are used to calculate the displacement coefficient of the first vertex, as shown in Figure 8. In an exemplary embodiment, a specific calculation method is as follows:
Displacement=Delta×Norm (13)
其中,Displacement为移位系数。Wherein, Displacement is the displacement coefficient.
这样,在得到第二误差结果之后,根据第一误差结果和第二误差结果,确定当前图像中的当前层的移位系数的编码模式。具体地,在一些实施例中,该方法可以包括:In this way, after obtaining the second error result, the coding mode of the shift coefficient of the current layer in the current image is determined according to the first error result and the second error result. Specifically, in some embodiments, the method may include:
确定第一误差结果与第二误差结果的误差比;determining an error ratio of the first error result to the second error result;
若误差比小于预设阈值,则确定当前图像中的当前层的移位系数使用第一编码模式;If the error ratio is less than a preset threshold, determining that the shift coefficient of the current layer in the current image uses the first coding mode;
若误差比大于预设阈值,则确定当前图像中的当前层的移位系数使用第二编码模式。If the error ratio is greater than a preset threshold, it is determined that the shift coefficient of the current layer in the current image uses the second encoding mode.
需要说明的是,在本申请实施例中,第一编码模式与第二编码模式不同。其中,第一编码模式表征跳过编码当前图像中的当前层的移位系数;第二编码模式表征编码当前图像中的当前层的移位系数。It should be noted that in the embodiment of the present application, the first coding mode is different from the second coding mode. The first coding mode represents skipping the encoding of the shift coefficient of the current layer in the current image; the second coding mode represents encoding the shift coefficient of the current layer in the current image.
还需要说明的是,在本申请实施例中,针对误差比与预设阈值相等的情况,可以确定当前图像中的当前层的移位系数使用第一编码模式,或者也可以确定当前图像中的当前层的移位系数使用第二编码模式。It should also be noted that in an embodiment of the present application, when the error ratio is equal to a preset threshold, it can be determined that the shift coefficient of the current layer in the current image uses the first coding mode, or it can be determined that the shift coefficient of the current layer in the current image uses the second coding mode.
在一种具体的实施例中,参见图20,该方法可以包括:In a specific embodiment, referring to FIG. 20 , the method may include:
S2001:根据当前图像中的当前层的初始网格的几何位置信息与参考图像中的当前层的移位系数,确定当前图像中的当前层的第一重建网格的几何位置信息。S2001: Determine geometric position information of a first reconstructed grid of a current layer in a current image according to geometric position information of an initial grid of a current layer in a current image and a shift coefficient of the current layer in a reference image.
S2002:对第一重建网格的几何位置信息和原始网格的几何位置信息进行误差计算,确定第一误差结果。S2002: performing error calculation on the geometric position information of the first reconstructed grid and the geometric position information of the original grid to determine a first error result.
S2003:根据当前图像中的当前层的初始网格的几何位置信息与当前图像中的当前层的移位系数,确定当前图像中的当前层的第二重建网格的几何位置信息。S2003: Determine geometric position information of a second reconstructed grid of the current layer in the current image according to the geometric position information of the initial grid of the current layer in the current image and the shift coefficient of the current layer in the current image.
S2004:对第二重建网格的几何位置信息和原始网格的几何位置信息进行误差计算,确定第二误差结果。S2004: performing error calculation on the geometric position information of the second reconstructed grid and the geometric position information of the original grid to determine a second error result.
S2005:确定第一误差结果与第二误差结果的误差比。S2005: Determine an error ratio between the first error result and the second error result.
S2006:若误差比小于预设阈值,则跳过编码当前图像中的当前层的移位系数。S2006: If the error ratio is less than a preset threshold, skip encoding the shift coefficient of the current layer in the current image.
S2007:若误差比大于预设阈值,则编码当前图像中的当前层的移位系数。S2007: If the error ratio is greater than a preset threshold, encode the shift coefficient of the current layer in the current image.
在本申请实施例中,针对第一误差结果和第二误差结果,可以通过一定的失真计算,这里可以利用均方误差(Mean Square Error,MSE)算法来得到不同LOD层mesh的几何位置信息的误差,具体如下所示:
In the embodiment of the present application, for the first error result and the second error result, a certain distortion calculation can be performed. Here, the mean square error (MSE) algorithm can be used to obtain the error of the geometric position information of the mesh of different LOD layers, as shown below:
也就是说,在编码当前图像中的lvl层的Displacement系数之前,首先计算原始编码方式得到的重建mesh与原始mesh之间的误差,即第二误差结果,用Dist_org表示;其次利用本申请实施例的方式:利用Base mesh进行划分得到的初始mesh的几何位置信息与参考图像中的Displacement系数进行重建得到的第二重建网格的几何位置信息与原始mesh之间的误差,即第一误差结果,用Dist_skip表示。That is to say, before encoding the Displacement coefficient of the lvl layer in the current image, first calculate the error between the reconstructed mesh obtained by the original encoding method and the original mesh, that is, the second error result, represented by Dist_org; secondly, use the method of the embodiment of the present application: use the geometric position information of the initial mesh obtained by dividing the Base mesh and the Displacement coefficient in the reference image to reconstruct the error between the geometric position information of the second reconstructed mesh and the original mesh, that is, the first error result, represented by Dist_skip.
这样,误差比可以用Dist_skip/Dist_org表示。如果Dist_skip/Dist_org小于预设阈值,那么可以确定利用第一编码模式对当前图像中的lvl层的Displacement系数进行编码,即跳过编码,也即对当前图像中的lvl层的Displacement系数不进行编码;如果Dist_skip/Dist_org大于预设阈值,那么可以确定利用第二编码模式对当前图像中的lvl层的Displacement系数进行编码,即需要对当前图像中的lvl层的Displacement系数进行编码。In this way, the error ratio can be expressed by Dist_skip/Dist_org. If Dist_skip/Dist_org is less than the preset threshold, it can be determined that the Displacement coefficient of the lvl layer in the current image is encoded using the first encoding mode, that is, the encoding is skipped, that is, the Displacement coefficient of the lvl layer in the current image is not encoded; if Dist_skip/Dist_org is greater than the preset threshold, it can be determined that the Displacement coefficient of the lvl layer in the current image is encoded using the second encoding mode, that is, the Displacement coefficient of the lvl layer in the current image needs to be encoded.
在一些实施例中,该方法还可以包括:根据第一编码模式对当前图像中的当前层的移位系数进行率失真代价计算,确定第一率失真结果;以及根据第二编码模式对当前图像中的当前层的移位系数进行率失真代价计算,确定第二率失真结果; In some embodiments, the method may further include: performing rate-distortion cost calculation on the shift coefficient of the current layer in the current image according to the first coding mode to determine a first rate-distortion result; and performing rate-distortion cost calculation on the shift coefficient of the current layer in the current image according to the second coding mode to determine a second rate-distortion result;
根据第一率失真结果与第二率失真结果,确定当前图像中的当前层的移位系数的编码模式。A coding mode of a shift coefficient of a current layer in a current image is determined according to the first rate-distortion result and the second rate-distortion result.
在一种具体的实施例中,根据第一率失真结果与第二率失真结果,确定当前图像中的当前层的移位系数的编码模式,可以包括:In a specific embodiment, determining the coding mode of the shift coefficient of the current layer in the current image according to the first rate-distortion result and the second rate-distortion result may include:
若第一率失真结果小于第二率失真结果,则确定当前图像中的当前层的移位系数使用第一编码模式;If the first rate-distortion result is less than the second rate-distortion result, determining that the shift coefficient of the current layer in the current image uses the first coding mode;
若第二率失真结果小于第一率失真结果,则确定当前图像中的当前层的移位系数使用第二编码模式。If the second rate-distortion result is less than the first rate-distortion result, it is determined that the shift coefficients of the current layer in the current image use the second coding mode.
需要说明的是,在本申请实施例中,针对第一率失真结果与第二率失真结果相等的情况,这时候可以确定当前图像中的当前层的移位系数使用第一编码模式,或者也可以确定当前图像中的当前层的移位系数使用第二编码模式。It should be noted that in an embodiment of the present application, when the first rate-distortion result is equal to the second rate-distortion result, it can be determined that the shift coefficient of the current layer in the current image uses the first coding mode, or it can be determined that the shift coefficient of the current layer in the current image uses the second coding mode.
还需要说明的是,在本申请实施例中,为了更准确衡量第一编码模式和第二编码模式对编码性能提升与否,本申请实施例同时对这两种编码模式进行率失真权衡,这里可以利用率失真代价方式来计算综合质量提升与码流增加后的率失真结果。其中,第一率失真结果和第二率失真结果可以分别表征第一编码模式所得到的重建网格相对于原始网格或者第二编码模式所得到的重建网格相对于原始网格各自的率失真代价,用于表示是否跳过编码移位系数的压缩效率。对于第一率失真结果和第二率失真结果的计算,具体计算公式如下所示,
J=D+λ×R (15)It should also be noted that in the embodiment of the present application, in order to more accurately measure whether the first coding mode and the second coding mode improve the coding performance, the embodiment of the present application simultaneously performs rate-distortion trade-offs on the two coding modes. Here, the rate-distortion cost method can be used to calculate the rate-distortion result after the comprehensive quality improvement and the increase in bit rate. Among them, the first rate-distortion result and the second rate-distortion result can respectively represent the rate-distortion cost of the reconstructed grid obtained by the first coding mode relative to the original grid or the reconstructed grid obtained by the second coding mode relative to the original grid, which is used to indicate whether the compression efficiency of the coding shift coefficient is skipped. For the calculation of the first rate-distortion result and the second rate-distortion result, the specific calculation formula is as follows,
J=D+λ×R (15)
在这里,J为率失真结果,D是原始网格与第一编码模式或第二编码模式所得到的重建网格之间的失真大小,例如对应点误差的平方和;λ是与量化参数QP有关的量,R为总几何比特流大小除以帧数。Here, J is the rate-distortion result, D is the distortion size between the original grid and the reconstructed grid obtained by the first coding mode or the second coding mode, such as the sum of squares of corresponding point errors; λ is a quantity related to the quantization parameter QP, and R is the total geometric bitstream size divided by the number of frames.
示例性地,如果第一率失真结果小于第二率失真结果,那么可以确定第一编码模式为最佳的编码模式,此时当前图像中的lvl层的移位系数使用第一编码模式,即跳过编码,也即对当前图像中的lvl层的Displacement系数不进行编码;如果第二率失真结果小于第一率失真结果,那么可以确定第二编码模式为最佳的编码模式,此时当前图像中的lvl层的移位系数使用第二编码模式,即需要对当前图像中的lvl层的Displacement系数进行编码。Exemplarily, if the first rate-distortion result is smaller than the second rate-distortion result, then the first coding mode can be determined as the optimal coding mode. At this time, the displacement coefficient of the lvl layer in the current image uses the first coding mode, that is, skipping coding, that is, not encoding the Displacement coefficient of the lvl layer in the current image; if the second rate-distortion result is smaller than the first rate-distortion result, then the second coding mode can be determined as the optimal coding mode. At this time, the displacement coefficient of the lvl layer in the current image uses the second coding mode, that is, it is necessary to encode the Displacement coefficient of the lvl layer in the current image.
进一步地,在一些实施例中,该方法还可以包括:在当前图像中的当前层的移位系数使用第二编码模式时,对当前图像中的当前层的移位系数进行编码处理,将所得到的编码比特写入码流。Furthermore, in some embodiments, the method may further include: when the shift coefficient of the current layer in the current image uses the second coding mode, encoding the shift coefficient of the current layer in the current image, and writing the obtained coding bits into the bitstream.
在本申请实施例中,对当前图像中的当前层的移位系数进行编码处理,将所得到的编码比特写入码流,具体可以包括:对当前图像中的当前层的移位系数进行提升变换,确定提升变换系数;对提升变换系数进行量化处理,确定量化系数;对量化系数进行系数重组处理,确定二维图像;对二维图像进行编码处理,将所得到的编码比特写入码流。In an embodiment of the present application, encoding processing is performed on the shift coefficients of the current layer in the current image, and the obtained coded bits are written into the bitstream, which can specifically include: performing a lifting transform on the shift coefficients of the current layer in the current image to determine the lifting transform coefficients; performing quantization processing on the lifting transform coefficients to determine the quantization coefficients; performing coefficient reorganization processing on the quantization coefficients to determine a two-dimensional image; encoding the two-dimensional image, and writing the obtained coded bits into the bitstream.
示例性地,本申请实施例提供的移位系数编码方法的实现步骤可以包括:Exemplarily, the implementation steps of the shift coefficient encoding method provided in the embodiment of the present application may include:
首先,根据Base mesh来按照一定的算法进行迭代划分来得到对应的mesh位置信息,具体的划分算法和前述内容一致,利用每个边界上的顶点进行线性插值,得到对应的几何位置信息。假设,整个划分迭代了N次,则根据不同迭代划分得到的Displacement进行LOD划分,得到对应的mesh几何位置信息,利用初始得到的几何位置信息与原始网格进行误差计算,得到每个点的Displacement。First, the Base mesh is iterated and divided according to a certain algorithm to obtain the corresponding mesh position information. The specific division algorithm is consistent with the above content. The vertices on each boundary are used for linear interpolation to obtain the corresponding geometric position information. Assuming that the entire division is iterated N times, LOD division is performed according to the Displacement obtained by different iterative divisions to obtain the corresponding mesh geometric position information. The initial geometric position information is used to calculate the error with the original mesh to obtain the Displacement of each point.
其次,基于LOD空间结构对移位系数进行提升小波变换(Lifting Transform),其中包括预测和更新两个步骤,其中预测步骤如下:
Secondly, based on the LOD spatial structure, the shift coefficients are subjected to a lifting wavelet transform, which includes two steps: prediction and update. The prediction step is as follows:
更新步骤如下:
The update steps are as follows:
最终对变换之后的系数进行量化,并且对量化之后的系数进行系数重组,目前的V-DMC基于Block进行系数重组,每个Block的大小为16×16,对每个Block内部的系数按照莫顿码进行排列得到对应的二维图像。在完成一系列操作之后,最后利用Video-Codec来对二维图像进行编码。Finally, the transformed coefficients are quantized and reorganized. The current V-DMC reorganizes coefficients based on blocks. The size of each block is 16×16. The coefficients in each block are arranged according to the Morton code to obtain the corresponding two-dimensional image. After completing a series of operations, the two-dimensional image is finally encoded using Video-Codec.
示例性地,还可以利用熵编码器直接编码移位系数,得到移位系数码流。Exemplarily, the shift coefficients may also be directly encoded using an entropy encoder to obtain a shift coefficient code stream.
进一步地,在一些实施例中,该方法还可以包括:确定第二语法标识信息的取值,其中,第二语法标识信息用于指示当前序列的移位系数是否启用第一编码模式;对第二语法标识信息的取值进行编码处理,将所得到的编码比特写入码流。Furthermore, in some embodiments, the method may also include: determining a value of second grammar identification information, wherein the second grammar identification information is used to indicate whether the shift coefficient of the current sequence enables the first coding mode; encoding the value of the second grammar identification information, and writing the obtained coded bits into the bit stream.
进一步地,在一些实施例中,该方法还可以包括:在第二语法标识信息指示当前序列的移位系数启用第一编码模式时,确定第三语法标识信息的取值,其中,第三语法标识信息用于指示当前图像的移位系数是否启用第一编码模式;对第三语法标识信息的取值进行编码处理,将所得到的编码比特写入码流。Furthermore, in some embodiments, the method may also include: when the second grammar identification information indicates that the shift coefficient of the current sequence enables the first coding mode, determining the value of the third grammar identification information, wherein the third grammar identification information is used to indicate whether the shift coefficient of the current image enables the first coding mode; encoding the value of the third grammar identification information and writing the obtained coded bits into the bitstream.
进一步地,在一些实施例中,该方法还可以包括:在第三语法标识信息指示当前图像的移位系数启用第一编码模式时,确定第一语法标识信息的取值,其中,第一语法标识信息用于指示当前图像中的当前层的移位系数是否使用第一编码模式;对第一语法标识信息的取值进行编码处理,将所得到的编码比 特写入码流。Furthermore, in some embodiments, the method may further include: when the third syntax identification information indicates that the shift coefficient of the current image enables the first coding mode, determining the value of the first syntax identification information, wherein the first syntax identification information is used to indicate whether the shift coefficient of the current layer in the current image uses the first coding mode; encoding the value of the first syntax identification information, and converting the obtained encoding mode into Special write to the code stream.
在本申请实施例中,第二语法标识信息为序列参数集(Sequence Parameter Set,SPS)级别的语法元素,第三语法标识信息为帧参数集(frame Parameter Set,FPS)级别的语法元素,第一语法标识信息为LOD层级别的语法元素。其中,第二语法标识信息用于指示当前序列的移位系数是否启用第一编码模式,第三语法标识信息用于指示当前图像的移位系数是否启用第一编码模式,第一语法标识信息用于指示当前层的移位系数是否使用第一编码模式。在这里,当前序列至少包括当前图像,当前图像划分的LOD层至少包括当前层。In an embodiment of the present application, the second syntax identification information is a syntax element at the Sequence Parameter Set (SPS) level, the third syntax identification information is a syntax element at the frame parameter set (FPS) level, and the first syntax identification information is a syntax element at the LOD layer level. The second syntax identification information is used to indicate whether the shift coefficient of the current sequence enables the first coding mode, the third syntax identification information is used to indicate whether the shift coefficient of the current image enables the first coding mode, and the first syntax identification information is used to indicate whether the shift coefficient of the current layer uses the first coding mode. Here, the current sequence includes at least the current image, and the LOD layer divided by the current image includes at least the current layer.
在本申请实施例中,对于第一语法标识信息的取值,若当前图像中的当前层的移位系数使用第一编码模式,或者说跳过编码当前图像中的当前层的移位系数,则确定第一语法标识信息的取值为第一值;若当前图像中的当前层的移位系数不使用第一编码模式,或者说需要编码当前图像中的当前层的移位系数,则确定第一语法标识信息的取值为第二值。In an embodiment of the present application, for the value of the first grammar identification information, if the shift coefficient of the current layer in the current image uses the first coding mode, or in other words, encoding the shift coefficient of the current layer in the current image is skipped, the value of the first grammar identification information is determined to be the first value; if the shift coefficient of the current layer in the current image does not use the first coding mode, or in other words, it is necessary to encode the shift coefficient of the current layer in the current image, the value of the first grammar identification information is determined to be the second value.
在本申请实施例中,对于第二语法标识信息的取值,若当前序列的移位系数启用第一编码模式,则确定第二语法标识信息的取值为第一值;若当前序列的移位系数不启用第一编码模式,则确定第二语法标识信息的取值为第二值。In an embodiment of the present application, for the value of the second grammar identification information, if the shift coefficient of the current sequence enables the first coding mode, the value of the second grammar identification information is determined to be the first value; if the shift coefficient of the current sequence does not enable the first coding mode, the value of the second grammar identification information is determined to be the second value.
在本申请实施例中,对于第三语法标识信息的取值,若当前图像的移位系数启用第一编码模式,则确定第三语法标识信息的取值为第一值;若当前图像的移位系数不启用第一编码模式,则确定第三语法标识信息的取值为第二值。In an embodiment of the present application, for the value of the third grammar identification information, if the shift coefficient of the current image enables the first coding mode, the value of the third grammar identification information is determined to be the first value; if the shift coefficient of the current image does not enable the first coding mode, the value of the third grammar identification information is determined to be the second value.
需要说明的是,第一值与第二值不同,而且第一值和第二值可以是参数形式,也可以是数字形式。具体地,第一语法标识信息、第二语法标识信息和第三语法标识信息可以是写入在概述(profile)中的参数,也可以是一个标志(flag)的取值,这里不作具体限定。It should be noted that the first value is different from the second value, and the first value and the second value can be in parameter form or in digital form. Specifically, the first syntax identification information, the second syntax identification information and the third syntax identification information can be parameters written in the profile or the value of a flag, which is not specifically limited here.
示例性地,对于第一值和第二值而言,第一值可以设置为1,第二值可以设置为0;或者,第一值可以设置为0,第二值可以设置为1;或者,第一值可以设置为true,第二值可以设置为false;或者,第一值可以设置为false,第二值可以设置为true。其中,在本申请实施例中,第一值设置为1,第二值设置为0,但是并不作具体限定。Exemplarily, for the first value and the second value, the first value can be set to 1 and the second value can be set to 0; or, the first value can be set to 0 and the second value can be set to 1; or, the first value can be set to true and the second value can be set to false; or, the first value can be set to false and the second value can be set to true. In the embodiment of the present application, the first value is set to 1 and the second value is set to 0, but this is not specifically limited.
也就是说,在本申请实施例中,针对高层语法元素,首先在序列参数集中确定是否启动本申请实施例的编码方法,其次在帧参数集中确定是否启动本申请实施例的编码方法,最后在每个LOD层级确定当前层的移位系数的编码方法,具体是使用第一编码模式还是使用第二编码模式对当前层的移位系数进行编码。That is to say, in an embodiment of the present application, for high-level syntax elements, first, it is determined in the sequence parameter set whether to start the encoding method of the embodiment of the present application, and secondly, it is determined in the frame parameter set whether to start the encoding method of the embodiment of the present application, and finally, the encoding method of the shift coefficient of the current layer is determined at each LOD level, specifically, whether to use the first encoding mode or the second encoding mode to encode the shift coefficient of the current layer.
进一步地,在一些实施例中,该方法还可以包括:确定第四语法标识信息的取值,其中,第四语法标识信息用于指示当前图像的基础网格是否使用帧间处理方式;对第四语法标识信息的取值进行编码处理,将所得到的编码比特写入码流。Furthermore, in some embodiments, the method may also include: determining a value of fourth grammar identification information, wherein the fourth grammar identification information is used to indicate whether the basic grid of the current image uses an inter-frame processing method; encoding the value of the fourth grammar identification information, and writing the obtained coded bits into the bitstream.
需要说明的是,在本申请实施例中,对于第四语法标识信息的取值,若当前图像的基础网格使用帧间处理方式,则确定第四语法标识信息的取值为第一值;若当前图像的基础网格不使用帧间处理方式,则确定第四语法标识信息的取值为第二值。It should be noted that in the embodiment of the present application, for the value of the fourth grammar identification information, if the basic grid of the current image uses the inter-frame processing method, the value of the fourth grammar identification information is determined to be the first value; if the basic grid of the current image does not use the inter-frame processing method, the value of the fourth grammar identification information is determined to be the second value.
还需要说明的是,在本申请实施例中,第一值与第二值不同,而且第一值和第二值可以是参数形式,也可以是数字形式。具体地,第四语法标识信息可以是写入在概述(profile)中的参数,也可以是一个标志(flag)的取值,这里不作具体限定。示例性地,第一值设置为1,第二值设置为0,但是也不作具体限定。It should also be noted that in the embodiment of the present application, the first value is different from the second value, and the first value and the second value can be in parameter form or in digital form. Specifically, the fourth syntax identification information can be a parameter written in the profile or a flag value, which is not specifically limited here. Exemplarily, the first value is set to 1 and the second value is set to 0, but this is not specifically limited either.
在一些实施例中,该方法还可以包括:在当前图像的基础网格使用帧间处理方式时,执行确定是否对当前图像中的当前层的移位系数进行编码处理的步骤。In some embodiments, the method may further include: when the base grid of the current image uses an inter-frame processing method, executing a step of determining whether to perform encoding processing on the shift coefficients of the current layer in the current image.
在本申请实施例中,在当前图像的基础网格使用帧间处理方式时,该方法还包括:根据参考图像,确定当前图像的参考图像索引;对当前图像的参考图像索引进行编码处理,将所得到的编码比特写入码流。In an embodiment of the present application, when the basic grid of the current image uses an inter-frame processing method, the method also includes: determining a reference image index of the current image based on a reference image; encoding the reference image index of the current image, and writing the obtained encoding bits into a bitstream.
需要说明的是,在帧间预测过程中,可以首先设置一个标识信息决定当前图像的基础网格的编码方式。如果当前图像的基础网格可以采用帧间编码,那么需要传递当前图像的参考图像索引。基于这样的算法,本申请实施例首先判断当前图像的基础网格是否可以进行帧间编码,如果可以采用帧间编码,那么才会去判断当前图像中的当前层的移位系数是否采用跳过编码,否则默认当前图像中的当前层的移位系数使用第二编码模式。It should be noted that, in the inter-frame prediction process, an identification information can be first set to determine the encoding method of the basic grid of the current image. If the basic grid of the current image can be inter-frame encoded, then the reference image index of the current image needs to be transmitted. Based on such an algorithm, the embodiment of the present application first determines whether the basic grid of the current image can be inter-frame encoded. If inter-frame encoding can be used, then it will determine whether the shift coefficient of the current layer in the current image uses skip encoding, otherwise the shift coefficient of the current layer in the current image is defaulted to use the second encoding mode.
在一些实施例中,该方法还可以包括:在第四语法标识信息指示当前图像的基础网格不使用帧间处理方式时,确定当前图像中的当前层的移位系数;根据当前图像中的当前层的初始网格的几何位置信息与当前图像中的当前层的移位系数,确定当前图像中的当前层的重建网格的几何位置信息。In some embodiments, the method may also include: determining the shift coefficient of the current layer in the current image when the fourth syntax identification information indicates that the basic grid of the current image does not use inter-frame processing; determining the geometric position information of the reconstructed grid of the current layer in the current image based on the geometric position information of the initial grid of the current layer in the current image and the shift coefficient of the current layer in the current image.
也就是说,在本申请实施例中,如果当前图像的基础网格不采用帧间编码,这时候当前图像中的当 前层的移位系数可以默认使用第二编码模式,即在确定当前图像中的当前层的移位系数之后,根据当前图像中的当前层的初始网格的几何位置信息与当前图像中的当前层的移位系数,确定当前图像中的当前层的重建网格的几何位置信息。That is to say, in the embodiment of the present application, if the basic grid of the current image does not use inter-frame coding, then The shift coefficient of the front layer can use the second encoding mode by default, that is, after determining the shift coefficient of the current layer in the current image, the geometric position information of the reconstructed grid of the current layer in the current image is determined according to the geometric position information of the initial grid of the current layer in the current image and the shift coefficient of the current layer in the current image.
进一步地,在帧间预测过程中,可以首先设置一个标识信息决定当前图像的基础网格的编码方式,如果当前图像的基础网格可以采用帧间编码,那么需要传递当前图像的参考图像索引。本申请实施例也可以是无论当前图像的基础网格是否可以采用帧间编码,都需要传递一个标识信息指示当前图像中的当前层的移位系数是否采用跳过编码。Furthermore, in the inter-frame prediction process, an identification information may be first set to determine the encoding method of the base grid of the current image. If the base grid of the current image can be inter-coded, then the reference image index of the current image needs to be transmitted. An embodiment of the present application may also be that regardless of whether the base grid of the current image can be inter-coded, an identification information needs to be transmitted to indicate whether the shift coefficient of the current layer in the current image adopts skip coding.
进一步地,在本申请实施例中,通过利用当前图像的基础网格划分得到的初始mesh的几何位置信息与参考图像的移位系数来作为一种mesh点几何位置信息的重建方式,这种方案是通过降低重建mesh点位置信息的一定程度前提下,降低移位系数的码流,从而可以提升mesh的编码效率。在本申请实施例中,还可以通过利用参数拟合的方式,利用预测mesh点几何位置信息与原始mesh点几何位置信息之间的映射关系,从而可以保证重建mesh点几何位置信息的质量,降低移位系数的码流大小,进一步提升mesh的编码效率。Further, in the embodiment of the present application, the geometric position information of the initial mesh obtained by using the basic mesh division of the current image and the shift coefficient of the reference image are used as a method for reconstructing the geometric position information of the mesh points. This scheme is to reduce the code stream of the shift coefficient under the premise of reducing the degree of the reconstructed mesh point position information, thereby improving the coding efficiency of the mesh. In the embodiment of the present application, the mapping relationship between the predicted mesh point geometric position information and the original mesh point geometric position information can also be used by using parameter fitting, so as to ensure the quality of the reconstructed mesh point geometric position information, reduce the code stream size of the shift coefficient, and further improve the coding efficiency of the mesh.
进一步地,本申请实施例还提供了一种码流,码流是根据待编码信息进行比特编码生成的;其中,待编码信息可以包括下述至少一项:Furthermore, the embodiment of the present application also provides a code stream, which is generated by bit encoding according to the information to be encoded; wherein the information to be encoded may include at least one of the following:
第一语法标识信息的取值、第二语法标识信息的取值、第三语法标识信息的取值、第四语法标识信息的取值、当前图像的参考图像索引、当前图像中的当前层的移位系数和当前图像中的当前层的映射指示信息;The value of the first syntax identification information, the value of the second syntax identification information, the value of the third syntax identification information, the value of the fourth syntax identification information, the reference image index of the current image, the shift coefficient of the current layer in the current image, and the mapping indication information of the current layer in the current image;
其中,第二语法标识信息用于指示当前序列的移位系数是否启用第一编码模式,第三语法标识信息用于指示当前图像的移位系数是否启用第一编码模式,第一语法标识信息用于指示当前图像中的当前层的移位系数是否使用第一编码模式,第四语法标识信息用于指示当前图像的基础网格是否使用帧间处理方式;且当前序列包括当前图像,当前图像划分的LOD层包括当前层。Among them, the second grammar identification information is used to indicate whether the shift coefficient of the current sequence enables the first coding mode, the third grammar identification information is used to indicate whether the shift coefficient of the current image enables the first coding mode, the first grammar identification information is used to indicate whether the shift coefficient of the current layer in the current image uses the first coding mode, and the fourth grammar identification information is used to indicate whether the basic grid of the current image uses inter-frame processing; and the current sequence includes the current image, and the LOD layers divided by the current image include the current layer.
本实施例提供了一种编码方法,根据当前图像的原始网格,确定当前图像的基础网格;对基础网格进行细分,确定当前图像中的当前层的初始网格的几何位置信息;确定参考图像中的当前层的移位系数,并根据当前图像中的当前层的初始网格的几何位置信息与参考图像中的当前层的移位系数,确定当前图像中的当前层的第一重建网格的几何位置信息;根据第一重建网格的几何位置信息和原始网格的几何位置信息,确定是否对当前图像中的当前层的移位系数进行编码处理。这样,基于当前图像中的当前层的移位系数是否进行编码处理,如果跳过编码当前图像中的当前层的移位系数,这时候可以根据当前图像中的当前层的初始网格的几何位置信息与参考图像中的当前层的移位系数来确定第一重建网格的几何位置信息,不仅降低了移位系数的码流,而且还可以保证网格中点的重建几何质量,从而进一步提升了网格中点的几何信息质量,进而提升了编码效率。This embodiment provides a coding method, which determines the basic grid of the current image according to the original grid of the current image; subdivides the basic grid to determine the geometric position information of the initial grid of the current layer in the current image; determines the shift coefficient of the current layer in the reference image, and determines the geometric position information of the first reconstructed grid of the current layer in the current image according to the geometric position information of the initial grid of the current layer in the current image and the shift coefficient of the current layer in the reference image; determines whether to perform coding processing on the shift coefficient of the current layer in the current image according to the geometric position information of the first reconstructed grid and the geometric position information of the original grid. In this way, based on whether the shift coefficient of the current layer in the current image is coded, if the coding of the shift coefficient of the current layer in the current image is skipped, the geometric position information of the first reconstructed grid can be determined according to the geometric position information of the initial grid of the current layer in the current image and the shift coefficient of the current layer in the reference image, which not only reduces the code stream of the shift coefficient, but also ensures the reconstructed geometric quality of the midpoint of the grid, thereby further improving the geometric information quality of the midpoint of the grid, and further improving the coding efficiency.
在本申请的又一实施例中,本申请实施例基于已有V-DMC的编码基础,通过利用当前图像的Base Mesh划分之后的几何位置信息与参考图像重建的移位系数得到对应的重建网格的几何位置信息(也可称为“预测网格的几何位置信息”)。通过分析预测网格中点的几何位置信息与原始网格中点的几何位置信息之间的相关性,可以对移位系数进行自适应编码,从而可以在保证mesh几何位置信息重建质量的基础上,降低移位系数编码的码率大小,从而可以提升mesh几何位置信息的编码效率。In another embodiment of the present application, the embodiment of the present application is based on the existing V-DMC coding foundation, and obtains the corresponding geometric position information of the reconstructed mesh (also referred to as "the geometric position information of the predicted mesh") by using the geometric position information after the Base Mesh division of the current image and the shift coefficient reconstructed by the reference image. By analyzing the correlation between the geometric position information of the midpoint of the predicted mesh and the geometric position information of the midpoint of the original mesh, the shift coefficient can be adaptively encoded, so that the bit rate of the shift coefficient encoding can be reduced on the basis of ensuring the quality of the mesh geometric position information reconstruction, thereby improving the encoding efficiency of the mesh geometric position information.
示例性地,图21为本申请实施例提供的一种确定预测网格的几何位置信息的基本原理示意图。如图21所示,根据基础网格和参考图像的移位系数可以确定预测网格的几何位置信息,然后根据预测网格中点的几何位置信息与原始网格中点的几何位置信息之间的参数相关性,可以确定是否编码当前图像的移位系数。For example, Figure 21 is a schematic diagram of the basic principle of determining the geometric position information of the prediction grid provided by an embodiment of the present application. As shown in Figure 21, the geometric position information of the prediction grid can be determined based on the shift coefficients of the base grid and the reference image, and then based on the parameter correlation between the geometric position information of the midpoint of the prediction grid and the geometric position information of the midpoint of the original grid, it can be determined whether to encode the shift coefficient of the current image.
在一种具体的实现方式中,在编码端,以当前的LOD层为例,利用一定的算法通过拟合预测网格的几何位置信息与初始网格的几何位置信息之间的关系,假设两者之间的关系如下所示:
predictmesh=f(reconMesh+refDisp,lvl) (18)In a specific implementation, at the encoding end, taking the current LOD layer as an example, a certain algorithm is used to fit the relationship between the geometric position information of the predicted grid and the geometric position information of the initial grid. It is assumed that the relationship between the two is as follows:
predict mesh =f(reconMesh+refDisp,lvl) (18)
其中,lvl表示不同的LOD层,reconMesh表示初始网格的几何位置信息,refDisp表示参考图像的移位系数,predictmesh表示利用一定的函数关系得到对应的预测网格的几何位置信息。Among them, lvl represents different LOD layers, reconMesh represents the geometric position information of the initial mesh, refDisp represents the displacement coefficient of the reference image, and predict mesh represents the geometric position information of the corresponding predicted mesh obtained by using a certain functional relationship.
在本申请实施例中,考虑到编码复杂度与编码效率的平衡,可以通过利用简单的线性拟合来得到预测网格的几何位置信息与初始网格的几何位置信息之间的关系,具体的如下所示:
predictmesh=k*(reconMesh+refDisp,lvl)+b (19)In the embodiment of the present application, considering the balance between coding complexity and coding efficiency, the relationship between the geometric position information of the predicted grid and the geometric position information of the initial grid can be obtained by using simple linear fitting, as shown below:
predict mesh =k*(reconMesh+refDisp,lvl)+b (19)
其中,*和+为向量的数乘和加法。Among them, * and + are vector multiplication and addition.
如此,通过一定的失真,例如可以利用MSE算法得到不同的LOD层对应的网格几何位置信息的误差,具体如下所示:
In this way, through a certain distortion, for example, the MSE algorithm can be used to obtain the error of the mesh geometric position information corresponding to different LOD layers, as shown below:
在编码端,在编码当前图像中的当前层的移位系数之前,首先计算原始编码方案得到的重建网格的点几何位置信息与原始网格的点几何位置信息之间的误差,即为Dist_org;其次,利用本申请实施例中的方式:利用Base Mesh进行划分得到的初始网格的点几何位置信息与参考图像中的移位系数进行重建得到的预测网格的点几何位置信息最终重建得到的点几何位置信息与原始网格的点几何位置信息之间的误差,即为Dist_skip。再次,当Dist_skip/Dist_org小于一定阈值(Th)时,则表示当前图像中的当前层的移位系数进行跳过编码,具体为:对当前图像中的当前层的移位系数不进行编码,直接利用Base Mesh划分之后的几何位置信息与参考图像中的移位系数进行重建得到当前的重建网格的几何位置信息。At the encoding end, before encoding the shift coefficient of the current layer in the current image, first calculate the error between the point geometric position information of the reconstructed grid obtained by the original encoding scheme and the point geometric position information of the original grid, which is Dist_org; secondly, use the method in the embodiment of the present application: the error between the point geometric position information of the predicted grid obtained by reconstructing the point geometric position information of the initial grid obtained by dividing using the Base Mesh and the shift coefficient in the reference image and the point geometric position information of the original grid is Dist_skip. Again, when Dist_skip/Dist_org is less than a certain threshold (Th), it means that the shift coefficient of the current layer in the current image is skipped for encoding, specifically: the shift coefficient of the current layer in the current image is not encoded, and the geometric position information after the Base Mesh division is directly used to reconstruct the geometric position information of the current reconstructed grid with the shift coefficient in the reference image.
在另一种具体的实现方式中,在解码端,首先,解析重建得到当前图像的基础网格,并且利用划分算法来对当前图像的基础网格进行划分得到初始网格的几何位置信息,其次根据当前图像的移位系数的编码模式来确定当前图像的移位系数的解码模式,In another specific implementation, at the decoding end, first, the basic grid of the current image is obtained by parsing and reconstructing, and the basic grid of the current image is divided by a division algorithm to obtain geometric position information of the initial grid. Then, the decoding mode of the shift coefficient of the current image is determined according to the encoding mode of the shift coefficient of the current image.
如果当前图像中的Displacement系数需要解码,则利用当前图像的Base mesh的划分得到的初始几何位置信息与Displacement系数进行重建恢复得到重建网格的几何位置信息;If the Displacement coefficient in the current image needs to be decoded, the initial geometric position information obtained by dividing the Base mesh of the current image and the Displacement coefficient are used to reconstruct and restore the geometric position information of the reconstructed mesh;
如果当前图像中的Displacement系数跳过解码,则直接利用当前图像的Base mesh的划分得到的初始几何位置信息与参考图像中的Displacement系数进行重建恢复得到重建网格的几何位置信息。If the Displacement coefficient in the current image is skipped for decoding, the initial geometric position information obtained by dividing the Base mesh of the current image and the Displacement coefficient in the reference image are directly used to reconstruct and restore the geometric position information of the reconstructed mesh.
进一步地,在本申请实施例中,针对每一帧的移位系数编码模式进行修改。具体地,基于已有V-DMC编码方案,在帧间编码中,首先存在一个标识信息(标志符)决定当前图像的Base Mesh的编码方式,如果当前图像的Base Mesh可以采用帧间编码,则需要传递当前图像的参考图像索引。基于这样的算法,本申请实施例首先判断当前图像的Base Mesh是否可以进行帧间编码,如果可以采用帧间编码,那么才会去判断当前图像的Displacement系数是否采用跳过编码,否则默认当前图像的Displacement系数采用原始编码方案。Furthermore, in an embodiment of the present application, the displacement coefficient coding mode of each frame is modified. Specifically, based on the existing V-DMC coding scheme, in inter-frame coding, there is first an identification information (identifier) that determines the coding method of the Base Mesh of the current image. If the Base Mesh of the current image can be inter-frame coded, the reference image index of the current image needs to be transmitted. Based on such an algorithm, the embodiment of the present application first determines whether the Base Mesh of the current image can be inter-frame coded. If inter-frame coding can be used, then it will determine whether the Displacement coefficient of the current image uses skip coding, otherwise the Displacement coefficient of the current image is defaulted to use the original coding scheme.
进一步地,在本申请实施例中,针对每一帧的移位系数编码模式进行修改。具体地,基于已有V-DMC编码方案,在帧间编码中,首先存在一个标识信息决定当前图像的Base Mesh的编码方式,如果当前图像的Base Mesh可以采用帧间编码,则需要传递当前图像的参考图像索引。本申请实施例是无论当前图像的Base Mesh是否可以采用帧间编码,都需要传递一个标识信息表示当前图像的Displacement系数是否采用跳过编码。Furthermore, in an embodiment of the present application, the displacement coefficient coding mode of each frame is modified. Specifically, based on the existing V-DMC coding scheme, in inter-frame coding, there is first an identification information that determines the coding method of the Base Mesh of the current image. If the Base Mesh of the current image can be inter-coded, it is necessary to pass the reference image index of the current image. In the embodiment of the present application, regardless of whether the Base Mesh of the current image can be inter-coded, it is necessary to pass an identification information indicating whether the Displacement coefficient of the current image adopts skip coding.
进一步地,在本申请实施例中,针对每一帧的移位系数编码模式进行修改。具体地,通过利用当前图像的Base Mesh划分得到的初始mesh的几何位置信息与参考图像的Displacement系数来作为一种mesh点几何位置信息的重建方式,这种编码方案是通过降低重建mesh点几何位置信息的一定程度前提下,降低编码Displacement系数的码流,从而可以提升mesh的编码效率。在本申请实施例中,可以通过利用参数拟合的方式,利用预测mesh点几何位置信息与原始mesh几何位置信息之间的关系,从而可以保证重建mesh点几何位置信息的质量,从而降低Displacement系数的码流大小,进一步提升mesh的编码效率。Furthermore, in an embodiment of the present application, the displacement coefficient coding mode for each frame is modified. Specifically, the geometric position information of the initial mesh obtained by using the Base Mesh division of the current image and the Displacement coefficient of the reference image are used as a way to reconstruct the geometric position information of a mesh point. This coding scheme is to reduce the code stream of the encoded Displacement coefficient under the premise of reducing the geometric position information of the reconstructed mesh point to a certain extent, thereby improving the coding efficiency of the mesh. In an embodiment of the present application, the relationship between the predicted mesh point geometric position information and the original mesh geometric position information can be used by using parameter fitting to ensure the quality of the reconstructed mesh point geometric position information, thereby reducing the code stream size of the Displacement coefficient and further improving the coding efficiency of the mesh.
也就是说,在本申请实施例中,通过利用当前图像中Base Mesh划分之后不同LOD层点几何位置信息与参考图像中的不同LOD层重建的Displacement系数来决定当前图像中的Displacement系数编码模式。具体的,在编码端,首先利用当前图像中Base Mesh划分之后不同LOD层点几何位置信息与参考图像中的不同LOD层重建的Displacement系数来得到对应的预测mesh的几何位置信息;其次在编码端通过计算原始编码方案得到的重建mesh几何位置信息与原始位置信息之间的Dist_org,以及采用预测mesh的几何位置信息与原始mesh的几何位置信息之间的误差Dist_skip,利用两者的误差来自适应确定定当前图像中Displacement系数的编码模式。如果当前图像中的Displacement系数可以进行跳过编码,并且可以保证重建之后mesh几何位置信息的质量和原始编码方案相差不多,这样可以进一步提升mesh的几何信息编码效率。这里需要注意的是,本方案同样可以首先通过拟合预测mesh的几何位置信息与原始mesh位置信息之间的关系,其次利用拟合之后的位置信息进行跳过编码,同样可以进一步提升mesh的重建几何信息的质量。在这里不限制拟合的方式或者采用线性拟合、曲线拟合或者卷积参数拟合,本方案更多的是尽可能利用当前图像的Base mesh划分之后点的位置信息、参考图像中Displacement系数以及当前图像的mesh的几何位置信息之间的关系,通过利用三者之间的相关性,来降低Displacement系数的编码码流大小,并且同时可以保证mesh的点重建几何信息质量,从而可以进一步提升mesh的点几何信息的质量。That is to say, in the embodiment of the present application, the Displacement coefficient encoding mode in the current image is determined by using the geometric position information of different LOD layer points after the Base Mesh division in the current image and the Displacement coefficients reconstructed at different LOD layers in the reference image. Specifically, at the encoding end, firstly, the geometric position information of the corresponding predicted mesh is obtained by using the geometric position information of different LOD layer points after the Base Mesh division in the current image and the Displacement coefficients reconstructed at different LOD layers in the reference image; secondly, at the encoding end, the Dist_org between the reconstructed mesh geometric position information and the original position information obtained by the original coding scheme is calculated, and the error Dist_skip between the geometric position information of the predicted mesh and the geometric position information of the original mesh is used, and the error between the two is used to adaptively determine the encoding mode of the Displacement coefficient in the current image. If the Displacement coefficient in the current image can be skipped for encoding, and it can be ensured that the quality of the mesh geometric position information after reconstruction is not much different from the original coding scheme, this can further improve the mesh geometric information encoding efficiency. It should be noted here that this solution can also first predict the relationship between the geometric position information of the mesh and the original mesh position information by fitting, and then skip encoding using the fitted position information, which can also further improve the quality of the reconstructed geometric information of the mesh. Here, there is no restriction on the fitting method or the use of linear fitting, curve fitting or convolution parameter fitting. This solution is more about making use of the relationship between the position information of the points after the Base mesh division of the current image, the Displacement coefficient in the reference image and the geometric position information of the mesh of the current image as much as possible. By utilizing the correlation between the three, the encoding bitstream size of the Displacement coefficient can be reduced, and at the same time, the quality of the point reconstruction geometric information of the mesh can be guaranteed, thereby further improving the quality of the point geometric information of the mesh.
在本申请实施例中,通过上述实施例对前述实施例的具体实现进行详细阐述,从中可以看出,根据前述实施例的技术方案,通过利用当前图像中Base mesh划分之后不同LOD层的点几何位置信息与参考图像中的不同LOD层重建的Displacement系数来决定当前图像中的Displacement系数编码模式。具体地,在编码端,首先利用当前图像中Base mesh划分之后不同LOD层的点几何位置信息与参考图像 中的不同LOD层重建的Displacement系数来得到对应的预测mesh的几何位置信息,其次在编码端通过计算原始编码方案得到的重建mesh的几何位置信息与原始mesh的几何位置信息之间的Dist_org,以及采用预测mesh的几何位置信息与原始mesh的几何位置信息之间的误差Dist_skip,利用两者的误差来自适应地决定当前图像中Displacement系数的编码模式。如果当前图像中的Displacement系数可以进行跳过编码,并且可以保证重建之后mesh的位置信息的质量和原始编码方案相差不多,这样可以进一步提升mesh的几何信息编码效率。但是需要注意的是,本申请实施例同样可以首先通过拟合预测mesh的几何位置信息与原始mesh的几何位置信息之间的关系,其次利用拟合之后的几何位置信息进行跳过编码,同样可以进一步提升mesh的重建几何信息的质量。其中,针对拟合的方式可以采用线性拟合、曲线拟合或者卷积参数拟合等,这里不作具体限定。另外,本申请实施例更多的是尽可能利用当前图像的Base mesh划分之后点的几何位置信息、参考图像中Displacement系数以及当前图像的mesh的几何位置信息之间的关系,通过利用三者之间的相关性,来降低Displacement系数的编码码流大小,并且同时可以保证mesh的点重建几何信息质量,从而可以进一步提升mesh的点几何信息的质量。In the embodiment of the present application, the specific implementation of the aforementioned embodiment is described in detail through the aforementioned embodiment. It can be seen that according to the technical solution of the aforementioned embodiment, the Displacement coefficient encoding mode in the current image is determined by using the point geometric position information of different LOD layers after the Base mesh division in the current image and the Displacement coefficients reconstructed at different LOD layers in the reference image. Specifically, at the encoding end, firstly, the point geometric position information of different LOD layers after the Base mesh division in the current image and the Displacement coefficients reconstructed at different LOD layers in the reference image are used. The Displacement coefficients reconstructed at different LOD layers in the image are used to obtain the corresponding predicted mesh geometric position information. Secondly, at the encoding end, the Dist_org between the geometric position information of the reconstructed mesh obtained by the original encoding scheme and the geometric position information of the original mesh is calculated, and the error Dist_skip between the geometric position information of the predicted mesh and the geometric position information of the original mesh is used to adaptively determine the encoding mode of the Displacement coefficient in the current image using the error between the two. If the Displacement coefficient in the current image can be skipped for encoding, and the quality of the position information of the mesh after reconstruction can be guaranteed to be similar to the original encoding scheme, the mesh geometric information encoding efficiency can be further improved. However, it should be noted that the embodiment of the present application can also firstly fit the relationship between the geometric position information of the predicted mesh and the geometric position information of the original mesh, and then use the fitted geometric position information for skip encoding, which can also further improve the quality of the reconstructed geometric information of the mesh. Among them, linear fitting, curve fitting or convolution parameter fitting can be used for the fitting method, which is not specifically limited here. In addition, the embodiments of the present application are more about making use of the relationship between the geometric position information of the points after the Base mesh division of the current image, the Displacement coefficient in the reference image, and the geometric position information of the mesh of the current image as much as possible. By utilizing the correlation between the three, the encoding bit stream size of the Displacement coefficient can be reduced, and at the same time, the quality of the point reconstruction geometric information of the mesh can be guaranteed, thereby further improving the quality of the point geometric information of the mesh.
以帧间测试环境为例,如表1所示,BD-Rate为衡量有损压缩效率的性能指标,当BD-Rate小于0时,代表编码效率相对于现有的编码方案提升。Taking the inter-frame test environment as an example, as shown in Table 1, BD-Rate is a performance indicator for measuring lossy compression efficiency. When BD-Rate is less than 0, it means that the coding efficiency is improved relative to the existing coding scheme.
表1
Table 1
根据表1可以看到,当阈值Th越大时,Displacement系数码流相比原始编码方案减少的程度越大,但是带来的是重建质量降低。综合来说,当阈值Th设置为1.06、1.07和1.08时,码流可以降低约12%、30%和38%;但是D1降低约-0.25dB、-0.731dB和-0.96dB,D2降低约-0.279dB、-0.8dB和-1.10dB;进而可以提升码率,提高编解码效率。According to Table 1, the larger the threshold Th is, the greater the reduction of the Displacement coefficient bitstream compared to the original coding scheme, but the reconstruction quality is reduced. In general, when the threshold Th is set to 1.06, 1.07 and 1.08, the bitstream can be reduced by about 12%, 30% and 38%; but D1 is reduced by about -0.25dB, -0.731dB and -0.96dB, and D2 is reduced by about -0.279dB, -0.8dB and -1.10dB; thus, the bitrate can be increased and the encoding and decoding efficiency can be improved.
在本申请的再一实施例中,基于前述实施例相同的发明构思,参见图22,其示出了本申请实施例提供的一种编码器的组成结构示意图。如图22所示,该编码器220可以包括第一确定单元2201、第一细分单元2202、第一重建单元2203和编码单元2204,其中:In yet another embodiment of the present application, based on the same inventive concept as the above-mentioned embodiment, refer to FIG22, which shows a schematic diagram of the composition structure of an encoder provided by an embodiment of the present application. As shown in FIG22, the encoder 220 may include a first determination unit 2201, a first subdivision unit 2202, a first reconstruction unit 2203 and an encoding unit 2204, wherein:
第一确定单元2201,配置为根据当前图像的原始网格,确定当前图像的基础网格;The first determining unit 2201 is configured to determine a basic grid of the current image according to an original grid of the current image;
第一细分单元2202,配置为对基础网格进行细分,确定当前图像中的当前层的初始网格的几何位置信息;A first subdivision unit 2202 is configured to subdivide the basic grid and determine geometric position information of an initial grid of a current layer in a current image;
第一重建单元2203,配置为确定参考图像中的当前层的移位系数,并根据当前图像中的当前层的初始网格的几何位置信息与参考图像中的当前层的移位系数,确定当前图像中的当前层的第一重建网格的几何位置信息;The first reconstruction unit 2203 is configured to determine a shift coefficient of the current layer in the reference image, and determine geometric position information of a first reconstructed grid of the current layer in the current image according to the geometric position information of the initial grid of the current layer in the current image and the shift coefficient of the current layer in the reference image;
编码单元2204,配置为根据第一重建网格的几何位置信息和原始网格的几何位置信息,确定是否对当前图像中的当前层的移位系数进行编码处理。The encoding unit 2204 is configured to determine whether to perform encoding processing on the shift coefficients of the current layer in the current image according to the geometric position information of the first reconstructed grid and the geometric position information of the original grid.
在一些实施例中,参考图像为当前图像之前的已编码图像。In some embodiments, the reference picture is an encoded picture preceding the current picture.
在一些实施例中,第一确定单元2201,还配置为对当前图像的原始网格进行下采样处理,确定当前图像的基础网格。In some embodiments, the first determining unit 2201 is further configured to perform downsampling processing on the original grid of the current image to determine the basic grid of the current image.
在一些实施例中,第一确定单元2201,还配置为根据当前图像中的当前层的初始网格的几何位置信息与当前图像中的当前层的原始网格的几何位置信息,确定当前图像中的当前层的移位系数。In some embodiments, the first determination unit 2201 is further configured to determine the shift coefficient of the current layer in the current image based on the geometric position information of the initial grid of the current layer in the current image and the geometric position information of the original grid of the current layer in the current image.
在一些实施例中,第一确定单元2201,还配置为基于初始网格中的第一顶点,确定第一顶点在初始网格与原始网格之间的误差值,以及确定第一顶点的法向量;以及根据第一顶点的误差值与第一顶点的法向量,计算第一顶点的移位系数;其中,第一顶点为初始网格中的任意一个顶点。In some embodiments, the first determination unit 2201 is also configured to determine the error value of the first vertex between the initial mesh and the original mesh based on the first vertex in the initial mesh, and determine the normal vector of the first vertex; and calculate the shift coefficient of the first vertex based on the error value of the first vertex and the normal vector of the first vertex; wherein the first vertex is any vertex in the initial mesh.
在一些实施例中,第一确定单元2201,还配置为对第一重建网格的几何位置信息和原始网格的几何位置信息进行误差计算,确定第一误差结果;以及基于第一误差结果,确定当前图像中的当前层的移位系数的编码模式,其中,编码模式用于表征是否对当前图像中的当前层的移位系数进行编码处理。In some embodiments, the first determination unit 2201 is further configured to perform error calculation on the geometric position information of the first reconstructed grid and the geometric position information of the original grid to determine a first error result; and based on the first error result, determine the encoding mode of the shift coefficient of the current layer in the current image, wherein the encoding mode is used to characterize whether to perform encoding processing on the shift coefficient of the current layer in the current image.
在一些实施例中,第一确定单元2201,还配置为根据当前图像中的当前层的初始网格的几何位置信息与当前图像中的当前层的移位系数,确定当前图像中的当前层的第二重建网格的几何位置信息;以及对第二重建网格的几何位置信息和原始网格的几何位置信息进行误差计算,确定第二误差结果。In some embodiments, the first determination unit 2201 is also configured to determine the geometric position information of the second reconstructed grid of the current layer in the current image based on the geometric position information of the initial grid of the current layer in the current image and the shift coefficient of the current layer in the current image; and perform error calculation on the geometric position information of the second reconstructed grid and the geometric position information of the original grid to determine a second error result.
在一些实施例中,第一确定单元2201,还配置为确定第一误差结果与第二误差结果的误差比;若误差比小于预设阈值,则确定当前图像中的当前层的移位系数使用第一编码模式;若误差比大于预设阈 值,则确定当前图像中的当前层的移位系数使用第二编码模式;其中,第一编码模式与第二编码模式不同。In some embodiments, the first determining unit 2201 is further configured to determine an error ratio between the first error result and the second error result; if the error ratio is less than a preset threshold, determine that the shift coefficient of the current layer in the current image uses the first coding mode; if the error ratio is greater than the preset threshold, determine that the shift coefficient of the current layer in the current image uses the first coding mode. value, it is determined that the shift coefficient of the current layer in the current image uses the second coding mode; wherein the first coding mode is different from the second coding mode.
在一些实施例中,第一确定单元2201,还配置为根据第一编码模式对当前图像中的当前层的移位系数进行率失真代价计算,确定第一率失真结果;以及根据第二编码模式对当前图像中的当前层的移位系数进行率失真代价计算,确定第二率失真结果;根据第一率失真结果与第二率失真结果,确定当前图像中的当前层的移位系数的编码模式。In some embodiments, the first determination unit 2201 is further configured to perform rate-distortion cost calculation on the shift coefficient of the current layer in the current image according to the first coding mode to determine a first rate-distortion result; and perform rate-distortion cost calculation on the shift coefficient of the current layer in the current image according to the second coding mode to determine a second rate-distortion result; and determine the coding mode of the shift coefficient of the current layer in the current image based on the first rate-distortion result and the second rate-distortion result.
在一些实施例中,第一确定单元2201,还配置为若第一率失真结果小于第二率失真结果,则确定当前图像中的当前层的移位系数使用第一编码模式;若第二率失真结果小于第一率失真结果,则确定当前图像中的当前层的移位系数使用第二编码模式。In some embodiments, the first determination unit 2201 is further configured to determine that the shift coefficient of the current layer in the current image uses the first encoding mode if the first rate-distortion result is smaller than the second rate-distortion result; and to determine that the shift coefficient of the current layer in the current image uses the second encoding mode if the second rate-distortion result is smaller than the first rate-distortion result.
在一些实施例中,第一编码模式表征跳过编码当前图像中的当前层的移位系数;第二编码模式表征编码当前图像中的当前层的移位系数。In some embodiments, the first encoding mode represents skipping of encoding the shift coefficients of the current layer in the current image; and the second encoding mode represents encoding the shift coefficients of the current layer in the current image.
在一些实施例中,编码单元2204,还配置为在当前图像中的当前层的移位系数使用第二编码模式时,对当前图像中的当前层的移位系数进行编码处理,将所得到的编码比特写入码流。In some embodiments, the encoding unit 2204 is further configured to encode the shift coefficient of the current layer in the current image when the shift coefficient of the current layer in the current image uses the second encoding mode, and write the obtained encoding bits into the bitstream.
在一些实施例中,第一确定单元2201,还配置为确定第二语法标识信息的取值,其中,第二语法标识信息用于指示当前序列的移位系数是否启用第一编码模式;In some embodiments, the first determining unit 2201 is further configured to determine a value of second grammar identification information, wherein the second grammar identification information is used to indicate whether the shift coefficient of the current sequence enables the first coding mode;
编码单元2204,还配置为对第二语法标识信息的取值进行编码处理,将所得到的编码比特写入码流。The encoding unit 2204 is further configured to perform encoding processing on the value of the second syntax identification information, and write the obtained encoding bits into the bit stream.
在一些实施例中,第一确定单元2201,还配置为在第二语法标识信息指示当前序列的移位系数启用第一编码模式时,确定第三语法标识信息的取值,其中,第三语法标识信息用于指示当前图像的移位系数是否启用第一编码模式;In some embodiments, the first determining unit 2201 is further configured to determine a value of third grammar identification information when the second grammar identification information indicates that the shift coefficient of the current sequence enables the first coding mode, wherein the third grammar identification information is used to indicate whether the shift coefficient of the current image enables the first coding mode;
编码单元2204,还配置为对第三语法标识信息的取值进行编码处理,将所得到的编码比特写入码流。The encoding unit 2204 is further configured to perform encoding processing on the value of the third syntax identification information, and write the obtained encoding bits into the bit stream.
在一些实施例中,第一确定单元2201,还配置为在第三语法标识信息指示当前图像的移位系数启用第一编码模式时,确定第一语法标识信息的取值,其中,第一语法标识信息用于指示当前图像中的当前层的移位系数是否使用第一编码模式;In some embodiments, the first determining unit 2201 is further configured to determine a value of the first syntax identification information when the third syntax identification information indicates that the shift coefficient of the current image enables the first coding mode, wherein the first syntax identification information is used to indicate whether the shift coefficient of the current layer in the current image uses the first coding mode;
编码单元2204,还配置为对第一语法标识信息的取值进行编码处理,将所得到的编码比特写入码流。The encoding unit 2204 is further configured to perform encoding processing on the value of the first syntax identification information, and write the obtained encoding bits into the bit stream.
在一些实施例中,第一确定单元2201,还配置为确定第四语法标识信息的取值,其中,第四语法标识信息用于指示当前图像的基础网格是否使用帧间处理方式;In some embodiments, the first determining unit 2201 is further configured to determine a value of fourth syntax identification information, wherein the fourth syntax identification information is used to indicate whether the basic grid of the current image uses an inter-frame processing method;
编码单元2204,还配置为对第四语法标识信息的取值进行编码处理,将所得到的编码比特写入码流。The encoding unit 2204 is further configured to perform encoding processing on the value of the fourth syntax identification information, and write the obtained encoding bits into the bitstream.
在一些实施例中,第一确定单元2201,还配置为在当前图像的基础网格使用帧间处理方式时,执行确定是否对当前图像中的当前层的移位系数进行编码处理的步骤。In some embodiments, the first determination unit 2201 is further configured to execute a step of determining whether to perform encoding processing on the shift coefficients of the current layer in the current image when the base grid of the current image uses an inter-frame processing method.
在一些实施例中,第一确定单元2201,还配置为在当前图像的基础网格使用帧间处理方式时,根据参考图像,确定当前图像的参考图像索引;In some embodiments, the first determining unit 2201 is further configured to determine a reference image index of the current image according to the reference image when the base grid of the current image uses an inter-frame processing method;
编码单元2204,还配置为对当前图像的参考图像索引进行编码处理,将所得到的编码比特写入码流。The encoding unit 2204 is further configured to perform encoding processing on the reference image index of the current image, and write the obtained encoding bits into the bitstream.
在一些实施例中,第一确定单元2201,还配置为对当前图像中的当前层的移位系数进行提升变换,确定提升变换系数;对提升变换系数进行量化处理,确定量化系数;以及对量化系数进行系数重组处理,确定二维图像;In some embodiments, the first determining unit 2201 is further configured to perform a lifting transformation on the shift coefficients of the current layer in the current image to determine the lifting transformation coefficients; perform a quantization process on the lifting transformation coefficients to determine the quantization coefficients; and perform a coefficient reorganization process on the quantization coefficients to determine the two-dimensional image;
编码单元2204,还配置为对二维图像进行编码处理,将所得到的编码比特写入码流。The encoding unit 2204 is further configured to perform encoding processing on the two-dimensional image and write the obtained encoding bits into the bit stream.
在一些实施例中,第一重建单元2203,还配置为确定当前图像中的当前层的初始网格的几何位置信息、参考图像中的当前层的移位系数与当前图像中的当前层的第一重建网格的几何位置信息之间的映射关系;以及根据映射关系以及当前图像中的当前层的初始网格的几何位置信息和参考图像中的当前层的移位系数,确定当前图像中的当前层的第一重建网格的几何位置信息。In some embodiments, the first reconstruction unit 2203 is also configured to determine a mapping relationship between geometric position information of an initial grid of a current layer in a current image, a shift coefficient of a current layer in a reference image, and geometric position information of a first reconstructed grid of a current layer in a current image; and determine geometric position information of a first reconstructed grid of a current layer in a current image based on the mapping relationship and the geometric position information of an initial grid of a current layer in a current image and a shift coefficient of a current layer in a reference image.
在一些实施例中,映射关系包括以下至少之一:基于线性函数的映射关系、基于非线性函数的映射关系和基于神经网格的映射关系。In some embodiments, the mapping relationship includes at least one of the following: a mapping relationship based on a linear function, a mapping relationship based on a nonlinear function, and a mapping relationship based on a neural grid.
在一些实施例中,第一确定单元2201,还配置为基于映射关系,确定当前图像中的当前层的映射指示信息;In some embodiments, the first determining unit 2201 is further configured to determine mapping indication information of a current layer in the current image based on the mapping relationship;
编码单元2204,还配置为对映射指示信息进行编码处理,将所得到的编码比特写入码流。The encoding unit 2204 is further configured to perform encoding processing on the mapping indication information and write the obtained encoding bits into the bit stream.
在一些实施例中,第一细分单元2202,还配置为根据网格细分模式对基础网格进行迭代划分,确定当前图像中的当前层的初始网格的几何位置信息。 In some embodiments, the first subdivision unit 2202 is further configured to iteratively divide the basic grid according to the grid subdivision mode to determine the geometric position information of the initial grid of the current layer in the current image.
在一些实施例中,网格细分模式包括:细分算法和细分迭代次数。In some embodiments, the mesh subdivision mode includes: a subdivision algorithm and a number of subdivision iterations.
在一些实施例中,细分算法为线性插值算法。In some embodiments, the subdivision algorithm is a linear interpolation algorithm.
可以理解地,在本申请实施例中,“单元”可以是部分电路、部分处理器、部分程序或软件等等,当然也可以是模块,还可以是非模块化的。而且在本实施例中的各组成部分可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。It is understandable that in the embodiments of the present application, a "unit" may be a part of a circuit, a part of a processor, a part of a program or software, etc., and of course, it may be a module, or it may be non-modular. Moreover, the components in the present embodiment may be integrated into a processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit. The above-mentioned integrated unit may be implemented in the form of hardware or in the form of a software functional module.
所述集成的单元如果以软件功能模块的形式实现并非作为独立的产品进行销售或使用时,可以存储在一个计算机可读取存储介质中,基于这样的理解,本实施例的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网格设备等)或processor(处理器)执行本实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(Read Only Memory,ROM)、随机存取存储器(Random Access Memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。If the integrated unit is implemented in the form of a software function module and is not sold or used as an independent product, it can be stored in a computer-readable storage medium. Based on this understanding, the technical solution of this embodiment is essentially or the part that contributes to the prior art or all or part of the technical solution can be embodied in the form of a software product. The computer software product is stored in a storage medium, including several instructions for a computer device (which can be a personal computer, server, or grid device, etc.) or a processor to perform all or part of the steps of the method described in this embodiment. The aforementioned storage medium includes: U disk, mobile hard disk, read-only memory (ROM), random access memory (RAM), disk or optical disk, etc., various media that can store program codes.
因此,本申请实施例提供了一种计算机可读存储介质,应用于编码器220,该计算机可读存储介质存储有计算机程序,所述计算机程序被第一处理器执行时实现前述实施例中任一项所述的方法。Therefore, an embodiment of the present application provides a computer-readable storage medium, which is applied to the encoder 220. The computer-readable storage medium stores a computer program, and when the computer program is executed by the first processor, the method described in any one of the aforementioned embodiments is implemented.
基于编码器220的组成以及计算机可读存储介质,参见图23,其示出了本申请实施例提供的编码器220的具体硬件结构示意图。如图23所示,编码器220可以包括:第一通信接口2301、第一存储器2302和第一处理器2303;各个组件通过第一总线系统2304耦合在一起。可理解,第一总线系统2304用于实现这些组件之间的连接通信。第一总线系统2304除包括数据总线之外,还包括电源总线、控制总线和状态信号总线。但是为了清楚说明起见,在图23中将各种总线都标为第一总线系统2304。其中,Based on the composition of the encoder 220 and the computer-readable storage medium, refer to Figure 23, which shows a specific hardware structure diagram of the encoder 220 provided in an embodiment of the present application. As shown in Figure 23, the encoder 220 may include: a first communication interface 2301, a first memory 2302 and a first processor 2303; each component is coupled together through a first bus system 2304. It can be understood that the first bus system 2304 is used to achieve connection and communication between these components. In addition to the data bus, the first bus system 2304 also includes a power bus, a control bus and a status signal bus. However, for the sake of clarity, various buses are labeled as the first bus system 2304 in Figure 23. Among them,
第一通信接口2301,用于在与其他外部网元之间进行收发信息过程中,信号的接收和发送;The first communication interface 2301 is used for receiving and sending signals during the process of sending and receiving information with other external network elements;
第一存储器2302,用于存储能够在第一处理器2303上运行的计算机程序;A first memory 2302, used to store a computer program that can be run on the first processor 2303;
第一处理器2303,用于在运行所述计算机程序时,执行:The first processor 2303 is configured to, when running the computer program, execute:
根据当前图像的原始网格,确定当前图像的基础网格;According to the original grid of the current image, determine the basic grid of the current image;
对基础网格进行细分,确定当前图像中的当前层的初始网格的几何位置信息;Subdividing the base grid to determine the geometric position information of the initial grid of the current layer in the current image;
确定参考图像中的当前层的移位系数,并根据当前图像中的当前层的初始网格的几何位置信息与参考图像中的当前层的移位系数,确定当前图像中的当前层的第一重建网格的几何位置信息;Determine a shift coefficient of the current layer in the reference image, and determine geometric position information of a first reconstructed grid of the current layer in the current image according to the geometric position information of the initial grid of the current layer in the current image and the shift coefficient of the current layer in the reference image;
根据第一重建网格的几何位置信息和原始网格的几何位置信息,确定是否对当前图像中的当前层的移位系数进行编码处理。According to the geometric position information of the first reconstructed grid and the geometric position information of the original grid, it is determined whether to perform encoding processing on the shift coefficients of the current layer in the current image.
可以理解,本申请实施例中的第一存储器2302可以是易失性存储器或非易失性存储器,或可包括易失性和非易失性存储器两者。其中,非易失性存储器可以是只读存储器(Read-Only Memory,ROM)、可编程只读存储器(Programmable ROM,PROM)、可擦除可编程只读存储器(Erasable PROM,EPROM)、电可擦除可编程只读存储器(Electrically EPROM,EEPROM)或闪存。易失性存储器可以是随机存取存储器(Random Access Memory,RAM),其用作外部高速缓存。通过示例性但不是限制性说明,许多形式的RAM可用,例如静态随机存取存储器(Static RAM,SRAM)、动态随机存取存储器(Dynamic RAM,DRAM)、同步动态随机存取存储器(Synchronous DRAM,SDRAM)、双倍数据速率同步动态随机存取存储器(Double Data Rate SDRAM,DDRSDRAM)、增强型同步动态随机存取存储器(Enhanced SDRAM,ESDRAM)、同步连接动态随机存取存储器(Synchlink DRAM,SLDRAM)和直接内存总线随机存取存储器(Direct Rambus RAM,DRRAM)。本申请描述的系统和方法的第一存储器2302旨在包括但不限于这些和任意其它适合类型的存储器。It can be understood that the first memory 2302 in the embodiment of the present application can be a volatile memory or a non-volatile memory, or can include both volatile and non-volatile memories. Among them, the non-volatile memory can be a read-only memory (ROM), a programmable read-only memory (PROM), an erasable programmable read-only memory (EPROM), an electrically erasable programmable read-only memory (EEPROM), or a flash memory. The volatile memory can be a random access memory (RAM), which is used as an external cache. By way of example and not limitation, many forms of RAM are available, such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double data rate synchronous DRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous link DRAM (SLDRAM), and direct RAM bus RAM (DRRAM). The first memory 2302 of the systems and methods described herein is intended to include, but is not limited to, these and any other suitable types of memory.
而第一处理器2303可能是一种集成电路芯片,具有信号的处理能力。在实现过程中,上述方法的各步骤可以通过第一处理器2303中的硬件的集成逻辑电路或者软件形式的指令完成。上述的第一处理器2303可以是通用处理器、数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现成可编程门阵列(Field Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。可以实现或者执行本申请实施例中的公开的各方法、步骤及逻辑框图。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。结合本申请实施例所公开的方法的步骤可以直接体现为硬件译码处理器执行完成,或者用译码处理器中的硬件及软件模块组合执行完成。软件模块可以位于随机存储器,闪存、只读存储器,可编程只读存储器或者电可擦写可编程存储器、寄存器等本领域成熟的存储介质中。该存储介质位于第一存储器2302,第一处理器2303读取第一存储器2302中的信息,结合其硬件完成上述方法的步骤。The first processor 2303 may be an integrated circuit chip with signal processing capabilities. In the implementation process, each step of the above method can be completed by the hardware integrated logic circuit or software instructions in the first processor 2303. The above-mentioned first processor 2303 can be a general-purpose processor, a digital signal processor (Digital Signal Processor, DSP), an application-specific integrated circuit (Application Specific Integrated Circuit, ASIC), a field programmable gate array (Field Programmable Gate Array, FPGA) or other programmable logic devices, discrete gates or transistor logic devices, discrete hardware components. The methods, steps and logic block diagrams disclosed in the embodiments of the present application can be implemented or executed. The general-purpose processor can be a microprocessor or the processor can also be any conventional processor, etc. The steps of the method disclosed in the embodiments of the present application can be directly embodied as a hardware decoding processor to execute, or the hardware and software modules in the decoding processor can be executed. The software module can be located in a mature storage medium in the field such as a random access memory, a flash memory, a read-only memory, a programmable read-only memory or an electrically erasable programmable memory, a register, etc. The storage medium is located in the first memory 2302, and the first processor 2303 reads the information in the first memory 2302 and completes the steps of the above method in combination with its hardware.
可以理解的是,本申请描述的这些实施例可以用硬件、软件、固件、中间件、微码或其组合来实现。对于硬件实现,处理单元可以实现在一个或多个专用集成电路(Application Specific Integrated Circuits, ASIC)、数字信号处理器(Digital Signal Processing,DSP)、数字信号处理设备(DSP Device,DSPD)、可编程逻辑设备(Programmable Logic Device,PLD)、现场可编程门阵列(Field-Programmable Gate Array,FPGA)、通用处理器、控制器、微控制器、微处理器、用于执行本申请所述功能的其它电子单元或其组合中。对于软件实现,可通过执行本申请所述功能的模块(例如过程、函数等)来实现本申请所述的技术。软件代码可存储在存储器中并通过处理器执行。存储器可以在处理器中或在处理器外部实现。It is understood that the embodiments described in this application can be implemented by hardware, software, firmware, middleware, microcode or a combination thereof. For hardware implementation, the processing unit can be implemented in one or more application specific integrated circuits (Application Specific Integrated Circuits, ASIC), Digital Signal Processing (DSP), Digital Signal Processing Device (DSPD), Programmable Logic Device (PLD), Field-Programmable Gate Array (FPGA), general-purpose processor, controller, microcontroller, microprocessor, other electronic units or combinations thereof for performing the functions described in the present application. For software implementation, the technology described in the present application can be implemented by modules (such as procedures, functions, etc.) that perform the functions described in the present application. The software code can be stored in a memory and executed by a processor. The memory can be implemented in the processor or outside the processor.
可选地,作为另一个实施例,第一处理器2303还配置为在运行所述计算机程序时,执行前述实施例中任一项所述的方法。Optionally, as another embodiment, the first processor 2303 is further configured to execute the method described in any one of the aforementioned embodiments when running the computer program.
本实施例提供了一种编码器,在该编码器中,基于当前图像中的当前层的移位系数是否进行编码处理,如果跳过编码当前图像中的当前层的移位系数,这时候可以根据当前图像中的当前层的初始网格的几何位置信息与参考图像中的当前层的移位系数来确定第一重建网格的几何位置信息,不仅降低了移位系数的码流,而且还可以保证网格中点的重建几何质量,从而进一步提升了网格中点的几何信息质量,进而提升了编码效率。The present embodiment provides an encoder in which encoding processing is performed based on whether the shift coefficient of the current layer in the current image is to be performed. If encoding the shift coefficient of the current layer in the current image is skipped, the geometric position information of the first reconstructed grid can be determined based on the geometric position information of the initial grid of the current layer in the current image and the shift coefficient of the current layer in the reference image. This not only reduces the code rate of the shift coefficient, but also ensures the reconstructed geometric quality of the midpoint of the grid, thereby further improving the geometric information quality of the midpoint of the grid, and further improving the encoding efficiency.
基于前述实施例相同的发明构思,参见图24,其示出了本申请实施例提供的一种解码器的组成结构示意图。如图24所示,该解码器240可以包括第二确定单元2401、第二细分单元2402、解码单元2403和第二重建单元2404,其中:Based on the same inventive concept as the above-mentioned embodiment, referring to FIG. 24 , a schematic diagram of the composition structure of a decoder provided in an embodiment of the present application is shown. As shown in FIG. 24 , the decoder 240 may include a second determination unit 2401, a second subdivision unit 2402, a decoding unit 2403, and a second reconstruction unit 2404, wherein:
第二确定单元2401,配置为确定当前图像的基础网格;The second determining unit 2401 is configured to determine a basic grid of the current image;
第二细分单元2402,配置为对基础网格进行细分,确定当前图像中的当前层的初始网格的几何位置信息;A second subdivision unit 2402 is configured to subdivide the basic grid and determine geometric position information of an initial grid of a current layer in a current image;
解码单元2403,配置为解码码流,确定第一语法标识信息;以及在第一语法标识信息指示当前图像中的当前层的移位系数使用第一解码模式时,确定参考图像中的当前层的移位系数;The decoding unit 2403 is configured to decode the code stream, determine the first syntax identification information; and determine the shift coefficient of the current layer in the reference image when the first syntax identification information indicates that the shift coefficient of the current layer in the current image uses the first decoding mode;
第二重建单元2404,配置为根据当前图像中的当前层的初始网格的几何位置信息与参考图像中的当前层的移位系数,确定当前图像中的当前层的重建网格的几何位置信息。The second reconstruction unit 2404 is configured to determine the geometric position information of the reconstructed grid of the current layer in the current image according to the geometric position information of the initial grid of the current layer in the current image and the shift coefficient of the current layer in the reference image.
在一些实施例中,参考图像为当前图像之前的已解码图像。In some embodiments, the reference picture is a decoded picture preceding the current picture.
在一些实施例中,解码单元2403,还配置为解码码流,确定第二语法标识信息;在第二语法标识信息指示当前序列的移位系数启用第一解码模式时,解码码流,确定第三语法标识信息;在第三语法标识信息指示当前图像的移位系数启用第一解码模式时,解码码流,确定第一语法标识信息;其中,第一语法标识信息用于指示当前层的移位系数是否使用第一解码模式,且当前序列包括当前图像,当前图像划分的LOD层包括当前层。In some embodiments, the decoding unit 2403 is further configured to decode the bitstream and determine the second grammar identification information; when the second grammar identification information indicates that the shift coefficient of the current sequence enables the first decoding mode, decode the bitstream and determine the third grammar identification information; when the third grammar identification information indicates that the shift coefficient of the current image enables the first decoding mode, decode the bitstream and determine the first grammar identification information; wherein the first grammar identification information is used to indicate whether the shift coefficient of the current layer uses the first decoding mode, and the current sequence includes the current image, and the LOD layer divided by the current image includes the current layer.
在一些实施例中,解码单元2403,还配置为在第一语法标识信息指示当前图像中的当前层的移位系数使用第二解码模式时,解码码流,确定当前图像中的当前层的移位系数;In some embodiments, the decoding unit 2403 is further configured to decode the code stream and determine the shift coefficient of the current layer in the current image when the first syntax identification information indicates that the shift coefficient of the current layer in the current image uses the second decoding mode;
第二确定单元2401,还配置为根据当前图像中的当前层的初始网格的几何位置信息与当前图像中的当前层的移位系数,确定当前图像中的当前层的重建网格的几何位置信息。The second determining unit 2401 is further configured to determine the geometric position information of the reconstructed grid of the current layer in the current image according to the geometric position information of the initial grid of the current layer in the current image and the shift coefficient of the current layer in the current image.
在一些实施例中,第一解码模式与第二解码模式不同;其中:第一解码模式表征跳过解码当前图像中的当前层的移位系数;第二解码模式表征解码当前图像中的当前层的移位系数。In some embodiments, the first decoding mode is different from the second decoding mode; wherein: the first decoding mode represents skipping of decoding the shift coefficients of the current layer in the current image; and the second decoding mode represents decoding the shift coefficients of the current layer in the current image.
在一些实施例中,解码单元2403,还配置为解码码流,确定第四语法标识信息;在第四语法标识信息指示当前图像的基础网格使用帧间处理方式时,执行解码码流,确定第一语法标识信息的步骤。In some embodiments, the decoding unit 2403 is further configured to decode the code stream and determine fourth syntax identification information; when the fourth syntax identification information indicates that the basic grid of the current image uses an inter-frame processing method, execute the step of decoding the code stream and determining the first syntax identification information.
在一些实施例中,解码单元2403,还配置为在第四语法标识信息指示当前图像的基础网格使用帧间处理方式时,解码码流,确定当前图像的参考图像索引;In some embodiments, the decoding unit 2403 is further configured to decode the code stream and determine the reference image index of the current image when the fourth syntax identification information indicates that the base grid of the current image uses the inter-frame processing mode;
第二确定单元2401,还配置为根据当前图像的参考图像索引,确定参考图像。The second determining unit 2401 is further configured to determine a reference image according to a reference image index of a current image.
在一些实施例中,解码单元2403,还配置为在第四语法标识信息指示当前图像的基础网格不使用帧间处理方式时,解码码流,确定当前图像中的当前层的移位系数;In some embodiments, the decoding unit 2403 is further configured to decode the code stream and determine the shift coefficient of the current layer in the current image when the fourth syntax identification information indicates that the base grid of the current image does not use the inter-frame processing mode;
第二确定单元2401,还配置为根据当前图像中的当前层的初始网格的几何位置信息与当前图像中的当前层的移位系数,确定当前图像中的当前层的重建网格的几何位置信息。The second determining unit 2401 is further configured to determine the geometric position information of the reconstructed grid of the current layer in the current image according to the geometric position information of the initial grid of the current layer in the current image and the shift coefficient of the current layer in the current image.
在一些实施例中,解码单元2403,还配置为解码码流,确定当前图像中的当前层的二维图像;In some embodiments, the decoding unit 2403 is further configured to decode the code stream to determine a two-dimensional image of a current layer in the current image;
第二确定单元2401,还配置为对二维图像进行系数重组处理,确定当前层的提升变换系数;以及对当前层的提升变换系数进行逆变换处理,确定当前层的移位系数。The second determination unit 2401 is further configured to perform coefficient reorganization processing on the two-dimensional image to determine the lifting transformation coefficients of the current layer; and perform inverse transformation processing on the lifting transformation coefficients of the current layer to determine the shift coefficients of the current layer.
在一些实施例中,第二确定单元2401,还配置为对二维图像进行系数重组处理,确定当前层的量化系数;以及对当前层的量化系数进行反量化处理,确定当前层的提升变换系数。In some embodiments, the second determination unit 2401 is further configured to perform coefficient reorganization processing on the two-dimensional image to determine the quantization coefficients of the current layer; and perform inverse quantization processing on the quantization coefficients of the current layer to determine the lifting transformation coefficients of the current layer.
在一些实施例中,第二重建单元2404,还配置为确定当前图像中的当前层的初始网格的几何位置信息、参考图像中的当前层的移位系数与当前图像中的当前层的重建网格的几何位置信息之间的映射关系;以及根据映射关系以及当前图像中的当前层的初始网格的几何位置信息和参考图像中的当前层的移位系数,确定当前图像中的当前层的重建网格的几何位置信息。 In some embodiments, the second reconstruction unit 2404 is further configured to determine a mapping relationship between geometric position information of an initial grid of a current layer in a current image, a shift coefficient of a current layer in a reference image, and geometric position information of a reconstructed grid of a current layer in a current image; and to determine the geometric position information of a reconstructed grid of a current layer in a current image based on the mapping relationship and the geometric position information of an initial grid of a current layer in a current image and the shift coefficient of a current layer in a reference image.
在一些实施例中,映射关系包括以下至少之一:基于线性函数的映射关系、基于非线性函数的映射关系和基于神经网格的映射关系。In some embodiments, the mapping relationship includes at least one of the following: a mapping relationship based on a linear function, a mapping relationship based on a nonlinear function, and a mapping relationship based on a neural grid.
在一些实施例中,解码单元2403,还配置为解码码流,确定当前图像中的当前层的映射指示信息;In some embodiments, the decoding unit 2403 is further configured to decode the code stream to determine mapping indication information of the current layer in the current image;
第二确定单元2401,还配置为根据映射指示信息,确定当前图像中的当前层的初始网格的几何位置信息、参考图像中的当前层的移位系数与当前图像中的当前层的重建网格的几何位置信息之间的映射关系。The second determination unit 2401 is further configured to determine the mapping relationship between the geometric position information of the initial grid of the current layer in the current image, the shift coefficient of the current layer in the reference image and the geometric position information of the reconstructed grid of the current layer in the current image according to the mapping indication information.
在一些实施例中,映射指示信息包括第一指示信息;第二确定单元2401,还配置为根据第一指示信息,确定映射关系的拟合参数;以及根据拟合参数,确定当前图像中的当前层的初始网格的几何位置信息、参考图像中的当前层的移位系数与当前图像中的当前层的重建网格的几何位置信息之间的映射关系。In some embodiments, the mapping indication information includes first indication information; the second determination unit 2401 is also configured to determine the fitting parameters of the mapping relationship based on the first indication information; and determine the mapping relationship between the geometric position information of the initial grid of the current layer in the current image, the shift coefficient of the current layer in the reference image, and the geometric position information of the reconstructed grid of the current layer in the current image based on the fitting parameters.
在一些实施例中,映射指示信息还包括第二指示信息;第二确定单元2401,还配置为根据第二指示信息,确定映射关系的类型;以及根据映射关系的类型和拟合参数,确定当前图像中的当前层的初始网格的几何位置信息、参考图像中的当前层的移位系数与当前图像中的当前层的重建网格的几何位置信息之间的映射关系。In some embodiments, the mapping indication information also includes second indication information; the second determination unit 2401 is further configured to determine the type of mapping relationship based on the second indication information; and determine the mapping relationship between the geometric position information of the initial grid of the current layer in the current image, the shift coefficient of the current layer in the reference image, and the geometric position information of the reconstructed grid of the current layer in the current image based on the type of mapping relationship and the fitting parameters.
在一些实施例中,映射指示信息包括第三指示信息;第二确定单元2401,还配置为根据第三指示信息,确定映射关系的索引序号;以及根据映射关系的索引序号,确定当前图像中的当前层的初始网格的几何位置信息、参考图像中的当前层的移位系数与当前图像中的当前层的重建网格的几何位置信息之间的映射关系。In some embodiments, the mapping indication information includes third indication information; the second determination unit 2401 is also configured to determine the index number of the mapping relationship based on the third indication information; and determine the mapping relationship between the geometric position information of the initial grid of the current layer in the current image, the shift coefficient of the current layer in the reference image, and the geometric position information of the reconstructed grid of the current layer in the current image based on the index number of the mapping relationship.
在一些实施例中,解码单元2403,还配置为解码码流,确定当前图像的基础网格。In some embodiments, the decoding unit 2403 is further configured to decode the code stream to determine a base grid of the current image.
在一些实施例中,第二细分单元2402,还配置为根据网格细分模式对基础网格进行迭代划分,确定当前图像中的当前层的初始网格的几何位置信息。In some embodiments, the second subdivision unit 2402 is further configured to iteratively divide the basic grid according to the grid subdivision mode to determine the geometric position information of the initial grid of the current layer in the current image.
在一些实施例中,网格细分模式包括:细分算法和细分迭代次数。In some embodiments, the mesh subdivision mode includes: a subdivision algorithm and a number of subdivision iterations.
在一些实施例中,细分算法为线性插值算法。In some embodiments, the subdivision algorithm is a linear interpolation algorithm.
可以理解地,在本实施例中,“单元”可以是部分电路、部分处理器、部分程序或软件等等,当然也可以是模块,还可以是非模块化的。而且在本实施例中的各组成部分可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。It can be understood that in this embodiment, a "unit" can be a part of a circuit, a part of a processor, a part of a program or software, etc., and of course it can also be a module, or it can be non-modular. Moreover, the components in this embodiment can be integrated into a processing unit, or each unit can exist physically separately, or two or more units can be integrated into one unit. The above-mentioned integrated unit can be implemented in the form of hardware or in the form of a software functional module.
所述集成的单元如果以软件功能模块的形式实现并非作为独立的产品进行销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本实施例提供了一种计算机可读存储介质,应用于解码器240,该计算机可读存储介质存储有计算机程序,所述计算机程序被第二处理器执行时实现前述实施例中任一项所述的方法。If the integrated unit is implemented in the form of a software function module and is not sold or used as an independent product, it can be stored in a computer-readable storage medium. Based on this understanding, this embodiment provides a computer-readable storage medium, which is applied to the decoder 240, and the computer-readable storage medium stores a computer program. When the computer program is executed by the second processor, the method described in any one of the above embodiments is implemented.
基于解码器240的组成以及计算机可读存储介质,参见图25,其示出了本申请实施例提供的解码器240的具体硬件结构示意图。如图25所示,解码器240可以包括:第二通信接口2501、第二存储器2502和第二处理器2503;各个组件通过第二总线系统2504耦合在一起。可理解,第二总线系统2504用于实现这些组件之间的连接通信。第二总线系统2504除包括数据总线之外,还包括电源总线、控制总线和状态信号总线。但是为了清楚说明起见,在图25中将各种总线都标为第二总线系统2504。其中,Based on the composition of the decoder 240 and the computer-readable storage medium, refer to Figure 25, which shows a specific hardware structure diagram of the decoder 240 provided in an embodiment of the present application. As shown in Figure 25, the decoder 240 may include: a second communication interface 2501, a second memory 2502 and a second processor 2503; each component is coupled together through a second bus system 2504. It can be understood that the second bus system 2504 is used to achieve connection and communication between these components. In addition to the data bus, the second bus system 2504 also includes a power bus, a control bus and a status signal bus. However, for the sake of clarity, various buses are marked as the second bus system 2504 in Figure 25. Among them,
第二通信接口2501,用于在与其他外部网元之间进行收发信息过程中,信号的接收和发送;The second communication interface 2501 is used for receiving and sending signals during the process of sending and receiving information with other external network elements;
第二存储器2502,用于存储能够在第二处理器2503上运行的计算机程序;The second memory 2502 is used to store a computer program that can be run on the second processor 2503;
第二处理器2503,用于在运行所述计算机程序时,执行:The second processor 2503 is configured to, when running the computer program, execute:
确定当前图像的基础网格;Determine the base grid of the current image;
对基础网格进行细分,确定当前图像中的当前层的初始网格的几何位置信息;Subdividing the base grid to determine the geometric position information of the initial grid of the current layer in the current image;
解码码流,确定第一语法标识信息;Decoding the code stream to determine first syntax identification information;
在第一语法标识信息指示当前图像中的当前层的移位系数使用第一解码模式时,确定参考图像中的当前层的移位系数;When the first syntax identification information indicates that the shift coefficient of the current layer in the current image uses the first decoding mode, determining the shift coefficient of the current layer in the reference image;
根据当前图像中的当前层的初始网格的几何位置信息与参考图像中的当前层的移位系数,确定当前图像中的当前层的重建网格的几何位置信息。The geometric position information of the reconstructed grid of the current layer in the current image is determined according to the geometric position information of the initial grid of the current layer in the current image and the shift coefficient of the current layer in the reference image.
可选地,作为另一个实施例,第二处理器2503还配置为在运行所述计算机程序时,执行前述实施例中任一项所述的方法。Optionally, as another embodiment, the second processor 2503 is further configured to execute any one of the methods described in the foregoing embodiments when running the computer program.
可以理解,第二存储器2502与第一存储器2302的硬件功能类似,第二处理器2503与第一处理器2303的硬件功能类似;这里不再详述。It can be understood that the hardware functions of the second memory 2502 and the first memory 2302 are similar, and the hardware functions of the second processor 2503 and the first processor 2303 are similar; they will not be described in detail here.
本实施例提供了一种解码器,在该解码器中,根据第一语法标识信息来指示当前图像中的当前层的移位系数是否跳过解码,在当前图像中的当前层的移位系数跳过解码时,这时候根据当前图像中的当前 层的初始网格的几何位置信息与参考图像中的当前层的移位系数来确定重建网格的几何位置信息,不仅降低了移位系数的码流,而且还可以保证网格中点的重建几何质量,从而进一步提升了网格中点的几何信息质量,进而提升了编解码效率。This embodiment provides a decoder, in which the first syntax identification information is used to indicate whether the shift coefficient of the current layer in the current image is skipped for decoding. When the shift coefficient of the current layer in the current image is skipped for decoding, the first syntax identification information is used to indicate whether the shift coefficient of the current layer in the current image is skipped for decoding. The geometric position information of the initial grid of the layer and the shift coefficient of the current layer in the reference image are used to determine the geometric position information of the reconstructed grid, which not only reduces the bit rate of the shift coefficient, but also ensures the reconstructed geometric quality of the grid midpoint, thereby further improving the geometric information quality of the grid midpoint and further improving the encoding and decoding efficiency.
在本申请的再一实施例中,参见图26,其示出了本申请实施例提供的一种编解码系统的组成结构示意图。如图26所示,编解码系统260可以包括编码器2601和解码器2602。In yet another embodiment of the present application, referring to FIG26 , a schematic diagram of the composition structure of a coding and decoding system provided in an embodiment of the present application is shown. As shown in FIG26 , a coding and decoding system 260 may include an encoder 2601 and a decoder 2602 .
在本申请实施例中,编码器2601可以是前述实施例中任一项所述的编码器,解码器2602可以是前述实施例中任一项所述的解码器。In the embodiment of the present application, the encoder 2601 may be the encoder described in any one of the aforementioned embodiments, and the decoder 2602 may be the decoder described in any one of the aforementioned embodiments.
需要说明的是,在本申请中,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者装置不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者装置所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括该要素的过程、方法、物品或者装置中还存在另外的相同要素。It should be noted that, in this application, the terms "include", "comprises" or any other variants thereof are intended to cover non-exclusive inclusion, so that a process, method, article or device including a series of elements includes not only those elements, but also other elements not explicitly listed, or also includes elements inherent to such process, method, article or device. In the absence of further restrictions, an element defined by the sentence "includes a ..." does not exclude the existence of other identical elements in the process, method, article or device including the element.
上述本申请实施例序号仅仅为了描述,不代表实施例的优劣。The serial numbers of the above-mentioned embodiments of the present application are for description only and do not represent the advantages or disadvantages of the embodiments.
本申请所提供的几个方法实施例中所揭露的方法,在不冲突的情况下可以任意组合,得到新的方法实施例。The methods disclosed in several method embodiments provided in this application can be arbitrarily combined without conflict to obtain new method embodiments.
本申请所提供的几个产品实施例中所揭露的特征,在不冲突的情况下可以任意组合,得到新的产品实施例。The features disclosed in several product embodiments provided in this application can be arbitrarily combined without conflict to obtain new product embodiments.
本申请所提供的几个方法或设备实施例中所揭露的特征,在不冲突的情况下可以任意组合,得到新的方法实施例或设备实施例。The features disclosed in several method or device embodiments provided in this application can be arbitrarily combined without conflict to obtain new method embodiments or device embodiments.
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以所述权利要求的保护范围为准。The above is only a specific implementation of the present application, but the protection scope of the present application is not limited thereto. Any person skilled in the art who is familiar with the present technical field can easily think of changes or substitutions within the technical scope disclosed in the present application, which should be included in the protection scope of the present application. Therefore, the protection scope of the present application should be based on the protection scope of the claims.
本申请实施例中,在编码端,根据当前图像的原始网格,确定当前图像的基础网格;对基础网格进行细分,确定当前图像中的当前层的初始网格的几何位置信息;确定参考图像中的当前层的移位系数,并根据当前图像中的当前层的初始网格的几何位置信息与参考图像中的当前层的移位系数,确定当前图像中的当前层的第一重建网格的几何位置信息;根据第一重建网格的几何位置信息和原始网格的几何位置信息,确定是否对当前图像中的当前层的移位系数进行编码处理。在解码端,确定当前图像的基础网格;对基础网格进行细分,确定当前图像中的当前层的初始网格的几何位置信息;解码码流,确定第一语法标识信息;在第一语法标识信息指示当前图像中的当前层的移位系数使用第一解码模式时,确定参考图像中的当前层的移位系数;根据当前图像中的当前层的初始网格的几何位置信息与参考图像中的当前层的移位系数,确定当前图像中的当前层的重建网格的几何位置信息。这样,基于当前图像的基础网格细分后的初始网格的几何位置信息与参考图像中的移位系数所得到的第一重建网格的几何位置信息,可以确定当前图像中的移位系数的自适应编码;如果跳过编码当前图像中的移位系数,这时候码流中无需传递当前图像中的移位系数,解码端可以根据初始网格的几何位置信息与参考图像中的移位系数来确定第一重建网格的几何位置信息,不仅降低了移位系数的码流,而且还可以保证网格中点的重建几何质量,从而进一步提升了网格中点的几何信息质量,进而提升了编解码效率。 In an embodiment of the present application, at the encoding end, based on the original grid of the current image, the base grid of the current image is determined; the base grid is subdivided to determine the geometric position information of the initial grid of the current layer in the current image; the shift coefficient of the current layer in the reference image is determined, and based on the geometric position information of the initial grid of the current layer in the current image and the shift coefficient of the current layer in the reference image, the geometric position information of the first reconstructed grid of the current layer in the current image is determined; based on the geometric position information of the first reconstructed grid and the geometric position information of the original grid, it is determined whether to encode the shift coefficient of the current layer in the current image. At the decoding end, the base grid of the current image is determined; the base grid is subdivided to determine the geometric position information of the initial grid of the current layer in the current image; the code stream is decoded to determine the first syntax identification information; when the first syntax identification information indicates that the shift coefficient of the current layer in the current image uses the first decoding mode, the shift coefficient of the current layer in the reference image is determined; based on the geometric position information of the initial grid of the current layer in the current image and the shift coefficient of the current layer in the reference image, the geometric position information of the reconstructed grid of the current layer in the current image is determined. In this way, based on the geometric position information of the initial grid after the basic grid of the current image is subdivided and the geometric position information of the first reconstructed grid obtained by the shift coefficient in the reference image, the adaptive encoding of the shift coefficient in the current image can be determined; if the encoding of the shift coefficient in the current image is skipped, the shift coefficient in the current image does not need to be transmitted in the bitstream at this time. The decoding end can determine the geometric position information of the first reconstructed grid based on the geometric position information of the initial grid and the shift coefficient in the reference image, which not only reduces the bitstream of the shift coefficient, but also ensures the reconstructed geometric quality of the midpoints in the grid, thereby further improving the geometric information quality of the midpoints in the grid, and thus improving the encoding and decoding efficiency.
Claims (51)
Priority Applications (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202380096894.6A CN120958829A (en) | 2023-06-30 | 2023-06-30 | Encoding/decoding method, code stream, encoder, decoder, and storage medium |
| PCT/CN2023/104460 WO2025000429A1 (en) | 2023-06-30 | 2023-06-30 | Coding method, decoding method, code stream, coder, decoder, and storage medium |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| PCT/CN2023/104460 WO2025000429A1 (en) | 2023-06-30 | 2023-06-30 | Coding method, decoding method, code stream, coder, decoder, and storage medium |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2025000429A1 true WO2025000429A1 (en) | 2025-01-02 |
Family
ID=93936665
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/CN2023/104460 Pending WO2025000429A1 (en) | 2023-06-30 | 2023-06-30 | Coding method, decoding method, code stream, coder, decoder, and storage medium |
Country Status (2)
| Country | Link |
|---|---|
| CN (1) | CN120958829A (en) |
| WO (1) | WO2025000429A1 (en) |
Citations (8)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN104243958A (en) * | 2014-09-29 | 2014-12-24 | 联想(北京)有限公司 | Coding method, decoding method, coding device and decoding device for three-dimensional grid data |
| US20180253867A1 (en) * | 2017-03-06 | 2018-09-06 | Canon Kabushiki Kaisha | Encoding and decoding of texture mapping data in textured 3d mesh models |
| WO2021170906A1 (en) * | 2020-02-28 | 2021-09-02 | Nokia Technologies Oy | An apparatus, a method and a computer program for volumetric video |
| US20230014820A1 (en) * | 2021-07-19 | 2023-01-19 | Tencent America LLC | Methods and apparatuses for dynamic mesh compression |
| US20230105452A1 (en) * | 2021-10-04 | 2023-04-06 | Tencent America LLC | Method and apparatus of adaptive sampling for mesh compression by decoders |
| CN116132671A (en) * | 2020-06-05 | 2023-05-16 | Oppo广东移动通信有限公司 | Point cloud compression method, encoder, decoder and storage medium |
| US20230177736A1 (en) * | 2021-12-03 | 2023-06-08 | Tencent America LLC | Method and apparatus for chart based mesh compression |
| CN116320352A (en) * | 2023-03-16 | 2023-06-23 | 腾讯科技(深圳)有限公司 | Point cloud processing method and device, computer equipment and storage medium |
-
2023
- 2023-06-30 WO PCT/CN2023/104460 patent/WO2025000429A1/en active Pending
- 2023-06-30 CN CN202380096894.6A patent/CN120958829A/en active Pending
Patent Citations (8)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN104243958A (en) * | 2014-09-29 | 2014-12-24 | 联想(北京)有限公司 | Coding method, decoding method, coding device and decoding device for three-dimensional grid data |
| US20180253867A1 (en) * | 2017-03-06 | 2018-09-06 | Canon Kabushiki Kaisha | Encoding and decoding of texture mapping data in textured 3d mesh models |
| WO2021170906A1 (en) * | 2020-02-28 | 2021-09-02 | Nokia Technologies Oy | An apparatus, a method and a computer program for volumetric video |
| CN116132671A (en) * | 2020-06-05 | 2023-05-16 | Oppo广东移动通信有限公司 | Point cloud compression method, encoder, decoder and storage medium |
| US20230014820A1 (en) * | 2021-07-19 | 2023-01-19 | Tencent America LLC | Methods and apparatuses for dynamic mesh compression |
| US20230105452A1 (en) * | 2021-10-04 | 2023-04-06 | Tencent America LLC | Method and apparatus of adaptive sampling for mesh compression by decoders |
| US20230177736A1 (en) * | 2021-12-03 | 2023-06-08 | Tencent America LLC | Method and apparatus for chart based mesh compression |
| CN116320352A (en) * | 2023-03-16 | 2023-06-23 | 腾讯科技(深圳)有限公司 | Point cloud processing method and device, computer equipment and storage medium |
Also Published As
| Publication number | Publication date |
|---|---|
| CN120958829A (en) | 2025-11-14 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN108833916B (en) | Video encoding method, video decoding method, video encoding device, video decoding device, storage medium and computer equipment | |
| US20060012719A1 (en) | System and method for motion prediction in scalable video coding | |
| JP2025518459A (en) | Geometric Coordinate Scaling for AI-Based Dynamic Point Cloud Coding | |
| WO2024212981A1 (en) | Three-dimensional mesh sequence encoding method and apparatus, and three-dimensional mesh sequence decoding method and apparatus | |
| JP4350504B2 (en) | Method and apparatus for encoding and decoding images with mesh and program corresponding thereto | |
| WO2025002021A1 (en) | Three-dimensional mesh inter-frame prediction encoding method and apparatus, three-dimensional mesh inter-frame prediction decoding method and apparatus, and electronic device | |
| WO2025000429A1 (en) | Coding method, decoding method, code stream, coder, decoder, and storage medium | |
| WO2025000342A1 (en) | Encoding and decoding method, encoder, decoder, and storage medium | |
| WO2025007270A1 (en) | Encoding method, decoding method, encoder, decoder, code stream, and storage medium | |
| WO2025000523A1 (en) | Coding method, decoding method, coder, decoder, code stream, and storage medium | |
| WO2025148072A1 (en) | Coding method, decoding method, code stream, encoder, decoder, and storage medium | |
| WO2025076795A1 (en) | Coding and decoding methods, bit stream, encoder, decoder, and storage medium | |
| WO2022246809A1 (en) | Encoding method, decoding method, code stream, encoder, decoder and storage medium | |
| WO2024148573A1 (en) | Encoding and decoding method, encoder, decoder, and storage medium | |
| WO2025076656A1 (en) | Encoding method, decoding method, encoder, decoder, and storage medium | |
| CN118678093B (en) | Coding processing method, decoding processing method and related equipment | |
| WO2025145325A1 (en) | Encoding method, decoding method, encoders, decoders and storage medium | |
| WO2025076790A1 (en) | Coding method and apparatus, decoding method and apparatus, coder, decoder, code stream, and storage medium | |
| WO2025213306A1 (en) | Coding method, decoding method, coders, decoders and storage medium | |
| WO2025151992A1 (en) | Coding method, decoding method, code stream, coders, decoders and storage medium | |
| WO2025152015A1 (en) | Coding method, decoding method, code stream, coder, decoder, and storage medium | |
| WO2024255912A1 (en) | Encoding method, decoding method, bitstream, encoder, decoder, medium and program product | |
| WO2025076749A1 (en) | Encoding method, decoding method, code stream, encoder, decoder, and storage medium | |
| CN119182907B (en) | Grid encoding method, grid decoding method and related equipment | |
| WO2024255475A1 (en) | Coding and decoding methods, bitstream, encoder, decoder and storage medium |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 23942936 Country of ref document: EP Kind code of ref document: A1 |