WO2025227794A1 - Encoding method and apparatus, and decoding method and apparatus - Google Patents
Encoding method and apparatus, and decoding method and apparatusInfo
- Publication number
- WO2025227794A1 WO2025227794A1 PCT/CN2024/143135 CN2024143135W WO2025227794A1 WO 2025227794 A1 WO2025227794 A1 WO 2025227794A1 CN 2024143135 W CN2024143135 W CN 2024143135W WO 2025227794 A1 WO2025227794 A1 WO 2025227794A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- coordinate system
- information
- coordinate
- blocks
- vertices
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T17/00—Three dimensional [3D] modelling, e.g. data description of 3D objects
- G06T17/20—Finite element generation, e.g. wire-frame surface description, tesselation
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N13/00—Stereoscopic video systems; Multi-view video systems; Details thereof
- H04N13/10—Processing, recording or transmission of stereoscopic or multi-view image signals
- H04N13/106—Processing image signals
- H04N13/161—Encoding, multiplexing or demultiplexing different image signal components
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/134—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
- H04N19/146—Data rate or code amount at the encoder output
- H04N19/147—Data rate or code amount at the encoder output according to rate distortion criteria
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/17—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
- H04N19/176—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/597—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding specially adapted for multi-view video sequence encoding
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/60—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/70—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/85—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression
Definitions
- This application relates to 3D technology, and more particularly to an encoding/decoding method and apparatus.
- Three-dimensional meshes can be used to represent volumetric video, digital humans, computer graphics (CG) content, etc., characterized by each frame containing one mesh.
- the raw data volume of 3D meshes is large; under uncompressed conditions, their bitrate typically exceeds 3Gbps, which puts enormous pressure on transmission. Therefore, efficient encoding and decoding techniques are needed to improve the storage or transmission efficiency of 3D meshes.
- This application provides an encoding/decoding method and apparatus to improve vertex compression efficiency.
- embodiments of this application provide an encoding method, comprising: acquiring raw data of a three-dimensional mesh, wherein the three-dimensional mesh comprises M blocks, the raw data comprises M sets of raw information, the M sets of raw information correspond one-to-one with the M blocks and respectively contain the coordinates of vertices in the corresponding blocks in a first coordinate system, wherein the first coordinate system is the original coordinate system of the M blocks; encoding the coordinate information and coordinate system information into a bitstream corresponding to the three-dimensional mesh based on the raw data, wherein the coordinate information contains the coordinates of some or all vertices in N blocks of the M blocks, and the coordinate system information indicates whether the current coordinate system of the N blocks is the first coordinate system, 1 ⁇ N ⁇ M.
- the optimal coordinate system corresponding to the block to be transformed is selected, so that the vertex information in the block is placed in the corresponding coordinate system for subsequent compression processing, which can improve the compression efficiency of the vertex.
- the 3D mesh comprises M blocks
- the original data comprises M sets of original information.
- Each of the M sets of original information corresponds one-to-one with one of the M blocks and contains the coordinates of the vertices in the corresponding block in a first coordinate system.
- This first coordinate system is the original coordinate system of the M blocks.
- the first coordinate system can be a Cartesian coordinate system, a local normal coordinate system, or a local cylindrical coordinate system.
- a 3D mesh is characterized by one frame containing one mesh, and one mesh can include multiple faces, which can be polygons (e.g., triangles). Each polygon is defined by its vertices in 3D space and information about how the vertices are connected (called connectivity information).
- connectivity information information about how the vertices are connected.
- the data of a 3D mesh can include multiple vertex coordinates and multiple connectivity relationships (connectivity relationships can also be interpreted as topological structures, and multiple connectivity relationships can be called a connectivity set).
- Vertex coordinates can be represented in three-dimensional coordinates (e.g., (x, y, z)) or in spherical coordinates. Their purpose is to indicate the position of the vertex in 3D space.
- vertex coordinates can be represented in a world coordinate system (e.g., a Cartesian coordinate system) or in a local coordinate system (e.g., a local normal coordinate system or a local cylindrical coordinate system).
- Connections can be represented by vertices associated with the connection.
- (123) represents a connection from vertex 1 to vertex 2, then to vertex 3, and then back to vertex 1.
- they can be represented by faces (e.g., triangular faces, polygonal faces, etc.).
- a triangular face has a direction; 1, 2, 3 represents 1->2, 2->3, 3->1, or it represents that the triangular face is composed of vertices 1, 2, and 3.
- edges represents the edge 1->2, 5 represents the edge 2->3, and 6 represents the edge 3->1.
- a patch can also be referred to as a region, block, spatial block, etc., without specific limitation.
- Any patch e.g., the first patch, also called the first patch
- the bitstream contains a first syntax structure corresponding to a three-dimensional grid
- the first syntax structure contains a second syntax structure corresponding to M blocks
- coordinate system information is located in the first syntax structure and outside the second syntax structure, or the coordinate system information is located in the second syntax structure corresponding to N blocks.
- the code stream corresponding to the three-dimensional grid contains two kinds of syntax elements: one is the syntax element corresponding to the three-dimensional grid (i.e., the grid syntax element, corresponding to the first syntax structure mentioned above), and the other is the syntax element corresponding to the block (i.e., the block syntax element, corresponding to the second syntax structure mentioned above).
- the coordinate system information in this embodiment can be written into the first syntax structure or the second syntax structure, and there is no specific limitation on this.
- the coordinate system information indicates that the current coordinate system of the N blocks is a second coordinate system, which is a transformed version of the first coordinate system.
- the coordinate information includes the coordinates of some or all vertices in the N blocks under the second coordinate system.
- the second coordinate system can be a Cartesian coordinate system, a local normal coordinate system, or a local cylindrical coordinate system.
- the coordinate system information can also indicate that an inverse coordinate transformation should be performed during decoding.
- the coordinate system information can include information on N sets of target coordinate systems, as well as the correspondence between the information on these N sets of target coordinate systems and N blocks.
- the information on any set of target coordinate systems includes the type of the target coordinate system, the origin information of the target coordinate system, and the coordinate axis information of the target coordinate system.
- N out of the M blocks in the 3D mesh require coordinate system transformation.
- the encoding end can transform the coordinate information of the vertices within any of these N blocks from the first coordinate system to the target coordinate system.
- the encoding end can write coordinate system information into the bitstream.
- This coordinate system information includes the type of the target coordinate system, the origin information, and the coordinate axes information.
- the target coordinate system can be either a first coordinate system or a second coordinate system.
- identifier 1 represents the first coordinate system and identifier 2 represents the second coordinate system; or, identifier 0 represents the first coordinate system and identifier 1 represents the second coordinate system, without any specific limitation.
- the origin information of the target coordinate system includes the coordinate information of the origin in the first coordinate system; or, the origin information of the target coordinate system includes the origin acquisition mode.
- the encoding end can directly write the coordinate information of the origin in the first coordinate system into the bitstream, for example, (0,1,0), or the encoding end can write the origin acquisition mode into the bitstream, for example, identifier 0 represents the origin as the previous vertex, identifier 1 represents the origin as the midpoint of the two previous vertices, without specific limitations.
- the coordinate axis information of the target coordinate system includes vectors of the coordinate axes in the first coordinate system; or, the coordinate axis information of the target coordinate system includes the acquisition mode of the coordinate axes.
- the encoding end can directly write the vectors of the coordinate axes in the first coordinate system into the bitstream, for example, (1,0,0) represents the x-axis, (0,1,0) represents the y-axis, and (0,0,1) represents the z-axis; or, the encoding end can write the acquisition mode of the coordinate axes into the bitstream, for example, identifier 1 represents the default x-axis (1,0,0), y-axis (0,1,0), z-axis (0,0,1), and identifier 2 represents a preset rotation angle.
- the aforementioned preset angle can also be written into the bitstream.
- the origin and coordinate axes of the target coordinate system can also be obtained in a way that is agreed upon in advance by the encoder and decoder.
- the default origin is the previous vertex
- the default x-axis is (1,0,0)
- the default y-axis is (0,1,0)
- the default z-axis is (0,0,1).
- the coordinate system information indicates that the current coordinate system of the N blocks is the first coordinate system, and the coordinate information includes the coordinates of some or all of the vertices in the N blocks in the first coordinate system.
- the coordinate system information can contain N block identifiers, which are used to indicate that the N blocks corresponding to the N block identifiers do not undergo coordinate system transformation.
- the encoding end can provide reverse indication, that is, the coordinate system information indicates blocks that do not require coordinate system transformation, while other blocks are assumed to require coordinate system transformation, and the target coordinate system of these blocks is pre-defined by the encoding and decoding ends.
- the decoding end can retain the blocks corresponding to the block identifiers in the coordinate system information, while performing coordinate system transformation on the blocks not indicated in the coordinate system information.
- the encoder can transform the coordinates of vertices in N blocks in the first coordinate system to coordinates in the second coordinate system.
- the encoding end determines whether there are transformation blocks among multiple blocks; when there are transformation blocks among multiple blocks, it obtains the target coordinate system of at least one transformation block; based on the target coordinate system of at least one transformation block, it performs coordinate system transformation on the information of vertices in at least one transformation block to obtain at least one transformed block.
- a transformation block is a block among multiple blocks where coordinate system transformations are performed on the information of the vertices within it.
- the encoding end can determine whether there are transformation blocks among multiple blocks by methods such as rate-distortion optimization, vertex distribution before and after coordinate system transformation, triangle attributes of the faces included in the 3D mesh, or pre-setting.
- the encoding end can perform this operation for any block, that is, for multiple meshes in the 3D mesh, it determines whether the block is a transformation block one by one; after determining each block and confirming it as a transformation block, it obtains its target coordinate system.
- Each transformation block can obtain a target coordinate system, and the target coordinate systems of different transformation blocks may be the same or different.
- the encoder uses a rate-distortion function to determine whether to perform a coordinate system transformation. Specifically, for any given block, it calculates the weighted values of the bitrate and distortion loss for each vertex in the block across a first coordinate system and other coordinate systems (e.g., a local coordinate system). Based on the bitrate and weighted values, it determines whether a coordinate system transformation is needed for the vertex information in that block, and selects the optimal coordinate system from among the other coordinate systems as the target coordinate system for that block. Another example is the method for analyzing vertex distribution before and after transformation.
- the purpose of determining whether there are transformed blocks among multiple blocks is to find blocks (i.e., transformed blocks) that have better compression ratio and bit rate in the target coordinate system than in the first coordinate system.
- blocks i.e., transformed blocks
- using the vertex information of the target coordinate system can improve the compression efficiency of the vertices of the transformed blocks.
- the encoding end can set coordinate system information for the transformed block.
- This coordinate system information can include a coordinate system identifier to indicate the target coordinate system of the transformed block. For example, a coordinate system identifier of 1 indicates a local normal coordinate system; a coordinate system identifier of 0 indicates a local cylindrical coordinate system.
- the coordinate system information can also include a coordinate system transformation indicator to indicate whether coordinate system transformation is required for the vertex information in the block. For example, a coordinate system transformation indicator of 1 indicates that coordinate system transformation is required for the vertex information in the block; a coordinate system transformation indicator of 0 indicates that coordinate system transformation is not required for the vertex information in the block.
- the information of vertices in any transformed block is the information in the corresponding target coordinate system. That is, for any transformed block, once its target coordinate system is determined, the information of at least one vertex in the transformed block can be transformed to obtain the transformed information of at least one vertex in the target coordinate system, which is the transformed block.
- the encoding end can first establish a target coordinate system, which can rely on reference vertices or reference triangle faces to determine the origin and coordinate axis directions of the target coordinate system. Then, the information in the first coordinate system is transformed to the target coordinate system. Transformation methods include, but are not limited to, translation, rotation, scaling, and perspective transformations.
- the target coordinate system can be a local normal coordinate system, whose origin can be an encoded vertex or obtained from an encoded triangle face.
- the coordinate axes of the local normal coordinate system can be generated from the normal and two tangential components at each vertex.
- the target coordinate system can be a local cylindrical coordinate system, whose origin can be an encoded vertex or obtained from an encoded triangle face.
- the coordinate axes of the local cylindrical coordinate system can be obtained based on a reference triangle. Then, the vertex coordinates (x, y, z) in the Cartesian coordinate system are transformed into the dihedral angle ⁇ , radius r, and height z in the local cylindrical coordinate system.
- multiple vertices in the same patch are applicable to the same local coordinate system.
- different local coordinate systems can be created for different vertices. That is, one or more vertices can be transformed to the same local coordinate system, or different vertices can be transformed to different local coordinate systems. No specific limitation is made in this regard.
- the encoding end can perform prediction and entropy coding on at least one transformed block to obtain a bitstream.
- the transformed information of at least one vertex can be predicted to obtain the residual information of at least one vertex; then the residual information of at least one vertex is encoded to obtain the bitstream.
- coordinate system information can be filled into the patch data unit syntax in the bitstream.
- One method is to add new bytes to the syntax, and another method is to use the reserved fields in the syntax. This application does not specifically limit this approach.
- One possible implementation also includes: obtaining untransformed blocks, which are blocks among multiple blocks for which coordinate system transformation is not performed on the information of vertices; and encoding based on the untransformed blocks to obtain a bitstream.
- the non-transformation block can be a patch in the 3D mesh that is different from the transformed block. Compared to the local coordinate system (corresponding to the target coordinate system), the non-transformation block is more suitable for the world coordinate system (corresponding to the first coordinate system). That is, multiple vertices in the 3D mesh can be applied to different local coordinate systems or to the world coordinate system. Based on the characteristics of multiple vertices in the 3D mesh, multiple vertices can be assigned to different patches in the most suitable way, so that the same coordinate system transformation can be applied to the vertices in the same patch. In this way, the information of vertices in each patch adopts the most suitable coordinate system, which can comprehensively improve the vertex compression rate of the 3D mesh.
- embodiments of this application provide a decoding method, comprising: receiving a bitstream corresponding to a three-dimensional mesh, the three-dimensional mesh comprising M blocks; decoding the bitstream to obtain coordinate information and coordinate system information, the coordinate information comprising the coordinates of some or all vertices in N blocks of the M blocks, the coordinate system information indicating whether the coordinate system of the N blocks is a first coordinate system, the first coordinate system being the original coordinate system of the M blocks, 1 ⁇ N ⁇ M; obtaining reconstructed data of the three-dimensional mesh based on the coordinate information and the coordinate system information, the reconstructed data comprising M sets of reconstructed information, the M sets of reconstructed information corresponding one-to-one with the M blocks and each containing the coordinates of vertices in the corresponding block in the first coordinate system.
- the coordinate system of the vertex information in a block of a 3D mesh can be determined by decoding the stream, and then the coordinate system transformation is performed to obtain the reconstructed information in the world coordinate system, which can improve the compression efficiency of the vertex.
- the aforementioned three-dimensional mesh comprises M blocks.
- the coordinate information includes the coordinates of some or all vertices in N blocks out of the M blocks, and the coordinate system information indicates whether the coordinate system of the N blocks is a first coordinate system.
- the first coordinate system is the original coordinate system of the M blocks, where 1 ⁇ N ⁇ M.
- the first coordinate system can be a Cartesian coordinate system, a local normal coordinate system, or a local cylindrical coordinate system.
- the bitstream contains a first syntax structure corresponding to a three-dimensional grid
- the first syntax structure contains a second syntax structure corresponding to M blocks
- coordinate system information is located in the first syntax structure and outside the second syntax structure, or the coordinate system information is located in the second syntax structure corresponding to N blocks.
- the code stream corresponding to the three-dimensional grid contains two kinds of syntax elements: one is the syntax element corresponding to the three-dimensional grid (i.e., the grid syntax element, corresponding to the first syntax structure mentioned above), and the other is the syntax element corresponding to the block (i.e., the block syntax element, corresponding to the second syntax structure mentioned above).
- the coordinate system information in this embodiment can be written into the first syntax structure or the second syntax structure, and there is no specific limitation on this.
- the coordinate system information indicates that the current coordinate system of the N blocks is a second coordinate system, which is a transformed version of the first coordinate system.
- the coordinate information includes the coordinates of some or all vertices in the N blocks under the second coordinate system.
- the second coordinate system can be a Cartesian coordinate system, a local normal coordinate system, or a local cylindrical coordinate system.
- the coordinate system information can also indicate that an inverse coordinate transformation should be performed during decoding.
- the coordinate system information can include information on N sets of target coordinate systems, as well as the correspondence between the information on these N sets of target coordinate systems and N blocks.
- the information on any set of target coordinate systems includes the type of the target coordinate system, the origin information of the target coordinate system, and the coordinate axis information of the target coordinate system.
- the decoding end needs to transform the coordinate information of the vertices in the N blocks back from the second coordinate system to the first coordinate system.
- the decoding end can be based on the coordinate system information, where the target coordinate system information includes the type of the target coordinate system, the origin information, and the coordinate axis information.
- the target system type can include a first coordinate system or a second coordinate system.
- identifier 1 represents the first coordinate system and identifier 2 represents the second coordinate system; or, identifier 0 represents the first coordinate system and identifier 1 represents the second coordinate system, without specific limitations.
- the first coordinate system can be a Cartesian coordinate system, a local normal coordinate system, or a local cylindrical coordinate system.
- the decoding end When the decoding end recognizes that the type of the target coordinate system is 2, it can determine that the coordinate information of the vertices in the block needs to be transformed from the second coordinate system back to the first coordinate system.
- the origin information of the target coordinate system includes the coordinate information of the origin in the first coordinate system; or, the origin information of the target coordinate system includes the origin acquisition mode.
- the encoding end can directly write the coordinate information of the origin in the first coordinate system into the bitstream, for example, (0,1,0), or the encoding end can write the origin acquisition mode into the bitstream, for example, identifier 0 represents the origin as the previous vertex, identifier 1 represents the origin as the midpoint of the two previous vertices, without specific limitations.
- the decoding end can use the vertex at position (0,1,0) in the first coordinate system as the origin of the second coordinate system, or use the previously processed vertex as the origin of the second coordinate system, or calculate the midpoint of the two vertices and then use that midpoint as the origin of the second coordinate system, etc., without making specific limitations.
- the coordinate axis information of the target coordinate system includes vectors of the coordinate axes in the first coordinate system; or, the coordinate axis information of the target coordinate system includes the acquisition mode of the coordinate axes.
- the encoding end can directly write the vectors of the coordinate axes in the first coordinate system into the bitstream, for example, (1,0,0) represents the x-axis, (0,1,0) represents the y-axis, and (0,0,1) represents the z-axis; or, the encoding end can write the acquisition mode of the coordinate axes into the bitstream, for example, identifier 1 represents the default x-axis (1,0,0), y-axis (0,1,0), z-axis (0,0,1), and identifier 2 represents a preset rotation angle.
- the aforementioned preset angle can also be written into the bitstream.
- the decoding end can use the vector represented by (1,0,0) in the first coordinate system as the x-axis of the second coordinate system, the vector represented by (0,1,0) in the first coordinate system as the y-axis of the second coordinate system, and the vector represented by (0,0,1) in the first coordinate system as the z-axis of the second coordinate system, or rotate the first coordinate system according to the rotation angle.
- the origin and coordinate axes of the target coordinate system can also be obtained in a way that is agreed upon in advance by the encoder and decoder.
- the default origin is the previous vertex
- the default x-axis is (1,0,0)
- the default y-axis is (0,1,0)
- the default z-axis is (0,0,1).
- the decoding end can construct a second coordinate system corresponding to the vertex, and then perform a transformation from the second coordinate system to the first coordinate system on the coordinate information of the vertex based on the second coordinate system.
- the coordinate system information indicates that the current coordinate system of the N blocks is the first coordinate system, and the coordinate information includes the coordinates of some or all of the vertices in the N blocks in the first coordinate system.
- the coordinate system information can contain N block identifiers, which are used to indicate that the N blocks corresponding to the N block identifiers do not undergo coordinate system transformation.
- the encoding end can provide reverse indication, that is, the coordinate system information indicates blocks that do not require coordinate system transformation, while other blocks are assumed to require coordinate system transformation, and the target coordinate system of these blocks is pre-defined by the encoding and decoding ends.
- the decoding end can retain the blocks corresponding to the block identifiers in the coordinate system information, while performing coordinate system transformation on the blocks not indicated in the coordinate system information.
- the reconstruction data includes M sets of reconstruction information, which correspond one-to-one with M blocks and each contains the coordinates of the vertices in the corresponding block in the first coordinate system.
- the decoder can obtain the coordinate information of all vertices in the corresponding block based on the bitstream; and perform an inverse coordinate transformation on the coordinate information of all vertices in the corresponding block to obtain the coordinate information of all vertices in the corresponding block in the first coordinate system.
- an encoding device comprising: a transceiver module, configured to acquire raw data of a three-dimensional mesh, the three-dimensional mesh comprising M blocks, the raw data comprising M sets of raw information, the M sets of raw information corresponding one-to-one with the M blocks and each containing the coordinates of vertices in the corresponding block in a first coordinate system, the first coordinate system being the original coordinate system of the M blocks; and an encoding module, configured to encode the coordinate information and coordinate system information into a bitstream corresponding to the three-dimensional mesh based on the raw data, the coordinate information comprising the coordinates of some or all vertices in N blocks of the M blocks, and the coordinate system information indicating whether the current coordinate system of the N blocks is the first coordinate system, 1 ⁇ N ⁇ M.
- the coordinate system information indicates that the current coordinate system of the N blocks is a second coordinate system, which is a transformed coordinate system of the first coordinate system.
- the coordinate information includes the coordinates of some or all vertices in the N blocks in the second coordinate system.
- the coordinate system information also indicates that an inverse coordinate transformation should be performed during decoding.
- the encoding module is further configured to transform the coordinates of the vertices in the N blocks in the first coordinate system to coordinates in the second coordinate system.
- the coordinate system information indicates that the current coordinate system of the N blocks is the first coordinate system, and the coordinate information includes the coordinates of some or all vertices in the N blocks in the first coordinate system.
- the bitstream includes a first syntax structure corresponding to the three-dimensional grid, the first syntax structure including a second syntax structure corresponding to the M blocks, the coordinate system information being located in the first syntax structure and outside the second syntax structure, or the coordinate system information being located in the second syntax structure corresponding to the N blocks.
- the coordinate system information includes information on N sets of second coordinate systems, and the correspondence between the information on the N sets of second coordinate systems and the N blocks.
- Any set of information on the second coordinate system includes the type of the second coordinate system, the origin information of the second coordinate system, and the coordinate axis information of the second coordinate system.
- the origin information of the second coordinate system includes the coordinate information of the origin in the first coordinate system; or, the origin information of the second coordinate system includes the origin acquisition mode.
- the coordinate axis information of the second coordinate system includes the vectors of the coordinate axes in the first coordinate system; or, the coordinate axis information of the second coordinate system includes the acquisition mode of the coordinate axes.
- embodiments of this application provide a decoding device, comprising: a transceiver module for receiving a bitstream corresponding to a three-dimensional mesh, the three-dimensional mesh comprising M blocks; a decoding module for decoding the bitstream to obtain coordinate information and coordinate system information, the coordinate information comprising the coordinates of some or all vertices in N blocks of the M blocks, the coordinate system information indicating whether the coordinate system of the N blocks is a first coordinate system, the first coordinate system being the original coordinate system of the M blocks, 1 ⁇ N ⁇ M; and obtaining reconstructed data of the three-dimensional mesh based on the coordinate information and the coordinate system information, the reconstructed data comprising M sets of reconstructed information, the M sets of reconstructed information corresponding one-to-one with the M blocks and each containing the coordinates of vertices in the corresponding block in the first coordinate system.
- the coordinate system information indicates that the current coordinate system of the N blocks is a second coordinate system, which is a transformed coordinate system of the first coordinate system.
- the coordinate information includes the coordinates of some or all vertices in the N blocks in the second coordinate system.
- the coordinate system information also indicates that an inverse coordinate transformation should be performed during decoding.
- the decoding module is further configured to transform the coordinates of the vertices in the N blocks in the second coordinate system to coordinates in the first coordinate system.
- the coordinate system information indicates that the current coordinate system of the N blocks is the first coordinate system, and the coordinate information includes the coordinates of some or all vertices in the N blocks in the first coordinate system.
- the bitstream includes a first syntax structure corresponding to the three-dimensional grid, the first syntax structure including a second syntax structure corresponding to the M blocks, the coordinate system information being located in the first syntax structure and outside the second syntax structure, or the coordinate system information being located in the second syntax structure corresponding to the N blocks.
- the coordinate system information includes information on N sets of second coordinate systems, and the correspondence between the information on the N sets of second coordinate systems and the N blocks.
- Any set of information on the second coordinate system includes the type of the second coordinate system, the origin information of the second coordinate system, and the coordinate axis information of the second coordinate system.
- the origin information of the second coordinate system includes the coordinate information of the origin in the first coordinate system; or, the origin information of the second coordinate system includes the origin acquisition mode.
- the coordinate axis information of the second coordinate system includes the vectors of the coordinate axes in the first coordinate system; or, the coordinate axis information of the second coordinate system includes the acquisition mode of the coordinate axes.
- embodiments of this application provide an encoding apparatus, comprising: one or more processors; a memory for storing one or more programs; and, when the one or more programs are executed by the one or more processors, causing the one or more processors to implement the method as described in any one of the first aspects above.
- embodiments of this application provide a decoding apparatus, comprising: one or more processors; a memory for storing one or more programs; and, when the one or more programs are executed by the one or more processors, causing the one or more processors to implement the method as described in any one of the second aspects above.
- embodiments of this application provide a computer-readable storage medium storing program instructions that, when executed by a device or one or more processors, cause the device to perform the method as described in any one of the first to second aspects above.
- embodiments of this application provide a computer program product comprising computer program code, which, when executed on a device, causes the device to perform the method described in any one of the first to second aspects.
- embodiments of this application provide a chip including a processor and a memory, the memory being used to store a computer program, and the processor being used to call and run the computer program stored in the memory to perform the method as described in any one of the first to second aspects above.
- embodiments of this application provide a bitstream, the bitstream including coordinate information and coordinate system information, the coordinate information including the coordinates of some or all vertices in N blocks out of M blocks of a three-dimensional mesh, and the coordinate system information indicating whether the current coordinate system of the N blocks is the first coordinate system, 1 ⁇ N ⁇ M.
- the coordinate system information indicates that the current coordinate system of the N blocks is a second coordinate system, which is a transformed coordinate system of the first coordinate system.
- the coordinate information includes the coordinates of some or all vertices in the N blocks in the second coordinate system.
- the coordinate system information also indicates that an inverse coordinate transformation should be performed during decoding.
- the coordinate system information indicates that the current coordinate system of the N blocks is the first coordinate system, and the coordinate information includes the coordinates of some or all vertices in the N blocks in the first coordinate system.
- the bitstream includes a first syntax structure corresponding to the three-dimensional grid, the first syntax structure including a second syntax structure corresponding to the M blocks, the coordinate system information being located in the first syntax structure and outside the second syntax structure, or the coordinate system information being located in the second syntax structure corresponding to the N blocks.
- the coordinate system information includes information on N sets of second coordinate systems, and the correspondence between the information on the N sets of second coordinate systems and the N blocks.
- Any set of information on the second coordinate system includes the type of the second coordinate system, the origin information of the second coordinate system, and the coordinate axis information of the second coordinate system.
- embodiments of this application provide a computer-readable storage medium storing a bitstream as described in any one of the tenth aspects above.
- embodiments of this application provide a method for transmitting an encoded bitstream of video data, the method comprising: acquiring a bitstream from a storage medium, the bitstream being the bitstream described in any one of the tenth aspects above and stored in the storage medium; and transmitting the bitstream.
- embodiments of this application provide a system for transmitting an encoded bitstream of video data, the system comprising: an acquisition unit for acquiring a bitstream from a storage medium, the bitstream being the bitstream described in any one of the tenth aspects above and stored in the storage medium; and a transmission unit for transmitting the bitstream.
- embodiments of this application provide a method for storing an encoded bitstream of video data, the method comprising: receiving the bitstream as described in any one of the tenth aspects above; and storing the bitstream in a storage medium.
- embodiments of this application provide a system for storing encoded bitstreams of video data, comprising: a receiving unit for receiving the bitstream as described in any one of the tenth aspects; and a storage unit for storing the bitstream.
- Figure 1 is an exemplary block diagram of the encoding/decoding system 10 according to an embodiment of this application;
- FIG. 2 is a flowchart of the encoding method 200 provided in an embodiment of this application.
- Figure 3 is a schematic diagram of the Cartesian coordinate system and the local normal coordinate system
- Figure 4 is a schematic diagram of the Cartesian coordinate system and the local cylindrical coordinate system
- FIG. 5 is a flowchart of the decoding method 500 provided in an embodiment of this application.
- Figure 6 is a schematic diagram of the encoding and decoding framework of an embodiment of this application.
- Figure 7 is a schematic diagram of the encoding and decoding framework of an embodiment of this application.
- Figure 8 is a schematic diagram of the encoding and decoding framework of an embodiment of this application.
- Figure 9 is a schematic diagram of the encoding and decoding framework of an embodiment of this application.
- Figure 10 is a schematic diagram of the structure of the encoding device 1000 of this application.
- Figure 11 is a schematic diagram of the structure of the decoding device 1100 of this application.
- Figure 12 shows the electronic device 1200 provided in this application.
- At least one (item) means one or more, and “more than” means two or more.
- “And/or” is used to describe the relationship between related objects, indicating that three relationships can exist. For example, “A and/or B” can represent three cases: only A exists, only B exists, and both A and B exist simultaneously, where A and B can be singular or plural. The character “/” generally indicates that the preceding and following related objects are in an “or” relationship. "At least one (item) of the following” or similar expressions refer to any combination of these items, including any combination of single or plural items.
- At least one (item) of a, b, or c can represent: a, b, c, "a and b", “a and c", “b and c", or "a and b and c", where a, b, and c can be single or multiple.
- Three-dimensional meshes can be used to represent volumetric videos, digital humans, computer graphics (CG) content, etc., and are characterized by each frame containing one mesh.
- the data of a 3D mesh typically includes: vertex coordinates, connectivity relationships, texture coordinates, and texture maps.
- Vertex coordinates indicate the position of vertices in three-dimensional (3D) space
- connectivity relationships indicate the vertices that make up the faces (e.g., triangular faces) in the 3D mesh.
- the compression of 3D meshes typically includes vertex coordinate encoding and connectivity encoding.
- the connectivity of the reference frame and the frame to be encoded also known as the current frame
- it is not necessary to encode the connectivity of the frame to be encoded thereby saving bitstream.
- inter-frame prediction is used for the vertex coordinates of the frame to be encoded, the encoding and decoding efficiency can be further improved.
- connection relationships between the 3D grids in different frames may be inconsistent, requiring re-encoding of the connection relationships.
- this application provides an encoding/decoding method and apparatus.
- the technical solution of this application is described below through embodiments.
- Figure 1 is an exemplary block diagram of an encoding/decoding system 10 according to an embodiment of this application.
- the compressor 12 and decompressor 16 in the encoding/decoding system 10 represent devices, etc., that can be used to perform various technologies according to the various examples described in the embodiments of this application.
- the encoding and decoding system 10 includes an encoding end and a decoding end.
- the encoding end is used to compress the three-dimensional mesh and provide the compressed bitstream to the decoding end.
- the decoding end decompresses the bitstream to obtain the reconstructed three-dimensional mesh.
- the encoding end includes a compressor 12, and optionally may include a data source 11 and a communication interface 13.
- Data source 11 may include or can be any type of 3D mesh acquisition device, such as a device that acquires 3D meshes via volumetric video, or a device that generates 3D meshes using CG technology. Data source 11 may also include or can be any type of memory or storage. The data output by data source 11 is a 3D mesh.
- Compressor 12 is used to receive a three-dimensional mesh and compress the three-dimensional mesh to obtain a bitstream.
- Communication interface 13 can be used to receive the bit stream and send the bit stream to the decoding end through communication channel 14.
- the decoding end includes a decompressor 16, and optionally may include a communication interface 15 and a post-processor 17.
- the communication interface 15 is used to receive the bit stream directly from the encoding end or from any other device such as a storage device, and to provide the bit stream to the decompressor 16.
- Communication interfaces 13 and 15 can be used to send or receive bitstreams through a direct communication link between the encoder and decoder, such as a direct wired or wireless connection, or through any type of network, such as a wired network, a wireless network or any combination thereof, any type of private network and public network or any combination thereof.
- Both communication interface 13 and communication interface 15 can be configured as a one-way communication interface or a two-way communication interface as indicated by the arrow pointing from the encoding end to the corresponding communication channel 14 in Figure 1. They can be used to send and receive messages, establish connections, confirm and exchange any other information related to the communication link, etc.
- Postprocessor 17 is used to post-process the decoded data (also known as the reconstructed 3D mesh) to obtain a post-processed 3D mesh.
- the post-processing performed by postprocessor 17 may include, for example, the reconstruction of a 3D image.
- Figure 1 shows the encoding and decoding ends as independent devices
- device embodiments may also include both encoding and decoding devices or both encoding and decoding functions, i.e., simultaneously including an encoding end or corresponding function and a decoding end or corresponding function.
- the encoding end or corresponding function and the decoding end or corresponding function may be implemented using the same hardware and/or software or by separate hardware and/or software or any combination thereof.
- the encoding and decoding ends can be any type of device, including any type of handheld or fixed device, such as a laptop or tablet computer, smartphone, tablet or tablet PC, desktop computer, etc., and may or may not use an operating system of any type.
- the encoding and decoding ends may be equipped with components for wireless communication. Therefore, the encoding and decoding ends can be wireless communication devices.
- the encoding and decoding system 10 shown in Figure 1 is merely exemplary. In some cases, the encoding end and the decoding end can be applied to the same device or different devices. This application embodiment does not specifically limit the encoding and decoding system of three-dimensional mesh.
- FIG. 2 is a flowchart of process 200 of the encoding method provided in an embodiment of this application.
- Process 200 can be executed by the encoding end described above.
- Process 200 is described as a series of steps or operations. It should be understood that process 200 can be executed in various orders and/or occur simultaneously, and is not limited to the execution order shown in Figure 2.
- Process 200 may include:
- Step 201 Obtain the raw data of the 3D mesh.
- the 3D mesh comprises M blocks
- the original data comprises M sets of original information.
- Each of the M sets of original information corresponds one-to-one with one of the M blocks and contains the coordinates of the vertices in the corresponding block in a first coordinate system.
- This first coordinate system is the original coordinate system of the M blocks.
- the first coordinate system can be a Cartesian coordinate system, a local normal coordinate system, or a local cylindrical coordinate system.
- a 3D mesh is characterized by one frame containing one mesh, and a mesh can include multiple faces, which can be polygons (e.g., triangles). Each polygon is defined by its vertices in 3D space and information about how the vertices are connected (called connectivity information).
- vertex attributes e.g., color, normals, etc.
- Vertex connections can be represented using a one-dimensional array (1Darray verCoordConnArray), where the dimension corresponds to the vertex connection index, thus arranging all values of all faces in a linear structure, i.e., all vertex connections can be arranged sequentially. Therefore, the mesh data can include multiple vertex coordinates and multiple connectivity relationships (connectivity relationships can also be interpreted as topological structures, and multiple connectivity relationships can be called a connection set), where...
- Vertex coordinates can be represented in three-dimensional coordinates (e.g., (x, y, z)) or in spherical coordinates. Their purpose is to indicate the position of the vertex in 3D space.
- vertex coordinates can be represented in a world coordinate system (e.g., a Cartesian coordinate system) or in a local coordinate system (e.g., a local normal coordinate system or a local cylindrical coordinate system).
- Connections can be represented by vertices associated with the connection.
- (123) represents a connection from vertex 1 to vertex 2, then to vertex 3, and then back to vertex 1.
- they can be represented by faces (e.g., triangular faces, polygonal faces, etc.).
- a triangular face has a direction; 1, 2, 3 represents 1->2, 2->3, 3->1, or it represents that the triangular face is composed of vertices 1, 2, and 3.
- edges represents the edge 1->2, 5 represents the edge 2->3, and 6 represents the edge 3->1.
- a patch can also be referred to as a region, block, spatial block, etc., without specific limitation.
- Any patch e.g., the first patch, also called the first patch
- the first patch includes multiple vertices
- these vertices are governed by the same local coordinate system. Therefore, the information (i.e., coordinate information) of these vertices can be transformed to the same local coordinate system, for example, from Cartesian coordinates to the local normal coordinate system. It should be understood that, to improve coding efficiency, multiple vertices in the same patch can use one or more coordinate systems.
- This includes establishing a local coordinate system for the coordinate information of multiple vertices and transforming all the coordinate information of multiple vertices to this local coordinate system; or establishing separate local coordinate systems for the coordinate information of multiple vertices and transforming the coordinate information of multiple vertices to their respective local coordinate systems; or establishing multiple local coordinate systems for the coordinate information of multiple vertices and transforming the coordinate information of multiple vertices according to their distribution or position to the corresponding local coordinate system.
- the coordinate information may include the following two cases:
- Coordinate information may include position coordinates in the first coordinate system (also known as three-dimensional coordinates, vertex coordinates, etc.).
- the coordinate information may include the position coordinate residual in the first coordinate system, which is obtained based on the position coordinate prediction in the first coordinate system.
- coordinate information may also include other information related to the location coordinates, without any specific limitations.
- Step 202 Encode the coordinate information and coordinate system information into the bitstream corresponding to the 3D mesh based on the original data.
- the coordinate information includes the coordinates of some or all vertices in N blocks out of M blocks, and the coordinate system information indicates whether the current coordinate system of the N blocks is the first coordinate system, 1 ⁇ N ⁇ M.
- the bitstream contains a first syntax structure corresponding to a three-dimensional grid
- the first syntax structure contains a second syntax structure corresponding to M blocks
- coordinate system information is located in the first syntax structure and outside the second syntax structure, or the coordinate system information is located in the second syntax structure corresponding to N blocks.
- the code stream corresponding to the three-dimensional grid contains two kinds of syntax elements: one is the syntax element corresponding to the three-dimensional grid (i.e., the grid syntax element, corresponding to the first syntax structure mentioned above), and the other is the syntax element corresponding to the block (i.e., the block syntax element, corresponding to the second syntax structure mentioned above).
- the coordinate system information in this embodiment can be written into the first syntax structure or the second syntax structure, and there is no specific limitation on this.
- the coordinate system information indicates that the current coordinate system of the N blocks is a second coordinate system, which is a transformed version of the first coordinate system.
- the coordinate information includes the coordinates of some or all vertices in the N blocks under the second coordinate system.
- the second coordinate system can be a Cartesian coordinate system, a local normal coordinate system, or a local cylindrical coordinate system.
- the coordinate system information can also indicate that an inverse coordinate transformation should be performed during decoding.
- the coordinate system information can include information on N sets of target coordinate systems, as well as the correspondence between the information on these N sets of target coordinate systems and N blocks.
- the information on any set of target coordinate systems includes the type of the target coordinate system, the origin information of the target coordinate system, and the coordinate axis information of the target coordinate system.
- N out of the M blocks in the 3D mesh require coordinate system transformation.
- the encoding end can transform the coordinate information of the vertices within any of these N blocks from the first coordinate system to the target coordinate system.
- the encoding end can write coordinate system information into the bitstream.
- This coordinate system information includes the type of the target coordinate system, the origin information, and the coordinate axes information.
- the target coordinate system can be either a first coordinate system or a second coordinate system.
- identifier 1 represents the first coordinate system and identifier 2 represents the second coordinate system; or, identifier 0 represents the first coordinate system and identifier 1 represents the second coordinate system, without any specific limitation.
- the origin information of the target coordinate system includes the coordinate information of the origin in the first coordinate system; or, the origin information of the target coordinate system includes the origin acquisition mode.
- the encoding end can directly write the coordinate information of the origin in the first coordinate system into the bitstream, for example, (0,1,0), or the encoding end can write the origin acquisition mode into the bitstream, for example, identifier 0 represents the origin as the previous vertex, identifier 1 represents the origin as the midpoint of the two previous vertices, without specific limitations.
- the coordinate axis information of the target coordinate system includes vectors of the coordinate axes in the first coordinate system; or, the coordinate axis information of the target coordinate system includes the acquisition mode of the coordinate axes.
- the encoding end can directly write the vectors of the coordinate axes in the first coordinate system into the bitstream, for example, (1,0,0) represents the x-axis, (0,1,0) represents the y-axis, and (0,0,1) represents the z-axis; or, the encoding end can write the acquisition mode of the coordinate axes into the bitstream, for example, identifier 1 represents the default x-axis (1,0,0), y-axis (0,1,0), z-axis (0,0,1), and identifier 2 represents a preset rotation angle.
- the aforementioned preset angle can also be written into the bitstream.
- the origin and coordinate axes of the target coordinate system can also be obtained in a way that is agreed upon in advance by the encoder and decoder.
- the default origin is the previous vertex
- the default x-axis is (1,0,0)
- the default y-axis is (0,1,0)
- the default z-axis is (0,0,1).
- the coordinate system information indicates that the current coordinate system of the N blocks is the first coordinate system, and the coordinate information includes the coordinates of some or all of the vertices in the N blocks in the first coordinate system.
- the coordinate system information can contain N block identifiers, which are used to indicate that the N blocks corresponding to the N block identifiers do not undergo coordinate system transformation.
- the encoding end can provide reverse indication, that is, the coordinate system information indicates blocks that do not require coordinate system transformation, while other blocks are assumed to require coordinate system transformation, and the target coordinate system of these blocks is pre-defined by the encoding and decoding ends.
- the decoding end can retain the blocks corresponding to the block identifiers in the coordinate system information, while performing coordinate system transformation on the blocks not indicated in the coordinate system information.
- the encoding and decoding ends pre-agree on coordinate transformation modes. Specifically, if the coordinate system information in the bitstream is 0, it indicates that the target coordinate system of patches numbered 0-29 is the second coordinate system, meaning they undergo coordinate transformation; the target coordinate system of patches numbered 30-59 is the first coordinate system, meaning they do not undergo coordinate transformation. Conversely, if the coordinate system information in the bitstream is 1, it indicates that the target coordinate system of patches numbered 0-29 is the first coordinate system, meaning they do not undergo coordinate transformation; the target coordinate system of patches numbered 30-59 is the second coordinate system, meaning they undergo coordinate transformation.
- This method allows a fixed number of patches to have their coordinate system transformation methods specified using an identifier within the syntax elements of the 3D mesh.
- the first identifier specifies the number of selected patches (e.g., 30); the second identifier specifies the coordinate system transformation method used for the patches at that number of patches (e.g., 0 indicates the first coordinate system, 1 indicates the second coordinate system); the third identifier specifies the method for acquiring patches, such as 0 indicating sequential acquisition, 1 indicating acquisition of odd-numbered patches, 2 indicating acquisition of even-numbered patches, 3 indicating acquisition of boundary patches, etc.
- This method allows multiple identifiers to be used in the syntax elements of a 3D mesh to specify the coordinate system transformation method for a non-fixed number of patches.
- identifiers corresponding to the number of patches can be used to indicate that patches numbered 1-4 will be transformed in the following ways: no transformation, transformation, no transformation, and no transformation, respectively.
- This method can also be used in the syntax elements of a 3D mesh to specify the coordinate system transformation methods for a non-fixed number of patches using multiple identifiers.
- the frame coordinate system is a coordinate transformation identifier
- Target coordinate system identifier in this case, the target coordinate system is consistent with the original coordinate system, i.e., no coordinate transformation is performed; otherwise, a target transformation is performed; or,
- a coordinate system consistent with the encoding and decoding conventions is used, but there is no corresponding target coordinate system identifier in the bitstream at this time;
- the frame coordinate system is a coordinate transformation identifier
- Target coordinate system identifier
- a coordinate system consistent with the encoding and decoding conventions is used, but there is no corresponding target coordinate system identifier in the bitstream at this time;
- the frame coordinate system is a coordinate transformation identifier
- Target coordinate system identifier
- a coordinate system consistent with the encoding and decoding conventions is used, but there is no corresponding target coordinate system identifier in the bitstream at this time;
- a coordinate system transformation identifier for the frame coordinate system such as 1 indicating that there is a coordinate system transformation in the current frame, and 0 indicating that there is no coordinate system transformation.
- the patch is a coordinate system transformation identifier, such as 1 indicates that the patch has a coordinate system transformation, and 0 indicates that there is no coordinate system transformation;
- the target coordinate system identifier such as 1 indicating that the target coordinate system is a local normal coordinate system, and 2 indicating that the target coordinate system is a local cylindrical coordinate system.
- 0 can be used to indicate that no coordinate system transformation is performed.
- the Patch transformation identifier may not be present, as in Example 5.
- the origin of the target coordinate system can be directly encoded as the coordinates of the three-dimensional origin, such as (0,0,0); or it can be encoded as a recording method, such as selecting the coordinates of the previous vertex.
- the target coordinate system coordinate axis identifier can be directly encoded as a 3D coordinate axis, such as (1,0,0)(0,1,0)(0,0,1); or it can be encoded as a recording method, such as selecting the normal of the preceding patch or the direction of a certain edge.
- the encoder can transform the coordinates of vertices in N blocks in the first coordinate system to coordinates in the second coordinate system.
- the encoding end can determine whether there are transformation blocks among multiple blocks; when there are transformation blocks among multiple blocks, it obtains the target coordinate system of at least one transformation block; and performs coordinate system transformation on the information of vertices in at least one transformation block according to the target coordinate system of at least one transformation block to obtain at least one transformed block.
- a transformation block is a block among multiple blocks where the coordinate system information of its vertices is transformed.
- the encoding end can determine whether there are transformation blocks among multiple blocks by methods such as rate-distortion optimization, vertex distribution before and after coordinate system transformation, triangle attributes of the faces included in the 3D mesh, or pre-setting.
- the encoding end can perform this operation for any block, that is, for multiple meshes in the 3D mesh, it determines whether the block is a transformation block one by one; after determining each block and confirming it as a transformation block, it obtains its target coordinate system.
- Each transformation block can obtain a target coordinate system, and the target coordinate systems of different transformation blocks may be the same or different.
- the encoder uses a rate-distortion function to determine whether to perform a coordinate system transformation. Specifically, for any given block, it calculates the weighted values of the bitrate and distortion loss for each vertex in the block across a first coordinate system and other coordinate systems (e.g., a local coordinate system). Based on the bitrate and weighted values, it determines whether a coordinate system transformation is needed for the vertex information in that block, and selects the optimal coordinate system from among the other coordinate systems as the target coordinate system for that block. Another example is the method for analyzing vertex distribution before and after transformation.
- the purpose of determining whether there are transformed blocks among multiple blocks is to find blocks (i.e., transformed blocks) that have better compression ratio and bit rate in the target coordinate system than in the first coordinate system.
- blocks i.e., transformed blocks
- using the vertex information of the target coordinate system can improve the compression efficiency of the vertices of the transformed blocks.
- the encoding end can set coordinate system information for the transformed block.
- This coordinate system information can include a coordinate system identifier to indicate the target coordinate system of the transformed block. For example, a coordinate system identifier of 1 indicates a local normal coordinate system; a coordinate system identifier of 0 indicates a local cylindrical coordinate system.
- the coordinate system information can also include a coordinate system transformation indicator to indicate whether coordinate system transformation is required for the vertex information in the block. For example, a coordinate system transformation indicator of 1 indicates that coordinate system transformation is required for the vertex information in the block; a coordinate system transformation indicator of 0 indicates that coordinate system transformation is not required for the vertex information in the block.
- the information of vertices in any transformed block is the information in the corresponding target coordinate system. That is, for any transformed block, once its target coordinate system is determined, the information of at least one vertex in the transformed block can be transformed to obtain the transformed information of at least one vertex in the target coordinate system, which is the transformed block.
- the encoding end can first establish a target coordinate system, which can rely on reference vertices or reference triangle faces to determine the origin and coordinate axis directions of the target coordinate system. Then, the information in the first coordinate system is transformed to the target coordinate system.
- the transformation methods include, but are not limited to, translation, rotation, scaling, and perspective transformations.
- the target coordinate system is a local normal coordinate system, whose origin can be an encoded vertex or obtained from an encoded triangle face.
- the coordinate axes of the local normal coordinate system can be generated by the normal and two tangential components at each vertex, as shown in Figure 3 ( Figure 3 is a schematic diagram of the Cartesian coordinate system and the local normal coordinate system).
- the target coordinate system is a local cylindrical coordinate system, whose origin can be an encoded vertex or obtained from an encoded triangle face.
- the coordinate axes of the local cylindrical coordinate system can be obtained based on a reference triangle.
- the vertex coordinates (x, y, z) in the Cartesian coordinate system are transformed into the dihedral angle ⁇ , radius r, and height z in the local cylindrical coordinate system, as shown in Figure 4 ( Figure 4 is a schematic diagram of the Cartesian coordinate system and the local cylindrical coordinate system).
- multiple vertices in the same patch are applicable to the same local coordinate system.
- different local coordinate systems can be created for different vertices. That is, one or more vertices can be transformed to the same local coordinate system, or different vertices can be transformed to different local coordinate systems. No specific limitation is made in this regard.
- the encoding end can perform prediction and entropy coding on at least one transformed block to obtain a bitstream.
- the transformed information of at least one vertex can be predicted to obtain the residual information of at least one vertex; then the residual information of at least one vertex is encoded to obtain the bitstream.
- coordinate system information can be filled into the patch data unit syntax in the bitstream.
- One method is to add new bytes to the syntax, and another method is to use the reserved fields in the syntax. This application does not specifically limit this approach.
- One possible implementation also includes: obtaining untransformed blocks, which are blocks among multiple blocks for which coordinate system transformation is not performed on the information of vertices; and encoding based on the untransformed blocks to obtain a bitstream.
- the non-transformation block can be a patch in the 3D mesh that is different from the transformed block. Compared to the local coordinate system (corresponding to the target coordinate system), the non-transformation block is more suitable for the world coordinate system (corresponding to the first coordinate system). That is, multiple vertices in the 3D mesh can be applied to different local coordinate systems or to the world coordinate system. Based on the characteristics of multiple vertices in the 3D mesh, multiple vertices can be assigned to different patches in the most suitable way, so that the same coordinate system transformation can be applied to the vertices in the same patch. In this way, the information of vertices in each patch adopts the most suitable coordinate system, which can comprehensively improve the vertex compression rate of the 3D mesh.
- the optimal coordinate system corresponding to the block to be transformed is selected, so that the vertex information in the block is placed in the corresponding coordinate system for subsequent compression processing, which can improve the compression efficiency of the vertex.
- FIG. 5 is a flowchart of process 500 of the decoding method provided in an embodiment of this application.
- Process 500 can be executed by the decoding end described above.
- Process 500 is described as a series of steps or operations. It should be understood that process 500 can be executed in various orders and/or occur simultaneously, and is not limited to the execution order shown in Figure 5.
- Process 500 may include:
- Step 501 Receive the bitstream corresponding to the 3D mesh.
- the above three-dimensional grid contains M blocks.
- Step 502 Decode the bitstream to obtain coordinate information and coordinate system information.
- the coordinate information includes the coordinates of some or all vertices in N blocks out of M blocks.
- the coordinate system information indicates whether the coordinate system of the N blocks is the first coordinate system, which is the original coordinate system of the M blocks, where 1 ⁇ N ⁇ M.
- the first coordinate system can be a Cartesian coordinate system, a local normal coordinate system, or a local cylindrical coordinate system.
- the bitstream contains a first syntax structure corresponding to a three-dimensional grid
- the first syntax structure contains a second syntax structure corresponding to M blocks
- coordinate system information is located in the first syntax structure and outside the second syntax structure, or the coordinate system information is located in the second syntax structure corresponding to N blocks.
- the code stream corresponding to the three-dimensional grid contains two kinds of syntax elements: one is the syntax element corresponding to the three-dimensional grid (i.e., the grid syntax element, corresponding to the first syntax structure mentioned above), and the other is the syntax element corresponding to the block (i.e., the block syntax element, corresponding to the second syntax structure mentioned above).
- the coordinate system information in this embodiment can be written into the first syntax structure or the second syntax structure, and there is no specific limitation on this.
- the coordinate system information indicates that the current coordinate system of the N blocks is a second coordinate system, which is a transformed version of the first coordinate system.
- the coordinate information includes the coordinates of some or all vertices in the N blocks under the second coordinate system.
- the second coordinate system can be a Cartesian coordinate system, a local normal coordinate system, or a local cylindrical coordinate system.
- the coordinate system information can also indicate that an inverse coordinate transformation should be performed during decoding.
- the coordinate system information can include information on N sets of target coordinate systems, as well as the correspondence between the information on these N sets of target coordinate systems and N blocks.
- the information on any set of target coordinate systems includes the type of the target coordinate system, the origin information of the target coordinate system, and the coordinate axis information of the target coordinate system.
- the decoding end needs to transform the coordinate information of the vertices in the N blocks back from the second coordinate system to the first coordinate system.
- the decoding end can be based on the coordinate system information, where the target coordinate system information includes the type of the target coordinate system, the origin information, and the coordinate axis information.
- the target coordinate system can be either a first coordinate system or a second coordinate system.
- identifier 1 represents the first coordinate system and identifier 2 represents the second coordinate system; or, identifier 0 represents the first coordinate system and identifier 1 represents the second coordinate system, without specific limitations.
- the first coordinate system can be a Cartesian coordinate system, a local normal coordinate system, or a local cylindrical coordinate system.
- the decoding end When the decoding end recognizes that the type of the target coordinate system is 2, it can determine that the coordinate information of the vertices in the block needs to be transformed from the second coordinate system back to the first coordinate system.
- the origin information of the target coordinate system includes the coordinate information of the origin in the first coordinate system; or, the origin information of the target coordinate system includes the origin acquisition mode.
- the encoding end can directly write the coordinate information of the origin in the first coordinate system into the bitstream, for example, (0,1,0), or the encoding end can write the origin acquisition mode into the bitstream, for example, identifier 0 represents the origin as the previous vertex, identifier 1 represents the origin as the midpoint of the two previous vertices, without specific limitations.
- the decoding end can use the vertex at position (0,1,0) in the first coordinate system as the origin of the second coordinate system, or use the previously processed vertex as the origin of the second coordinate system, or calculate the midpoint of the two vertices and then use that midpoint as the origin of the second coordinate system, etc., without making specific limitations.
- the coordinate axis information of the target coordinate system includes vectors of the coordinate axes in the first coordinate system; or, the coordinate axis information of the target coordinate system includes the acquisition mode of the coordinate axes.
- the encoding end can directly write the vectors of the coordinate axes in the first coordinate system into the bitstream, for example, (1,0,0) represents the x-axis, (0,1,0) represents the y-axis, and (0,0,1) represents the z-axis; or, the encoding end can write the acquisition mode of the coordinate axes into the bitstream, for example, identifier 1 represents the default x-axis (1,0,0), y-axis (0,1,0), z-axis (0,0,1), and identifier 2 represents a preset rotation angle.
- the aforementioned preset angle can also be written into the bitstream.
- the decoding end can use the vector represented by (1,0,0) in the first coordinate system as the x-axis of the second coordinate system, the vector represented by (0,1,0) in the first coordinate system as the y-axis of the second coordinate system, and the vector represented by (0,0,1) in the first coordinate system as the z-axis of the second coordinate system, or rotate the first coordinate system according to the rotation angle.
- the origin and coordinate axes of the target coordinate system can also be obtained in a way that is agreed upon in advance by the encoder and decoder.
- the default origin is the previous vertex
- the default x-axis is (1,0,0)
- the default y-axis is (0,1,0)
- the default z-axis is (0,0,1).
- the decoding end can construct a second coordinate system corresponding to the vertex, and then perform a transformation from the second coordinate system to the first coordinate system on the coordinate information of the vertex based on the second coordinate system.
- the coordinate system information indicates that the current coordinate system of the N blocks is the first coordinate system, and the coordinate information includes the coordinates of some or all of the vertices in the N blocks in the first coordinate system.
- the coordinate system information can contain N block identifiers, which are used to indicate that the N blocks corresponding to the N block identifiers do not undergo coordinate system transformation.
- the encoding end can provide reverse indication, that is, the coordinate system information indicates blocks that do not require coordinate system transformation, while other blocks are assumed to require coordinate system transformation, and the target coordinate system of these blocks is pre-defined by the encoding and decoding ends.
- the decoding end can retain the blocks corresponding to the block identifiers in the coordinate system information, while performing coordinate system transformation on the blocks not indicated in the coordinate system information.
- Step 503 Obtain the reconstruction data of the three-dimensional mesh based on coordinate information and coordinate system information.
- the reconstruction data includes M sets of reconstruction information, which correspond one-to-one with M blocks and each contains the coordinates of the vertices in the corresponding block in the first coordinate system.
- the decoder transforms the coordinates of vertices in the N blocks in the second coordinate system to coordinates in the first coordinate system.
- the decoding end obtains information about at least one first vertex in the first block to be decoded, as well as the second coordinate system corresponding to at least one first vertex, based on the bitstream; performs coordinate system transformation on the information of at least one first vertex to obtain the transformed information of at least one first vertex in the first coordinate system; and reconstructs at least one first vertex based on the transformed information of at least one first vertex.
- the decoding end can first determine whether a coordinate system transformation is needed for the vertex information in the first patch. If a coordinate system transformation is needed, the second coordinate system is obtained based on the coordinate system identifier.
- the decoding process at the decoding end can be similar to that at the encoding end. First, the stream is decoded to obtain residual information, and then residual compensation is performed based on the residual information to obtain the aforementioned information.
- the encoding and decoding ends can also employ other encoding and decoding methods, which are not specifically limited in this embodiment.
- At least one first vertex of the reconstructed first patch is in the second coordinate system. Therefore, in order to achieve the reconstruction goal, the information of at least one first vertex obtained from the decoding stream can be transformed into information in the first coordinate system, i.e., the transformed information of at least one first vertex.
- the transformed information of at least one first vertex may include the following two cases:
- the transformed information of at least one first vertex includes the position coordinates of at least one first vertex in the first coordinate system. That is, the information of at least one first vertex obtained by the decoding stream is the position coordinates of at least one first vertex in the second coordinate system, and correspondingly, the transformed information of at least one first vertex after coordinate system transformation is the position coordinates of at least one first vertex in the first coordinate system.
- the transformed information of at least one first vertex includes the position coordinate residual of at least one first vertex in the first coordinate system. That is, the information of at least one first vertex obtained by the decoding stream is the position coordinate residual of at least one first vertex in the second coordinate system, and correspondingly, the transformed information of at least one first vertex after coordinate system transformation is the position coordinate residual of at least one first vertex in the first coordinate system.
- transformed information of at least one first vertex may also include other information related to its position coordinates, without any specific limitations.
- the decoding end can reconstruct at least one first vertex based on the position coordinates of at least one first vertex in the first coordinate system. That is, after coordinate system transformation, the position coordinates of at least one first vertex have been transformed from the second coordinate system to the first coordinate system. Therefore, the transformed position coordinates can be directly used as the position coordinates of at least one first vertex to achieve the purpose of reconstructing at least one first vertex.
- the decoding end can reconstruct the position coordinates of at least one first vertex in the first coordinate system based on the residual position coordinates of at least one first vertex in the first coordinate system; then, it can reconstruct the position coordinates of at least one first vertex based on the residual position coordinates of at least one first vertex in the first coordinate system. That is, after coordinate system transformation, the residual position coordinates of at least one first vertex have been transformed from the second coordinate system to the first coordinate system. At this point, it is necessary to reconstruct the position coordinates of at least one first vertex in the first coordinate system based on the residual position coordinates. Then, the transformed position coordinates are used as the position coordinates of at least one first vertex to achieve the purpose of reconstructing at least one first vertex.
- the coordinate system of the vertices in a patch of a 3D mesh can be determined by decoding the stream, and then the coordinate system transformation is performed to obtain the reconstructed information in the world coordinate system, which can improve the compression efficiency of vertices.
- Figure 6 is a schematic diagram of the encoding and decoding framework of an embodiment of this application. As shown in Figure 6, the encoding end includes the following processing steps:
- the encoding end selects the coordinate system for the patch to be encoded in the 3D mesh, and determines the applicable coordinate system for the patch;
- the decoding process includes the following steps:
- the decoding end performs entropy decoding and dequantization on the bitstream to obtain the prediction results of the vertices in the patch to be decoded;
- FIG. 7 is a schematic diagram of the encoding and decoding framework of an embodiment of this application. As shown in Figure 7, the encoding end includes the following processing steps:
- the connectivity of a 3D mesh i.e., the index values of the three vertices in each triangle
- EdgeBreaker uses a finite set of state symbols (e.g., "CLERS") to identify each triangle based on its relative position to the set of already encoded triangles, thus completing the traversal encoding of all triangles.
- CLERS finite set of state symbols
- the encoder selects the coordinate system (local coordinate system) for the vertices of the patch to be encoded in the 3D mesh (in Cartesian coordinates).
- the coordinate system applicable to the patch to be encoded can be written into the bitstream for the decoder to obtain, or it can be obtained by the decoder based on local prior information without being written into the bitstream.
- the patch can be obtained through methods such as mesh segmentation and triangular patch clustering, and it can be a set of vertices.
- the selection method for the coordinate system applicable to the patch to be encoded can include: rate-distortion optimization, vertex position distribution before and after transformation, mesh triangle attributes, or a pre-defined coordinate system consistent with the encoder and decoder.
- rate-distortion optimization can make a decision based on the amount of data (rate) required for encoding the patch under different local coordinate systems and the amount of distortion to be optimized (mesh encoding quality loss).
- the patch to be encoded in the applicable local coordinate system is transformed.
- a local coordinate system is established, which can rely on reference vertices or reference triangles to determine the origin and coordinate axis directions. Then, the vertex information from the Cartesian coordinate system is transformed to this local coordinate system. Coordinate system transformation methods include, but are not limited to, translation, rotation, scaling, and perspective transformations.
- the origin in a local normal coordinate system transformation, can be an encoded vertex or derived from an encoded triangle.
- the local normal coordinate system's axes can be generated from the normal and two tangential components at each vertex, as shown in Figure 4.
- Another example is a local cylindrical coordinate system transformation.
- the origin can be an encoded vertices or derived from an encoded triangle.
- the local cylindrical coordinate system axes can be obtained based on a reference triangle.
- the vertex coordinates (x, y, z) in the Cartesian coordinate system are transformed into the dihedral angle ⁇ , radius r, and height z in the local cylindrical coordinate system, as shown in Figure 5.
- Local coordinate system transformations are not limited to one coordinate system; that is, a 3D mesh can have multiple coordinate system transformations simultaneously.
- Prediction methods in the Cartesian coordinate system can be spatial prediction or spatiotemporal dual prediction based on reference relationships, or differential prediction or neighborhood weighted prediction based on reference information, or a combination of multiple prediction methods. No specific limitations are imposed on these methods.
- the prediction method in the local coordinate system can be the same as the prediction method in the Cartesian coordinate system, or it can be a different prediction method. This application does not make specific limitations on the embodiments.
- the embodiments may calculate the value of the reference vertex or triangle mapped to the coordinate system (the coordinate system where the patch to be encoded is located) as the reference information required for prediction; alternatively, the vertex may be mapped to the coordinate system of the reference vertex or triangle, the prediction result calculated, and the prediction result mapped back to the coordinate system of the vertex; alternatively, no prediction may be performed; alternatively, prediction may be based on prior information, which may be a fixed value or a corresponding reference value calculated from the mesh, etc. No specific limitations are imposed on this.
- Transformation methods can include wavelet transform, discrete cosine transform, etc.
- Quantization of the prediction residual can be performed in various ways, including scalar quantization or vector quantization, without limitation. Furthermore, the quantization operation can apply the same precision and method to patches to be encoded in different coordinate systems, or it can apply different precision and methods to patches to be encoded in different coordinate systems. If the coordinate system selected for the patch to be encoded needs to be obtained by the decoder through the bitstream, then this information can also be used for prediction and entropy coding operations.
- the bitstream also includes coordinate system information, which indicates the coordinate system applicable to the patch to be encoded.
- the decoding process is as follows:
- Entropy decoding and dequantization operations correspond to step 4 at the encoding end.
- the decoding stream at the decoding end can obtain coordinate system information, which is used to determine the coordinate system of the patch to be decoded and identify the coordinate system of the patch to be decoded.
- the corresponding coordinate system reconstruction method corresponds to step 3 at the encoding end, where the method for calculating the reference vertex or triangle is required when the reference vertex or reference triangle and the patch to be reconstructed and decoded are not in the same coordinate system.
- the inverse transformation of the local coordinate system corresponds to step 2 at the encoding end.
- the origin and coordinate axes of the local coordinate system are determined, which requires reference vertices or triangles. These can be vertices or triangles already decoded in the bitstream, or vertices or triangles obtained locally a priori to determine the local coordinate system. Then, the vertex values in the local coordinate system are transformed to the Cartesian coordinate system.
- the decoding end can merge the reconstruction results of vertices from multiple patches to obtain a reconstructed 3D mesh.
- Methods for merging Cartesian and local coordinate systems include: directly using the union of vertices in the Cartesian and local coordinate systems as the reconstruction result of the 3D mesh, and then reordering the mesh vertices according to connectivity; or traversing the mesh vertices according to connectivity and obtaining the reconstruction result of the 3D mesh based on the coordinate system of the patch to be decoded; additionally, some or all vertices in the 3D mesh can be modified to handle misalignments or discontinuities caused by merging vertices in different coordinate systems. Modification methods can include mesh filtering, vulnerability patching, etc. This application does not specifically limit the merging method.
- Cartesian coordinates are suitable for compressing vertex data where adjacent vertices have similar motion vectors, but they are less efficient for detailed, wrinkled regions in 3D meshes.
- Local coordinate systems can concentrate the data to be compressed across fewer dimensions, thereby improving coding efficiency.
- vertex quantization errors can affect the establishment of subsequent local coordinate systems, thus reducing the prediction efficiency of subsequent vertices.
- the embodiments of this application can apply appropriate coordinate system transformations to different types of vertices, solving the problem of low compression efficiency in local regions when a single transformation or no transformation is applied to the entire mesh. Compared to existing methods, the bitrate can be reduced by approximately 15% while maintaining the same reconstruction quality.
- Figure 8 is a schematic diagram of the encoding and decoding framework of an embodiment of this application.
- the encoding end first predicts the vertex information in the patch to be encoded in the Cartesian coordinate system, then performs coordinate system transformation on the prediction result, and performs prediction in the local coordinate system.
- local coordinate system prediction may not be included, that is, only local coordinate system transformation is performed, and the transformation result is quantized and entropy encoded.
- the decoding end corresponds to the encoding end. It selects the coordinate system for the entropy-decoded and dequantized values, performs inverse coordinate transformation on the vertices in the patch to be decoded, and reconstructs them in the local coordinate system. Then, it merges the meshes and finally performs reconstruction in the Cartesian coordinate system.
- FIG. 9 is a schematic diagram of the encoding and decoding framework of an embodiment of this application. As shown in Figure 9, the encoding end includes the following processing steps:
- the first patch includes at least one first vertex in the 3D mesh.
- the information of the at least one first vertex is information in the first coordinate system.
- the information of the at least one first vertex includes the position coordinates of the at least one first vertex in the first coordinate system.
- the residual information of at least one first vertex is subjected to coordinate system transformation to obtain the transformed residual information of at least one first vertex in the first coordinate system; or, when it is determined that the first patch requires coordinate system transformation, the residual information of at least one first vertex is encoded to obtain the bitstream.
- the encoder can predict the transformed residual information of at least one first vertex to obtain the residual information of at least one first vertex; and encode the residual information of at least one first vertex to obtain the bitstream.
- the decoding process includes the following steps:
- the residual information of at least one first vertex in the first patch is information in the first coordinate system.
- the coordinate system transformation is performed on the residual information of at least one first vertex to obtain the transformed residual information of at least one first vertex in the second coordinate system, which is different from the first coordinate system.
- the decoding end can also receive the bitstream; obtain the residual information of at least one second vertex in the second patch to be decoded based on the bitstream, the residual information of at least one second vertex being information in the second coordinate system; when it is determined that the first patch needs to undergo coordinate system transformation, reconstruct the information of at least one first vertex based on the residual information of at least one first vertex; perform coordinate system transformation on the information of at least one first vertex to obtain the transformed information of at least one first vertex in the first coordinate system; and reconstruct at least one first vertex based on the transformed information of at least one first vertex.
- this embodiment reverses the order of local coordinate system transformation and local coordinate system prediction with Cartesian coordinate system prediction. That is, local coordinate system transformation and prediction can be performed on the patch to be encoded, and the prediction result can be inversely transformed back to the Cartesian coordinate system for secondary prediction.
- the decoding end corresponds to the encoding end, performing coordinate system determination on the entropy-decoded and dequantized values, and then performing Cartesian coordinate system reconstruction and local coordinate system transformation on the patch to be decoded.
- the meshes are then merged, and finally, reconstruction and inverse local coordinate system transformation are performed to obtain the reconstructed mesh.
- the encoder can add a quantization step after the first prediction (i.e., Cartesian coordinate system prediction in Figure 8, local coordinate system prediction in Figure 9) and before the coordinate system selection, and the decoder can add an inverse quantization step accordingly.
- the first prediction i.e., Cartesian coordinate system prediction in Figure 8, local coordinate system prediction in Figure 9
- the decoder can add an inverse quantization step accordingly.
- FIG 10 is a structural schematic diagram of the encoding device 1000 of this application. As shown in Figure 10, the encoding device 1000 of this embodiment can be applied to the above-mentioned encoding end.
- the encoding device 1000 may include: a transceiver module 1001 and an encoding module 1002.
- the transceiver module 1001 is used to acquire the raw data of a three-dimensional mesh, wherein the three-dimensional mesh contains M blocks, and the raw data contains M sets of raw information.
- the M sets of raw information correspond one-to-one with the M blocks and each contains the coordinates of the vertices in the corresponding block in a first coordinate system.
- the first coordinate system is the original coordinate system of the M blocks.
- the encoding module 1002 is used to encode the coordinate information and coordinate system information into the bitstream corresponding to the three-dimensional mesh based on the raw data.
- the coordinate information contains the coordinates of some or all vertices in N blocks of the M blocks, and the coordinate system information indicates whether the current coordinate system of the N blocks is the first coordinate system, where 1 ⁇ N ⁇ M.
- the coordinate system information indicates that the current coordinate system of the N blocks is a second coordinate system, which is a transformed coordinate system of the first coordinate system.
- the coordinate information includes the coordinates of some or all vertices in the N blocks in the second coordinate system.
- the coordinate system information also indicates that an inverse coordinate transformation should be performed during decoding.
- the encoding module 1002 is further configured to transform the coordinates of the vertices in the N blocks in the first coordinate system to coordinates in the second coordinate system.
- the coordinate system information indicates that the current coordinate system of the N blocks is the first coordinate system, and the coordinate information includes the coordinates of some or all vertices in the N blocks in the first coordinate system.
- the bitstream includes a first syntax structure corresponding to the three-dimensional grid, the first syntax structure including a second syntax structure corresponding to the M blocks, the coordinate system information being located in the first syntax structure and outside the second syntax structure, or the coordinate system information being located in the second syntax structure corresponding to the N blocks.
- the coordinate system information includes information on N sets of second coordinate systems, and the correspondence between the information on the N sets of second coordinate systems and the N blocks.
- Any set of information on the second coordinate system includes the type of the second coordinate system, the origin information of the second coordinate system, and the coordinate axis information of the second coordinate system.
- the origin information of the second coordinate system includes the coordinate information of the origin in the first coordinate system; or, the origin information of the second coordinate system includes the origin acquisition mode.
- the coordinate axis information of the second coordinate system includes the vectors of the coordinate axes in the first coordinate system; or, the coordinate axis information of the second coordinate system includes the acquisition mode of the coordinate axes.
- the apparatus in this embodiment can be used to execute the technical solution of the method embodiment shown in FIG2. Its implementation principle and technical effect are similar, and will not be described again here.
- FIG 11 is a schematic diagram of the structure of the decoding device 1100 of this application. As shown in Figure 11, the decoding device 1100 of this embodiment can be applied to the decoding end mentioned above.
- the decoding device 1100 may include: a transceiver module 1101 and a decoding module 1102.
- the transceiver module 1101 is used to receive the bitstream corresponding to the three-dimensional mesh, the three-dimensional mesh containing M blocks; the decoding module 1102 is used to decode the bitstream to obtain coordinate information and coordinate system information, the coordinate information containing the coordinates of some or all vertices in N blocks of the M blocks, the coordinate system information indicating whether the coordinate system of the N blocks is a first coordinate system, the first coordinate system being the original coordinate system of the M blocks, 1 ⁇ N ⁇ M; based on the coordinate information and the coordinate system information, the reconstruction data of the three-dimensional mesh is obtained, the reconstruction data containing M sets of reconstruction information, the M sets of reconstruction information corresponding one-to-one with the M blocks and each containing the coordinates of the vertices in the corresponding block in the first coordinate system.
- the coordinate system information indicates that the current coordinate system of the N blocks is a second coordinate system, which is a transformed coordinate system of the first coordinate system.
- the coordinate information includes the coordinates of some or all vertices in the N blocks in the second coordinate system.
- the coordinate system information also indicates that an inverse coordinate transformation should be performed during decoding.
- the decoding module 1102 is further configured to transform the coordinates of the vertices in the N blocks in the second coordinate system to the coordinates in the first coordinate system.
- the coordinate system information indicates that the current coordinate system of the N blocks is the first coordinate system, and the coordinate information includes the coordinates of some or all vertices in the N blocks in the first coordinate system.
- the bitstream includes a first syntax structure corresponding to the three-dimensional grid, the first syntax structure including a second syntax structure corresponding to the M blocks, the coordinate system information being located in the first syntax structure and outside the second syntax structure, or the coordinate system information being located in the second syntax structure corresponding to the N blocks.
- the coordinate system information includes information on N sets of second coordinate systems, and the correspondence between the information on the N sets of second coordinate systems and the N blocks.
- Any set of information on the second coordinate system includes the type of the second coordinate system, the origin information of the second coordinate system, and the coordinate axis information of the second coordinate system.
- the origin information of the second coordinate system includes the coordinate information of the origin in the first coordinate system; or, the origin information of the second coordinate system includes the origin acquisition mode.
- the coordinate axis information of the second coordinate system includes the vectors of the coordinate axes in the first coordinate system; or, the coordinate axis information of the second coordinate system includes the acquisition mode of the coordinate axes.
- the apparatus in this embodiment can be used to execute the technical solution of the method embodiment shown in FIG5. Its implementation principle and technical effect are similar, and will not be described again here.
- FIG 12 is a schematic structural diagram of the electronic device 1200 provided in this application.
- the electronic device 1200 may include a processor 1201 and a transceiver circuit 1202. Optionally, it may also include a memory 1203.
- bus 1204 includes not only a data bus but also a power bus, a control bus, and a status signal bus. However, for clarity, all buses are referred to as bus 1204 in the figure.
- the memory 1203 can be used to store the instructions in the above method embodiments.
- the processor 1201 can be used to execute instructions in the memory 1203, control the transceiver circuit 1202 to receive signals, and control the transceiver circuit 1202 to send signals.
- Electronic device 1200 can be the electronic device at the encoding/decoding end in the above method embodiment or a chip in an electronic device.
- each step of the above method embodiments can be completed by integrated logic circuits in the processor hardware or by instructions in software form.
- the processor can be a general-purpose processor, a digital signal processor (DSP), an application-specific integrated circuit (ASIC), a field-programmable gate array (FPGA), or other programmable logic devices, discrete gate or transistor logic devices, or discrete hardware components.
- DSP digital signal processor
- ASIC application-specific integrated circuit
- FPGA field-programmable gate array
- a general-purpose processor can be a microprocessor or any conventional processor.
- the steps of the method disclosed in this application can be directly implemented by a hardware encoding processor, or by a combination of hardware and software modules in the encoding processor.
- the software modules can reside in random access memory, flash memory, read-only memory, programmable read-only memory, electrically erasable programmable memory, registers, or other mature storage media in the art.
- This storage medium is located in memory; the processor reads information from the memory and, in conjunction with its hardware, completes the steps of the above method.
- the memory mentioned in the above embodiments can be volatile memory or non-volatile memory, or may include both.
- the non-volatile memory can be read-only memory (ROM), programmable read-only memory (PROM), erasable programmable read-only memory (EPROM), electrically erasable programmable read-only memory (EEPROM), or flash memory.
- the volatile memory can be random access memory (RAM), which is used as an external cache.
- RAM synchronous dynamic random access memory
- SDRAM synchronous dynamic random access memory
- DDR SDRAM double data rate synchronous dynamic random access memory
- ESDRAM enhanced synchronous dynamic random access memory
- SLDRAM synchronous linked dynamic random access memory
- DR RAM direct rambus RAM
- the disclosed systems, apparatuses, and methods can be implemented in other ways.
- the apparatus embodiments described above are merely illustrative; for instance, the division of units is only a logical functional division, and in actual implementation, there may be other division methods.
- multiple units or components may be combined or integrated into another system, or some features may be ignored or not executed.
- the coupling or direct coupling or communication connection shown or discussed may be through some interfaces; the indirect coupling or communication connection between apparatuses or units may be electrical, mechanical, or other forms.
- the units described as separate components may or may not be physically separate.
- the components shown as units may or may not be physical units; that is, they may be located in one place or distributed across multiple network units. Some or all of the units can be selected to achieve the purpose of this embodiment according to actual needs.
- the functional units in the various embodiments of this application can be integrated into one processing unit, or each unit can exist physically separately, or two or more units can be integrated into one unit.
- the aforementioned functions are implemented as software functional units and sold or used as independent products, they can be stored in a computer-readable storage medium.
- This computer software product is stored in a storage medium and includes several instructions to cause a computer device (personal computer, server, or network device, etc.) to execute all or part of the steps of the methods described in the various embodiments of this application.
- the aforementioned storage medium includes various media capable of storing program code, such as USB flash drives, portable hard drives, read-only memory (ROM), random access memory (RAM), magnetic disks, or optical disks.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Physics & Mathematics (AREA)
- Computer Graphics (AREA)
- Geometry (AREA)
- Software Systems (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
- Image Generation (AREA)
Abstract
Description
本申请涉及3D技术,尤其涉及一种编解码方法和装置。This application relates to 3D technology, and more particularly to an encoding/decoding method and apparatus.
三维网格(例如动态网格)可以用于表达体积视频、数字人、计算机图形学(Computer Graphics,CG)内容等,其特点是一帧包含一个网格。三维网格的原始数据量大,在无压缩条件下,其码率通常超过3Gbps,这给传输带来极大的压力。因此需要通过高效的编解码技术,以提高三维网格的存储效率或传输效率。Three-dimensional meshes (such as dynamic meshes) can be used to represent volumetric video, digital humans, computer graphics (CG) content, etc., characterized by each frame containing one mesh. The raw data volume of 3D meshes is large; under uncompressed conditions, their bitrate typically exceeds 3Gbps, which puts enormous pressure on transmission. Therefore, efficient encoding and decoding techniques are needed to improve the storage or transmission efficiency of 3D meshes.
本申请提供一种编解码方法和装置,以提高顶点的压缩效率。This application provides an encoding/decoding method and apparatus to improve vertex compression efficiency.
第一方面,本申请实施例提供了一种编码方法,包括:获取三维网格的原始数据,所述三维网格包含M个区块,所述原始数据包含M组原始信息,所述M组原始信息与所述M个区块一一对应且分别包含对应区块中的顶点在第一坐标系下的坐标,所述第一坐标系为所述M个区块的原始坐标系;基于所述原始数据将坐标信息和坐标系信息编码到所述三维网格对应的码流中,所述坐标信息包含所述M个区块中的N个区块中部分或全部顶点的坐标,所述坐标系信息指示所述N个区块的当前坐标系是否为所述第一坐标系,1≤N<M。In a first aspect, embodiments of this application provide an encoding method, comprising: acquiring raw data of a three-dimensional mesh, wherein the three-dimensional mesh comprises M blocks, the raw data comprises M sets of raw information, the M sets of raw information correspond one-to-one with the M blocks and respectively contain the coordinates of vertices in the corresponding blocks in a first coordinate system, wherein the first coordinate system is the original coordinate system of the M blocks; encoding the coordinate information and coordinate system information into a bitstream corresponding to the three-dimensional mesh based on the raw data, wherein the coordinate information contains the coordinates of some or all vertices in N blocks of the M blocks, and the coordinate system information indicates whether the current coordinate system of the N blocks is the first coordinate system, 1≤N<M.
本申请实施例,对三维网格中的多个区块分别判断是否要进行顶点的信息的坐标系变换,选取要变换的区块对应的最优的坐标系,从而将该区块中的顶点的信息放在相应的坐标系下进行后续压缩处理,可以提高顶点的压缩效率。In this embodiment, it is determined whether coordinate system transformation of vertex information is required for multiple blocks in a 3D mesh. The optimal coordinate system corresponding to the block to be transformed is selected, so that the vertex information in the block is placed in the corresponding coordinate system for subsequent compression processing, which can improve the compression efficiency of the vertex.
本申请实施例中,三维网格包含M个区块,原始数据包含M组原始信息,M组原始信息与M个区块一一对应且分别包含对应区块中的顶点在第一坐标系下的坐标,该第一坐标系为M个区块的原始坐标系。可选的,第一坐标系可以是笛卡尔坐标系、局部法向坐标系或者局部圆柱坐标系。In this embodiment, the 3D mesh comprises M blocks, and the original data comprises M sets of original information. Each of the M sets of original information corresponds one-to-one with one of the M blocks and contains the coordinates of the vertices in the corresponding block in a first coordinate system. This first coordinate system is the original coordinate system of the M blocks. Optionally, the first coordinate system can be a Cartesian coordinate system, a local normal coordinate system, or a local cylindrical coordinate system.
如上文所述,三维网格的特点是一帧包含一个网格,一个网格可以包括多个面片,该面片可以为多边形(例如三角形),每个多边形都由其在三维空间中的顶点和关于顶点如何连接的信息(称为连接信息)定义。三维网格的数据可以包括多个顶点坐标和多个连接关系(连接关系也可以解释为拓扑结构,而多个连接关系可以称之为连接集合),其中,As mentioned above, a 3D mesh is characterized by one frame containing one mesh, and one mesh can include multiple faces, which can be polygons (e.g., triangles). Each polygon is defined by its vertices in 3D space and information about how the vertices are connected (called connectivity information). The data of a 3D mesh can include multiple vertex coordinates and multiple connectivity relationships (connectivity relationships can also be interpreted as topological structures, and multiple connectivity relationships can be called a connectivity set).
顶点坐标可以采用三维坐标(例如,(x,y,z))表示,也可以采用球坐标表示,其作用在于指示顶点在3D空间中的位置,此外顶点坐标可以在世界坐标系(例如,笛卡尔坐标系)下表示,也可以在局部坐标系(例如,局部法向坐标系或局部圆柱坐标系)下表示。Vertex coordinates can be represented in three-dimensional coordinates (e.g., (x, y, z)) or in spherical coordinates. Their purpose is to indicate the position of the vertex in 3D space. In addition, vertex coordinates can be represented in a world coordinate system (e.g., a Cartesian coordinate system) or in a local coordinate system (e.g., a local normal coordinate system or a local cylindrical coordinate system).
连接关系可以通过与该连接关系关联的顶点表示,例如,(123)表示了由顶点1至顶点2、再至顶点3、再至顶点1的连接关系,或者,也可以通过面片(例如,三角形面片、多边形面片等)表示。例如,三角形面片具有方向,1,2,3代表1->2,2->3,3->1,或者代表三角形面片由1、2、3这三个顶点构成。又例如,通过边的方式表示,4表示1->2这条边,5表示2->3这条边,6表示3->1这条边。应理解,前述内容仅作为连接关系的表示方式的示例,本申请实施例对表示连接关系的方式不做具体限定。Connections can be represented by vertices associated with the connection. For example, (123) represents a connection from vertex 1 to vertex 2, then to vertex 3, and then back to vertex 1. Alternatively, they can be represented by faces (e.g., triangular faces, polygonal faces, etc.). For example, a triangular face has a direction; 1, 2, 3 represents 1->2, 2->3, 3->1, or it represents that the triangular face is composed of vertices 1, 2, and 3. Another example is representation by edges; 4 represents the edge 1->2, 5 represents the edge 2->3, and 6 represents the edge 3->1. It should be understood that the foregoing is merely an example of how connections can be represented, and the embodiments of this application do not specifically limit the way connections can be represented.
本申请实施例中,区块(patch)也可以称作区域、块、空间块等,对此不做具体限定。任意一个区块(例如第一区块,亦称作第一patch)可以包括三维网格中的至少一个顶点(例如第一顶点),即,第一patch包括三维网格中的一个顶点,或者,第一patch包括三维网格中的部分(多个)顶点。In this embodiment of the application, a patch can also be referred to as a region, block, spatial block, etc., without specific limitation. Any patch (e.g., the first patch, also called the first patch) may include at least one vertex (e.g., the first vertex) in the three-dimensional mesh, that is, the first patch includes a vertex in the three-dimensional mesh, or the first patch includes some (multiple) vertices in the three-dimensional mesh.
在一种可能的实现方式中,码流包含对应于三维网格的第一语法结构,第一语法结构包含对应于M个区块的第二语法结构,坐标系信息位于第一语法结构中且在第二语法结构之外,或者,坐标系信息位于对应于N个区块的第二语法结构中。In one possible implementation, the bitstream contains a first syntax structure corresponding to a three-dimensional grid, the first syntax structure contains a second syntax structure corresponding to M blocks, and coordinate system information is located in the first syntax structure and outside the second syntax structure, or the coordinate system information is located in the second syntax structure corresponding to N blocks.
即,三维网格对应的码流包含两种语法元素,一种是对应于三维网格的语法元素(亦即网格语法元素,对应上述第一语法结构),另一种是对应于区块的语法元素(亦即区块语法元素,对应上述第二语法结构),本申请实施例中的坐标系信息可以写入第一语法结构,也可以写入第二语法结构,对此不做具体限定。That is, the code stream corresponding to the three-dimensional grid contains two kinds of syntax elements: one is the syntax element corresponding to the three-dimensional grid (i.e., the grid syntax element, corresponding to the first syntax structure mentioned above), and the other is the syntax element corresponding to the block (i.e., the block syntax element, corresponding to the second syntax structure mentioned above). The coordinate system information in this embodiment can be written into the first syntax structure or the second syntax structure, and there is no specific limitation on this.
在一种可能的实现方式中,坐标系信息指示N个区块的当前坐标系为第二坐标系,该第二坐标系为第一坐标系经变换后的坐标系,坐标信息包含N个区块中部分或全部顶点在第二坐标系下的坐标。可选的,第二坐标系可以是笛卡尔坐标系、局部法向坐标系或者局部圆柱坐标系。此外,坐标系信息还可以指示在解码时进行坐标逆变换。In one possible implementation, the coordinate system information indicates that the current coordinate system of the N blocks is a second coordinate system, which is a transformed version of the first coordinate system. The coordinate information includes the coordinates of some or all vertices in the N blocks under the second coordinate system. Optionally, the second coordinate system can be a Cartesian coordinate system, a local normal coordinate system, or a local cylindrical coordinate system. Furthermore, the coordinate system information can also indicate that an inverse coordinate transformation should be performed during decoding.
坐标系信息可以包含N组目标坐标系的信息,以及该N组目标坐标系的信息与N个区块的对应关系,任意一组目标坐标系的信息包含目标坐标系的类型、目标坐标系的原点信息以及目标坐标系的坐标轴信息。The coordinate system information can include information on N sets of target coordinate systems, as well as the correspondence between the information on these N sets of target coordinate systems and N blocks. The information on any set of target coordinate systems includes the type of the target coordinate system, the origin information of the target coordinate system, and the coordinate axis information of the target coordinate system.
本申请实施例中,三维网格包含的M个区块中N个区块需要进行坐标系变换,即,该N个区块中的任意一个区块,其所包含的顶点的坐标信息,编码端可以将前述坐标信息从第一坐标系变换到目标坐标系。为了让解码端也同步到前述坐标变换的过程,编码端可以在码流中写入坐标系信息,该坐标系信息包含目标坐标系的类型、目标坐标系的原点信息以及目标坐标系的坐标轴信息。其中,In this embodiment, N out of the M blocks in the 3D mesh require coordinate system transformation. Specifically, the encoding end can transform the coordinate information of the vertices within any of these N blocks from the first coordinate system to the target coordinate system. To synchronize the decoding end with this coordinate transformation process, the encoding end can write coordinate system information into the bitstream. This coordinate system information includes the type of the target coordinate system, the origin information, and the coordinate axes information.
可选的,目标坐标系的类型包含第一坐标系或第二坐标系。例如,标识1代表第一坐标系,标识2代表第二坐标系;或者,标识0代表第一坐标系,标识1代表第二坐标系,对此不做具体限定。Optionally, the target coordinate system can be either a first coordinate system or a second coordinate system. For example, identifier 1 represents the first coordinate system and identifier 2 represents the second coordinate system; or, identifier 0 represents the first coordinate system and identifier 1 represents the second coordinate system, without any specific limitation.
可选的,目标坐标系的原点信息包含原点在第一坐标系下的坐标信息;或者,目标坐标系的原点信息包含原点的获取模式。编码端可以直接在码流中写入原点在第一坐标系下的坐标信息,例如,(0,1,0),或者,编码端可以在码流中写入原点的获取模式,例如,标识0代表原点为上一个顶点,标识1代表原点为上两个顶点的中点,对此不做具体限定。Optionally, the origin information of the target coordinate system includes the coordinate information of the origin in the first coordinate system; or, the origin information of the target coordinate system includes the origin acquisition mode. The encoding end can directly write the coordinate information of the origin in the first coordinate system into the bitstream, for example, (0,1,0), or the encoding end can write the origin acquisition mode into the bitstream, for example, identifier 0 represents the origin as the previous vertex, identifier 1 represents the origin as the midpoint of the two previous vertices, without specific limitations.
可选的,目标坐标系的坐标轴信息包含坐标轴在第一坐标系下的向量;或者,目标坐标系的坐标轴信息包含坐标轴的获取模式。编码端可以直接在码流中写入坐标轴在第一坐标系下的向量,例如,(1,0,0)代表x轴,(0,1,0)代表y轴,(0,0,1)代表z轴,或者,编码端可以在码流中写入坐标轴的获取模式,例如,标识1代表默认的x轴(1,0,0),y轴(0,1,0),z轴(0,0,1),标识2代表旋转预设角度。可选的,前述预设角度也可以写入码流。Optionally, the coordinate axis information of the target coordinate system includes vectors of the coordinate axes in the first coordinate system; or, the coordinate axis information of the target coordinate system includes the acquisition mode of the coordinate axes. The encoding end can directly write the vectors of the coordinate axes in the first coordinate system into the bitstream, for example, (1,0,0) represents the x-axis, (0,1,0) represents the y-axis, and (0,0,1) represents the z-axis; or, the encoding end can write the acquisition mode of the coordinate axes into the bitstream, for example, identifier 1 represents the default x-axis (1,0,0), y-axis (0,1,0), z-axis (0,0,1), and identifier 2 represents a preset rotation angle. Optionally, the aforementioned preset angle can also be written into the bitstream.
可选的,上述目标坐标系的原点和坐标轴也可以采用编码端和解码端预先约定的方式获取,例如,默认原点为上一个顶点,默认的x轴为(1,0,0),y轴为(0,1,0),z轴为(0,0,1)。Optionally, the origin and coordinate axes of the target coordinate system can also be obtained in a way that is agreed upon in advance by the encoder and decoder. For example, the default origin is the previous vertex, the default x-axis is (1,0,0), the default y-axis is (0,1,0), and the default z-axis is (0,0,1).
在一种可能的实现方式中,坐标系信息指示N个区块的当前坐标系为第一坐标系,坐标信息包含N个区块中部分或全部顶点在第一坐标系下的坐标。In one possible implementation, the coordinate system information indicates that the current coordinate system of the N blocks is the first coordinate system, and the coordinate information includes the coordinates of some or all of the vertices in the N blocks in the first coordinate system.
坐标系信息可以包含N个区块标识,该N个区块标识用于指示与N个区块标识对应的N个区块不进行坐标系变换。The coordinate system information can contain N block identifiers, which are used to indicate that the N blocks corresponding to the N block identifiers do not undergo coordinate system transformation.
本申请实施例中,编码端可以反向指示,即坐标系信息中指示不进行坐标系变换的区块,而其他区块则默认要进行坐标系变换,且这些区块的目标坐标系为编码端和解码端预先预订的。这样解码端可以对坐标系信息中的区块标识所对应的区块保留,而对坐标系信息中没有指示的区块进行坐标系变换。In this embodiment, the encoding end can provide reverse indication, that is, the coordinate system information indicates blocks that do not require coordinate system transformation, while other blocks are assumed to require coordinate system transformation, and the target coordinate system of these blocks is pre-defined by the encoding and decoding ends. In this way, the decoding end can retain the blocks corresponding to the block identifiers in the coordinate system information, while performing coordinate system transformation on the blocks not indicated in the coordinate system information.
在一种可能的实现中,编码端可以将N个区块中的顶点在第一坐标系下的坐标变换为在第二坐标系下的坐标。In one possible implementation, the encoder can transform the coordinates of vertices in N blocks in the first coordinate system to coordinates in the second coordinate system.
编码端判断多个区块中是否有变换区块;当多个区块中有变换区块时,获取至少一个变换区块各自的目标坐标系;根据至少一个变换区块各自的目标坐标系分别对至少一个变换区块中的顶点的信息进行坐标系变换,以得到至少一个变换后区块。The encoding end determines whether there are transformation blocks among multiple blocks; when there are transformation blocks among multiple blocks, it obtains the target coordinate system of at least one transformation block; based on the target coordinate system of at least one transformation block, it performs coordinate system transformation on the information of vertices in at least one transformation block to obtain at least one transformed block.
变换区块是多个区块中要对其中的顶点的信息进行坐标系变换的区块。A transformation block is a block among multiple blocks where coordinate system transformations are performed on the information of the vertices within it.
本申请实施例中,编码端可以通过率失真优化、坐标系变换前后顶点分布、三维网格包括的面片的三角形属性或者预先设定等方法判断多个区块中是否有变换区块。编码端可以针对任意一个区块执行该操作,即,对三维网格中的多个网格,逐个判断该区块是否为变换区块;在逐个判断、并确定其为变换区块后,再获取其目标坐标系。这样多个区块中可以没有变换区块,或者可以有至少一个变换区块,或者可以全部为变换区块,每个变换区块可以得到一个目标坐标系,不同的变换区块的目标坐标系可以相同,也可以不同。In this embodiment, the encoding end can determine whether there are transformation blocks among multiple blocks by methods such as rate-distortion optimization, vertex distribution before and after coordinate system transformation, triangle attributes of the faces included in the 3D mesh, or pre-setting. The encoding end can perform this operation for any block, that is, for multiple meshes in the 3D mesh, it determines whether the block is a transformation block one by one; after determining each block and confirming it as a transformation block, it obtains its target coordinate system. In this way, there may be no transformation blocks among multiple blocks, or there may be at least one transformation block, or all of them may be transformation blocks. Each transformation block can obtain a target coordinate system, and the target coordinate systems of different transformation blocks may be the same or different.
例如,编码端用率失真函数进行判断,即,针对多个区块中的任意一个区块,计算该区块中的顶点的信息分别在第一坐标系和其他坐标系(例如,局部坐标系)下的码率和失真损失的加权值,进而基于码率和加权值确定是否要对该区块中的顶点的信息进行坐标系变换,并从其他坐标系中选取最优的坐标系作为该区块的目标坐标系。又例如,变换前后顶点分布的方法可以对变换前后的顶点进行直方图统计,根据直方图比较确定是否要对该区块中的顶点的信息进行坐标系变换,并从其他坐标系中选取更集中的坐标系作为该区块的目标坐标系。For example, the encoder uses a rate-distortion function to determine whether to perform a coordinate system transformation. Specifically, for any given block, it calculates the weighted values of the bitrate and distortion loss for each vertex in the block across a first coordinate system and other coordinate systems (e.g., a local coordinate system). Based on the bitrate and weighted values, it determines whether a coordinate system transformation is needed for the vertex information in that block, and selects the optimal coordinate system from among the other coordinate systems as the target coordinate system for that block. Another example is the method for analyzing vertex distribution before and after transformation. This involves performing histogram statistics on the vertices before and after the transformation, comparing the histograms to determine whether a coordinate system transformation is needed for the vertex information in that block, and selecting a more concentrated coordinate system from among the other coordinate systems as the target coordinate system for that block.
由此可见,判断多个区块中是否有变换区块的目的在于找出多个区块中,存在的在目标坐标系下比在第一坐标系下具有更优的压缩率、码率的区块(即变换区块),亦即,相较于第一坐标系,采用目标坐标系的顶点的信息可以提高变换区块的顶点的压缩效率。Therefore, the purpose of determining whether there are transformed blocks among multiple blocks is to find blocks (i.e., transformed blocks) that have better compression ratio and bit rate in the target coordinate system than in the first coordinate system. In other words, compared with the first coordinate system, using the vertex information of the target coordinate system can improve the compression efficiency of the vertices of the transformed blocks.
在一种可能的实现方式中,为了提高解码端的解码效率,编码端可以给变换区块设置坐标系信息,该坐标系信息可以包括坐标系标识,用于表示变换区块的目标坐标系。例如,坐标系标识为1,表示局部法向坐标系;坐标系标识为0,表示局部圆柱坐标系。此外,坐标系信息还可以包括是否变换坐标系标识,用于表示是否要对区块中的顶点的信息进行坐标系变换。例如,是否变换坐标系标识为1,表示要对区块中的顶点的信息进行坐标系变换,是否变换坐标系标识为0,表示不对区块中的顶点的信息进行坐标系变换。In one possible implementation, to improve decoding efficiency at the decoding end, the encoding end can set coordinate system information for the transformed block. This coordinate system information can include a coordinate system identifier to indicate the target coordinate system of the transformed block. For example, a coordinate system identifier of 1 indicates a local normal coordinate system; a coordinate system identifier of 0 indicates a local cylindrical coordinate system. Furthermore, the coordinate system information can also include a coordinate system transformation indicator to indicate whether coordinate system transformation is required for the vertex information in the block. For example, a coordinate system transformation indicator of 1 indicates that coordinate system transformation is required for the vertex information in the block; a coordinate system transformation indicator of 0 indicates that coordinate system transformation is not required for the vertex information in the block.
任意一个变换后区块中的顶点的信息为对应的目标坐标系下的信息。即,针对任意一个变换区块,当确定了其目标坐标系后,可以对该变换区块中的至少一个顶点的信息进行坐标系变换,以得到在目标坐标系下的至少一个顶点的变换后信息,即为变换后区块。The information of vertices in any transformed block is the information in the corresponding target coordinate system. That is, for any transformed block, once its target coordinate system is determined, the information of at least one vertex in the transformed block can be transformed to obtain the transformed information of at least one vertex in the target coordinate system, which is the transformed block.
在一种可能的实现方式中,编码端可以先建立目标坐标系,可以依赖参考顶点或者参考三角形面片,确定目标坐标系的坐标原点和坐标轴方向;然后将第一坐标系下的信息变换到目标坐标系下,变换方法包括但不限于平移变换、旋转变换、缩放变换、透视变换等。例如,目标坐标系为局部法向坐标系,其坐标原点可以是已编码顶点,或者由已编码三角形面片得到,局部法向坐标系的坐标轴可以由每个顶点处的法线和两个切向分量生成。又例如,目标坐标系为局部圆柱坐标系,其坐标原点可以是已编码顶点,或者由已编码三角形面片得到,局部圆柱坐标系的坐标轴可以依据参考三角形得到,然后将笛卡尔坐标系下的顶点坐标(x,y,z)变换为局部圆柱坐标系下的二面角θ,半径r和高度z。In one possible implementation, the encoding end can first establish a target coordinate system, which can rely on reference vertices or reference triangle faces to determine the origin and coordinate axis directions of the target coordinate system. Then, the information in the first coordinate system is transformed to the target coordinate system. Transformation methods include, but are not limited to, translation, rotation, scaling, and perspective transformations. For example, the target coordinate system can be a local normal coordinate system, whose origin can be an encoded vertex or obtained from an encoded triangle face. The coordinate axes of the local normal coordinate system can be generated from the normal and two tangential components at each vertex. As another example, the target coordinate system can be a local cylindrical coordinate system, whose origin can be an encoded vertex or obtained from an encoded triangle face. The coordinate axes of the local cylindrical coordinate system can be obtained based on a reference triangle. Then, the vertex coordinates (x, y, z) in the Cartesian coordinate system are transformed into the dihedral angle θ, radius r, and height z in the local cylindrical coordinate system.
需要说明的是,本申请实施例中,同一个patch中的多个顶点适用于同一种局部坐标系,但在进行坐标系变换时,可以针对不同的顶点创建不同的局部坐标系,即可以有一个或多个顶点向同一个局部坐标系变换,也可以不同的顶点变换到不同个局部坐标系下,对此不做具体限定。It should be noted that in the embodiments of this application, multiple vertices in the same patch are applicable to the same local coordinate system. However, when performing coordinate system transformation, different local coordinate systems can be created for different vertices. That is, one or more vertices can be transformed to the same local coordinate system, or different vertices can be transformed to different local coordinate systems. No specific limitation is made in this regard.
本申请实施例中,编码端可以对至少一个变换后区块进行预测和熵编码以得到码流。针对任意一个变换后区块,可以对其中的至少一个顶点的变换后信息进行预测以得到至少一个顶点的残差信息;再对至少一个顶点的残差信息进行编码以得到码流。In this embodiment, the encoding end can perform prediction and entropy coding on at least one transformed block to obtain a bitstream. For any transformed block, the transformed information of at least one vertex can be predicted to obtain the residual information of at least one vertex; then the residual information of at least one vertex is encoded to obtain the bitstream.
在一种可能的实现方式中,坐标系信息可以填充入码流中的区块数据单元语法(Patch data unit syntax)中,一种方法是在该语法中增加新的字节,另一种方法是使用该语法中的保留字段,本申请实施例对此不做具体限定。In one possible implementation, coordinate system information can be filled into the patch data unit syntax in the bitstream. One method is to add new bytes to the syntax, and another method is to use the reserved fields in the syntax. This application does not specifically limit this approach.
在一种可能的实现方式中,还包括:获取不变换区块,不变换区块是多个区块中不对其中的顶点的信息进行坐标系变换的区块;根据不变换区块进行编码以得到码流。One possible implementation also includes: obtaining untransformed blocks, which are blocks among multiple blocks for which coordinate system transformation is not performed on the information of vertices; and encoding based on the untransformed blocks to obtain a bitstream.
不变换区块可以是三维网格中与变换区块不同的patch,该不变换区块相较于局部坐标系(对应目标坐标系),更适用于世界坐标系(对应第一坐标系),亦即,三维网格中的多个顶点可以适用于不同的局部坐标系,也可以适用于世界坐标系,根据三维网格中的多个顶点特性,可以以最适用的方式将多个顶点归入不同patch,从而对同一patch中的顶点采用同种坐标系变换,这样各个patch中的顶点的信息采用最适用的坐标系,可以全面提高三维网格的顶点的压缩率。The non-transformation block can be a patch in the 3D mesh that is different from the transformed block. Compared to the local coordinate system (corresponding to the target coordinate system), the non-transformation block is more suitable for the world coordinate system (corresponding to the first coordinate system). That is, multiple vertices in the 3D mesh can be applied to different local coordinate systems or to the world coordinate system. Based on the characteristics of multiple vertices in the 3D mesh, multiple vertices can be assigned to different patches in the most suitable way, so that the same coordinate system transformation can be applied to the vertices in the same patch. In this way, the information of vertices in each patch adopts the most suitable coordinate system, which can comprehensively improve the vertex compression rate of the 3D mesh.
第二方面,本申请实施例提供了一种解码方法,包括:接收三维网格对应的码流,所述三维网格包含M个区块;解码所述码流以得到坐标信息和坐标系信息,所述坐标信息包含所述M个区块中的N个区块中部分或全部顶点的坐标,所述坐标系信息指示所述N个区块的坐标系是否为第一坐标系,所述第一坐标系为所述M个区块的原始坐标系,1≤N<M;基于所述坐标信息和所述坐标系信息得到所述三维网格的重建数据,所述重建数据包含M组重建信息,所述M组重建信息与所述M个区块一一对应且分别包含对应区块中的顶点在所述第一坐标系下的坐标。Secondly, embodiments of this application provide a decoding method, comprising: receiving a bitstream corresponding to a three-dimensional mesh, the three-dimensional mesh comprising M blocks; decoding the bitstream to obtain coordinate information and coordinate system information, the coordinate information comprising the coordinates of some or all vertices in N blocks of the M blocks, the coordinate system information indicating whether the coordinate system of the N blocks is a first coordinate system, the first coordinate system being the original coordinate system of the M blocks, 1≤N<M; obtaining reconstructed data of the three-dimensional mesh based on the coordinate information and the coordinate system information, the reconstructed data comprising M sets of reconstructed information, the M sets of reconstructed information corresponding one-to-one with the M blocks and each containing the coordinates of vertices in the corresponding block in the first coordinate system.
本申请实施例,通过解码流可以确定三维网格中的一个区块中的顶点的信息所在的坐标系,再进行坐标系变换得到世界坐标系下的重建信息,可以提高顶点的压缩效率。In this embodiment of the application, the coordinate system of the vertex information in a block of a 3D mesh can be determined by decoding the stream, and then the coordinate system transformation is performed to obtain the reconstructed information in the world coordinate system, which can improve the compression efficiency of the vertex.
上述三维网格包含M个区块。本申请实施例中,坐标信息包含M个区块中的N个区块中部分或全部顶点的坐标,坐标系信息指示N个区块的坐标系是否为第一坐标系,第一坐标系为M个区块的原始坐标系,1≤N<M。可选的,第一坐标系可以是笛卡尔坐标系、局部法向坐标系或者局部圆柱坐标系。The aforementioned three-dimensional mesh comprises M blocks. In this embodiment, the coordinate information includes the coordinates of some or all vertices in N blocks out of the M blocks, and the coordinate system information indicates whether the coordinate system of the N blocks is a first coordinate system. The first coordinate system is the original coordinate system of the M blocks, where 1 ≤ N < M. Optionally, the first coordinate system can be a Cartesian coordinate system, a local normal coordinate system, or a local cylindrical coordinate system.
在一种可能的实现方式中,码流包含对应于三维网格的第一语法结构,第一语法结构包含对应于M个区块的第二语法结构,坐标系信息位于第一语法结构中且在第二语法结构之外,或者,坐标系信息位于对应于N个区块的第二语法结构中。In one possible implementation, the bitstream contains a first syntax structure corresponding to a three-dimensional grid, the first syntax structure contains a second syntax structure corresponding to M blocks, and coordinate system information is located in the first syntax structure and outside the second syntax structure, or the coordinate system information is located in the second syntax structure corresponding to N blocks.
即,三维网格对应的码流包含两种语法元素,一种是对应于三维网格的语法元素(亦即网格语法元素,对应上述第一语法结构),另一种是对应于区块的语法元素(亦即区块语法元素,对应上述第二语法结构),本申请实施例中的坐标系信息可以写入第一语法结构,也可以写入第二语法结构,对此不做具体限定。That is, the code stream corresponding to the three-dimensional grid contains two kinds of syntax elements: one is the syntax element corresponding to the three-dimensional grid (i.e., the grid syntax element, corresponding to the first syntax structure mentioned above), and the other is the syntax element corresponding to the block (i.e., the block syntax element, corresponding to the second syntax structure mentioned above). The coordinate system information in this embodiment can be written into the first syntax structure or the second syntax structure, and there is no specific limitation on this.
在一种可能的实现方式中,坐标系信息指示N个区块的当前坐标系为第二坐标系,该第二坐标系为第一坐标系经变换后的坐标系,坐标信息包含N个区块中部分或全部顶点在第二坐标系下的坐标。可选的,第二坐标系可以是笛卡尔坐标系、局部法向坐标系或者局部圆柱坐标系。此外,坐标系信息还可以指示在解码时进行坐标逆变换。此外,坐标系信息还可以指示在解码时进行坐标逆变换。In one possible implementation, the coordinate system information indicates that the current coordinate system of the N blocks is a second coordinate system, which is a transformed version of the first coordinate system. The coordinate information includes the coordinates of some or all vertices in the N blocks under the second coordinate system. Optionally, the second coordinate system can be a Cartesian coordinate system, a local normal coordinate system, or a local cylindrical coordinate system. Furthermore, the coordinate system information can also indicate that an inverse coordinate transformation should be performed during decoding.
坐标系信息可以包含N组目标坐标系的信息,以及该N组目标坐标系的信息与N个区块的对应关系,任意一组目标坐标系的信息包含目标坐标系的类型、目标坐标系的原点信息以及目标坐标系的坐标轴信息。The coordinate system information can include information on N sets of target coordinate systems, as well as the correspondence between the information on these N sets of target coordinate systems and N blocks. The information on any set of target coordinate systems includes the type of the target coordinate system, the origin information of the target coordinate system, and the coordinate axis information of the target coordinate system.
本申请实施例中,当坐标系信息指示M个区块中N个区块所对应的第二坐标系时,表示编码端将该N个区块中的顶点的坐标信息从第一坐标系变换到了第二坐标系,与此对应,解码端需要将该N个区块中的顶点的坐标信息从第二坐标系变化回第一坐标系。对此,解码端具体的实施方式可以基于坐标系信息,坐标系信息中的目标坐标系的信息包含目标坐标系的类型、目标坐标系的原点信息以及目标坐标系的坐标轴信息。其中,In this embodiment, when the coordinate system information indicates the second coordinate system corresponding to N blocks out of M blocks, it means that the encoding end has transformed the coordinate information of the vertices in the N blocks from the first coordinate system to the second coordinate system. Correspondingly, the decoding end needs to transform the coordinate information of the vertices in the N blocks back from the second coordinate system to the first coordinate system. Specifically, the decoding end can be based on the coordinate system information, where the target coordinate system information includes the type of the target coordinate system, the origin information, and the coordinate axis information.
可选的,目标系的类型包含第一坐标系或第二坐标系。例如,标识1代表第一坐标系,标识2代表第二坐标系;或者,标识0代表第一坐标系,标识1代表第二坐标系,对此不做具体限定。可选的,第一坐标系可以是笛卡尔坐标系、局部法向坐标系或者局部圆柱坐标系。Optionally, the target system type can include a first coordinate system or a second coordinate system. For example, identifier 1 represents the first coordinate system and identifier 2 represents the second coordinate system; or, identifier 0 represents the first coordinate system and identifier 1 represents the second coordinate system, without specific limitations. Optionally, the first coordinate system can be a Cartesian coordinate system, a local normal coordinate system, or a local cylindrical coordinate system.
解码端识别到目标坐标系的类型的标识为2时,可以确定要对区块中的顶点的坐标信息从第二坐标系变换回第一坐标系。When the decoding end recognizes that the type of the target coordinate system is 2, it can determine that the coordinate information of the vertices in the block needs to be transformed from the second coordinate system back to the first coordinate system.
可选的,目标坐标系的原点信息包含原点在第一坐标系下的坐标信息;或者,目标坐标系的原点信息包含原点的获取模式。编码端可以直接在码流中写入原点在第一坐标系下的坐标信息,例如,(0,1,0),或者,编码端可以在码流中写入原点的获取模式,例如,标识0代表原点为上一个顶点,标识1代表原点为上两个顶点的中点,对此不做具体限定。Optionally, the origin information of the target coordinate system includes the coordinate information of the origin in the first coordinate system; or, the origin information of the target coordinate system includes the origin acquisition mode. The encoding end can directly write the coordinate information of the origin in the first coordinate system into the bitstream, for example, (0,1,0), or the encoding end can write the origin acquisition mode into the bitstream, for example, identifier 0 represents the origin as the previous vertex, identifier 1 represents the origin as the midpoint of the two previous vertices, without specific limitations.
解码端可以将在第一坐标系下(0,1,0)位置的顶点作为第二坐标系的原点,或者将上一个处理完的顶点作为第二坐标系的原点,或者将计算上两个顶点的中点,然后将该中点作为第二坐标系的原点,等等,对此不做具体限定。The decoding end can use the vertex at position (0,1,0) in the first coordinate system as the origin of the second coordinate system, or use the previously processed vertex as the origin of the second coordinate system, or calculate the midpoint of the two vertices and then use that midpoint as the origin of the second coordinate system, etc., without making specific limitations.
可选的,目标坐标系的坐标轴信息包含坐标轴在第一坐标系下的向量;或者,目标坐标系的坐标轴信息包含坐标轴的获取模式。编码端可以直接在码流中写入坐标轴在第一坐标系下的向量,例如,(1,0,0)代表x轴,(0,1,0)代表y轴,(0,0,1)代表z轴,或者,编码端可以在码流中写入坐标轴的获取模式,例如,标识1代表默认的x轴(1,0,0),y轴(0,1,0),z轴(0,0,1),标识2代表旋转预设角度。可选的,前述预设角度也可以写入码流。Optionally, the coordinate axis information of the target coordinate system includes vectors of the coordinate axes in the first coordinate system; or, the coordinate axis information of the target coordinate system includes the acquisition mode of the coordinate axes. The encoding end can directly write the vectors of the coordinate axes in the first coordinate system into the bitstream, for example, (1,0,0) represents the x-axis, (0,1,0) represents the y-axis, and (0,0,1) represents the z-axis; or, the encoding end can write the acquisition mode of the coordinate axes into the bitstream, for example, identifier 1 represents the default x-axis (1,0,0), y-axis (0,1,0), z-axis (0,0,1), and identifier 2 represents a preset rotation angle. Optionally, the aforementioned preset angle can also be written into the bitstream.
解码端可以将在第一坐标系下(1,0,0)代表的向量作为第二坐标系的x轴,在第一坐标系下(0,1,0)代表的向量作为第二坐标系的y轴,在第一坐标系下(0,0,1)代表的向量作为第二坐标系的z轴,或者根据旋转角度对第一坐标系进行旋转。The decoding end can use the vector represented by (1,0,0) in the first coordinate system as the x-axis of the second coordinate system, the vector represented by (0,1,0) in the first coordinate system as the y-axis of the second coordinate system, and the vector represented by (0,0,1) in the first coordinate system as the z-axis of the second coordinate system, or rotate the first coordinate system according to the rotation angle.
可选的,上述目标坐标系的原点和坐标轴也可以采用编码端和解码端预先约定的方式获取,例如,默认原点为上一个顶点,默认的x轴为(1,0,0),y轴为(0,1,0),z轴为(0,0,1)。Optionally, the origin and coordinate axes of the target coordinate system can also be obtained in a way that is agreed upon in advance by the encoder and decoder. For example, the default origin is the previous vertex, the default x-axis is (1,0,0), the default y-axis is (0,1,0), and the default z-axis is (0,0,1).
基于以上目标坐标系的信息,解码端可以构建顶点对应的第二坐标系,从而基于该第二坐标系对顶点的坐标信息实施从第二坐标系向第一坐标系的变换。Based on the information of the target coordinate system, the decoding end can construct a second coordinate system corresponding to the vertex, and then perform a transformation from the second coordinate system to the first coordinate system on the coordinate information of the vertex based on the second coordinate system.
在一种可能的实现方式中,坐标系信息指示N个区块的当前坐标系为第一坐标系,坐标信息包含N个区块中部分或全部顶点在第一坐标系下的坐标。In one possible implementation, the coordinate system information indicates that the current coordinate system of the N blocks is the first coordinate system, and the coordinate information includes the coordinates of some or all of the vertices in the N blocks in the first coordinate system.
坐标系信息可以包含N个区块标识,该N个区块标识用于指示与N个区块标识对应的N个区块不进行坐标系变换。The coordinate system information can contain N block identifiers, which are used to indicate that the N blocks corresponding to the N block identifiers do not undergo coordinate system transformation.
本申请实施例中,编码端可以反向指示,即坐标系信息中指示不进行坐标系变换的区块,而其他区块则默认要进行坐标系变换,且这些区块的目标坐标系为编码端和解码端预先预订的。这样解码端可以对坐标系信息中的区块标识所对应的区块保留,而对坐标系信息中没有指示的区块进行坐标系变换。In this embodiment, the encoding end can provide reverse indication, that is, the coordinate system information indicates blocks that do not require coordinate system transformation, while other blocks are assumed to require coordinate system transformation, and the target coordinate system of these blocks is pre-defined by the encoding and decoding ends. In this way, the decoding end can retain the blocks corresponding to the block identifiers in the coordinate system information, while performing coordinate system transformation on the blocks not indicated in the coordinate system information.
本申请实施例中,重建数据包含M组重建信息,该M组重建信息与M个区块一一对应且分别包含对应区块中的顶点在第一坐标系下的坐标。In this embodiment of the application, the reconstruction data includes M sets of reconstruction information, which correspond one-to-one with M blocks and each contains the coordinates of the vertices in the corresponding block in the first coordinate system.
当目标坐标系与第一坐标系不同时,解码端可以基于码流获取对应区块中所有顶点的坐标信息;对对应区块中所有顶点的坐标信息进行坐标系逆变换,得到对应区块中所有顶点在第一坐标系下的坐标信息。When the target coordinate system is different from the first coordinate system, the decoder can obtain the coordinate information of all vertices in the corresponding block based on the bitstream; and perform an inverse coordinate transformation on the coordinate information of all vertices in the corresponding block to obtain the coordinate information of all vertices in the corresponding block in the first coordinate system.
第三方面,本申请实施例提供了一种编码装置,包括:收发模块,用于获取三维网格的原始数据,所述三维网格包含M个区块,所述原始数据包含M组原始信息,所述M组原始信息与所述M个区块一一对应且分别包含对应区块中的顶点在第一坐标系下的坐标,所述第一坐标系为所述M个区块的原始坐标系;编码模块,用于基于所述原始数据将坐标信息和坐标系信息编码到所述三维网格对应的码流中,所述坐标信息包含所述M个区块中的N个区块中部分或全部顶点的坐标,所述坐标系信息指示所述N个区块的当前坐标系是否为所述第一坐标系,1≤N<M。Thirdly, embodiments of this application provide an encoding device, comprising: a transceiver module, configured to acquire raw data of a three-dimensional mesh, the three-dimensional mesh comprising M blocks, the raw data comprising M sets of raw information, the M sets of raw information corresponding one-to-one with the M blocks and each containing the coordinates of vertices in the corresponding block in a first coordinate system, the first coordinate system being the original coordinate system of the M blocks; and an encoding module, configured to encode the coordinate information and coordinate system information into a bitstream corresponding to the three-dimensional mesh based on the raw data, the coordinate information comprising the coordinates of some or all vertices in N blocks of the M blocks, and the coordinate system information indicating whether the current coordinate system of the N blocks is the first coordinate system, 1 ≤ N < M.
在一种可能的实现方式中,所述坐标系信息指示所述N个区块的当前坐标系为第二坐标系,所述第二坐标系为所述第一坐标系经变换后的坐标系,所述坐标信息包含所述N个区块中部分或全部顶点在所述第二坐标系下的坐标。In one possible implementation, the coordinate system information indicates that the current coordinate system of the N blocks is a second coordinate system, which is a transformed coordinate system of the first coordinate system. The coordinate information includes the coordinates of some or all vertices in the N blocks in the second coordinate system.
在一种可能的实现方式中,所述坐标系信息还指示在解码时进行坐标逆变换。In one possible implementation, the coordinate system information also indicates that an inverse coordinate transformation should be performed during decoding.
在一种可能的实现方式中,所述编码模块,还用于将所述N个区块中的顶点在所述第一坐标系下的坐标变换为在所述第二坐标系下的坐标。In one possible implementation, the encoding module is further configured to transform the coordinates of the vertices in the N blocks in the first coordinate system to coordinates in the second coordinate system.
在一种可能的实现方式中,所述坐标系信息指示所述N个区块的当前坐标系为所述第一坐标系,所述坐标信息包含所述N个区块中部分或全部顶点在所述第一坐标系下的坐标。In one possible implementation, the coordinate system information indicates that the current coordinate system of the N blocks is the first coordinate system, and the coordinate information includes the coordinates of some or all vertices in the N blocks in the first coordinate system.
在一种可能的实现方式中,所述码流包含对应于所述三维网格的第一语法结构,所述第一语法结构包含对应于所述M个区块的第二语法结构,所述坐标系信息位于所述第一语法结构中且在所述第二语法结构之外,或者,所述坐标系信息位于对应于所述N个区块的第二语法结构中。In one possible implementation, the bitstream includes a first syntax structure corresponding to the three-dimensional grid, the first syntax structure including a second syntax structure corresponding to the M blocks, the coordinate system information being located in the first syntax structure and outside the second syntax structure, or the coordinate system information being located in the second syntax structure corresponding to the N blocks.
在一种可能的实现方式中,所述坐标系信息包含N组第二坐标系的信息,以及所述N组第二坐标系的信息与所述N个区块的对应关系,任意一组所述第二坐标系的信息包含第二坐标系的类型、所述第二坐标系的原点信息以及所述第二坐标系的坐标轴信息。In one possible implementation, the coordinate system information includes information on N sets of second coordinate systems, and the correspondence between the information on the N sets of second coordinate systems and the N blocks. Any set of information on the second coordinate system includes the type of the second coordinate system, the origin information of the second coordinate system, and the coordinate axis information of the second coordinate system.
在一种可能的实现方式中,所述第二坐标系的原点信息包含原点在所述第一坐标系下的坐标信息;或者,所述第二坐标系的原点信息包含原点的获取模式。In one possible implementation, the origin information of the second coordinate system includes the coordinate information of the origin in the first coordinate system; or, the origin information of the second coordinate system includes the origin acquisition mode.
在一种可能的实现方式中,所述第二坐标系的坐标轴信息包含坐标轴在所述第一坐标系下的向量;或者,所述第二坐标系的坐标轴信息包含坐标轴的获取模式。In one possible implementation, the coordinate axis information of the second coordinate system includes the vectors of the coordinate axes in the first coordinate system; or, the coordinate axis information of the second coordinate system includes the acquisition mode of the coordinate axes.
第四方面,本申请实施例提供了一种解码装置,包括:收发模块,用于接收三维网格对应的码流,所述三维网格包含M个区块;解码模块,用于解码所述码流以得到坐标信息和坐标系信息,所述坐标信息包含所述M个区块中的N个区块中部分或全部顶点的坐标,所述坐标系信息指示所述N个区块的坐标系是否为第一坐标系,所述第一坐标系为所述M个区块的原始坐标系,1≤N<M;基于所述坐标信息和所述坐标系信息得到所述三维网格的重建数据,所述重建数据包含M组重建信息,所述M组重建信息与所述M个区块一一对应且分别包含对应区块中的顶点在所述第一坐标系下的坐标。Fourthly, embodiments of this application provide a decoding device, comprising: a transceiver module for receiving a bitstream corresponding to a three-dimensional mesh, the three-dimensional mesh comprising M blocks; a decoding module for decoding the bitstream to obtain coordinate information and coordinate system information, the coordinate information comprising the coordinates of some or all vertices in N blocks of the M blocks, the coordinate system information indicating whether the coordinate system of the N blocks is a first coordinate system, the first coordinate system being the original coordinate system of the M blocks, 1≤N<M; and obtaining reconstructed data of the three-dimensional mesh based on the coordinate information and the coordinate system information, the reconstructed data comprising M sets of reconstructed information, the M sets of reconstructed information corresponding one-to-one with the M blocks and each containing the coordinates of vertices in the corresponding block in the first coordinate system.
在一种可能的实现方式中,所述坐标系信息指示所述N个区块的当前坐标系为第二坐标系,所述第二坐标系为所述第一坐标系经变换后的坐标系,所述坐标信息包含所述N个区块中部分或全部顶点在所述第二坐标系下的坐标。In one possible implementation, the coordinate system information indicates that the current coordinate system of the N blocks is a second coordinate system, which is a transformed coordinate system of the first coordinate system. The coordinate information includes the coordinates of some or all vertices in the N blocks in the second coordinate system.
在一种可能的实现方式中,所述坐标系信息还指示在解码时进行坐标逆变换。In one possible implementation, the coordinate system information also indicates that an inverse coordinate transformation should be performed during decoding.
在一种可能的实现方式中,所述解码模块,还用于将所述N个区块中的顶点在所述第二坐标系下的坐标变换为在所述第一坐标系下的坐标。In one possible implementation, the decoding module is further configured to transform the coordinates of the vertices in the N blocks in the second coordinate system to coordinates in the first coordinate system.
在一种可能的实现方式中,所述坐标系信息指示所述N个区块的当前坐标系为所述第一坐标系,所述坐标信息包含所述N个区块中部分或全部顶点在所述第一坐标系下的坐标。In one possible implementation, the coordinate system information indicates that the current coordinate system of the N blocks is the first coordinate system, and the coordinate information includes the coordinates of some or all vertices in the N blocks in the first coordinate system.
在一种可能的实现方式中,所述码流包含对应于所述三维网格的第一语法结构,所述第一语法结构包含对应于所述M个区块的第二语法结构,所述坐标系信息位于所述第一语法结构中且在所述第二语法结构之外,或者,所述坐标系信息位于对应于所述N个区块的第二语法结构中。In one possible implementation, the bitstream includes a first syntax structure corresponding to the three-dimensional grid, the first syntax structure including a second syntax structure corresponding to the M blocks, the coordinate system information being located in the first syntax structure and outside the second syntax structure, or the coordinate system information being located in the second syntax structure corresponding to the N blocks.
在一种可能的实现方式中,所述坐标系信息包含N组第二坐标系的信息,以及所述N组第二坐标系的信息与所述N个区块的对应关系,任意一组所述第二坐标系的信息包含第二坐标系的类型、所述第二坐标系的原点信息以及所述第二坐标系的坐标轴信息。In one possible implementation, the coordinate system information includes information on N sets of second coordinate systems, and the correspondence between the information on the N sets of second coordinate systems and the N blocks. Any set of information on the second coordinate system includes the type of the second coordinate system, the origin information of the second coordinate system, and the coordinate axis information of the second coordinate system.
在一种可能的实现方式中,所述第二坐标系的原点信息包含原点在所述第一坐标系下的坐标信息;或者,所述第二坐标系的原点信息包含原点的获取模式。In one possible implementation, the origin information of the second coordinate system includes the coordinate information of the origin in the first coordinate system; or, the origin information of the second coordinate system includes the origin acquisition mode.
在一种可能的实现方式中,所述第二坐标系的坐标轴信息包含坐标轴在所述第一坐标系下的向量;或者,所述第二坐标系的坐标轴信息包含坐标轴的获取模式。In one possible implementation, the coordinate axis information of the second coordinate system includes the vectors of the coordinate axes in the first coordinate system; or, the coordinate axis information of the second coordinate system includes the acquisition mode of the coordinate axes.
第五方面,本申请实施例提供了一种编码装置,包括:一个或多个处理器;存储器,用于存储一个或多个程序;当所述一个或多个程序被所述一个或多个处理器执行,使得所述一个或多个处理器实现如上述第一方面中任一项所述的方法。Fifthly, embodiments of this application provide an encoding apparatus, comprising: one or more processors; a memory for storing one or more programs; and, when the one or more programs are executed by the one or more processors, causing the one or more processors to implement the method as described in any one of the first aspects above.
第六方面,本申请实施例提供了一种解码装置,包括:一个或多个处理器;存储器,用于存储一个或多个程序;当所述一个或多个程序被所述一个或多个处理器执行,使得所述一个或多个处理器实现如上述第二方面中任一项所述的方法。In a sixth aspect, embodiments of this application provide a decoding apparatus, comprising: one or more processors; a memory for storing one or more programs; and, when the one or more programs are executed by the one or more processors, causing the one or more processors to implement the method as described in any one of the second aspects above.
第七方面,本申请实施例提供了一种计算机可读存储介质,所述计算机可读存储介质存储有程序指令,当所述程序指令由设备或一个或多个处理器执行时,使得所述设备执行如上述第一至二方面中任一项所述的方法。In a seventh aspect, embodiments of this application provide a computer-readable storage medium storing program instructions that, when executed by a device or one or more processors, cause the device to perform the method as described in any one of the first to second aspects above.
第八方面,本申请实施例提供了一种计算机程序产品,所述计算机程序产品包括计算机程序代码,当所述计算机程序代码在设备上运行时,使得设备执行上述第一至二方面中任一项所述的方法。Eighthly, embodiments of this application provide a computer program product comprising computer program code, which, when executed on a device, causes the device to perform the method described in any one of the first to second aspects.
第九方面,本申请实施例提供了一种芯片,包括处理器和存储器,所述存储器用于存储计算机程序,所述处理器用于调用并运行所述存储器中存储的计算机程序,以执行如上述第一至二方面中任一项所述的方法。Ninthly, embodiments of this application provide a chip including a processor and a memory, the memory being used to store a computer program, and the processor being used to call and run the computer program stored in the memory to perform the method as described in any one of the first to second aspects above.
第十方面,本申请实施例提供了一种码流,所述码流包括坐标信息和坐标系信息,所述坐标信息包含三维网格的M个区块中的N个区块中部分或全部顶点的坐标,所述坐标系信息指示所述N个区块的当前坐标系是否为所述第一坐标系,1≤N<M。In a tenth aspect, embodiments of this application provide a bitstream, the bitstream including coordinate information and coordinate system information, the coordinate information including the coordinates of some or all vertices in N blocks out of M blocks of a three-dimensional mesh, and the coordinate system information indicating whether the current coordinate system of the N blocks is the first coordinate system, 1≤N<M.
在一种可能的实现方式中,所述坐标系信息指示所述N个区块的当前坐标系为第二坐标系,所述第二坐标系为所述第一坐标系经变换后的坐标系,所述坐标信息包含所述N个区块中部分或全部顶点在所述第二坐标系下的坐标。In one possible implementation, the coordinate system information indicates that the current coordinate system of the N blocks is a second coordinate system, which is a transformed coordinate system of the first coordinate system. The coordinate information includes the coordinates of some or all vertices in the N blocks in the second coordinate system.
在一种可能的实现方式中,所述坐标系信息还指示在解码时进行坐标逆变换。In one possible implementation, the coordinate system information also indicates that an inverse coordinate transformation should be performed during decoding.
在一种可能的实现方式中,所述坐标系信息指示所述N个区块的当前坐标系为所述第一坐标系,所述坐标信息包含所述N个区块中部分或全部顶点在所述第一坐标系下的坐标。In one possible implementation, the coordinate system information indicates that the current coordinate system of the N blocks is the first coordinate system, and the coordinate information includes the coordinates of some or all vertices in the N blocks in the first coordinate system.
在一种可能的实现方式中,所述码流包含对应于所述三维网格的第一语法结构,所述第一语法结构包含对应于所述M个区块的第二语法结构,所述坐标系信息位于所述第一语法结构中且在所述第二语法结构之外,或者,所述坐标系信息位于对应于所述N个区块的第二语法结构中。In one possible implementation, the bitstream includes a first syntax structure corresponding to the three-dimensional grid, the first syntax structure including a second syntax structure corresponding to the M blocks, the coordinate system information being located in the first syntax structure and outside the second syntax structure, or the coordinate system information being located in the second syntax structure corresponding to the N blocks.
在一种可能的实现方式中,所述坐标系信息包含N组第二坐标系的信息,以及所述N组第二坐标系的信息与所述N个区块的对应关系,任意一组所述第二坐标系的信息包含第二坐标系的类型、所述第二坐标系的原点信息以及所述第二坐标系的坐标轴信息。In one possible implementation, the coordinate system information includes information on N sets of second coordinate systems, and the correspondence between the information on the N sets of second coordinate systems and the N blocks. Any set of information on the second coordinate system includes the type of the second coordinate system, the origin information of the second coordinate system, and the coordinate axis information of the second coordinate system.
第十一方面,本申请实施例提供了一种计算机可读存储介质,所述计算机可读存储介质上存储有如上述第十方面中任一项所述的码流。Eleventhly, embodiments of this application provide a computer-readable storage medium storing a bitstream as described in any one of the tenth aspects above.
第十二方面,本申请实施例提供了一种传输视频数据的经编码的码流的方法,所述方法包括:从存储介质中获取码流,所述码流是上述第十方面中任一项所述的码流且存储在所述存储介质中;以及,发送所述码流。In a twelfth aspect, embodiments of this application provide a method for transmitting an encoded bitstream of video data, the method comprising: acquiring a bitstream from a storage medium, the bitstream being the bitstream described in any one of the tenth aspects above and stored in the storage medium; and transmitting the bitstream.
第十三方面,本申请实施例提供了一种传输视频数据的经编码的码流的系统,所述系统包括:获取单元,用于从存储介质中获取码流,所述码流是上述第十方面中任一项所述的码流且存储在所述存储介质中;以及,发送单元,用于发送所述码流。In a thirteenth aspect, embodiments of this application provide a system for transmitting an encoded bitstream of video data, the system comprising: an acquisition unit for acquiring a bitstream from a storage medium, the bitstream being the bitstream described in any one of the tenth aspects above and stored in the storage medium; and a transmission unit for transmitting the bitstream.
第十四方面,本申请实施例提供了一种存储视频数据的经编码的码流的方法,所述方法包括:接收上述第十方面中任一项所述的码流;以及,将所述码流存储到存储介质中。In a fourteenth aspect, embodiments of this application provide a method for storing an encoded bitstream of video data, the method comprising: receiving the bitstream as described in any one of the tenth aspects above; and storing the bitstream in a storage medium.
第十五方面,本申请实施例提供了一种存储视频数据的经编码的码流的系统,包括:接收单元,用于接收上述第十方面中任一项所述的码流;以及,存储单元,用于存储所述码流。In a fifteenth aspect, embodiments of this application provide a system for storing encoded bitstreams of video data, comprising: a receiving unit for receiving the bitstream as described in any one of the tenth aspects; and a storage unit for storing the bitstream.
图1为本申请实施例的编解码系统10的示例性框图;Figure 1 is an exemplary block diagram of the encoding/decoding system 10 according to an embodiment of this application;
图2为本申请实施例提供的编码方法的过程200的流程图;Figure 2 is a flowchart of the encoding method 200 provided in an embodiment of this application;
图3为笛卡尔坐标系和局部法向坐标系的示意图;Figure 3 is a schematic diagram of the Cartesian coordinate system and the local normal coordinate system;
图4为笛卡尔坐标系和局部圆柱坐标系的示意图;Figure 4 is a schematic diagram of the Cartesian coordinate system and the local cylindrical coordinate system;
图5为本申请实施例提供的解码方法的过程500的流程图;Figure 5 is a flowchart of the decoding method 500 provided in an embodiment of this application;
图6为本申请实施例的编解码框架的示意图;Figure 6 is a schematic diagram of the encoding and decoding framework of an embodiment of this application;
图7为本申请实施例的编解码框架的示意图;Figure 7 is a schematic diagram of the encoding and decoding framework of an embodiment of this application;
图8为本申请实施例的编解码框架的示意图;Figure 8 is a schematic diagram of the encoding and decoding framework of an embodiment of this application;
图9为本申请实施例的编解码框架的示意图;Figure 9 is a schematic diagram of the encoding and decoding framework of an embodiment of this application;
图10为本申请编码装置1000的结构示意图;Figure 10 is a schematic diagram of the structure of the encoding device 1000 of this application;
图11为本申请解码装置1100的结构示意图;Figure 11 is a schematic diagram of the structure of the decoding device 1100 of this application;
图12为本申请提供的电子设备1200。Figure 12 shows the electronic device 1200 provided in this application.
为使本申请的目的、技术方案和优点更加清楚,下面将结合本申请中的附图,对本申请中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。To make the objectives, technical solutions, and advantages of this application clearer, the technical solutions of this application will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are only some embodiments of this application, not all embodiments. Based on the embodiments of this application, all other embodiments obtained by those skilled in the art without creative effort are within the scope of protection of this application.
本申请的说明书实施例和权利要求书及附图中的术语“第一”、“第二”等仅用于区分描述的目的,而不能理解为指示或暗示相对重要性,也不能理解为指示或暗示顺序。此外,术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含,例如,包含了一系列步骤或单元。方法、系统、产品或设备不必限于清楚地列出的那些步骤或单元,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或单元。The terms "first," "second," etc., used in the specification, embodiments, claims, and drawings of this application are for distinguishing purposes only and should not be construed as indicating or implying relative importance or order. Furthermore, the terms "comprising" and "having," and any variations thereof, are intended to cover non-exclusive inclusion, such as including a series of steps or units. A method, system, product, or apparatus is not necessarily limited to those steps or units explicitly listed, but may include other steps or units not explicitly listed or inherent to these processes, methods, products, or apparatuses.
应当理解,在本申请中,“至少一个(项)”是指一个或者多个,“多个”是指两个或两个以上。“和/或”,用于描述关联对象的关联关系,表示可以存在三种关系,例如,“A和/或B”可以表示:只存在A,只存在B以及同时存在A和B三种情况,其中A,B可以是单数或者复数。字符“/”一般表示前后关联对象是一种“或”的关系。“以下至少一项(个)”或其类似表达,是指这些项中的任意组合,包括单项(个)或复数项(个)的任意组合。例如,a,b或c中的至少一项(个),可以表示:a,b,c,“a和b”,“a和c”,“b和c”,或“a和b和c”,其中a,b,c可以是单个,也可以是多个。It should be understood that in this application, "at least one (item)" means one or more, and "more than" means two or more. "And/or" is used to describe the relationship between related objects, indicating that three relationships can exist. For example, "A and/or B" can represent three cases: only A exists, only B exists, and both A and B exist simultaneously, where A and B can be singular or plural. The character "/" generally indicates that the preceding and following related objects are in an "or" relationship. "At least one (item) of the following" or similar expressions refer to any combination of these items, including any combination of single or plural items. For example, at least one (item) of a, b, or c can represent: a, b, c, "a and b", "a and c", "b and c", or "a and b and c", where a, b, and c can be single or multiple.
三维网格(例如动态网格)可以用于表达体积视频、数字人、计算机图形学(Computer Graphics,CG)内容等,其特点是一帧包含一个网格。三维网格的数据通常包括:顶点坐标、连接关系、纹理坐标和纹理贴图等。其中,顶点坐标用于指示顶点在三维(3D)空间中的位置信息,连接关系用于指示三维网格中面片(例如,三角形面片)的顶点构成。Three-dimensional meshes (such as dynamic meshes) can be used to represent volumetric videos, digital humans, computer graphics (CG) content, etc., and are characterized by each frame containing one mesh. The data of a 3D mesh typically includes: vertex coordinates, connectivity relationships, texture coordinates, and texture maps. Vertex coordinates indicate the position of vertices in three-dimensional (3D) space, and connectivity relationships indicate the vertices that make up the faces (e.g., triangular faces) in the 3D mesh.
基于此,对三维网格的压缩通常包括顶点坐标编码和连接关系编码,当参考帧和待编码帧(亦可以称作当前帧)的连接关系相同时,可以无需编码待编码帧的连接关系,从而节省码流,如果对待编码帧的顶点坐标采用帧间预测,可以进一步提高编解码效率。Based on this, the compression of 3D meshes typically includes vertex coordinate encoding and connectivity encoding. When the connectivity of the reference frame and the frame to be encoded (also known as the current frame) is the same, it is not necessary to encode the connectivity of the frame to be encoded, thereby saving bitstream. If inter-frame prediction is used for the vertex coordinates of the frame to be encoded, the encoding and decoding efficiency can be further improved.
但是一帧都包含一个网格,帧间的三维网格的连接关系可能会不一致,则需要重新编码连接关系,However, since each frame contains a grid, the connection relationships between the 3D grids in different frames may be inconsistent, requiring re-encoding of the connection relationships.
这样无法高效的利用帧间相关性进行时域预测。This makes it impossible to efficiently utilize inter-frame correlation for temporal prediction.
为了解决上述技术问题,本申请实施例提供了一种编解码方法和装置,下文通过实施例描述本申请的技术方案。To address the aforementioned technical problems, this application provides an encoding/decoding method and apparatus. The technical solution of this application is described below through embodiments.
图1为本申请实施例的编解码系统10的示例性框图。编解码系统10中的压缩器12和解压缩器16代表可用于根据本申请实施例中描述的各种示例执行各技术的设备等。Figure 1 is an exemplary block diagram of an encoding/decoding system 10 according to an embodiment of this application. The compressor 12 and decompressor 16 in the encoding/decoding system 10 represent devices, etc., that can be used to perform various technologies according to the various examples described in the embodiments of this application.
如图1所示,编解码系统10包括编码端和解码端,编码端用于压缩三维网格,将压缩后得到的码流提供给解码端,解码端对码流进行解压缩得到重建的三维网格。As shown in Figure 1, the encoding and decoding system 10 includes an encoding end and a decoding end. The encoding end is used to compress the three-dimensional mesh and provide the compressed bitstream to the decoding end. The decoding end decompresses the bitstream to obtain the reconstructed three-dimensional mesh.
编码端包括压缩器12,另外即可选地,可包括数据源11和通信接口13。The encoding end includes a compressor 12, and optionally may include a data source 11 and a communication interface 13.
数据源11可以包括或可以为任意类型的三维网格获取设备,例如通过体积视频采集三维网格的设备,或者通过CG技术生成的三维网格的设备。数据源11也可以包括或可以为任意类型的内存或存储器。数据源11输出的数据为三维网格。Data source 11 may include or can be any type of 3D mesh acquisition device, such as a device that acquires 3D meshes via volumetric video, or a device that generates 3D meshes using CG technology. Data source 11 may also include or can be any type of memory or storage. The data output by data source 11 is a 3D mesh.
压缩器12用于接收三维网格,对三维网格压缩后得到码流。Compressor 12 is used to receive a three-dimensional mesh and compress the three-dimensional mesh to obtain a bitstream.
通信接口13可用于接收码流,并通过通信信道14向解码端发送该码流。Communication interface 13 can be used to receive the bit stream and send the bit stream to the decoding end through communication channel 14.
解码端包括解压缩器16,另外即可选地,可包括通信接口15和后处理器17。The decoding end includes a decompressor 16, and optionally may include a communication interface 15 and a post-processor 17.
通信接口15用于直接从编码端或从存储设备等任意其它设备接收码流,并将码流提供给解压缩器16。The communication interface 15 is used to receive the bit stream directly from the encoding end or from any other device such as a storage device, and to provide the bit stream to the decompressor 16.
通信接口13和通信接口15可用于通过编码端与解码端之间的直连通信链路,例如直接有线或无线连接等,或者通过任意类型的网络,例如有线网络、无线网络或其任意组合、任意类型的私网和公网或其任意类型的组合,发送或接收码流。Communication interfaces 13 and 15 can be used to send or receive bitstreams through a direct communication link between the encoder and decoder, such as a direct wired or wireless connection, or through any type of network, such as a wired network, a wireless network or any combination thereof, any type of private network and public network or any combination thereof.
通信接口13和通信接口15均可配置为如图1中从编码端指向解码端的对应通信信道14的箭头所指示的单向通信接口,或双向通信接口,并且可用于发送和接收消息等,以建立连接,确认并交换与通信链路相关的任何其它信息,等等。Both communication interface 13 and communication interface 15 can be configured as a one-way communication interface or a two-way communication interface as indicated by the arrow pointing from the encoding end to the corresponding communication channel 14 in Figure 1. They can be used to send and receive messages, establish connections, confirm and exchange any other information related to the communication link, etc.
后处理器17用于对解码后的数据(也称为重建后的三维网格)进行后处理,得到后处理三维网格。后处理器17执行的后处理可以包括例如3D形象的重建。Postprocessor 17 is used to post-process the decoded data (also known as the reconstructed 3D mesh) to obtain a post-processed 3D mesh. The post-processing performed by postprocessor 17 may include, for example, the reconstruction of a 3D image.
尽管图1示出了编码端和解码端作为独立的设备,但设备实施例也可以同时包括编码设备和解码设备或同时包括编码和解码的功能,即同时包括编码端或对应功能和解码端或对应功能。在这些实施例中,编码端或对应功能和解码端或对应功能可以使用相同硬件和/或软件或通过单独的硬件和/或软件或其任意组合来实现。Although Figure 1 shows the encoding and decoding ends as independent devices, device embodiments may also include both encoding and decoding devices or both encoding and decoding functions, i.e., simultaneously including an encoding end or corresponding function and a decoding end or corresponding function. In these embodiments, the encoding end or corresponding function and the decoding end or corresponding function may be implemented using the same hardware and/or software or by separate hardware and/or software or any combination thereof.
根据描述,图1所示的编码端和/或解码端中的不同单元或功能的存在和(准确)划分可能根据实际设备和应用而有所不同,这对技术人员来说是显而易见的。As described, the presence and (accurate) division of different units or functions in the encoding and/or decoding ends shown in Figure 1 may vary depending on the actual device and application, which is obvious to those skilled in the art.
编码端和解码端可包括各种设备中的任一种,包括任意类型的手持设备或固定设备,例如,笔记本电脑或膝上型电脑、智能手机、平板或平板电脑、台式计算机,等等,并可以不使用或使用任意类型的操作系统。在一些情况下,编码端和解码端可配备用于无线通信的组件。因此,编码端和解码端可以是无线通信设备。The encoding and decoding ends can be any type of device, including any type of handheld or fixed device, such as a laptop or tablet computer, smartphone, tablet or tablet PC, desktop computer, etc., and may or may not use an operating system of any type. In some cases, the encoding and decoding ends may be equipped with components for wireless communication. Therefore, the encoding and decoding ends can be wireless communication devices.
需要说明的是,图1所示的编解码系统10仅仅是示例性的,在一些情况下,编码端和解码端可以应用于同一设备或不同设备,本申请实施例对三维网格的编解码系统不做具体限定。It should be noted that the encoding and decoding system 10 shown in Figure 1 is merely exemplary. In some cases, the encoding end and the decoding end can be applied to the same device or different devices. This application embodiment does not specifically limit the encoding and decoding system of three-dimensional mesh.
图2为本申请实施例提供的编码方法的过程200的流程图。过程200可以由上文所述的编码端执行。过程200描述为一系列的步骤或操作,应当理解的是,过程200可以以各种顺序执行和/或同时发生,不限于图2所示的执行顺序。过程200可以包括:Figure 2 is a flowchart of process 200 of the encoding method provided in an embodiment of this application. Process 200 can be executed by the encoding end described above. Process 200 is described as a series of steps or operations. It should be understood that process 200 can be executed in various orders and/or occur simultaneously, and is not limited to the execution order shown in Figure 2. Process 200 may include:
步骤201、获取三维网格的原始数据。Step 201: Obtain the raw data of the 3D mesh.
本申请实施例中,三维网格包含M个区块,原始数据包含M组原始信息,M组原始信息与M个区块一一对应且分别包含对应区块中的顶点在第一坐标系下的坐标,该第一坐标系为M个区块的原始坐标系。可选的,第一坐标系可以是笛卡尔坐标系、局部法向坐标系或者局部圆柱坐标系。In this embodiment, the 3D mesh comprises M blocks, and the original data comprises M sets of original information. Each of the M sets of original information corresponds one-to-one with one of the M blocks and contains the coordinates of the vertices in the corresponding block in a first coordinate system. This first coordinate system is the original coordinate system of the M blocks. Optionally, the first coordinate system can be a Cartesian coordinate system, a local normal coordinate system, or a local cylindrical coordinate system.
如上文所述,三维网格的特点是一帧包含一个网格,一个网格可以包括多个面片,该面片可以为多边形(例如三角形),每个多边形都由其在三维空间中的顶点和关于顶点如何连接的信息(称为连接信息)定义。可选的,顶点属性(例如颜色、法线等)可以与网格顶点相关联。可以采用一维阵列(1Darray verCoordConnArray)表示顶点连接,其中维度对应于顶点连接索引,从而将所有面片的所有值排列在线性结构中,即所有顶点连接可以按顺序排列。由此可见,该网格的数据可以包括多个顶点坐标和多个连接关系(连接关系也可以解释为拓扑结构,而多个连接关系可以称之为连接集合),其中,As mentioned above, a 3D mesh is characterized by one frame containing one mesh, and a mesh can include multiple faces, which can be polygons (e.g., triangles). Each polygon is defined by its vertices in 3D space and information about how the vertices are connected (called connectivity information). Optionally, vertex attributes (e.g., color, normals, etc.) can be associated with mesh vertices. Vertex connections can be represented using a one-dimensional array (1Darray verCoordConnArray), where the dimension corresponds to the vertex connection index, thus arranging all values of all faces in a linear structure, i.e., all vertex connections can be arranged sequentially. Therefore, the mesh data can include multiple vertex coordinates and multiple connectivity relationships (connectivity relationships can also be interpreted as topological structures, and multiple connectivity relationships can be called a connection set), where...
顶点坐标可以采用三维坐标(例如,(x,y,z))表示,也可以采用球坐标表示,其作用在于指示顶点在3D空间中的位置,此外顶点坐标可以在世界坐标系(例如,笛卡尔坐标系)下表示,也可以在局部坐标系(例如,局部法向坐标系或局部圆柱坐标系)下表示。Vertex coordinates can be represented in three-dimensional coordinates (e.g., (x, y, z)) or in spherical coordinates. Their purpose is to indicate the position of the vertex in 3D space. In addition, vertex coordinates can be represented in a world coordinate system (e.g., a Cartesian coordinate system) or in a local coordinate system (e.g., a local normal coordinate system or a local cylindrical coordinate system).
连接关系可以通过与该连接关系关联的顶点表示,例如,(123)表示了由顶点1至顶点2、再至顶点3、再至顶点1的连接关系,或者,也可以通过面片(例如,三角形面片、多边形面片等)表示。例如,三角形面片具有方向,1,2,3代表1->2,2->3,3->1,或者代表三角形面片由1、2、3这三个顶点构成。又例如,通过边的方式表示,4表示1->2这条边,5表示2->3这条边,6表示3->1这条边。应理解,前述内容仅作为连接关系的表示方式的示例,本申请实施例对表示连接关系的方式不做具体限定。Connections can be represented by vertices associated with the connection. For example, (123) represents a connection from vertex 1 to vertex 2, then to vertex 3, and then back to vertex 1. Alternatively, they can be represented by faces (e.g., triangular faces, polygonal faces, etc.). For example, a triangular face has a direction; 1, 2, 3 represents 1->2, 2->3, 3->1, or it represents that the triangular face is composed of vertices 1, 2, and 3. Another example is representation by edges; 4 represents the edge 1->2, 5 represents the edge 2->3, and 6 represents the edge 3->1. It should be understood that the foregoing is merely an example of how connections can be represented, and the embodiments of this application do not specifically limit the way connections can be represented.
本申请实施例中,区块(patch)也可以称作区域、块、空间块等,对此不做具体限定。任意一个区块(例如第一区块,亦称作第一patch)可以包括三维网格中的至少一个顶点(例如第一顶点),即,第一patch包括三维网格中的一个顶点,或者,第一patch包括三维网格中的部分(多个)顶点。In this embodiment of the application, a patch can also be referred to as a region, block, spatial block, etc., without specific limitation. Any patch (e.g., the first patch, also called the first patch) may include at least one vertex (e.g., the first vertex) in the three-dimensional mesh, that is, the first patch includes a vertex in the three-dimensional mesh, or the first patch includes some (multiple) vertices in the three-dimensional mesh.
针对第一patch包括多个顶点的情况,该多个顶点适用于同一种局部坐标系,因此可以将该多个顶点的信息(即坐标信息)变换到同一种局部坐标系下,例如,均是从笛卡尔坐标系变换到局部法向坐标系。应理解,为了提高编码效率,同一个patch中的多个顶点可以采用一个或多个坐标,包括给多个顶点的坐标信息建立一个局部坐标系,将多个顶点的坐标信息全都转换到该局部坐标系下;或者,给多个顶点的坐标信息分别建立局部坐标系,将多个顶点的坐标信息转换到各自的局部坐标系下;或者,给多个顶点的坐标信息建立多个局部坐标系,将多个顶点的坐标信息按照其分布或位置等转换到对应的局部坐标系下。When the first patch includes multiple vertices, these vertices are governed by the same local coordinate system. Therefore, the information (i.e., coordinate information) of these vertices can be transformed to the same local coordinate system, for example, from Cartesian coordinates to the local normal coordinate system. It should be understood that, to improve coding efficiency, multiple vertices in the same patch can use one or more coordinate systems. This includes establishing a local coordinate system for the coordinate information of multiple vertices and transforming all the coordinate information of multiple vertices to this local coordinate system; or establishing separate local coordinate systems for the coordinate information of multiple vertices and transforming the coordinate information of multiple vertices to their respective local coordinate systems; or establishing multiple local coordinate systems for the coordinate information of multiple vertices and transforming the coordinate information of multiple vertices according to their distribution or position to the corresponding local coordinate system.
本申请实施例中,坐标信息可以包括以下两种情况:In this embodiment of the application, the coordinate information may include the following two cases:
(1)坐标信息可以包括在第一坐标系下的位置坐标(亦称作三维坐标、顶点坐标等)。(1) Coordinate information may include position coordinates in the first coordinate system (also known as three-dimensional coordinates, vertex coordinates, etc.).
(2)坐标信息可以包括在第一坐标系下的位置坐标残差,该位置坐标残差是基于在第一坐标系下的位置坐标预测得到的。(2) The coordinate information may include the position coordinate residual in the first coordinate system, which is obtained based on the position coordinate prediction in the first coordinate system.
需要说明的是,坐标信息还可以包括与位置坐标相关的其他信息,对此不做具体限定。It should be noted that coordinate information may also include other information related to the location coordinates, without any specific limitations.
步骤202、基于原始数据将坐标信息和坐标系信息编码到三维网格对应的码流中。Step 202: Encode the coordinate information and coordinate system information into the bitstream corresponding to the 3D mesh based on the original data.
本申请实施例中,坐标信息包含M个区块中的N个区块中部分或全部顶点的坐标,坐标系信息指示N个区块的当前坐标系是否为第一坐标系,1≤N<M。In this embodiment of the application, the coordinate information includes the coordinates of some or all vertices in N blocks out of M blocks, and the coordinate system information indicates whether the current coordinate system of the N blocks is the first coordinate system, 1≤N<M.
在一种可能的实现方式中,码流包含对应于三维网格的第一语法结构,第一语法结构包含对应于M个区块的第二语法结构,坐标系信息位于第一语法结构中且在第二语法结构之外,或者,坐标系信息位于对应于N个区块的第二语法结构中。In one possible implementation, the bitstream contains a first syntax structure corresponding to a three-dimensional grid, the first syntax structure contains a second syntax structure corresponding to M blocks, and coordinate system information is located in the first syntax structure and outside the second syntax structure, or the coordinate system information is located in the second syntax structure corresponding to N blocks.
即,三维网格对应的码流包含两种语法元素,一种是对应于三维网格的语法元素(亦即网格语法元素,对应上述第一语法结构),另一种是对应于区块的语法元素(亦即区块语法元素,对应上述第二语法结构),本申请实施例中的坐标系信息可以写入第一语法结构,也可以写入第二语法结构,对此不做具体限定。That is, the code stream corresponding to the three-dimensional grid contains two kinds of syntax elements: one is the syntax element corresponding to the three-dimensional grid (i.e., the grid syntax element, corresponding to the first syntax structure mentioned above), and the other is the syntax element corresponding to the block (i.e., the block syntax element, corresponding to the second syntax structure mentioned above). The coordinate system information in this embodiment can be written into the first syntax structure or the second syntax structure, and there is no specific limitation on this.
在一种可能的实现方式中,坐标系信息指示N个区块的当前坐标系为第二坐标系,该第二坐标系为第一坐标系经变换后的坐标系,坐标信息包含N个区块中部分或全部顶点在第二坐标系下的坐标。可选的,第二坐标系可以是笛卡尔坐标系、局部法向坐标系或者局部圆柱坐标系。此外,坐标系信息还可以指示在解码时进行坐标逆变换。In one possible implementation, the coordinate system information indicates that the current coordinate system of the N blocks is a second coordinate system, which is a transformed version of the first coordinate system. The coordinate information includes the coordinates of some or all vertices in the N blocks under the second coordinate system. Optionally, the second coordinate system can be a Cartesian coordinate system, a local normal coordinate system, or a local cylindrical coordinate system. Furthermore, the coordinate system information can also indicate that an inverse coordinate transformation should be performed during decoding.
坐标系信息可以包含N组目标坐标系的信息,以及该N组目标坐标系的信息与N个区块的对应关系,任意一组目标坐标系的信息包含目标坐标系的类型、目标坐标系的原点信息以及目标坐标系的坐标轴信息。The coordinate system information can include information on N sets of target coordinate systems, as well as the correspondence between the information on these N sets of target coordinate systems and N blocks. The information on any set of target coordinate systems includes the type of the target coordinate system, the origin information of the target coordinate system, and the coordinate axis information of the target coordinate system.
本申请实施例中,三维网格包含的M个区块中N个区块需要进行坐标系变换,即,该N个区块中的任意一个区块,其所包含的顶点的坐标信息,编码端可以将前述坐标信息从第一坐标系变换到目标坐标系。为了让解码端也同步到前述坐标变换的过程,编码端可以在码流中写入坐标系信息,该坐标系信息包含目标坐标系的类型、目标坐标系的原点信息以及目标坐标系的坐标轴信息。其中,In this embodiment, N out of the M blocks in the 3D mesh require coordinate system transformation. Specifically, the encoding end can transform the coordinate information of the vertices within any of these N blocks from the first coordinate system to the target coordinate system. To synchronize the decoding end with this coordinate transformation process, the encoding end can write coordinate system information into the bitstream. This coordinate system information includes the type of the target coordinate system, the origin information, and the coordinate axes information.
可选的,目标坐标系的类型包含第一坐标系或第二坐标系。例如,标识1代表第一坐标系,标识2代表第二坐标系;或者,标识0代表第一坐标系,标识1代表第二坐标系,对此不做具体限定。Optionally, the target coordinate system can be either a first coordinate system or a second coordinate system. For example, identifier 1 represents the first coordinate system and identifier 2 represents the second coordinate system; or, identifier 0 represents the first coordinate system and identifier 1 represents the second coordinate system, without any specific limitation.
可选的,目标坐标系的原点信息包含原点在第一坐标系下的坐标信息;或者,目标坐标系的原点信息包含原点的获取模式。编码端可以直接在码流中写入原点在第一坐标系下的坐标信息,例如,(0,1,0),或者,编码端可以在码流中写入原点的获取模式,例如,标识0代表原点为上一个顶点,标识1代表原点为上两个顶点的中点,对此不做具体限定。Optionally, the origin information of the target coordinate system includes the coordinate information of the origin in the first coordinate system; or, the origin information of the target coordinate system includes the origin acquisition mode. The encoding end can directly write the coordinate information of the origin in the first coordinate system into the bitstream, for example, (0,1,0), or the encoding end can write the origin acquisition mode into the bitstream, for example, identifier 0 represents the origin as the previous vertex, identifier 1 represents the origin as the midpoint of the two previous vertices, without specific limitations.
可选的,目标坐标系的坐标轴信息包含坐标轴在第一坐标系下的向量;或者,目标坐标系的坐标轴信息包含坐标轴的获取模式。编码端可以直接在码流中写入坐标轴在第一坐标系下的向量,例如,(1,0,0)代表x轴,(0,1,0)代表y轴,(0,0,1)代表z轴,或者,编码端可以在码流中写入坐标轴的获取模式,例如,标识1代表默认的x轴(1,0,0),y轴(0,1,0),z轴(0,0,1),标识2代表旋转预设角度。可选的,前述预设角度也可以写入码流。Optionally, the coordinate axis information of the target coordinate system includes vectors of the coordinate axes in the first coordinate system; or, the coordinate axis information of the target coordinate system includes the acquisition mode of the coordinate axes. The encoding end can directly write the vectors of the coordinate axes in the first coordinate system into the bitstream, for example, (1,0,0) represents the x-axis, (0,1,0) represents the y-axis, and (0,0,1) represents the z-axis; or, the encoding end can write the acquisition mode of the coordinate axes into the bitstream, for example, identifier 1 represents the default x-axis (1,0,0), y-axis (0,1,0), z-axis (0,0,1), and identifier 2 represents a preset rotation angle. Optionally, the aforementioned preset angle can also be written into the bitstream.
可选的,上述目标坐标系的原点和坐标轴也可以采用编码端和解码端预先约定的方式获取,例如,默认原点为上一个顶点,默认的x轴为(1,0,0),y轴为(0,1,0),z轴为(0,0,1)。Optionally, the origin and coordinate axes of the target coordinate system can also be obtained in a way that is agreed upon in advance by the encoder and decoder. For example, the default origin is the previous vertex, the default x-axis is (1,0,0), the default y-axis is (0,1,0), and the default z-axis is (0,0,1).
在一种可能的实现方式中,坐标系信息指示N个区块的当前坐标系为第一坐标系,坐标信息包含N个区块中部分或全部顶点在第一坐标系下的坐标。In one possible implementation, the coordinate system information indicates that the current coordinate system of the N blocks is the first coordinate system, and the coordinate information includes the coordinates of some or all of the vertices in the N blocks in the first coordinate system.
坐标系信息可以包含N个区块标识,该N个区块标识用于指示与N个区块标识对应的N个区块不进行坐标系变换。The coordinate system information can contain N block identifiers, which are used to indicate that the N blocks corresponding to the N block identifiers do not undergo coordinate system transformation.
本申请实施例中,编码端可以反向指示,即坐标系信息中指示不进行坐标系变换的区块,而其他区块则默认要进行坐标系变换,且这些区块的目标坐标系为编码端和解码端预先预订的。这样解码端可以对坐标系信息中的区块标识所对应的区块保留,而对坐标系信息中没有指示的区块进行坐标系变换。In this embodiment, the encoding end can provide reverse indication, that is, the coordinate system information indicates blocks that do not require coordinate system transformation, while other blocks are assumed to require coordinate system transformation, and the target coordinate system of these blocks is pre-defined by the encoding and decoding ends. In this way, the decoding end can retain the blocks corresponding to the block identifiers in the coordinate system information, while performing coordinate system transformation on the blocks not indicated in the coordinate system information.
示例性的,编码端和解码端预先约定坐标变换模式,即,码流中坐标系信息为0,指示编号为0-29的patch的目标坐标系为第二坐标系,即其采用坐标系变换,编号为30-59的patch的目标坐标系为第一坐标系,即其不采用坐标系变换;码流中坐标系信息为1,指示编号为0-29的patch的目标坐标系为第一坐标系,即其不采用坐标系变换,编号为30-59的patch的目标坐标系为第二坐标系,即其采用坐标系变换。这种方式可以在三维网格的语法元素中,使用一个标识符指定固定数目的patch的坐标系变换方式。For example, the encoding and decoding ends pre-agree on coordinate transformation modes. Specifically, if the coordinate system information in the bitstream is 0, it indicates that the target coordinate system of patches numbered 0-29 is the second coordinate system, meaning they undergo coordinate transformation; the target coordinate system of patches numbered 30-59 is the first coordinate system, meaning they do not undergo coordinate transformation. Conversely, if the coordinate system information in the bitstream is 1, it indicates that the target coordinate system of patches numbered 0-29 is the first coordinate system, meaning they do not undergo coordinate transformation; the target coordinate system of patches numbered 30-59 is the second coordinate system, meaning they undergo coordinate transformation. This method allows a fixed number of patches to have their coordinate system transformation methods specified using an identifier within the syntax elements of the 3D mesh.
示例性的,码流中第一标识符指定选中的patch数目(例如30);第二标识符指定该patch数目下的patch采用的坐标系变换的方法(例如,0指示第一坐标系,1指示第二坐标系);第三标识符指定获取patch的方法,如0指示顺序获取,1指示获取奇数,2指示获取偶数,3指示获取边界patch等。这种方式可以在三维网格的语法元素中,使用多个标识符指定非固定数目的patch的坐标系变换方式。For example, in the bitstream, the first identifier specifies the number of selected patches (e.g., 30); the second identifier specifies the coordinate system transformation method used for the patches at that number of patches (e.g., 0 indicates the first coordinate system, 1 indicates the second coordinate system); the third identifier specifies the method for acquiring patches, such as 0 indicating sequential acquisition, 1 indicating acquisition of odd-numbered patches, 2 indicating acquisition of even-numbered patches, 3 indicating acquisition of boundary patches, etc. This method allows multiple identifiers to be used in the syntax elements of a 3D mesh to specify the coordinate system transformation method for a non-fixed number of patches.
示例性的,采取对应patch数目的标识符标识,如0100,分别指示编号为1-4的patch分别采取不变换,变换,不变换,不变换的方式。这种方式也可以在三维网格的语法元素中,使用多个标识符指定非固定数目的patch的坐标系变换方式。For example, identifiers corresponding to the number of patches, such as 0100, can be used to indicate that patches numbered 1-4 will be transformed in the following ways: no transformation, transformation, no transformation, and no transformation, respectively. This method can also be used in the syntax elements of a 3D mesh to specify the coordinate system transformation methods for a non-fixed number of patches using multiple identifiers.
以下示例性的介绍本申请的几种码流的结构:The following are examples illustrating the structures of several bitstreams in this application:
实施例一Example 1
三维网格(Mesh)的帧(frame)码流Frame bitstream of a 3D mesh
{{
……
Patch数目;Number of patches;
帧坐标系是否变换标识符;Frame coordinate system transformation identifier;
If(帧坐标系是否变换标识符==1)If (frame coordinate system transformation identifier == 1)
目标坐标系标识符;Target coordinate system identifier;
}}
区块(patch)码流Patch stream
{{
……
Patch是否变换标识符;Does the Patch change the identifier?
}}
实施例二Example 2
frame码流frame bitstream
{{
……
Patch数目;Number of patches;
帧坐标系是否坐标变换标识符;Whether the frame coordinate system is a coordinate transformation identifier;
}}
patch码流patch stream
{{
……
目标坐标系标识符(此时目标坐标系和原坐标系一致,即不坐标变换,否则进行目标变换);或者,Target coordinate system identifier (in this case, the target coordinate system is consistent with the original coordinate system, i.e., no coordinate transformation is performed; otherwise, a target transformation is performed); or,
编解码约定一致的坐标系,此时没有对应码流中的目标坐标系标识符;A coordinate system consistent with the encoding and decoding conventions is used, but there is no corresponding target coordinate system identifier in the bitstream at this time;
}}
实施例三Example 3
frame码流frame bitstream
{{
……
Patch数目;Number of patches;
帧坐标系是否坐标变换标识符;Whether the frame coordinate system is a coordinate transformation identifier;
}}
patch码流patch stream
{{
……
Patch是否变换标识符;Does the Patch change the identifier?
If(Patch是否变换标识符==1)If (Patch identifier is changed == 1)
目标坐标系标识符;或者,Target coordinate system identifier; or,
编解码约定一致的坐标系,此时没有对应码流中的目标坐标系标识符;A coordinate system consistent with the encoding and decoding conventions is used, but there is no corresponding target coordinate system identifier in the bitstream at this time;
}}
实施例四Example 4
frame码流frame bitstream
{{
……
Patch数目;Number of patches;
帧坐标系是否坐标变换标识符;Whether the frame coordinate system is a coordinate transformation identifier;
}}
patch码流patch stream
{{
……
Patch是否变换标识符;Does the Patch change the identifier?
If(Patch是否变换标识符==1)If (Patch identifier is changed == 1)
目标坐标系标识符;Target coordinate system identifier;
If(目标坐标系标识符==1)If (target coordinate system identifier == 1)
目标坐标系原点标识;Origin marker of the target coordinate system;
目标坐标系坐标轴标识;Target coordinate system coordinate axis labels;
}}
实施例五Example 5
frame码流frame bitstream
{{
……
Patch数目;Number of patches;
}}
patch码流patch stream
{{
......
目标坐标系标识符;或者,Target coordinate system identifier; or,
编解码约定一致的坐标系,此时没有对应码流中的目标坐标系标识符;A coordinate system consistent with the encoding and decoding conventions is used, but there is no corresponding target coordinate system identifier in the bitstream at this time;
}}
其中,in,
帧坐标系是否坐标变换标识符,如1表示本帧内存在坐标系变换,0表示不存在坐标系变换;A coordinate system transformation identifier for the frame coordinate system, such as 1 indicating that there is a coordinate system transformation in the current frame, and 0 indicating that there is no coordinate system transformation.
Patch是否变换标识符,如1表示patch存在坐标系变换,0表示不存在坐标系变换;Whether the patch is a coordinate system transformation identifier, such as 1 indicates that the patch has a coordinate system transformation, and 0 indicates that there is no coordinate system transformation;
目标坐标系标识符,如1表示目标坐标系为局部法向坐标系,2表示目标坐标系为局部圆柱坐标系。可选的,可存在用0表示不进行坐标系变换,此时可以不存在Patch是否变换标识符,如实施例五;The target coordinate system identifier, such as 1 indicating that the target coordinate system is a local normal coordinate system, and 2 indicating that the target coordinate system is a local cylindrical coordinate system. Optionally, 0 can be used to indicate that no coordinate system transformation is performed. In this case, the Patch transformation identifier may not be present, as in Example 5.
目标坐标系原点标识,可以直接编码三维原点坐标,如(0,0,0);或者编码一种记录方式,如选择前序一个顶点的坐标。The origin of the target coordinate system can be directly encoded as the coordinates of the three-dimensional origin, such as (0,0,0); or it can be encoded as a recording method, such as selecting the coordinates of the previous vertex.
目标坐标系坐标轴标识,可以直接编码三维坐标轴,如(1,0,0)(0,1,0)(0,0,1);或者编码一种记录方式,如选择前序面片法向或者某个边的方向。The target coordinate system coordinate axis identifier can be directly encoded as a 3D coordinate axis, such as (1,0,0)(0,1,0)(0,0,1); or it can be encoded as a recording method, such as selecting the normal of the preceding patch or the direction of a certain edge.
需要说明的是,上文示例性的示出了五种码流的结构,但这并非对码流结构构成限定,本申请实施例对此也不做具体限定。It should be noted that the above examples illustrate the structures of five bitstreams, but this is not a limitation on the bitstream structure, and the embodiments of this application do not impose any specific limitations on it.
在一种可能的实现中,编码端可以将N个区块中的顶点在第一坐标系下的坐标变换为在第二坐标系下的坐标。In one possible implementation, the encoder can transform the coordinates of vertices in N blocks in the first coordinate system to coordinates in the second coordinate system.
编码端可以判断多个区块中是否有变换区块;当多个区块中有变换区块时,获取至少一个变换区块各自的目标坐标系;根据至少一个变换区块各自的目标坐标系分别对至少一个变换区块中的顶点的信息进行坐标系变换,以得到至少一个变换后区块。The encoding end can determine whether there are transformation blocks among multiple blocks; when there are transformation blocks among multiple blocks, it obtains the target coordinate system of at least one transformation block; and performs coordinate system transformation on the information of vertices in at least one transformation block according to the target coordinate system of at least one transformation block to obtain at least one transformed block.
变换区块是多个区块中要对其中的顶点的信息进行坐标系变换的区块。A transformation block is a block among multiple blocks where the coordinate system information of its vertices is transformed.
本申请实施例中,编码端可以通过率失真优化、坐标系变换前后顶点分布、三维网格包括的面片的三角形属性或者预先设定等方法判断多个区块中是否有变换区块。编码端可以针对任意一个区块执行该操作,即,对三维网格中的多个网格,逐个判断该区块是否为变换区块;在逐个判断、并确定其为变换区块后,再获取其目标坐标系。这样多个区块中可以没有变换区块,或者可以有至少一个变换区块,或者可以全部为变换区块,每个变换区块可以得到一个目标坐标系,不同的变换区块的目标坐标系可以相同,也可以不同。In this embodiment, the encoding end can determine whether there are transformation blocks among multiple blocks by methods such as rate-distortion optimization, vertex distribution before and after coordinate system transformation, triangle attributes of the faces included in the 3D mesh, or pre-setting. The encoding end can perform this operation for any block, that is, for multiple meshes in the 3D mesh, it determines whether the block is a transformation block one by one; after determining each block and confirming it as a transformation block, it obtains its target coordinate system. In this way, there may be no transformation blocks among multiple blocks, or there may be at least one transformation block, or all of them may be transformation blocks. Each transformation block can obtain a target coordinate system, and the target coordinate systems of different transformation blocks may be the same or different.
例如,编码端用率失真函数进行判断,即,针对多个区块中的任意一个区块,计算该区块中的顶点的信息分别在第一坐标系和其他坐标系(例如,局部坐标系)下的码率和失真损失的加权值,进而基于码率和加权值确定是否要对该区块中的顶点的信息进行坐标系变换,并从其他坐标系中选取最优的坐标系作为该区块的目标坐标系。又例如,变换前后顶点分布的方法可以对变换前后的顶点进行直方图统计,根据直方图比较确定是否要对该区块中的顶点的信息进行坐标系变换,并从其他坐标系中选取更集中的坐标系作为该区块的目标坐标系。For example, the encoder uses a rate-distortion function to determine whether to perform a coordinate system transformation. Specifically, for any given block, it calculates the weighted values of the bitrate and distortion loss for each vertex in the block across a first coordinate system and other coordinate systems (e.g., a local coordinate system). Based on the bitrate and weighted values, it determines whether a coordinate system transformation is needed for the vertex information in that block, and selects the optimal coordinate system from among the other coordinate systems as the target coordinate system for that block. Another example is the method for analyzing vertex distribution before and after transformation. This involves performing histogram statistics on the vertices before and after the transformation, comparing the histograms to determine whether a coordinate system transformation is needed for the vertex information in that block, and selecting a more concentrated coordinate system from among the other coordinate systems as the target coordinate system for that block.
由此可见,判断多个区块中是否有变换区块的目的在于找出多个区块中,存在的在目标坐标系下比在第一坐标系下具有更优的压缩率、码率的区块(即变换区块),亦即,相较于第一坐标系,采用目标坐标系的顶点的信息可以提高变换区块的顶点的压缩效率。Therefore, the purpose of determining whether there are transformed blocks among multiple blocks is to find blocks (i.e., transformed blocks) that have better compression ratio and bit rate in the target coordinate system than in the first coordinate system. In other words, compared with the first coordinate system, using the vertex information of the target coordinate system can improve the compression efficiency of the vertices of the transformed blocks.
在一种可能的实现方式中,为了提高解码端的解码效率,编码端可以给变换区块设置坐标系信息,该坐标系信息可以包括坐标系标识,用于表示变换区块的目标坐标系。例如,坐标系标识为1,表示局部法向坐标系;坐标系标识为0,表示局部圆柱坐标系。此外,坐标系信息还可以包括是否变换坐标系标识,用于表示是否要对区块中的顶点的信息进行坐标系变换。例如,是否变换坐标系标识为1,表示要对区块中的顶点的信息进行坐标系变换,是否变换坐标系标识为0,表示不对区块中的顶点的信息进行坐标系变换。In one possible implementation, to improve decoding efficiency at the decoding end, the encoding end can set coordinate system information for the transformed block. This coordinate system information can include a coordinate system identifier to indicate the target coordinate system of the transformed block. For example, a coordinate system identifier of 1 indicates a local normal coordinate system; a coordinate system identifier of 0 indicates a local cylindrical coordinate system. Furthermore, the coordinate system information can also include a coordinate system transformation indicator to indicate whether coordinate system transformation is required for the vertex information in the block. For example, a coordinate system transformation indicator of 1 indicates that coordinate system transformation is required for the vertex information in the block; a coordinate system transformation indicator of 0 indicates that coordinate system transformation is not required for the vertex information in the block.
任意一个变换后区块中的顶点的信息为对应的目标坐标系下的信息。即,针对任意一个变换区块,当确定了其目标坐标系后,可以对该变换区块中的至少一个顶点的信息进行坐标系变换,以得到在目标坐标系下的至少一个顶点的变换后信息,即为变换后区块。The information of vertices in any transformed block is the information in the corresponding target coordinate system. That is, for any transformed block, once its target coordinate system is determined, the information of at least one vertex in the transformed block can be transformed to obtain the transformed information of at least one vertex in the target coordinate system, which is the transformed block.
在一种可能的实现方式中,编码端可以先建立目标坐标系,可以依赖参考顶点或者参考三角形面片,确定目标坐标系的坐标原点和坐标轴方向;然后将第一坐标系下的信息变换到目标坐标系下,变换方法包括但不限于平移变换、旋转变换、缩放变换、透视变换等。例如,目标坐标系为局部法向坐标系,其坐标原点可以是已编码顶点,或者由已编码三角形面片得到,局部法向坐标系的坐标轴可以由每个顶点处的法线和两个切向分量生成,例如如图3(图3为笛卡尔坐标系和局部法向坐标系的示意图)所示。又例如,目标坐标系为局部圆柱坐标系,其坐标原点可以是已编码顶点,或者由已编码三角形面片得到,局部圆柱坐标系的坐标轴可以依据参考三角形得到,然后将笛卡尔坐标系下的顶点坐标(x,y,z)变换为局部圆柱坐标系下的二面角θ,半径r和高度z,例如如图4(图4为笛卡尔坐标系和局部圆柱坐标系的示意图)所示。In one possible implementation, the encoding end can first establish a target coordinate system, which can rely on reference vertices or reference triangle faces to determine the origin and coordinate axis directions of the target coordinate system. Then, the information in the first coordinate system is transformed to the target coordinate system. The transformation methods include, but are not limited to, translation, rotation, scaling, and perspective transformations. For example, the target coordinate system is a local normal coordinate system, whose origin can be an encoded vertex or obtained from an encoded triangle face. The coordinate axes of the local normal coordinate system can be generated by the normal and two tangential components at each vertex, as shown in Figure 3 (Figure 3 is a schematic diagram of the Cartesian coordinate system and the local normal coordinate system). As another example, the target coordinate system is a local cylindrical coordinate system, whose origin can be an encoded vertex or obtained from an encoded triangle face. The coordinate axes of the local cylindrical coordinate system can be obtained based on a reference triangle. Then, the vertex coordinates (x, y, z) in the Cartesian coordinate system are transformed into the dihedral angle θ, radius r, and height z in the local cylindrical coordinate system, as shown in Figure 4 (Figure 4 is a schematic diagram of the Cartesian coordinate system and the local cylindrical coordinate system).
需要说明的是,本申请实施例中,同一个patch中的多个顶点适用于同一种局部坐标系,但在进行坐标系变换时,可以针对不同的顶点创建不同的局部坐标系,即可以有一个或多个顶点向同一个局部坐标系变换,也可以不同的顶点变换到不同个局部坐标系下,对此不做具体限定。It should be noted that in the embodiments of this application, multiple vertices in the same patch are applicable to the same local coordinate system. However, when performing coordinate system transformation, different local coordinate systems can be created for different vertices. That is, one or more vertices can be transformed to the same local coordinate system, or different vertices can be transformed to different local coordinate systems. No specific limitation is made in this regard.
本申请实施例中,编码端可以对至少一个变换后区块进行预测和熵编码以得到码流。针对任意一个变换后区块,可以对其中的至少一个顶点的变换后信息进行预测以得到至少一个顶点的残差信息;再对至少一个顶点的残差信息进行编码以得到码流。In this embodiment, the encoding end can perform prediction and entropy coding on at least one transformed block to obtain a bitstream. For any transformed block, the transformed information of at least one vertex can be predicted to obtain the residual information of at least one vertex; then the residual information of at least one vertex is encoded to obtain the bitstream.
在一种可能的实现方式中,坐标系信息可以填充入码流中的区块数据单元语法(Patch data unit syntax)中,一种方法是在该语法中增加新的字节,另一种方法是使用该语法中的保留字段,本申请实施例对此不做具体限定。In one possible implementation, coordinate system information can be filled into the patch data unit syntax in the bitstream. One method is to add new bytes to the syntax, and another method is to use the reserved fields in the syntax. This application does not specifically limit this approach.
在一种可能的实现方式中,还包括:获取不变换区块,不变换区块是多个区块中不对其中的顶点的信息进行坐标系变换的区块;根据不变换区块进行编码以得到码流。One possible implementation also includes: obtaining untransformed blocks, which are blocks among multiple blocks for which coordinate system transformation is not performed on the information of vertices; and encoding based on the untransformed blocks to obtain a bitstream.
不变换区块可以是三维网格中与变换区块不同的patch,该不变换区块相较于局部坐标系(对应目标坐标系),更适用于世界坐标系(对应第一坐标系),亦即,三维网格中的多个顶点可以适用于不同的局部坐标系,也可以适用于世界坐标系,根据三维网格中的多个顶点特性,可以以最适用的方式将多个顶点归入不同patch,从而对同一patch中的顶点采用同种坐标系变换,这样各个patch中的顶点的信息采用最适用的坐标系,可以全面提高三维网格的顶点的压缩率。The non-transformation block can be a patch in the 3D mesh that is different from the transformed block. Compared to the local coordinate system (corresponding to the target coordinate system), the non-transformation block is more suitable for the world coordinate system (corresponding to the first coordinate system). That is, multiple vertices in the 3D mesh can be applied to different local coordinate systems or to the world coordinate system. Based on the characteristics of multiple vertices in the 3D mesh, multiple vertices can be assigned to different patches in the most suitable way, so that the same coordinate system transformation can be applied to the vertices in the same patch. In this way, the information of vertices in each patch adopts the most suitable coordinate system, which can comprehensively improve the vertex compression rate of the 3D mesh.
本申请实施例,对三维网格中的多个区块分别判断是否要进行顶点的信息的坐标系变换,选取要变换的区块对应的最优的坐标系,从而将该区块中的顶点的信息放在相应的坐标系下进行后续压缩处理,可以提高顶点的压缩效率。In this embodiment, it is determined whether coordinate system transformation of vertex information is required for multiple blocks in a 3D mesh. The optimal coordinate system corresponding to the block to be transformed is selected, so that the vertex information in the block is placed in the corresponding coordinate system for subsequent compression processing, which can improve the compression efficiency of the vertex.
图5为本申请实施例提供的解码方法的过程500的流程图。过程500可以由上文所述的解码端执行。过程500描述为一系列的步骤或操作,应当理解的是,过程500可以以各种顺序执行和/或同时发生,不限于图5所示的执行顺序。过程500可以包括:Figure 5 is a flowchart of process 500 of the decoding method provided in an embodiment of this application. Process 500 can be executed by the decoding end described above. Process 500 is described as a series of steps or operations. It should be understood that process 500 can be executed in various orders and/or occur simultaneously, and is not limited to the execution order shown in Figure 5. Process 500 may include:
步骤501、接收三维网格对应的码流。Step 501: Receive the bitstream corresponding to the 3D mesh.
上述三维网格包含M个区块。The above three-dimensional grid contains M blocks.
步骤502、解码所述码流以得到坐标信息和坐标系信息。Step 502: Decode the bitstream to obtain coordinate information and coordinate system information.
本申请实施例中,坐标信息包含M个区块中的N个区块中部分或全部顶点的坐标,坐标系信息指示N个区块的坐标系是否为第一坐标系,第一坐标系为M个区块的原始坐标系,1≤N<M。可选的,第一坐标系可以是笛卡尔坐标系、局部法向坐标系或者局部圆柱坐标系。In this embodiment, the coordinate information includes the coordinates of some or all vertices in N blocks out of M blocks. The coordinate system information indicates whether the coordinate system of the N blocks is the first coordinate system, which is the original coordinate system of the M blocks, where 1 ≤ N < M. Optionally, the first coordinate system can be a Cartesian coordinate system, a local normal coordinate system, or a local cylindrical coordinate system.
在一种可能的实现方式中,码流包含对应于三维网格的第一语法结构,第一语法结构包含对应于M个区块的第二语法结构,坐标系信息位于第一语法结构中且在第二语法结构之外,或者,坐标系信息位于对应于N个区块的第二语法结构中。In one possible implementation, the bitstream contains a first syntax structure corresponding to a three-dimensional grid, the first syntax structure contains a second syntax structure corresponding to M blocks, and coordinate system information is located in the first syntax structure and outside the second syntax structure, or the coordinate system information is located in the second syntax structure corresponding to N blocks.
即,三维网格对应的码流包含两种语法元素,一种是对应于三维网格的语法元素(亦即网格语法元素,对应上述第一语法结构),另一种是对应于区块的语法元素(亦即区块语法元素,对应上述第二语法结构),本申请实施例中的坐标系信息可以写入第一语法结构,也可以写入第二语法结构,对此不做具体限定。That is, the code stream corresponding to the three-dimensional grid contains two kinds of syntax elements: one is the syntax element corresponding to the three-dimensional grid (i.e., the grid syntax element, corresponding to the first syntax structure mentioned above), and the other is the syntax element corresponding to the block (i.e., the block syntax element, corresponding to the second syntax structure mentioned above). The coordinate system information in this embodiment can be written into the first syntax structure or the second syntax structure, and there is no specific limitation on this.
在一种可能的实现方式中,坐标系信息指示N个区块的当前坐标系为第二坐标系,该第二坐标系为第一坐标系经变换后的坐标系,坐标信息包含N个区块中部分或全部顶点在第二坐标系下的坐标。可选的,第二坐标系可以是笛卡尔坐标系、局部法向坐标系或者局部圆柱坐标系。此外,坐标系信息还可以指示在解码时进行坐标逆变换。此外,坐标系信息还可以指示在解码时进行坐标逆变换。In one possible implementation, the coordinate system information indicates that the current coordinate system of the N blocks is a second coordinate system, which is a transformed version of the first coordinate system. The coordinate information includes the coordinates of some or all vertices in the N blocks under the second coordinate system. Optionally, the second coordinate system can be a Cartesian coordinate system, a local normal coordinate system, or a local cylindrical coordinate system. Furthermore, the coordinate system information can also indicate that an inverse coordinate transformation should be performed during decoding.
坐标系信息可以包含N组目标坐标系的信息,以及该N组目标坐标系的信息与N个区块的对应关系,任意一组目标坐标系的信息包含目标坐标系的类型、目标坐标系的原点信息以及目标坐标系的坐标轴信息。The coordinate system information can include information on N sets of target coordinate systems, as well as the correspondence between the information on these N sets of target coordinate systems and N blocks. The information on any set of target coordinate systems includes the type of the target coordinate system, the origin information of the target coordinate system, and the coordinate axis information of the target coordinate system.
本申请实施例中,当坐标系信息指示M个区块中N个区块所对应的第二坐标系时,表示编码端将该N个区块中的顶点的坐标信息从第一坐标系变换到了第二坐标系,与此对应,解码端需要将该N个区块中的顶点的坐标信息从第二坐标系变化回第一坐标系。对此,解码端具体的实施方式可以基于坐标系信息,坐标系信息中的目标坐标系的信息包含目标坐标系的类型、目标坐标系的原点信息以及目标坐标系的坐标轴信息。其中,In this embodiment, when the coordinate system information indicates the second coordinate system corresponding to N blocks out of M blocks, it means that the encoding end has transformed the coordinate information of the vertices in the N blocks from the first coordinate system to the second coordinate system. Correspondingly, the decoding end needs to transform the coordinate information of the vertices in the N blocks back from the second coordinate system to the first coordinate system. Specifically, the decoding end can be based on the coordinate system information, where the target coordinate system information includes the type of the target coordinate system, the origin information, and the coordinate axis information.
可选的,目标坐标系的类型包含第一坐标系或第二坐标系。例如,标识1代表第一坐标系,标识2代表第二坐标系;或者,标识0代表第一坐标系,标识1代表第二坐标系,对此不做具体限定。可选的,第一坐标系可以是笛卡尔坐标系、局部法向坐标系或者局部圆柱坐标系。Optionally, the target coordinate system can be either a first coordinate system or a second coordinate system. For example, identifier 1 represents the first coordinate system and identifier 2 represents the second coordinate system; or, identifier 0 represents the first coordinate system and identifier 1 represents the second coordinate system, without specific limitations. Optionally, the first coordinate system can be a Cartesian coordinate system, a local normal coordinate system, or a local cylindrical coordinate system.
解码端识别到目标坐标系的类型的标识为2时,可以确定要对区块中的顶点的坐标信息从第二坐标系变换回第一坐标系。When the decoding end recognizes that the type of the target coordinate system is 2, it can determine that the coordinate information of the vertices in the block needs to be transformed from the second coordinate system back to the first coordinate system.
可选的,目标坐标系的原点信息包含原点在第一坐标系下的坐标信息;或者,目标坐标系的原点信息包含原点的获取模式。编码端可以直接在码流中写入原点在第一坐标系下的坐标信息,例如,(0,1,0),或者,编码端可以在码流中写入原点的获取模式,例如,标识0代表原点为上一个顶点,标识1代表原点为上两个顶点的中点,对此不做具体限定。Optionally, the origin information of the target coordinate system includes the coordinate information of the origin in the first coordinate system; or, the origin information of the target coordinate system includes the origin acquisition mode. The encoding end can directly write the coordinate information of the origin in the first coordinate system into the bitstream, for example, (0,1,0), or the encoding end can write the origin acquisition mode into the bitstream, for example, identifier 0 represents the origin as the previous vertex, identifier 1 represents the origin as the midpoint of the two previous vertices, without specific limitations.
解码端可以将在第一坐标系下(0,1,0)位置的顶点作为第二坐标系的原点,或者将上一个处理完的顶点作为第二坐标系的原点,或者将计算上两个顶点的中点,然后将该中点作为第二坐标系的原点,等等,对此不做具体限定。The decoding end can use the vertex at position (0,1,0) in the first coordinate system as the origin of the second coordinate system, or use the previously processed vertex as the origin of the second coordinate system, or calculate the midpoint of the two vertices and then use that midpoint as the origin of the second coordinate system, etc., without making specific limitations.
可选的,目标坐标系的坐标轴信息包含坐标轴在第一坐标系下的向量;或者,目标坐标系的坐标轴信息包含坐标轴的获取模式。编码端可以直接在码流中写入坐标轴在第一坐标系下的向量,例如,(1,0,0)代表x轴,(0,1,0)代表y轴,(0,0,1)代表z轴,或者,编码端可以在码流中写入坐标轴的获取模式,例如,标识1代表默认的x轴(1,0,0),y轴(0,1,0),z轴(0,0,1),标识2代表旋转预设角度。可选的,前述预设角度也可以写入码流。Optionally, the coordinate axis information of the target coordinate system includes vectors of the coordinate axes in the first coordinate system; or, the coordinate axis information of the target coordinate system includes the acquisition mode of the coordinate axes. The encoding end can directly write the vectors of the coordinate axes in the first coordinate system into the bitstream, for example, (1,0,0) represents the x-axis, (0,1,0) represents the y-axis, and (0,0,1) represents the z-axis; or, the encoding end can write the acquisition mode of the coordinate axes into the bitstream, for example, identifier 1 represents the default x-axis (1,0,0), y-axis (0,1,0), z-axis (0,0,1), and identifier 2 represents a preset rotation angle. Optionally, the aforementioned preset angle can also be written into the bitstream.
解码端可以将在第一坐标系下(1,0,0)代表的向量作为第二坐标系的x轴,在第一坐标系下(0,1,0)代表的向量作为第二坐标系的y轴,在第一坐标系下(0,0,1)代表的向量作为第二坐标系的z轴,或者根据旋转角度对第一坐标系进行旋转。The decoding end can use the vector represented by (1,0,0) in the first coordinate system as the x-axis of the second coordinate system, the vector represented by (0,1,0) in the first coordinate system as the y-axis of the second coordinate system, and the vector represented by (0,0,1) in the first coordinate system as the z-axis of the second coordinate system, or rotate the first coordinate system according to the rotation angle.
可选的,上述目标坐标系的原点和坐标轴也可以采用编码端和解码端预先约定的方式获取,例如,默认原点为上一个顶点,默认的x轴为(1,0,0),y轴为(0,1,0),z轴为(0,0,1)。Optionally, the origin and coordinate axes of the target coordinate system can also be obtained in a way that is agreed upon in advance by the encoder and decoder. For example, the default origin is the previous vertex, the default x-axis is (1,0,0), the default y-axis is (0,1,0), and the default z-axis is (0,0,1).
基于以上目标坐标系的信息,解码端可以构建顶点对应的第二坐标系,从而基于该第二坐标系对顶点的坐标信息实施从第二坐标系向第一坐标系的变换。Based on the information of the target coordinate system, the decoding end can construct a second coordinate system corresponding to the vertex, and then perform a transformation from the second coordinate system to the first coordinate system on the coordinate information of the vertex based on the second coordinate system.
在一种可能的实现方式中,坐标系信息指示N个区块的当前坐标系为第一坐标系,坐标信息包含N个区块中部分或全部顶点在第一坐标系下的坐标。In one possible implementation, the coordinate system information indicates that the current coordinate system of the N blocks is the first coordinate system, and the coordinate information includes the coordinates of some or all of the vertices in the N blocks in the first coordinate system.
坐标系信息可以包含N个区块标识,该N个区块标识用于指示与N个区块标识对应的N个区块不进行坐标系变换。The coordinate system information can contain N block identifiers, which are used to indicate that the N blocks corresponding to the N block identifiers do not undergo coordinate system transformation.
本申请实施例中,编码端可以反向指示,即坐标系信息中指示不进行坐标系变换的区块,而其他区块则默认要进行坐标系变换,且这些区块的目标坐标系为编码端和解码端预先预订的。这样解码端可以对坐标系信息中的区块标识所对应的区块保留,而对坐标系信息中没有指示的区块进行坐标系变换。In this embodiment, the encoding end can provide reverse indication, that is, the coordinate system information indicates blocks that do not require coordinate system transformation, while other blocks are assumed to require coordinate system transformation, and the target coordinate system of these blocks is pre-defined by the encoding and decoding ends. In this way, the decoding end can retain the blocks corresponding to the block identifiers in the coordinate system information, while performing coordinate system transformation on the blocks not indicated in the coordinate system information.
步骤503、基于坐标信息和坐标系信息得到三维网格的重建数据。Step 503: Obtain the reconstruction data of the three-dimensional mesh based on coordinate information and coordinate system information.
本申请实施例中,重建数据包含M组重建信息,该M组重建信息与M个区块一一对应且分别包含对应区块中的顶点在第一坐标系下的坐标。In this embodiment of the application, the reconstruction data includes M sets of reconstruction information, which correspond one-to-one with M blocks and each contains the coordinates of the vertices in the corresponding block in the first coordinate system.
在一种可能的实现中,解码端将N个区块中的顶点在第二坐标系下的坐标变换为在第一坐标系下的坐标。In one possible implementation, the decoder transforms the coordinates of vertices in the N blocks in the second coordinate system to coordinates in the first coordinate system.
解码端根据码流获取待解码的第一区块中的至少一个第一顶点的信息,以及至少一个第一顶点对应的第二坐标系;对至少一个第一顶点的信息进行坐标系变换以得到在第一坐标系下的至少一个第一顶点的变换后信息;根据至少一个第一顶点的变换后信息重建至少一个第一顶点。The decoding end obtains information about at least one first vertex in the first block to be decoded, as well as the second coordinate system corresponding to at least one first vertex, based on the bitstream; performs coordinate system transformation on the information of at least one first vertex to obtain the transformed information of at least one first vertex in the first coordinate system; and reconstructs at least one first vertex based on the transformed information of at least one first vertex.
在一种可能的实现方式中,基于上述坐标系信息包括的内容,解码端可以先判断是否需要对第一patch中的顶点的信息进行坐标系变换。若需要坐标系变换,则根据坐标系标识获取第二坐标系。In one possible implementation, based on the coordinate system information mentioned above, the decoding end can first determine whether a coordinate system transformation is needed for the vertex information in the first patch. If a coordinate system transformation is needed, the second coordinate system is obtained based on the coordinate system identifier.
解码端解码流的过程,可以参照编码端,先解码流得到残差信息,然后基于残差信息进行残差补偿从而得到上述信息。除此之外,编解码端还可以采用其他的编解码方法,本申请实施例对此不做具体限定。The decoding process at the decoding end can be similar to that at the encoding end. First, the stream is decoded to obtain residual information, and then residual compensation is performed based on the residual information to obtain the aforementioned information. In addition, the encoding and decoding ends can also employ other encoding and decoding methods, which are not specifically limited in this embodiment.
重建第一patch的至少一个第一顶点是第二坐标系下的,因此为了达到重建目标,可以对解码流得到的至少一个第一顶点的信息进行坐标系变换,将其从第二坐标系变换成第一坐标系下的信息,即至少一个第一顶点的变换后信息。At least one first vertex of the reconstructed first patch is in the second coordinate system. Therefore, in order to achieve the reconstruction goal, the information of at least one first vertex obtained from the decoding stream can be transformed into information in the first coordinate system, i.e., the transformed information of at least one first vertex.
本申请实施例中,至少一个第一顶点的变换后信息可以包括以下两种情况:In this embodiment of the application, the transformed information of at least one first vertex may include the following two cases:
(1)至少一个第一顶点的变换后信息包括在第一坐标系下的至少一个第一顶点的位置坐标。即,解码流得到的至少一个第一顶点的信息是在第二坐标系下的至少一个第一顶点的位置坐标,相应的,坐标系变换后的至少一个第一顶点的变换后信息是在第一坐标系下的至少一个第一顶点的位置坐标。(1) The transformed information of at least one first vertex includes the position coordinates of at least one first vertex in the first coordinate system. That is, the information of at least one first vertex obtained by the decoding stream is the position coordinates of at least one first vertex in the second coordinate system, and correspondingly, the transformed information of at least one first vertex after coordinate system transformation is the position coordinates of at least one first vertex in the first coordinate system.
(2)至少一个第一顶点的变换后信息包括在第一坐标系下的至少一个第一顶点的位置坐标残差。即,解码流得到的至少一个第一顶点的信息是在第二坐标系下的至少一个第一顶点的位置坐标残差,相应的,坐标系变换后的至少一个第一顶点的变换后信息是在第一坐标系下的至少一个第一顶点的位置坐标残差。(2) The transformed information of at least one first vertex includes the position coordinate residual of at least one first vertex in the first coordinate system. That is, the information of at least one first vertex obtained by the decoding stream is the position coordinate residual of at least one first vertex in the second coordinate system, and correspondingly, the transformed information of at least one first vertex after coordinate system transformation is the position coordinate residual of at least one first vertex in the first coordinate system.
需要说明的是,至少一个第一顶点的变换后信息还可以包括与位置坐标相关的其他信息,对此不做具体限定。It should be noted that the transformed information of at least one first vertex may also include other information related to its position coordinates, without any specific limitations.
解码端重建至少一个第一顶点可以采用以下两种方法:The following two methods can be used to reconstruct at least one first vertex at the decoding end:
方法一,解码端可以根据在第一坐标系下的至少一个第一顶点的位置坐标重建至少一个第一顶点。即,经过坐标系变换,至少一个第一顶点的位置坐标已经从第二坐标系变换成第一坐标系,因此可以直接将变换后的位置坐标作为至少一个第一顶点的位置坐标,以达到重建至少一个第一顶点的目的。Method 1: The decoding end can reconstruct at least one first vertex based on the position coordinates of at least one first vertex in the first coordinate system. That is, after coordinate system transformation, the position coordinates of at least one first vertex have been transformed from the second coordinate system to the first coordinate system. Therefore, the transformed position coordinates can be directly used as the position coordinates of at least one first vertex to achieve the purpose of reconstructing at least one first vertex.
方法二,解码端可以根据在第一坐标系下的至少一个第一顶点的位置坐标残差进行重建以得到在第一坐标系下的至少一个第一顶点的位置坐标;再根据在第一坐标系下的至少一个第一顶点的位置坐标重建至少一个第一顶点。即,经过坐标系变换,至少一个第一顶点的位置坐标残差已经从第二坐标系变换成第一坐标系,此时还需要基于位置坐标残差进行重建才可以得到在第一坐标系下的至少一个第一顶点的位置坐标,进而将变换后的位置坐标作为至少一个第一顶点的位置坐标,以达到重建至少一个第一顶点的目的。Method Two: The decoding end can reconstruct the position coordinates of at least one first vertex in the first coordinate system based on the residual position coordinates of at least one first vertex in the first coordinate system; then, it can reconstruct the position coordinates of at least one first vertex based on the residual position coordinates of at least one first vertex in the first coordinate system. That is, after coordinate system transformation, the residual position coordinates of at least one first vertex have been transformed from the second coordinate system to the first coordinate system. At this point, it is necessary to reconstruct the position coordinates of at least one first vertex in the first coordinate system based on the residual position coordinates. Then, the transformed position coordinates are used as the position coordinates of at least one first vertex to achieve the purpose of reconstructing at least one first vertex.
本申请实施例,通过解码流可以确定三维网格中的一个patch中的顶点的信息所在的坐标系,再进行坐标系变换得到世界坐标系下的重建信息,可以提高顶点的压缩效率。In this embodiment, the coordinate system of the vertices in a patch of a 3D mesh can be determined by decoding the stream, and then the coordinate system transformation is performed to obtain the reconstructed information in the world coordinate system, which can improve the compression efficiency of vertices.
下面采用几个具体的实施例,对图2-图7所示方法实施例的技术方案进行详细说明。The technical solutions of the method embodiments shown in Figures 2-7 will be described in detail below using several specific examples.
图6为本申请实施例的编解码框架的示意图,如图6所示,在编码端包括以下处理步骤:Figure 6 is a schematic diagram of the encoding and decoding framework of an embodiment of this application. As shown in Figure 6, the encoding end includes the following processing steps:
1、编码端对三维网格中的待编码的patch进行坐标系选择,确定该patch所适用的坐标系;1. The encoding end selects the coordinate system for the patch to be encoded in the 3D mesh, and determines the applicable coordinate system for the patch;
2、对待编码的patch中的顶点的信息进行坐标系变换得到变换后信息;2. Perform coordinate system transformation on the vertex information in the patch to be encoded to obtain the transformed information;
该步骤中,如果待编码的patch所适用的坐标系与的坐标系(世界坐标系)一致,则不需要对待编码的patch中的顶点进行坐标系变换。In this step, if the coordinate system used by the patch to be encoded is consistent with the world coordinate system, then it is not necessary to perform coordinate system transformation on the vertices in the patch to be encoded.
3、在所适用的坐标系下,对顶点的变换后信息进行预测,得到预测结果;3. Under the applicable coordinate system, predict the transformed information of the vertices to obtain the prediction results;
4、对预测结果进行量化和熵编码,并将结果写入码流。4. Quantize and entropy encode the prediction results, and write the results into the bitstream.
在解码端包括以下处理步骤:The decoding process includes the following steps:
1、解码端对码流进行熵解码和反量化,得到待解码的patch中的顶点的预测结果;1. The decoding end performs entropy decoding and dequantization on the bitstream to obtain the prediction results of the vertices in the patch to be decoded;
2、对预测结果进行重建,得到顶点的信息;2. Reconstruct the prediction results to obtain vertex information;
3、判断待解码的patch所在的坐标系(对应所适用的坐标系);3. Determine the coordinate system of the patch to be decoded (corresponding to the applicable coordinate system);
4、对顶点的信息进行与编码端对应的坐标系逆变换,得到待解码的patch中的顶点在世界坐标系下的重建结果。4. Perform an inverse coordinate transformation on the vertex information corresponding to the encoding end to obtain the reconstruction result of the vertices in the patch to be decoded in the world coordinate system.
图7为本申请实施例的编解码框架的示意图,如图7所示,在编码端包括以下处理步骤:Figure 7 is a schematic diagram of the encoding and decoding framework of an embodiment of this application. As shown in Figure 7, the encoding end includes the following processing steps:
I编码连接关系I-encoding connection relationship
三维网格的连接关系,即每个三角形包含的三个顶点的索引值,可以采用EdgeBreaker、TFAN等算法进行有损或无损编码。例如,EdgeBreaker算法,依据每个三角形相对于已经编码三角形集合的相对位置关系,采用有限的状态符号(例如“CLERS”)进行标识,以此完成对所有三角形的遍历编码。在遍历所有三角形过程中,可以进一步获得所有顶点的遍历顺序,用于后续顶点坐标及其他属性信息的编码。The connectivity of a 3D mesh, i.e., the index values of the three vertices in each triangle, can be encoded using algorithms such as EdgeBreaker and TFAN, either lossily or losslessly. For example, the EdgeBreaker algorithm uses a finite set of state symbols (e.g., "CLERS") to identify each triangle based on its relative position to the set of already encoded triangles, thus completing the traversal encoding of all triangles. During the traversal of all triangles, the traversal order of all vertices can be further obtained, which is used for encoding subsequent vertex coordinates and other attribute information.
II编码顶点的信息II Encodes Vertex Information
1、坐标系选择1. Coordinate system selection
编码端对三维网格中的待编码的patch中的顶点的信息(在笛卡尔坐标系下)进行坐标系选择,确定待编码的patch所适用的坐标系(局部坐标系)。待编码的patch所适用的坐标系可以写入码流使解码端获得,也可以不写入码流由解码端根据本地先验信息获取。The encoder selects the coordinate system (local coordinate system) for the vertices of the patch to be encoded in the 3D mesh (in Cartesian coordinates). The coordinate system applicable to the patch to be encoded can be written into the bitstream for the decoder to obtain, or it can be obtained by the decoder based on local prior information without being written into the bitstream.
patch的获取方法可以通过网格分割,三角形面片聚类等方法获得,其可以是顶点的集合。The patch can be obtained through methods such as mesh segmentation and triangular patch clustering, and it can be a set of vertices.
待编码的patch所适用的坐标系的选择方法可以包括:率失真优化、变换前后顶点位置分布、网格三角形属性或者预先设定编解码端一致的坐标系等。例如,率失真优化可以判断待编码的patch选择不同的局部坐标系下的编码所需的数据量(速率)和优化失真量(网格编码质量损失)进行决策。The selection method for the coordinate system applicable to the patch to be encoded can include: rate-distortion optimization, vertex position distribution before and after transformation, mesh triangle attributes, or a pre-defined coordinate system consistent with the encoder and decoder. For example, rate-distortion optimization can make a decision based on the amount of data (rate) required for encoding the patch under different local coordinate systems and the amount of distortion to be optimized (mesh encoding quality loss).
2、局部坐标系变换2. Local coordinate system transformation
坐标系选择之后有适用的局部坐标系的待编码的patch进行坐标系变换。After selecting the coordinate system, the patch to be encoded in the applicable local coordinate system is transformed.
建立局部坐标系,该局部坐标系的建立可以依赖参考顶点或者参考三角形,确定坐标原点和坐标轴方向;然后将笛卡尔坐标系下的顶点信息变换到该局部坐标系下。坐标系变换方法包括但不限于平移变换、旋转变换、缩放变换、透视变换等。例如,局部法向坐标系变换,坐标原点可以是已编码顶点或者由已编码三角形得到,局部法向坐标系放入坐标轴可以由每个顶点处的法线和两个切向分量生成,参考图4。又例如,局部圆柱坐标系变换,坐标原点可以是已编码顶点或者由已编码三角形得到,局部圆柱坐标系坐标轴可以依据参考三角形得到,然后将笛卡尔坐标系下的顶点坐标(x,y,z)变换为局部圆柱坐标系下的二面角θ,半径r和高度z,参考图5。局部坐标系变换不限于一种坐标系,即一个三维网格可以同时存在多种坐标系变换。A local coordinate system is established, which can rely on reference vertices or reference triangles to determine the origin and coordinate axis directions. Then, the vertex information from the Cartesian coordinate system is transformed to this local coordinate system. Coordinate system transformation methods include, but are not limited to, translation, rotation, scaling, and perspective transformations. For example, in a local normal coordinate system transformation, the origin can be an encoded vertex or derived from an encoded triangle. The local normal coordinate system's axes can be generated from the normal and two tangential components at each vertex, as shown in Figure 4. Another example is a local cylindrical coordinate system transformation. The origin can be an encoded vertices or derived from an encoded triangle. The local cylindrical coordinate system axes can be obtained based on a reference triangle. Then, the vertex coordinates (x, y, z) in the Cartesian coordinate system are transformed into the dihedral angle θ, radius r, and height z in the local cylindrical coordinate system, as shown in Figure 5. Local coordinate system transformations are not limited to one coordinate system; that is, a 3D mesh can have multiple coordinate system transformations simultaneously.
3、对应坐标系下的预测3. Prediction in the corresponding coordinate system
笛卡尔坐标系下的预测方法,可以依据参考关系采用空间预测或时空双预测,也可以依据参考信息采用差分预测或者邻域加权预测,还可以是多种预测方法的组合,对此不做具体限定。Prediction methods in the Cartesian coordinate system can be spatial prediction or spatiotemporal dual prediction based on reference relationships, or differential prediction or neighborhood weighted prediction based on reference information, or a combination of multiple prediction methods. No specific limitations are imposed on these methods.
局部坐标系下的预测方法,可以与笛卡尔坐标系下采用相同的预测方法,也可以采取不同的预测方法,本申请实施例不做具体限定。The prediction method in the local coordinate system can be the same as the prediction method in the Cartesian coordinate system, or it can be a different prediction method. This application does not make specific limitations on the embodiments.
针对预测所需要的参考顶点或者参考三角形,如果和待编码的patch不在同一坐标系下,无法直接预测。本申请实施例可以是计算参考顶点或者三角形映射到坐标系(待预测的待编码的patch所在坐标系)下的值作为预测所需要的参考信息;也可以是将顶点映射到参考顶点或者三角形所在坐标系,计算预测结果,并对预测结果映射回顶点所在坐标系;还可以是不进行预测;还可以是依据先验信息进行预测,先验信息可以是固定值,或者网格计算出的相应参考值等。对此不做具体限定。If the reference vertex or triangle required for prediction is not in the same coordinate system as the patch to be encoded, direct prediction is not possible. In this application, the embodiments may calculate the value of the reference vertex or triangle mapped to the coordinate system (the coordinate system where the patch to be encoded is located) as the reference information required for prediction; alternatively, the vertex may be mapped to the coordinate system of the reference vertex or triangle, the prediction result calculated, and the prediction result mapped back to the coordinate system of the vertex; alternatively, no prediction may be performed; alternatively, prediction may be based on prior information, which may be a fixed value or a corresponding reference value calculated from the mesh, etc. No specific limitations are imposed on this.
对坐标系变换后的值,可以先对其作进一步变换或者映射,再进行对应坐标系预测。变换方法可以是小波变换,离散余弦变换等。For values transformed by coordinate system, further transformations or mappings can be performed before predicting the corresponding coordinate system. Transformation methods can include wavelet transform, discrete cosine transform, etc.
4、对顶点的预测残差进行量化和熵编码操作,将结果写入码流4. Perform quantization and entropy encoding operations on the prediction residuals of the vertices, and write the results into the bitstream.
预测残差的量化操作可以有多种方式,可采用标量量化或者矢量量化实现,对此不做限定。此外量化操作可以对不同坐标系下的待编码的patch采取相同的量化精度和方式,也可以对不同坐标系下的待编码的patch采取不同的量化精度和方式。如果待编码的patch所选取的坐标系需要通过码流使解码端获得,那么该信息也可以进行预测和熵编码操作。Quantization of the prediction residual can be performed in various ways, including scalar quantization or vector quantization, without limitation. Furthermore, the quantization operation can apply the same precision and method to patches to be encoded in different coordinate systems, or it can apply different precision and methods to patches to be encoded in different coordinate systems. If the coordinate system selected for the patch to be encoded needs to be obtained by the decoder through the bitstream, then this information can also be used for prediction and entropy coding operations.
码流中还包括坐标系信息,该坐标系信息用于指示待编码的patch所适用的坐标系。The bitstream also includes coordinate system information, which indicates the coordinate system applicable to the patch to be encoded.
解码端实施过程如下:The decoding process is as follows:
I解码连接关系I Decoding Connection Relationship
II解码顶点信息II Decode Vertex Information
1、对码流进行熵解码和反量化,得到预测结果1. Perform entropy decoding and dequantization on the bitstream to obtain the prediction result.
熵解码和反量化操作对应于编码端步骤4。Entropy decoding and dequantization operations correspond to step 4 at the encoding end.
2、坐标系判断2. Coordinate system determination
解码端解码流可以得到坐标系信息,用于对待解码的patch进行坐标系判断,确定待解码的patch所在坐标系。The decoding stream at the decoding end can obtain coordinate system information, which is used to determine the coordinate system of the patch to be decoded and identify the coordinate system of the patch to be decoded.
3、对应坐标系重建3. Reconstruction of the corresponding coordinate system
对应坐标系重建方法对应于编码端步骤3,其中计算参考顶点或者三角形的方法,当参考顶点或者参考三角形和待重建的待解码的patch不在同一坐标系下,同样需要进行对应的参考值计算。The corresponding coordinate system reconstruction method corresponds to step 3 at the encoding end, where the method for calculating the reference vertex or triangle is required when the reference vertex or reference triangle and the patch to be reconstructed and decoded are not in the same coordinate system.
4、局部坐标系逆变换4. Inverse transformation of local coordinate system
局部坐标系逆变换对应于编码端步骤2。先确定局部坐标系的坐标原点和坐标轴,其需要依赖参考顶点或者参考三角形可以是码流中已解码出的顶点或者三角形,或者本地先验获取用于确定局部坐标系的顶点或者三角形。然后将局部坐标系下的顶点值变换到笛卡尔坐标系下。The inverse transformation of the local coordinate system corresponds to step 2 at the encoding end. First, the origin and coordinate axes of the local coordinate system are determined, which requires reference vertices or triangles. These can be vertices or triangles already decoded in the bitstream, or vertices or triangles obtained locally a priori to determine the local coordinate system. Then, the vertex values in the local coordinate system are transformed to the Cartesian coordinate system.
5、网格合并5. Grid merging
解码端可以合并多个patch的顶点的重建结果,得到重建三维网格。合并笛卡尔坐标系和局部坐标系的方法可以是,直接将笛卡尔坐标系和局部坐标系下顶点并集作为三维网格的重建结果,并根据连接关系进行网格顶点重排序;或者是依据连接关系遍历网格顶点,根据待解码的patch坐标系判断结果获得三维网格的重建结果;此外还可以对三维网格中部分或者全部顶点进行修改,处理不同坐标系下顶点合并产生的错位或不连续,修改方法可以是网格滤波,漏洞修补等操作。本申请实施例对合并的方法不做具体限定。The decoding end can merge the reconstruction results of vertices from multiple patches to obtain a reconstructed 3D mesh. Methods for merging Cartesian and local coordinate systems include: directly using the union of vertices in the Cartesian and local coordinate systems as the reconstruction result of the 3D mesh, and then reordering the mesh vertices according to connectivity; or traversing the mesh vertices according to connectivity and obtaining the reconstruction result of the 3D mesh based on the coordinate system of the patch to be decoded; additionally, some or all vertices in the 3D mesh can be modified to handle misalignments or discontinuities caused by merging vertices in different coordinate systems. Modification methods can include mesh filtering, vulnerability patching, etc. This application does not specifically limit the merging method.
笛卡尔坐标系适宜压缩相邻顶点具有相似运动矢量的顶点数据,但是对三维网格中细节丰富的褶皱区域效率较低,而局部坐标系可以将待压缩数据分布集中到更少的维度上,从而提升编码效率,但是顶点的量化误差会影响后续局部坐标系的建立,从而降低后续顶点的预测效率。本申请实施例可以针对不同种类的顶点分别采取适用的坐标系变换,解决了对整个网格单一变换或者不变换在局部区域压缩效率低的问题,相较于已有方法,在重建质量保持不变的条件下,码率可以降低约15%。Cartesian coordinates are suitable for compressing vertex data where adjacent vertices have similar motion vectors, but they are less efficient for detailed, wrinkled regions in 3D meshes. Local coordinate systems can concentrate the data to be compressed across fewer dimensions, thereby improving coding efficiency. However, vertex quantization errors can affect the establishment of subsequent local coordinate systems, thus reducing the prediction efficiency of subsequent vertices. The embodiments of this application can apply appropriate coordinate system transformations to different types of vertices, solving the problem of low compression efficiency in local regions when a single transformation or no transformation is applied to the entire mesh. Compared to existing methods, the bitrate can be reduced by approximately 15% while maintaining the same reconstruction quality.
图8为本申请实施例的编解码框架的示意图,如图8所示,区别于图7所示实施例,本实施例中,编码端先在笛卡尔坐标系下进行待编码的patch中的顶点信息的预测,然后对预测结果进行坐标系变换,以及在局部坐标系下进行预测。可选的,可以不包含局部坐标系预测,即只进行局部坐标系变换,并对变换结果进行量化和熵编码。Figure 8 is a schematic diagram of the encoding and decoding framework of an embodiment of this application. As shown in Figure 8, unlike the embodiment shown in Figure 7, in this embodiment, the encoding end first predicts the vertex information in the patch to be encoded in the Cartesian coordinate system, then performs coordinate system transformation on the prediction result, and performs prediction in the local coordinate system. Optionally, local coordinate system prediction may not be included, that is, only local coordinate system transformation is performed, and the transformation result is quantized and entropy encoded.
解码端与编码端对应,对熵解码和反量化后的值进行坐标系选择,并对待解码的patch中的顶点进行坐标系逆变换,以及在局部坐标系下的重建,然后合并网格,最后进行笛卡尔坐标系下的重建操作。The decoding end corresponds to the encoding end. It selects the coordinate system for the entropy-decoded and dequantized values, performs inverse coordinate transformation on the vertices in the patch to be decoded, and reconstructs them in the local coordinate system. Then, it merges the meshes and finally performs reconstruction in the Cartesian coordinate system.
图9为本申请实施例的编解码框架的示意图,如图9所示,在编码端包括以下处理步骤:Figure 9 is a schematic diagram of the encoding and decoding framework of an embodiment of this application. As shown in Figure 9, the encoding end includes the following processing steps:
1、获取待编码的第一patch,第一patch包括三维网格中的至少一个第一顶点,至少一个第一顶点的信息为第一坐标系下的信息;至少一个第一顶点的信息包括在第一坐标系下的至少一个第一顶点的位置坐标。1. Obtain the first patch to be encoded. The first patch includes at least one first vertex in the 3D mesh. The information of the at least one first vertex is information in the first coordinate system. The information of the at least one first vertex includes the position coordinates of the at least one first vertex in the first coordinate system.
2、对至少一个第一顶点的信息进行坐标系变换以得到在第二坐标系下的至少一个第一顶点的变换后信息,第二坐标系与第一坐标系不同;2. Perform coordinate system transformation on the information of at least one first vertex to obtain the transformed information of at least one first vertex in a second coordinate system, which is different from the first coordinate system;
3、对至少一个第一顶点的变换后信息进行预测以得到至少一个第一顶点的残差信息;3. Predict the transformed information of at least one first vertex to obtain the residual information of at least one first vertex;
4、当确定第一patch不需要进行坐标系变换时,对至少一个第一顶点的残差信息进行坐标系变换以得到在第一坐标系下的至少一个第一顶点的变换后残差信息;或者,当确定第一patch需要进行坐标系变换时,对至少一个第一顶点的残差信息进行编码以得到码流。4. When it is determined that the first patch does not require coordinate system transformation, the residual information of at least one first vertex is subjected to coordinate system transformation to obtain the transformed residual information of at least one first vertex in the first coordinate system; or, when it is determined that the first patch requires coordinate system transformation, the residual information of at least one first vertex is encoded to obtain the bitstream.
5、对至少一个第一顶点的变换后残差信息进行编码以得到码流。5. Encode the transformed residual information of at least one first vertex to obtain the bitstream.
编码端可以对至少一个第一顶点的变换后残差信息进行预测以得到至少一个第一顶点的残差信息;对至少一个第一顶点的残差信息进行编码以得到码流。The encoder can predict the transformed residual information of at least one first vertex to obtain the residual information of at least one first vertex; and encode the residual information of at least one first vertex to obtain the bitstream.
在解码端包括以下处理步骤:The decoding process includes the following steps:
1、接收码流;1. Receive the bitstream;
2、根据码流获取待解码的第一patch中的至少一个第一顶点的残差信息,至少一个第一顶点的残差信息为第一坐标系下的信息;2. Obtain residual information of at least one first vertex in the first patch to be decoded based on the bitstream. The residual information of at least one first vertex is information in the first coordinate system.
3、当确定第一patch不需要进行坐标系变换时,对至少一个第一顶点的残差信息进行坐标系变换以得到在第二坐标系下的至少一个第一顶点的变换后残差信息,第二坐标系与第一坐标系不同;3. When it is determined that the first patch does not require coordinate system transformation, the coordinate system transformation is performed on the residual information of at least one first vertex to obtain the transformed residual information of at least one first vertex in the second coordinate system, which is different from the first coordinate system.
4、根据至少一个第一顶点的变换后残差信息进行重建以得到至少一个第一顶点的信息;4. Reconstruct the information of at least one first vertex based on the transformed residual information of at least one first vertex;
5、对至少一个第一顶点的信息进行坐标系变换以得到在第一坐标系下的至少一个第一顶点的变换后信息;5. Perform coordinate system transformation on the information of at least one first vertex to obtain the transformed information of at least one first vertex in the first coordinate system;
6、根据至少一个第一顶点的变换后信息重建至少一个第一顶点。6. Reconstruct at least one first vertex based on the transformed information of at least one first vertex.
此外,解码端还可以接收码流;根据码流获取待解码的第二patch中的至少一个第二顶点的残差信息,至少一个第二顶点的残差信息为第二坐标系下的信息;当确定第一patch需要进行坐标系变换时,根据至少一个第一顶点的残差信息进行重建以得到至少一个第一顶点的信息;对至少一个第一顶点的信息进行坐标系变换以得到在第一坐标系下的至少一个第一顶点的变换后信息;根据至少一个第一顶点的变换后信息重建至少一个第一顶点。In addition, the decoding end can also receive the bitstream; obtain the residual information of at least one second vertex in the second patch to be decoded based on the bitstream, the residual information of at least one second vertex being information in the second coordinate system; when it is determined that the first patch needs to undergo coordinate system transformation, reconstruct the information of at least one first vertex based on the residual information of at least one first vertex; perform coordinate system transformation on the information of at least one first vertex to obtain the transformed information of at least one first vertex in the first coordinate system; and reconstruct at least one first vertex based on the transformed information of at least one first vertex.
相较于图8所示实施例,本实施例将局部坐标系变换和局部坐标系预测与笛卡尔坐标系预测顺序互换,即可以对待编码的patch进行局部坐标系变换和预测,并将预测结果逆变换回笛卡尔坐标系,并在笛卡尔坐标系下进行二次预测。解码端与编码端对应,对熵解码和反量化后的值进行坐标系判断,并对待解码的patch进行笛卡尔坐标系重建和局部坐标系变换,然后合并网格,最后进行局部坐标系下的重建和局部坐标系逆变换操作,得到重建网格。Compared to the embodiment shown in Figure 8, this embodiment reverses the order of local coordinate system transformation and local coordinate system prediction with Cartesian coordinate system prediction. That is, local coordinate system transformation and prediction can be performed on the patch to be encoded, and the prediction result can be inversely transformed back to the Cartesian coordinate system for secondary prediction. The decoding end corresponds to the encoding end, performing coordinate system determination on the entropy-decoded and dequantized values, and then performing Cartesian coordinate system reconstruction and local coordinate system transformation on the patch to be decoded. The meshes are then merged, and finally, reconstruction and inverse local coordinate system transformation are performed to obtain the reconstructed mesh.
可选的,编码端可以在第一次预测(即图8中的笛卡尔坐标系预测,图9中的局部坐标系预测)之后,坐标系选择之前,加入一个量化步骤,解码端可以相应的加入反量化步骤。Optionally, the encoder can add a quantization step after the first prediction (i.e., Cartesian coordinate system prediction in Figure 8, local coordinate system prediction in Figure 9) and before the coordinate system selection, and the decoder can add an inverse quantization step accordingly.
图10为本申请编码装置1000的结构示意图,如图10所示,本实施例的编码装置1000可以应用于上文编码端。该编码装置1000可以包括:收发模块1001和编码模块1002。其中,Figure 10 is a structural schematic diagram of the encoding device 1000 of this application. As shown in Figure 10, the encoding device 1000 of this embodiment can be applied to the above-mentioned encoding end. The encoding device 1000 may include: a transceiver module 1001 and an encoding module 1002.
收发模块1001,用于获取三维网格的原始数据,所述三维网格包含M个区块,所述原始数据包含M组原始信息,所述M组原始信息与所述M个区块一一对应且分别包含对应区块中的顶点在第一坐标系下的坐标,所述第一坐标系为所述M个区块的原始坐标系;编码模块1002,用于基于所述原始数据将坐标信息和坐标系信息编码到所述三维网格对应的码流中,所述坐标信息包含所述M个区块中的N个区块中部分或全部顶点的坐标,所述坐标系信息指示所述N个区块的当前坐标系是否为所述第一坐标系,1≤N<M。The transceiver module 1001 is used to acquire the raw data of a three-dimensional mesh, wherein the three-dimensional mesh contains M blocks, and the raw data contains M sets of raw information. The M sets of raw information correspond one-to-one with the M blocks and each contains the coordinates of the vertices in the corresponding block in a first coordinate system. The first coordinate system is the original coordinate system of the M blocks. The encoding module 1002 is used to encode the coordinate information and coordinate system information into the bitstream corresponding to the three-dimensional mesh based on the raw data. The coordinate information contains the coordinates of some or all vertices in N blocks of the M blocks, and the coordinate system information indicates whether the current coordinate system of the N blocks is the first coordinate system, where 1 ≤ N < M.
在一种可能的实现方式中,所述坐标系信息指示所述N个区块的当前坐标系为第二坐标系,所述第二坐标系为所述第一坐标系经变换后的坐标系,所述坐标信息包含所述N个区块中部分或全部顶点在所述第二坐标系下的坐标。In one possible implementation, the coordinate system information indicates that the current coordinate system of the N blocks is a second coordinate system, which is a transformed coordinate system of the first coordinate system. The coordinate information includes the coordinates of some or all vertices in the N blocks in the second coordinate system.
在一种可能的实现方式中,所述坐标系信息还指示在解码时进行坐标逆变换。In one possible implementation, the coordinate system information also indicates that an inverse coordinate transformation should be performed during decoding.
在一种可能的实现方式中,所述编码模块1002,还用于将所述N个区块中的顶点在所述第一坐标系下的坐标变换为在所述第二坐标系下的坐标。In one possible implementation, the encoding module 1002 is further configured to transform the coordinates of the vertices in the N blocks in the first coordinate system to coordinates in the second coordinate system.
在一种可能的实现方式中,所述坐标系信息指示所述N个区块的当前坐标系为所述第一坐标系,所述坐标信息包含所述N个区块中部分或全部顶点在所述第一坐标系下的坐标。In one possible implementation, the coordinate system information indicates that the current coordinate system of the N blocks is the first coordinate system, and the coordinate information includes the coordinates of some or all vertices in the N blocks in the first coordinate system.
在一种可能的实现方式中,所述码流包含对应于所述三维网格的第一语法结构,所述第一语法结构包含对应于所述M个区块的第二语法结构,所述坐标系信息位于所述第一语法结构中且在所述第二语法结构之外,或者,所述坐标系信息位于对应于所述N个区块的第二语法结构中。In one possible implementation, the bitstream includes a first syntax structure corresponding to the three-dimensional grid, the first syntax structure including a second syntax structure corresponding to the M blocks, the coordinate system information being located in the first syntax structure and outside the second syntax structure, or the coordinate system information being located in the second syntax structure corresponding to the N blocks.
在一种可能的实现方式中,所述坐标系信息包含N组第二坐标系的信息,以及所述N组第二坐标系的信息与所述N个区块的对应关系,任意一组所述第二坐标系的信息包含第二坐标系的类型、所述第二坐标系的原点信息以及所述第二坐标系的坐标轴信息。In one possible implementation, the coordinate system information includes information on N sets of second coordinate systems, and the correspondence between the information on the N sets of second coordinate systems and the N blocks. Any set of information on the second coordinate system includes the type of the second coordinate system, the origin information of the second coordinate system, and the coordinate axis information of the second coordinate system.
在一种可能的实现方式中,所述第二坐标系的原点信息包含原点在所述第一坐标系下的坐标信息;或者,所述第二坐标系的原点信息包含原点的获取模式。In one possible implementation, the origin information of the second coordinate system includes the coordinate information of the origin in the first coordinate system; or, the origin information of the second coordinate system includes the origin acquisition mode.
在一种可能的实现方式中,所述第二坐标系的坐标轴信息包含坐标轴在所述第一坐标系下的向量;或者,所述第二坐标系的坐标轴信息包含坐标轴的获取模式。In one possible implementation, the coordinate axis information of the second coordinate system includes the vectors of the coordinate axes in the first coordinate system; or, the coordinate axis information of the second coordinate system includes the acquisition mode of the coordinate axes.
本实施例的装置,可以用于执行图2所示方法实施例的技术方案,其实现原理和技术效果类似,此处不再赘述。The apparatus in this embodiment can be used to execute the technical solution of the method embodiment shown in FIG2. Its implementation principle and technical effect are similar, and will not be described again here.
图11为本申请解码装置1100的结构示意图,如图11所示,本实施例的解码装置1100可以应用于上文解码端。该解码装置1100可以包括:收发模块1101和解码模块1102。其中,Figure 11 is a schematic diagram of the structure of the decoding device 1100 of this application. As shown in Figure 11, the decoding device 1100 of this embodiment can be applied to the decoding end mentioned above. The decoding device 1100 may include: a transceiver module 1101 and a decoding module 1102.
收发模块1101,用于接收三维网格对应的码流,所述三维网格包含M个区块;解码模块1102,用于解码所述码流以得到坐标信息和坐标系信息,所述坐标信息包含所述M个区块中的N个区块中部分或全部顶点的坐标,所述坐标系信息指示所述N个区块的坐标系是否为第一坐标系,所述第一坐标系为所述M个区块的原始坐标系,1≤N<M;基于所述坐标信息和所述坐标系信息得到所述三维网格的重建数据,所述重建数据包含M组重建信息,所述M组重建信息与所述M个区块一一对应且分别包含对应区块中的顶点在所述第一坐标系下的坐标。The transceiver module 1101 is used to receive the bitstream corresponding to the three-dimensional mesh, the three-dimensional mesh containing M blocks; the decoding module 1102 is used to decode the bitstream to obtain coordinate information and coordinate system information, the coordinate information containing the coordinates of some or all vertices in N blocks of the M blocks, the coordinate system information indicating whether the coordinate system of the N blocks is a first coordinate system, the first coordinate system being the original coordinate system of the M blocks, 1≤N<M; based on the coordinate information and the coordinate system information, the reconstruction data of the three-dimensional mesh is obtained, the reconstruction data containing M sets of reconstruction information, the M sets of reconstruction information corresponding one-to-one with the M blocks and each containing the coordinates of the vertices in the corresponding block in the first coordinate system.
在一种可能的实现方式中,所述坐标系信息指示所述N个区块的当前坐标系为第二坐标系,所述第二坐标系为所述第一坐标系经变换后的坐标系,所述坐标信息包含所述N个区块中部分或全部顶点在所述第二坐标系下的坐标。In one possible implementation, the coordinate system information indicates that the current coordinate system of the N blocks is a second coordinate system, which is a transformed coordinate system of the first coordinate system. The coordinate information includes the coordinates of some or all vertices in the N blocks in the second coordinate system.
在一种可能的实现方式中,所述坐标系信息还指示在解码时进行坐标逆变换。In one possible implementation, the coordinate system information also indicates that an inverse coordinate transformation should be performed during decoding.
在一种可能的实现方式中,所述解码模块1102,还用于将所述N个区块中的顶点在所述第二坐标系下的坐标变换为在所述第一坐标系下的坐标。In one possible implementation, the decoding module 1102 is further configured to transform the coordinates of the vertices in the N blocks in the second coordinate system to the coordinates in the first coordinate system.
在一种可能的实现方式中,所述坐标系信息指示所述N个区块的当前坐标系为所述第一坐标系,所述坐标信息包含所述N个区块中部分或全部顶点在所述第一坐标系下的坐标。In one possible implementation, the coordinate system information indicates that the current coordinate system of the N blocks is the first coordinate system, and the coordinate information includes the coordinates of some or all vertices in the N blocks in the first coordinate system.
在一种可能的实现方式中,所述码流包含对应于所述三维网格的第一语法结构,所述第一语法结构包含对应于所述M个区块的第二语法结构,所述坐标系信息位于所述第一语法结构中且在所述第二语法结构之外,或者,所述坐标系信息位于对应于所述N个区块的第二语法结构中。In one possible implementation, the bitstream includes a first syntax structure corresponding to the three-dimensional grid, the first syntax structure including a second syntax structure corresponding to the M blocks, the coordinate system information being located in the first syntax structure and outside the second syntax structure, or the coordinate system information being located in the second syntax structure corresponding to the N blocks.
在一种可能的实现方式中,所述坐标系信息包含N组第二坐标系的信息,以及所述N组第二坐标系的信息与所述N个区块的对应关系,任意一组所述第二坐标系的信息包含第二坐标系的类型、所述第二坐标系的原点信息以及所述第二坐标系的坐标轴信息。In one possible implementation, the coordinate system information includes information on N sets of second coordinate systems, and the correspondence between the information on the N sets of second coordinate systems and the N blocks. Any set of information on the second coordinate system includes the type of the second coordinate system, the origin information of the second coordinate system, and the coordinate axis information of the second coordinate system.
在一种可能的实现方式中,所述第二坐标系的原点信息包含原点在所述第一坐标系下的坐标信息;或者,所述第二坐标系的原点信息包含原点的获取模式。In one possible implementation, the origin information of the second coordinate system includes the coordinate information of the origin in the first coordinate system; or, the origin information of the second coordinate system includes the origin acquisition mode.
在一种可能的实现方式中,所述第二坐标系的坐标轴信息包含坐标轴在所述第一坐标系下的向量;或者,所述第二坐标系的坐标轴信息包含坐标轴的获取模式。In one possible implementation, the coordinate axis information of the second coordinate system includes the vectors of the coordinate axes in the first coordinate system; or, the coordinate axis information of the second coordinate system includes the acquisition mode of the coordinate axes.
本实施例的装置,可以用于执行图5所示方法实施例的技术方案,其实现原理和技术效果类似,此处不再赘述。The apparatus in this embodiment can be used to execute the technical solution of the method embodiment shown in FIG5. Its implementation principle and technical effect are similar, and will not be described again here.
图12为本申请提供的电子设备1200的示意性结构图。电子设备1200可包括:处理器1201和收发电路1202。可选地,还包括存储器1203。Figure 12 is a schematic structural diagram of the electronic device 1200 provided in this application. The electronic device 1200 may include a processor 1201 and a transceiver circuit 1202. Optionally, it may also include a memory 1203.
电子设备1200的各个组件通过总线1204耦合在一起,其中总线1204除包括数据总线之外,还包括电源总线、控制总线和状态信号总线等。但是为了清楚说明起见,在图中将各种总线都称为总线1204。The various components of electronic device 1200 are coupled together via bus 1204, which includes not only a data bus but also a power bus, a control bus, and a status signal bus. However, for clarity, all buses are referred to as bus 1204 in the figure.
可选地,存储器1203可以用于存储上文方法实施例中的指令。Optionally, the memory 1203 can be used to store the instructions in the above method embodiments.
处理器1201可用于执行存储器1203中的指令,并控制收发电路1202接收信号,以及控制收发电路1202发送信号。The processor 1201 can be used to execute instructions in the memory 1203, control the transceiver circuit 1202 to receive signals, and control the transceiver circuit 1202 to send signals.
电子设备1200可以是上文方法实施例中的编/解码端的电子设备或电子设备中的芯片。Electronic device 1200 can be the electronic device at the encoding/decoding end in the above method embodiment or a chip in an electronic device.
在实现过程中,上述方法实施例的各步骤可以通过处理器中的硬件的集成逻辑电路或者软件形式的指令完成。处理器可以是通用处理器、数字信号处理器(digital signal processor,DSP)、特定应用集成电路(application-specific integrated circuit,ASIC)、现场可编程门阵列(field programmable gate array,FPGA)或其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。本申请实施例公开的方法的步骤可以直接体现为硬件编码处理器执行完成,或者用编码处理器中的硬件及软件模块组合执行完成。软件模块可以位于随机存储器,闪存、只读存储器,可编程只读存储器或者电可擦写可编程存储器、寄存器等本领域成熟的存储介质中。该存储介质位于存储器,处理器读取存储器中的信息,结合其硬件完成上述方法的步骤。In implementation, each step of the above method embodiments can be completed by integrated logic circuits in the processor hardware or by instructions in software form. The processor can be a general-purpose processor, a digital signal processor (DSP), an application-specific integrated circuit (ASIC), a field-programmable gate array (FPGA), or other programmable logic devices, discrete gate or transistor logic devices, or discrete hardware components. A general-purpose processor can be a microprocessor or any conventional processor. The steps of the method disclosed in this application can be directly implemented by a hardware encoding processor, or by a combination of hardware and software modules in the encoding processor. The software modules can reside in random access memory, flash memory, read-only memory, programmable read-only memory, electrically erasable programmable memory, registers, or other mature storage media in the art. This storage medium is located in memory; the processor reads information from the memory and, in conjunction with its hardware, completes the steps of the above method.
上述各实施例中提及的存储器可以是易失性存储器或非易失性存储器,或可包括易失性和非易失性存储器两者。其中,非易失性存储器可以是只读存储器(read-only memory,ROM)、可编程只读存储器(programmable ROM,PROM)、可擦除可编程只读存储器(erasable PROM,EPROM)、电可擦除可编程只读存储器(electrically EPROM,EEPROM)或闪存。易失性存储器可以是随机存取存储器(random access memory,RAM),其用作外部高速缓存。通过示例性但不是限制性说明,许多形式的RAM可用,例如静态随机存取存储器(static RAM,SRAM)、动态随机存取存储器(dynamic RAM,DRAM)、同步动态随机存取存储器(synchronous DRAM,SDRAM)、双倍数据速率同步动态随机存取存储器(doubledata rate SDRAM,DDR SDRAM)、增强型同步动态随机存取存储器(enhanced SDRAM,ESDRAM)、同步连接动态随机存取存储器(synchlink DRAM,SLDRAM)和直接内存总线随机存取存储器(direct rambus RAM,DR RAM)。应注意,本文描述的系统和方法的存储器旨在包括但不限于这些和任意其它适合类型的存储器。The memory mentioned in the above embodiments can be volatile memory or non-volatile memory, or may include both. The non-volatile memory can be read-only memory (ROM), programmable read-only memory (PROM), erasable programmable read-only memory (EPROM), electrically erasable programmable read-only memory (EEPROM), or flash memory. The volatile memory can be random access memory (RAM), which is used as an external cache. By way of example, but not limitation, many forms of RAM are available, such as static random access memory (SRAM), dynamic random access memory (DRAM), synchronous dynamic random access memory (SDRAM), double data rate synchronous dynamic random access memory (DDR SDRAM), enhanced synchronous dynamic random access memory (ESDRAM), synchronous linked dynamic random access memory (SLDRAM), and direct rambus RAM (DR RAM). It should be noted that the memory used in the systems and methods described herein is intended to include, but is not limited to, these and any other suitable types of memory.
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。Those skilled in the art will recognize that the units and algorithm steps of the various examples described in conjunction with the embodiments disclosed herein can be implemented in electronic hardware, or a combination of computer software and electronic hardware. Whether these functions are implemented in hardware or software depends on the specific application and design constraints of the technical solution. Those skilled in the art can use different methods to implement the described functions for each specific application, but such implementation should not be considered beyond the scope of this application.
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统、装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。Those skilled in the art will understand that, for the sake of convenience and brevity, the specific working processes of the systems, devices, and units described above can be referred to the corresponding processes in the foregoing method embodiments, and will not be repeated here.
在本申请所提供的几个实施例中,应该理解到,所揭露的系统、装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。In the several embodiments provided in this application, it should be understood that the disclosed systems, apparatuses, and methods can be implemented in other ways. For example, the apparatus embodiments described above are merely illustrative; for instance, the division of units is only a logical functional division, and in actual implementation, there may be other division methods. For example, multiple units or components may be combined or integrated into another system, or some features may be ignored or not executed. Furthermore, the coupling or direct coupling or communication connection shown or discussed may be through some interfaces; the indirect coupling or communication connection between apparatuses or units may be electrical, mechanical, or other forms.
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。The units described as separate components may or may not be physically separate. The components shown as units may or may not be physical units; that is, they may be located in one place or distributed across multiple network units. Some or all of the units can be selected to achieve the purpose of this embodiment according to actual needs.
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。In addition, the functional units in the various embodiments of this application can be integrated into one processing unit, or each unit can exist physically separately, or two or more units can be integrated into one unit.
所述功能如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(个人计算机,服务器,或者网络设备等)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(read-only memory,ROM)、随机存取存储器(random access memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。If the aforementioned functions are implemented as software functional units and sold or used as independent products, they can be stored in a computer-readable storage medium. Based on this understanding, the technical solution of this application, in essence, or the part that contributes to the prior art, or a portion of the technical solution, can be embodied in the form of a software product. This computer software product is stored in a storage medium and includes several instructions to cause a computer device (personal computer, server, or network device, etc.) to execute all or part of the steps of the methods described in the various embodiments of this application. The aforementioned storage medium includes various media capable of storing program code, such as USB flash drives, portable hard drives, read-only memory (ROM), random access memory (RAM), magnetic disks, or optical disks.
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以所述权利要求的保护范围为准。The above description is merely a specific embodiment of this application, but the scope of protection of this application is not limited thereto. Any variations or substitutions that can be easily conceived by those skilled in the art within the scope of the technology disclosed in this application should be included within the scope of protection of this application. Therefore, the scope of protection of this application should be determined by the scope of the claims.
Claims (22)
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202410539064.2A CN120881295A (en) | 2024-04-30 | 2024-04-30 | Encoding and decoding method and device |
| CN202410539064.2 | 2024-04-30 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2025227794A1 true WO2025227794A1 (en) | 2025-11-06 |
Family
ID=97472427
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/CN2024/143135 Pending WO2025227794A1 (en) | 2024-04-30 | 2024-12-27 | Encoding method and apparatus, and decoding method and apparatus |
Country Status (2)
| Country | Link |
|---|---|
| CN (1) | CN120881295A (en) |
| WO (1) | WO2025227794A1 (en) |
Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20090184956A1 (en) * | 2008-01-21 | 2009-07-23 | Samsung Electronics Co., Ltd. | Method, medium, and system for compressing and decoding mesh data in three-dimensional mesh model |
| US20120075302A1 (en) * | 2009-06-10 | 2012-03-29 | Thomson Licensing Llc | Method for encoding/decoding a 3d mesh model that comprises one or more components |
| WO2023164603A1 (en) * | 2022-02-24 | 2023-08-31 | Innopeak Technology, Inc. | Efficient geometry component coding for dynamic mesh coding |
| US20240054684A1 (en) * | 2022-08-10 | 2024-02-15 | Sharp Kabushiki Kaisha | 3d data decoding apparatus and 3d data coding apparatus |
| WO2024084931A1 (en) * | 2022-10-18 | 2024-04-25 | ソニーグループ株式会社 | Information processing device and method |
-
2024
- 2024-04-30 CN CN202410539064.2A patent/CN120881295A/en active Pending
- 2024-12-27 WO PCT/CN2024/143135 patent/WO2025227794A1/en active Pending
Patent Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20090184956A1 (en) * | 2008-01-21 | 2009-07-23 | Samsung Electronics Co., Ltd. | Method, medium, and system for compressing and decoding mesh data in three-dimensional mesh model |
| US20120075302A1 (en) * | 2009-06-10 | 2012-03-29 | Thomson Licensing Llc | Method for encoding/decoding a 3d mesh model that comprises one or more components |
| WO2023164603A1 (en) * | 2022-02-24 | 2023-08-31 | Innopeak Technology, Inc. | Efficient geometry component coding for dynamic mesh coding |
| US20240054684A1 (en) * | 2022-08-10 | 2024-02-15 | Sharp Kabushiki Kaisha | 3d data decoding apparatus and 3d data coding apparatus |
| WO2024084931A1 (en) * | 2022-10-18 | 2024-04-25 | ソニーグループ株式会社 | Information processing device and method |
Also Published As
| Publication number | Publication date |
|---|---|
| CN120881295A (en) | 2025-10-31 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US20240087174A1 (en) | Coding and decoding point cloud attribute information | |
| CN116843771A (en) | Encoding method, decoding method and terminal | |
| CN116940965A (en) | Slice time aligned decoding for trellis compression | |
| JP2024505796A (en) | Point cloud decoding method, point cloud encoding method, decoder and encoder | |
| WO2025227794A1 (en) | Encoding method and apparatus, and decoding method and apparatus | |
| KR20230124673A (en) | Encoding method, decoding method, encoder and decoder of point cloud | |
| KR20230135646A (en) | Encoding patch temporal alignment for mesh compression. | |
| JP7735574B2 (en) | Method, apparatus, and medium for point cloud coding | |
| WO2025076656A1 (en) | Encoding method, decoding method, encoder, decoder, and storage medium | |
| WO2024255912A1 (en) | Encoding method, decoding method, bitstream, encoder, decoder, medium and program product | |
| JP7736934B2 (en) | Method, apparatus, and medium for point cloud coding | |
| US20250225679A1 (en) | System and method for geometry point cloud coding | |
| WO2024255475A1 (en) | Coding and decoding methods, bitstream, encoder, decoder and storage medium | |
| WO2025108070A1 (en) | System and method for geometry point cloud coding | |
| WO2024213067A1 (en) | Decoding method, encoding method, bitstream, decoder, encoder and storage medium | |
| WO2025076795A1 (en) | Coding and decoding methods, bit stream, encoder, decoder, and storage medium | |
| WO2025213480A1 (en) | Encoding method and apparatus, decoding method and apparatus, point cloud encoder, point cloud decoder, bit stream, device, and storage medium | |
| WO2025000342A1 (en) | Encoding and decoding method, encoder, decoder, and storage medium | |
| WO2025208498A1 (en) | Encoding method, decoding method, encoders, decoders, bitstream and storage media | |
| WO2024148573A1 (en) | Encoding and decoding method, encoder, decoder, and storage medium | |
| WO2025152005A1 (en) | Encoding method, decoding method, encoding apparatus, decoding apparatus, encoder, decoder, code stream, device, and storage medium | |
| WO2025076749A1 (en) | Encoding method, decoding method, code stream, encoder, decoder, and storage medium | |
| CN121100528A (en) | Encoding/decoding methods, bitstreams, encoders, decoders, and storage media | |
| WO2025145325A1 (en) | Encoding method, decoding method, encoders, decoders and storage medium | |
| WO2025148072A1 (en) | Coding method, decoding method, code stream, encoder, decoder, and storage medium |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 24937820 Country of ref document: EP Kind code of ref document: A1 |