WO2018121798A1

WO2018121798A1 - Video coding and decoding device and method based on depth automatic coder

Info

Publication number: WO2018121798A1
Application number: PCT/CN2018/074719
Authority: WO
Inventors: 陈天石; 支天; 罗宇哲; 刘少礼; 郭崎; 陈云霁
Original assignee: Shanghai Cambricon Information Technology Co Ltd
Current assignee: Shanghai Cambricon Information Technology Co Ltd
Priority date: 2016-12-30
Filing date: 2018-01-31
Publication date: 2018-07-05
Anticipated expiration: 2019-06-30
Also published as: CN107046646B; CN107046646A

Abstract

The present disclosure provides a video coding and decoding device and method based on depth automatic coder. The device comprises: a depth automatic coder module, a neural network codec module, and a mixed codec module. The depth automatic coder module comprises a depth automatic coder. The depth automatic coder comprises a coding end. The coding end is used for performing first compression on an original video to obtain first-compressed data. The neural network codec module is used for performing coding and compression on a decoding end parameter, so as to generate a decoding end parameter that has been coded. The mixed codec module is used for performing mixed coding on the first-compressed data and the decoding end parameter that has been coded, so as to obtain the video compression data. In the present disclosure, the compression rate of video data is improved by constructing a coding end and a decoding end that are symmetrical in the structure and by performing secondary compression and decompression on the video data. Because of a non-linear characteristic of an artificial neural network, the integration of video data compression and encryption is implemented, by using a parameter of the artificial neural network as a key. The coding result of the video data contains a characteristic of the video data, which facilitates the classification and search of the video data and provides a broad development space and application prospect. Without the complicated coding and decoding process in manual design, the automatic extraction of a data characteristic by the depth automatic coder greatly reduces manual intervention, so that the automation of the coding process is implemented, the implementation is simple, and the scalability is good, and accordingly, the present invention can be used to compress video data and can also compresses other data.

Description

Video codec device and method based on depth automatic encoder

Technical field

本披露涉及视频压缩和解压领域，尤其涉及一种基于深度自动编码器的视频编解码装置及方法。The present disclosure relates to the field of video compression and decompression, and in particular, to a video encoding and decoding apparatus and method based on a depth auto-encoder.

Background technique

随着互联网时代的到来，视频数据的大量产生对传输能力提出了更高的要求。为了缓解传输压力，视频编码解码技术应运而生，并对压缩视频以便于传输发挥了巨大的作用。With the advent of the Internet era, the massive generation of video data puts higher demands on transmission capabilities. In order to alleviate the transmission pressure, video coding and decoding technology came into being, and played a huge role in compressing video for transmission.

传统的视频编码技术是针对视频中存在的各种类型的冗余采用不同的方法予以消除从而达到压缩视频的目的。例如，针对视频的时间冗余、空间冗余、视觉冗余和编码冗余技术人员分别采取帧间编码、帧内编码、量化和熵编码等方法。变换也是去除空间冗余的常用方法。每种视频编码方法都有对应的解码方法。复杂的编码标准通过组合不同的方法和采用不同的实现方式以达到较好的压缩比。The traditional video coding technology is to eliminate the various types of redundancy existing in the video by different methods to achieve the purpose of compressing video. For example, techniques for temporal redundancy, spatial redundancy, visual redundancy, and coding redundancy for video use inter-frame coding, intra-frame coding, quantization, and entropy coding, respectively. Transforming is also a common method of removing spatial redundancy. Each video encoding method has a corresponding decoding method. Complex coding standards achieve better compression ratios by combining different methods and using different implementations.

传统的视频编码技术虽然已经较为成熟，但是比较复杂，需要精巧的人工设计，才能达到较好的压缩效果。Although the traditional video coding technology is relatively mature, it is more complicated and requires sophisticated manual design to achieve better compression.

公开内容Public content

有鉴于此，本披露的主要目的在于提供一种基于深度自动编码器的视频编解码装置及方法。In view of this, the main purpose of the present disclosure is to provide a video encoding and decoding apparatus and method based on a depth auto-encoder.

本披露的基于深度自动编码器的视频编解码装置，包括：深度自动编码器模块，包括深度自动编码器，所述深度自动编码器包括编码端，所述编码端用于对原始视频进行首次压缩得到首次压缩数据；神经网络编解码模块，用于对解码端参数进行编码压缩，生成编码后的解码端参数；混合编解码模块，用于对首次压缩数据和编码后的解码端参数进行混合编码，得到视频压缩数据。The depth codec-based video codec device of the present disclosure includes: a depth auto-encoder module, including a depth auto-encoder, the depth auto-encoder includes an encoding end, and the encoding end is used to compress the original video for the first time. Obtaining the first compressed data; the neural network codec module is configured to encode and compress the decoding end parameters to generate the encoded decoding end parameters; the hybrid codec module is configured to perform hybrid encoding on the first compressed data and the encoded decoding end parameters. , get video compression data.

在本披露的一些实施例中，所述编码端为N层人工神经网络结构。In some embodiments of the disclosure, the encoding end is an N-layer artificial neural network structure.

在本披露的一些实施例中，N层人工神经网络结构的第1层为输入层，第2至N层为隐含层，层间单元全连接，层内单元无连接，第N层隐含层的隐单元数小于输入层的输入单元数。In some embodiments of the present disclosure, the first layer of the N-layer artificial neural network structure is an input layer, the second to N layers are hidden layers, the inter-layer units are fully connected, the intra-layer elements are not connected, and the N-th layer is implicit. The number of hidden cells in the layer is less than the number of input cells in the input layer.

在本披露的一些实施例中，所述混合编码包括熵编码。In some embodiments of the disclosure, the hybrid encoding comprises entropy encoding.

在本披露的一些实施例中，所述熵编码包括哈夫曼编码。In some embodiments of the disclosure, the entropy encoding comprises Huffman encoding.

在本披露的一些实施例中，还包括：存储模块，用于存储所述首次压缩数据、解码端参数和视频压缩数据。In some embodiments of the present disclosure, the method further includes: a storage module, configured to store the first compressed data, the decoding end parameter, and the video compressed data.

在本披露的一些实施例中，所述神经网络编解码模块用于从所述存储模块读取所述解码端参数，以对所述解码端参数进行编码压缩。In some embodiments of the present disclosure, the neural network codec module is configured to read the decoding end parameter from the storage module to encode and compress the decoding end parameter.

在本披露的一些实施例中，所述混合编解码模块用于从所述存储模块读取所述首次压缩数据，并从所述神经网络编解码模块读取编码后的解码端参数，以进行所述混合编码，并将所述视频压缩数据存储至所述存储模块。In some embodiments of the present disclosure, the hybrid codec module is configured to read the first compressed data from the storage module, and read the encoded decoding end parameters from the neural network codec module to perform The hybrid encoding and storing the video compressed data to the storage module.

在本披露的一些实施例中，所述深度自动编码器还包括：解码端；所述混合编解码模块还用于对视频压缩数据进行解码，得到首次解压数据和编码后的解码端参数；所述神经网络编解码模块还用于对编码后的解码端参数进行解码，得到解码端参数；所述解码端用于对首次解压数据解码，得到原始视频数据。In some embodiments of the disclosure, the depth auto-encoder further includes: a decoding end; the hybrid codec module is further configured to decode the video compressed data to obtain the first decompressed data and the encoded decoding end parameter; The neural network codec module is further configured to decode the encoded decoding end parameter to obtain a decoding end parameter; the decoding end is configured to decode the first decompressed data to obtain original video data.

在本披露的一些实施例中，所述存储模块还用于存储所述首次解压数据、编码后的解码端参数和原始视频数据。In some embodiments of the disclosure, the storage module is further configured to store the first decompressed data, the encoded decoding end parameter, and the original video data.

在本披露的一些实施例中，所述混合编解码模块还用于从所述存储模块读取所述视频压缩数据，以对所述视频压缩数据进行解码。In some embodiments of the present disclosure, the hybrid codec module is further configured to read the video compressed data from the storage module to decode the video compressed data.

在本披露的一些实施例中，所述神经网络编解码模块还用于从所述存储模块读取所述编码后的解码端参数，以对所述编码后的解码端参数进行解码。In some embodiments of the disclosure, the neural network codec module is further configured to read the encoded decoding end parameter from the storage module to decode the encoded decoding end parameter.

在本披露的一些实施例中，所述深度自动编码器模块还用于从所述存储模块读取所述首次解压数据，从所述神经网络编解码模块读取所述解码端的参数，以使所述解码端对所述首次解压数据解码。In some embodiments of the present disclosure, the depth autoencoder module is further configured to read the first decompressed data from the storage module, and read parameters of the decoding end from the neural network codec module, so that The decoding end decodes the first decompressed data.

在本披露的一些实施例中，所述解码端是与编码端结构对称的N层人工神经网络结构。In some embodiments of the present disclosure, the decoding end is an N-layer artificial neural network structure that is symmetric with the encoding end structure.

在本披露的一些实施例中，所述解码端的第n层为所述编码端的第 (N-n+1)层，所述解码端第n层与第n+1层之间的权重矩阵，为所述编码端第(N-n)层和第(N-n+1)层之间的权重矩阵的转置，其中，1≤n≤N。In some embodiments of the disclosure, the nth layer of the decoding end is the (N-n+1)th layer of the encoding end, and the weight matrix between the nth layer and the n+1th layer of the decoding end, Is a transposition of a weight matrix between the (Nn)th layer and the (N-n+1)th layer of the encoding end, where 1≤n≤N.

在本披露的一些实施例中，所述深度自动编码器模块还用于初始化所述深度自动编码器，并利用训练用视频对所述深度自动编码器进行训练，得到用于视频编码的深度自动编码器。In some embodiments of the present disclosure, the depth autoencoder module is further configured to initialize the depth auto-encoder and train the depth auto-encoder by using training video to obtain depth automatic for video coding. Encoder.

在本披露的一些实施例中，所述深度自动编码器模块还用于利用训练用视频对所述深度自动编码器进行训练包括：将所述深度自动编码器编码端的相邻两层作为一个限制玻尔兹曼机；初始化所述限制玻尔兹曼机；利用所述训练用视频数据对所述限制玻尔兹曼机进行训练；用反向传播算法精细调整所述深度自动编码器编码端的权值矩阵，以最小化对原始输入的重构误差。In some embodiments of the disclosure, the depth autoencoder module is further configured to perform training on the depth autocoder by using training video, including: using two adjacent layers of the depth autoencoder encoding end as a limitation a Boltzmann machine; initializing the limited Boltzmann machine; training the limited Boltzmann machine with the training video data; fine-tuning the depth autoencoder encoding end with a backpropagation algorithm A weight matrix to minimize the reconstruction error to the original input.

在本披露的一些实施例中，还包括控制器，其与所述深度自动编码器模块、神经网络编解码模块和混合编解码模块互联，用于控制上述模块。In some embodiments of the present disclosure, a controller is further included, interconnected with the depth autoencoder module, the neural network codec module, and the hybrid codec module for controlling the above modules.

本披露还提供了一种基于深度自动编码器的视频编码方法，利用权上述任一项视频编解码装置进行视频编码，包括：对原始视频进行首次压缩，得到首次压缩数据；对解码端参数进行编码压缩，得到编码后的解码端参数；对所述首次压缩数据和编码后的解码端参数进行混合编码，得到视频压缩数据。The present disclosure also provides a video encoding method based on a depth auto-encoder, which uses the video encoding and decoding apparatus of any of the above to perform video encoding, including: compressing the original video for the first time, and obtaining the first compressed data; Encoding compression, obtaining encoded decoding end parameters; performing hybrid encoding on the first compressed data and the encoded decoding end parameters to obtain video compressed data.

在本披露的一些实施例中，利用第一N层人工神经网络结构对所述原始视频进行首次压缩。In some embodiments of the present disclosure, the original video is first compressed using a first N-layer artificial neural network structure.

在本披露的一些实施例中，所述第一N层人工神经网络结构的第1层为输入层，第2至N层为隐含层，层间单元全连接，层内单元无连接，第N层隐含层的隐单元数小于输入层的输入单元数。In some embodiments of the disclosure, the first layer of the first N-layer artificial neural network structure is an input layer, the second to N layers are hidden layers, the inter-layer units are fully connected, and the intra-layer elements are not connected. The number of hidden cells of the N-layer hidden layer is smaller than the number of input cells of the input layer.

在本披露的一些实施例中，还包括：存储所述首次压缩数据、解码端参数和视频压缩数据。In some embodiments of the present disclosure, the method further includes: storing the first compressed data, the decoding end parameter, and the video compressed data.

在本披露的一些实施例中，读取所述解码端参数，以对所述解码端参数进行编码压缩。In some embodiments of the present disclosure, the decoding end parameters are read to encode and compress the decoding end parameters.

在本披露的一些实施例中，读取所述首次压缩数据和编码后的解码端参数，以进行所述混合编码，并存储所述视频压缩数据。In some embodiments of the present disclosure, the first compressed data and the encoded decoded end parameters are read to perform the hybrid encoding, and the video compressed data is stored.

在本披露的一些实施例中，还包括：对所述视频压缩数据进行解码，得到首次解压数据和编码后的解码端参数；对所述编码后的解码端参数进行解码，得到解码端参数；对所述首次解压数据解码，得到原始视频数据。In some embodiments of the present disclosure, the method further includes: decoding the video compressed data to obtain first decompressed data and the encoded decoding end parameter; and decoding the encoded decoding end parameter to obtain a decoding end parameter; Decoding the first decompressed data to obtain original video data.

在本披露的一些实施例中，还包括：存储所述首次解压数据、编码后的解码端参数和原始视频数据。In some embodiments of the present disclosure, the method further includes: storing the first decompressed data, the encoded decoding end parameters, and the original video data.

在本披露的一些实施例中，读取所述视频压缩数据，以对所述视频压缩数据进行解码。In some embodiments of the disclosure, the video compression data is read to decode the video compression data.

在本披露的一些实施例中，读取所述编码后的解码端参数，以对所述编码后的解码端参数进行解码。In some embodiments of the present disclosure, the encoded decoder parameters are read to decode the encoded decoder parameters.

在本披露的一些实施例中，读取所述首次解压数据和所述解码端的参数，以对所述首次解压数据解码。In some embodiments of the present disclosure, the first decompressed data and parameters of the decoding end are read to decode the first decompressed data.

在本披露的一些实施例中，利用第二N层人工神经网络结构对所述首次解压数据解码，所述第二N层人工神经网络结构与所述第一N层人工神经网络结构对称。In some embodiments of the present disclosure, the first decompressed data is decoded using a second N-layer artificial neural network structure that is symmetric with the first N-layer artificial neural network structure.

在本披露的一些实施例中，所述第二N层人工神经网络结构的第n层为所述第一N层人工神经网络结构的第(N-n+1)层，所述第二N层人工神经网络结构第n层与第n+1层之间的权重矩阵，为所述第一N层人工神经网络结构第(N-n)层和第(N-n+1)层之间的权重矩阵的转置，其中，1≤n≤N。In some embodiments of the present disclosure, the nth layer of the second N-layer artificial neural network structure is the (N-n+1)th layer of the first N-layer artificial neural network structure, and the second N The weight matrix between the nth layer and the n+1th layer of the layer artificial neural network structure is the weight between the (Nn)th layer and the (N-n+1)th layer of the first N layer artificial neural network structure The transposition of the matrix, where 1 ≤ n ≤ N.

在本披露的一些实施例中，在所述对原始视频进行首次压缩之前还包括：初始化深度自动编码器；利用训练用视频数据对所述深度自动编码器进行训练。In some embodiments of the present disclosure, before the first compression of the original video, the method further includes: initializing a depth auto-encoder; and training the deep auto-encoder with the training video data.

在本披露的一些实施例中，所述利用训练用视频数据对深度自动编码器进行训练包括：将深度自动编码器编码端的相邻两层作为一个限制玻尔兹曼机；初始化所述限制玻尔兹曼机；利用所述训练用视频数据对所述限制玻尔兹曼机进行训练；用反向传播方法调整所述深度自动编码器编码端的权值矩阵，最小化对原始输入的重构误差。In some embodiments of the present disclosure, the training the depth autoencoder with the training video data comprises: using two adjacent layers of the depth autoencoder encoding end as a limited Boltzmann machine; initializing the limiting glass Using the training video data to train the restricted Boltzmann machine; using a backpropagation method to adjust the weight matrix of the depth autoencoder encoding end, minimizing the reconstruction of the original input error.

在本披露的一些实施例中，还包括：利用控制器对上述步骤进行控制。In some embodiments of the present disclosure, the method further includes: controlling the foregoing steps by using a controller.

在本披露的一些实施例中，对所述N层人工神经网络结构中的各层依次执行正向运算；按照与所述正向运算相反的顺序，对所述N层人工神经网络结构中的各层依次执行反向运算；对所述N层人工神经网络结构中的各层进行权值更新；重复执行上述各个步骤多次，完成所述N层人工神经网络结构的训练。In some embodiments of the present disclosure, a forward operation is sequentially performed on each layer in the N-layer artificial neural network structure; in an inverse order of the forward operation, in the N-layer artificial neural network structure Each layer sequentially performs a reverse operation; performs weight update on each layer in the N-layer artificial neural network structure; and repeatedly performs the above steps multiple times to complete the training of the N-layer artificial neural network structure.

在本披露的一些实施例中，所述对所述N层人工神经网络结构中的各层依次执行反向运算包括：第一运算部分：由所述输出神经元梯度和所述输入神经元得到权值梯度；第二运算部分：使用所述输出神经元梯度和权值，计算出输入神经元梯度。In some embodiments of the disclosure, the performing the inverse operations sequentially on the layers in the N-layer artificial neural network structure includes: a first computing portion: obtained by the output neuron gradient and the input neurons Weight gradient; second operation portion: calculating the input neuron gradient using the output neuron gradient and weight.

在本披露的一些实施例中，所述对所述N层人工神经网络结构中的各层进行权值更新包括：利用所述权值梯度对所述权值进行更新，得到更新后的权值。In some embodiments of the disclosure, performing weight update on each layer in the N-layer artificial neural network structure includes: updating the weight by using the weight gradient to obtain an updated weight .

从上述技术方案可以看出，本披露的基于深度自动编码器的视频编解码装置及方法具有以下有益效果：It can be seen from the above technical solution that the video encoding and decoding apparatus and method based on the depth auto-encoder of the present disclosure have the following beneficial effects:

(1)利用人工神经网络度视频对视频进行两次编码压缩，提高了视频数据的压缩率；(1) Using the artificial neural network degree video to encode the video twice, which improves the compression ratio of the video data;

(2)由于人工神经网络具有非线性的特征，通过将人工神经网络的参数作为秘钥，实现了视频数据的压缩加密一体化；(2) Since the artificial neural network has nonlinear characteristics, the compression and encryption integration of video data is realized by using the parameters of the artificial neural network as the secret key;

(3)深度自动编码器对视频数据的编码结果包含了视频数据的特征，便于视频数据的分类与搜索，将机器学习引入视频编码领域，具有广阔的发展空间和应用前景；(3) The encoding result of video data by deep automatic encoder includes the characteristics of video data, which facilitates the classification and search of video data, and introduces machine learning into the field of video coding, which has broad development space and application prospects;

(4)无需人工设计复杂的编解码流程，利用深度自动编码器自动提取数据特征的功能，大大减少了人工干预，实现编码过程的自动化，实现简单，并且可拓展性良好，不仅可以用于视频数据压缩，还可用于其它数据压缩。(4) It is not necessary to manually design complex codec process, and the function of automatically extracting data features by using deep automatic encoder, greatly reducing manual intervention, realizing automation of coding process, simple implementation, and good expandability, not only for video Data compression can also be used for other data compression.

DRAWINGS

附图是用来提供对本公开的进一步理解，并且构成说明书的一部分，与下面的具体实施方式一起用于解释本公开，但并不构成对本公开的限制。在附图中：The drawings are intended to provide a further understanding of the disclosure, and are in the In the drawing:

图1是依据本披露实施例的视频编解码装置的结构示意图；1 is a schematic structural diagram of a video codec apparatus according to an embodiment of the present disclosure;

图2是本披露实施例的深度自动编码器的示意图；2 is a schematic diagram of a depth autoencoder of an embodiment of the present disclosure;

图3是本披露实施例的视频编解码方法的编码流程图；3 is a coding flowchart of a video encoding and decoding method according to an embodiment of the present disclosure;

图4是本披露实施例的视频编解码方法的深度自动编码器训练流程图；4 is a flowchart of a deep autoencoder training of a video encoding and decoding method according to an embodiment of the present disclosure;

图5是本披露实施例的视频编解码方法的解码流程图。FIG. 5 is a flowchart of decoding of a video encoding and decoding method according to an embodiment of the present disclosure.

【符号说明】【Symbol Description】

10-控制器；20-深度自动编码器模块；30-神经网络编解码模块；40-混合编解码模块；50-存储模块50。10-controller; 20-depth autoencoder module; 30-neural network codec module; 40-hybrid codec module; 50-storage module 50.

detailed description

为使本披露的目的、技术方案和优点更加清楚明白，以下结合具体实施例，并参照附图，对本披露进一步详细说明。The present disclosure will be further described in detail below with reference to the specific embodiments thereof and the accompanying drawings.

随着智能时代的到来，将人工智能的方法引入视频编解码领域，以寻求更大的突破应成为未来的发展趋势。本披露实施例提供了一种基于深度自动编码器的视频编解码装置，图1所示为该视频编解码装置的结构示意图，包括控制器10，深度自动编码器模块20，神经网络编解码模块30，混合编解码模块40，存储模块50；其中，With the advent of the intelligent era, the introduction of artificial intelligence methods into the field of video codec, in order to seek greater breakthroughs should become the future development trend. The embodiment of the present disclosure provides a video encoding and decoding device based on a depth auto-encoder, and FIG. 1 is a schematic structural diagram of the video encoding and decoding device, including a controller 10, a depth auto-encoder module 20, and a neural network codec module. 30, a hybrid codec module 40, a storage module 50; wherein

控制器10与深度自动编码器模块20、神经网络编解码模块30和混合编解码模块40互联。控制器10包括一本地的指令队列。控制器10用于将用户程序所编译而成的控制指令存储于指令队列之中，并将其译码为控制信号以控制各模块完成各自的功能，实现视频编码和解码。存储模块50也与深度自动编码器模块20、神经网络编解码模块30和混合编解码模块40互联，用于存储视频编解码过程中的各种数据和参数。Controller 10 is interconnected with depth autoencoder module 20, neural network codec module 30, and hybrid codec module 40. Controller 10 includes a local command queue. The controller 10 is configured to store the control instructions compiled by the user program in the instruction queue, and decode them into control signals to control each module to complete its respective functions, and implement video encoding and decoding. The storage module 50 is also interconnected with the depth autoencoder module 20, the neural network codec module 30, and the hybrid codec module 40 for storing various data and parameters in the video codec process.

深度自动编码器模块20包括深度自动编码器，深度自动编码器包括结构对称的编码端和解码端，所述编码端为N层人工神经网络结构，其中第1层为输入层，第2至N层为隐含层，层间单元全连接、层内单元无连接，第N层隐含层的隐单元数小于输入层的输入单元数，从而可以达到视频压缩的效果，其中N大于等于2。The depth autoencoder module 20 includes a depth autoencoder including a structurally symmetric encoding end and a decoding end, the encoding end being an N-layer artificial neural network structure, wherein the first layer is an input layer, and the second to N The layer is a hidden layer, the inter-layer unit is fully connected, and the intra-layer unit has no connection. The number of hidden units in the hidden layer of the N-th layer is smaller than the number of input units in the input layer, so that the effect of video compression can be achieved, wherein N is greater than or equal to 2.

解码端是与编码端结构对称的N层人工神经网络结构，具体来说，解码端的第1层(即输入层)为编码端的第N层隐含层，其第2层(即第1 层隐含层)为编码端的第N-1层隐含层，解码端的第1层与第2层之间的权重矩阵为编码端的第N-1层与第N层之间的权重矩阵的转置。The decoding end is an N-layer artificial neural network structure symmetric with the coding end structure. Specifically, the first layer (ie, the input layer) of the decoding end is the Nth layer hidden layer of the coding end, and the second layer (ie, the first layer is hidden) The layer containing) is the N-1 layer hidden layer of the encoding end, and the weight matrix between the first layer and the second layer of the decoding end is the transposition of the weight matrix between the N-1th layer and the Nth layer of the encoding end.

解码端的第3层(即第2层隐含层)为编码端的第N-2层隐含层，解码端的第2层与第3层之间的权重矩阵为编码端的第N-2层与第N-1层之间的权重矩阵的转置。The third layer of the decoding end (ie, the second layer hidden layer) is the N-2 layer hidden layer of the encoding end, and the weight matrix between the second layer and the third layer of the decoding end is the N-2 layer and the encoding end of the encoding end. Transpose of the weight matrix between the N-1 layers.

依次类推，解码端的第N层(即第N层隐含层)为编码端的第1层(即输入层)，解码端的第N-1层与第N层之间的权重矩阵为编码端的第1层与第2层之间的权重矩阵的转置。By analogy, the Nth layer of the decoding end (ie, the Nth layer hidden layer) is the first layer of the encoding end (ie, the input layer), and the weight matrix between the N-1th layer and the Nth layer of the decoding end is the first of the encoding end. The transpose of the weight matrix between the layer and the second layer.

即解码端的第n层为编码端的第N-n+1层，解码端相邻两层(第n层和第n+1层)之间的权重矩阵，为编码端对应相邻两层(第N-n层和第N-n+1层)之间的权重矩阵的转置。That is, the nth layer of the decoding end is the N-n+1 layer of the encoding end, and the weight matrix between the two adjacent layers (the nth layer and the n+1th layer) of the decoding end is the adjacent two layers of the coding end (the first layer) Transposition of the weight matrix between the Nn layer and the N-n+1th layer).

可对上述人工神经网络结构进行训练，训练的步骤是对一个(多层)人工神经网络中的各层依次执行正向运算，然后按照相反的层的顺序依次执行反向运算，最后用计算得到的权值的梯度去更新权值；这就是神经网络的训练的依次迭代，整个训练过程需要重复执行这个过程多次。具体来说，使用人工神经网络结构实现人工神经网络训练的方法包括以下内容：The artificial neural network structure can be trained. The training step is to perform a forward operation on each layer in a (multi-layer) artificial neural network, and then perform reverse operations in the order of the opposite layers, and finally calculate The gradient of the weights is used to update the weights; this is the sequential iteration of the training of the neural network, and the entire training process needs to be repeated several times. Specifically, the method for implementing artificial neural network training using an artificial neural network structure includes the following contents:

正向运算步骤：首先，对多层人工神经网络中的各层依次执行正向运算，得到各层的输出神经元。Forward operation steps: First, a forward operation is sequentially performed on each layer in the multi-layer artificial neural network to obtain output neurons of each layer.

反向运算步骤：然后，按照与正向运算相反的顺序，对多层人工神经网络中的各层依次执行反向运算，得到各层的权值梯度和输入神经元梯度。Reverse operation steps: Then, in the reverse order of the forward operation, the layers in the multi-layer artificial neural network are sequentially subjected to an inverse operation to obtain a weight gradient of each layer and an input neuron gradient.

该步骤包括第一运算部分和第二运算部分。第一运算部分用于计算权值梯度。对于人工神经网络的每一层，由该层的输出神经元梯度和输入神经元通过矩阵乘法或卷积得到该层的权值梯度。第二运算部分用于计算输入神经元梯度。对于人工神经网络的每一层，可以使用输出神经元梯度和权值，计算出输入神经元梯度。This step includes a first arithmetic part and a second arithmetic part. The first arithmetic part is used to calculate the weight gradient. For each layer of the artificial neural network, the gradient of the weight of the layer is obtained by matrix multiplication or convolution of the output neuron gradient of the layer and the input neurons. The second computational portion is used to calculate the input neuron gradient. For each layer of the artificial neural network, the input neuron gradient and weight can be used to calculate the input neuron gradient.

权值更新步骤：接着，对多层人工神经网络中的各层进行权值更新，得到更新后的权值。在这个步骤中，对于人工神经网络的每一层，利用权值梯度对权值进行更新，得到更新后的权值。Weight update step: Next, weight updates are performed on each layer in the multi-layer artificial neural network to obtain an updated weight. In this step, for each layer of the artificial neural network, the weight is updated with a weight gradient to obtain an updated weight.

重复执行正向运算步骤、反向运算步骤和权值更新步骤多次，完成多层人工神经网络的训练。The forward operation step, the reverse operation step, and the weight update step are repeatedly performed multiple times to complete the training of the multi-layer artificial neural network.

整个训练方法需要多次重复执行上述过程，直至人工神经网络的参数达到要求，训练过程完毕。The entire training method requires repeated execution of the above process until the parameters of the artificial neural network meet the requirements, and the training process is completed.

如图2所示，其示例性地给出了一种深度自动编码器的示意图，编码端和解码端均为五层人工神经网络结构，其中，深度自动编码器的第1层隐含层有2000个单元，第2层隐含层有1000个单元，第3层隐含层有500个单元，第4层隐含层有30个单元，输入层与第1层隐含层之间的权重矩阵为W1，第1层隐含层和第2层隐含层之间的权重矩阵为W2，第2层隐含层和第3层隐含层之间的权重矩阵为W3，第3层隐含层和第4层隐含层之间的权重矩阵为W4。对应地，解码端的输入层有30个单元，第1层隐含层有500个单元，第2层隐含层有1000个单元，第3层隐含层有2000个单元，输入层和第1层隐含层之间的权重矩阵为W ^T ₄，第1层隐含层和第2层隐含层之间的权重矩阵为W ^T ₃，第2层隐含层和第3层隐含层之间的权重矩阵为W ^T ₂，第3层隐含层和第4层隐含层之间的权重矩阵为W ^T ₁。 As shown in FIG. 2, a schematic diagram of a depth auto-encoder is exemplarily given. The encoding end and the decoding end are five-layer artificial neural network structures, wherein the first layer hidden layer of the deep auto-encoder has 2000 units, the second layer has 1000 cells in the hidden layer, the hidden layer in the third layer has 500 cells, the hidden layer in the fourth layer has 30 cells, and the weight between the input layer and the hidden layer in the first layer The matrix is W1, the weight matrix between the first layer hidden layer and the second layer hidden layer is W2, and the weight matrix between the second layer hidden layer and the third layer hidden layer is W3, and the third layer is hidden. The weight matrix between the containing layer and the layer 4 hidden layer is W4. Correspondingly, the input layer of the decoding end has 30 units, the first layer has 500 cells, the second layer has 1000 cells, and the third layer has 2000 cells, the input layer and the first layer. The weight matrix between the layer hidden layers is W ^T ₄ , and the weight matrix between the first layer hidden layer and the second layer hidden layer is W ^T ₃ , the second layer hidden layer and the third layer hidden layer The weight matrix between them is W ^T ₂ , and the weight matrix between the layer 3 hidden layer and the layer 4 hidden layer is W ^T ₁ .

深度自动编码器模块20利用深度自动编码器的编码端对原始视频进行首次压缩，原始视频数据输入编码端的输入层，经编码端各层压缩后由第N层隐含层输出，得到首次压缩数据，并存储于存储模块50，同时将解码端的参数存储于存储模块50，该参数包括解码端的层数N、各层的单元数目和各层之间的权值矩阵。The depth autoencoder module 20 uses the encoding end of the depth autoencoder to compress the original video for the first time. The original video data is input to the input layer of the encoding end, and is compressed by each layer of the encoding end and output by the hidden layer of the Nth layer to obtain the first compressed data. And stored in the storage module 50, and the parameters of the decoding end are stored in the storage module 50, the parameters include the number of layers N of the decoding end, the number of units of each layer, and the weight matrix between the layers.

神经网络编解码模块30从存储模块50读取解码端的参数，并对参数进行编码压缩，生成编码后的解码端参数。其中，可以采用常用的编码方式对参数进行编码。The neural network codec module 30 reads the parameters of the decoding end from the storage module 50, and encodes and compresses the parameters to generate the encoded decoding end parameters. Among them, the parameters can be encoded by a common coding method.

混合编解码模块40对首次压缩数据进行二次压缩，具体地，其从存储模块50中读取首次压缩数据，并从神经网络编解码模块30中读取编码后的解码端参数，并对首次压缩数据和编码后的解码端参数进行混合编码，得到视频压缩数据，并存储于存储模块50，完成视频压缩。其中，混合编码可以采用哈夫曼编码等熵编码方式。The hybrid codec module 40 performs secondary compression on the first compressed data. Specifically, it reads the first compressed data from the storage module 50, and reads the encoded decoding end parameters from the neural network codec module 30, and for the first time. The compressed data and the encoded decoding end parameters are mixed and encoded to obtain video compressed data, and stored in the storage module 50 to complete video compression. Among them, the hybrid coding can adopt the Huffman coding isentropic coding mode.

本披露的视频编解码装置，利用人工神经网络度视频对视频进行两次编码压缩，提高了视频数据的压缩率，而且由于人工神经网络具有非线性的特征，通过将人工神经网络的参数作为秘钥，实现了视频数据的压缩加密一体化。深度自动编码器对视频数据的编码结果包含了视频数据的特征，便于视频数据的分类与搜索，将机器学习引入视频编码领域，具有广阔的发展空间和应用前景。The video codec device of the present disclosure uses the artificial neural network degree video to compress and compress the video twice, thereby improving the compression ratio of the video data, and because the artificial neural network has nonlinear characteristics, the parameters of the artificial neural network are taken as secrets. The key realizes the integration of compression and encryption of video data. The encoding result of video data by deep automatic encoder includes the characteristics of video data, which facilitates the classification and search of video data, and introduces machine learning into the field of video coding, which has broad development space and application prospects.

进一步地，本实施例的视频编解码装置可以对视频压缩数据进行解码以重构原始视频数据。Further, the video codec device of this embodiment may decode the video compressed data to reconstruct the original video data.

混合编解码模块40对视频压缩数据进行首次解压，具体地，其从存储模块50读取视频压缩数据，并对视频压缩数据进行解码，得到首次解压数据和编码后的解码端参数，并存储于存储模块50。其中该解码采用与混合编码对应的解码方式，该首次解压数据即编码过程中的首次压缩数据。The hybrid codec module 40 decompresses the video compressed data for the first time. Specifically, it reads the video compressed data from the storage module 50, and decodes the video compressed data to obtain the first decompressed data and the encoded decoding end parameters, and stores the data in the decoded data. Storage module 50. The decoding adopts a decoding manner corresponding to the hybrid encoding, and the first decompressed data is the first compressed data in the encoding process.

神经网络编解码模块30从存储模块50读取编码后的解码端参数，并对编码后的解码端参数进行解码，得到解码端的参数。其中该解码采用与编码过程中解码端参数的编码方式对应的解码方式。The neural network codec module 30 reads the encoded decoding end parameters from the storage module 50, and decodes the encoded decoding end parameters to obtain parameters of the decoding end. The decoding adopts a decoding manner corresponding to the encoding mode of the decoding end parameter in the encoding process.

深度自动编码器模块20利用解码端对首次解压数据进行二次解压，具体地，深度自动编码器模块20从存储模块50读取首次解压数据，从神经网络编解码模块30读取解码端的参数，首次解压数据输入解码端的输入层，经解码端各层解压后由第N层隐含层输出，得到原始视频数据，并存储于存储模块50。The depth autoencoder module 20 uses the decoding end to perform secondary decompression on the first decompressed data. Specifically, the deep autoencoder module 20 reads the first decompressed data from the storage module 50, and reads the parameters of the decoding end from the neural network codec module 30. The input layer of the data input decoding end is decompressed for the first time, and is decompressed by each layer of the decoding end and output by the hidden layer of the Nth layer to obtain original video data, and is stored in the storage module 50.

由此可见，本披露的视频编解码装置，无需人工设计复杂的编解码流程，利用深度自动编码器自动提取数据特征的功能，大大减少了人工干预，实现编码过程的自动化，实现简单，并且可拓展性良好，不仅可以用于视频数据压缩，还可用于其它数据压缩。It can be seen that the video codec device of the present disclosure does not need to manually design a complicated codec process, and automatically extracts data features by using a deep automatic encoder, thereby greatly reducing manual intervention, realizing automation of the encoding process, and realizing simplicity and Good scalability, not only for video data compression, but also for other data compression.

进一步地，本披露的视频编解码装置，深度自动编码器通过训练的方式生成。深度自动编码器模块20首先初始化一深度自动编码器，然后利用训练用视频对深度自动编码器的编码端进行训练，得到用于视频编码的深度自动编码器编码端。具体包括，Further, in the video codec device of the present disclosure, the depth auto-encoder is generated by training. The depth autoencoder module 20 first initializes a depth autoencoder, and then trains the encoding end of the depth autoencoder with the training video to obtain a depth autoencoder encoding end for video encoding. Specifically,

首先，将深度自动编码器编码端的相邻两层作为一个限制玻尔兹曼机，将相邻两层的上一层作为可见层，下一层作为隐含层，对限制玻尔兹曼机进行训练。First, the adjacent two layers of the deep autoencoder encoding end are used as a limited Boltzmann machine, the upper layer of the adjacent two layers is used as the visible layer, and the next layer is used as the hidden layer to limit the Boltzmann machine. Train.

限制玻尔兹曼机采用二值单元，其能量函数为：Limiting the Boltzmann machine to use a binary unit whose energy function is:

式中，v _i为第i个可见单元，h _j为第j个隐单元，a _i为第i个可见单元v _i的偏置，b _j为第j个隐单元h _j的偏置，w _j，i为连接第j个隐单元和第i个可见单元的权值，n _v和n _h分别是可见单元和隐单元的数目。 Where v _i is the i-th visible unit, h _j is the j-th hidden unit, a _i is the offset of the i-th visible unit v _i , b _j is the offset of the j-th hidden unit h _j , w _{j, i} is the weight connecting the jth hidden unit and the ith visible unit, and n _v and n _h are the number of visible and hidden units, respectively.

然后：初始化限制玻尔兹曼机。包括：将训练用视频作为训练样本集合S(|S|＝n _s)，设定训练周期J、学习率η、CD-K算法参数k；指定可见层和隐藏层单元数n _v和n _h；设定偏置向量a，b和权值矩阵w。 Then: Initialize the limit Boltzmann machine. Including: training video as training sample set S(|S|=n _s ), setting training period J, learning rate η, CD-K algorithm parameter k; specifying visible layer and hidden layer unit numbers n _v and n _h ; set the offset vectors a, b and the weight matrix w.

其中，第i个可见单元v _i的偏置a _i为偏置向量a的第i项，第j个隐单元h _j的偏置b _j为偏置向量b的第j项，w _j，i为权值矩阵W中第j行第i列的元素，n _s为训练样本集合的单元数。 Wherein, the offset a _i of the i-th visible cell v _i is the i-th term of the offset vector a, and the offset b _j of the j-th hidden cell h _j is the j-th term of the offset vector b, w _j,i Is the element of the i-th column of the jth row in the weight matrix W, and n _s is the number of cells of the training sample set.

接着，对限制玻尔兹曼机进行训练。包括：Next, the Boltzmann machine is trained. include:

首先，使用CD-K算法得到ΔW，Δa和Δb；First, ΔW, Δa and Δb are obtained using the CD-K algorithm;

然后，使用ΔW，Δa和Δb更新限制玻尔兹曼机的参数：Then, use ΔW, Δa and Δb to update the parameters that limit the Boltzmann machine:

循环上述两个步骤J次，得到训练好的限制玻尔兹曼机，作为深度自动编码器。The above two steps are cycled J times to obtain a trained limited Boltzmann machine as a depth auto-encoder.

其中，使用CD-K算法得到ΔW，Δa和Δb的步骤如下：Among them, the steps of obtaining ΔW, Δa and Δb using the CD-K algorithm are as follows:

初始化：ΔW＝0，Δa＝0，Δb＝0；Initialization: ΔW=0, Δa=0, Δb=0;

对训练样本集合S中的每一个样本v进行如下循环：The following loop is performed for each sample v in the training sample set S:

(1)初始化v ₀＝v (1) Initialization v ₀ = v

(2)进行k次采样，在每次采样中，先从可见单元组v _t采样出隐单元组h _t，再从隐单元组h _t采样出可见单元组v _t+1，其中t为整数且0≤t≤k-1。 (2) for k samples in each sample, the visible start sampling the cell group V _t H _t group of hidden units, and from the group of hidden units H _t sampling a visible cell group v _{t + 1,} wherein t is an integer And 0 ≤ t ≤ k-1.

(3)对于每一个i和j(i和j均为整数，1≤i≤n _h，1≤j≤n _v)进行如下计算： (3) For each of i and j (i and j are integers, 1 ≤ i ≤ n _h , 1 ≤ j ≤ n _v ), the following calculation is performed:

Δb _i＝Δb _i+[P(h _i＝1|v ₀)-p(h _i＝1|v _k)] Δb _i =Δb _i +[P(h _i =1|v ₀ )-p(h _i =1|v _k )]

其中，

和

分别为序号为0的可见单元组中的第j个可见单元和序号为k的可见单元组中第j个可见单元。 among them,

with

The jth visible unit in the visible unit group with the sequence number 0 and the jth visible unit in the visible unit group with the sequence number k, respectively.

最后，用反向传播算法精细调整深度自动编码器编码端的权值矩阵，以最小化对原始输入的重构误差。例如，在精细调整深度自动编码器编码端的权值矩阵时，不再将编码端的输入输出单元和隐单元看成限制波尔兹曼机的单元，而是直接使用各单元的实数输出值。由于编码端已经经过训练，可以用反向传播算法来调整权值矩阵以最小化编码端输出的重构误差。Finally, the back propagation algorithm is used to fine tune the weight matrix of the depth autoencoder's encoding end to minimize the reconstruction error to the original input. For example, when the weight matrix of the encoding end of the depth autoencoder is finely adjusted, the input/output unit and the hidden unit of the encoding end are no longer regarded as the unit of the Boltzmann machine, but the real output value of each unit is directly used. Since the encoder has been trained, a backpropagation algorithm can be used to adjust the weight matrix to minimize the reconstruction error of the encoder output.

本披露另一实施例提供了一种基于深度自动编码器的视频编解码方法，参见图3，包括：Another embodiment of the present disclosure provides a video encoding and decoding method based on a depth auto-encoder. Referring to FIG. 3, the method includes:

步骤S101，控制器10向深度自动编码器模块20发送编码指令，深度自动编码器的编码端对原始视频进行首次压缩。In step S101, the controller 10 sends an encoding instruction to the depth autoencoder module 20, and the encoding end of the deep autoencoder first compresses the original video.

步骤S102，控制器10向深度自动编码器模块20发送IO指令，首次压缩数据和解码端的参数存储于存储模块50。In step S102, the controller 10 sends an IO command to the depth autoencoder module 20, and the parameters of the first compressed data and the decoding end are stored in the storage module 50.

步骤S103，控制器10向神经网络编解码模块30发送IO指令，神经网络编解码模块30从存储模块50读取解码端的参数。In step S103, the controller 10 sends an IO command to the neural network codec module 30, and the neural network codec module 30 reads the parameters of the decoding end from the storage module 50.

步骤S104，控制器10向神经网络编解码模块30发送编码指令，神经网络编解码模块30对参数进行编码压缩。In step S104, the controller 10 sends an encoding instruction to the neural network codec module 30, and the neural network codec module 30 encodes and compresses the parameters.

步骤S105，控制器10向混合编解码模块40发送IO指令，混合编解码模块40从存储模块50中读取首次压缩数据，并从神经网络编解码模块30中读取编码后的解码端参数。In step S105, the controller 10 sends an IO command to the hybrid codec module 40, and the hybrid codec module 40 reads the first compressed data from the storage module 50, and reads the encoded decoded terminal parameters from the neural network codec module 30.

步骤S106，控制器10向混合编解码模块40发送编码指令，混合编解码模块40对首次压缩数据和编码后的解码端参数进行混合编码，得到视频压缩数据。In step S106, the controller 10 sends an encoding instruction to the hybrid codec module 40, and the hybrid codec module 40 performs hybrid encoding on the first compressed data and the encoded decoding end parameters to obtain video compressed data.

步骤S107，控制器10向混合编解码模块40发送IO指令，混合编解码模块40将视频压缩数据存储于存储模块50。In step S107, the controller 10 sends an IO command to the hybrid codec module 40, and the hybrid codec module 40 stores the video compressed data in the storage module 50.

其中，参见图4，在步骤S101之前还可以包括：Wherein, referring to FIG. 4, before step S101, the method may further include:

从存储模块50读取训练用视频数据；Reading training video data from the storage module 50;

利用训练用视频数据对深度自动编码器进行训练。The depth autoencoder is trained using training video data.

参见图5，该视频编解码方法还包括：Referring to FIG. 5, the video encoding and decoding method further includes:

步骤S201，控制器10向混合编解码模块40发送IO指令，混合编解码模块40从存储模块50读取视频压缩数据。In step S201, the controller 10 sends an IO command to the hybrid codec module 40, and the hybrid codec module 40 reads the video compressed data from the storage module 50.

步骤S202，控制器10向混合编解码模块40发送解码指令，混合编解码模块40对视频压缩数据进行解码，得到首次解压数据和编码后的解码端参数。In step S202, the controller 10 sends a decoding instruction to the hybrid codec module 40, and the hybrid codec module 40 decodes the video compressed data to obtain the first decompressed data and the encoded decoding end parameters.

步骤S203，控制器10向混合编解码模块40发送IO指令，混合编解码模块40将首次解压数据和编码后的解码端参数存储于存储模块50。In step S203, the controller 10 sends an IO command to the hybrid codec module 40, and the hybrid codec module 40 stores the first decompressed data and the encoded decoder parameters in the storage module 50.

步骤S204，控制器10向神经网络编解码模块30发送IO指令，神经网络编解码模块30从存储模块50读取编码后的解码端参数。In step S204, the controller 10 sends an IO command to the neural network codec module 30, and the neural network codec module 30 reads the encoded decoder parameters from the storage module 50.

步骤S205，控制器10向神经网络编解码模块30发送解码指令，神经网络编解码模块30对编码后的解码端参数进行解码，得到解码端的参数。In step S205, the controller 10 sends a decoding instruction to the neural network codec module 30, and the neural network codec module 30 decodes the encoded decoding end parameters to obtain parameters of the decoding end.

步骤S206，控制器10向深度自动编码器模块20发送IO指令，深度自动编码器模块20从存储模块50读取首次解压数据，从神经网络编解码模块30读取解码端的参数。In step S206, the controller 10 sends an IO command to the depth autoencoder module 20, and the deep autoencoder module 20 reads the first decompressed data from the storage module 50, and reads the parameters of the decoding end from the neural network codec module 30.

步骤S207，控制器10向深度自动编码器模块20发送解码指令，深度自动编码器模块20对首次解压数据进行二次解压，得到原始视频数据。In step S207, the controller 10 sends a decoding instruction to the depth autoencoder module 20, and the depth autoencoder module 20 performs second decompression on the first decompressed data to obtain original video data.

步骤S208，控制器10向深度自动编码器模块20发送IO指令，深度自动编码器模块20将原始视频数据存储于存储模块50。In step S208, the controller 10 sends an IO command to the depth autoencoder module 20, and the depth autoencoder module 20 stores the original video data in the storage module 50.

在一个实施例里，本披露公开了一个芯片，其包括了上述视频编解码装置。In one embodiment, the present disclosure discloses a chip that includes the video codec described above.

在一个实施例里，本披露公开了一个芯片封装结构，其包括了上述芯片。In one embodiment, the present disclosure discloses a chip package structure that includes the chip described above.

在一个实施例里，本披露公开了一个板卡，其包括了上述芯片封装结构。In one embodiment, the present disclosure discloses a board that includes the chip package structure described above.

在一个实施例里，本披露公开了一个电子装置，其包括了上述板卡。In one embodiment, the present disclosure discloses an electronic device that includes the above-described card.

电子装置包括数据处理装置、机器人、电脑、打印机、扫描仪、平板电脑、智能终端、手机、行车记录仪、导航仪、传感器、摄像头、云端服务器、相机、摄像机、投影仪、手表、耳机、移动存储、可穿戴设备交通工具、家用电器、和/或医疗设备。Electronic devices include data processing devices, robots, computers, printers, scanners, tablets, smart terminals, mobile phones, driving recorders, navigators, sensors, cameras, cloud servers, cameras, cameras, projectors, watches, headphones, mobile Storage, wearable device vehicles, household appliances, and/or medical devices.

所述交通工具包括飞机、轮船和/或车辆；所述家用电器包括电视、空调、微波炉、冰箱、电饭煲、加湿器、洗衣机、电灯、燃气灶、油烟机；所述医疗设备包括核磁共振仪、B超仪和/或心电图仪。The vehicle includes an airplane, a ship, and/or a vehicle; the household appliance includes a television, an air conditioner, a microwave oven, a refrigerator, a rice cooker, a humidifier, a washing machine, an electric lamp, a gas stove, a range hood; the medical device includes a nuclear magnetic resonance instrument, B-ultrasound and / or electrocardiograph.

另外，在本披露各个实施例中的各功能单元可以集成在一个处理单元中，也可以是各个单元单独物理存在，也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现，也可以采用软件程序模块的形式实现。In addition, each functional unit in each embodiment of the present disclosure may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit. The above integrated unit can be implemented in the form of hardware or in the form of a software program module.

所述集成的单元如果以软件程序模块的形式实现并作为独立的产品销售或使用时，可以存储在一个计算机可读取存储器中。基于这样的理解，本披露的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来，该计算机软件产品存储在一个存储器中，包括若干指令用以使得一台计算机设备(可为个人计算机、服务器或者网络设备等)执行本披露各个实施例所述方法的全部或部分步骤。而前述的存储器包括：U盘、只读存储器(ROM，Read-Only Memory)、随机存取存储器(RAM，Random Access Memory)、移动硬盘、磁碟或者光盘等各种可以存储程序代码的介质。The integrated unit, if implemented in the form of a software program module and sold or used as a standalone product, may be stored in a computer readable memory. Based on such understanding, the technical solution of the present disclosure may be embodied in the form of a software product in the form of a software product in essence or in the form of a contribution to the prior art, and the computer software product is stored in a memory. A number of instructions are included to cause a computer device (which may be a personal computer, server or network device, etc.) to perform all or part of the steps of the methods described in the various embodiments of the present disclosure. The foregoing memory includes: a U disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic disk, or an optical disk, and the like, which can store program codes.

各功能单元/模块都可以是硬件，比如该硬件可以是电路，包括数字电路，模拟电路等等。硬件结构的物理实现包括但不局限于物理器件，物理器件包括但不局限于晶体管，忆阻器等等。所述计算装置中的计算模块可以是任何适当的硬件处理器，比如CPU、GPU、FPGA、DSP和ASIC等等。所述存储单元可以是任何适当的磁存储介质或者磁光存储介质，比如RRAM，DRAM，SRAM，EDRAM，HBM，HMC等等。Each functional unit/module may be hardware, such as the hardware may be a circuit, including digital circuits, analog circuits, and the like. Physical implementations of hardware structures include, but are not limited to, physical devices including, but not limited to, transistors, memristors, and the like. The computing modules in the computing device can be any suitable hardware processor, such as a CPU, GPU, FPGA, DSP, ASIC, and the like. The storage unit may be any suitable magnetic storage medium or magneto-optical storage medium such as RRAM, DRAM, SRAM, EDRAM, HBM, HMC, and the like.

前面的附图中所描绘的进程或方法可通过包括硬件(例如，电路、专用逻辑等)、固件、软件(例如，被具体化在非瞬态计算机可读介质上的软件)，或两者的组合的处理逻辑来执行。虽然上文按照某些顺序操作描述了进程或方法，但是，应该理解，所描述的某些操作能以不同顺序来执行。此外，可并行地而非顺序地执行一些操作。The processes or methods depicted in the preceding figures may include hardware (eg, circuitry, dedicated logic, etc.), firmware, software (eg, software embodied on a non-transitory computer readable medium), or both The combined processing logic is executed. Although the processes or methods have been described above in some order, it should be understood that certain operations described can be performed in a different order. Moreover, some operations may be performed in parallel rather than sequentially.

需要说明的是，在附图或说明书正文中，未绘示或描述的实现方式，均为所属技术领域中普通技术人员所知的形式，并未进行详细说明。此外，上述对各元件的定义并不仅限于实施例中提到的各种具体结构、形状，本领域普通技术人员可对其进行简单地更改或替换；本文可提供包含特定值的参数的示范，但这些参数无需确切等于相应的值，而是可在可接受的误差容限或设计约束内近似于相应值；实施例中提到的方向用语，例如“上”、“下”、“前”、“后”、“左”、“右”等，仅是参考附图的方向，并非用来限制本披露的保护范围；上述实施例可基于设计及可靠度的考虑，彼此混合搭配使用或与其他实施例混合搭配使用，即不同实施例中的技术特征可以自由组合形成更多的实施例。It should be noted that the implementations that are not shown or described in the drawings or the text of the specification are all known to those of ordinary skill in the art and are not described in detail. In addition, the above definitions of the various elements are not limited to the specific structures and shapes mentioned in the embodiments, and those skilled in the art may simply change or replace them; an example of parameters including specific values may be provided herein. However, these parameters need not be exactly equal to the corresponding values, but may approximate the corresponding values within acceptable error tolerances or design constraints; the directional terms mentioned in the examples, such as "upper", "lower", "front" , "post", "left", "right", etc., are merely referring to the directions of the drawings, and are not intended to limit the scope of protection of the present disclosure; the above embodiments may be used in combination or in combination with each other based on design and reliability considerations. Other embodiments are used in combination, that is, the technical features in different embodiments can be freely combined to form more embodiments.

以上所述的具体实施例，对本披露的目的、技术方案和有益效果进行了进一步详细说明，所应理解的是，以上所述仅为本披露的具体实施例而已，并不用于限制本披露，凡在本披露的精神和原则之内，所做的任何修改、等同替换、改进等，均应包含在本披露的保护范围之内。The specific embodiments of the present invention have been described in detail with reference to the specific embodiments of the present disclosure. It is understood that the above description is only the specific embodiment of the disclosure, and is not intended to limit the disclosure. Any modifications, equivalent substitutions, improvements, etc. made within the spirit and principles of this disclosure are intended to be included within the scope of the disclosure.

Claims

A video encoding and decoding device based on a depth auto-encoder, comprising:

a depth auto-encoder module, including a depth auto-encoder, the depth auto-encoder includes an encoding end, and the encoding end is used for first compressing the original video to obtain first-time compressed data;

a neural network codec module, configured to encode and compress parameters of the decoding end, and generate encoded decoding end parameters;

The hybrid codec module is configured to perform hybrid coding on the first compressed data and the encoded decoding end parameters to obtain video compressed data.

The video codec device of claim 1 wherein:

The coding end is an N-layer artificial neural network structure.

The video encoding and decoding apparatus according to claim 2, wherein the first layer of the N-layer artificial neural network structure is an input layer, and the second to N layers are hidden layers, and the inter-layer units are fully connected, and the layers are The unit has no connection, and the number of hidden units of the hidden layer of the Nth layer is smaller than the number of input units of the input layer.

The video codec device according to any one of claims 1 to 3, wherein the hybrid coding comprises entropy coding.

The video encoding and decoding apparatus according to claim 4, wherein said entropy encoding comprises Huffman encoding.

The video codec device according to any one of claims 1 to 5, further comprising:

And a storage module, configured to store the first compressed data, the decoding end parameter, and the video compressed data.

The video codec device of claim 6 wherein:

The neural network codec module is configured to read the decoding end parameter from the storage module to encode and compress the decoding end parameter.

A video encoding and decoding apparatus according to claim 6 or 7, wherein:

The hybrid codec module is configured to read the first compressed data from the storage module, and read the encoded decoding end parameters from the neural network codec module to perform the hybrid encoding, and Video compression data is stored to the storage module.

The video encoding and decoding apparatus according to any one of claims 2 to 8, wherein the depth automatic encoder further comprises: a decoding end;

The hybrid codec module is further configured to decode video compressed data to obtain first decompressed data and encoded decoding end parameters;

The neural network codec module is further configured to decode the encoded decoding end parameter to obtain a decoding end parameter;

The decoding end is configured to decode the first decompressed data to obtain original video data.

The video encoding and decoding apparatus according to claim 9, wherein the storage module is further configured to store the first decompressed data, the encoded decoding end parameter, and the original video data.

A video encoding and decoding apparatus according to claim 10, wherein

The hybrid codec module is further configured to read the video compressed data from the storage module to decode the video compressed data.

The video encoding and decoding apparatus according to claim 10 or 11, wherein the neural network codec module is further configured to read the encoded decoding end parameter from the storage module, after the encoding The decoding end parameters are decoded.

The video encoding and decoding apparatus according to any one of claims 10 to 12, wherein the depth autoencoder module is further configured to read the first decompressed data from the storage module, from the neural network. The decoding module reads the parameters of the decoding end to cause the decoding end to decode the first decompressed data.

The video encoding and decoding apparatus according to any one of claims 9 to 13, wherein the decoding end is an N-layer artificial neural network structure symmetric with the coding end structure.

The video encoding and decoding apparatus according to claim 14, wherein the nth layer of the decoding end is the (N-n+1)th layer of the encoding end, and the nth layer and the n+1th of the decoding end are The weight matrix between the layers is a transposition of the weight matrix between the (Nn)th layer and the (N-n+1)th layer of the encoding end, where 1≤n≤N.

A video encoding and decoding apparatus according to any one of claims 1 to 15, wherein

The depth autoencoder module is further configured to initialize the depth auto-encoder, and train the depth auto-encoder by using training video to obtain a depth auto-encoder for video coding.

The video encoding and decoding apparatus according to claim 16, wherein the depth autoencoder module is further configured to train the depth autoencoder by using training video, including:

The adjacent two layers of the code end of the depth autoencoder are used as a limited Boltzmann machine;

Initializing the restricted Boltzmann machine;

Training the restricted Boltzmann machine with the training video data;

The weight matrix of the codec of the depth autoencoder is finely adjusted by a back propagation algorithm to minimize the reconstruction error to the original input.

The video codec device according to any one of claims 1 to 17, further comprising:

And a controller interconnected with the depth autoencoder module, the neural network codec module, and the hybrid codec module for controlling the module.

A video encoding method based on a depth auto-encoder, wherein the video encoding and decoding apparatus according to any one of claims 1 to 18 is used for video encoding, comprising:

The first compression of the original video, the first compressed data;

Encoding and compressing the parameters of the decoding end to obtain the encoded decoding end parameters;

The first compressed data and the encoded decoding end parameters are mixed and encoded to obtain video compressed data.

The video encoding and decoding method according to claim 19, wherein the original video is first compressed using a first N-layer artificial neural network structure.

The video encoding and decoding method according to claim 20, wherein the first layer of the first N-layer artificial neural network structure is an input layer, the second to N layers are hidden layers, and the inter-layer units are fully connected. There is no connection in the intra-layer unit, and the number of hidden units in the hidden layer of the N-th layer is smaller than the number of input units in the input layer.

The video encoding and decoding method according to any one of claims 19 to 21, wherein the hybrid encoding comprises entropy encoding.

The video encoding and decoding method according to claim 22, wherein said entropy encoding comprises Huffman encoding.

The video encoding and decoding method according to any one of claims 19 to 23, further comprising:

The first compressed data, the decoding end parameters, and the video compressed data are stored.

A video encoding and decoding method according to claim 24, wherein

The decoding end parameter is read to encode and compress the decoding end parameter.

A video encoding and decoding method according to claim 24 or 25, wherein

The first compressed data and the encoded decoding end parameters are read to perform the hybrid encoding, and the video compressed data is stored.

The video encoding and decoding method according to any one of claims 20 to 26, further comprising:

Decoding the video compressed data to obtain first decompressed data and encoded decoding end parameters;

Decoding the encoded decoding end parameter to obtain a decoding end parameter;

Decoding the first decompressed data to obtain original video data.

The video encoding and decoding method according to claim 27, further comprising: storing the first decompressed data, the encoded decoding end parameters, and the original video data.

A video encoding and decoding method according to claim 28, wherein

The video compressed data is read to decode the video compressed data.

The video encoding and decoding method according to claim 28 or 29, wherein the encoded decoding end parameter is read to decode the encoded decoding end parameter.

The video encoding and decoding method according to any one of claims 28 to 30, wherein the first decompressed data and the parameters of the decoding end are read to decode the first decompressed data.

The video encoding and decoding method according to any one of claims 27 to 31, wherein the first decompressed data is decoded by using a second N-layer artificial neural network structure, and the second N-layer artificial neural network structure and The structure of the first N-layer artificial neural network is symmetrical.

The video encoding and decoding method according to claim 32, wherein the nth layer of the second N-layer artificial neural network structure is the (N-n+1) of the first N-layer artificial neural network structure a weight matrix between the nth layer and the n+1th layer of the second N layer artificial neural network structure, which is the (Nn)th layer and the (N-n+) of the first N layer artificial neural network structure 1) Transposition of a weight matrix between layers, where 1 ≤ n ≤ N.

The video encoding method according to any one of claims 19 to 33, further comprising: before the first compression of the original video:

Initialize the depth autoencoder;

The depth autoencoder is trained using training video data.

The video encoding method according to claim 34, wherein said training the depth auto-encoder by using video data for training comprises:

The adjacent two layers of the coding end of the depth autoencoder are used as a limited Boltzmann machine;

Initializing the restricted Boltzmann machine;

Training the restricted Boltzmann machine with the training video data;

The weighting matrix of the encoding end of the depth autoencoder is adjusted by a backpropagation method to minimize the reconstruction error of the original input.

The video encoding and decoding method according to any one of claims 19 to 35, further comprising: controlling the step by using a controller.

A video encoding and decoding method according to any one of claims 19 to 36, characterized in that

Performing a forward operation on each layer in the N-layer artificial neural network structure in sequence;

Performing inverse operations on the layers in the N-layer artificial neural network structure in the reverse order of the forward operation;

Performing weight update on each layer in the N-layer artificial neural network structure;

The above steps are repeatedly performed multiple times to complete the training of the N-layer artificial neural network structure.

The video encoding and decoding method according to claim 37, wherein the performing the inverse operations sequentially on the layers in the N-layer artificial neural network structure comprises:

a first operation portion: obtaining a weight gradient from the output neuron gradient and the input neuron;

A second operation portion: calculating an input neuron gradient using the output neuron gradient and weight.

The video encoding and decoding method according to claim 38, wherein the performing weight update on each layer in the N-layer artificial neural network structure comprises: updating the weight by using the weight gradient , get the updated weight.