CN110874553A - Recognition model training method and device - Google Patents
Recognition model training method and device Download PDFInfo
- Publication number
- CN110874553A CN110874553A CN201811019880.1A CN201811019880A CN110874553A CN 110874553 A CN110874553 A CN 110874553A CN 201811019880 A CN201811019880 A CN 201811019880A CN 110874553 A CN110874553 A CN 110874553A
- Authority
- CN
- China
- Prior art keywords
- probability
- sequence
- preset
- target sequence
- backward
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2218/00—Aspects of pattern recognition specially adapted for signal processing
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2218/00—Aspects of pattern recognition specially adapted for signal processing
- G06F2218/12—Classification; Matching
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Artificial Intelligence (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Signal Processing (AREA)
- Image Analysis (AREA)
Abstract
本申请实施例提供了一种识别模型训练方法及装置,识别模型训练方法包括:获取序列样本;将序列样本输入识别模型,得到预设前向目标序列的第一概率及预设后向目标序列的第二概率;根据预设前向目标序列及预设后向目标序列,按照同一位置预设后向目标序列中的目标在前、预设前向目标序列中的目标在后的顺序,排列得到前后向目标序列,并计算前后向目标序列的第三概率;根据第一概率、第二概率及第三概率,计算目标函数;根据目标函数,利用预设训练算法,训练识别模型。通过本方案,可以实现识别模型的实时识别。
The embodiments of the present application provide a recognition model training method and device. The recognition model training method includes: acquiring sequence samples; inputting the sequence samples into a recognition model to obtain a first probability of a preset forward target sequence and a preset backward target sequence The second probability of The forward and backward target sequence is obtained, and the third probability of the forward and backward target sequence is calculated; the target function is calculated according to the first probability, the second probability and the third probability; and the recognition model is trained by using a preset training algorithm according to the target function. Through this solution, the real-time recognition of the recognition model can be realized.
Description
技术领域technical field
本申请涉及机器学习技术领域,特别是涉及一种识别模型训练方法及装置。The present application relates to the technical field of machine learning, and in particular, to a recognition model training method and device.
背景技术Background technique
随着人工智能技术的发展,机器学习作为人工智能的核心技术,已在目标检测与跟踪、行为检测与识别、语音识别等方面得到了广泛应用。DNN(Deep Neural Network,深度神经网络)作为机器学习研究中的一个新兴领域,通过模仿人脑的机制来解析数据,是一种通过建立和模拟人脑进行分析学习的智能模型。With the development of artificial intelligence technology, machine learning, as the core technology of artificial intelligence, has been widely used in object detection and tracking, behavior detection and recognition, speech recognition, etc. As an emerging field in machine learning research, DNN (Deep Neural Network) analyzes data by imitating the mechanism of the human brain, and is an intelligent model that analyzes and learns by establishing and simulating the human brain.
传统的DNN中,如CNN(Convolutional Neural Network,卷积神经网络),网络模型建立输入数据与输出结果之间的映射关系,通过将输入数据输入网络模型,得到输出结果,不同时刻的输入数据所得到的输出结果之间相互独立。然而在一些特殊的应用场景中,例如语音识别、视频目标跟踪等场景中,每一时刻的数据与其他时刻的数据之间具有较大关联。RNN(Recurrent Neural Network,循环神经网络)是一种实现循环序列运算的DNN,RNN对每一个输入数据的运算都依赖于对其他时刻的输入数据的运算结果。In traditional DNN, such as CNN (Convolutional Neural Network, Convolutional Neural Network), the network model establishes the mapping relationship between the input data and the output result, and the output result is obtained by inputting the input data into the network model. The output results obtained are independent of each other. However, in some special application scenarios, such as speech recognition, video target tracking and other scenarios, the data at each moment has a greater correlation with the data at other moments. RNN (Recurrent Neural Network, Recurrent Neural Network) is a DNN that implements cyclic sequence operations. The operation of RNN on each input data depends on the operation results of input data at other times.
在对基于RNN建立的识别模型进行训练时,多采用前向计算的方式,前向计算的过程是将过去时刻的运算结果引入当前时刻的运算中。训练获得的模型会倾向于利用尽可能多的未来信息,使得每一时刻的运算结果往往具有延迟性,导致识别模型无法满足实时识别的要求。When training the recognition model based on RNN, the method of forward calculation is mostly used. The process of forward calculation is to introduce the operation result of the past moment into the operation of the current moment. The model obtained by training tends to use as much future information as possible, so that the operation results at each moment are often delayed, resulting in the recognition model unable to meet the requirements of real-time recognition.
发明内容SUMMARY OF THE INVENTION
本申请实施例的目的在于提供一种识别模型训练方法及装置,以实现识别模型的实时识别。具体技术方案如下:The purpose of the embodiments of the present application is to provide a recognition model training method and apparatus, so as to realize real-time recognition of the recognition model. The specific technical solutions are as follows:
第一方面,本申请实施例提供了一种识别模型训练方法,所述方法包括:In a first aspect, an embodiment of the present application provides a method for training a recognition model, the method comprising:
获取序列样本;Get sequence samples;
将所述序列样本输入识别模型,得到预设前向目标序列的第一概率及预设后向目标序列的第二概率;Inputting the sequence samples into a recognition model to obtain a first probability of a preset forward target sequence and a second probability of a preset backward target sequence;
根据所述预设前向目标序列及所述预设后向目标序列,按照同一位置所述预设后向目标序列中的目标在前、所述预设前向目标序列中的目标在后的顺序,排列得到前后向目标序列,并计算所述前后向目标序列的第三概率;According to the preset forward target sequence and the preset backward target sequence, according to the same position, the target in the preset backward target sequence is first, and the target in the preset forward target sequence is behind. order, arrange to obtain the forward and backward target sequence, and calculate the third probability of the forward and backward target sequence;
根据所述第一概率、所述第二概率及所述第三概率,计算目标函数;calculating an objective function according to the first probability, the second probability and the third probability;
根据所述目标函数,利用预设训练算法,训练所述识别模型。According to the objective function, the recognition model is trained by using a preset training algorithm.
可选的,所述识别模型包括循环神经网络及联结主义时序分类算法;Optionally, the recognition model includes a recurrent neural network and a connectionist time series classification algorithm;
所述将所述序列样本输入识别模型,得到预设前向目标序列的第一概率及预设后向目标序列的第二概率,包括:The inputting the sequence samples into the recognition model to obtain the first probability of the preset forward target sequence and the second probability of the preset backward target sequence, including:
将所述序列样本输入所述循环神经网络,通过所述循环神经网络的前向计算,得到所述序列样本中各特征的输出概率组成的第一概率序列,并根据所述第一概率序列,利用所述联结主义时序分类算法,计算预设前向目标序列的第一概率;Inputting the sequence samples into the cyclic neural network, and through the forward calculation of the cyclic neural network, a first probability sequence composed of output probabilities of each feature in the sequence samples is obtained, and according to the first probability sequence, using the connectionist time series classification algorithm to calculate the first probability of the preset forward target sequence;
通过所述循环神经网络的后向计算,得到所述序列样本中各特征的输出概率组成的第二概率序列,并根据所述第二概率序列,利用所述联结主义时序分类算法,计算预设后向目标序列的第二概率。Through the backward calculation of the cyclic neural network, a second probability sequence composed of the output probabilities of each feature in the sequence sample is obtained, and according to the second probability sequence, the connectionist time series classification algorithm is used to calculate the preset The second probability of the backward target sequence.
可选的,所述计算所述前后向目标序列的第三概率,包括:Optionally, the calculating the third probability of the forward and backward target sequence includes:
根据所述第一概率序列及所述第二概率序列,计算所述第一概率序列中的各输出概率与所述第二概率序列中相同时刻的输出概率的均值,得到第三概率序列;According to the first probability sequence and the second probability sequence, calculate the mean value of each output probability in the first probability sequence and the output probability at the same time in the second probability sequence to obtain a third probability sequence;
根据所述第三概率序列,利用所述联结主义时序分类算法,计算所述前后向目标序列的第三概率。According to the third probability sequence, using the connectionist temporal classification algorithm, a third probability of the forward and backward target sequence is calculated.
可选的,所述根据所述第一概率、所述第二概率及所述第三概率,计算目标函数,包括:Optionally, the calculating the objective function according to the first probability, the second probability and the third probability includes:
根据所述第一概率、所述第二概率及所述第三概率,利用目标函数计算公式,计算目标函数,其中,所述目标函数计算公式,为:According to the first probability, the second probability and the third probability, use the objective function calculation formula to calculate the objective function, wherein the objective function calculation formula is:
g=-log(Pf)-log(Pb)-log(Pfb)g=-log(P f )-log(P b )-log(P fb )
所述g为所述目标函数,所述Pf为所述第一概率,所述Pb为所述第二概率,所述Pfb为所述第三概率。The g is the objective function, the P f is the first probability, the P b is the second probability, and the P fb is the third probability.
可选的,所述预设训练算法,包括:反向传播算法;Optionally, the preset training algorithm includes: a backpropagation algorithm;
所述根据所述目标函数,利用预设训练算法,训练所述识别模型,包括:According to the objective function, using a preset training algorithm to train the recognition model, including:
根据所述目标函数,确定将所述序列样本输入所述识别模型后,得到的预测序列与预设目标序列之间的误差,所述预设目标序列为所述预设前向目标序列或者所述预设后向目标序列;According to the objective function, determine the error between the predicted sequence obtained after the sequence sample is input into the recognition model and a preset target sequence, where the preset target sequence is the preset forward target sequence or the preset target sequence. the preset backward target sequence;
根据所述误差,利用所述反向传播算法,通过调整所述识别模型的各参数训练所述识别模型。According to the error, the recognition model is trained by adjusting the parameters of the recognition model by using the back-propagation algorithm.
第二方面,本申请实施例提供了一种识别模型训练装置,所述装置包括:In a second aspect, an embodiment of the present application provides an apparatus for training a recognition model, the apparatus comprising:
获取模块,用于获取序列样本;The acquisition module is used to acquire sequence samples;
识别模块,用于将所述序列样本输入识别模型,得到预设前向目标序列的第一概率及预设后向目标序列的第二概率;an identification module for inputting the sequence samples into an identification model to obtain a first probability of a preset forward target sequence and a second probability of a preset backward target sequence;
排列模块,用于根据所述预设前向目标序列及所述预设后向目标序列,按照同一位置所述预设后向目标序列中的目标在前、所述预设前向目标序列中的目标在后的顺序,排列得到前后向目标序列;an arrangement module for, according to the preset forward target sequence and the preset backward target sequence, according to the same position, the target in the preset backward target sequence is in the front, and the preset forward target sequence is in the same position The order of the target in the rear, arrange to get the front and rear target sequence;
计算模块,用于计算所述前后向目标序列的第三概率;根据所述第一概率、所述第二概率及所述第三概率,计算目标函数;a calculation module, configured to calculate the third probability of the forward and backward target sequence; calculate the target function according to the first probability, the second probability and the third probability;
训练模块,用于根据所述目标函数,利用预设训练算法,训练所述识别模型。A training module is used to train the recognition model by using a preset training algorithm according to the objective function.
可选的,所述识别模型包括循环神经网络及联结主义时序分类算法;Optionally, the recognition model includes a recurrent neural network and a connectionist time series classification algorithm;
所述识别模块,具体用于:The identification module is specifically used for:
将所述序列样本输入所述循环神经网络,通过所述循环神经网络的前向计算,得到所述序列样本中各特征的输出概率组成的第一概率序列,并根据所述第一概率序列,利用所述联结主义时序分类算法,计算预设前向目标序列的第一概率;Inputting the sequence samples into the cyclic neural network, and through the forward calculation of the cyclic neural network, a first probability sequence composed of output probabilities of each feature in the sequence samples is obtained, and according to the first probability sequence, using the connectionist time series classification algorithm to calculate the first probability of the preset forward target sequence;
通过所述循环神经网络的后向计算,得到所述序列样本中各特征的输出概率组成的第二概率序列,并根据所述第二概率序列,利用所述联结主义时序分类算法,计算预设后向目标序列的第二概率。Through the backward calculation of the cyclic neural network, a second probability sequence composed of the output probabilities of each feature in the sequence sample is obtained, and according to the second probability sequence, the connectionist time series classification algorithm is used to calculate the preset The second probability of the backward target sequence.
可选的,所述计算模块,具体用于:Optionally, the computing module is specifically used for:
根据所述第一概率序列及所述第二概率序列,计算所述第一概率序列中的各输出概率与所述第二概率序列中相同时刻的输出概率的均值,得到第三概率序列;According to the first probability sequence and the second probability sequence, calculate the mean value of each output probability in the first probability sequence and the output probability at the same time in the second probability sequence to obtain a third probability sequence;
根据所述第三概率序列,利用所述联结主义时序分类算法,计算所述前后向目标序列的第三概率。According to the third probability sequence, using the connectionist temporal classification algorithm, a third probability of the forward and backward target sequence is calculated.
可选的,所述计算模块,具体用于:Optionally, the computing module is specifically used for:
根据所述第一概率、所述第二概率及所述第三概率,利用目标函数计算公式,计算目标函数,其中,所述目标函数计算公式,为:According to the first probability, the second probability and the third probability, use the objective function calculation formula to calculate the objective function, wherein the objective function calculation formula is:
g=-log(Pf)-log(Pb)-log(Pfb)g=-log(P f )-log(P b )-log(P fb )
所述g为所述目标函数,所述Pf为所述第一概率,所述Pb为所述第二概率,所述Pfb为所述第三概率。The g is the objective function, the P f is the first probability, the P b is the second probability, and the P fb is the third probability.
可选的,所述预设训练算法,包括:反向传播算法;Optionally, the preset training algorithm includes: a backpropagation algorithm;
所述训练模块,具体用于:The training module is specifically used for:
根据所述目标函数,确定将所述序列样本输入所述识别模型后,得到的预测序列与预设目标序列之间的误差,所述预设目标序列为所述预设前向目标序列或者所述预设后向目标序列;According to the objective function, determine the error between the predicted sequence obtained after the sequence sample is input into the recognition model and a preset target sequence, where the preset target sequence is the preset forward target sequence or the preset target sequence. the preset backward target sequence;
根据所述误差,利用所述反向传播算法,通过调整所述识别模型的各参数训练所述识别模型。According to the error, the recognition model is trained by adjusting the parameters of the recognition model by using the back-propagation algorithm.
本申请实施例提供的一种识别模型训练方法及装置,通过获取序列样本,将该序列样本输入识别模型,得到预设前向目标序列的第一概率及预设后向目标序列的第二概率,根据预设前向目标序列及预设后向目标序列,按照同一位置预设后向目标序列中的目标在前、预设前向目标序列中的目标在后的顺序,排列得到前后向目标序列,并计算前后向目标序列的第三概率,根据第一概率、第二概率及第三概率,计算目标函数,根据该目标函数,利用预设训练算法,训练识别模型。通过对预设前向目标序列和预设后向目标序列的重排列,在前后向目标序列中约束了每一位置预设后向目标序列中的目标在前、预设前向目标序列中的目标在后的顺序,这样,计算得到的目标函数中加入了对前向计算和后向计算的解码位置的约束,即针对每一位置的目标解码,后向计算早于前向计算,由于前向计算会延迟,而后向计算会提前,这样,通过对解码位置的约束,使得训练的识别模型中前向计算的结果不延时、后向计算的结果不提前,实现了识别模型的实时识别。In a recognition model training method and device provided by the embodiments of the present application, by acquiring a sequence sample and inputting the sequence sample into a recognition model, a first probability of a preset forward target sequence and a second probability of a preset backward target sequence are obtained , according to the preset forward target sequence and the preset backward target sequence, according to the order of the target in the preset backward target sequence at the same position first, and the target in the preset forward target sequence after the sequence, the front and rear targets are arranged to obtain sequence, and calculate the third probability of the forward and backward target sequence, calculate the target function according to the first probability, the second probability and the third probability, and use the preset training algorithm to train the recognition model according to the target function. By rearranging the preset forward target sequence and the preset backward target sequence, in the forward and backward target sequence, the target in the preset backward target sequence at each position is constrained to be in the front and the preset forward target sequence in the forward target sequence. The target is in the latter order. In this way, constraints on the decoding positions of the forward calculation and the backward calculation are added to the calculated target function, that is, for the target decoding of each position, the backward calculation is earlier than the forward calculation. The forward calculation will be delayed, and the backward calculation will be advanced. In this way, by restricting the decoding position, the result of the forward calculation in the trained recognition model is not delayed, and the result of the backward calculation is not advanced, and the real-time recognition of the recognition model is realized. .
附图说明Description of drawings
为了更清楚地说明本申请实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the following briefly introduces the accompanying drawings required for the description of the embodiments or the prior art. Obviously, the drawings in the following description are only These are some embodiments of the present application. For those of ordinary skill in the art, other drawings can also be obtained based on these drawings without any creative effort.
图1为现有技术的基于RNN的语音识别框图;Fig. 1 is the speech recognition block diagram based on RNN of the prior art;
图2为本申请实施例的识别模型训练方法的流程示意图;2 is a schematic flowchart of a recognition model training method according to an embodiment of the present application;
图3为本申请实施例的识别模型训练装置的结构示意图;3 is a schematic structural diagram of a recognition model training apparatus according to an embodiment of the present application;
图4为本申请实施例的电子设备的结构示意图。FIG. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
具体实施方式Detailed ways
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application. Obviously, the described embodiments are only a part of the embodiments of the present application, but not all of the embodiments. Based on the embodiments in the present application, all other embodiments obtained by those of ordinary skill in the art without creative efforts shall fall within the protection scope of the present application.
如图1所示,为基于RNN的语音识别框图,记输入的语音特征序列为x=[x1,x2,x3,…,xT],RNN为序列学习网络模型,可以为LSTM(Long Short Term Memory,长短期记忆)网络、GRU(Gated Recurrent Unit,门控循环单元)网络等。As shown in Figure 1, it is a block diagram of speech recognition based on RNN, and the input speech feature sequence is x=[x 1 ,x 2 ,x 3 ,...,x T ], RNN is a sequence learning network model, which can be LSTM ( Long Short Term Memory) network, GRU (Gated Recurrent Unit, gated recurrent unit) network, etc.
当前常用的RNN中,通常采用双向计算处理输入特征序列,或者采用前向计算处理输入特征序列。如果采用双向计算处理输入特征序列,则任一时刻的输出结果都与整个输入特征序列有关,只能用于离线识别,无法用于实时识别;如果采用前向计算处理输入特征序列,RNN的网络模型也是采用前向计算的方式训练得到的,最终训练好的网络模型在识别时会有长短不一的延时,识别模型也无法满足实时识别的要求。In the currently commonly used RNN, the input feature sequence is usually processed by bidirectional calculation, or the input feature sequence is processed by forward calculation. If two-way computation is used to process the input feature sequence, the output result at any time is related to the entire input feature sequence, which can only be used for offline recognition and cannot be used for real-time recognition; if forward computing is used to process the input feature sequence, the RNN network The model is also trained by forward computing. The final trained network model will have different delays in recognition, and the recognition model cannot meet the requirements of real-time recognition.
为了实现识别模型的实时识别,本申请实施例提供了一种识别模型训练方法、装置、电子设备及机器可读存储介质。In order to realize real-time recognition of recognition models, embodiments of the present application provide a recognition model training method, apparatus, electronic device, and machine-readable storage medium.
下面,首先对本申请实施例所提供的识别模型训练方法进行介绍。Below, the recognition model training method provided by the embodiment of the present application is first introduced.
本申请实施例所提供的一种识别模型训练方法的执行主体可以为执行智能算法的电子设备,该电子设备可以为具有目标检测与跟踪、行为检测与识别或者语音识别等功能的智能设备,例如远程计算机、远程服务器、智能相机、智能语音设备等等,执行主体中应该至少包括搭载有核心处理芯片的处理器。实现本申请实施例所提供的一种识别模型训练方法的方式可以为设置于执行主体中的软件、硬件电路和逻辑电路中的至少一种方式。The execution subject of the recognition model training method provided by the embodiment of the present application may be an electronic device that executes an intelligent algorithm, and the electronic device may be an intelligent device with functions such as target detection and tracking, behavior detection and recognition, or speech recognition, for example For remote computers, remote servers, smart cameras, smart voice devices, etc., the executive body should at least include a processor equipped with a core processing chip. A manner of implementing the recognition model training method provided by the embodiments of the present application may be at least one manner of software, hardware circuits, and logic circuits provided in the execution body.
如图2所示,本申请实施例所提供的一种识别模型训练方法,可以包括如下步骤:As shown in FIG. 2, a recognition model training method provided by the embodiment of the present application may include the following steps:
S201,获取序列样本。S201, obtain a sequence sample.
本实施例可应用于语音识别、目标跟踪等场景,因此,序列样本可以为语音序列、视频帧序列、文字序列等。This embodiment can be applied to scenarios such as speech recognition and target tracking. Therefore, the sequence samples may be speech sequences, video frame sequences, text sequences, and the like.
S202,将序列样本输入识别模型,得到预设前向目标序列的第一概率及预设后向目标序列的第二概率。S202, the sequence samples are input into the recognition model to obtain the first probability of the preset forward target sequence and the second probability of the preset backward target sequence.
识别模型为实现语音识别、目标识别等功能的DNN模型,为了实现对序列的循环运算,识别模型中包括RNN,以及进行概率计算的算法单元,进行概率计算的算法可以包括CTC(Connectionism Temporal Classification,联结主义时序分类)、HMM(Hidden MarkovModel,隐马尔科夫模型)、Attention(注意力)机制等。The recognition model is a DNN model that realizes functions such as speech recognition and target recognition. In order to realize the cyclic operation of the sequence, the recognition model includes RNN and an algorithm unit for probability calculation. The algorithm for probability calculation can include CTC (Connectionism Temporal Classification, Connectionism temporal classification), HMM (Hidden MarkovModel, Hidden Markov Model), Attention (attention) mechanism, etc.
RNN的运算过程包括前向计算和后向计算,前向计算为在对某一时刻的特征进行运算时,运算的输入除了该时刻的特征以外还需要考虑该时刻之前各时刻的运算状态;后向计算为在对某一时刻的特征进行运算时,运算的输入除了该时刻的特征以外还需要考虑该时刻之后各时刻的运算状态。预设前向目标序列为在进行前向计算时期望能够得到的目标序列,预设后向目标序列为在进行后向计算时期望能够得到的目标序列,通常情况下,预设前向目标序列与预设后向目标序列相同。第一概率即为通过前向计算得到预设前向目标序列的概率,第二概率即为通过后向计算得到预设后向目标序列的概率。The operation process of RNN includes forward calculation and backward calculation. When the forward calculation is performed on the characteristics of a certain moment, the input of the operation needs to consider the operation state of each moment before the moment in addition to the characteristics of the moment; When the direction calculation is to perform an operation on the feature of a certain time, the input of the operation needs to consider the operation state at each time after the time in addition to the feature of the time. The preset forward target sequence is the target sequence expected to be obtained during the forward calculation, and the preset backward target sequence is the target sequence expected to be obtained during the backward calculation. Usually, the preset forward target sequence Same as preset backward target sequence. The first probability is the probability of obtaining the preset forward target sequence through forward calculation, and the second probability is the probability of obtaining the preset backward target sequence through backward calculation.
可选的,识别模型可以包括RNN及CTC算法。Optionally, the recognition model may include RNN and CTC algorithms.
RNN是一个强大的序列学习模型,但它要求输入是预先分割的数据,因此应用受到较大的限制。与CTC结合则可以避免预先分割数据的要求,其基本思路是,将RNN的网络输出解释为所有可能的类别序列的概率分布。给定该分布,目标函数即为最大化预设目标序列的概率。RNN is a powerful sequence learning model, but it requires the input to be pre-segmented data, so the application is greatly limited. Combining with CTC can avoid the requirement of pre-segmenting the data. The basic idea is to interpret the network output of RNN as the probability distribution of all possible class sequences. Given this distribution, the objective function is to maximize the probability of the preset target sequence.
针对RNN的每一个时刻t的输入,网络输出层都有L+1(L为类别集合数目)个节点,其中,前L个节点的输出为在时刻t观察到各个类别的概率,第L+1个节点的输出为观察到空格的概率。加入空格输出使得CTC可以处理真值预设目标序列中相邻目标类别相同的情况。综合所有时刻输出层的值,即可计算出任意目标序列的概率。CTC综合考虑了所有的对齐情况,所以不需要预先分割数据。For the input of each time t of the RNN, the network output layer has L+1 (L is the number of category sets) nodes, where the output of the first L nodes is the probability of observing each category at time t, and the L+ The output of 1 node is the probability of observing a space. Adding spaces to the output enables CTC to handle the case where the adjacent target classes in the ground-truth preset target sequence are the same. By combining the values of the output layer at all times, the probability of any target sequence can be calculated. CTC comprehensively considers all alignment cases, so there is no need to pre-segment the data.
相应的,S202具体可以为:Correspondingly, S202 may specifically be:
将序列样本输入RNN,通过RNN的前向计算,得到序列样本中各特征的输出概率组成的第一概率序列,并根据第一概率序列,利用CTC算法,计算预设前向目标序列的第一概率;Input the sequence sample into the RNN, and through the forward calculation of the RNN, obtain a first probability sequence composed of the output probabilities of each feature in the sequence sample, and use the CTC algorithm to calculate the first probability sequence of the preset forward target sequence according to the first probability sequence. probability;
通过RNN的后向计算,得到序列样本中各特征的输出概率组成的第二概率序列,并根据第二概率序列,利用CTC算法,计算预设后向目标序列的第二概率。Through the backward calculation of the RNN, a second probability sequence composed of output probabilities of each feature in the sequence sample is obtained, and according to the second probability sequence, the CTC algorithm is used to calculate the second probability of the preset backward target sequence.
通过RNN的前向计算,可以得到第一概率序列[pf1,pf2,pf3,…,pfT],其中,n∈[1,T]为输入特征序列的时刻,设定预设前向目标序列为[πf1,πf2,πf3,…,πfU],经过CTC算法计算预设前向目标序列的第一概率为Pf;通过RNN的后向计算,可以得到第二概率序列[pb1,pb2,pb3,…,pbT],设定预设后向目标序列为[πb1,πb2,πb3,…,πbU],经过CTC算法计算预设后向目标序列的第二概率为Pb。Through the forward calculation of RNN, the first probability sequence [p f1 ,p f2 ,p f3 ,...,p fT ] can be obtained, where n∈[1,T] is the moment of inputting the feature sequence. The forward target sequence is [π f1 ,π f2 ,π f3 ,…,π fU ], and the first probability of the preset forward target sequence calculated by the CTC algorithm is P f ; through the backward calculation of the RNN, the second probability can be obtained Sequence [p b1 ,p b2 ,p b3 ,…,p bT ], set the preset backward target sequence as [π b1 ,π b2 ,π b3 ,…,π bU ], and calculate the preset backward direction through the CTC algorithm The second probability of the target sequence is P b .
S203,根据预设前向目标序列及预设后向目标序列,按照同一位置预设后向目标序列中的目标在前、预设前向目标序列中的目标在后的顺序,排列得到前后向目标序列,并计算前后向目标序列的第三概率。S203, according to the preset forward target sequence and the preset backward target sequence, according to the order in which the target in the preset backward target sequence at the same position is at the front and the target in the preset forward target sequence is at the rear, arrange to obtain the forward and backward direction target sequence, and calculate the third probability of the forward and backward target sequence.
由于在RNN的运算中,前向计算会有延时、后向计算会提前,为了让前向计算不延时、后向计算不提前,本实施例利用前后向RNN的解码结果互相约束,即根据预设前向目标序列及预设后向目标序列,按照同一位置预设后向目标序列中的目标在前、预设前向目标序列中的目标在后的顺序,排列得到前后向目标序列,如上述,设定预设前向目标序列为[πf1,πf2,πf3,…,πfU]、预设后向目标序列为[πb1,πb2,πb3,…,πbU],则得到的前后向目标序列为[πb1,πf1,πb2,πf2,πb3,πf3,…,πbU,πfU]。通过CTC、HMM、Attention等算法机制可计算得到前后向目标序列的第三概率。In the operation of the RNN, the forward calculation will be delayed and the backward calculation will be advanced. In order to prevent the forward calculation from being delayed and the backward calculation not to be advanced, this embodiment uses the decoding results of the forward and backward RNNs to constrain each other, that is, According to the preset forward target sequence and the preset backward target sequence, according to the order of the target in the preset backward target sequence at the same position first and the target in the preset forward target sequence at the back, the front and rear target sequence is obtained by arranging , as above, set the preset forward target sequence as [π f1 ,π f2 ,π f3 ,…,π fU ] and the preset backward target sequence as [π b1 ,π b2 ,π b3 ,…,π bU ], the obtained forward and backward target sequence is [π b1 ,π f1 ,π b2 ,π f2 ,π b3 ,π f3 ,…,π bU ,π fU ]. The third probability of the forward and backward target sequence can be calculated through algorithm mechanisms such as CTC, HMM, and Attention.
可选的,S203中计算前后向目标序列的第三概率的步骤,具体可以为:Optionally, the step of calculating the third probability of the forward and backward target sequence in S203 may specifically be:
根据第一概率序列及第二概率序列,计算第一概率序列中的各输出概率与第二概率序列中相同时刻的输出概率的均值,得到第三概率序列;According to the first probability sequence and the second probability sequence, calculate the mean value of each output probability in the first probability sequence and the output probability at the same time in the second probability sequence to obtain a third probability sequence;
根据第三概率序列,利用CTC算法,计算前后向目标序列的第三概率。According to the third probability sequence, the CTC algorithm is used to calculate the third probability of the forward and backward target sequence.
在利用CTC算法计算第三概率之前,需要对前后向互约束的第三概率序列进行计算,第三概率序列的计算方式就是将第一概率序列中的各输出概率与第二概率序列中相同时刻的输出概率相加并除以二(即计算第一概率序列中的各输出概率与第二概率序列中相同时刻的输出概率的均值),例如第一概率序列为[pf1,pf2,pf3,…,pfT]、第二概率序列为[pb1,pb2,pb3,…,pbT],则第三概率序列为[(pf1+pb1)/2,(pf2+pb2)/2,(pf3+pb3)/2,…,(pfT+pbT)/2],利用CTC算法,计算前后向目标序列的第三概率为Pfb。Before using the CTC algorithm to calculate the third probability, it is necessary to calculate the third probability sequence that is mutually constrained in the forward and backward directions. Add the output probabilities of , and divide by two (that is, calculate the mean of each output probability in the first probability sequence and the output probability at the same time in the second probability sequence), for example, the first probability sequence is [p f1 ,p f2 ,p f3 ,…,p fT ], the second probability sequence is [p b1 ,p b2 ,p b3 ,…,p bT ], then the third probability sequence is [(p f1 +p b1 )/2,(p f2 + p b2 )/2,(p f3 +p b3 )/2,…,(p fT +p bT )/2], using the CTC algorithm, calculate the third probability of the forward and backward target sequence as P fb .
S204,根据第一概率、第二概率及第三概率,计算目标函数。S204, calculate the objective function according to the first probability, the second probability and the third probability.
目标函数为进行识别模型训练时所依据的函数,例如梯度训练中的梯度函数等,表征了模型参数调整的方向和程度,本实施例中分别计算了预设前向目标序列的第一概率、预设后向目标序列的第二概率以及前后向目标序列的第三概率,综合这三个概率,可以计算目标函数,该目标函数更为完整的表征了模型参数调整的方向和程度。该目标函数在保证前向计算和后向计算的识别率的同时,也对它们的解码顺序做了限制。The objective function is the function based on the recognition model training, such as the gradient function in gradient training, etc., which represents the direction and degree of model parameter adjustment. In this embodiment, the first probability of the preset forward target sequence, The second probability of the backward target sequence and the third probability of the forward and backward target sequence are preset, and by combining these three probabilities, an objective function can be calculated, and the objective function more completely characterizes the direction and degree of model parameter adjustment. The objective function not only guarantees the recognition rate of forward calculation and backward calculation, but also restricts their decoding order.
可选的,S204具体可以为:Optionally, S204 may specifically be:
根据第一概率、第二概率及第三概率,利用目标函数计算公式,计算目标函数,其中,目标函数计算公式,为:According to the first probability, the second probability and the third probability, use the objective function calculation formula to calculate the objective function, wherein the objective function calculation formula is:
g=-log(Pf)-log(Pb)-log(Pfb) (1)g=-log(P f )-log(P b )-log(P fb ) (1)
g为目标函数,Pf为第一概率,Pb为第二概率,Pfb为第三概率。g is the objective function, P f is the first probability, P b is the second probability, and P fb is the third probability.
针对例如反向传播算法等训练算法,目标函数与概率之间为对数关系,可以分别对第一概率、第二概率和第三概率求对数的相反数,再将三个结果相加记得到目标函数。For training algorithms such as back-propagation algorithm, the relationship between the objective function and the probability is logarithmic. You can calculate the inverse of the logarithm for the first probability, the second probability and the third probability, and then add the three results to remember. to the objective function.
S205,根据目标函数,利用预设训练算法,训练识别模型。S205, according to the objective function, use a preset training algorithm to train the recognition model.
预设训练算法可以为反向传播算法、梯度算法等常用的训练算法,这里不做具体限定。The preset training algorithm may be a commonly used training algorithm such as a back-propagation algorithm and a gradient algorithm, which is not specifically limited here.
可选的,预设训练算法可以包括:反向传播算法。Optionally, the preset training algorithm may include: a back propagation algorithm.
相应的,S205具体可以为:Correspondingly, S205 may specifically be:
根据目标函数,确定将序列样本输入识别模型后,得到的预测序列与预设目标序列之间的误差,其中,预设目标序列为预设前向目标序列或者预设后向目标序列;According to the objective function, determine the error between the predicted sequence obtained after the sequence samples are input into the recognition model and the preset target sequence, wherein the preset target sequence is a preset forward target sequence or a preset backward target sequence;
根据预测序列与预设目标序列之间的误差,利用反向传播算法,通过调整识别模型的各参数,训练识别模型。According to the error between the predicted sequence and the preset target sequence, the back-propagation algorithm is used to train the recognition model by adjusting the parameters of the recognition model.
训练的过程就是将序列样本输入识别模型,得到预测序列,计算预测序列与预设目标序列之间的误差,基于该误差,利用反向传播算法,不断调整识别模型的模型参数,通过多次的循环迭代对识别模型进行训练。在训练得到最终的识别模型后,在对语音序列、视频序列等输入进行识别时,可以直接使用RNN的前向计算实时获得识别结果。The training process is to input the sequence samples into the recognition model to obtain the predicted sequence, calculate the error between the predicted sequence and the preset target sequence, and based on the error, use the back-propagation algorithm to continuously adjust the model parameters of the recognition model. The loop iterations train the recognition model. After training to obtain the final recognition model, when recognizing input such as speech sequences, video sequences, etc., the forward calculation of RNN can be directly used to obtain the recognition results in real time.
应用本实施例,通过获取序列样本,将该序列样本输入识别模型,得到预设前向目标序列的第一概率及预设后向目标序列的第二概率,根据预设前向目标序列及预设后向目标序列,按照同一位置预设后向目标序列中的目标在前、预设前向目标序列中的目标在后的顺序,排列得到前后向目标序列,并计算前后向目标序列的第三概率,根据第一概率、第二概率及第三概率,计算目标函数,根据该目标函数,利用预设训练算法,训练识别模型。通过对预设前向目标序列和预设后向目标序列的重排列,在前后向目标序列中约束了每一位置预设后向目标序列中的目标在前、预设前向目标序列中的目标在后的顺序,这样,计算得到的目标函数中加入了对前向计算和后向计算的解码位置的约束,即针对每一位置的目标解码,后向计算早于前向计算,由于前向计算会延迟,而后向计算会提前,这样,通过对解码位置的约束,使得训练的识别模型中前向计算的结果不延时、后向计算的结果不提前,实现了识别模型的实时识别。Applying this embodiment, by acquiring sequence samples and inputting the sequence samples into the recognition model, the first probability of the preset forward target sequence and the second probability of the preset backward target sequence are obtained. Set the backward target sequence, according to the order of the target in the preset backward target sequence at the same position in the front, and the target in the preset forward target sequence at the back, arrange to obtain the forward and backward target sequence, and calculate the number of the forward and backward target sequence. With three probabilities, an objective function is calculated according to the first probability, the second probability and the third probability, and a preset training algorithm is used to train the recognition model according to the objective function. By rearranging the preset forward target sequence and the preset backward target sequence, in the forward and backward target sequence, the target in the preset backward target sequence at each position is constrained to be in the front and the preset forward target sequence in the forward target sequence. The target is in the latter order. In this way, constraints on the decoding positions of the forward calculation and the backward calculation are added to the calculated target function, that is, for the target decoding of each position, the backward calculation is earlier than the forward calculation. The forward calculation will be delayed, and the backward calculation will be advanced. In this way, by restricting the decoding position, the result of the forward calculation in the trained recognition model is not delayed, and the result of the backward calculation is not advanced, and the real-time recognition of the recognition model is realized. .
相应于上述方法实施例,本申请实施例提供了一种识别模型训练装置,如图3所示,该识别模型训练装置可以包括:Corresponding to the above method embodiments, the embodiments of the present application provide an apparatus for training an identification model. As shown in FIG. 3 , the apparatus for training an identification model may include:
获取模块310,用于获取序列样本;an
识别模块320,用于将所述序列样本输入识别模型,得到预设前向目标序列的第一概率及预设后向目标序列的第二概率;The
排列模块330,用于根据所述预设前向目标序列及所述预设后向目标序列,按照同一位置所述预设后向目标序列中的目标在前、所述预设前向目标序列中的目标在后的顺序,排列得到前后向目标序列;The
计算模块340,用于计算所述前后向目标序列的第三概率;根据所述第一概率、所述第二概率及所述第三概率,计算目标函数;A
训练模块350,用于根据所述目标函数,利用预设训练算法,训练所述识别模型。The
可选的,所述识别模型可以包括循环神经网络及联结主义时序分类算法;Optionally, the recognition model may include a recurrent neural network and a connectionist time series classification algorithm;
所述识别模块320,具体可以用于:The
将所述序列样本输入所述循环神经网络,通过所述循环神经网络的前向计算,得到所述序列样本中各特征的输出概率组成的第一概率序列,并根据所述第一概率序列,利用所述联结主义时序分类算法,计算预设前向目标序列的第一概率;Inputting the sequence samples into the cyclic neural network, and through the forward calculation of the cyclic neural network, a first probability sequence composed of output probabilities of each feature in the sequence samples is obtained, and according to the first probability sequence, using the connectionist time series classification algorithm to calculate the first probability of the preset forward target sequence;
通过所述循环神经网络的后向计算,得到所述序列样本中各特征的输出概率组成的第二概率序列,并根据所述第二概率序列,利用所述联结主义时序分类算法,计算预设后向目标序列的第二概率。Through the backward calculation of the cyclic neural network, a second probability sequence composed of the output probabilities of each feature in the sequence sample is obtained, and according to the second probability sequence, the connectionist time series classification algorithm is used to calculate the preset The second probability of the backward target sequence.
可选的,所述计算模块340,具体可以用于:Optionally, the
根据所述第一概率序列及所述第二概率序列,计算所述第一概率序列中的各输出概率与所述第二概率序列中相同时刻的输出概率的均值,得到第三概率序列;According to the first probability sequence and the second probability sequence, calculate the mean value of each output probability in the first probability sequence and the output probability at the same time in the second probability sequence to obtain a third probability sequence;
根据所述第三概率序列,利用所述联结主义时序分类算法,计算所述前后向目标序列的第三概率。According to the third probability sequence, using the connectionist temporal classification algorithm, a third probability of the forward and backward target sequence is calculated.
可选的,所述计算模块340,具体可以用于:Optionally, the
根据所述第一概率、所述第二概率及所述第三概率,利用目标函数计算公式,计算目标函数,其中,所述目标函数计算公式,为:According to the first probability, the second probability and the third probability, use the objective function calculation formula to calculate the objective function, wherein the objective function calculation formula is:
g=-log(Pf)-log(Pb)-log(Pfb)g=-log(P f )-log(P b )-log(P fb )
所述g为所述目标函数,所述Pf为所述第一概率,所述Pb为所述第二概率,所述Pfb为所述第三概率。The g is the objective function, the P f is the first probability, the P b is the second probability, and the P fb is the third probability.
可选的,所述预设训练算法,可以包括:反向传播算法;Optionally, the preset training algorithm may include: a backpropagation algorithm;
所述训练模块350,具体可以用于:The
根据所述目标函数,确定将所述序列样本输入所述识别模型后,得到的预测序列与预设目标序列之间的误差,所述预设目标序列为所述预设前向目标序列或者所述预设后向目标序列;According to the objective function, determine the error between the predicted sequence obtained after the sequence sample is input into the recognition model and a preset target sequence, where the preset target sequence is the preset forward target sequence or the preset target sequence. the preset backward target sequence;
根据所述误差,利用所述反向传播算法,通过调整所述识别模型的各参数训练所述识别模型。According to the error, the recognition model is trained by adjusting the parameters of the recognition model by using the back-propagation algorithm.
应用本实施例,通过获取序列样本,将该序列样本输入识别模型,得到预设前向目标序列的第一概率及预设后向目标序列的第二概率,根据预设前向目标序列及预设后向目标序列,按照同一位置预设后向目标序列中的目标在前、预设前向目标序列中的目标在后的顺序,排列得到前后向目标序列,并计算前后向目标序列的第三概率,根据第一概率、第二概率及第三概率,计算目标函数,根据该目标函数,利用预设训练算法,训练识别模型。通过对预设前向目标序列和预设后向目标序列的重排列,在前后向目标序列中约束了每一位置预设后向目标序列中的目标在前、预设前向目标序列中的目标在后的顺序,这样,计算得到的目标函数中加入了对前向计算和后向计算的解码位置的约束,即针对每一位置的目标解码,后向计算早于前向计算,由于前向计算会延迟,而后向计算会提前,这样,通过对解码位置的约束,使得训练的识别模型中前向计算的结果不延时、后向计算的结果不提前,实现了识别模型的实时识别。Applying this embodiment, by acquiring sequence samples and inputting the sequence samples into the recognition model, the first probability of the preset forward target sequence and the second probability of the preset backward target sequence are obtained. Set the backward target sequence, according to the order of the target in the preset backward target sequence at the same position in the front, and the target in the preset forward target sequence at the back, arrange to obtain the forward and backward target sequence, and calculate the number of the forward and backward target sequence. With three probabilities, an objective function is calculated according to the first probability, the second probability and the third probability, and a preset training algorithm is used to train the recognition model according to the objective function. By rearranging the preset forward target sequence and the preset backward target sequence, in the forward and backward target sequence, the target in the preset backward target sequence at each position is constrained to be in the front and the preset forward target sequence in the forward target sequence. The target is in the latter order. In this way, constraints on the decoding positions of the forward calculation and the backward calculation are added to the calculated target function, that is, for the target decoding of each position, the backward calculation is earlier than the forward calculation. The forward calculation will be delayed, and the backward calculation will be advanced. In this way, by restricting the decoding position, the result of the forward calculation in the trained recognition model is not delayed, and the result of the backward calculation is not advanced, and the real-time recognition of the recognition model is realized. .
相应于上述方法实施例,本申请实施例提供了一种电子设备,如图4所示,该电子设备包括处理器401和存储器402,其中,Corresponding to the foregoing method embodiments, the embodiments of the present application provide an electronic device. As shown in FIG. 4 , the electronic device includes a
所述存储器402,用于存放计算机程序;The
所述处理器401,用于执行所述存储器402上所存放的计算机程序时,实现上述识别模型训练方法的任一步骤。The
上述存储器可以包括RAM(Random Access Memory,随机存取存储器),也可以包括NVM(Non-Volatile Memory,非易失性存储器),例如至少一个磁盘存储器。可选的,存储器还可以是至少一个位于远离前述处理器的存储装置。The above-mentioned memory may include RAM (Random Access Memory, random access memory), and may also include NVM (Non-Volatile Memory, non-volatile memory), for example, at least one disk memory. Optionally, the memory may also be at least one storage device located away from the aforementioned processor.
上述处理器可以是通用处理器,包括GPU(Graphics Processing Unit,图形处理器)、CPU(Central Processing Unit,中央处理器)、NP(Network Processor,网络处理器)等;还可以是DSP(Digital Signal Processor,数字信号处理器)、ASIC(ApplicationSpecific Integrated Circuit,专用集成电路)、FPGA(Field-Programmable Gate Array,现场可编程门阵列)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。The above-mentioned processor may be a general-purpose processor, including a GPU (Graphics Processing Unit, graphics processor), a CPU (Central Processing Unit, central processing unit), an NP (Network Processor, network processor), etc.; it may also be a DSP (Digital Signal Processing Unit) Processor, digital signal processor), ASIC (ApplicationSpecific Integrated Circuit, application specific integrated circuit), FPGA (Field-Programmable Gate Array, field programmable gate array) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components .
本实施例中,该电子设备的处理器通过读取存储器中存储的计算机程序,并通过运行该计算机程序,能够实现:通过获取序列样本,将该序列样本输入识别模型,得到预设前向目标序列的第一概率及预设后向目标序列的第二概率,根据预设前向目标序列及预设后向目标序列,按照同一位置预设后向目标序列中的目标在前、预设前向目标序列中的目标在后的顺序,排列得到前后向目标序列,并计算前后向目标序列的第三概率,根据第一概率、第二概率及第三概率,计算目标函数,根据该目标函数,利用预设训练算法,训练识别模型。通过对预设前向目标序列和预设后向目标序列的重排列,在前后向目标序列中约束了每一位置预设后向目标序列中的目标在前、预设前向目标序列中的目标在后的顺序,这样,计算得到的目标函数中加入了对前向计算和后向计算的解码位置的约束,即针对每一位置的目标解码,后向计算早于前向计算,由于前向计算会延迟,而后向计算会提前,这样,通过对解码位置的约束,使得训练的识别模型中前向计算的结果不延时、后向计算的结果不提前,实现了识别模型的实时识别。In this embodiment, by reading the computer program stored in the memory, and running the computer program, the processor of the electronic device can achieve: by acquiring a sequence sample, inputting the sequence sample into the recognition model, and obtaining a preset forward target The first probability of the sequence and the second probability of the preset backward target sequence are based on the preset forward target sequence and the preset backward target sequence, and the target in the preset backward target sequence at the same position is in the front and the preset front Arrange the front and rear target sequences in the order in which the targets in the target sequence come after, and calculate the third probability of the front and rear target sequences, calculate the target function according to the first probability, the second probability and the third probability, and according to the target function , using the preset training algorithm to train the recognition model. By rearranging the preset forward target sequence and the preset backward target sequence, in the forward and backward target sequence, the target in the preset backward target sequence at each position is constrained to be in the front and the preset forward target sequence in the forward target sequence. The target is in the latter order. In this way, constraints on the decoding positions of the forward calculation and the backward calculation are added to the calculated target function, that is, for the target decoding of each position, the backward calculation is earlier than the forward calculation. The forward calculation will be delayed, and the backward calculation will be advanced. In this way, by restricting the decoding position, the result of the forward calculation in the trained recognition model is not delayed, and the result of the backward calculation is not advanced, and the real-time recognition of the recognition model is realized. .
另外,相应于上述实施例所提供的识别模型训练方法,本申请实施例提供了一种计算机可读存储介质,该计算机可读存储介质内存储有计算机程序,所述计算机程序被处理器执行时实现上述识别模型训练方法的任一步骤。In addition, corresponding to the recognition model training method provided by the above embodiments, the embodiments of the present application provide a computer-readable storage medium, where a computer program is stored in the computer-readable storage medium, and when the computer program is executed by a processor Implement any step of the above recognition model training method.
本实施例中,计算机可读存储介质在运行时执行本申请实施例所提供的识别模型训练方法的计算机程序,因此能够实现:通过获取序列样本,将该序列样本输入识别模型,得到预设前向目标序列的第一概率及预设后向目标序列的第二概率,根据预设前向目标序列及预设后向目标序列,按照同一位置预设后向目标序列中的目标在前、预设前向目标序列中的目标在后的顺序,排列得到前后向目标序列,并计算前后向目标序列的第三概率,根据第一概率、第二概率及第三概率,计算目标函数,根据该目标函数,利用预设训练算法,训练识别模型。通过对预设前向目标序列和预设后向目标序列的重排列,在前后向目标序列中约束了每一位置预设后向目标序列中的目标在前、预设前向目标序列中的目标在后的顺序,这样,计算得到的目标函数中加入了对前向计算和后向计算的解码位置的约束,即针对每一位置的目标解码,后向计算早于前向计算,由于前向计算会延迟,而后向计算会提前,这样,通过对解码位置的约束,使得训练的识别模型中前向计算的结果不延时、后向计算的结果不提前,实现了识别模型的实时识别。In this embodiment, the computer readable storage medium executes the computer program of the recognition model training method provided by the embodiment of the present application when it is running, so it can realize: by acquiring a sequence sample, inputting the sequence sample into the recognition model, and obtaining the preset pre- The first probability of the forward target sequence and the second probability of the preset backward target sequence are based on the preset forward target sequence and the preset backward target sequence, and the target in the preset backward target sequence at the same position is in the front and the preset target sequence. Assume that the targets in the forward target sequence are in the latter order, arrange to obtain the forward and backward target sequence, and calculate the third probability of the forward and backward target sequence, and calculate the objective function according to the first probability, the second probability and the third probability. The objective function uses the preset training algorithm to train the recognition model. By rearranging the preset forward target sequence and the preset backward target sequence, in the forward and backward target sequence, the target in the preset backward target sequence at each position is constrained to be in the front and the preset forward target sequence in the forward target sequence. The target is in the latter order. In this way, constraints on the decoding positions of the forward calculation and the backward calculation are added to the calculated target function, that is, for the target decoding of each position, the backward calculation is earlier than the forward calculation. The forward calculation will be delayed, and the backward calculation will be advanced. In this way, by restricting the decoding position, the result of the forward calculation in the trained recognition model is not delayed, and the result of the backward calculation is not advanced, and the real-time recognition of the recognition model is realized. .
对于电子设备以及计算机可读存储介质实施例而言,由于其所涉及的方法内容基本相似于前述的方法实施例,所以描述的比较简单,相关之处参见方法实施例的部分说明即可。For the embodiments of the electronic device and the computer-readable storage medium, since the method contents involved are basically similar to the foregoing method embodiments, the description is relatively simple.
需要说明的是,在本文中,诸如第一和第二等之类的关系术语仅仅用来将一个实体或者操作与另一个实体或操作区分开来,而不一定要求或者暗示这些实体或操作之间存在任何这种实际的关系或者顺序。而且,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者设备所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括所述要素的过程、方法、物品或者设备中还存在另外的相同要素。It should be noted that, in this document, relational terms such as first and second are only used to distinguish one entity or operation from another entity or operation, and do not necessarily require or imply any relationship between these entities or operations. any such actual relationship or sequence exists. Moreover, the terms "comprising", "comprising" or any other variation thereof are intended to encompass a non-exclusive inclusion such that a process, method, article or device that includes a list of elements includes not only those elements, but also includes not explicitly listed or other elements inherent to such a process, method, article or apparatus. Without further limitation, an element qualified by the phrase "comprising a..." does not preclude the presence of additional identical elements in a process, method, article or apparatus that includes the element.
本说明书中的各个实施例均采用相关的方式描述,各个实施例之间相同相似的部分互相参见即可,每个实施例重点说明的都是与其他实施例的不同之处。尤其,对于装置、电子设备以及计算机可读存储介质实施例而言,由于其基本相似于方法实施例,所以描述的比较简单,相关之处参见方法实施例的部分说明即可。Each embodiment in this specification is described in a related manner, and the same and similar parts between the various embodiments may be referred to each other, and each embodiment focuses on the differences from other embodiments. In particular, for the apparatus, electronic device, and computer-readable storage medium embodiments, since they are basically similar to the method embodiments, the description is relatively simple, and reference may be made to some descriptions of the method embodiments for related parts.
以上所述仅为本申请的较佳实施例而已,并非用于限定本申请的保护范围。凡在本申请的精神和原则之内所作的任何修改、等同替换、改进等,均包含在本申请的保护范围内。The above descriptions are only preferred embodiments of the present application, and are not intended to limit the protection scope of the present application. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of this application are included in the protection scope of this application.
Claims (10)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201811019880.1A CN110874553A (en) | 2018-09-03 | 2018-09-03 | Recognition model training method and device |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201811019880.1A CN110874553A (en) | 2018-09-03 | 2018-09-03 | Recognition model training method and device |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| CN110874553A true CN110874553A (en) | 2020-03-10 |
Family
ID=69716749
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN201811019880.1A Pending CN110874553A (en) | 2018-09-03 | 2018-09-03 | Recognition model training method and device |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN110874553A (en) |
Cited By (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN111737920A (en) * | 2020-06-24 | 2020-10-02 | 深圳前海微众银行股份有限公司 | Data processing method, equipment and medium based on recurrent neural network |
| CN114463376A (en) * | 2021-12-24 | 2022-05-10 | 北京达佳互联信息技术有限公司 | Video character tracking method and device, electronic equipment and storage medium |
-
2018
- 2018-09-03 CN CN201811019880.1A patent/CN110874553A/en active Pending
Cited By (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN111737920A (en) * | 2020-06-24 | 2020-10-02 | 深圳前海微众银行股份有限公司 | Data processing method, equipment and medium based on recurrent neural network |
| CN111737920B (en) * | 2020-06-24 | 2024-04-26 | 深圳前海微众银行股份有限公司 | Data processing method, equipment and medium based on cyclic neural network |
| CN114463376A (en) * | 2021-12-24 | 2022-05-10 | 北京达佳互联信息技术有限公司 | Video character tracking method and device, electronic equipment and storage medium |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN111368937B (en) | Image classification method and device, training method and device, equipment and medium | |
| US11403479B2 (en) | Feedback signaling to facilitate data classification functionality of a spiking neural network | |
| US12106058B2 (en) | Multi-turn dialogue response generation using asymmetric adversarial machine classifiers | |
| CN110807437B (en) | Video granularity characteristic determination method and device and computer-readable storage medium | |
| US20240029436A1 (en) | Action classification in video clips using attention-based neural networks | |
| US20230394245A1 (en) | Adversarial Bootstrapping for Multi-Turn Dialogue Model Training | |
| WO2021103761A1 (en) | Compound property analysis method and apparatus, compound property analysis model training method, and storage medium | |
| Kim et al. | Orchard: Visual object recognition accelerator based on approximate in-memory processing | |
| WO2021238262A1 (en) | Vehicle recognition method and apparatus, device, and storage medium | |
| JP2019528502A (en) | Method and apparatus for optimizing a model applicable to pattern recognition and terminal device | |
| CN114118259B (en) | Target detection method and device | |
| CN115691475B (en) | Methods for training speech recognition models and speech recognition methods | |
| CN109523014B (en) | News comment automatic generation method and system based on generative confrontation network model | |
| CN112990444A (en) | Hybrid neural network training method, system, equipment and storage medium | |
| CN111553477A (en) | Image processing method, device and storage medium | |
| CN114358111A (en) | Object clustering model acquisition method, object clustering method and device | |
| CN106296734B (en) | Method for tracking target based on extreme learning machine and boosting Multiple Kernel Learnings | |
| CN111858999B (en) | Retrieval method and device based on segmentation difficult sample generation | |
| CN114764593A (en) | Model training method, model training device and electronic equipment | |
| CN110874553A (en) | Recognition model training method and device | |
| CN111950629A (en) | Adversarial sample detection method, device and equipment | |
| CN110880018A (en) | Convolutional neural network target classification method based on novel loss function | |
| CN113257281B (en) | Method for carrying out hierarchical uncertainty quantitative estimation on multi-modal emotion recognition | |
| CN110414515B (en) | Chinese character image recognition method, device and storage medium based on information fusion processing | |
| CN116861374A (en) | Intention deviation prompting method and device based on behavior habit, medium and equipment |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination |