CN111832601A

CN111832601A - State detection method, model training method, storage medium and electronic device

Info

Publication number: CN111832601A
Application number: CN202010287493.7A
Authority: CN
Inventors: 赵元; 沈海峰
Original assignee: Beijing Didi Infinity Technology and Development Co Ltd
Current assignee: Beijing Didi Infinity Technology and Development Co Ltd
Priority date: 2020-04-13
Filing date: 2020-04-13
Publication date: 2020-10-27

Abstract

The embodiment of the present invention discloses a state detection method, a model training method, a storage medium and an electronic device. By acquiring multiple feature vectors of an image sequence to be detected, multiple branch tasks and corresponding feature weight vector sets are determined. The feature vector and feature weight vector set determine the input vector corresponding to the branch task, and output the branch task state to finally determine the state detection result according to the multiple branch task states. In the embodiment of the present invention, the input vector is calculated according to the feature weight vector set corresponding to each branch task and a plurality of feature vectors, so that the input vector of each branch task can be determined in a targeted manner, and the processing of each branch task can be improved. Corresponding branch task state accuracy, thereby improving the accuracy of the final state detection result.

Description

State detection method, model training method, storage medium and electronic device

技术领域technical field

本发明涉及计算机技术领域，尤其涉及一种状态检测方法、模型训练方法、存储介质和电子设备。The present invention relates to the field of computer technology, and in particular, to a state detection method, a model training method, a storage medium and an electronic device.

背景技术Background technique

在深度学习领域，对一个数据状态的检测过程通常会应用多个模型，目前主要有两种应用多个模型进行数据状态检测的方法。其中，现有技术中的单任务检测方法对于复杂的应用场景效果较差，且计算过程复杂，在实际应用过程中会占用大量的计算资源，不适用于大多数的真实使用场景。而多任务检测方法虽然可以节省计算开销，但由于多个模型公用一个输入特征，这些特征被强行灌输到各个模型的训练中，会造成每个模型的训练变得低效，从而难以达到比单任务模型更好的检测效果。In the field of deep learning, the detection process of a data state usually uses multiple models. At present, there are mainly two methods for using multiple models to detect the data state. Among them, the single-task detection method in the prior art has poor effect on complex application scenarios, and the calculation process is complicated, which will occupy a large amount of computing resources in the actual application process, and is not suitable for most real use scenarios. Although the multi-task detection method can save computational overhead, since multiple models share one input feature, these features are forcibly instilled into the training of each model, which will cause the training of each model to become inefficient, making it difficult to achieve a better performance than a single model. The task model has better detection effect.

发明内容SUMMARY OF THE INVENTION

有鉴于此，本发明实施例提供一种状态检测方法、模型训练方法、存储介质和电子设备，旨在提高多任务模型进行状态检测的检测结果准确性。In view of this, the embodiments of the present invention provide a state detection method, a model training method, a storage medium, and an electronic device, aiming at improving the accuracy of the detection result of the state detection performed by the multi-task model.

第一方面，本发明实施例公开了一种状态检测方法，所述方法包括：In a first aspect, an embodiment of the present invention discloses a state detection method, the method comprising:

确定待检测图像序列，所述待检测图像序列包括至少一个待检测图像；determining a sequence of images to be detected, the sequence of images to be detected including at least one image to be detected;

将各所述待检测图像依次通过N个卷积层获取对应的N个特征向量，所述N为大于等于2的整数；Each of the images to be detected is sequentially passed through N convolutional layers to obtain corresponding N feature vectors, where N is an integer greater than or equal to 2;

确定多个分支任务；Identify multiple branch tasks;

确定各所述分支任务对应的特征权重向量集合，各所述特征权重向量对应于各所述特征向量；Determine a set of feature weight vectors corresponding to each of the branch tasks, and each of the feature weight vectors corresponds to each of the feature vectors;

根据各所述待检测图像对应的N个特征向量和所述特征权重向量集合分别确定各所述分支任务的输入向量；Determine the input vector of each of the branch tasks according to the N feature vectors corresponding to each of the to-be-detected images and the set of feature weight vectors;

将各所述输入向量输入用于处理对应分支任务的图片分类器以确定所述待检测图像序列对应的分支任务状态；Inputting each of the input vectors into a picture classifier for processing the corresponding branch task to determine the branch task state corresponding to the image sequence to be detected;

根据所述待检测图像序列对应的分支任务状态确定状态检测结果。The state detection result is determined according to the branch task state corresponding to the image sequence to be detected.

进一步地，所述将各所述待检测图像依次通过N个卷积层获取对应的N个特征向量具体为：Further, obtaining the corresponding N feature vectors of each of the to-be-detected images sequentially through N convolutional layers is specifically:

将各所述待检测图像输入预先训练得到的卷积神经网络，分别将所述卷积神经网络中的N个卷积层的输出作为与输入的待检测图像对应的N个特征向量。Each of the images to be detected is input into a pre-trained convolutional neural network, and the outputs of the N convolutional layers in the convolutional neural network are respectively used as N feature vectors corresponding to the input images to be detected.

进一步地，所述根据各所述待检测图像对应的N个特征向量和所述特征权重向量集合分别确定各所述分支任务的输入向量包括：Further, determining the input vector of each branch task according to the N feature vectors corresponding to each of the to-be-detected images and the feature weight vector set respectively includes:

确定目标分支任务；Determine the target branch task;

根据所述目标分支任务对应的特征权重向量集合对各所述待检测图像对应的N个特征向量加权，以确定所述目标分支任务的输入向量。The N feature vectors corresponding to each of the images to be detected are weighted according to the feature weight vector set corresponding to the target branch task to determine the input vector of the target branch task.

进一步地，所述根据所述目标分支任务对应的特征权重向量集合对各所述待检测图像对应的N个特征向量加权，以确定所述目标分支任务的输入向量具体为：Further, according to the feature weight vector set corresponding to the target branch task, the N feature vectors corresponding to each of the to-be-detected images are weighted to determine that the input vector of the target branch task is specifically:

确定目标待检测图像；Determine the target image to be detected;

对于各所述目标分支任务，根据对应的特征权重向量集合中各所述特征权重对所述目标待检测图像对应的各所述特征向量加权；For each of the target branch tasks, weight each of the feature vectors corresponding to the target image to be detected according to each of the feature weights in the corresponding feature weight vector set;

将各所述加权后的特征向量分别输入非线性映射函数，计算各所述输出结果的和以确定所述目标待检测图像对于各所述目标分支任务的输入向量。Each of the weighted feature vectors is input into a nonlinear mapping function respectively, and the sum of each of the output results is calculated to determine the input vector of the target image to be detected for each of the target branch tasks.

进一步地，所述根据所述待检测图像序列对应的分支任务状态确定状态检测结果具体为：Further, determining the state detection result according to the branch task state corresponding to the image sequence to be detected is specifically:

将所述待检测图像序列对应的分支任务状态输入状态子模型中输出状态检测结果。The branch task state corresponding to the image sequence to be detected is input into the state sub-model and the state detection result is output.

第二方面，本发明实施例公开了一种状态检测模型训练方法，所述方法包括：In a second aspect, an embodiment of the present invention discloses a method for training a state detection model, the method comprising:

确定训练集，所述训练集包括多个待检测图像和对应的分支任务状态；Determine a training set, the training set includes a plurality of images to be detected and corresponding branch task states;

将各所述待检测图像作为状态检测模型的输入，对应的分支任务状态作为输出，预测所述状态检测模型中的待定的模型参数，所述状态检测模型中包括特征向量提取子模型和分类子模型；Each of the images to be detected is used as the input of the state detection model, and the corresponding branch task state is used as the output, and the undetermined model parameters in the state detection model are predicted. The state detection model includes a feature vector extraction sub-model and a classification sub-model. Model;

以迭代的方式将所述训练集中各所述待检测图像输入以各所述待定模型参数为模型参数的状态检测模型，以获取对应的预测分支任务状态，所述模型参数包括特征向量提取参数、特征权重向量参数和分类模型参数；In an iterative manner, each of the to-be-detected images in the training set is input into a state detection model with each of the to-be-determined model parameters as model parameters to obtain the corresponding prediction branch task state, where the model parameters include feature vector extraction parameters, Feature weight vector parameters and classification model parameters;

根据所述预测分支任务状态和各所述待检测图像对应的分支任务状态确定损失系数；Determine a loss coefficient according to the predicted branch task state and the branch task state corresponding to each of the to-be-detected images;

更新所述待定模型参数直到所述损失系数不再降低。The pending model parameters are updated until the loss coefficient no longer decreases.

进一步地，所述更新所述待定模型参数直到所述损失系数不再降低具体为：Further, the updating of the undetermined model parameters until the loss coefficient is no longer reduced is specifically:

根据反向传播算法确定待更新的待定模型参数；Determine the pending model parameters to be updated according to the back-propagation algorithm;

将所述待更新的待定模型参数更新为新的待定模型参数直到所述损失系数不再降低。The pending model parameters to be updated are updated to new pending model parameters until the loss coefficient is no longer reduced.

进一步地，所述状态检测模型中还包括注意力机制；Further, the state detection model also includes an attention mechanism;

所述特征向量提取参数为所述特征向量提取子模型的参数，所述特征权重向量参数为所述注意力机制的参数，所述分类模型参数为所述分类子模型的参数。The feature vector extraction parameters are parameters of the feature vector extraction sub-model, the feature weight vector parameters are parameters of the attention mechanism, and the classification model parameters are parameters of the classification sub-model.

第三方面，本发明实施例公开了一种计算机可读存储介质，用于存储计算机程序指令，所述计算机程序指令在被处理器执行时实现如第一方面和第二方面中任一项所述的方法。In a third aspect, an embodiment of the present invention discloses a computer-readable storage medium for storing computer program instructions, the computer program instructions, when executed by a processor, implement any one of the first aspect and the second aspect. method described.

第四方面，本发明实施例公开了一种电子设备，包括存储器和处理器，所述存储器用于存储一条或多条计算机程序指令，其中，所述一条或多条计算机程序指令被所述处理器执行以实现如第一方面和第二方面中任一项所述的方法。In a fourth aspect, an embodiment of the present invention discloses an electronic device, including a memory and a processor, where the memory is configured to store one or more computer program instructions, wherein the one or more computer program instructions are processed by the The machine executes to implement the method of any one of the first and second aspects.

本发明实施例根据各所述分支任务对应的特征权重向量集合以及多个特征向量分别计算输入向量，能够实现针对性的确定各所述分支任务的输入向量，提高用于处理各所述分支任务对应的分支任务状态准确率，进而提升最终状态检测结果的准确性。In the embodiment of the present invention, the input vector is calculated according to the feature weight vector set corresponding to each branch task and a plurality of feature vectors, so that the input vector of each branch task can be determined in a targeted manner, and the processing of each branch task can be improved. Corresponding branch task state accuracy, thereby improving the accuracy of the final state detection result.

附图说明Description of drawings

通过以下参照附图对本发明实施例的描述，本发明的上述以及其它目的、特征和优点将更为清楚，在附图中：The above and other objects, features and advantages of the present invention will become more apparent from the following description of embodiments of the present invention with reference to the accompanying drawings, in which:

图1为本发明实施例的状态检测方法的流程图；1 is a flowchart of a state detection method according to an embodiment of the present invention;

图2为用于执行本发明实施例的状态检测方法的系统示意图；2 is a schematic diagram of a system for implementing a state detection method according to an embodiment of the present invention;

图3为本发明实施例的状态检测方法的数据流向示意图；3 is a schematic diagram of a data flow of a state detection method according to an embodiment of the present invention;

图4为本发明实施例的状态检测模型训练方法的流程图；4 is a flowchart of a state detection model training method according to an embodiment of the present invention;

图5为本发明实施例的状态检测模型训练方法的数据流向示意图；5 is a schematic diagram of a data flow of a state detection model training method according to an embodiment of the present invention;

图6为本发明实施例的电子设备的示意图。FIG. 6 is a schematic diagram of an electronic device according to an embodiment of the present invention.

具体实施方式Detailed ways

以下基于实施例对本发明进行描述，但是本发明并不仅仅限于这些实施例。在下文对本发明的细节描述中，详尽描述了一些特定的细节部分。对本领域技术人员来说没有这些细节部分的描述也可以完全理解本发明。为了避免混淆本发明的实质，公知的方法、过程、流程、元件和电路并没有详细叙述。The present invention is described below based on examples, but the present invention is not limited to these examples only. In the following detailed description of the invention, some specific details are described in detail. The present invention can be fully understood by those skilled in the art without the description of these detailed parts. Well-known methods, procedures, procedures, components and circuits have not been described in detail in order to avoid obscuring the essence of the present invention.

此外，本领域普通技术人员应当理解，在此提供的附图都是为了说明的目的，并且附图不一定是按比例绘制的。Furthermore, those of ordinary skill in the art will appreciate that the drawings provided herein are for illustrative purposes and are not necessarily drawn to scale.

除非上下文明确要求，否则在说明书的“包括”、“包含”等类似词语应当解释为包含的含义而不是排他或穷举的含义；也就是说，是“包括但不限于”的含义。Unless clearly required by the context, words such as "including", "comprising" and the like in the specification should be construed in an inclusive rather than an exclusive or exhaustive sense; that is, in the sense of "including but not limited to".

在本发明的描述中，需要理解的是，术语“第一”、“第二”等仅用于描述目的，而不能理解为指示或暗示相对重要性。此外，在本发明的描述中，除非另有说明，“多个”的含义是两个或两个以上。In the description of the present invention, it should be understood that the terms "first", "second" and the like are used for descriptive purposes only, and should not be construed as indicating or implying relative importance. Also, in the description of the present invention, unless otherwise specified, "plurality" means two or more.

图1为本发明实施例的状态检测方法的流程图，如图1所示，所述状态检测方法包括以下步骤：FIG. 1 is a flowchart of a state detection method according to an embodiment of the present invention. As shown in FIG. 1 , the state detection method includes the following steps:

步骤S100、确定待检测图像序列。Step S100, determining the image sequence to be detected.

具体地，所述待检测图像序列中包括至少一个待检测图像，通过服务器确定。所述服务器确定待检测图像序列的方法例如可以是通过预设的应用程序接口接收终端设备上传的待检测图像，其中，所述终端设备是能够运行计算机程序的、具有通信功能通用数据处理终端，例如，智能手机、平板电脑、笔记本电脑以及装载了通信模块的其他终端设备等。所述待检测图像序列的至少一个待检测图像中包括用于检测对应状态的图像信息，例如，当所述状态检测方法用于检测某路口车流量状态时，所述待检测图像序列中的各所述待检测图像中均包括路口信息；当所述状态检测方法用于检测人脸状态时，所述待检测图像序列中的待检测图像中包括人脸信息。Specifically, the sequence of images to be detected includes at least one image to be detected, which is determined by the server. The method for the server to determine the sequence of images to be detected may be, for example, receiving images to be detected uploaded by a terminal device through a preset application program interface, wherein the terminal device is a general-purpose data processing terminal with a communication function capable of running a computer program, For example, smart phones, tablet computers, notebook computers, and other terminal equipment loaded with communication modules, etc. At least one to-be-detected image of the to-be-detected image sequence includes image information for detecting a corresponding state. For example, when the state detection method is used to detect a traffic flow state at a certain intersection, each image in the to-be-detected image sequence is The to-be-detected images all include intersection information; when the state detection method is used to detect a face state, the to-be-detected images in the to-be-detected image sequence include human face information.

图2为本发明实施例的状态检测方法的系统示意图，如图2所示，所述进行状态检测方法的系统可以仅包括用于获取待检测图像序列，并对所述待检测图像序列进行状态检测的终端设备21，或者包括通过网络连接的服务器20和终端设备21，所述终端设备21用于向服务器20上传待检测图像序列，所述服务器20用于基于所述待检测图像序列进行状态检测。FIG. 2 is a system schematic diagram of a state detection method according to an embodiment of the present invention. As shown in FIG. 2 , the system for performing the state detection method may only include a system for acquiring a sequence of images to be detected, and performing state detection on the sequence of images to be detected. The detected terminal device 21, or includes a server 20 and a terminal device 21 connected through a network, the terminal device 21 is used to upload the image sequence to be detected to the server 20, and the server 20 is used to perform status based on the image sequence to be detected. detection.

具体地，本发明实施例所述的状态检测方法的系统可以应用于各种状态检测场景，例如司机疲劳检测、道路拥堵程度检测、用户情绪检测等。以对司机疲劳驾驶状态进行检测的应用场景为例，在本发明实施例的一个可选的实现方式中，所述终端设备21为安装在车内，具有摄像功能的数据处理装置，用于定期采集包括司机面部状态的待检测图像序列，并且在所述待检测图像序列中提取司机面部特征以检测所述司机是否处于疲劳驾驶状态。在本发明实施例的另一个可选的实现方式中，所述终端设备21可以为安装在车内，且具有通信功能以及摄像功能的数据处理装置，用于定期采集包括司机面部状态的待检测图像序列，并将所述待检测图像序列上传至服务器20。所述服务器20用于在所述终端设备21上传的待检测图像序列中提取司机面部特征以检测所述司机是否处于疲劳驾驶状态。Specifically, the system of the state detection method described in the embodiment of the present invention can be applied to various state detection scenarios, such as driver fatigue detection, road congestion degree detection, user emotion detection, and the like. Taking the application scenario of detecting the driver's fatigue driving state as an example, in an optional implementation manner of the embodiment of the present invention, the terminal device 21 is a data processing device installed in the vehicle and having a camera function, which is used to periodically A sequence of images to be detected including a driver's facial state is collected, and driver facial features are extracted from the sequence of images to be detected to detect whether the driver is in a fatigued driving state. In another optional implementation manner of the embodiment of the present invention, the terminal device 21 may be a data processing device installed in the vehicle and having a communication function and a camera function, and is used to periodically collect the to-be-detected data including the driver's facial state. image sequence, and upload the to-be-detected image sequence to the server 20 . The server 20 is configured to extract the driver's facial features from the image sequence to be detected uploaded by the terminal device 21 to detect whether the driver is in a fatigued driving state.

以对道路拥堵程度进行检测的应用场景为例，在本发明实施例的一个可选的实现方式中，所述终端设备21为安装在路口，具有摄像功能的数据处理装置，用于定期采集包括路口通行车辆状态的待检测图像序列，并且在所述待检测图像序列中提取车辆运动特征以检测所述路口是否处于拥堵状态。在本发明实施例的另一个可选的实现方式中，所述终端设备21可以为安装在路口，且具有通信功能以及摄像功能的数据处理装置，用于定期采集包括路口车辆状态的待检测图像序列，并将所述待检测图像序列上传至服务器20。所述服务器20用于在所述终端设备21上传的待检测图像序列中提取车辆运动特征以检测所述路口是否处于拥堵状态。Taking the application scenario of detecting the degree of road congestion as an example, in an optional implementation manner of the embodiment of the present invention, the terminal device 21 is a data processing device installed at an intersection and having a camera function, which is used to periodically collect data including: A to-be-detected image sequence of the state of vehicles passing through the intersection, and vehicle motion features are extracted from the to-be-detected image sequence to detect whether the intersection is in a congested state. In another optional implementation manner of the embodiment of the present invention, the terminal device 21 may be a data processing device installed at an intersection and having a communication function and a camera function, and is used to periodically collect images to be detected including vehicle status at the intersection. sequence, and upload the to-be-detected image sequence to the server 20 . The server 20 is configured to extract vehicle motion features from the image sequence to be detected uploaded by the terminal device 21 to detect whether the intersection is in a congested state.

进一步地，本发明实施例所述的状态检测方法和应用所述状态检测方法的场景还适用于对其他类型的数据状态进行检测，例如通过音频记录内容的状态检测、通过文字记录内容的状态检测等。Further, the state detection method described in the embodiment of the present invention and the scene in which the state detection method is applied are also applicable to the detection of other types of data states, such as state detection through audio recording content, state detection through text recording content. Wait.

步骤S200、将各所述待检测图像依次通过N个卷积层获取对应的N个特征向量。Step S200 , passing each of the to-be-detected images sequentially through N convolutional layers to obtain corresponding N feature vectors.

具体地，所述服务器在确定待检测图像序列后，对所述待检测图像序列中包括的各所述待检测图像依次进行特征提取。在本发明实施例中，所述特征提取的过程为分别将所述待检测图像序列中的各所述待检测图像依次输入N个卷积层，将每一个卷积层的输出作为对应的特征向量。在本发明实施例的一个可选的实现方式中，各所述卷积层都是常用的图片分类器特征提取结构，包括但不限于Alexnet，GoogleNet，VGG，ResNet等算法，每一层输出的特征向量都基于上一层输出的特征向量得到。Specifically, after determining the sequence of images to be detected, the server sequentially performs feature extraction on each of the images to be detected included in the sequence of images to be detected. In the embodiment of the present invention, the feature extraction process is to input each of the to-be-detected images in the to-be-detected image sequence into N convolutional layers in turn, and use the output of each convolutional layer as a corresponding feature vector. In an optional implementation manner of the embodiment of the present invention, each of the convolutional layers is a common image classifier feature extraction structure, including but not limited to algorithms such as Alexnet, GoogleNet, VGG, and ResNet. The feature vectors are obtained based on the feature vectors output by the previous layer.

图3为本发明实施例的状态检测方法的数据流向示意图，如图3所示，所述N个卷积层可以属于同一个卷积神经网络30，所述卷积神经网络30中包括N个或者大于N个数量的卷积层。所述服务器在对一个待检测图像进行特征提取时，将所述待检测图像输入卷积神经网络30，所述待检测图像在所述卷积神经网络30内依次经过多个卷积层，其中每一个卷积层的输出作为下一个卷积层的输入。所述服务器在所述卷积神经网络30中确定卷积层1到卷积层N对应的N个输出f_i(1≤i≤N)作为所述待检测图像对应的特征向量。其中，所述卷积神经网络30可以通过多个标注的图像信息和与各所述图像信息对应的多个特征向量训练得到，即将所述图像信息作为输入，多个对应的特征向量作为每一个卷积层的输出训练对所述卷积神经网络30进行训练。FIG. 3 is a schematic diagram of a data flow of a state detection method according to an embodiment of the present invention. As shown in FIG. 3 , the N convolutional layers may belong to the same convolutional neural network 30, and the convolutional neural network 30 includes N Or more than N number of convolutional layers. When the server performs feature extraction on an image to be detected, the image to be detected is input into the convolutional neural network 30, and the image to be detected passes through a plurality of convolutional layers in sequence in the convolutional neural network 30, wherein The output of each convolutional layer is used as the input of the next convolutional layer. The server determines, in the convolutional neural network 30, N outputs f _i (1≤i≤N) corresponding to the convolutional layers 1 to N corresponding to the convolutional layers N as feature vectors corresponding to the image to be detected. Wherein, the convolutional neural network 30 can be obtained by training multiple labeled image information and multiple feature vectors corresponding to each of the image information, that is, the image information is used as input, and multiple corresponding feature vectors are used as each The output training of the convolutional layer trains the convolutional neural network 30 .

以所述状态检测方法用于检测司机疲劳驾驶状态为例进行说明，所述服务器获取到待检测图像序列后，依次将所述待检测图像序列中的各所述待检测图像输入预先训练的卷积神经网络30，在所述卷积神经网络30的第一个卷积层输出各所述待检测图像中人像信息对应的特征向量f₁，在第二个卷积层输出各所述人像信息中的人脸信息对应的特征向量f₂，在第三个卷积层输出各所述人脸信息中的眼部信息对应的特征向量f₃。Taking the state detection method for detecting the driver's fatigue driving state as an example for illustration, after the server obtains the image sequence to be detected, it sequentially inputs each image to be detected in the sequence of images to be detected into a pre-trained volume. A convolutional neural network 30, the first convolutional layer of the convolutional neural network 30 outputs the feature vector f1 corresponding to the portrait information in each _of the images to be detected, and the second convolutional layer outputs each of the portrait information. The feature vector f ₂ corresponding to the face information in the third convolution layer outputs the feature vector f ₃ corresponding to the eye information in each of the face information.

步骤S300、确定多个分支任务。Step S300, determining a plurality of branch tasks.

具体地，所述服务器可以预先设定多个的分支任务，每个分支任务为用于确定对应的分支任务状态，以便服务器根据各所述分支任务的分支任务状态确定所述待检测图像序列的状态。如图3所示，在本发明实施例中各所述分支任务分别对应于一个预先训练得到的图片分类器32，以通过各所述图片分类器32根据各所述待检测图像对应的特征向量f_i(1≤i≤N)确定对应的分支任务状态。Specifically, the server may preset a plurality of branch tasks, and each branch task is used to determine a corresponding branch task state, so that the server can determine the image sequence to be detected according to the branch task state of each branch task. state. As shown in FIG. 3 , in this embodiment of the present invention, each of the branch tasks corresponds to a pre-trained image classifier 32 , so that each of the image classifiers 32 can use the image classifier 32 according to the feature vector corresponding to each of the images to be detected. f _i (1≤i≤N) determines the corresponding branch task state.

以所述状态检测方法用于检测司机疲劳驾驶状态为例进行说明，所述分支任务可以包括人脸检测任务、睁闭眼检测检测任务以及打哈欠检测任务。各所述分支任务分别对应用于检测疲劳程度的图片分类器1、用于检测眨眼的图片分类器2以及用于检测打哈欠的图片分类器3。各所述图片分类器输出的分支任务状态分别为是否疲劳、是否眨眼以及是否打哈欠。Taking the state detection method for detecting a driver's fatigue driving state as an example for illustration, the branch tasks may include a face detection task, an eye opening and closing detection task, and a yawning detection task. Each of the branch tasks respectively corresponds to the image classifier 1 for detecting fatigue, the image classifier 2 for detecting blinking, and the image classifier 3 for detecting yawning. The branch task states output by each of the picture classifiers are fatigue, blinking, and yawning.

步骤S400、确定各所述分支任务对应的特征权重向量集合。Step S400: Determine a feature weight vector set corresponding to each branch task.

具体地，由于各所述分支任务用于检测不同的分支任务状态，为提高各所述分支任务状态的准确性，需要针对不同的分支任务生成对应的输入向量。在本发明实施例中，各所述分支任务对应的输入向量需要根据各所述分支任务对应的特征权重向量集合确定，其中各所述特征权重向量通过预先训练，对应于各所述特征向量。所述特征权重向量的训练方法可以为通过所述服务器训练注意力机制得到。例如，所述服务器可以预先构建所述注意力机制的输入输出训练集，通过所述训练集训练所述注意力机制，在训练过程中调整所述注意力机制的各项参数，直到损失达到最小值时将所述注意力机制此时的各项参数作为特征权重向量构建所述特征权重向量集合。Specifically, since each branch task is used to detect different branch task states, in order to improve the accuracy of each branch task state, corresponding input vectors need to be generated for different branch tasks. In the embodiment of the present invention, the input vector corresponding to each branch task needs to be determined according to the feature weight vector set corresponding to each branch task, wherein each feature weight vector corresponds to each feature vector through pre-training. The training method of the feature weight vector may be obtained by training the attention mechanism through the server. For example, the server may construct an input and output training set of the attention mechanism in advance, train the attention mechanism through the training set, and adjust various parameters of the attention mechanism during the training process until the loss reaches a minimum When the value is set, each parameter of the attention mechanism at this time is used as a feature weight vector to construct the feature weight vector set.

步骤S500、根据各所述待检测图像对应的N个特征向量和所述特征权重向量集合分别确定各所述分支任务的输入向量。Step S500: Determine the input vector of each branch task according to the N feature vectors corresponding to each of the to-be-detected images and the feature weight vector set.

具体地，所述服务器在确定了各所述待检测图像对应的N个特征向量、各所述分支任务以及各所述分支任务对应的特征权重向量集合后，根据上述参数确定各所述分支任务对应的输入向量。在本发明实施例中，所述服务器可以在所述分支任务中确定目标分支任务，将所述目标分支任务对应的特征权重向量集合中各元素作为注意力机制的参数，获取一个待检测图像对应的N个特征向量作为所述注意力机制的输入，输出与所述目标分支任务对应的，用于表征所述待检测图像的输入向量。所述注意力机制确定输入向量的具体方法为根据所述目标分支任务对应的特征权重向量集合对各所述待检测图像对应的N个特征向量加权，再基于得到的N个加权后的特征向量确定目标分支任务的输入向量。即先确定目标待检测图像，对于各所述目标分支任务，根据对应的特征权重向量集合中各所述特征权重对所述目标待检测图像对应的各所述特征向量加权，最后将各所述加权后的特征向量分别输入非线性映射函数，计算各所述输出结果的和以确定所述目标待检测图像对于各所述目标分支任务的输入向量。其中所述目标待检测图像在所述待检测图像序列中确定，当确定了所述目标待检测图像对应的各输入向量后，所述服务器可以在所述待检测图像序列中重新确定下一个目标待检测图像以再次确定对应的输入向量。Specifically, after determining the N feature vectors corresponding to each of the to-be-detected images, each of the branch tasks, and a set of feature weight vectors corresponding to each of the branch tasks, the server determines each of the branch tasks according to the above parameters the corresponding input vector. In this embodiment of the present invention, the server may determine a target branch task in the branch task, use each element in the feature weight vector set corresponding to the target branch task as a parameter of the attention mechanism, and obtain an image corresponding to the target branch task. The N feature vectors of are used as the input of the attention mechanism, and the output corresponds to the target branch task and is used to represent the input vector of the image to be detected. The specific method for determining the input vector by the attention mechanism is to weight the N feature vectors corresponding to each of the to-be-detected images according to the feature weight vector set corresponding to the target branch task, and then based on the N weighted feature vectors obtained. Determine the input vector for the target branch task. That is, first determine the target image to be detected, and for each target branch task, weight each of the feature vectors corresponding to the target to-be-detected image according to each of the feature weights in the corresponding feature weight vector set, and finally weight each of the The weighted feature vectors are respectively input to the nonlinear mapping function, and the sum of each of the output results is calculated to determine the input vector of the target to-be-detected image for each of the target branch tasks. The target to-be-detected image is determined in the to-be-detected image sequence, and after each input vector corresponding to the target to-be-detected image is determined, the server may re-determine the next target in the to-be-detected image sequence The image to be detected to determine the corresponding input vector again.

进一步地，所述目标分支任务的输入向量可以通过下述公式确定：Further, the input vector of the target branch task can be determined by the following formula:

其中，F_j为与目标分支任务对应的输入向量，M为分支任务的数量，N为所述特征向量的数量，f_i为特征向量，α_ji为与所述目标分支任务以及所述特征向量对应的特征权重向量，Φ为非线性映射函数。如图3所示，所述服务器将待检测图像输入卷积神经网络30，输出N个包括待检测图像特征的特征向量f_i(1≤i≤N)，将各所述特征向量f_i(1≤i≤N)输入注意力机制31，以通过所述目标分支任务对应的特征权重向量集合α_ji(1≤i≤N，1≤j≤M)以及各所述特征向量f_i(1≤i≤N)确定所述目标分支任务对应的输入向量F_j(1≤j≤M)。例如，当通过图片分类器1处理所述目标分支任务时，所述服务器将所述图片分类器1对应的特征权重向量集合α_1i(1≤i≤N)以及特征向量f_i(1≤i≤N)输入上述公式中得到与目标分支任务对应的输入向量F₁。进一步地，所述服务器在确定一个目标分支任务对应的输入向量后，确定其他图片分类器处理的分支任务为目标分支任务，再确定新的目标分支任务对应的输入向量，直到所述服务器得到全部分支任务的输入向量。Wherein, F _j is the input vector corresponding to the target branch task, M is the number of branch tasks, N is the number of the feature vectors, f _i is the feature vector, α _ji is the target branch task and the feature vector The corresponding feature weight vector, Φ is a nonlinear mapping function. As shown in FIG. 3, the server inputs the image to be detected into the convolutional neural network 30, and outputs N feature vectors f _i (1≤i≤N) including the features of the image to be detected, and each of the feature vectors f _i ( 1≤i≤N) is input to the attention mechanism 31 to pass the feature weight vector set α _ji (1≤i≤N, 1≤j≤M) corresponding to the target branch task and each of the feature vectors f _i (1 ≤i≤N) to determine the input vector F _j (1≤j≤M) corresponding to the target branch task. For example, when the target branch task is processed by the image classifier 1, the server assigns the feature weight vector set α _1i (1≤i≤N) and the feature vector f _i (1≤i) corresponding to the image classifier 1 ≤N) Input the above formula to obtain the input vector F ₁ corresponding to the target branch task. Further, after determining the input vector corresponding to a target branch task, the server determines that the branch tasks processed by other image classifiers are the target branch task, and then determines the input vector corresponding to the new target branch task, until the server obtains all the target branch tasks. Input vector for branching tasks.

步骤S600、将各所述输入向量输入用于处理对应分支任务的图片分类器以确定所述待检测图像序列对应的分支任务状态。Step S600: Input each of the input vectors into a picture classifier for processing the corresponding branch task to determine the branch task state corresponding to the image sequence to be detected.

具体地，对于各所述待检测图像，所述服务器根据其对应的N个特征向量和各所述分支任务对应的特征权重向量集合确定了对应于各所述分支任务的输入向量后，将各所述分支任务对应的输入向量分别输入用于处理对应分支任务的图片分类器，以输出对应的分支任务状态。确定了所述待检测图像序列中全部待检测图像的分支任务状态后，所述服务器再基于全部待检测图像的分支任务状态确定所述待检测图像序列对应的分支任务状态。其中，各所述图片分类器通过服务器预先训练得到。Specifically, for each image to be detected, after the server determines the input vector corresponding to each branch task according to the corresponding N feature vectors and the feature weight vector set corresponding to each branch task, the server determines the input vector corresponding to each branch task. The input vectors corresponding to the branch tasks are respectively input to a picture classifier for processing the corresponding branch tasks, so as to output the corresponding branch task states. After determining the branch task states of all the images to be detected in the image sequence to be detected, the server determines the branch task state corresponding to the image sequence to be detected based on the branch task states of all the images to be detected. Wherein, each of the picture classifiers is pre-trained by the server.

以所述状态检测方法用于检测司机疲劳驾驶状态，所述分支任务包括人脸检测任务、睁闭眼检测检测任务以及打哈欠检测任务为例进行说明。所述服务器对于所述待检测图像序列中的各所述待检测图像，将各所述分支任务对应的输入向量输入对应的图片分类器中，以根据输出分别确定所述待检测图像中人脸是否处于疲劳状态、是否眨眼以及是否打哈欠的分支任务状态。进一步地，所述服务器还基于全部待检测图像的分支任务状态确定所述待检测图像序列对应的分支任务状态，例如判断为疲劳状态的图像数量、判断为眨眼的图像数量以及判断为打哈欠的图像数量。The state detection method is used to detect the driver's fatigue driving state, and the branch tasks include the face detection task, the eye opening and closing detection task, and the yawning detection task as an example for description. For each of the to-be-detected images in the to-be-detected image sequence, the server inputs the input vector corresponding to each of the branch tasks into the corresponding image classifier, so as to determine the faces in the to-be-detected images respectively according to the output. Branch task status for fatigue, blinking, and yawning. Further, the server also determines the branch task state corresponding to the image sequence to be detected based on the branch task states of all the images to be detected, such as the number of images judged to be fatigued, the number of images judged to be blinking, and the number of images judged to be yawning. number of images.

步骤S700、根据所述待检测图像序列对应的分支任务状态确定状态检测结果。Step S700: Determine a state detection result according to the branch task state corresponding to the image sequence to be detected.

具体地，所述服务器确定所述待检测图像序列对应的分支任务状态后，所述服务器根据各所述分值任务状态和预设的规则确定所述待检测图像序列对应的状态检测结果。在本发明实施例的一种可选的实现方式中，所述预设的规则例如可以是将各所述分支任务状态作为预先训练得到的状态子模型中，输出对应的状态检测结果。在本发明实施例的另一种可选的实现方式中，所述预设的规则例如可以是设定各项分支任务状态对应的条件，根据各所述分支任务状态是否满足对应的条件判断最终的状态检测结果。Specifically, after the server determines the branch task state corresponding to the image sequence to be detected, the server determines the state detection result corresponding to the image sequence to be detected according to each of the score task states and preset rules. In an optional implementation manner of the embodiment of the present invention, the preset rule may be, for example, that each branch task state is used as a state sub-model obtained by pre-training, and a corresponding state detection result is output. In another optional implementation manner of the embodiment of the present invention, the preset rule may be, for example, setting conditions corresponding to each branch task state, and determining the final decision according to whether each branch task state meets the corresponding condition status detection result.

以所述状态检测方法用于检测司机疲劳驾驶状态，所述分支任务包括人脸检测任务、睁闭眼检测检测任务以及打哈欠检测任务，所述待检测图像序列对应的各所述分支任务状态分别为疲劳状态的图像数量、眨眼的图像数量以及打哈欠的图像数量为例进行说明。在本发明实施例中，所述服务器可以设定疲劳状态图像概率阈值、眨眼图像概率阈值以及打哈欠图像概率阈值，所述服务器在确定了所述待检测图像序列对应的各所述分支任务状态后，分别计算疲劳状态、眨眼以及打哈欠的图像数量在全部待检测图像数量中的概率。当其中P个概率大于对应的概率阈值时，所述服务器判断所述待检测图像序列对应的状态检测结果为疲劳驾驶，所述P为服务器预设的常数。如图3所示，所述状态检测方法的具体过程为所述服务器将所述待检测图像序列中各所述待检测图像依次输入卷积神经网络30，以确定对应的N个特征向量f_i(1≤i≤N)，再将所述N个特征向量f_i(1≤i≤N)输入参数不同的注意力机制31中，以分别输出各所述分支任务对应的输入向量F_j(1≤j≤M)。其中，在确定分支任务j对应的输入向量时，所述注意力机制的参数为所述分支任务j对应的特征权重向量集合α_ji(1≤i≤N，1≤j≤M)。所述服务器确定各所述输入向量F_j(1≤j≤M)后，再将各所述输入向量F_j(1≤j≤M)分别输入用于处理对应分支任务的图片分类器32得到对应的分支任务状态。当所述服务器确定了全部待检测图像对应的分支任务状态后，进一步得到所述待检测图像序列对应的分支任务状态并输入状态检测模块33，以通过预定规则确定最终的状态检测结果。The state detection method is used to detect the driver's fatigue driving state, and the branch tasks include a face detection task, an eye opening and closing detection task, and a yawning detection task, and each branch task state corresponding to the image sequence to be detected corresponds to The number of images of fatigue state, the number of images of blinking, and the number of images of yawning are taken as examples for description. In this embodiment of the present invention, the server may set a fatigue state image probability threshold, a blinking image probability threshold, and a yawning image probability threshold, and the server determines the state of each branch task corresponding to the image sequence to be detected. Then, calculate the probability of fatigue state, blinking and yawning images in all the images to be detected. When the P probabilities are greater than the corresponding probability thresholds, the server determines that the state detection result corresponding to the image sequence to be detected is fatigue driving, and the P is a constant preset by the server. As shown in FIG. 3 , the specific process of the state detection method is that the server sequentially inputs each of the to-be-detected images in the to-be-detected image sequence into a convolutional neural network 30 to determine corresponding N feature vectors f _i (1≤i≤N), and then input the N feature vectors f _i (1≤i≤N) into the attention mechanism 31 with different parameters, so as to output the input vectors F _j corresponding to the branch tasks respectively ( 1≤j≤M). Wherein, when determining the input vector corresponding to the branch task j, the parameter of the attention mechanism is the feature weight vector set α _ji corresponding to the branch task j (1≤i≤N, 1≤j≤M). After the server determines each of the input vectors F _j (1≤j≤M), then each of the input vectors F _j (1≤j≤M) is respectively input into the picture classifier 32 for processing the corresponding branch task to obtain: The corresponding branch task status. After the server has determined the branch task states corresponding to all the images to be detected, it further obtains the branch task states corresponding to the image sequence to be detected and inputted to the state detection module 33 to determine the final state detection result according to a predetermined rule.

进一步地，在本发明实施例中，所述服务器确定状态检测结果的过程中应用的卷积神经网络30、注意力机制31和图片分类器32可以分别预先训练得到，在本发明实施例的另一个实现方式中，所述卷积神经网络30、注意力机制31和图片分类器32也可以作为一个状态检测模型中的子模型整体训练得到。Further, in the embodiment of the present invention, the convolutional neural network 30, the attention mechanism 31, and the image classifier 32 applied in the process of determining the state detection result by the server may be obtained by pre-training respectively. In an implementation manner, the convolutional neural network 30, the attention mechanism 31 and the image classifier 32 can also be trained as a whole as a sub-model in a state detection model.

图4为本发明实施例的状态检测模型训练方法的流程图，所述状态检测模型的训练过程可以在上述用于进行状态检测过程的服务器中完成，也可以在其他服务器或其他终端设备中完成。如图4所示，所述状态检测模型训练方法包括以下步骤：4 is a flowchart of a state detection model training method according to an embodiment of the present invention. The training process of the state detection model may be completed in the above-mentioned server for performing the state detection process, or may be completed in other servers or other terminal devices . As shown in Figure 4, the state detection model training method includes the following steps:

步骤S800、确定训练集。Step S800, determining a training set.

具体地，所述训练集包括多个待检测图像和对应分支任务状态，用于作为所述状态检测模型的输入和输出。其中，所述待检测图像可以对应一个或多个分支任务状态，所述分支任务状态通过预先标注、融合得到。进一步地，所述训练集中还可以包括不同类型分支任务状态的多个子训练集，每一个子训练集中的待检测图像和分支任务状态一一对应。所述状态检测模型中可以包括多条输入输出子路径，每一个所述子训练集对应于所述状态检测模型中的一个子路径，用于训练所述子路径。例如，当所述状态检测模型用于检测司机疲劳驾驶状态，所述状态检测模型中包括用于检测是否疲劳、是否眨眼和是否打哈欠的子路径时，所述训练集中包括用于训练各所述子模型的疲劳子训练集、眨眼子训练集以及打哈欠子训练集。Specifically, the training set includes a plurality of images to be detected and corresponding branch task states, which are used as the input and output of the state detection model. The to-be-detected image may correspond to one or more branch task states, and the branch task states are obtained by pre-marking and fusion. Further, the training set may also include multiple sub-training sets of different types of branch task states, and the images to be detected in each sub-training set correspond to the branch task states one-to-one. The state detection model may include a plurality of input and output sub-paths, each of the sub-training sets corresponds to a sub-path in the state detection model, and is used for training the sub-paths. For example, when the state detection model is used to detect the driver's fatigue driving state, and the state detection model includes sub-paths for detecting fatigue, blinking, and yawning, the training set includes sub-paths for training each driver. The fatigue sub-training set, the blinking sub-training set, and the yawning sub-training set of the sub-model.

步骤S900、将各所述待检测图像作为状态检测模型的输入，对应的分支任务状态作为输出，预测所述状态检测模型中的待定的模型参数。Step S900 , taking each of the images to be detected as the input of the state detection model, and the corresponding branch task state as the output, to predict the undetermined model parameters in the state detection model.

具体地，在确定训练集之后，将所述训练集中的各所述待检测图像作为状态检测模型的输入，对应的分支任务状态作为输出，以预测所述状态检测模型中的待定的模型参数。进一步地，所述状态检测模型中包括特征向量提取子模型、注意力机制和分类子模型，其中各所述子模型的模型参数分别为特征向量提取参数、特征权重向量参数和分类模型参数。在本发明实施例的一个可选的实现方式中，所述状态检测模型中仅包括一条输入输出路径，所述训练过程为将所述训练集中的待检测图像依次输入特征向量提取子模型、注意力机制以及分类子模型，将对应的分支任务状态作为输出，以预测所述状态检测模型中的待定的模型参数。Specifically, after the training set is determined, each image to be detected in the training set is used as the input of the state detection model, and the corresponding branch task state is used as the output to predict the undetermined model parameters in the state detection model. Further, the state detection model includes a feature vector extraction sub-model, an attention mechanism and a classification sub-model, wherein the model parameters of each of the sub-models are respectively a feature vector extraction parameter, a feature weight vector parameter and a classification model parameter. In an optional implementation manner of the embodiment of the present invention, the state detection model includes only one input and output path, and the training process is to sequentially input the images to be detected in the training set into the feature vector extraction sub-model, pay attention to The force mechanism and the classification sub-model take the corresponding branch task state as an output to predict the pending model parameters in the state detection model.

在本发明实施例的另一个可选的实现方式中，所述状态检测模型中包括多条输入输出子路径。其中，所述注意力机制可以对应多个特征权重向量参数，所述分类子模型可以包括多个，各所述特征权重向量参数和各所述分类子模型均对应于一条由特征向量提取子模型、注意力机制以及分类子模型顺序连接组成的子路径。在训练所述子路径时，将所述子路径对应的子训练集中的待检测图像依次输入特征向量提取子模型、注意力机制以及分类子模型，将对应的分支任务状态作为输出，以预测对应于所述子路径中各子模型待定的模型参数。In another optional implementation manner of the embodiment of the present invention, the state detection model includes a plurality of input and output sub-paths. The attention mechanism may correspond to a plurality of feature weight vector parameters, the classification sub-model may include multiple, and each of the feature weight vector parameters and each of the classification sub-models corresponds to a sub-model extracted from the feature vector , attention mechanism, and sub-paths composed of sequential connections of classification sub-models. When training the sub-paths, the images to be detected in the sub-training set corresponding to the sub-paths are sequentially input into the feature vector extraction sub-model, the attention mechanism and the classification sub-model, and the corresponding branch task state is used as the output to predict the corresponding The undetermined model parameters of each sub-model in the sub-path.

步骤S1000、以迭代的方式将所述训练集中各所述待检测图像输入以各所述待定模型参数为模型参数的状态检测模型，以获取对应的预测分支任务状态。Step S1000 , input each of the to-be-detected images in the training set into a state detection model with each of the to-be-determined model parameters as model parameters in an iterative manner, to obtain a corresponding prediction branch task state.

具体地，在通过预测得到待定的模型参数后，再将所述训练集中各所述待检测图像输入模型参数为各所述待定的模型参数的状态检测模型，输出对应的预测分支任务状态。即当所述状态检测模型仅包括一条输入输出路径时，将所述待检测图像依次输入特征向量提取子模型、注意力机制以及分类子模型，输出对应的预测分支任务状态。当所述状态检测模型中包括多条输入输出子路径时，将各所述子训练集中的待检测图像输入对应特征向量提取子模型、注意力机制以及分类子模型组成的子路径，输出对应的子预测分支任务状态。Specifically, after the undetermined model parameters are obtained through prediction, each of the to-be-detected images in the training set is input into a state detection model whose model parameters are the undetermined model parameters, and the corresponding prediction branch task state is output. That is, when the state detection model includes only one input and output path, the image to be detected is input into the feature vector extraction sub-model, attention mechanism and classification sub-model in sequence, and the corresponding prediction branch task state is output. When the state detection model includes multiple input and output sub-paths, input the images to be detected in each of the sub-training sets into the sub-paths composed of the corresponding feature vector extraction sub-model, attention mechanism and classification sub-model, and output the corresponding sub-paths. Subpredict branch task state.

步骤S1100、根据所述预测分支任务状态和各所述待检测图像对应的分支任务状态确定损失系数。Step S1100: Determine a loss coefficient according to the predicted branch task state and the branch task state corresponding to each image to be detected.

具体地，对比所述预测分支任务状态与所述状态检测模型输入的待检测图像对应的分支任务状态，以确定对应的损失系数。例如，当所述状态检测模型的输出为0.5，所述待检测图像对应的分支任务状态为1时，所述损失系数为所述分支任务状态和所述预测分支任务状态的差值0.5。当所述状态检测模型中包括多个输入输出子路径时，确定所述损失系数为各所述子路径对应的子损失系数的和。Specifically, the predicted branch task state is compared with the branch task state corresponding to the image to be detected input by the state detection model to determine the corresponding loss coefficient. For example, when the output of the state detection model is 0.5 and the branch task state corresponding to the to-be-detected image is 1, the loss coefficient is 0.5 of the difference between the branch task state and the predicted branch task state. When the state detection model includes multiple input and output sub-paths, the loss coefficient is determined to be the sum of the sub-loss coefficients corresponding to each of the sub-paths.

步骤S1200、更新所述待定模型参数直到所述损失系数不再降低。Step S1200: Update the undetermined model parameters until the loss coefficient is no longer reduced.

具体地，在确定损失系数后，重新确定待定模型参数，将新的待定模型参数作为所述状态检测模型的参数。再重复上述步骤S1000、S1100，直到所述损失系数不再降低。在本发明实施例中，所述待定模型参数可通过反向传播算法确定。进一步地，停止更新所述待定模型参数的条件还可以为设定一个损失系数阈值，当所述损失系数达到所述阈值时停止训练所述模型等。Specifically, after the loss coefficient is determined, the undetermined model parameters are re-determined, and the new undetermined model parameters are used as the parameters of the state detection model. The above steps S1000 and S1100 are repeated until the loss coefficient is no longer reduced. In this embodiment of the present invention, the undetermined model parameters may be determined through a back-propagation algorithm. Further, the condition for stopping updating the parameters of the undetermined model may also be setting a loss coefficient threshold, and when the loss coefficient reaches the threshold, the training of the model is stopped, and so on.

图5为本发明实施例的状态检测模型训练方法的数据流向示意图，用于展示仅包括一个输入输出路径的状态检测模型训练过程中的数据流向，或包括多个输入输出子路径的状态检测模型中一个子路径训练过程的数据流向。如图5所示，所述状态检测模型的训练方法为：5 is a schematic diagram of a data flow of a state detection model training method according to an embodiment of the present invention, which is used to illustrate the data flow during training of a state detection model including only one input and output path, or a state detection model including multiple input and output sub-paths The data flow of a subpath training process in . As shown in Figure 5, the training method of the state detection model is:

在训练集中获取一个待检测图像依次输入特征向量提取子模型50、注意力机制51和分类子模型52，将其对应的分支任务状态作为输出，预测所述特征向量提取子模型50、注意力机制51和分类子模型52对应的待定特征向量提取参数、待定特征权重向量参数和待定分类模型参数。再将所述训练集中各所述待检测图像依次输入所述特征向量提取子模型50、注意力机制51和分类子模型52，输出对应的预测分支任务状态，其中，上述子模型的模型参数分别为所述待定特征向量提取参数、待定特征权重向量参数和待定分类模型参数。在得到预测分支任务状态后，将所述预测分支任务状态与所述待检测图像对应的分支任务状态进行对比，得到对应的损失系数。当所述损失系数不符合服务器预设的条件时，再重新确定待更新的待定特征向量提取参数、待定特征权重向量参数和待定分类模型参数，将所述待更新的模型参数分别更新为新的模型参数，再重新确定损失系数直到所述损失系数符合预设条件。所述待更新的待定模型参数可以通过反向传播算法确定。在本发明实施例的一个可选的实现方式中，所述预设条件可以为在多次调整模型参数期间损失函数不再降低。在本发明实施例的另一个可选的实现方式中，所述预设条件还可以是设定一个损失系数阈值，当所述损失系数达到所述阈值时停止训练所述模型。Obtain an image to be detected in the training set and input the feature vector extraction sub-model 50, attention mechanism 51 and classification sub-model 52 in turn, take the corresponding branch task state as output, and predict the feature vector extraction sub-model 50, attention mechanism 51 and the undetermined feature vector extraction parameters, undetermined feature weight vector parameters and undetermined classification model parameters corresponding to the classification sub-model 52 . Then, each of the images to be detected in the training set is sequentially input into the feature vector extraction sub-model 50, the attention mechanism 51 and the classification sub-model 52, and the corresponding prediction branch task state is output, wherein the model parameters of the above sub-models are respectively Extracting parameters, undetermined feature weight vector parameters and undetermined classification model parameters for the undetermined feature vector. After the predicted branch task state is obtained, the predicted branch task state is compared with the branch task state corresponding to the to-be-detected image to obtain a corresponding loss coefficient. When the loss coefficient does not meet the conditions preset by the server, the undetermined feature vector extraction parameters, undetermined feature weight vector parameters and undetermined classification model parameters to be updated are re-determined, and the to-be-updated model parameters are respectively updated to new ones model parameters, and then re-determine the loss coefficient until the loss coefficient meets the preset condition. The pending model parameters to be updated may be determined by a back-propagation algorithm. In an optional implementation manner of the embodiment of the present invention, the preset condition may be that the loss function is no longer reduced during multiple adjustment of the model parameters. In another optional implementation manner of the embodiment of the present invention, the preset condition may also be to set a loss coefficient threshold, and stop training the model when the loss coefficient reaches the threshold.

进一步地，所述特征向量提取子模型50、注意力机制51和分类子模型52还可以预先分别确定对应的模型参数，即在训练所述状态检测模型时，所述状态检测模型中可以包括已经训练完成的子模型。由于所述已经训练完成的子模型的模型参数已经确定，不需要再次参与训练过程，所述服务器在训练所述状态检测模型时仅对未确定的模型参数进行调整。例如，当所述待定特征向量提取子模型和分类子模型为已经预先训练好的模型时，所述服务器认为所述特征向量提取参数和分类模型参数已经确定，在训练所述状态检测模型的过程中，仅调整所述特征权重向量参数，直到所述损失函数满足预设的条件。Further, the feature vector extraction sub-model 50, the attention mechanism 51 and the classification sub-model 52 may also pre-determine corresponding model parameters, that is, when training the state detection model, the state detection model may include The trained submodel. Since the model parameters of the already trained sub-model have been determined, there is no need to participate in the training process again, and the server only adjusts the undetermined model parameters when training the state detection model. For example, when the to-be-determined feature vector extraction sub-model and classification sub-model are pre-trained models, the server considers that the feature vector extraction parameters and classification model parameters have been determined, and in the process of training the state detection model , only the parameter of the feature weight vector is adjusted until the loss function satisfies the preset condition.

本发明实施例通过获取待检测图像序列的多个特征向量，确定多个分支任务和对应的特征权重向量集合，根据各所述特征向量和特征权重向量集合确定输入对应分支任务的输入向量，输出分支任务状态以最终根据多个分支任务状态确定状态检测结果。所述方法根据各所述分支任务对应的特征权重向量集合以及多个特征向量分别计算输入向量，能够实现针对性的确定各所述分支任务的输入向量，提高用于处理各所述分支任务对应的分支任务状态准确率，进而提升最终状态检测结果的准确性。The embodiment of the present invention determines multiple branch tasks and corresponding feature weight vector sets by acquiring multiple feature vectors of the image sequence to be detected, determines the input vector corresponding to the branch task according to each of the feature vectors and feature weight vector sets, and outputs the output Branching task states to finally determine the state detection result according to the plurality of branching task states. The method calculates the input vector respectively according to the feature weight vector set corresponding to each of the branch tasks and a plurality of feature vectors, which can realize the targeted determination of the input vector of each of the branch tasks, and improve the corresponding processing of the branch tasks. The accuracy of the branch task status is improved, thereby improving the accuracy of the final status detection result.

图6是本发明实施例的电子设备的示意图。图6所示的电子设备为通用数据处理装置，其包括通用的计算机硬件结构，其至少包括处理器60和存储器61。处理器60和存储器61通过总线62连接。存储器61适于存储处理器60可执行的指令或程序。处理器60可以是独立的微处理器，也可以是一个或者多个微处理器集合。由此，处理器60通过执行存储器61所存储的命令，从而执行如上所述的本发明实施例的方法流程实现对于数据的处理和对于其他装置的控制。总线62将上述多个组件连接在一起，同时将上述组件连接到显示控制器63和显示装置以及输入/输出(I/O)装置64。输入/输出(I/O)装置64可以是鼠标、键盘、调制解调器、网络接口、触控输入装置、体感输入装置、打印机以及本领域公知的其他装置。典型地，输入/输出(I/O)装置64通过输入/输出(I/O)控制器65与系统相连。FIG. 6 is a schematic diagram of an electronic device according to an embodiment of the present invention. The electronic device shown in FIG. 6 is a general-purpose data processing apparatus, which includes a general-purpose computer hardware structure, which at least includes a processor 60 and a memory 61 . The processor 60 and the memory 61 are connected by a bus 62 . The memory 61 is adapted to store instructions or programs executable by the processor 60 . The processor 60 may be an independent microprocessor or a collection of one or more microprocessors. Thus, the processor 60 executes the commands stored in the memory 61 to execute the above-described method flow of the embodiments of the present invention to process data and control other devices. The bus 62 connects the above-mentioned various components together, while connecting the above-mentioned components to the display controller 63 and the display device and the input/output (I/O) device 64 . Input/output (I/O) device 64 may be a mouse, keyboard, modem, network interface, touch input device, somatosensory input device, printer, and other devices known in the art. Typically, input/output (I/O) devices 64 are connected to the system through input/output (I/O) controllers 65 .

其中，存储器61可以存储软件组件，例如操作系统、通信模块、交互模块以及应用程序。以上所述的每个模块和应用程序都对应于完成一个或多个功能和在发明实施例中描述的方法的一组可执行程序指令。Among them, the memory 61 can store software components, such as an operating system, a communication module, an interaction module, and an application program. Each of the modules and applications described above corresponds to a set of executable program instructions that perform one or more functions and methods described in embodiments of the invention.

上述根据本发明实施例的方法、设备(系统)和计算机程序产品的流程图和/或框图描述了本发明的各个方面。应理解，流程图和/或框图的每个块以及流程图图例和/或框图中的块的组合可以由计算机程序指令来实现。这些计算机程序指令可以被提供至通用计算机、专用计算机或其它可编程数据处理设备的处理器，以产生机器，使得(经由计算机或其它可编程数据处理设备的处理器执行的)指令创建用于实现流程图和/或框图块或块中指定的功能/动作的装置。The foregoing flowchart and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the present invention describe various aspects of the present invention. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer or other programmable data processing apparatus to produce a machine such that the instructions (executed via the processor of the computer or other programmable data processing apparatus) create for implementing Flowchart and/or block diagram blocks or means for the functions/acts specified in the blocks.

同时，如本领域技术人员将意识到的，本发明实施例的各个方面可以被实现为系统、方法或计算机程序产品。因此，本发明实施例的各个方面可以采取如下形式：完全硬件实施方式、完全软件实施方式(包括固件、常驻软件、微代码等)或者在本文中通常可以都称为“电路”、“模块”或“系统”的将软件方面与硬件方面相结合的实施方式。此外，本发明的方面可以采取如下形式：在一个或多个计算机可读介质中实现的计算机程序产品，计算机可读介质具有在其上实现的计算机可读程序代码。Also, as will be appreciated by those skilled in the art, various aspects of the embodiments of the present invention may be implemented as a system, method or computer program product. Accordingly, various aspects of embodiments of the present invention may take the form of an entirely hardware implementation, an entirely software implementation (including firmware, resident software, microcode, etc.), or may be generally referred to herein as "circuits," "modules," ” or “system” that combines software aspects with hardware aspects. Furthermore, aspects of the present invention may take the form of a computer program product embodied in one or more computer readable media having computer readable program code embodied thereon.

可以利用一个或多个计算机可读介质的任意组合。计算机可读介质可以是计算机可读信号介质或计算机可读存储介质。计算机可读存储介质可以是如(但不限于)电子的、磁的、光学的、电磁的、红外的或半导体系统、设备或装置，或者前述的任意适当的组合。计算机可读存储介质的更具体的示例(非穷尽列举)将包括以下各项：具有一根或多根电线的电气连接、便携式计算机软盘、硬盘、随机存取存储器(RAN)、只读存储器(RON)、可擦除可编程只读存储器(EPRON或闪速存储器)、光纤、便携式光盘只读存储器(CD-RON)、光存储装置、磁存储装置或前述的任意适当的组合。在本发明实施例的上下文中，计算机可读存储介质可以为能够包含或存储由指令执行系统、设备或装置使用的程序或结合指令执行系统、设备或装置使用的程序的任意有形介质。Any combination of one or more computer-readable media may be utilized. The computer-readable medium may be a computer-readable signal medium or a computer-readable storage medium. The computer-readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, device, or apparatus, or any suitable combination of the foregoing. More specific examples (non-exhaustive list) of computer readable storage media would include the following: electrical connections with one or more wires, portable computer floppy disks, hard disks, random access memory (RAN), read only memory ( RON), erasable programmable read only memory (EPRON or flash memory), optical fiber, portable compact disk read only memory (CD-RON), optical storage, magnetic storage, or any suitable combination of the foregoing. In the context of embodiments of the present invention, a computer-readable storage medium may be any tangible medium that can contain or store a program for use by or in connection with an instruction execution system, device, or apparatus.

计算机可读信号介质可以包括传播的数据信号，所述传播的数据信号具有在其中如在基带中或作为载波的一部分实现的计算机可读程序代码。这样的传播的信号可以采用多种形式中的任何形式，包括但不限于：电磁的、光学的或其任何适当的组合。计算机可读信号介质可以是以下任意计算机可读介质：不是计算机可读存储介质，并且可以对由指令执行系统、设备或装置使用的或结合指令执行系统、设备或装置使用的程序进行通信、传播或传输。A computer-readable signal medium may include a propagated data signal having computer-readable program code embodied therein, such as in baseband or as part of a carrier wave. Such propagated signals may take any of a variety of forms including, but not limited to, electromagnetic, optical, or any suitable combination thereof. A computer-readable signal medium can be any computer-readable medium that is not a computer-readable storage medium and that can communicate, propagate, and communicate a program for use by or in conjunction with the instruction execution system, apparatus, or apparatus. or transmission.

用于执行针对本发明各方面的操作的计算机程序代码可以以一种或多种编程语言的任意组合来编写，所述编程语言包括：面向对象的编程语言如Java、SNalltalk、C++、PHP、Python等；以及常规过程编程语言如“C”编程语言或类似的编程语言。程序代码可以作为独立软件包完全地在用户计算机上、部分地在用户计算机上执行；部分地在用户计算机上且部分地在远程计算机上执行；或者完全地在远程计算机或服务器上执行。在后一种情况下，可以将远程计算机通过包括局域网(LAN)或广域网(WAN)的任意类型的网络连接至用户计算机，或者可以与外部计算机进行连接(例如通过使用因特网服务供应商的因特网)。Computer program code for carrying out operations directed to aspects of the present invention may be written in any combination of one or more programming languages including: object-oriented programming languages such as Java, SNalltalk, C++, PHP, Python etc.; and conventional procedural programming languages such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package; partly on the user's computer and partly on a remote computer; or entirely on the remote computer or server. In the latter case, the remote computer may be connected to the user's computer through any type of network including a local area network (LAN) or wide area network (WAN), or may be connected to an external computer (eg, by using an Internet service provider's Internet) .

本发明还涉及一种计算机可读存储介质，用于存储计算机可读程序，所述计算机可读程序用于供计算机执行上述部分或全部的方法实施例。The present invention also relates to a computer-readable storage medium for storing a computer-readable program, the computer-readable program being used for a computer to execute some or all of the above method embodiments.

即，本领域技术人员可以理解，实现上述实施例方法中的全部或部分步骤是可以通过程序来指令相关的硬件来完成，该程序存储在一个存储介质中，包括若干指令用以使得一个设备(可以是单片机，芯片等)或处理器(processor)执行本申请各实施例所述方法的全部或部分步骤。而前述的存储介质包括：U盘、移动硬盘、只读存储器(RON，Read-OnlyNeNory)、随机存取存储器(RAN，RandoN Access NeNory)、磁碟或者光盘等各种可以存储程序代码的介质。That is, those skilled in the art can understand that all or part of the steps in the method for implementing the above embodiments can be completed by instructing the relevant hardware through a program, and the program is stored in a storage medium and includes several instructions to make a device ( It may be a single chip microcomputer, a chip, etc.) or a processor (processor) to execute all or part of the steps of the methods described in the embodiments of the present application. The aforementioned storage medium includes: U disk, removable hard disk, read-only memory (RON, Read-Only NeNory), random access memory (RAN, Random Access NeNory), magnetic disk or optical disk and other media that can store program codes.

以上所述仅为本发明的优选实施例，并不用于限制本发明，对于本领域技术人员而言，本发明可以有各种改动和变化。凡在本发明的精神和原理之内所作的任何修改、等同替换、改进等，均应包含在本发明的保护范围之内。The above descriptions are only preferred embodiments of the present invention, and are not intended to limit the present invention. For those skilled in the art, the present invention may have various modifications and changes. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention shall be included within the protection scope of the present invention.

Claims

1. A method of condition detection, the method comprising:

determining an image sequence to be detected, wherein the image sequence to be detected comprises at least one image to be detected;

sequentially passing each image to be detected through N convolutional layers to obtain corresponding N characteristic vectors, wherein N is an integer greater than or equal to 2;

determining a plurality of branch tasks;

determining a feature weight vector set corresponding to each branch task, wherein each feature weight vector corresponds to each feature vector;

respectively determining the input vector of each branch task according to the N characteristic vectors corresponding to each image to be detected and the characteristic weight vector set;

inputting each input vector into a picture classifier for processing a corresponding branch task to determine a branch task state corresponding to the image sequence to be detected;

and determining a state detection result according to the branch task state corresponding to the image sequence to be detected.

2. The method according to claim 1, wherein the obtaining of the corresponding N eigenvectors from each of the images to be detected sequentially through the N convolutional layers is specifically:

inputting each image to be detected into a convolutional neural network obtained by pre-training, and respectively taking the output of N convolutional layers in the convolutional neural network as N characteristic vectors corresponding to the input image to be detected.

3. The method according to claim 1, wherein the determining the input vector of each of the branch tasks according to the N feature vectors corresponding to each of the images to be detected and the feature weight vector set respectively comprises:

determining a target branch task;

and weighting the N eigenvectors corresponding to the images to be detected according to the characteristic weight vector set corresponding to the target branch task to determine the input vector of the target branch task.

4. The method according to claim 3, wherein the weighting the N feature vectors corresponding to each image to be detected according to the feature weight vector set corresponding to the target branch task to determine the input vector of the target branch task specifically comprises:

determining an image to be detected of a target;

for each target branch task, weighting each feature vector corresponding to the target image to be detected according to each feature weight in the corresponding feature weight vector set;

and respectively inputting each weighted feature vector into a nonlinear mapping function, and calculating the sum of each output result to determine the input vector of the target image to be detected for each target branch task.

5. The method according to claim 1, wherein the determining a state detection result according to the branch task state corresponding to the image sequence to be detected specifically comprises:

and inputting the branch task state corresponding to the image sequence to be detected into a state sub-model and outputting a state detection result.

6. A method for training a state detection model, the method comprising:

determining a training set, wherein the training set comprises a plurality of images to be detected and corresponding branch task states;

taking each image to be detected as the input of a state detection model, taking the corresponding branch task state as the output, and predicting the model parameters to be determined in the state detection model, wherein the state detection model comprises a feature vector extraction sub-model and a classification sub-model;

inputting each image to be detected in the training set into a state detection model taking each model parameter to be detected as a model parameter in an iterative manner so as to obtain a corresponding predicted branch task state, wherein the model parameter comprises a feature vector extraction parameter, a feature weight vector parameter and a classification model parameter;

determining a loss coefficient according to the predicted branch task state and the branch task state corresponding to each image to be detected;

updating the pending model parameters until the loss factor is no longer reduced.

7. The method of claim 6, wherein said updating the pending model parameters until the loss factor no longer decreases is specifically:

determining undetermined model parameters to be updated according to a back propagation algorithm;

and updating the undetermined model parameters to be updated into new undetermined model parameters until the loss coefficient is not reduced any more.

8. The method of claim 6, further comprising an attention mechanism in the state detection model;

the feature vector extraction parameters are parameters of the feature vector extraction submodel, the feature weight vector parameters are parameters of the attention mechanism, and the classification model parameters are parameters of the classification submodel.

9. A computer readable storage medium storing computer program instructions, which when executed by a processor implement the method of any one of claims 1-8.

10. An electronic device comprising a memory and a processor, wherein the memory is configured to store one or more computer program instructions, wherein the one or more computer program instructions are executed by the processor to implement the method of any of claims 1-8.