CN109858558A

CN109858558A - Training method, device, electronic equipment and the storage medium of disaggregated model

Info

Publication number: CN109858558A
Application number: CN201910113211.9A
Authority: CN
Inventors: 张志伟; 吴丽军; 李焱
Original assignee: Beijing Dajia Internet Information Technology Co Ltd
Current assignee: Beijing Dajia Internet Information Technology Co Ltd
Priority date: 2019-02-13
Filing date: 2019-02-13
Publication date: 2019-06-07
Anticipated expiration: 2039-02-13
Also published as: CN109858558B

Abstract

The disclosure is directed to a kind of training method of disaggregated model, device, electronic equipment and storage mediums.In the training method, in advance by the tag along sort of each sample data, at least two levels with hyponymy are divided into according to semanteme；The destination number of training stage needed for determining disaggregated model to be trained；It in each training stage, determines the training stage corresponding target tier, is concentrated using sample data, the corresponding sample data of the tag along sort of target tier is trained disaggregated model, when meeting predetermined convergence condition, terminates the training process of the training stage；Wherein, in the destination number training stage, at least there are two target tier corresponding to the training stage is not identical；After the last one training stage, the training to disaggregated model is completed.The training method for the disaggregated model that the disclosure provides can realize effective training to disaggregated model under the premise of the hyponymy between the tag along sort by sample data is taken into account.

Description

Method and device for training classification model, electronic equipment and storage medium

Technical Field

The present disclosure relates to the field of classification model technologies, and in particular, to a method and an apparatus for training a classification model, an electronic device, and a storage medium.

Background

In the related art, a training mode of a classification model is to randomly select sample data from a sample data set and input the sample data into the model, and then train the model by using the selected sample data to obtain a trained classification model.

Although the above training mode of the classification model can realize the training of the model, the inventor finds that the related art has at least the following problems in the process of realizing the invention:

in the sample data set, some sample data classification tags are upper-level type classification tags, some sample data classification tags are lower-level type classification tags, for example, when the image classification model is trained, some sample image classification tags are dogs, and some sample image classification tags are dog varieties, such as husky and the like. However, the related art does not take this into account when training the classification model, which undoubtedly affects the accuracy of the model. Therefore, how to implement effective training of the classification model on the premise of taking the upper-lower relation between the classification labels of the sample data into consideration so as to obtain a classification model with higher accuracy is a technical problem to be solved urgently.

Disclosure of Invention

In order to overcome the problems in the related art, the present disclosure provides a training method and apparatus for a classification model, an electronic device, and a storage medium.

According to a first aspect of the embodiments of the present disclosure, a training method of a classification model is provided, where in a sample data set utilized in training of the classification model, each sample data has a classification label, and the classification labels of each sample data are divided into at least two levels having a top-bottom relationship according to semantics, and each classification label belongs to one level; the method comprises the following steps:

determining the target number of the training stages required by the classification model to be trained;

for each training stage in the target number of training stages, determining a target level corresponding to the training stage, training the classification model by using sample data corresponding to a classification label of the target level in the sample data set, and finishing the training process of the training stage when the classification model meets a preset convergence condition; in the target number of training stages, at least two training stages have different corresponding target levels;

and after the final training stage is finished, taking the classification model obtained by current training as the classification model after the training is finished.

Optionally, for a first training stage of the target number of training stages, the step of determining a target level corresponding to the training stage includes:

taking the top layer of the at least two levels as a target level corresponding to the first training stage;

for the last training stage of the target number of training stages, the step of determining the target level corresponding to the training stage includes:

and taking all the levels in the at least two levels as target levels corresponding to the last training stage.

Optionally, when the number of targets is greater than 2, for each intermediate training stage except for the first training stage and the last training stage in the number of training stages with the number of targets, the step of determining the target level corresponding to the training stage includes:

taking the top layer and the preset middle layer corresponding to the middle training stage as a target level corresponding to the middle training stage;

wherein, the predetermined middle layer corresponding to each middle training stage comprises: one or more levels other than the top and bottom levels.

Optionally, the target number is the same as the number of tiers of the at least two tiers;

the preset middle layer corresponding to each middle training stage comprises: the number of the first intermediate layers is equal to the number of the stages of the intermediate training stage.

Optionally, for a first training stage of the target number of training stages, when the classification model satisfies a predetermined convergence condition, ending the training process of the training stage includes:

when the loss value which is obtained by utilizing the first loss function and corresponds to the classification model is smaller than a first threshold value, finishing the training process of the first training stage;

for the last training phase of the target number of training phases, when the classification model satisfies a predetermined convergence condition, ending the training process of the training phase, including:

and when the loss value which is obtained by utilizing the second loss function and corresponds to the classification model is smaller than a second threshold value, finishing the training process of the last training stage.

Optionally, for each intermediate training phase in the target number of training phases, the ending the training process of the training phase when the classification model satisfies a predetermined convergence condition includes:

when the loss value which is obtained by utilizing the first loss function and corresponds to the classification model is smaller than a third threshold value, finishing the training process of the intermediate training stage; or,

and when the loss value which is obtained by utilizing the second loss function and corresponds to the classification model is smaller than a fourth threshold value, finishing the training process of the intermediate training stage.

Optionally, in the first loss Function, the distribution probability value predicted by each classification label is obtained by calculation based on a Sigmoid Function of the first loss Function;

in the second loss Function, for each classification label, the distribution probability value predicted by the classification label is obtained by calculation based on a Sigmoid Function or a normalized exponential Function Softmax Function.

According to a second aspect of the embodiments of the present disclosure, there is provided a training apparatus for a classification model, in a sample data set utilized by the classification model during training, each sample data has a classification label, and the classification labels of each sample data are divided into at least two levels having a top-bottom relationship according to semantics, and each classification label belongs to one level; the device includes:

a determining module configured to determine a target number of training phases required for a classification model to be trained;

a training module configured to determine, for each of the target number of training stages, a target level corresponding to the training stage, train the classification model using sample data corresponding to a classification label of the target level in the sample data set, and terminate a training process of the training stage when the classification model satisfies a predetermined convergence condition; in the target number of training stages, at least two training stages have different corresponding target levels;

and the training completion module is configured to take the classification model obtained by current training as the trained classification model after the last training stage is finished.

Optionally, the determining, by the training module, a target level corresponding to a first training stage of the target number of training stages includes: determining the top level of the at least two levels as a target level corresponding to the first training stage;

the training module determines, for a last training stage of the target number of training stages, a target level corresponding to the training stage, including: and determining all the levels in the at least two levels as the target level corresponding to the last training stage.

Optionally, when the number of targets is greater than 2, the training module, for each intermediate training stage except for the first training stage and the last training stage in the number of training stages with the number of targets, determines a target level corresponding to the training stage, including:

determining the top layer and a preset middle layer corresponding to the middle training stage as a target level corresponding to the middle training stage;

Optionally, the training module, for a first training stage of the target number of training stages, when the classification model satisfies a predetermined convergence condition, ends the training process of the training stage, including: when the loss value which is obtained by utilizing the first loss function and corresponds to the classification model is smaller than a first threshold value, finishing the training process of the first training stage;

the training module, aiming at the last training stage in the target number of training stages, when the classification model meets the predetermined convergence condition, ends the training process of the training stage, and includes:

Optionally, the training module, for each intermediate training stage in the target number of training stages, when the classification model satisfies a predetermined convergence condition, ends the training process of the training stage, including:

According to a third aspect of the embodiments of the present disclosure, there is provided an electronic apparatus including:

a processor;

a memory for storing processor-executable instructions;

wherein the processor is configured to: and when the executable instructions stored in the memory are executed, the training method of any one of the classification models is realized.

According to a fourth aspect of embodiments of the present disclosure, there is provided a non-transitory computer-readable storage medium, wherein instructions of the storage medium, when executed by a processor of an electronic device, enable the electronic device to implement any one of the above-mentioned training methods for a classification model.

According to a fifth aspect of embodiments of the present disclosure, there is provided a computer program product, which, when executed by a processor of an electronic device, enables the electronic device to perform any one of the above-mentioned training methods of a classification model.

The technical scheme provided by the embodiment of the disclosure can have the following beneficial effects: in the sample data set, classifying labels of the sample data are divided into at least two levels with a top-bottom relation according to semantics; when training the classification model, training the model in stages; in each training stage, sample data is selected based on the hierarchy to which the classification label of the sample data belongs, namely, in the training process of the model, the hierarchy factor of the classification label of the sample data is increased. Therefore, according to the technical scheme provided by the embodiment of the disclosure, the effective training of the classification model can be realized on the premise that the upper-lower relation between the classification labels of the sample data is taken into consideration, so that the classification model with higher accuracy is obtained.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the invention and together with the description, serve to explain the principles of the invention.

FIG. 1 is a flow diagram illustrating a method of training a classification model according to an exemplary embodiment.

FIG. 2 is a block diagram illustrating a training apparatus for classification models according to an exemplary embodiment.

FIG. 3 is a block diagram illustrating an electronic device in accordance with an example embodiment.

FIG. 4 is a block diagram illustrating an apparatus for training a classification model according to an example embodiment.

FIG. 5 is a block diagram illustrating another apparatus for training a classification model according to an example embodiment.

Detailed Description

Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with the present invention. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the invention, as detailed in the appended claims.

In order to solve the prior art problems, embodiments of the present disclosure provide a method and an apparatus for training a classification model, an electronic device, and a storage medium.

It should be noted that an executing subject of the training method for the classification model provided by the embodiment of the present disclosure may be a training device for the classification model, and the training device may be applied to an electronic device. Specifically, the electronic device may be a server or a terminal device, and in a specific application, the terminal device may be a computer, a smart phone, a tablet device, and the like.

In the embodiment of the present disclosure, in a sample data set utilized in training a classification model, each sample data has a classification label, and before training the classification model, the classification labels of each sample data are hierarchically divided in advance. Specifically, each classification label is divided into at least two levels with a top-bottom relation according to semantics, and each classification label belongs to one level.

For example, in one implementation, the number of hierarchies may be specifically determined according to the classification tag of each sample data after the sample data set is determined. For example, in a sample data set consisting of pictures, the classification labels of the sample pictures include: husky, human, dog, cat, animal, man, building, etc.; the classification labels can be divided into three levels according to the upper-lower relation of the label contents of the classification labels. Wherein, the top-level classification label can include: humans, animals, and buildings; the classification labels of the middle layer may include: cats, dogs, buildings, and men; the underlying class labels may include: husky.

For example, in another implementation manner, the number of the hierarchies may be a predetermined number of hierarchies, and each classification label is semantically allocated to one hierarchy based on the predetermined number of hierarchies. For example, if the predetermined number of levels is 2, then when the 8 classification tags are hierarchically divided, the classification tags at the top level may include: people, men, animals and buildings, and the bottom classification labels can be cats, dogs, husks and buildings.

In addition, if the sample data set lacks the upper-level classification tag or lacks the lower-level classification tag, the upper-level classification tag and the lower-level classification tag may be expanded for the classification tag in the sample data set. For example, the class tag may be an animal, a lower class tag named mammal, and both cat and dog class tags may be lower class tags than the class tag of mammal.

It is understood that after the classification tags of the sample data are hierarchically classified, the hierarchical classification tags can form a tag tree. In the label tree, the name of the classification label at the top layer is higher, so the classification label at the top layer can also be called a father label; the classification tags in the levels below the top level may be referred to as sub-tags, and the classification tags in the bottom level may be referred to as leaf tags because the names of the classification tags in the bottom level are lower.

It should be noted that the training method of the classification model provided in the embodiment of the present disclosure is also applicable to training of the classification model of multi-label sample data, where the multi-label sample data means that one sample data has multiple classification labels. In a specific application, for a classification model in which sample data is multi-label sample data, it is only necessary to divide each classification label of the multi-label sample data into at least two levels having a top-bottom relation according to semantics.

It should be emphasized that the above specific division manner for dividing the classification label hierarchy is merely an example, and should not be construed as a limitation to the embodiments of the present disclosure, and any implementation manner for dividing the classification label hierarchy based on semantics may be applied to the training method of the classification model provided in the embodiments of the present disclosure.

In addition, in the embodiment of the present disclosure, the training process of the classification model is performed in stages, that is, the training process of the classification model is composed of a plurality of training stages; and in each training stage, selecting a certain amount of sample data from the sample data set to train, calculating the loss value of the classification model by using a loss function corresponding to the training stage, ending the training stage when the loss value is less than a preset threshold value of the training stage, entering the next training stage until the last training stage is ended, and finishing the training of the classification model. When the sample data is selected for each training stage, the target level corresponding to each training stage may be determined, and the classification label is determined according to the target level corresponding to each training stage, so that the sample data to be selected is determined. It is to be understood that the target hierarchy is one or more of the at least two hierarchies into which the category labels are divided as described above. It should be noted that, in the embodiment of the present disclosure, the target levels corresponding to at least two training stages of the classification model are different. In addition, different loss functions may be used for different training phases, and of course, the loss functions used for some training phases may be the same. When the loss value calculated by using the loss function in each training stage meets the preset threshold value, the training process of the training stage can be ended, and the next training stage is started until all the training stages with the target number are completely trained.

First, a method for training a classification model provided in the embodiments of the present disclosure is described below.

Fig. 1 is a flowchart illustrating a training method of a classification model according to an exemplary embodiment, and as shown in fig. 1, the training method of a classification model provided by an embodiment of the present disclosure may include the following steps:

s11: and determining the target number of the training stages required by the classification model to be trained.

Here, there are various specific implementations of determining the target number of training phases required for the classification model to be trained.

For example, in one implementation, the target number of training phases required for the classification model to be trained may be determined based on the number of levels of the at least two levels. For example: determining the number of levels of the at least two levels as the target number; alternatively, a value greater or less than the number of hierarchies of the at least two hierarchies is determined as the target number.

For example, in another implementation, the number of stages input by the operator may be obtained, with the obtained number of stages being the target number.

It should be emphasized that the above-described specific implementation of determining the target number of training phases required for the classification model to be trained is merely an example, and should not be construed as limiting the embodiments of the present disclosure.

In addition, it should be noted that any model for classification can be used as the classification model to be trained in the present disclosure. However, the specific structure of the classification model to be trained does not relate to the invention of the present disclosure, and therefore, the present disclosure defines the specific structure of the classification model to be trained.

S12: and aiming at each training stage in the target number of training stages, determining a target level corresponding to the training stage, training the classification model by using sample data corresponding to the classification label of the target level in the sample data set, and finishing the training process of the training stage when the classification model meets a preset convergence condition.

And in the training stages with the target number, at least two training stages have different corresponding target levels.

It can be understood that the sample data corresponding to the classification label utilized in each training stage may be sample data corresponding to a part of the classification labels in the target hierarchy, or may be sample data corresponding to all the classification labels in the target hierarchy.

S13: and after the final training stage is finished, taking the classification model obtained by current training as the classification model after the training is finished.

In step S12, for each of the target number of training phases, there are a plurality of specific implementation manners for determining the target level corresponding to the training phase.

For example, in an implementation manner, for a first training phase of the target number of training phases, the step of determining a target level corresponding to the training phase may include: taking the top layer of the at least two levels as a target level corresponding to the first training stage;

for the last training stage of the target number of training stages, the step of determining the target level corresponding to the training stage may include:

It can be understood that, when the target number of the training phases required by the classification model to be trained is 2, the second training phase is the last training phase; and when the target number of the training stages required by the classification model to be trained is more than 2, the target number of the training stages also comprises an intermediate training stage between the first training stage and the last training stage.

Here, there are various ways of determining the target level corresponding to each intermediate training stage.

For example, in an implementation manner, when the target number is greater than 2, for each intermediate training stage, except for the first training stage and the last training stage, of the target number of training stages, the step of determining the target level corresponding to the training stage may include:

In another implementation, the target level corresponding to each intermediate training stage may also include only a predetermined intermediate layer of each training stage.

For clarity of the scheme and clear layout, specific levels included in the predetermined middle layer corresponding to each middle training stage are exemplified in the following.

It can be understood that, in the embodiment of the present disclosure, the training process of the classification model is progressive, and the model is trained first to realize the basic classification, so that the model learns the layer-by-layer refined classification until the classification model can realize the more refined classification. For example, when a classification model for classifying pictures is trained, after the first training stage of the classification model is completed, the classification model can perform basic classification on the pictures, such as predicting whether the class of the picture to be classified is a picture related to landscape, a picture related to people, a picture related to animals, or the like; after the training of the second training stage is completed, for the picture containing the animal to be predicted and classified, the picture can be predicted to be the picture related to the animal; when the last training stage is completed, the classification labels in all the levels are already involved in the training of the classification model, and at the moment, the classification model can realize the prediction of the detailed variety of a certain animal in the picture.

In the training method for the classification model provided by the embodiment of the disclosure, in a sample data set, classification labels of the sample data are divided into at least two levels with a top-bottom relation according to semantics; when training the classification model, training the model in stages; in each training stage, sample data is selected based on the hierarchy to which the classification label of the sample data belongs, namely, in the training process of the model, the hierarchy factor of the classification label of the sample data is increased. Therefore, according to the technical scheme provided by the embodiment of the disclosure, the effective training of the classification model can be realized on the premise that the upper-lower relation between the classification labels of the sample data is taken into consideration, so that the classification model with higher accuracy is obtained.

For clarity of the scheme and clarity of layout, specific levels included in the predetermined middle layer corresponding to each middle training stage are exemplarily described below.

For example, in one implementation, the target number of training stages required for the classification model to be trained may be the same as the number of levels of the at least two levels;

the predetermined intermediate layer corresponding to each intermediate training stage may include: the number of the first intermediate layers is equal to the number of the stages of the intermediate training stage.

In the above-described embodiments, it has been mentioned that the predetermined intermediate layer is one or more layers other than the top layer and the bottom layer, and therefore, the first intermediate layer and the layers above the first intermediate layer herein do not include the top layer and the bottom layer.

For example, when the number of levels of the at least two levels is 4, the classification model to be trained includes 4 training phases. Wherein, the target level corresponding to the first training stage is the top level of the at least two levels; the second training stage is an intermediate training stage, the number of stages is 2, so that the first intermediate layer corresponding to the second training stage is the second layer of the at least two levels, and the predetermined intermediate layer corresponding to the second training stage only comprises the second layer, so that the target level corresponding to the second training stage is the top layer and the second layer; the third training stage is also an intermediate training stage, and the number of stages is 3, so that the first intermediate layer corresponding to the third training stage is the third layer of the at least two levels, and further, the predetermined intermediate layer corresponding to the third training stage comprises the third layer and the second layer, so that the target levels corresponding to the third training stage are the top layer, the second layer and the third layer; the fourth training phase is the last training phase, so the target level corresponding to the fourth training phase is all the layers of the at least two levels, i.e., the top layer, the second layer, the third layer, and the fourth layer.

It can be understood that after the target level corresponding to each training stage is determined, the classification label can be selected from the level corresponding to the label tree, and then the sample data to be utilized by the training classification model in each training stage is selected.

In addition, in the embodiment of the present disclosure, there are a plurality of predetermined convergence conditions for each training phase when training the classification model.

Optionally, in an implementation manner, for a first training stage of the target number of training stages, the ending the training process of the training stage when the classification model satisfies a predetermined convergence condition may include:

for the last training stage of the target number of training stages, the ending the training process of the training stage when the classification model satisfies the predetermined convergence condition may include:

In addition, for each intermediate training phase in the target number of training phases, the ending the training process of the training phase when the classification model satisfies a predetermined convergence condition may include:

The first threshold, the second threshold, the third threshold, and the fourth threshold may be numerically the same value or different values, though they are named differently, and the present invention is not limited thereto.

In the first loss Function, the distribution probability value predicted for each classification label may be calculated based on a Sigmoid Function, where each classification label in the first loss Function is: in a training phase, the first loss function is used to calculate the loss value, and the classification label of each sample data is used.

In the second loss Function, for each class label, the distribution probability value predicted for the class label may be calculated based on a Sigmoid Function or a normalized exponential Function Softmax Function, where each class label in the second loss Function is: in a training phase, the second loss function is used to calculate a class label for each sample data utilized in the loss value.

Wherein the first loss function may be:

in the first loss function, p_nIn a training phase for calculating a loss value by using the first loss function, the real distribution probability value of the nth classification label is a value which can be obtained in advance, specifically a value between 0 and 1;the predicted distribution probability value of the nth classification label is obtained by using a Sigmoid function for calculation, and the predicted distribution probability value is a value between 0 and 1; c is the number of classification labels of the sample data utilized in the training stage;each p calculated for the first loss function_nAndis the loss value of the training phase that uses the first loss function to calculate the loss value.

The second loss function may be:

in the second loss function, p_nAlso for calculating the true distribution probability value of the nth class label in a training phase of the loss value using the second loss function,also the predicted distribution probability value for the nth class label. In the second loss function, different from the first loss function, in the first summation formulaCalculated using the Softmax function, of the second summation formulaCalculating and obtaining by using a Sigmoid function; c₀In the classification label of the sample data utilized in the training stage, Sigmoid function is used for calculationC is the number of classification labels of the sample data utilized in the training phase, it can be understood that, in the first summation formula, the Softmax function is used to calculateIs from C₀Class label between to C.Each p calculated for the second loss function_nAndis the loss value of the training phase that uses the second loss function to calculate the loss value.

For clarity of the scheme and clarity of layout, the following describes an exemplary way of calculating the loss value of each training phase by taking the training process of the classification model including four training phases as an example. Wherein which ones of the class labels of the training phase that utilize the second loss function to calculate the loss value are calculated using the Softmax functionWhich classification tags are calculated using Sigmoid functionAs will also be exemplified herein.

For example, in one implementation, a first training phase of the classification model may utilize a first loss function to calculate a loss value; the second training phase, the third training phase, and the fourth training phase may utilize a second penalty function to calculate a penalty value. In addition, the values of the losses in each training phase are calculated for each class labelThe calculation method of (2) may be determined according to the hierarchy of the classification label in the label tree.

In practical applications, leaf tags in a tag tree are usually independent from each other, while parent tags, or classification tags at intermediate levels, are not usually necessarily independent from each other. For example, for a classification model for classifying pictures, the two father labels of landscape and plant may intersect, and the landscape may include flowers and plants, and the flowers and plants may also constitute the landscape. The two leaf tags of Husky and Teddy are independent, and can not generate intersection. Thus, calculating leaf labelsThe Softmax function may be employed because of the respective calculations that are performed using the Softmax functionAnd 1, respectively, as calculated by using Sigmoid functionDistribution probability, and may be greater than 1, so that each parent tag is calculatedThen, a Sigmoid function may be employed; sof may be used for classification tags located in the middle levels of the tag tree, neither parent nor leaf tagstmax function calculationAlternatively, Sigmoid function calculation may be usedIn particular, it may be predetermined which of the two functions the classification label for each intermediate level is computed usingThe present disclosure is not limited thereto.

Therefore, when calculating the value of the loss function in the first training stage, since the target level corresponding to the first training stage is the top level, and the classification labels in the top level are all parent labels, in the first training stage, the values of the classification labels are determined according to the number of the target levelsCan be obtained by adopting Sigmoid function calculation.

When calculating the loss value of the second training stage, because the target levels corresponding to the second training stage are the top level and the second level, in the second training stage, the classification labels belonging to the top level are respectivelyStill, it is possible to use the Sigmoid function calculation, while each belonging to a class label of the second layerThe calculation can be carried out by adopting a Sigmoid function, and can also be carried out by adopting a Softmax function.

When calculating the loss value of the third training stage, the target levels corresponding to the third training stage are the top level, the second level and the third level, so in the third training stage, each classification label belonging to the top levelIt is still possible to use Sigmoid function calculations for classification tags belonging to the second layer each and to the third layer eachThe calculation can be carried out by adopting a Sigmoid function, and can also be carried out by adopting a Softmax function.

When calculating the value of the loss function in the fourth training stage, since the target levels corresponding to the fourth training stage are the top level, the second level, the third level, and the fourth level as the bottom level, in the fourth training stage, the classification labels belonging to the top level are respectively assigned to the target levelsStill, the computation can be performed by using Sigmoid function, and the classification labels belonging to the second layer and the third layer can be computed by using Sigmoid function or Softmax function, while the classification labels belonging to the bottom layerCalculation can be performed using the Softmax function.

It will be appreciated that as the training phase is incremented, each class label is assignedAre continuously updated.

Therefore, in the training method of the classification model provided by the embodiment of the disclosure, each training stage adopts the loss function adapted to the target level of the training stage, so that the accuracy of the trained classification model is further improved.

Corresponding to the above training method of the classification model, the embodiment of the present disclosure further provides a training device of the classification model. It should be noted that, in the sample data set utilized in the training of the classification model, each sample data has a classification label, and the classification labels of each sample data are divided into at least two levels having a top-bottom relationship according to semantics, and each classification label belongs to one level.

FIG. 2 is a block diagram illustrating a training apparatus for classification models according to an exemplary embodiment. Referring to fig. 2, the apparatus includes a determination module 121, a training module 122, and a completion training module 123.

The determining module 121 is configured to determine a target number of training phases required for the classification model to be trained, based on the number of levels of the at least two levels.

The training module 122 is configured to, for each of the target number of training stages, determine a target level corresponding to the training stage, train the classification model using the sample data corresponding to the classification label of the target level in the sample data set, and end the training process of the training stage when the classification model satisfies a predetermined convergence condition; and in the training stages with the target number, at least two training stages have different corresponding target levels.

The training completion module 123 is configured to take the currently trained classification model as the trained classification model after the last training phase is completed.

Optionally, in an implementation manner, the determining, by the training module 122, a target level corresponding to a first training phase of the target number of training phases includes: determining the top level of the at least two levels as a target level corresponding to the first training stage;

the training module 122, for the last training stage of the target number of training stages, determines a target level corresponding to the training stage, including: and determining all the levels in the at least two levels as the target level corresponding to the last training stage.

Optionally, in an implementation manner, when the target number is greater than 2, the determining, by the training module 122, for each intermediate training stage, except for the first training stage and the last training stage, of the target number of training stages, a target level corresponding to the training stage includes:

Optionally, in one implementation, the target number is the same as the number of tiers of the at least two tiers;

Optionally, in an implementation manner, the training module 122, for a first training stage of the target number of training stages, when the classification model satisfies a predetermined convergence condition, ends the training process of the training stage, including: when the loss value which is obtained by utilizing the first loss function and corresponds to the classification model is smaller than a first threshold value, finishing the training process of the first training stage;

the training module 122, aiming at the last training stage in the target number of training stages, when the classification model satisfies a predetermined convergence condition, ends the training process of the training stage, including:

Optionally, in an implementation manner, the training module 122, for each intermediate training stage in the target number of training stages, when the classification model satisfies a predetermined convergence condition, ends the training process of the training stage, including:

With regard to the apparatus in the above-described embodiment, the specific manner in which each module performs the operation has been described in detail in the embodiment related to the method, and will not be elaborated here.

In addition, corresponding to the training method of the classification model provided in the foregoing embodiment, an embodiment of the present disclosure further provides an electronic device, as shown in fig. 3, where the electronic device may include:

a processor 310;

a memory 320 for storing processor-executable instructions;

wherein the processor 310 is configured to: when the executable instructions stored in the memory 320 are executed, the steps of any one of the training methods for the classification model provided by the embodiments of the present disclosure are implemented.

It is understood that the electronic device may be a server or a terminal device, and in a specific application, the terminal device may be a computer, a smart phone, a tablet device, and the like.

FIG. 4 is a block diagram illustrating an apparatus 400 for training a classification model according to an exemplary embodiment. For example, the device 400 may be a computer, a smartphone, a tablet device, and the like.

Referring to fig. 4, device 400 may include one or more of the following components: a processing component 402, a memory 404, a power component 406, a multimedia component 408, an audio component 410, an interface for input/output (I/O) 412, a sensor component 414, and a communication component 416.

The processing component 402 generally controls the overall operation of the device 400, such as operations associated with display, telephone calls, data communications, camera operations, and recording operations. The processing component 402 may include one or more processors 420 to execute instructions to perform all or a portion of the steps of the methods described above. Further, the processing component 402 can include one or more modules that facilitate interaction between the processing component 402 and other components. For example, the processing component 402 can include a multimedia module to facilitate interaction between the multimedia component 408 and the processing component 402.

The memory 404 is configured to store various types of data to support operations at the device 400. Examples of such data include instructions for any application or method operating on device 400, contact data, phonebook data, messages, pictures, videos, and so forth. The memory 404 may be implemented by any type or combination of volatile or non-volatile memory devices such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disks.

Power components 406 provide power to the various components of device 400. Power components 406 may include a power management system, one or more power supplies, and other components associated with generating, managing, and distributing power for device 400.

The multimedia component 408 includes a screen providing an output interface between the device 400 and a user. In some embodiments, the screen may include a Liquid Crystal Display (LCD) and a Touch Panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive an input signal from a user. The touch panel includes one or more touch sensors to sense touch, slide, and gestures on the touch panel. The touch sensor may not only sense the boundary of a touch or slide action, but also detect the duration and pressure associated with the touch or slide operation. In some embodiments, the multimedia component 408 includes a front facing camera and/or a rear facing camera. The front-facing camera and/or the rear-facing camera may receive external multimedia data when the device 400 is in an operational mode, such as a shooting mode or a video mode. Each front camera and rear camera may be a fixed optical lens system or have a focal length and optical zoom capability.

The audio component 410 is configured to output and/or input audio signals. For example, the audio component 410 includes a Microphone (MIC) configured to receive external audio signals when the device 400 is in an operational mode, such as a call mode, a recording mode, and a voice recognition mode. The received audio signals may further be stored in the memory 404 or transmitted via the communication component 416. In some embodiments, audio component 410 also includes a speaker for outputting audio signals.

The I/O interface 412 provides an interface between the processing component 402 and peripheral interface modules, which may be keyboards, click wheels, buttons, etc. These buttons may include, but are not limited to: a home button, a volume button, a start button, and a lock button.

The sensor component 414 includes one or more sensors for providing status assessment of various aspects of the device 400. For example, the sensor component 414 can detect an open/closed state of the device 400, the relative positioning of components, such as a display and keypad of the device 400, the sensor component 414 can also detect a change in the position of the device 400 or a component of the device 400, the presence or absence of user contact with the device 400, orientation or acceleration/deceleration of the device 400, and a change in the temperature of the device 400. The sensor assembly 414 may include a proximity sensor configured to detect the presence of a nearby object without any physical contact. The sensor assembly 414 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications. In some embodiments, the sensor assembly 414 may also include an acceleration sensor, a gyroscope sensor, a magnetic sensor, a pressure sensor, or a temperature sensor.

The communication component 416 is configured to facilitate wired or wireless communication between the device 400 and other devices. The device 400 may access a wireless network based on a communication standard, such as WiFi, an operator network (such as 2G, 3G, 4G, or 5G), or a combination thereof. In an exemplary embodiment, the communication component 416 receives broadcast signals or broadcast related information from an external broadcast management system via a broadcast channel. In an exemplary embodiment, the communication component 416 further includes a Near Field Communication (NFC) module to facilitate short-range communications.

In an exemplary embodiment, the apparatus 400 may be implemented by one or more Application Specific Integrated Circuits (ASICs), Digital Signal Processors (DSPs), Digital Signal Processing Devices (DSPDs), Programmable Logic Devices (PLDs), Field Programmable Gate Arrays (FPGAs), controllers, micro-controllers, microprocessors or other electronic components for performing any of the above-described training methods for classification models.

FIG. 5 is a block diagram illustrating an apparatus 500 for training a classification model according to an example embodiment. For example, the device 500 may be provided as a server. Referring to fig. 5, device 500 includes a processing component 522 that further includes one or more processors and memory resources, represented by memory 532, for storing instructions, such as applications, that are executable by processing component 522. The application programs stored in memory 532 may include one or more modules that each correspond to a set of instructions. Further, the processing component 522 is configured to execute instructions to perform any of the above-described methods of training a classification model.

The device 500 may also include a power component 526 configured to perform power management for the device 500, a wired or wireless network interface 550 configured to connect the device 500 to a network, and an input/output (I/O) interface 558. The device 500 may operate based on an operating system stored in memory 532, such as Windows Server, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM, or the like.

In addition, the embodiments of the present disclosure also provide a non-transitory computer-readable storage medium, where instructions of the storage medium, when executed by a processor of an electronic device, enable the electronic device to perform any one of the training methods of the classification model provided by the embodiments of the present disclosure.

In an exemplary embodiment, a non-transitory computer-readable storage medium comprising instructions, such as the memory 404 comprising instructions, executable by the processor 420 of the device 400 to perform any of the above-described methods of training a classification model is also provided. For example, the non-transitory computer readable storage medium may be a ROM, a Random Access Memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, an optical data storage device, and the like.

Other embodiments of the invention will be apparent to those skilled in the art from consideration of the specification and practice of the invention disclosed herein. This disclosure is intended to cover any variations, uses, or adaptations of the invention following, in general, the principles of the invention and including such departures from the present disclosure as come within known or customary practice within the art to which the invention pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the invention being indicated by the following claims.

It will be understood that the invention is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the invention is limited only by the appended claims.

Claims

1. A training method of a classification model is characterized in that sample data set utilized in the training of the classification model is provided, each sample data set is provided with a classification label, the classification labels of the sample data are divided into at least two levels with a top-bottom relation according to semantics, and each classification label belongs to one level; the method comprises the following steps:

2. The method of claim 1, wherein for a first training phase of the target number of training phases, the step of determining a target level corresponding to the training phase comprises:

3. The method according to claim 2, wherein when the number of targets is greater than 2, for each intermediate training stage except for a first training stage and a last training stage in the number of training stages of the targets, the step of determining the target level corresponding to the training stage comprises:

4. The method of claim 3, wherein the target number is the same as a number of levels of the at least two levels;

5. The method according to claim 2, wherein for a first training phase of the target number of training phases, ending the training process of the training phase when the classification model satisfies a predetermined convergence condition comprises:

6. The method of claim 3, wherein for each intermediate training phase of the target number of training phases, ending the training process of the training phase when the classification model satisfies a predetermined convergence condition comprises:

7. The method according to claim 5 or 6, wherein the predicted distribution probability value of each classification tag in the first loss Function is calculated based on a Sigmoid Function;

8. A training device for a classification model is characterized in that sample data used in the training of the classification model are concentrated, each sample data is provided with a classification label, the classification labels of the sample data are divided into at least two levels with a top-bottom relation according to semantics, and each classification label belongs to one level; the method comprises the following steps:

9. The apparatus of claim 8, wherein the training module, for a first training phase of the target number of training phases, determines a target level corresponding to the training phase, and comprises: determining the top level of the at least two levels as a target level corresponding to the first training stage;

10. The apparatus of claim 9, wherein when the target number is greater than 2, the training module determines, for each intermediate training phase of the target number of training phases except for a first training phase and a last training phase, a target level corresponding to the training phase, including: