WO2022036921A1 - Acquisition of target model - Google Patents
Acquisition of target model Download PDFInfo
- Publication number
- WO2022036921A1 WO2022036921A1 PCT/CN2020/132785 CN2020132785W WO2022036921A1 WO 2022036921 A1 WO2022036921 A1 WO 2022036921A1 CN 2020132785 W CN2020132785 W CN 2020132785W WO 2022036921 A1 WO2022036921 A1 WO 2022036921A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- network
- sub
- feature extraction
- model
- candidate
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/082—Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/096—Transfer learning
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Definitions
- the present disclosure relates to the field of information technology, and in particular, to a method and device for acquiring a target model, an electronic device, and a storage medium.
- Transfer learning aims to correlate the original model used to perform a certain task to obtain a target model to apply to the target task.
- transfer learning has been applied in many scenarios. Therefore, how to improve the performance of the target model has become a topic of great research value.
- the present disclosure provides a method and apparatus for acquiring a target model, an electronic device and a storage medium.
- a first aspect of the present disclosure provides a method for acquiring a target model.
- the method may include: pre-training an original model by using a first training sample set to adjust network parameters of the original model; wherein the original model includes a a first sub-network; using the second sub-network and at least part of the pre-trained structure of the first sub-network to obtain a target model; wherein the second sub-network is used to perform the target task based on the features extracted by the first sub-network; using and The second training sample set corresponding to the target task trains the target model to adjust the network parameters of the target model.
- the first sub-network may include a plurality of network segments, and each of the network segments may include sequentially connected at least one feature extraction unit for performing feature extraction.
- the first sub-network may include at least one branch network, and each branch network may include at least one of the network segments connected in sequence.
- the obtaining the target model by using the second sub-network and at least part of the pre-trained structure of the first sub-network may include: obtaining by using different partial structures of the first sub-network at least one candidate sub-network, and selecting the candidate sub-network that satisfies a preset condition as a selected sub-network; using the selected sub-network and the second sub-network to obtain the target model.
- the preset condition may include: a candidate model obtained by using the candidate sub-network and the second sub-network satisfies a preset performance condition.
- the preset condition may further include: the number of feature extraction units in the candidate sub-network reaches a preset number.
- the candidate sub-networks may include at least one feature extraction unit in each of the network segments in the same branch network, and the feature extraction units in different candidate sub-networks may at least Parts are different.
- the obtaining at least one candidate sub-network by using different partial structures of the first sub-network, and selecting the candidate sub-network that satisfies a preset condition as the selected sub-network may include: selecting the same sub-network
- the first feature extraction unit in each of the network segments in the branch network obtains an initial pending sub-network; for the pending sub-network, at least one feature extraction unit is added to obtain a candidate sub-network, wherein the added feature
- the extraction unit is located after the feature extraction unit selected in the network section; a plurality of candidate sub-networks and the second sub-network are respectively formed into candidate models, and one of the plurality of candidate models with the best performance conditions is selected
- the candidate sub-network corresponding to the candidate model is used as the pending sub-network; in the case where the number of feature extraction units in the pending sub-network is less than the preset number, obtain a new candidate sub-network based on the pending sub-network, and Select a candidate
- adding at least one feature extraction unit to the undetermined sub-network to obtain a candidate sub-network may include: taking each of the network segments as a target segment, and using the target segment The number of selected feature extraction units is increased by an integer value, while keeping the number of selected feature extraction units in other network segments unchanged, to obtain a candidate sub-network corresponding to the target segment.
- the obtaining the new candidate sub-network based on the pending sub-network may include: adding at least one feature extraction unit to the pending sub-network to obtain a new candidate sub-network, wherein: The added feature extraction unit is located after the selected feature extraction unit in the network section.
- the candidate model obtained by using the candidate sub-network and the second sub-network satisfies a preset performance condition may include: using a verification sample corresponding to the target task, comparing the candidate model using the candidate sub-network.
- the sub-network verifies the candidate model obtained by the second sub-network, and obtains a performance score of the candidate model for performing the target task; based on the performance score, it is determined whether the candidate model satisfies the preset performance condition.
- the first sub-network includes a first number of feature extraction units, the first sub-network includes a second number of network segments, and the preset number is less than the first number and greater than or equal to the second number quantity.
- the first sub-network may comprise a one-way branch network
- the branch network may comprise a plurality of network segments connected in sequence, each of the network segments may comprise at least one feature extraction connected in sequence unit
- the use of the first training sample set to pre-train the original model to adjust the network parameters of the original model may include: using a preset selection strategy before each training, selecting a The feature extraction unit; using the first training sample set, the part located before the selected feature extraction unit in each of the network segments is trained to adjust the feature extraction located in the selected feature extraction unit in each of the network segments Network parameters for the part before the cell.
- the first sub-network may include multiple branch networks, and each branch network may include at least one network segment connected in sequence, and each of the network segments may include at least one network segment connected in sequence.
- a feature extraction unit; the use of the first training sample set to pre-train the original model to adjust the network parameters of the original model may include: using a preset selection strategy before each training, selecting from the first sub-network All the way to the branch network and select one of the feature extraction units in each of the network segments included in the selected branch network; using the first training sample set, for each of the network segments The portion preceding the selected feature extraction unit is trained to adjust the network parameters of the portion preceding the selected feature extraction unit in each of said network segments.
- a preset selection strategy is used before each training, one of the branch networks is selected in the first sub-network, and each network segment included in the selected branch network is selected. Selecting one of the feature extraction units in the first sub-network may include: randomly selecting one of the branch networks in the first sub-network and selecting one of the network segments included in the selected branch network Feature extraction unit.
- the first sub-network may further include a downsampling layer between adjacent segments of the network.
- the feature extraction unit may include sequentially connected convolutional layers, activation layers, and batching layers.
- the method may further include: training the original model by using the second training sample set to adjust the network parameters of the original model.
- the method may further include: using the first training sample set to train the target model to adjust the network parameters of the target model parameter.
- the original model may further include a third sub-network for performing a preset task based on the extracted features, wherein the preset task may be the same as the target task or different.
- the number of first training samples in the first training sample set may be greater than the number of second training samples in the second training sample set.
- a second aspect of the present disclosure provides an apparatus for obtaining a target model, including: a first training module, a model obtaining module, and a second training module, where the first training module is used to pre-train an original model by using the first training sample set , to adjust the network parameters of the original model; wherein, the original model includes a first sub-network for feature extraction; the model acquisition module is used to use the second sub-network and at least part of the structure of the pre-trained first sub-network to obtain the target model; wherein, the second sub-network is used to perform the target task based on the features extracted by the first sub-network; the second training module is used to use the second training sample set corresponding to the target task to train the target model to adjust the network parameters of the target model .
- a third aspect of the present disclosure provides an electronic device, including a memory and a processor coupled to each other, where the processor is configured to execute program instructions stored in the memory, so as to implement the method for acquiring the target model in the first aspect.
- a fourth aspect of the present disclosure provides a computer-readable storage medium on which program instructions are stored, and when the program instructions are executed by a processor, implement the method for acquiring the target model in the first aspect above.
- FIG. 1 is a schematic flowchart of a method for acquiring a target model according to an embodiment of the present disclosure
- FIG. 2 is a schematic diagram of a framework of a first sub-network according to an embodiment of the present disclosure
- FIG. 3 is a schematic diagram of a framework of a first sub-network according to another embodiment of the present disclosure.
- FIG. 4 is a schematic flowchart of step S12 in FIG. 1 according to an embodiment of the present disclosure
- FIG. 5 is a schematic diagram of a framework of a sub-network to be determined according to an embodiment of the present disclosure
- FIG. 6 is a schematic flowchart of a method for acquiring a target model according to another embodiment of the present disclosure
- FIG. 7 is a schematic diagram of a framework of an apparatus for acquiring a target model according to an embodiment of the present disclosure
- FIG. 8 is a schematic diagram of a framework of an electronic device according to an embodiment of the present disclosure.
- FIG. 9 is a schematic diagram of a frame of a computer-readable storage medium according to an embodiment of the present disclosure.
- system and “network” are used interchangeably herein.
- the term “and/or” in this article is only an association relationship to describe the associated objects, indicating that there can be three kinds of relationships, for example, A and/or B, it can mean that A exists alone, A and B exist at the same time, and A and B exist independently B these three cases.
- the character "/” in this document generally indicates that the related objects are an “or” relationship.
- “multiple” herein means two or more than two.
- FIG. 1 is a schematic flowchart of a method for acquiring a target model according to an embodiment of the present disclosure. Specifically, steps S11 to S13 may be included.
- Step S11 Pre-training the original model by using the first training sample set to adjust the network parameters of the original model; wherein, the original model includes a first sub-network for feature extraction.
- the first sub-network may include a plurality of network segments (stages), and each network segment may include at least one feature extraction unit (block) connected in sequence.
- the feature extraction unit is used for feature extraction, and may include sequentially connected convolutional layers, activation layers, and batch normalization (BN) layers.
- a convolutional layer can include several convolution kernels for feature extraction.
- the activation layer can include activation functions such as sigmoid, tanh, ReLu, etc., to introduce nonlinear factors.
- Batch layers can be used for normalization operations.
- the sequential connection of convolutional layers, activation layers and batching layers can help improve the learning effect of the feature extraction unit during the training process.
- a pooling layer can be connected after the convolutional layer to downsample the features extracted by the convolutional layer.
- the first sub-network also includes a downsampling layer located between adjacent network segments, which can facilitate feature dimension reduction, compress the number of data and parameters, reduce overfitting, and improve fault tolerance.
- the number of feature extraction units included in each network segment may be the same, for example, each network segment includes 3, or 4, or 5 feature extraction units.
- the number of feature extraction units contained in each network segment can also be completely different.
- the first network segment contains 3 feature extraction units
- the second network segment contains 4 feature extraction units
- the third network segment contains 4 feature extraction units.
- a segment contains 5 feature extraction units.
- the number of feature extraction units included in each network segment may not be exactly the same.
- the first network segment includes 3 feature extraction units
- the second network segment also includes 3 feature extraction units
- the third network segment also includes 3 feature extraction units.
- the network segment contains 5 feature extraction units. That is to say, it can be set according to actual application requirements, which is not limited here.
- the first sub-network may have multiple branch networks, and each branch network may include a sequence At least one network segment of the connection.
- FIG. 2 is a schematic diagram of a frame of a first sub-network according to an embodiment of the present disclosure.
- a dotted rectangle represents a network segment.
- Each network segment includes 4 feature extraction units.
- the sub-network includes three branch networks connected in parallel, the first branch network is the first row of network segments (to simplify the schematic diagram, the first branch network only schematically depicts one network segment), the second branch network is the second row network segment, the third branch network is the third row network segment (to simplify the schematic diagram, the third branch network only schematically depicts one network segment), and the three branch networks have the same input node.
- the first sub-network including other number of branch networks may also be set according to actual application requirements.
- the first sub-network may be set to include a 2-way branch network, a 4-way branch network, etc., which is not limited herein.
- multiple network segments may also be connected in series.
- the first sub-network may have only one branch network, and the branch network includes multiple network segments connected in sequence .
- FIG. 3 is a schematic diagram of a frame of a first sub-network according to another embodiment of the present disclosure. As shown in FIG. 3 , the dotted rectangles represent network segments, and each network segment includes 4 feature extraction units.
- the first sub-network includes two serially connected network segments, in this case the first sub-network contains only one branch network.
- the original model may further include another sub-network, for example, a third sub-network, and the other sub-network is used to perform preset tasks based on the extracted features.
- the preset task may be a target detection task, an image classification task, a scene segmentation task, etc., which are not limited herein.
- the target detection task means detecting target objects in the image, for example, detecting vehicles, pedestrians, etc. in the image;
- the image classification task means classifying the image into a certain category, for example, classifying the image into cats, dogs, turtles, etc. ;
- the scene segmentation task represents the category to which the pixels in the image are detected.
- the pixels in the image that belong to lanes, cars, green belts, and the sky are detected.
- preset tasks only indicate that there may exist in practical applications. a use case, and it does not limit its scope of use.
- the specific structure of the other sub-network can be set according to actual application requirements, which is not limited here.
- another sub-network may include several (eg, 2, 3, etc.) sequentially connected fully-connected layers, softmax layers, etc., here Not limited.
- another sub-network may include a fully connected layer and a softmax layer, which is not limited herein.
- the first training sample set may be a large-scale data set, that is, the number of first training samples in the first training sample set is greater than a preset value (eg, 1000, 5000, 10000 etc). Therefore, the original model can be fully pre-trained by using the first training sample set, which is beneficial to improve the accuracy of the subsequently acquired target model.
- the first sub-network may only include a one-way branch network, and the one-way branch network may include a plurality of sequentially connected network segments.
- a preset selection strategy can be used before each training to select a feature extraction unit in each network segment, so that the first training sample set can be used to analyze the selected features in each network segment.
- the portion preceding the extraction unit (including the selected feature extraction unit) is trained to adjust the network parameters of the portion preceding the selected feature extraction unit in each network segment.
- the above method can help to improve the efficiency of pre-training.
- the above-mentioned preset selection strategy may include: randomly selecting a feature extraction unit in each network segment.
- the preset selection strategy may include: randomly selecting an integer value within a preset value range corresponding to each network segment, and the upper limit of the preset value range is the value of the corresponding network segment.
- the number of included feature extraction units for convenience of description, can be denoted as N i , which represents the number of feature extraction units included in the i-th network segment.
- An integer value is randomly selected within the preset value range of Ni.
- the randomly selected integer value can be denoted as S i , which represents the randomly selected integer value of the i -th network segment, and the first training sample can be used. Set, train the part (i.e.
- the selected features in each network segment can be The output of the extraction unit is used as the input data for the next network segment.
- the output result of the second feature extraction unit can be used as the next feature extraction unit.
- Input data for the network segment when the feature extraction unit selected in the first network segment is the third feature extraction unit, the output result of the third feature extraction unit can be used as the input result of the next network segment.
- a preset selection strategy can be used before each training to select one branch network in the first sub-network, and the selected branch network includes A feature extraction unit is selected in each network segment of The branch network contains the network parameters of the part of each network segment preceding the selected feature extraction unit.
- an integer value can be randomly selected within a preset value range, and the upper limit of the preset value range is the number of branch networks included in the first sub-network.
- the sub-network includes N branch networks in total, and the lower limit of the preset value range may be 1, and an integer value may be randomly selected from the preset value range of 1 to N.
- the randomly selected integer value may be denoted as S, indicating that the S-th branch network is selected in the first sub-network. Please refer to FIG. 2 , the first sub-network shown in FIG.
- the branch network that contains the largest number of feature extraction units in the first sub-network may be selected. Then, a feature extraction unit is selected in each network segment included in the branch network. For details, please refer to the foregoing description, which will not be repeated here.
- the preset end condition may include: the number of times each first training sample participates in training has reached a preset number of times threshold, and the preset number of times threshold may be set according to actual application needs, for example, may be set to 100, 120, 150, etc. etc., which are not limited here.
- Step S12 Obtain a target model by using the second sub-network and at least part of the pre-trained structure of the first sub-network.
- the second sub-network is configured to perform the target task based on the features extracted by the first sub-network.
- the target task may include any one of the following: target detection task, image classification task, and scene segmentation task.
- the original model may further include a third sub-network, and the third sub-network is used to perform a preset task based on the extracted features, and the preset task may be the same as or different from the target task.
- both the preset task and the target task may be image classification tasks.
- the preset task and the target task may both be target detection tasks.
- the preset task may be a target detection task, and the target task may be an image classification task, which is not limited herein.
- the third sub-network may be the same as the second sub-network, or may not be the same.
- the third sub-network may include a fully connected layer and a softmax layer, and the second sub-network may include two sequentially connected fully connected layers and a softmax layer connected after these two fully connected layers.
- the second sub-network like the third sub-network, may also include a fully connected layer and a softmax layer, which is not limited herein.
- different partial structures of the first sub-network can be used to obtain at least one candidate sub-network, and a candidate sub-network that satisfies a preset condition can be selected as the selected sub-network, so that the selected sub-network and the second sub-network can be used. network to get the target model.
- the candidate sub-network includes at least one feature extraction unit in each network segment in the same branch network, and the feature extraction units in different candidate sub-networks are at least partially different, then each time the first sub-network is selected
- one branch network can be selected, and a feature extraction unit can be selected in each network section of the selected branch network, so that the part of each network section located before the selected feature extraction unit can be extracted.
- the combination is used as a partial structure of the first sub-network, so that different partial structures of the first sub-network can be obtained. Taking the first sub-network only includes one branch network as an example, please refer to Fig. 3 in conjunction with Fig. 3.
- the third feature extraction unit can be randomly selected in the first network section, and the The second network segment randomly selects the second feature extraction unit, then the part of the first network segment before the third feature extraction unit and the part of the second network segment located before the second feature extraction unit can be combined
- the combination of the parts of the first sub-network is used as the partial structure of the first sub-network; when selecting the partial structure of the first sub-network for the second time, the second feature extraction unit can be randomly selected in the first network section, and the second feature extraction unit can be selected in the second network section.
- the segment randomly selects the third feature extraction unit then the part of the first network segment before the second feature extraction unit and the part of the second network segment before the third feature extraction unit can be combined, As a part of the structure of the first sub-network, and so on, and will not be exemplified one by one here.
- the number of times of selection can be set according to actual application requirements, for example, 10, or 15, or 20, etc. can be used as the number of times of selection according to the computational complexity, which is not limited here.
- the preset condition may include: the candidate model obtained by using the candidate sub-network and the second sub-network satisfies the preset performance condition.
- the verification samples corresponding to the target task can be used to verify the candidate model obtained by using the candidate sub-network and the second sub-network to obtain the performance score of the candidate model for executing the target task, so as to determine whether the candidate model satisfies the performance score based on the performance score.
- Preset performance conditions For example, when the performance score is the highest value of the performance scores of all candidate models, it may be considered that the corresponding candidate model satisfies the preset performance condition.
- the preset condition may further include: the number of feature extraction units in the candidate sub-network is not less than the preset number.
- the preset number can be set according to actual application requirements, for example, it can be set to 4, 5, 6, etc., which is not limited here.
- the preset number can constrain the complexity of the target model.
- the selected sub-network and the second sub-network may be sequentially connected to obtain the target model.
- the candidate sub-network includes at least one feature extraction unit in each network segment of the same network branch, and the feature extraction units in different candidate sub-networks are at least partially different.
- the candidate sub-network includes at least one feature extraction unit in each network segment of the same network branch, and the feature extraction units in different candidate sub-networks are at least partially different.
- each sub-network in the first sub-network can be The network segment is assigned a count value count i , which represents the count value of the i-th network segment, and is used to combine the feature extraction units located before the count value count i in each network segment as In the partial structure selected this time, the number of network segments included in the first sub-network may be denoted as N s .
- the initial value of the count value count i can be set to 1, and after each partial structure is selected, the count is incremented by 1 until all the different partial structures in the first sub-network are exhaustively exhausted.
- the count values of the 4 network segments can be recorded as 1 respectively.
- 1, 1, 1, at this time the combination of the first feature extraction unit in each network segment can be used as the partial structure selected this time; when the partial structure is selected for the second time, the count value of the four network segments can be They are denoted as 1, 1, 1, and 2, respectively.
- the combination of the first feature extraction unit of the first three network segments and the first two feature extraction units of the last network segment can be used as part of the structure selected this time. ;
- the count values of the four network segments can be recorded as 1, 1, 1, and 3, respectively.
- the first feature extraction unit and the last network segment of the first three network segments can be The combination of the first three feature extraction units of the segment is used as the partial structure selected this time; when the partial structure is selected for the fourth time, the count values of the four network segments can be recorded as 1, 1, 1, and 4, respectively.
- the first feature extraction unit of the first three network segments and all the feature extraction units of the last network segment are taken as part of the structure selected this time; when the partial structure is selected for the fifth time, the count values of the four network segments can be Denoted as 1, 1, 2, and 1, respectively, at this time, the first feature extraction unit of the first two network segments, the first two feature extraction units of the third network segment, and the first feature extraction unit of the last network segment can be The combination of feature extraction units is used as the partial structure selected this time; when the partial structure is selected for the sixth time, the count values of the four network segments can be recorded as 1, 1, 2, and 2 respectively.
- the combination of the first feature extraction unit of the segment and the first two feature extraction units of the next two network segments is used as part of the structure selected this time, and so on, which will not be repeated here.
- the second training sample set may be used for training original model to tune the network parameters of the original model. Therefore, the original model can first adjust the network parameters based on the target task, which can help to improve the accuracy of the network structure dimension adjustment.
- at least part of the structure of the first sub-network includes the adjusted network parameters after being trained by the second training sample set, that is, during the transfer learning process, both the network parameters and the network structure are adjusted.
- the number of first training samples in the first training sample set is greater than the number of second training samples in the second training sample set, then through the embodiments of the present disclosure, the large-scale data samples (ie the first On the basis of the training sample set) and the small-scale data samples corresponding to the target task (ie the second training sample set), a target model suitable for the target task is obtained, which can help reduce the difficulty of collecting small-scale data samples and labeling.
- the workload of the second training sample set can further improve the efficiency of obtaining the target model.
- the number of first training samples in the first training sample set may be 5000, 10000, 15000, etc.
- the number of second training samples in the second training sample set may be 100, 200, 300, etc., which can be determined according to It is set according to the actual usage, which is not limited here.
- Step S13 using the second training sample set corresponding to the target task to train the target model to adjust the network parameters of the target model.
- the target model may be trained only by using the second training sample set corresponding to the target task, so as to adjust the network parameters of the target model.
- the number of second training samples in the second training sample set is smaller than the number of first training samples in the first training sample set, for example, the second training sample set is small-scale data samples, and the first training sample set is large-scale data
- the second training sample set is small-scale data samples
- the first training sample set is large-scale data
- the target model in order to improve the accuracy of the target model, can also be trained by using the first training sample set to adjust the network parameters of the target model, and then the target model can be trained by using the second training sample set corresponding to the target task. model to adjust the network parameters of the target model again.
- the original model is first pre-trained by using the first training sample set to adjust the network parameters of the original model, wherein the original model includes a first sub-network for feature extraction.
- a target model is then obtained using the second sub-network and at least part of the pre-trained structure of the first sub-network, wherein the second sub-network is used to perform the target task based on the features extracted by the first sub-network.
- the target model is trained by using the second training sample set corresponding to the target task to adjust the network parameters of the target model.
- the original model is a single-branch network structure, that is, the first sub-network includes a branch network, and the branch network includes a plurality of network segments connected in sequence.
- Each network segment includes at least one feature extraction unit connected in sequence.
- the candidate sub-network includes at least one feature extraction unit in each network segment. Feature extraction units in different candidate sub-networks are at least partially different.
- Step S12 may include steps S41 to S46.
- Step S41 Select the first feature extraction unit in each network segment to obtain an initial undetermined sub-network.
- the first feature extraction unit in each network segment is selected, and each first feature extraction unit is connected together to obtain the initial undetermined sub-network.
- the output result of the first feature extraction unit in the first network segment can be used as the input data of the second network segment.
- the initial pending sub-network is denoted as [1,1].
- Step S42 for the undetermined sub-network, add at least one feature extraction unit to obtain a candidate sub-network, wherein the added feature extraction unit is located after the feature extraction unit selected by at least one network segment.
- the at least one feature extraction unit may be one feature extraction unit, two feature extraction units, etc., which is not limited herein.
- at least one feature extraction unit may be two, three, etc. multiple feature extraction units; while in the case of high accuracy requirements for network structure adjustment, it can be It is a feature extraction unit, which can be specifically set according to actual application needs, which is not limited here.
- Each network segment can be used as a target segment, and then the sequence number of the selected feature extraction unit of the target segment in the previous step is increased by 1, that is, a feature extraction unit is added, while maintaining other networks in the first sub-network. In the segment, the number of selected feature extraction units in the previous step remains unchanged, and the candidate sub-network corresponding to the target segment is obtained.
- the first network segment and the second network segment can be respectively used as target segments. Utilize the first two feature extraction units in the first target segment (in step S41, the feature extraction unit selected by the first network segment is the first feature extraction unit, so it becomes the first two features in this step extraction unit) and the first feature extraction unit in the second network segment to obtain candidate sub-networks for the first target segment.
- the output result of the second feature extraction unit in the first target segment can be used as the input data of the second network segment.
- the candidate sub-network of the second target section is obtained.
- the output result of the first feature extraction unit in the first network segment can be used as the input data of the second target segment.
- the candidate sub-network corresponding to the first target segment can be marked as [2, 1], indicating that the candidate sub-network corresponding to the first target segment is composed of the first network.
- the first two feature extraction units of the segment and the first feature extraction unit of the second network segment, and the initial candidate sub-network corresponding to the second target segment is recorded as [1, 2], indicating that the second
- the candidate sub-network corresponding to the target segment is composed of the first feature extraction unit of the first network segment and the first two feature extraction units of the second network segment.
- each network segment can be used as a target segment, and then the sequence number of the selected feature extraction unit of the target segment in the previous step is added to an integer such as 2, 3, etc., that is, an increase of two A feature extraction unit, three feature extraction units..., while keeping the number of selected feature extraction units in the previous step in other network segments in the first sub-network unchanged, to obtain a candidate sub-network corresponding to the target segment.
- the number of added feature extraction units can be set according to actual application requirements, which is not limited here.
- Step S43 forming a candidate model from the plurality of candidate sub-networks and the second sub-network respectively, and selecting a candidate sub-network corresponding to a candidate model with the best performance condition among the plurality of candidate models as the sub-network to be determined.
- the verification samples corresponding to the target task can be used to verify the candidate models obtained by multiple candidate sub-networks and the second sub-network, respectively, to obtain the performance scores of the multiple candidate models to perform the target task, and determine the best performance based on the performance scores.
- a good candidate model, where the validation samples can be the same as or part of the second set of training samples. For details, reference may be made to the relevant descriptions in the foregoing disclosed embodiments, which will not be repeated here. It is assumed that, after verification by the validation samples, the candidate sub-network [2, 1] can be taken as the pending sub-network.
- Step S44 Determine whether the number of feature extraction units in the sub-network to be determined is less than the preset number, and if so, go to Step S42, otherwise go to Step S45.
- the preset number may be preset according to actual application requirements. Specifically, it can be set according to the desired complexity of the target model. For example, it can be set to 4, 5, 6, etc., which is not limited here.
- step S42 when the preset number is 4, since the number of feature extraction units in the undetermined sub-network [2, 1] is 3, which is less than the preset number, step S42 may be executed.
- step S45 when the preset number is 3, since the number of feature extraction units in the undetermined sub-network [2,1] is 3, which is not less than the preset number, step S45 can be executed, that is, the undetermined sub-network [2,1] can be directly ], as the selected subnet.
- Other situations can be deduced by analogy, and no examples are given here.
- the first sub-network may include a first number of feature extraction units
- the first sub-network may include a second number of network segments
- the preset number may be set to be smaller than the first number and greater than the second number quantity.
- the preset number may be set according to the computational complexity of the target model for completing the target task.
- the candidate sub-network [2, 1] can be used as an example to continue the description.
- step S44 determines that the feature extraction units of the sub-network [2, 1] to be determined are less than the preset number, jump to step S42 again. At this time, based on the undetermined sub-network [2, 1], a new candidate sub-network can be obtained.
- the first feature extraction unit located after the selected feature extraction unit in the target segment may be added after the selected feature extraction unit of the target segment, thereby obtaining a new candidate sub-network.
- the first network segment can be used as the target segment first, and the segment located after the selected feature extraction unit (the second feature extraction unit)
- the first feature extraction unit (the third feature extraction unit) is added to the selected feature extraction unit (the second feature extraction unit) of the target segment to obtain a new candidate sub-network, which can be expressed as [ 3,1], that is, the new candidate sub-network is composed of the first three feature extraction units in the first network segment and the first feature extraction unit in the second network segment.
- the second network segment can be used as the target segment, and the first feature extraction unit (the first feature extraction unit) after the selected feature extraction unit (the first feature extraction unit) in the target segment ), after adding to the selected feature extraction unit (the first feature extraction unit) of the target segment, another new candidate sub-network is obtained, which can be expressed as [2, 2] for the convenience of description.
- the first feature extraction unit the first feature extraction unit
- the selected feature extraction unit the first feature extraction unit
- a sub-network with the best performance conditions is selected from the current candidate sub-networks as the pending sub-network, and then at least one feature extraction unit is added to the previous number of feature extraction units.
- the unit calculates a new candidate sub-network based on the previous pending sub-network, and selects the sub-network with the best performance condition from the new candidate sub-network as the new pending sub-network.
- FIG. 5 is a schematic diagram of a frame of a sub-network to be determined according to an embodiment of the present disclosure.
- the feature extraction unit in the solid rectangle represents the selected feature extraction unit
- the feature extraction unit in the dotted rectangle represents the selected feature extraction unit.
- the undetermined sub-network shown in Figure 5 is [2, 2], and the undetermined sub-network is [2, 2] and the preset number is 4 as an example. Since the feature extraction in the un-determined sub-network is If the number of units is equal to 4, the following step S45 may be performed, that is, the undetermined sub-network [2, 2] may be used as the selected sub-network.
- Step S45 Take the undetermined sub-network as the selected sub-network.
- the sub-network to be determined may be used as the selected sub-network.
- the selected sub-network at the performance level can be obtained under the model complexity constrained by the preset number.
- Step S46 Obtain the target model by using the selected sub-network and the second sub-network.
- the selected sub-network and the second sub-network can be sequentially connected to obtain the target model.
- the above method can not only constrain the target model in terms of "model complexity” and “model performance”, but also improve the efficiency of acquiring the target model.
- a branch network may be selected from the first sub-network first, and then steps S41 to S46 are performed for the branch network.
- FIG. 6 is a schematic flowchart of a method for acquiring a target model according to another embodiment of the present disclosure.
- the original model includes a first sub-network for feature extraction
- the first sub-network includes a branch network
- the branch network includes a plurality of network segments connected in sequence, each network segment A segment includes at least one feature extraction unit connected sequentially.
- Steps S601 to S612 may be included.
- Step S601 Select a feature extraction unit in each network segment by using a preset selection strategy before training the original model each time.
- Step S602 Use the first training sample set to train the part of each network segment before the selected feature extraction unit to adjust the network parameters of the part of each network segment before the selected feature extraction unit.
- Step S603 Use the second training sample set corresponding to the target task to train the original model to adjust the network parameters of the original model.
- Step S604 Select the first feature extraction unit in each network segment to obtain the initial undetermined sub-network.
- Step S605 for the undetermined sub-network, add at least one feature extraction unit to obtain a candidate sub-network, wherein the added feature extraction unit is located after the feature extraction unit selected by at least one network segment.
- Step S606 The multiple candidate sub-networks and the second sub-network are respectively formed into candidate models, and the candidate sub-network corresponding to a candidate sub-model with the best performance condition among the multiple candidate models is selected as the undetermined sub-network.
- the second sub-network is configured to perform the target task based on the features extracted by the first sub-network.
- Step S607 Determine whether the number of feature extraction units of the sub-network to be determined is less than the preset number, if so, go to step S605, otherwise go to step S608.
- Step S608 Obtain a new candidate sub-network based on the undetermined sub-network.
- Step S609 Take the undetermined sub-network as the selected sub-network.
- Step S610 Obtain the target model by using the selected sub-network and the second sub-network.
- the sub-network to be determined is the sub-network [2, 2] shown in FIG. 5, and when the preset number is 4, the sub-network to be determined [2, 2] can be used as the selected sub-network, and the selected sub-network can be used.
- network and the second sub-network to get the target model Specifically, the selected sub-network and the second sub-network can be sequentially connected to obtain the target model.
- Step S611 Use the first training sample set to train the target model to adjust the network parameters of the target model.
- the neutralized sub-network is the undetermined sub-network [2, 2] shown in Fig. 5
- the first training sample set can be used to form the undetermined sub-network [2, 2] and the second sub-network
- the target model is trained to adjust the network parameters of the target model.
- Step S612 Use the second training sample set corresponding to the target task to train the target model to adjust the network parameters of the target model.
- the neutralized sub-network is the undetermined sub-network [2, 2] shown in Fig. 5
- the second training sample set can be further used to pair the undetermined sub-network [2, 2] and the second sub-network
- the formed target model is trained to adjust the network parameters of the target model.
- the above method can help to improve the efficiency of acquiring the selected sub-network at the level of "network structure adjustment". Since the original model is adjusted at the level of "network parameter adjustment” and "network structure adjustment", the degree of freedom of network adjustment can be greatly improved, and the original model of pre-training can be fully excavated from the "network parameter dimension" and "network structure dimension”. The potential of the model is beneficial to improve the performance of the target model.
- a branch network may be selected from the first sub-network first, and then steps S601 to S612 are performed for the branch network.
- FIG. 7 is a schematic diagram of a framework of an apparatus 70 for acquiring a target model according to an embodiment of the present disclosure.
- the device 70 for acquiring the target model may include: a first training module 71 , a model acquiring module 72 and a second training module 73 , where the first training module 71 is used to pre-train the original model by using the first training sample set to adjust the network of the original model parameters; wherein, the original model may include a first sub-network for feature extraction; the model acquisition module 72 is configured to obtain a target model by using the second sub-network and at least part of the pre-trained first sub-network structure; wherein, the first sub-network The second sub-network is used to perform the target task based on the features extracted by the first sub-network; the second training module 73 is used to train the target model using the second training sample set corresponding to the target task to adjust the network parameters of the target model.
- the first sub-network may include a plurality of network segments, and each of the network segments may include sequentially connected at least one feature extraction unit for performing feature extraction.
- the first sub-network may include at least one branch network, and each branch network includes at least one of the network segments connected in sequence. Therefore, the first sub-network can be set as a "single-chain" single-branch network, or as a "multi-chain” multi-branch network, so that the target model can be obtained in the multi-branch network, and the single branch network can be obtained. The target model is obtained in the network, which can help to expand the scope of use.
- the model acquisition module 72 may include a structure search sub-module, configured to obtain at least one candidate sub-network by using different partial structures of the first sub-network, and select a candidate sub-network that satisfies a preset condition as the selected sub-network , the model obtaining module 72 may include a model building module for obtaining the target model by using the selected sub-network and the second sub-network.
- a structure search sub-module configured to obtain at least one candidate sub-network by using different partial structures of the first sub-network, and select a candidate sub-network that satisfies a preset condition as the selected sub-network
- the model obtaining module 72 may include a model building module for obtaining the target model by using the selected sub-network and the second sub-network.
- At least one candidate sub-network is obtained, and the candidate sub-network that satisfies the preset conditions is selected as the selected sub-network, so that the selected sub-network and the second sub-network are used to obtain the target model, which can have It is beneficial to expand the adjustment space of "network structure dimension", which can help to improve the performance of the target model.
- the preset condition may further include: the number of feature extraction units in the candidate sub-network reaches a preset number.
- the target model can be constrained from the levels of "model complexity" and "model performance”. .
- the candidate sub-networks may include at least one feature extraction unit in each network segment in the same branch network, and the feature extraction units in different candidate sub-networks are at least partially different.
- the structure search sub-module may include an initialization unit for selecting a first feature extraction unit in each of the network segments in the same branch network to obtain an initial undetermined sub-network, the The initialization unit is further configured to add at least one feature extraction unit to the undetermined sub-network to obtain a candidate sub-network, wherein the added feature extraction unit is located after the feature extraction unit selected in the network section, and the structure search sub-module may A performance evaluation unit is included, which is used to form a candidate model with a plurality of the candidate sub-networks and the second sub-network respectively, and select a candidate sub-network corresponding to a candidate model with the best performance condition among the plurality of candidate models, as the candidate sub-network.
- the structure search sub-module includes a repeated search unit, used for obtaining a new candidate sub-network based on the sub-network to be determined when the number of feature extraction units in the sub-network to be determined is less than a preset number, and Select a candidate sub-network corresponding to a candidate model with the best performance condition among the candidate models composed of a plurality of new candidate sub-networks and the second sub-network, as the new to-be-determined sub-network, the structure search sub-module may A selection acquisition unit is included, which is used to use the pending sub-network as the selected sub-network under the condition that the number of feature extraction units in the pending sub-network is not less than the preset number.
- the initialization unit may be configured to use each of the network segments as a target segment, increase the number of feature extraction units selected by the target segment by an integer value, while maintaining the number of feature extraction units in other network segments.
- the number of selected feature extraction units is unchanged, and the repeated search unit to obtain the candidate sub-network corresponding to the target section can be used to add at least one feature extraction unit to the selected sub-network to obtain a new candidate sub-network, wherein adding The feature extraction unit of is located after the selected feature extraction unit in the network section.
- each network segment By taking each network segment as a target segment and using the first feature extraction unit in each target segment to obtain the initial undetermined sub-network, it can be beneficial to extract each network segment from the first sub-network
- the head starts to adjust the network structure. By increasing the number of feature extraction units selected in the target segment by an integer value, while keeping the number of feature extraction units selected in other network segments unchanged, the candidates corresponding to the target segment are obtained. Therefore, in the subsequent adjustment process, the feature extraction units of different network segments can be adjusted one by one, which can help to improve the accuracy of network adjustment.
- the performance evaluation unit may be configured to use the verification samples corresponding to the target task to verify the candidate model obtained by using the candidate sub-network and the second sub-network to obtain a performance score of the candidate model for executing the target task.
- the first sub-network includes a first number of feature extraction units, the first sub-network includes a second number of network segments, and the preset number is less than the first number and greater than or equal to the second number.
- the candidate model obtained by using the candidate sub-network and the second sub-network is verified to obtain the performance score of the candidate model for executing the target task, and based on the performance score to determine whether the candidate model meets the preset performance conditions , so the accuracy of selecting the selected sub-network can be improved;
- the first sub-network includes a first number of feature extraction units, and the first sub-network includes a second number of network segments, and the preset number is smaller than the first number, Greater than or equal to the second number can help reduce the complexity of the target model.
- the first sub-network includes a branch network of one way
- the branch network includes a plurality of network segments connected in sequence
- each network segment includes at least one feature extraction unit connected in sequence
- the first training module 71 It includes a unit selection sub-module for using a preset selection strategy before each training to select a feature extraction unit in each network segment
- the first training module 71 includes a sample training sub-module for using the first training sample set.
- the first sub-network comprises a multi-way branch network
- each of the branch networks comprises at least one network segment connected in sequence, each segment of the network comprising at least one feature connected in sequence
- the extraction unit, the unit selection sub-module is further configured to use a preset selection strategy before each training, select one of the branch networks in the first sub-network, and select a branch network in each branch network included in the selected branch network.
- One of the feature extraction units is selected from the network section, and the sample training sub-module is used to use the first training sample set to train the part of each of the network sections before the selected feature extraction unit , to adjust the network parameters of the portion of each of the network segments preceding the selected feature extraction unit.
- one of the branch networks may be randomly selected in the first sub-network and one of the feature extraction units may be selected in each of the network segments included in the selected branch network.
- a branch network including the largest number of feature extraction units may be selected in the first sub-network, and a branch network may be selected in each of the network segments included in the selected branch network Describe the feature extraction unit.
- the first sub-network may also include a downsampling layer between adjacent network segments.
- the learning effect of the feature extraction unit in the training process can be improved;
- the downsampling layer between them can help to achieve feature dimensionality reduction, compress the number of data and parameters, reduce overfitting, and improve fault tolerance.
- the apparatus 70 for acquiring the target model may further include a third training module for training the original model by using the second training sample set to adjust the network parameters of the original model.
- the original model is first trained with the second training sample set corresponding to the target task to adjust the network parameters of the original model, which can help to improve the accuracy of subsequent network structure dimension adjustment.
- the apparatus 70 for acquiring the target model may further include a fourth training module, configured to train the target model by using the first training sample set, so as to adjust the network parameters of the target model.
- the first training sample set is used to train the target model, and then the second training sample set corresponding to the target task is used to train the target model again, which can help improve the performance of the target model.
- the original model may further include a third sub-network for performing a preset task based on the extracted features, wherein the preset task is the same as or different from the target task.
- the original model By setting the original model to include a third sub-network, and the third sub-network is used to perform a preset task based on the extracted features, and the preset task is the same or different from the target task, it can be beneficial to further expand the application for obtaining the target model range.
- the number of first training samples in the first training sample set may be greater than the number of second training samples in the second training sample set.
- the number of the first training samples in the first training sample set is greater than the number of the second training samples in the second training sample set, it can be beneficial to reduce the workload of sample labeling on the target task.
- FIG. 8 is a schematic diagram of a frame of an electronic device 80 according to an embodiment of the present disclosure.
- the electronic device 80 includes a memory 81 and a processor 82 coupled to each other, and the processor 82 is configured to execute program instructions stored in the memory 81 to implement the steps of any of the above-mentioned embodiments of the method for acquiring a target model.
- the electronic device 80 may include, but is not limited to, a microcomputer and a server.
- the electronic device 80 may also include mobile devices such as a notebook computer and a tablet computer, which are not limited herein.
- the processor 82 is configured to control itself and the memory 81 to implement the steps of the above-mentioned method for acquiring any target model.
- the processor 82 may also be referred to as a CPU (Central Processing Unit, central processing unit).
- the processor 82 may be an integrated circuit chip with signal processing capability.
- the processor 82 can also be a general-purpose processor, a digital signal processor (Digital Signal Processor, DSP), an application specific integrated circuit (Application Specific Integrated Circuit, ASIC), a field programmable gate array (Field-Programmable Gate Array, FPGA) or other Programmable logic devices, discrete gate or transistor logic devices, discrete hardware components.
- a general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
- the processor 82 may be jointly implemented by an integrated circuit chip.
- the second training sample set corresponding to the target task be used for adjustment in the "network parameter dimension”
- the original model can be adjusted in the "network structure dimension” which can greatly improve the freedom of network adjustment It is possible to fully tap the potential of the pre-trained original model from the "network parameter dimension” and “network structure dimension”, which is beneficial to improve the performance of the target model.
- FIG. 9 is a schematic diagram of a framework of an embodiment of the disclosed computer-readable storage medium 90 .
- the computer-readable storage medium 90 stores program instructions 901 that can be executed by the processor, and the program instructions 901 are used to implement the steps of any of the above-mentioned embodiments of the method for acquiring a target model.
- the second training sample set corresponding to the target task can be used for adjustment in the "network parameter dimension”, but also the original model can be adjusted in the "network structure dimension”, which can greatly improve the freedom of network adjustment. It is possible to fully tap the potential of the pre-trained original model from the "network parameter dimension” and “network structure dimension”, which is beneficial to improve the performance of the target model.
- the functions or modules included in the apparatuses provided in the embodiments of the present disclosure may be used to execute the methods described in the above method embodiments.
- the disclosed method and apparatus may be implemented in other manners.
- the device implementations described above are only illustrative.
- the division of modules or units is only a logical function division. In actual implementation, there may be other divisions.
- units or components may be combined or integrated. to another system, or some features can be ignored, or not implemented.
- the shown or discussed mutual coupling or direct coupling or communication connection may be through some interfaces, indirect coupling or communication connection of devices or units, which may be in electrical, mechanical or other forms.
- Units described as separate components may or may not be physically separated, and components shown as units may or may not be physical units, that is, may be located in one place, or may be distributed over network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution in this implementation manner.
- each functional unit in each embodiment of the present disclosure may be integrated into one processing unit, or each unit may exist physically alone, or two or more units may be integrated into one unit.
- the above-mentioned integrated units may be implemented in the form of hardware, or may be implemented in the form of software functional units.
- the integrated unit if implemented as a software functional unit and sold or used as a stand-alone product, may be stored in a computer-readable storage medium.
- a computer-readable storage medium including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) or a processor (processor) to execute all or part of the steps of the methods of the various embodiments of the present disclosure.
- the aforementioned storage medium includes: U disk, mobile hard disk, Read-Only Memory (ROM, Read-Only Memory), Random Access Memory (RAM, Random Access Memory), magnetic disk or optical disk and other media that can store program codes .
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Molecular Biology (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
- Image Analysis (AREA)
- Holo Graphy (AREA)
- Automatic Disk Changers (AREA)
- Signal Processing For Digital Recording And Reproducing (AREA)
Abstract
Description
相关申请的交叉引用CROSS-REFERENCE TO RELATED APPLICATIONS
本公开要求于2020年8月21日提交的、申请号为202010852192.4、发明名称为“目标模型的获取方法及装置、电子设备和存储介质”的中国专利申请的优先权,该中国专利申请公开的全部内容以引用的方式并入本文中。This disclosure claims the priority of the Chinese patent application filed on August 21, 2020 with the application number of 202010852192.4 and the invention titled "Method and Device for Obtaining Target Model, Electronic Device and Storage Medium", which is disclosed in the Chinese patent application. The entire contents are incorporated herein by reference.
本公开涉及信息技术领域,特别是涉及一种目标模型的获取方法及装置、电子设备和存储介质。The present disclosure relates to the field of information technology, and in particular, to a method and device for acquiring a target model, an electronic device, and a storage medium.
迁移学习(Transfer Leaning)旨在将用于执行某个任务的原始模型经过相关处理,得到目标模型以应用到目标任务上。随着深度学习、计算机视觉等技术的快速发展,迁移学习已经在诸多场景得到了应用。故此,如何提高目标模型的性能成为极具研究价值的课题。Transfer learning aims to correlate the original model used to perform a certain task to obtain a target model to apply to the target task. With the rapid development of deep learning, computer vision and other technologies, transfer learning has been applied in many scenarios. Therefore, how to improve the performance of the target model has become a topic of great research value.
发明内容SUMMARY OF THE INVENTION
本公开提供一种目标模型的获取方法及装置、电子设备和存储介质。The present disclosure provides a method and apparatus for acquiring a target model, an electronic device and a storage medium.
本公开第一方面提供了一种目标模型的获取方法,所述方法可包括:利用第一训练样本集预训练原始模型,以调整原始模型的网络参数;其中,原始模型包括用于特征提取的第一子网络;利用第二子网络和经预训练的第一子网络的至少部分结构,得到目标模型;其中,第二子网络用于基于第一子网络提取的特征执行目标任务;利用与目标任务对应的第二训练样本集训练目标模型,以调整目标模型的网络参数。A first aspect of the present disclosure provides a method for acquiring a target model. The method may include: pre-training an original model by using a first training sample set to adjust network parameters of the original model; wherein the original model includes a a first sub-network; using the second sub-network and at least part of the pre-trained structure of the first sub-network to obtain a target model; wherein the second sub-network is used to perform the target task based on the features extracted by the first sub-network; using and The second training sample set corresponding to the target task trains the target model to adjust the network parameters of the target model.
在一些实施例中,所述第一子网络可包括多个网络区段,每一所述网络区段可包括顺序连接的至少一个特征提取单元,所述特征提取单元用于进行特征提取。In some embodiments, the first sub-network may include a plurality of network segments, and each of the network segments may include sequentially connected at least one feature extraction unit for performing feature extraction.
在一些实施例中,所述第一子网络可包括至少一路分支网络,且每路所述分支网络可包括顺序连接的至少一个所述网络区段。In some embodiments, the first sub-network may include at least one branch network, and each branch network may include at least one of the network segments connected in sequence.
在一些实施例中,所述利用第二子网络和经预训练的所述第一子网络的至少部分结构,得到所述目标模型,可包括:利用所述第一子网络的不同部分结构得到至少一个候选子网络,并选取满足预设条件的所述候选子网络作为选中子网络;利用所述选中子网络和所述第二子网络,得到所述目标模型。In some embodiments, the obtaining the target model by using the second sub-network and at least part of the pre-trained structure of the first sub-network may include: obtaining by using different partial structures of the first sub-network at least one candidate sub-network, and selecting the candidate sub-network that satisfies a preset condition as a selected sub-network; using the selected sub-network and the second sub-network to obtain the target model.
在一些实施例中,所述预设条件可包括:利用所述候选子网络与所述第二子网络得到的候选模型满足预设性能条件。In some embodiments, the preset condition may include: a candidate model obtained by using the candidate sub-network and the second sub-network satisfies a preset performance condition.
在一些实施例中,所述预设条件还可包括:所述候选子网络中的特征提取单元的数 量达到预设数量。In some embodiments, the preset condition may further include: the number of feature extraction units in the candidate sub-network reaches a preset number.
在一些实施例中,所述候选子网络可包括同一所述分支网络中每个所述网络区段中的至少一个特征提取单元,且不同所述候选子网络中的所述特征提取单元可至少部分不同。In some embodiments, the candidate sub-networks may include at least one feature extraction unit in each of the network segments in the same branch network, and the feature extraction units in different candidate sub-networks may at least Parts are different.
在一些实施例中,所述利用所述第一子网络的不同部分结构得到至少一个候选子网络,并选取满足预设条件的所述候选子网络作为选中子网络,可包括:选择同一所述分支网络中每个所述网络区段中的第一个特征提取单元,得到初始的待定子网络;对于所述待定子网络,增加至少一个特征提取单元,得到候选子网络,其中,增加的特征提取单元位于所述网络区段中选择的特征提取单元之后;将多个所述候选子网络分别与所述第二子网络组成候选模型,选择多个所述候选模型中性能条件最好的一个候选模型对应的候选子网络,作为待定子网络;在所述待定子网络中的特征提取单元数量小于预设数量的情况下,基于所述待定子网络,得到新的所述候选子网络,并选择多个新的所述候选子网络与所述第二子网络组成的候选模型中性能条件最好的一个候选模型对应的候选子网络,作为新的所述待定子网络;在所述待定子网络中的特征提取单元数量不小于所述预设数量的情况下,将所述待定子网络作为所述选中子网络。In some embodiments, the obtaining at least one candidate sub-network by using different partial structures of the first sub-network, and selecting the candidate sub-network that satisfies a preset condition as the selected sub-network may include: selecting the same sub-network The first feature extraction unit in each of the network segments in the branch network obtains an initial pending sub-network; for the pending sub-network, at least one feature extraction unit is added to obtain a candidate sub-network, wherein the added feature The extraction unit is located after the feature extraction unit selected in the network section; a plurality of candidate sub-networks and the second sub-network are respectively formed into candidate models, and one of the plurality of candidate models with the best performance conditions is selected The candidate sub-network corresponding to the candidate model is used as the pending sub-network; in the case where the number of feature extraction units in the pending sub-network is less than the preset number, obtain a new candidate sub-network based on the pending sub-network, and Select a candidate sub-network corresponding to a candidate model with the best performance condition among the candidate models composed of multiple new candidate sub-networks and the second sub-network as the new undetermined sub-network; When the number of feature extraction units in the network is not less than the preset number, the undetermined sub-network is used as the selected sub-network.
在一些实施例中,所述对于所述待定子网络,增加至少一个特征提取单元,得到候选子网络,可包括:将每个所述网络区段分别作为目标区段,将所述目标区段所选择的特征提取单元数量增加一整数数值,同时保持其他网络区段中所选择的特征提取单元数量不变,得到对应所述目标区段的候选子网络。In some embodiments, adding at least one feature extraction unit to the undetermined sub-network to obtain a candidate sub-network may include: taking each of the network segments as a target segment, and using the target segment The number of selected feature extraction units is increased by an integer value, while keeping the number of selected feature extraction units in other network segments unchanged, to obtain a candidate sub-network corresponding to the target segment.
在一些实施例中,所述基于所述待定子网络,得到新的所述候选子网络,可包括:对于所述待定子网络,增加至少一个特征提取单元,得到新的候选子网络,其中,增加的特征提取单元位于所述网络区段中选择的特征提取单元之后。In some embodiments, the obtaining the new candidate sub-network based on the pending sub-network may include: adding at least one feature extraction unit to the pending sub-network to obtain a new candidate sub-network, wherein: The added feature extraction unit is located after the selected feature extraction unit in the network section.
在一些实施例中,所述利用所述候选子网络与所述第二子网络得到的候选模型满足预设性能条件,可包括:利用与所述目标任务对应的验证样本,对利用所述候选子网络与所述第二子网络得到的候选模型进行验证,得到所述候选模型执行所述目标任务的性能评分;基于所述性能评分确定所述候选模型是否满足所述预设性能条件。In some embodiments, the candidate model obtained by using the candidate sub-network and the second sub-network satisfies a preset performance condition, may include: using a verification sample corresponding to the target task, comparing the candidate model using the candidate sub-network. The sub-network verifies the candidate model obtained by the second sub-network, and obtains a performance score of the candidate model for performing the target task; based on the performance score, it is determined whether the candidate model satisfies the preset performance condition.
所述第一子网络包含第一数量个特征提取单元,所述第一子网络包含第二数量个网络区段,所述预设数量小于所述第一数量,且大于或等于所述第二数量。The first sub-network includes a first number of feature extraction units, the first sub-network includes a second number of network segments, and the preset number is less than the first number and greater than or equal to the second number quantity.
在一些实施例中,所述第一子网络可包括一路分支网络,且该路分支网络可包括顺序连接的多个网络区段,每一所述网络区段可包括顺序连接的至少一个特征提取单元;所述利用第一训练样本集预训练原始模型,以调整所述原始模型的网络参数,可包括:每次训练前利用预设选择策略,在每一所述网络区段中选择一所述特征提取单元;利用所述第一训练样本集,对每一所述网络区段中位于选择的特征提取单元之前的部分进行训练,以调整每一所述网络区段中位于选择的特征提取单元之前的部分的网络参数。In some embodiments, the first sub-network may comprise a one-way branch network, and the branch network may comprise a plurality of network segments connected in sequence, each of the network segments may comprise at least one feature extraction connected in sequence unit; the use of the first training sample set to pre-train the original model to adjust the network parameters of the original model may include: using a preset selection strategy before each training, selecting a The feature extraction unit; using the first training sample set, the part located before the selected feature extraction unit in each of the network segments is trained to adjust the feature extraction located in the selected feature extraction unit in each of the network segments Network parameters for the part before the cell.
在一些实施例中,所述第一子网络可包括多路分支网络,且每路所述分支网络可包括顺序连接的至少一个网络区段,每一所述网络区段可包括顺序连接的至少一个特征提取单元;所述利用第一训练样本集预训练原始模型,以调整所述原始模型的网络参数,可包括:每次训练前利用预设选择策略,在所述第一子网络中选择一路所述分支网络并在选择的所述分支网络所包含的每一所述网络区段中选择一所述特征提取单元;利用所述第一训练样本集,对每一所述网络区段中位于选择的特征提取单元之前的部分进行训练,以调整每一所述网络区段中位于选择的特征提取单元之前的部分的网络参数。In some embodiments, the first sub-network may include multiple branch networks, and each branch network may include at least one network segment connected in sequence, and each of the network segments may include at least one network segment connected in sequence. A feature extraction unit; the use of the first training sample set to pre-train the original model to adjust the network parameters of the original model may include: using a preset selection strategy before each training, selecting from the first sub-network All the way to the branch network and select one of the feature extraction units in each of the network segments included in the selected branch network; using the first training sample set, for each of the network segments The portion preceding the selected feature extraction unit is trained to adjust the network parameters of the portion preceding the selected feature extraction unit in each of said network segments.
在一些实施例中,所述每次训练前利用预设选择策略,在所述第一子网络中选择一路所述分支网络并在选择的所述分支网络所包含的每一所述网络区段中选择一所述特 征提取单元,可包括:在所述第一子网络中随机选择一路所述分支网络并在选择的所述分支网络所包含的每一所述网络区段中选择一所述特征提取单元。In some embodiments, a preset selection strategy is used before each training, one of the branch networks is selected in the first sub-network, and each network segment included in the selected branch network is selected. Selecting one of the feature extraction units in the first sub-network may include: randomly selecting one of the branch networks in the first sub-network and selecting one of the network segments included in the selected branch network Feature extraction unit.
在一些实施例中,所述每次训练前利用预设选择策略,在所述第一子网络中选择一路所述分支网络并在选择的所述分支网络所包含的每一所述网络区段中选择一所述特征提取单元,可包括:在所述第一子网络中选择包含所述特征提取单元数量最多的分支网络并在选择的所述分支网络所包含的每一所述网络区段中选择一所述特征提取单元。In some embodiments, a preset selection strategy is used before each training, one of the branch networks is selected in the first sub-network, and each network segment included in the selected branch network is selected. Selecting one of the feature extraction units in the first sub-network may include: selecting a branch network that includes the largest number of the feature extraction units in the first sub-network, and selecting each of the network segments included in the selected branch network Select one of the feature extraction units.
在一些实施例中,所述第一子网络还可包括位于相邻所述网络区段之间的下采样层。In some embodiments, the first sub-network may further include a downsampling layer between adjacent segments of the network.
在一些实施例中,所述特征提取单元可包括顺序连接的卷积层、激活层和批处理层。In some embodiments, the feature extraction unit may include sequentially connected convolutional layers, activation layers, and batching layers.
在一些实施例中,在所述利用第一训练样本集预训练原始模型,以调整所述原始模型的网络参数之后,以及在所述利用第二子网络和经预训练的所述第一子网络的至少部分结构,得到所述目标模型之前,所述方法还可包括:利用所述第二训练样本集训练所述原始模型,以调整所述原始模型的网络参数。In some embodiments, after the use of the first training sample set to pre-train the original model to adjust the network parameters of the original model, and after the use of the second sub-network and the pre-trained first sub-network Before obtaining the target model, the method may further include: training the original model by using the second training sample set to adjust the network parameters of the original model.
在一些实施例中,在所述利用第二子网络和经预训练的所述第一子网络的至少部分结构,得到所述目标模型之后,以及在所述利用与所述目标任务对应的第二训练样本集训练所述目标模型,以调整所述目标模型的网络参数之前,所述方法还可包括:利用所述第一训练样本集训练所述目标模型,以调整所述目标模型的网络参数。In some embodiments, after the target model is obtained by using the second sub-network and at least part of the pre-trained structure of the first sub-network, and after the using the first sub-network corresponding to the target task Before training the target model with the second training sample set to adjust the network parameters of the target model, the method may further include: using the first training sample set to train the target model to adjust the network parameters of the target model parameter.
在一些实施例中,所述原始模型还可包括第三子网络,所述第三子网络用于基于提取到的特征执行预设任务,其中,所述预设任务可与所述目标任务相同或不同。In some embodiments, the original model may further include a third sub-network for performing a preset task based on the extracted features, wherein the preset task may be the same as the target task or different.
在一些实施例中,所述第一训练样本集中第一训练样本的数量可大于所述第二训练样本集中第二训练样本的数量。In some embodiments, the number of first training samples in the first training sample set may be greater than the number of second training samples in the second training sample set.
待定子网络本公开第二方面提供了一种目标模型的获取装置,包括:第一训练模块、模型获取模块和第二训练模块,第一训练模块用于利用第一训练样本集预训练原始模型,以调整原始模型的网络参数;其中,原始模型包括用于特征提取的第一子网络;模型获取模块用于利用第二子网络和经预训练的第一子网络的至少部分结构,得到目标模型;其中,第二子网络用于基于第一子网络提取的特征执行目标任务;第二训练模块用于利用与目标任务对应的第二训练样本集训练目标模型,以调整目标模型的网络参数。A second aspect of the present disclosure provides an apparatus for obtaining a target model, including: a first training module, a model obtaining module, and a second training module, where the first training module is used to pre-train an original model by using the first training sample set , to adjust the network parameters of the original model; wherein, the original model includes a first sub-network for feature extraction; the model acquisition module is used to use the second sub-network and at least part of the structure of the pre-trained first sub-network to obtain the target model; wherein, the second sub-network is used to perform the target task based on the features extracted by the first sub-network; the second training module is used to use the second training sample set corresponding to the target task to train the target model to adjust the network parameters of the target model .
本公开第三方面提供了一种电子设备,包括相互耦接的存储器和处理器,处理器用于执行存储器中存储的程序指令,以实现上述第一方面中的目标模型的获取方法。A third aspect of the present disclosure provides an electronic device, including a memory and a processor coupled to each other, where the processor is configured to execute program instructions stored in the memory, so as to implement the method for acquiring the target model in the first aspect.
本公开第四方面提供了一种计算机可读存储介质,其上存储有程序指令,程序指令被处理器执行时实现上述第一方面中的目标模型的获取方法。A fourth aspect of the present disclosure provides a computer-readable storage medium on which program instructions are stored, and when the program instructions are executed by a processor, implement the method for acquiring the target model in the first aspect above.
图1是根据本公开实施例的目标模型的获取方法的流程示意图;1 is a schematic flowchart of a method for acquiring a target model according to an embodiment of the present disclosure;
图2是根据本公开实施例的第一子网络的框架示意图;2 is a schematic diagram of a framework of a first sub-network according to an embodiment of the present disclosure;
图3是根据本公开另一实施例的第一子网络的框架示意图;3 is a schematic diagram of a framework of a first sub-network according to another embodiment of the present disclosure;
图4是根据本公开实施例的图1中步骤S12的流程示意图;FIG. 4 is a schematic flowchart of step S12 in FIG. 1 according to an embodiment of the present disclosure;
图5是根据本公开实施例的待定子网络的框架示意图;5 is a schematic diagram of a framework of a sub-network to be determined according to an embodiment of the present disclosure;
图6是根据本公开另一实施例的目标模型的获取方法的流程示意图;6 is a schematic flowchart of a method for acquiring a target model according to another embodiment of the present disclosure;
图7是根据本公开实施例的目标模型的获取装置的框架示意图;7 is a schematic diagram of a framework of an apparatus for acquiring a target model according to an embodiment of the present disclosure;
图8是根据本公开实施例的电子设备的框架示意图;8 is a schematic diagram of a framework of an electronic device according to an embodiment of the present disclosure;
图9是根据本公开实施例的计算机可读存储介质的框架示意图。FIG. 9 is a schematic diagram of a frame of a computer-readable storage medium according to an embodiment of the present disclosure.
下面结合说明书附图,对本公开实施例进行详细说明。The embodiments of the present disclosure will be described in detail below with reference to the accompanying drawings.
以下描述中,为了说明而不是为了限定,提出了诸如特定系统结构、接口、技术之类的具体细节,以便透彻理解本公开。In the following description, for purposes of explanation and not limitation, specific details are set forth, such as specific system structures, interfaces, techniques, etc., in order to provide a thorough understanding of the present disclosure.
本文中术语“系统”和“网络”在本文中可互换使用。本文中术语“和/或”,仅仅是一种描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。另外,本文中字符“/”,一般表示前后关联对象是一种“或”的关系。此外,本文中的“多”表示两个或者多于两个。The terms "system" and "network" are used interchangeably herein. The term "and/or" in this article is only an association relationship to describe the associated objects, indicating that there can be three kinds of relationships, for example, A and/or B, it can mean that A exists alone, A and B exist at the same time, and A and B exist independently B these three cases. In addition, the character "/" in this document generally indicates that the related objects are an "or" relationship. Also, "multiple" herein means two or more than two.
请参阅图1,图1是根据本公开实施例的目标模型的获取方法的流程示意图。具体而言,可以包括步骤S11至S13。Please refer to FIG. 1 , which is a schematic flowchart of a method for acquiring a target model according to an embodiment of the present disclosure. Specifically, steps S11 to S13 may be included.
步骤S11:利用第一训练样本集预训练原始模型,以调整原始模型的网络参数;其中,原始模型包括用于特征提取的第一子网络。Step S11: Pre-training the original model by using the first training sample set to adjust the network parameters of the original model; wherein, the original model includes a first sub-network for feature extraction.
在一个实施场景中,第一子网络可以包括多个网络区段(stage),且每一网络区段可以包括顺序连接的至少一个特征提取单元(block)。具体地,特征提取单元用于进行特征提取,可以包括顺序连接的卷积层、激活层和批处理层(Batch Normalization,BN)。卷积层可以包括若干卷积核,用于提取特征。激活层可以包括sigmoid、tanh、ReLu等激活函数,用于引入非线性因素。批处理层可以用于归一化操作。通过顺序连接的卷积层、激活层和批处理层,能够有利于提高特征提取单元在训练过程中的学习效果。此外,在特征提取单元中,卷积层之后还可以连接有池化层,用于对卷积层提取到的特征进行下采样。此外,第一子网络还包括位于相邻网络区段之间的下采样层,能够有利于实现特征降维,压缩数据和参数的数量,减小过拟合,同时提高容错性。每一网络区段所包含的特征提取单元的数量可以相同,如每一网络区段均包含3个、或4个、或5个特征提取单元。每一网络区段所包含的特征提取单元的数量也可以完全不同,如第一个网络区段包含3个特征提取单元,第二个网络区段包含4个特征提取单元,第三个网络区段包含5个特征提取单元。每一网络区段所包含的特征提取单元的数量还可以不完全相同,如第一个网络区段包含3个特征提取单元,第二个网络区段也包含3个特征提取单元,第三个网络区段包含5个特征提取单元。也就是说,可以根据实际应用需要进行设置,在此不做限定。In one implementation scenario, the first sub-network may include a plurality of network segments (stages), and each network segment may include at least one feature extraction unit (block) connected in sequence. Specifically, the feature extraction unit is used for feature extraction, and may include sequentially connected convolutional layers, activation layers, and batch normalization (BN) layers. A convolutional layer can include several convolution kernels for feature extraction. The activation layer can include activation functions such as sigmoid, tanh, ReLu, etc., to introduce nonlinear factors. Batch layers can be used for normalization operations. The sequential connection of convolutional layers, activation layers and batching layers can help improve the learning effect of the feature extraction unit during the training process. In addition, in the feature extraction unit, a pooling layer can be connected after the convolutional layer to downsample the features extracted by the convolutional layer. In addition, the first sub-network also includes a downsampling layer located between adjacent network segments, which can facilitate feature dimension reduction, compress the number of data and parameters, reduce overfitting, and improve fault tolerance. The number of feature extraction units included in each network segment may be the same, for example, each network segment includes 3, or 4, or 5 feature extraction units. The number of feature extraction units contained in each network segment can also be completely different. For example, the first network segment contains 3 feature extraction units, the second network segment contains 4 feature extraction units, and the third network segment contains 4 feature extraction units. A segment contains 5 feature extraction units. The number of feature extraction units included in each network segment may not be exactly the same. For example, the first network segment includes 3 feature extraction units, the second network segment also includes 3 feature extraction units, and the third network segment also includes 3 feature extraction units. The network segment contains 5 feature extraction units. That is to say, it can be set according to actual application requirements, which is not limited here.
在一个具体的实施场景中,上述多个网络区段中的若干网络区段可以具有相同的输入节点,在此情况下,第一子网络可以有多路分支网络,每路分支网络可以包括顺序连接的至少一个网络区段。In a specific implementation scenario, several network segments in the above-mentioned multiple network segments may have the same input node. In this case, the first sub-network may have multiple branch networks, and each branch network may include a sequence At least one network segment of the connection.
请结合参阅图2,图2是根据本公开实施例的第一子网络的框架示意图,如图2所示,虚线矩形表示网络区段,每一网络区段包含4个特征提取单元,第一子网络包含三个并行连接的分支网络,第一路分支网络为第一行网络区段(为简化示意图,第一个分支网络仅示意性地描绘了一个网络区段),第二路分支网络为第二行网络区段,第三路分支网络为第三行网络区段(为简化示意图,第三个分支网络仅示意性地描绘了一个网络区段),三个分支网络具有相同的输入节点。在其他场景,也可以根据实际应用需要设置包含其他数量分支网络的第一子网络。例如,可以将第一子网络设置为包括2路分支网络、4路分支网络等等,在此不做限定。Please refer to FIG. 2. FIG. 2 is a schematic diagram of a frame of a first sub-network according to an embodiment of the present disclosure. As shown in FIG. 2, a dotted rectangle represents a network segment. Each network segment includes 4 feature extraction units. The sub-network includes three branch networks connected in parallel, the first branch network is the first row of network segments (to simplify the schematic diagram, the first branch network only schematically depicts one network segment), the second branch network is the second row network segment, the third branch network is the third row network segment (to simplify the schematic diagram, the third branch network only schematically depicts one network segment), and the three branch networks have the same input node. In other scenarios, the first sub-network including other number of branch networks may also be set according to actual application requirements. For example, the first sub-network may be set to include a 2-way branch network, a 4-way branch network, etc., which is not limited herein.
在另一个具体的实施场景中,多个网络区段也可以串行连接,在此情况下,第一子网络可以仅有一路分支网络,且该路分支网络包括顺序连接的多个网络区段。In another specific implementation scenario, multiple network segments may also be connected in series. In this case, the first sub-network may have only one branch network, and the branch network includes multiple network segments connected in sequence .
请结合参阅图3,图3是根据本公开另一实施例的第一子网络的框架示意图,如图3所示,虚线矩形表示网络区段,每一网络区段包含4个特征提取单元,第一子网络包括两个串行连接的网络区段,在此情况下,第一子网络仅包含一路分支网络。Please refer to FIG. 3 . FIG. 3 is a schematic diagram of a frame of a first sub-network according to another embodiment of the present disclosure. As shown in FIG. 3 , the dotted rectangles represent network segments, and each network segment includes 4 feature extraction units. The first sub-network includes two serially connected network segments, in this case the first sub-network contains only one branch network.
在又一个具体的实施场景中,除第一子网络之外,原始模型还可以包括另一子网络,例如,第三子网络,另一子网络用于基于提取到的特征执行预设任务。预设任务可以是目标检测任务、图像分类任务、场景分割任务等等,在此不做限定。目标检测任务表示在图像中检测出目标对象,例如,在图像中检测出车辆、行人等;图像分类任务表示将图像归类至某一类别,例如,将图像归类至猫、狗、乌龟等;场景分割任务表示检测出图像中的像素点所属的类别,例如,检测出图像中分别属于车道、车、绿化带、天空的像素点,上述预设任务的举例仅仅表示在实际应用中可能存在的一种使用情况,并不因此而限定其使用范围。另一子网络的具体结构可以根据实际应用需要进行设置,在此不做限定。例如,在预设任务为目标检测任务或图像分类任务的情况下,另一子网络可以包括若干个(如,2个、3个等)顺序连接的全连接层、softmax层等等,在此不做限定。又例如,在预设任务为场景分割的情况下,另一子网络可以包括全连接层、softmax层,在此不做限定。In yet another specific implementation scenario, in addition to the first sub-network, the original model may further include another sub-network, for example, a third sub-network, and the other sub-network is used to perform preset tasks based on the extracted features. The preset task may be a target detection task, an image classification task, a scene segmentation task, etc., which are not limited herein. The target detection task means detecting target objects in the image, for example, detecting vehicles, pedestrians, etc. in the image; the image classification task means classifying the image into a certain category, for example, classifying the image into cats, dogs, turtles, etc. ; The scene segmentation task represents the category to which the pixels in the image are detected. For example, the pixels in the image that belong to lanes, cars, green belts, and the sky are detected. The above examples of preset tasks only indicate that there may exist in practical applications. a use case, and it does not limit its scope of use. The specific structure of the other sub-network can be set according to actual application requirements, which is not limited here. For example, when the preset task is a target detection task or an image classification task, another sub-network may include several (eg, 2, 3, etc.) sequentially connected fully-connected layers, softmax layers, etc., here Not limited. For another example, when the preset task is scene segmentation, another sub-network may include a fully connected layer and a softmax layer, which is not limited herein.
在一个实施场景中,为了提高预训练的准确性,第一训练样本集可以为大规模数据集,即第一训练样本集中第一训练样本的数量大于预设数值(如,1000、5000、10000等等)。因此,可以利用第一训练样本集充分地对原始模型进行预训练,有利于提高后续获取的目标模型的准确性。在一个实施场景中,为了提高预训练的效率,第一子网络可以仅包括一路分支网络,且该路分支网络可以包括多个顺序连接的网络区段。在此情况下,可以在每次训练前利用预设选择策略,在每一网络区段中选择一特征提取单元,从而可以利用第一训练样本集,对每一网络区段中位于选择的特征提取单元之前的部分(包含该选择的特征提取单元)进行训练,以调整每一网络区段中位于选择的特征提取单元之前的部分的网络参数。上述方式,能够有利于提高预训练的效率。具体地,上述预设选择策略可以包括:在每一网络区段内随机选择一特征提取单元。In an implementation scenario, in order to improve the accuracy of pre-training, the first training sample set may be a large-scale data set, that is, the number of first training samples in the first training sample set is greater than a preset value (eg, 1000, 5000, 10000 etc). Therefore, the original model can be fully pre-trained by using the first training sample set, which is beneficial to improve the accuracy of the subsequently acquired target model. In an implementation scenario, in order to improve the efficiency of pre-training, the first sub-network may only include a one-way branch network, and the one-way branch network may include a plurality of sequentially connected network segments. In this case, a preset selection strategy can be used before each training to select a feature extraction unit in each network segment, so that the first training sample set can be used to analyze the selected features in each network segment. The portion preceding the extraction unit (including the selected feature extraction unit) is trained to adjust the network parameters of the portion preceding the selected feature extraction unit in each network segment. The above method can help to improve the efficiency of pre-training. Specifically, the above-mentioned preset selection strategy may include: randomly selecting a feature extraction unit in each network segment.
在一个具体的实施场景中,预设选择策略可以包括:在与各个网络区段分别对应的预设数值范围内随机选择一整数数值,且预设数值范围的上限值为对应网络区段所包含的特征提取单元的数量,为了便于描述,可以记为N i,表示第i个网络区段包含的特征提取单元的数量,预设数值范围的下限值可以为1,则可以在1至N i的预设数值范围内随机选择一整数数值,为了便于描述,可以将随机选择的整数数值记为S i,表示第i个网络区段随机选择的整数数值,则可以利用第一训练样本集,对每一网络区段中位于第S i个特征提取单元之前的部分(即第1至第S i)进行训练,以调整每一网络区段中位于第S i个特征提取单元之前的部分(即第1至第S i)的网络参数。 In a specific implementation scenario, the preset selection strategy may include: randomly selecting an integer value within a preset value range corresponding to each network segment, and the upper limit of the preset value range is the value of the corresponding network segment. The number of included feature extraction units, for convenience of description, can be denoted as N i , which represents the number of feature extraction units included in the i-th network segment. An integer value is randomly selected within the preset value range of Ni. For the convenience of description, the randomly selected integer value can be denoted as S i , which represents the randomly selected integer value of the i -th network segment, and the first training sample can be used. Set, train the part (i.e. the 1st to the S i ) before the S i th feature extraction unit in each network segment to adjust the S i th feature extraction unit before the S i th feature extraction unit in each network segment. The network parameters of the part (ie 1st to Sith ).
如图3中虚线箭头所示,在利用第一训练样本集,对每一网络区段中位于选择的特征提取单元之前的部分进行训练的情况下,可以将每一网络区段内选择的特征提取单元的输出结果,作为下一网络区段的输入数据。请结合参阅图3,如图所示,例如,当第一个网络区段中选择的特征提取单元为第2个特征提取单元时,可以将第2个特征提取单元的输出结果,作为下一网络区段的输入数据。又例如,当第一个网络区段中选择的特征提取单元为第3个特征提取单元时,可以将第3个特征提取单元的输出结果,作为下一网络区段的输入结果,其他情况可以以此类推,在此不再一一举例。As shown by the dashed arrow in FIG. 3 , in the case where the first training sample set is used to train the part located before the selected feature extraction unit in each network segment, the selected features in each network segment can be The output of the extraction unit is used as the input data for the next network segment. Please refer to Figure 3, as shown in the figure, for example, when the feature extraction unit selected in the first network segment is the second feature extraction unit, the output result of the second feature extraction unit can be used as the next feature extraction unit. Input data for the network segment. For another example, when the feature extraction unit selected in the first network segment is the third feature extraction unit, the output result of the third feature extraction unit can be used as the input result of the next network segment. By analogy, no examples will be given here.
在一个实施场景中,在第一子网络包括多路分支网络的情况下,每次训练前可以利用预设选择策略,在第一子网络中选择一路分支网络,并在选择的分支网络所包含的每 一网络区段中选择一特征提取单元,从而可以利用第一训练样本集,对选择的分支网络包含的每一网络区段中位于选择的特征提取单元之前的部分进行训练,以调整选择的分支网络包含的每一网络区段中位于选择的特征提取单元之前的部分的网络参数。在网络区段中选择特征提取单元的具体方式可以参阅前述相关描述,在此不再赘述。In an implementation scenario, when the first sub-network includes a multi-way branch network, a preset selection strategy can be used before each training to select one branch network in the first sub-network, and the selected branch network includes A feature extraction unit is selected in each network segment of The branch network contains the network parameters of the part of each network segment preceding the selected feature extraction unit. For the specific manner of selecting the feature extraction unit in the network segment, reference may be made to the foregoing related descriptions, which will not be repeated here.
在一个具体的实施场景中,在第一子网络中选择分支网络的方式,可以参阅在网络区段中选择特征提取单元的方式。具体地,可以在预设数值范围内随机选择一整数数值,且预设数值范围的上限值为第一子网络所包含的分支网络的数量,为了便于描述,可以记为N,表示第一子网络共包含N路分支网络,预设数值范围的下限值可以为1,则可以在1至N的预设数值范围内随机选择一整数数值。为了便于描述,可以将随机选择的整数数值记为S,表示在第一子网络中选择第S路分支网络。请结合参阅图2,图2所示的第一子网络共包含3路分支网络,则可以在1至3中随机选择一整数数值,例如2,则可以将第2路分支网络S 2,作为选择的分支网络。然后,在第2路分支网络S 2所包含的每一网络区段中选择一特征提取单元,具体可以参阅前述描述,在此不再赘述。 In a specific implementation scenario, for the method of selecting a branch network in the first sub-network, reference may be made to the method of selecting a feature extraction unit in a network segment. Specifically, an integer value can be randomly selected within a preset value range, and the upper limit of the preset value range is the number of branch networks included in the first sub-network. The sub-network includes N branch networks in total, and the lower limit of the preset value range may be 1, and an integer value may be randomly selected from the preset value range of 1 to N. For the convenience of description, the randomly selected integer value may be denoted as S, indicating that the S-th branch network is selected in the first sub-network. Please refer to FIG. 2 , the first sub-network shown in FIG. 2 includes a total of 3 branch networks, then an integer value can be randomly selected from 1 to 3, for example, 2, then the second branch network S 2 can be used as Selected branch network. Then, a feature extraction unit is selected in each network segment included in the second branch network S2. For details, please refer to the above description, which will not be repeated here.
在另一个具体的实施场景中,可以选择第一子网络中包含特征提取单元数量最多的分支网络。然后,在该分支网络所包含的每一网络区段中选择一特征提取单元,具体可以参阅前述描述,在此不再赘述。In another specific implementation scenario, the branch network that contains the largest number of feature extraction units in the first sub-network may be selected. Then, a feature extraction unit is selected in each network segment included in the branch network. For details, please refer to the foregoing description, which will not be repeated here.
在一个实施场景中,当满足预设结束条件时,可以结束对原始模型的预训练。具体地,预设结束条件可以包括:每个第一训练样本参与训练的次数已达到预设次数阈值,预设次数阈值可以根据实际应用需要进行设置,例如,可以设置为100、120、150等等,在此不做限定。In one implementation scenario, when a preset end condition is satisfied, the pre-training of the original model can be ended. Specifically, the preset end condition may include: the number of times each first training sample participates in training has reached a preset number of times threshold, and the preset number of times threshold may be set according to actual application needs, for example, may be set to 100, 120, 150, etc. etc., which are not limited here.
步骤S12:利用第二子网络和经预训练的第一子网络的至少部分结构,得到目标模型。Step S12: Obtain a target model by using the second sub-network and at least part of the pre-trained structure of the first sub-network.
本公开实施例中,第二子网络用于基于第一子网络提取的特征执行目标任务。在一个实施场景中,目标任务可以包括以下任一者:目标检测任务、图像分类任务、场景分割任务。关于目标检测任务、图像分类任务和场景分割任务的具体含义,可以参考前述描述,在此不再赘述。此外,如前描述,原始模型还可以包括第三子网络,第三子网络用于基于提取到的特征执行预设任务,预设任务可以与目标任务相同,也可以不同。例如,预设任务和目标任务可以均为图像分类任务。又例如预设任务和目标任务可以均为目标检测任务。再例如,预设任务可以为目标检测任务,而目标任务可以为图像分类任务,在此不做限定。此外,第三子网络可以和第二子网络相同,也可以不相同。例如,第三子网络可以包括一个全连接层和一个softmax层,第二子网络可以包含两个顺序连接的全连接层和连接于这两个全连接层之后的softmax层。又例如,第二子网络也可以与第三子网络一样,也包括一个全连接层和一个softmax层,在此不做限定。In the embodiment of the present disclosure, the second sub-network is configured to perform the target task based on the features extracted by the first sub-network. In one implementation scenario, the target task may include any one of the following: target detection task, image classification task, and scene segmentation task. For the specific meanings of the target detection task, the image classification task, and the scene segmentation task, reference may be made to the foregoing descriptions, which will not be repeated here. In addition, as described above, the original model may further include a third sub-network, and the third sub-network is used to perform a preset task based on the extracted features, and the preset task may be the same as or different from the target task. For example, both the preset task and the target task may be image classification tasks. For another example, the preset task and the target task may both be target detection tasks. For another example, the preset task may be a target detection task, and the target task may be an image classification task, which is not limited herein. In addition, the third sub-network may be the same as the second sub-network, or may not be the same. For example, the third sub-network may include a fully connected layer and a softmax layer, and the second sub-network may include two sequentially connected fully connected layers and a softmax layer connected after these two fully connected layers. For another example, the second sub-network, like the third sub-network, may also include a fully connected layer and a softmax layer, which is not limited herein.
在一个实施场景中,可以利用第一子网络的不同部分结构,得到至少一个候选子网络,并选取满足预设条件的候选子网络,作为选中子网络,从而可以利用选中子网络和第二子网络,得到目标模型。In an implementation scenario, different partial structures of the first sub-network can be used to obtain at least one candidate sub-network, and a candidate sub-network that satisfies a preset condition can be selected as the selected sub-network, so that the selected sub-network and the second sub-network can be used. network to get the target model.
在一个具体的实施场景中,候选子网络包括同一分支网络中每个网络区段中的至少一个特征提取单元,且不同候选子网络中的特征提取单元至少部分不同,则每次选择第一子网络的部分结构时,可以选择一路分支网络,并在选择的分支网络的每一网络区段中选择一特征提取单元,从而可以将每一网络区段中位于选择的特征提取单元之前的部分的组合,作为第一子网络的部分结构,从而可以得到第一子网络的不同部分结构。以第一子网络仅包括一路分支网络为例,请结合参阅图3,第一次选择第一子网络的部分结构时,可以在第一个网络区段随机选择第三个特征提取单元,在第二个网络区段随机选择第二个特征提取单元,则可以将第一个网络区段中位于第三个特征提取单元之前的 部分和第二网络区段中位于第二个特征提取单元之前的部分的组合,作为第一子网络的部分结构;第二次选择第一子网络的部分结构时,可以在第一个网络区段随机选择第二个特征提取单元,在第二个网络区段随机选择第三个特征提取单元,则可以将第一个网络区段中位于第二个特征提取单元之前的部分和第二网络区段中位于第三个特征提取单元之前的部分的组合,作为第一子网络的部分结构,以此类推,在此不再一一举例。具体地,选择的次数可以根据实际应用需要进行设置,例如,可以根据计算复杂度,将10,或15、或20等作为选择的次数,在此不做限定。In a specific implementation scenario, the candidate sub-network includes at least one feature extraction unit in each network segment in the same branch network, and the feature extraction units in different candidate sub-networks are at least partially different, then each time the first sub-network is selected When there is a partial structure of the network, one branch network can be selected, and a feature extraction unit can be selected in each network section of the selected branch network, so that the part of each network section located before the selected feature extraction unit can be extracted. The combination is used as a partial structure of the first sub-network, so that different partial structures of the first sub-network can be obtained. Taking the first sub-network only includes one branch network as an example, please refer to Fig. 3 in conjunction with Fig. 3. When selecting part of the structure of the first sub-network for the first time, the third feature extraction unit can be randomly selected in the first network section, and the The second network segment randomly selects the second feature extraction unit, then the part of the first network segment before the third feature extraction unit and the part of the second network segment located before the second feature extraction unit can be combined The combination of the parts of the first sub-network is used as the partial structure of the first sub-network; when selecting the partial structure of the first sub-network for the second time, the second feature extraction unit can be randomly selected in the first network section, and the second feature extraction unit can be selected in the second network section. If the segment randomly selects the third feature extraction unit, then the part of the first network segment before the second feature extraction unit and the part of the second network segment before the third feature extraction unit can be combined, As a part of the structure of the first sub-network, and so on, and will not be exemplified one by one here. Specifically, the number of times of selection can be set according to actual application requirements, for example, 10, or 15, or 20, etc. can be used as the number of times of selection according to the computational complexity, which is not limited here.
在另一个具体的实施场景中,预设条件可以包括:利用候选子网络与第二子网络得到的候选模型满足预设性能条件。具体地,可以利用与目标任务对应的验证样本,对利用候选子网络与第二子网络得到的候选模型进行验证,得到候选模型执行目标任务的性能评分,从而基于性能评分,确定候选模型是否满足预设性能条件。例如,当性能评分为所有候选模型的性能评分的最高值时,可以认为对应的候选模型满足预设性能条件。In another specific implementation scenario, the preset condition may include: the candidate model obtained by using the candidate sub-network and the second sub-network satisfies the preset performance condition. Specifically, the verification samples corresponding to the target task can be used to verify the candidate model obtained by using the candidate sub-network and the second sub-network to obtain the performance score of the candidate model for executing the target task, so as to determine whether the candidate model satisfies the performance score based on the performance score. Preset performance conditions. For example, when the performance score is the highest value of the performance scores of all candidate models, it may be considered that the corresponding candidate model satisfies the preset performance condition.
在又一个具体的实施场景中,预设条件还可以包括:候选子网络中的特征提取单元的数量不小于预设数量。预设数量可以根据实际应用需要进行设置,例如,可以设置为4、5、6等等,在此不做限定。预设数量可以约束目标模型的复杂度。In yet another specific implementation scenario, the preset condition may further include: the number of feature extraction units in the candidate sub-network is not less than the preset number. The preset number can be set according to actual application requirements, for example, it can be set to 4, 5, 6, etc., which is not limited here. The preset number can constrain the complexity of the target model.
在又一个具体的实施场景中,可以将选中子网络和第二子网络顺序连接,得到目标模型。In yet another specific implementation scenario, the selected sub-network and the second sub-network may be sequentially connected to obtain the target model.
在另一个实施场景中,候选子网络包括同一网络分支的每个网络区段中的至少一个特征提取单元,且不同候选子网络中的特征提取单元至少部分不同,为了能够得到全局最优解,可以穷举第一子网络中所有的不同部分结构,并将每一种部分结构,作为对应的候选子网络,并选取满足预设条件的候选子网络,作为选中子网络,从而可以利用选中子网络和第二子网络,得到目标模型。具体地,预设条件的设置方式,可以参阅前述描述,在此不再赘述。In another implementation scenario, the candidate sub-network includes at least one feature extraction unit in each network segment of the same network branch, and the feature extraction units in different candidate sub-networks are at least partially different. In order to obtain the global optimal solution, It is possible to exhaustively enumerate all the different partial structures in the first sub-network, and use each partial structure as the corresponding candidate sub-network, and select the candidate sub-network that satisfies the preset conditions as the selected sub-network, so that the selected sub-network can be used. network and the second sub-network to get the target model. Specifically, for the setting method of the preset condition, reference may be made to the foregoing description, and details are not repeated here.
在一个具体的实施场景中,以第一子网络仅包括一路网络分支为例,在穷举第一子网络中所有的不同部分结构时,为了避免重复选取,可以为第一子网络中每个网络区段赋予一计数值count i,表示第i个网络区段的计数值,用于在每次选取部分结构时,将每个网络区段中位于计数值count i之前的特征提取单元组合作为本次选取的部分结构,其中,第一子网络中包含的网络区段的数量可以记为N s。计数值count i的初始值可以设置为1,并每次选取得到部分结构之后,进行加1计数,直至穷举完第一子网络中所有的不同部分结构为止。以第一子网络包含4个网络区段,且每个网络区段也包含4个特征提取单元为例,则第一次选取部分结构时,4个网络区段的计数值可以分别记为1、1、1、1,此时可以将各个网络区段中的首个特征提取单元的组合,作为本次选取的部分结构;第二次选取部分结构时,4个网络区段的计数值可以分别记为1、1、1、2,此时可以将前3个网络区段的首个特征提取单元和最后一个网络区段的前2个特征提取单元的组合,作为本次选取的部分结构;第三次选取部分结构时,4个网络区段的计数值可以分别记为1、1、1、3,此时可以将前3个网络区段的首个特征提取单元和最后一个网络区段的前3个特征提取单元的组合,作为本次选取的部分结构;第四次选取部分结构时,4个网络区段的计数值可以分别记为1、1、1、4,此时可以将前3个网络区段的首个特征提取单元和最后一个网络区段的全部特征提取单元,作为本次选取的部分结构;第五次选取部分结构时,4个网络区段的计数值可以分别记为1、1、2、1,此时可以将前2个网络区段的首个特征提取单元、第三个网络区段的前2个特征提取单元以及最后一个网络区段的首个特征提取单元的组合,作为本次选取的部分结构;第六次选取部分结构时,4个网络区段的计数值可以分别记为1、1、2、2,此时可以将前2个网络区段的首个特征提取单元以及后2个网络区段的前2个特征提取单元的组合,作为本次选取的部 分结构,以此类推,在此不再赘述。 In a specific implementation scenario, taking the first sub-network including only one network branch as an example, when exhaustively enumerating all the different partial structures in the first sub-network, in order to avoid repeated selection, each sub-network in the first sub-network can be The network segment is assigned a count value count i , which represents the count value of the i-th network segment, and is used to combine the feature extraction units located before the count value count i in each network segment as In the partial structure selected this time, the number of network segments included in the first sub-network may be denoted as N s . The initial value of the count value count i can be set to 1, and after each partial structure is selected, the count is incremented by 1 until all the different partial structures in the first sub-network are exhaustively exhausted. Take the first sub-network including 4 network segments, and each network segment also includes 4 feature extraction units as an example, then when a partial structure is selected for the first time, the count values of the 4 network segments can be recorded as 1 respectively. , 1, 1, 1, at this time, the combination of the first feature extraction unit in each network segment can be used as the partial structure selected this time; when the partial structure is selected for the second time, the count value of the four network segments can be They are denoted as 1, 1, 1, and 2, respectively. At this time, the combination of the first feature extraction unit of the first three network segments and the first two feature extraction units of the last network segment can be used as part of the structure selected this time. ; When the partial structure is selected for the third time, the count values of the four network segments can be recorded as 1, 1, 1, and 3, respectively. At this time, the first feature extraction unit and the last network segment of the first three network segments can be The combination of the first three feature extraction units of the segment is used as the partial structure selected this time; when the partial structure is selected for the fourth time, the count values of the four network segments can be recorded as 1, 1, 1, and 4, respectively. The first feature extraction unit of the first three network segments and all the feature extraction units of the last network segment are taken as part of the structure selected this time; when the partial structure is selected for the fifth time, the count values of the four network segments can be Denoted as 1, 1, 2, and 1, respectively, at this time, the first feature extraction unit of the first two network segments, the first two feature extraction units of the third network segment, and the first feature extraction unit of the last network segment can be The combination of feature extraction units is used as the partial structure selected this time; when the partial structure is selected for the sixth time, the count values of the four network segments can be recorded as 1, 1, 2, and 2 respectively. The combination of the first feature extraction unit of the segment and the first two feature extraction units of the next two network segments is used as part of the structure selected this time, and so on, which will not be repeated here.
在一个实施场景中,为了提高网络结构维度调整的准确性,还可以在利用第二子网络和经预训练的第一子网络的至少部分结构,得到目标模型之前,利用第二训练样本集训练原始模型,以调整原始模型的网络参数。从而可以使得原始模型能够先基于目标任务进行网络参数调整,进而能够有利于提高网络结构维度调整的准确性。在此情况下,第一子网络的至少部分结构包含经第二训练样本集训练后,调整后的网络参数,即在迁移学习过程中,网络参数和网络结构均进行了调整。In an implementation scenario, in order to improve the accuracy of the dimension adjustment of the network structure, before obtaining the target model by using the second sub-network and at least part of the pre-trained structure of the first sub-network, the second training sample set may be used for training original model to tune the network parameters of the original model. Therefore, the original model can first adjust the network parameters based on the target task, which can help to improve the accuracy of the network structure dimension adjustment. In this case, at least part of the structure of the first sub-network includes the adjusted network parameters after being trained by the second training sample set, that is, during the transfer learning process, both the network parameters and the network structure are adjusted.
在一个具体的实施场景中,第一训练样本集中第一训练样本的数量大于第二训练样本集中第二训练样本的数量,则通过本公开的实施例,能够在大规模数据样本(即第一训练样本集)和与目标任务对应的小规模数据样本(即第二训练样本集)的基础上,得到适用于目标任务的目标模型,从而能够有利于降低搜集小规模数据样本的难度,以及标注第二训练样本集的工作量,进而能够有利于进一步提高获取目标模型的效率。具体地,第一训练样本集中第一训练样本的数量可以是5000、10000、15000等等,对应地,第二训练样本集中第二训练样本的数量可以是100、200、300等等,可以根据实际使用情况进行设置,在此不做限定。In a specific implementation scenario, the number of first training samples in the first training sample set is greater than the number of second training samples in the second training sample set, then through the embodiments of the present disclosure, the large-scale data samples (ie the first On the basis of the training sample set) and the small-scale data samples corresponding to the target task (ie the second training sample set), a target model suitable for the target task is obtained, which can help reduce the difficulty of collecting small-scale data samples and labeling. The workload of the second training sample set can further improve the efficiency of obtaining the target model. Specifically, the number of first training samples in the first training sample set may be 5000, 10000, 15000, etc. Correspondingly, the number of second training samples in the second training sample set may be 100, 200, 300, etc., which can be determined according to It is set according to the actual usage, which is not limited here.
步骤S13:利用与目标任务对应的第二训练样本集训练目标模型,以调整目标模型的网络参数。Step S13 : using the second training sample set corresponding to the target task to train the target model to adjust the network parameters of the target model.
在一个实施场景中,可以仅利用与目标任务对应的第二训练样本集训练目标模型,以调整目标模型的网络参数。具体地,第二训练样本集中第二训练样本的数量小于第一训练样本集中第一训练样本的数量,例如,第二训练样本集为小规模数据样本,而第一训练样本集为大规模数据样本,具体可以参阅前述描述,在此不再赘述。In one implementation scenario, the target model may be trained only by using the second training sample set corresponding to the target task, so as to adjust the network parameters of the target model. Specifically, the number of second training samples in the second training sample set is smaller than the number of first training samples in the first training sample set, for example, the second training sample set is small-scale data samples, and the first training sample set is large-scale data For the sample, please refer to the foregoing description for details, which will not be repeated here.
在另一个实施场景中,为了提高目标模型的准确性,还可以先利用第一训练样本集训练目标模型,以调整目标模型的网络参数,再利用与目标任务对应的第二训练样本集训练目标模型,以再次调整目标模型的网络参数。In another implementation scenario, in order to improve the accuracy of the target model, the target model can also be trained by using the first training sample set to adjust the network parameters of the target model, and then the target model can be trained by using the second training sample set corresponding to the target task. model to adjust the network parameters of the target model again.
在上述实施例中,首先利用第一训练样本集预训练原始模型,以调整原始模型的网络参数,其中,原始模型包括用于特征提取的第一子网络。随后利用第二子网络和经预训练的第一子网络的至少部分结构,得到目标模型,其中,第二子网络用于基于第一子网络提取的特征执行目标任务。最后利用与目标任务对应的第二训练样本集训练目标模型,以调整目标模型的网络参数。故此,不仅能够在“网络参数维度”利用与目标任务对应的第二训练样本集进行调整,还能够在“网络结构维度”对原始模型进行调整,能够大大提高网络调整的自由度,从而从“网络参数维度”和“网络结构维度”充分挖掘预训练的原始模型的潜力,有利于提高目标模型的性能。In the above embodiment, the original model is first pre-trained by using the first training sample set to adjust the network parameters of the original model, wherein the original model includes a first sub-network for feature extraction. A target model is then obtained using the second sub-network and at least part of the pre-trained structure of the first sub-network, wherein the second sub-network is used to perform the target task based on the features extracted by the first sub-network. Finally, the target model is trained by using the second training sample set corresponding to the target task to adjust the network parameters of the target model. Therefore, not only can the second training sample set corresponding to the target task be used for adjustment in the "network parameter dimension", but also the original model can be adjusted in the "network structure dimension", which can greatly improve the degree of freedom of network adjustment. "Network parameter dimension" and "network structure dimension" fully tap the potential of the pre-trained original model, which is beneficial to improve the performance of the target model.
请参阅图4,图4是根据本公开实施例的图1中步骤S12的流程示意图。具体地,本公开实施例中,原始模型为单分支网络结构,即,第一子网络包括一路分支网络,且该路分支网络包括顺序连接的多个网络区段。每一网络区段包括顺序连接的至少一个特征提取单元。候选子网络包括每个网络区段中的至少一个特征提取单元。不同候选子网络中的特征提取单元至少部分不同。步骤S12可以包括步骤S41至S46。Please refer to FIG. 4 , which is a schematic flowchart of step S12 in FIG. 1 according to an embodiment of the present disclosure. Specifically, in the embodiment of the present disclosure, the original model is a single-branch network structure, that is, the first sub-network includes a branch network, and the branch network includes a plurality of network segments connected in sequence. Each network segment includes at least one feature extraction unit connected in sequence. The candidate sub-network includes at least one feature extraction unit in each network segment. Feature extraction units in different candidate sub-networks are at least partially different. Step S12 may include steps S41 to S46.
步骤S41:选择每个网络区段中的第一个特征提取单元,得到初始的待定子网络。Step S41: Select the first feature extraction unit in each network segment to obtain an initial undetermined sub-network.
选择每个网络区段中第一个特征提取单元,将每个第一个特征提取单元连接到一起,得到初始的待定子网络。请结合参阅图3,如图3中虚线箭头所示,可以将第一个网络区段中第一个特征提取单元的输出结果,作为第二网络区段的输入数据。初始的待定子网络记为[1,1]。The first feature extraction unit in each network segment is selected, and each first feature extraction unit is connected together to obtain the initial undetermined sub-network. Please refer to FIG. 3 , as shown by the dashed arrow in FIG. 3 , the output result of the first feature extraction unit in the first network segment can be used as the input data of the second network segment. The initial pending sub-network is denoted as [1,1].
步骤S42,对于待定子网络,增加至少一个特征提取单元,得到候选子网络,其中,增加的特征提取单元位于至少一个网络区段选择的特征提取单元之后。Step S42, for the undetermined sub-network, add at least one feature extraction unit to obtain a candidate sub-network, wherein the added feature extraction unit is located after the feature extraction unit selected by at least one network segment.
本公开实施例中,至少一个特征提取单元可以是一个特征提取单元、两个特征提取单元等等,在此不做限定。例如,在网络结构调整的精度要求相对较低的情况下,至少一个特征提取单元可以是两个、三个等多个特征提取单元;而在网络结构调整的精度要求较高的情况下,可以为一个特征提取单元,具体可以根据实际应用需要进行设置,在此不做限定。In this embodiment of the present disclosure, the at least one feature extraction unit may be one feature extraction unit, two feature extraction units, etc., which is not limited herein. For example, in the case where the accuracy requirements of network structure adjustment are relatively low, at least one feature extraction unit may be two, three, etc. multiple feature extraction units; while in the case of high accuracy requirements for network structure adjustment, it can be It is a feature extraction unit, which can be specifically set according to actual application needs, which is not limited here.
可以将每个网络区段分别作为目标区段,然后将目标区段在上一步骤中的选择特征提取单元的序号加1,也就是增加一个特征提取单元,同时保持第一子网络中其他网络区段中在上一步骤中的选择特征提取单元数不变,得到对应目标区段的候选子网络。请结合参阅图3,可以分别将第一个网络区段、第二网络区段作为目标区段。利用第一个目标区段中前两个特征提取单元(在步骤S41中,第一个网络区段选择的特征提取单元是第一个特征提取单元,因此在此步骤中变为前两个特征提取单元)和第二个网络区段中首个特征提取单元,得到第一个目标区段的候选子网络。具体地,如图3中虚线箭头所示,可以将第一个目标区段中第二个特征提取单元的输出结果,作为第二网络区段的输入数据。利用第二个目标区段中前两个特征提取单元和第一网络区段中首个提取单元,得到第二个目标区段的候选子网络,具体地,如图3中虚线箭头所示,可以将第一个网络区段中第一个特征提取单元的输出结果,作为第二个目标区段的输入数据。请继续结合参阅图3,为了便于描述,可以将第一个目标区段对应的候选子网络记为[2,1],表示第一个目标区段对应的候选子网络是由第一个网络区段前两个特征提取单元和第二个网络区段首个特征提取单元组成的,并将第二个目标区段对应的初始的候选子网络记为[1,2],表示第二个目标区段对应的候选子网络是由第一个网络区段首个特征提取单元和第二网络区段前两个特征提取单元组成的,其他情况可以以此类推,在此不再一一举例。Each network segment can be used as a target segment, and then the sequence number of the selected feature extraction unit of the target segment in the previous step is increased by 1, that is, a feature extraction unit is added, while maintaining other networks in the first sub-network. In the segment, the number of selected feature extraction units in the previous step remains unchanged, and the candidate sub-network corresponding to the target segment is obtained. Please refer to FIG. 3 , the first network segment and the second network segment can be respectively used as target segments. Utilize the first two feature extraction units in the first target segment (in step S41, the feature extraction unit selected by the first network segment is the first feature extraction unit, so it becomes the first two features in this step extraction unit) and the first feature extraction unit in the second network segment to obtain candidate sub-networks for the first target segment. Specifically, as shown by the dotted arrow in FIG. 3 , the output result of the second feature extraction unit in the first target segment can be used as the input data of the second network segment. Using the first two feature extraction units in the second target section and the first extraction unit in the first network section, the candidate sub-network of the second target section is obtained. Specifically, as shown by the dotted arrow in Figure 3, The output result of the first feature extraction unit in the first network segment can be used as the input data of the second target segment. Please continue to refer to Figure 3. For the convenience of description, the candidate sub-network corresponding to the first target segment can be marked as [2, 1], indicating that the candidate sub-network corresponding to the first target segment is composed of the first network. The first two feature extraction units of the segment and the first feature extraction unit of the second network segment, and the initial candidate sub-network corresponding to the second target segment is recorded as [1, 2], indicating that the second The candidate sub-network corresponding to the target segment is composed of the first feature extraction unit of the first network segment and the first two feature extraction units of the second network segment. Other cases can be deduced by analogy, and we will not give examples here. .
在一个实施场景中,可以将每个网络区段分别作为目标区段,然后将目标区段在上一步骤中的选择特征提取单元的序号加2、3……等整数,也就是增加两个特征提取单元、三个特征提取单元……,同时保持第一子网络中其他网络区段中在上一步骤中的选择特征提取单元数不变,得到对应目标区段的候选子网络。增加的特征提取单元的数量可以根据实际应用需要进行设置,在此不做限定。In an implementation scenario, each network segment can be used as a target segment, and then the sequence number of the selected feature extraction unit of the target segment in the previous step is added to an integer such as 2, 3, etc., that is, an increase of two A feature extraction unit, three feature extraction units..., while keeping the number of selected feature extraction units in the previous step in other network segments in the first sub-network unchanged, to obtain a candidate sub-network corresponding to the target segment. The number of added feature extraction units can be set according to actual application requirements, which is not limited here.
步骤S43:将多个候选子网络分别与第二子网络组成候选模型,选择多个候选模型中性能条件最好的一个候选模型对应的候选子网络,作为待定子网络。Step S43 : forming a candidate model from the plurality of candidate sub-networks and the second sub-network respectively, and selecting a candidate sub-network corresponding to a candidate model with the best performance condition among the plurality of candidate models as the sub-network to be determined.
可以利用与目标任务对应的验证样本,对多个候选子网络与第二子网络得到的多个候选模型分别进行验证,得到多个候选模型执行目标任务的性能评分,并基于性能评分确定性能最好的候选模型,其中,验证样本可以与第二训练样本集相同或为第二训练训练样本集的一部分。具体可以参阅前述公开实施例中的相关描述,在此不再赘述。假设,经过验证样本的验证,可以将候选子网络[2,1]作为待定子网络。The verification samples corresponding to the target task can be used to verify the candidate models obtained by multiple candidate sub-networks and the second sub-network, respectively, to obtain the performance scores of the multiple candidate models to perform the target task, and determine the best performance based on the performance scores. A good candidate model, where the validation samples can be the same as or part of the second set of training samples. For details, reference may be made to the relevant descriptions in the foregoing disclosed embodiments, which will not be repeated here. It is assumed that, after verification by the validation samples, the candidate sub-network [2, 1] can be taken as the pending sub-network.
步骤S44:判断待定子网络中的特征提取单元数量是否小于预设数量,若是,则执行步骤S42,否则执行步骤S45。Step S44: Determine whether the number of feature extraction units in the sub-network to be determined is less than the preset number, and if so, go to Step S42, otherwise go to Step S45.
本公开实施例中,预设数量可以根据实际应用需要进行预先设置。具体地,可以根据目标模型的期望复杂度进行设置。例如,可以设置为4、5、6等等,在此不做限定。例如,当预设数量为4时,由于待定子网络[2,1]中特征提取单元数量为3,故小于预设数量,则可以执行步骤S42。又例如,当预设数量为3时,由于待定子网络[2,1]中特征提取单元数量为3,不小于预设数量,则可以执行步骤S45,即直接将待定子网络[2,1],作为选中子网络。其他情况可以以此类推,在此不再一一举例。In the embodiment of the present disclosure, the preset number may be preset according to actual application requirements. Specifically, it can be set according to the desired complexity of the target model. For example, it can be set to 4, 5, 6, etc., which is not limited here. For example, when the preset number is 4, since the number of feature extraction units in the undetermined sub-network [2, 1] is 3, which is less than the preset number, step S42 may be executed. For another example, when the preset number is 3, since the number of feature extraction units in the undetermined sub-network [2,1] is 3, which is not less than the preset number, step S45 can be executed, that is, the undetermined sub-network [2,1] can be directly ], as the selected subnet. Other situations can be deduced by analogy, and no examples are given here.
在一个实施场景中,第一子网络可以包括第一数量个特征提取单元,第一子网络可以包括第二数量个网络区段,则预设数量可以设置为小于第一数量,且大于第二数量。In one implementation scenario, the first sub-network may include a first number of feature extraction units, the first sub-network may include a second number of network segments, and the preset number may be set to be smaller than the first number and greater than the second number quantity.
在一个实施场景中,预设数量可以根据完成目标任务的目标模型的计算复杂度来设定。In an implementation scenario, the preset number may be set according to the computational complexity of the target model for completing the target task.
以经过验证样本的验证,可以将候选子网络[2,1]作为待定子网络为例进行继续说明。当S44步骤判断待定子网络[2,1]的特征提取单元小于预设数量时,重新跳转到步骤S42。此时,基于待定子网络[2,1],可以得到新的候选子网络。Taking the verification of the verified samples, the candidate sub-network [2, 1] can be used as an example to continue the description. When step S44 determines that the feature extraction units of the sub-network [2, 1] to be determined are less than the preset number, jump to step S42 again. At this time, based on the undetermined sub-network [2, 1], a new candidate sub-network can be obtained.
在一个实施场景中,可以将目标区段中位于选择的特征提取单元之后首个特征提取单元添加至目标区段选择的特征提取单元之后,从而得到新的候选子网络。请继续结合参阅图3,以待定子网络[2,1]为例,可以将第一个网络区段先作为目标区段,将位于选择的特征提取单元(第二个特征提取单元)之后的首个特征提取单元(第三个特征提取单元),添加至目标区段的选择的特征提取单元(第二个特征提取单元)之后,得到新的候选子网络,为了便于描述,可以表示为[3,1],即新的候选子网络是由第一个网络区段中前三个特征提取单元和第二网络区段首个特征提取单元组成的。类似的,可以再将第二个网络区段作为目标区段,将目标区段中位于选择的特征提取单元(第一个特征提取单元)之后的首个特征提取单元(第一个特征提取单元),添加至目标区段的选择的特征提取单元(第一个特征提取单元)之后,得到另一个新的候选子网络,为了便于描述,可以表示为[2,2]。其他情况可以以此类推,在此不再一一举例。In an implementation scenario, the first feature extraction unit located after the selected feature extraction unit in the target segment may be added after the selected feature extraction unit of the target segment, thereby obtaining a new candidate sub-network. Please continue to refer to Figure 3, taking the sub-network [2, 1] to be determined as an example, the first network segment can be used as the target segment first, and the segment located after the selected feature extraction unit (the second feature extraction unit) The first feature extraction unit (the third feature extraction unit) is added to the selected feature extraction unit (the second feature extraction unit) of the target segment to obtain a new candidate sub-network, which can be expressed as [ 3,1], that is, the new candidate sub-network is composed of the first three feature extraction units in the first network segment and the first feature extraction unit in the second network segment. Similarly, the second network segment can be used as the target segment, and the first feature extraction unit (the first feature extraction unit) after the selected feature extraction unit (the first feature extraction unit) in the target segment ), after adding to the selected feature extraction unit (the first feature extraction unit) of the target segment, another new candidate sub-network is obtained, which can be expressed as [2, 2] for the convenience of description. Other situations can be deduced by analogy, and no examples are given here.
从上述步骤可以看出,基于某一特征提取单元数量,从当前的候选子网络中选择一个性能条件最好的子网络作为待定子网络,然后在之前的特征提取单元数量加上至少一个特征提取单元,以之前的待定子网络为基础,计算新的候选子网络,并从新的候选子网络中再挑选性能条件最好的子网络作为新的待定子网络。仍以上面所述的例子为例,从候选子网络[2,1]和[1,2]中挑选待定子网络,假设[2,1]的性能更好,则[2,1]为待定子网络,后续则以[2,1]为基础,从[3,1]和[2,2]挑选待定子网络。在这种情况下,[1,3]就不会作为候选子网络进行第二轮的性能评估。从而节省了在步骤S43中计算性能评分的候选子网络。当网络区段比较多的情况下,这种方法能显著节省计算量。It can be seen from the above steps that, based on the number of feature extraction units, a sub-network with the best performance conditions is selected from the current candidate sub-networks as the pending sub-network, and then at least one feature extraction unit is added to the previous number of feature extraction units. The unit calculates a new candidate sub-network based on the previous pending sub-network, and selects the sub-network with the best performance condition from the new candidate sub-network as the new pending sub-network. Still taking the example described above as an example, select the pending sub-network from the candidate sub-networks [2, 1] and [1, 2], assuming that [2, 1] has better performance, then [2, 1] is pending Sub-network, and then based on [2, 1], the pending sub-network is selected from [3, 1] and [2, 2]. In this case, [1, 3] will not be used as candidate sub-networks for the second round of performance evaluation. Thus, candidate sub-networks for calculating the performance score in step S43 are saved. When there are many network segments, this method can significantly save computation.
请结合参阅图5,图5是根据本公开实施例的待定子网络的框架示意图,如图5所示,实线矩形框的特征提取单元表示选择的特征提取单元,虚线矩形框的特征提取单元表示未选择的特征提取单元,图5所示的待定子网络为[2,2],以待定子网络是[2,2],预设数量是4为例,由于待定子网络中的特征提取单元的数量等于4,则可以执行下述步骤S45,即可以将待定子网络[2,2]作为选中子网络。Please refer to FIG. 5. FIG. 5 is a schematic diagram of a frame of a sub-network to be determined according to an embodiment of the present disclosure. As shown in FIG. 5, the feature extraction unit in the solid rectangle represents the selected feature extraction unit, and the feature extraction unit in the dotted rectangle represents the selected feature extraction unit. Indicates an unselected feature extraction unit. The undetermined sub-network shown in Figure 5 is [2, 2], and the undetermined sub-network is [2, 2] and the preset number is 4 as an example. Since the feature extraction in the un-determined sub-network is If the number of units is equal to 4, the following step S45 may be performed, that is, the undetermined sub-network [2, 2] may be used as the selected sub-network.
步骤S45:将待定子网络作为选中子网络。Step S45: Take the undetermined sub-network as the selected sub-network.
具体地,在待定子网络中的特征提取单元数量不小于预设数量的情况下,可以将待定子网络作为选中子网络。则通过上述方式,可以在预设数量所约束的模型复杂度下,得到在性能层面的选中子网络。Specifically, when the number of feature extraction units in the sub-network to be determined is not less than the preset number, the sub-network to be determined may be used as the selected sub-network. In the above manner, the selected sub-network at the performance level can be obtained under the model complexity constrained by the preset number.
步骤S46:利用选中子网络和第二子网络,得到目标模型。Step S46: Obtain the target model by using the selected sub-network and the second sub-network.
具体地,可以将选中子网络和第二子网络顺序连接,得到目标模型。Specifically, the selected sub-network and the second sub-network can be sequentially connected to obtain the target model.
在本公开实施例中,通过上述方法,不仅能够在“模型复杂度”、“模型性能”层面约束目标模型,还能提高获取目标模型的效率。In the embodiments of the present disclosure, the above method can not only constrain the target model in terms of "model complexity" and "model performance", but also improve the efficiency of acquiring the target model.
在一个实施场景中,在第一子网络包括多路分支网络的情况下,可以首先在第一子网络中选择一路分支网络,随后针对该路分支网络执行步骤S41至S46。In an implementation scenario, when the first sub-network includes a multi-channel branch network, a branch network may be selected from the first sub-network first, and then steps S41 to S46 are performed for the branch network.
请参阅图6,图6是本公开另一实施例的目标模型的获取方法的流程示意图。具体 地,本公开实施例中,原始模型包括用于特征提取的第一子网络,第一子网络包括一路分支网络,且该路分支网络包括顺序连接的多个网络区段,每一网络区段包括顺序连接的至少一个特征提取单元。可以包括步骤S601至S612。Please refer to FIG. 6 , which is a schematic flowchart of a method for acquiring a target model according to another embodiment of the present disclosure. Specifically, in the embodiment of the present disclosure, the original model includes a first sub-network for feature extraction, the first sub-network includes a branch network, and the branch network includes a plurality of network segments connected in sequence, each network segment A segment includes at least one feature extraction unit connected sequentially. Steps S601 to S612 may be included.
步骤S601:每次训练原始模型前利用预设选择策略,在每一网络区段中选择一特征提取单元。Step S601 : Select a feature extraction unit in each network segment by using a preset selection strategy before training the original model each time.
具体可以参阅前述公开实施例中的相关描述,在此不再赘述。For details, reference may be made to the relevant descriptions in the foregoing disclosed embodiments, which will not be repeated here.
步骤S602:利用第一训练样本集,对每一网络区段中位于选择的特征提取单元之前的部分进行训练,以调整每一网络区段中位于选择的特征提取单元之前的部分的网络参数。Step S602: Use the first training sample set to train the part of each network segment before the selected feature extraction unit to adjust the network parameters of the part of each network segment before the selected feature extraction unit.
具体可以参阅前述公开实施例中的相关描述,在此不再赘述。For details, reference may be made to the relevant descriptions in the foregoing disclosed embodiments, which will not be repeated here.
步骤S603:利用与目标任务对应的第二训练样本集训练原始模型,以调整原始模型的网络参数。Step S603: Use the second training sample set corresponding to the target task to train the original model to adjust the network parameters of the original model.
具体可以参阅前述公开实施例中的相关描述,在此不再赘述。For details, reference may be made to the relevant descriptions in the foregoing disclosed embodiments, which will not be repeated here.
步骤S604:选择每个网络区段中的第一个特征提取单元,得到初始的待定子网络。Step S604: Select the first feature extraction unit in each network segment to obtain the initial undetermined sub-network.
具体可以参阅前述公开实施例中的相关描述,在此不再赘述。For details, reference may be made to the relevant descriptions in the foregoing disclosed embodiments, which will not be repeated here.
步骤S605,对于待定子网络,增加至少一个特征提取单元,得到候选子网络,其中,增加的特征提取单元位于至少一个网络区段选择的特征提取单元之后。Step S605, for the undetermined sub-network, add at least one feature extraction unit to obtain a candidate sub-network, wherein the added feature extraction unit is located after the feature extraction unit selected by at least one network segment.
具体可以参阅前述公开实施例中的相关描述,在此不再赘述。For details, reference may be made to the relevant descriptions in the foregoing disclosed embodiments, which will not be repeated here.
步骤S606:将多个候选子网络分别与第二子网络组成候选模型,选择多个候选模型中性能条件最好的一个候选子模型对应的候选子网络作为待定子网络。Step S606: The multiple candidate sub-networks and the second sub-network are respectively formed into candidate models, and the candidate sub-network corresponding to a candidate sub-model with the best performance condition among the multiple candidate models is selected as the undetermined sub-network.
本公开实施例中,第二子网络用于基于第一子网络提取的特征执行目标任务。具体可以参阅前述公开实施例中的相关描述,在此不再赘述。In the embodiment of the present disclosure, the second sub-network is configured to perform the target task based on the features extracted by the first sub-network. For details, reference may be made to the relevant descriptions in the foregoing disclosed embodiments, which will not be repeated here.
步骤S607:判断待定子网络的特征提取单元数量是否小于预设数量,若是,则执行步骤S605,否则执行步骤S608。Step S607: Determine whether the number of feature extraction units of the sub-network to be determined is less than the preset number, if so, go to step S605, otherwise go to step S608.
具体可以参阅前述公开实施例中的相关描述,在此不再赘述。For details, reference may be made to the relevant descriptions in the foregoing disclosed embodiments, which will not be repeated here.
步骤S608:基于待定子网络,得到新的候选子网络。Step S608: Obtain a new candidate sub-network based on the undetermined sub-network.
具体可以参阅前述公开实施例中的相关描述,在此不再赘述。For details, reference may be made to the relevant descriptions in the foregoing disclosed embodiments, which will not be repeated here.
步骤S609:将待定子网络作为选中子网络。Step S609: Take the undetermined sub-network as the selected sub-network.
具体可以参阅前述公开实施例中的相关描述,在此不再赘述。For details, reference may be made to the relevant descriptions in the foregoing disclosed embodiments, which will not be repeated here.
步骤S610:利用选中子网络和第二子网络,得到目标模型。Step S610: Obtain the target model by using the selected sub-network and the second sub-network.
请结合参阅图5,待定子网络为图5所示的子网络[2,2],且预设数量为4时,可以将待定子网络[2,2]作为选中子网络,并利用选中子网络和第二子网络,得到目标模型。具体地,可以将选中子网络和第二子网络顺序连接,得到目标模型。Please refer to FIG. 5 , the sub-network to be determined is the sub-network [2, 2] shown in FIG. 5, and when the preset number is 4, the sub-network to be determined [2, 2] can be used as the selected sub-network, and the selected sub-network can be used. network and the second sub-network to get the target model. Specifically, the selected sub-network and the second sub-network can be sequentially connected to obtain the target model.
具体可以参阅前述公开实施例中的相关描述,在此不再赘述。For details, reference may be made to the relevant descriptions in the foregoing disclosed embodiments, which will not be repeated here.
步骤S611:利用第一训练样本集训练目标模型,以调整目标模型的网络参数。Step S611: Use the first training sample set to train the target model to adjust the network parameters of the target model.
请继续结合参阅图5,当选中子网络为图5所示的待定子网络[2,2]时,可以利用 第一训练样本集对由待定子网络[2,2]和第二子网络构成的目标模型进行训练,以调整目标模型的网络参数。Please continue to refer to Fig. 5. When the neutralized sub-network is the undetermined sub-network [2, 2] shown in Fig. 5, the first training sample set can be used to form the undetermined sub-network [2, 2] and the second sub-network The target model is trained to adjust the network parameters of the target model.
具体可以参阅前述公开实施例中的相关描述,在此不再赘述。For details, reference may be made to the relevant descriptions in the foregoing disclosed embodiments, which will not be repeated here.
步骤S612:利用与目标任务对应的第二训练样本集训练目标模型,以调整目标模型的网络参数。Step S612: Use the second training sample set corresponding to the target task to train the target model to adjust the network parameters of the target model.
请继续结合参阅图5,当选中子网络为图5所示的待定子网络[2,2]时,可以进一步利用第二训练样本集对由待定子网络[2,2]和第二子网络构成的目标模型进行训练,以调整目标模型的网络参数。Please continue to refer to Fig. 5. When the neutralized sub-network is the undetermined sub-network [2, 2] shown in Fig. 5, the second training sample set can be further used to pair the undetermined sub-network [2, 2] and the second sub-network The formed target model is trained to adjust the network parameters of the target model.
具体可以参阅前述公开实施例中的相关描述,在此不再赘述。For details, reference may be made to the relevant descriptions in the foregoing disclosed embodiments, which will not be repeated here.
在本公开实施例中,通过上述方法,能够有利于在“网络结构调整”层面提高获取选中子网络的效率。由于在“网络参数调整”层面和“网络结构调整”层面共同对原始模型进行调整,能够大大提高网络调整的自由度,得以从“网络参数维度”和“网络结构维度”充分挖掘预训练的原始模型的潜力,有利于提高目标模型的性能。In the embodiment of the present disclosure, the above method can help to improve the efficiency of acquiring the selected sub-network at the level of "network structure adjustment". Since the original model is adjusted at the level of "network parameter adjustment" and "network structure adjustment", the degree of freedom of network adjustment can be greatly improved, and the original model of pre-training can be fully excavated from the "network parameter dimension" and "network structure dimension". The potential of the model is beneficial to improve the performance of the target model.
在一个实施场景中,在第一子网络包括多路分支网络的情况下,可以首先在第一子网络中选择一路分支网络,随后针对该路分支网络执行步骤S601至S612。In an implementation scenario, when the first sub-network includes a multi-channel branch network, a branch network may be selected from the first sub-network first, and then steps S601 to S612 are performed for the branch network.
请参阅图7,图7是根据本公开实施例的目标模型的获取装置70的框架示意图。目标模型的获取装置70可包括:第一训练模块71、模型获取模块72和第二训练模块73,第一训练模块71用于利用第一训练样本集预训练原始模型,以调整原始模型的网络参数;其中,原始模型可包括用于特征提取的第一子网络;模型获取模块72用于利用第二子网络和经预训练的第一子网络的至少部分结构,得到目标模型;其中,第二子网络用于基于第一子网络提取的特征执行目标任务;第二训练模块73用于利用与目标任务对应的第二训练样本集训练目标模型,以调整目标模型的网络参数。Please refer to FIG. 7 , which is a schematic diagram of a framework of an
在一些公开实施例中,所述第一子网络可包括多个网络区段,每一所述网络区段可包括顺序连接的至少一个特征提取单元,所述特征提取单元用于进行特征提取。In some disclosed embodiments, the first sub-network may include a plurality of network segments, and each of the network segments may include sequentially connected at least one feature extraction unit for performing feature extraction.
在一些公开实施例中,所述第一子网络可包括至少一路分支网络,且每路所述分支网络包括顺序连接的至少一个所述网络区段。故能够将第一子网络设置为“单链式”的单分支网络,或者设置为“多链式”的多分支网络,从而既能够在多分支网络中获取到目标模型,也够在单分支网络中获取目标模型,进而能够有利于扩展使用范围。In some disclosed embodiments, the first sub-network may include at least one branch network, and each branch network includes at least one of the network segments connected in sequence. Therefore, the first sub-network can be set as a "single-chain" single-branch network, or as a "multi-chain" multi-branch network, so that the target model can be obtained in the multi-branch network, and the single branch network can be obtained. The target model is obtained in the network, which can help to expand the scope of use.
在一些公开实施例中,模型获取模块72可包括结构搜索子模块,用于利用第一子网络的不同部分结构得到至少一个候选子网络,并选取满足预设条件的候选子网络作为选中子网络,模型获取模块72可包括模型构建模块,用于利用选中子网络和第二子网络,得到目标模型。In some disclosed embodiments, the
通过利用第一子网络的不同部分结构,得到至少一个候选子网,并选取满足预设条件的候选子网络作为选中子网络,从而利用选中子网络和第二子网络,得到目标模型,能够有利于扩展“网络结构维度”的调整空间,进而能够有利于提高目标模型的性能。By using different partial structures of the first sub-network, at least one candidate sub-network is obtained, and the candidate sub-network that satisfies the preset conditions is selected as the selected sub-network, so that the selected sub-network and the second sub-network are used to obtain the target model, which can have It is beneficial to expand the adjustment space of "network structure dimension", which can help to improve the performance of the target model.
在一些公开实施例中,预设条件可包括:利用候选子网络与第二子网络得到的候选模型满足预设性能条件。In some disclosed embodiments, the preset condition may include: the candidate model obtained by using the candidate sub-network and the second sub-network satisfies the preset performance condition.
在一些公开实施例中,所述预设条件还可包括:所述候选子网络中的特征提取单元的数量达到预设数量。In some disclosed embodiments, the preset condition may further include: the number of feature extraction units in the candidate sub-network reaches a preset number.
由于特征提取单元数量在一定程度上能够反映目标模型的复杂度,而预设性能条件在一定程度上能够反映目标模型的性能,故能够从“模型复杂度”、“模型性能”层面 约束目标模型。Since the number of feature extraction units can reflect the complexity of the target model to a certain extent, and the preset performance conditions can reflect the performance of the target model to a certain extent, the target model can be constrained from the levels of "model complexity" and "model performance". .
在一些公开实施例中,候选子网络可包括同一分支网络中每个网络区段中的至少一个特征提取单元,且不同候选子网络中的特征提取单元至少部分不同。In some disclosed embodiments, the candidate sub-networks may include at least one feature extraction unit in each network segment in the same branch network, and the feature extraction units in different candidate sub-networks are at least partially different.
在一些公开实施例中,结构搜索子模块可包括初始化单元,用于选择同一所述分支网络中每个所述网络区段中的第一个特征提取单元,得到初始的待定子网络,所述初始化单元还用于对于所述待定子网络,增加至少一个特征提取单元,得到候选子网络,其中,增加的特征提取单元位于所述网络区段中选择的特征提取单元之后,结构搜索子模块可包括性能评价单元,用于将多个所述候选子网络分别与所述第二子网络组成候选模型,选择多个所述候选模型中性能条件最好的一个候选模型对应的候选子网络,作为待定子网络,结构搜索子模块包括重复搜索单元,用于在待定子网络中的特征提取单元数量小于预设数量的情况下,基于所述待定子网络,得到新的所述候选子网络,并选择多个新的所述候选子网络与所述第二子网络组成的候选模型中性能条件最好的一个候选模型对应的候选子网络,作为新的所述待定子网络,结构搜索子模块可包括选中获取单元,用于在待定子网络中的特征提取单元数量不小于预设数量的情况下,将待定子网络作为选中子网络。In some disclosed embodiments, the structure search sub-module may include an initialization unit for selecting a first feature extraction unit in each of the network segments in the same branch network to obtain an initial undetermined sub-network, the The initialization unit is further configured to add at least one feature extraction unit to the undetermined sub-network to obtain a candidate sub-network, wherein the added feature extraction unit is located after the feature extraction unit selected in the network section, and the structure search sub-module may A performance evaluation unit is included, which is used to form a candidate model with a plurality of the candidate sub-networks and the second sub-network respectively, and select a candidate sub-network corresponding to a candidate model with the best performance condition among the plurality of candidate models, as the candidate sub-network. The sub-network to be determined, the structure search sub-module includes a repeated search unit, used for obtaining a new candidate sub-network based on the sub-network to be determined when the number of feature extraction units in the sub-network to be determined is less than a preset number, and Select a candidate sub-network corresponding to a candidate model with the best performance condition among the candidate models composed of a plurality of new candidate sub-networks and the second sub-network, as the new to-be-determined sub-network, the structure search sub-module may A selection acquisition unit is included, which is used to use the pending sub-network as the selected sub-network under the condition that the number of feature extraction units in the pending sub-network is not less than the preset number.
在一些公开实施例中,初始化单元可用于将每个所述网络区段分别作为目标区段,将所述目标区段所选择的特征提取单元数量增加一整数数值,同时保持其他网络区段中所选择的特征提取单元数量不变,得到对应所述目标区段的候选子网络重复搜索单元可用于对于所述选中子网络,增加至少一个特征提取单元,得到新的候选子网络,其中,增加的特征提取单元位于所述网络区段中选择的特征提取单元之后。In some disclosed embodiments, the initialization unit may be configured to use each of the network segments as a target segment, increase the number of feature extraction units selected by the target segment by an integer value, while maintaining the number of feature extraction units in other network segments. The number of selected feature extraction units is unchanged, and the repeated search unit to obtain the candidate sub-network corresponding to the target section can be used to add at least one feature extraction unit to the selected sub-network to obtain a new candidate sub-network, wherein adding The feature extraction unit of is located after the selected feature extraction unit in the network section.
通过分别将每个网络区段作为目标区段,并利用每个目标区段中的第一个特征提取单元,得到初始的待定子网络,故能够有利于从第一子网络每个网络区段的头部开始网络结构调整,通过将目标区段所选择的特征提取单元数量增加一整数数值,同时保持其他网络区段中所选择的特征提取单元数量不变,得到对应目标区段的候选子网络,故能够在后续调整过程中,逐个调整不同网络区段的特征提取单元,能够有利于提高网络调整的精确度。By taking each network segment as a target segment and using the first feature extraction unit in each target segment to obtain the initial undetermined sub-network, it can be beneficial to extract each network segment from the first sub-network The head starts to adjust the network structure. By increasing the number of feature extraction units selected in the target segment by an integer value, while keeping the number of feature extraction units selected in other network segments unchanged, the candidates corresponding to the target segment are obtained. Therefore, in the subsequent adjustment process, the feature extraction units of different network segments can be adjusted one by one, which can help to improve the accuracy of network adjustment.
在一些公开实施例中,性能评价单元可用于利用与目标任务对应的验证样本,对利用候选子网络与第二子网络得到的候选模型进行验证,得到候选模型执行目标任务的性能评分。In some disclosed embodiments, the performance evaluation unit may be configured to use the verification samples corresponding to the target task to verify the candidate model obtained by using the candidate sub-network and the second sub-network to obtain a performance score of the candidate model for executing the target task.
在一些公开实施例中,第一子网络包含第一数量个特征提取单元,第一子网络包含第二数量个网络区段,预设数量小于第一数量,且大于或等于第二数量。In some disclosed embodiments, the first sub-network includes a first number of feature extraction units, the first sub-network includes a second number of network segments, and the preset number is less than the first number and greater than or equal to the second number.
通过利用与目标任务对应的验证样本,对利用候选子网络与第二子网络得到的候选模型进行验证,得到候选模型执行目标任务的性能评分,并基于性评分确定候选模型是否满足预设性能条件,故能够提高选择选中子网络的准确性;此外,第一子网络包含第一数量个特征提取单元,且第一子网络包含第二数量个网络区段,且预设数量小于第一数量,大于或等于第二数量,能够有利于降低目标模型的复杂度。By using the verification samples corresponding to the target task, the candidate model obtained by using the candidate sub-network and the second sub-network is verified to obtain the performance score of the candidate model for executing the target task, and based on the performance score to determine whether the candidate model meets the preset performance conditions , so the accuracy of selecting the selected sub-network can be improved; in addition, the first sub-network includes a first number of feature extraction units, and the first sub-network includes a second number of network segments, and the preset number is smaller than the first number, Greater than or equal to the second number can help reduce the complexity of the target model.
在一些公开实施例中,第一子网络包括一路分支网络,该路分支网络包括顺序连接的多个网络区段,每一网络区段包括顺序连接的至少一个特征提取单元,第一训练模块71包括单元选取子模块,用于每次训练前利用预设选择策略,在每一网络区段中选择一特征提取单元,第一训练模块71包括样本训练子模块,用于利用第一训练样本集,对每一网络区段中位于选择的特征提取单元之前的部分进行训练,以调整每一网络区段中位于选择的特征提取单元之前的部分的网络参数。In some disclosed embodiments, the first sub-network includes a branch network of one way, the branch network includes a plurality of network segments connected in sequence, each network segment includes at least one feature extraction unit connected in sequence, and the
在一些公开实施例中,所述第一子网络包括多路分支网络,且每路所述分支网 络包括顺序连接的至少一个网络区段,每一所述网络区段包括顺序连接的至少一个特征提取单元,所述单元选取子模块还用于每次训练前利用预设选择策略,在所述第一子网络中选择一路所述分支网络并在选择的所述分支网络所包含的每一所述网络区段中选择一所述特征提取单元,所述样本训练子模块用于利用所述第一训练样本集,对每一所述网络区段中位于选择的特征提取单元之前的部分进行训练,以调整每一所述网络区段中位于选择的特征提取单元之前的部分的网络参数。In some disclosed embodiments, the first sub-network comprises a multi-way branch network, and each of the branch networks comprises at least one network segment connected in sequence, each segment of the network comprising at least one feature connected in sequence The extraction unit, the unit selection sub-module is further configured to use a preset selection strategy before each training, select one of the branch networks in the first sub-network, and select a branch network in each branch network included in the selected branch network. One of the feature extraction units is selected from the network section, and the sample training sub-module is used to use the first training sample set to train the part of each of the network sections before the selected feature extraction unit , to adjust the network parameters of the portion of each of the network segments preceding the selected feature extraction unit.
通过通过上述配置,有利于经过多次训练后对第一子网络的各个部分充分训练,并提高预训练效率。Through the above configuration, it is beneficial to fully train each part of the first sub-network after multiple trainings, and improve the pre-training efficiency.
在一些公开实施例中,可在所述第一子网络中随机选择一路所述分支网络并在选择的所述分支网络所包含的每一所述网络区段中选择一所述特征提取单元。In some disclosed embodiments, one of the branch networks may be randomly selected in the first sub-network and one of the feature extraction units may be selected in each of the network segments included in the selected branch network.
在一些公开实施例中,可在所述第一子网络中选择包含所述特征提取单元数量最多的分支网络并在选择的所述分支网络所包含的每一所述网络区段中选择一所述特征提取单元。In some disclosed embodiments, a branch network including the largest number of feature extraction units may be selected in the first sub-network, and a branch network may be selected in each of the network segments included in the selected branch network Describe the feature extraction unit.
在一些公开实施例中,第一子网络还可包括位于相邻网络区段之间的下采样层。In some disclosed embodiments, the first sub-network may also include a downsampling layer between adjacent network segments.
在一些公开实施例中,特征提取单元可包括顺序连接的卷积层、激活层和批处理层。In some disclosed embodiments, the feature extraction unit may include sequentially connected convolutional layers, activation layers, and batching layers.
通过将特征提取单元设置为包括顺序连接的卷积层、激活层和批处理层,能够有利于提高特征提取单元在训练过程中的学习效果;而在第一子网络设置位于相邻网络区段之间的下采样层,能够有利于实现特征降维,压缩数据和参数的数量,减小过拟合,同时提高容错性。By setting the feature extraction unit to include sequentially connected convolutional layers, activation layers and batch processing layers, the learning effect of the feature extraction unit in the training process can be improved; The downsampling layer between them can help to achieve feature dimensionality reduction, compress the number of data and parameters, reduce overfitting, and improve fault tolerance.
在一些公开实施例中,目标模型的获取装置70还可包括第三训练模块,用于利用第二训练样本集训练原始模型,以调整原始模型的网络参数。In some disclosed embodiments, the
通过在预训练之后,先利用与目标任务对应的第二训练样本集训练原始模型,以调整原始模型的网络参数,能够有利于提高后续网络结构维度调整的准确性。After pre-training, the original model is first trained with the second training sample set corresponding to the target task to adjust the network parameters of the original model, which can help to improve the accuracy of subsequent network structure dimension adjustment.
在一些公开实施例中,目标模型的获取装置70还可包括第四训练模块,用于利用第一训练样本集训练目标模型,以调整目标模型的网络参数。In some disclosed embodiments, the
通过在完成网络结构维度调整之后,先利用第一训练样本集训练目标模型,再利用与目标任务对应的第二训练样本集再次训练目标模型,能够有利于提高目标模型的性能。After completing the adjustment of the network structure dimension, the first training sample set is used to train the target model, and then the second training sample set corresponding to the target task is used to train the target model again, which can help improve the performance of the target model.
在一些公开实施例中,原始模型还可包括第三子网络,第三子网络用于基于提取到的特征执行预设任务,其中,预设任务与目标任务相同或不同。In some disclosed embodiments, the original model may further include a third sub-network for performing a preset task based on the extracted features, wherein the preset task is the same as or different from the target task.
通过将原始模型设置为包括第三子网络,且第三子网络用于基于提取到的特征执行预设任务,且预设任务与目标任务相同或不同,能够有利于进一步扩展适用于获取目标模型的范围。By setting the original model to include a third sub-network, and the third sub-network is used to perform a preset task based on the extracted features, and the preset task is the same or different from the target task, it can be beneficial to further expand the application for obtaining the target model range.
在一些公开实施例中,第一训练样本集中第一训练样本的数量可大于第二训练样本集中第二训练样本的数量。In some disclosed embodiments, the number of first training samples in the first training sample set may be greater than the number of second training samples in the second training sample set.
通过将第一训练样本集中第一训练样本的数量设置为大于第二训练样本集中第二训练样本的数量,能够有利于减少在目标任务上样本标注的工作量。By setting the number of the first training samples in the first training sample set to be greater than the number of the second training samples in the second training sample set, it can be beneficial to reduce the workload of sample labeling on the target task.
请参阅图8,图8是根据本公开实施例的电子设备80的框架示意图。电子设备80包括相互耦接的存储器81和处理器82,处理器82用于执行存储器81中存储的程序指令,以实现上述任一目标模型的获取方法实施例的步骤。在一个具体的实施场景中, 电子设备80可以包括但不限于:微型计算机、服务器,此外,电子设备80还可以包括笔记本电脑、平板电脑等移动设备,在此不做限定。Please refer to FIG. 8 , which is a schematic diagram of a frame of an
具体而言,处理器82用于控制其自身以及存储器81以实现上述任一目标模型的获取方法实施例的步骤。处理器82还可以称为CPU(Central Processing Unit,中央处理单元)。处理器82可能是一种集成电路芯片,具有信号的处理能力。处理器82还可以是通用处理器、数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现场可编程门阵列(Field-Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。另外,处理器82可以由集成电路芯片共同实现。Specifically, the
在本公开实施例中,不仅能够在“网络参数维度”利用与目标任务对应的第二训练样本集进行调整,还能够在“网络结构维度”对原始模型进行调整,能够大大提高网络调整的自由度,得以从“网络参数维度”和“网络结构维度”充分挖掘预训练的原始模型的潜力,有利于提高目标模型的性能。In the embodiment of the present disclosure, not only can the second training sample set corresponding to the target task be used for adjustment in the "network parameter dimension", but also the original model can be adjusted in the "network structure dimension", which can greatly improve the freedom of network adjustment It is possible to fully tap the potential of the pre-trained original model from the "network parameter dimension" and "network structure dimension", which is beneficial to improve the performance of the target model.
请参阅图9,图9为本公开计算机可读存储介质90一实施例的框架示意图。计算机可读存储介质90存储有能够被处理器运行的程序指令901,程序指令901用于实现上述任一目标模型的获取方法实施例的步骤。Please refer to FIG. 9 , which is a schematic diagram of a framework of an embodiment of the disclosed computer-
在本公开实施例中,不仅能够在“网络参数维度”利用与目标任务对应的第二训练样本集进行调整,还能够在“网络结构维度”对原始模型进行调整,能够大大提高网络调整的自由度,得以从“网络参数维度”和“网络结构维度”充分挖掘预训练的原始模型的潜力,有利于提高目标模型的性能。In the embodiment of the present disclosure, not only the second training sample set corresponding to the target task can be used for adjustment in the "network parameter dimension", but also the original model can be adjusted in the "network structure dimension", which can greatly improve the freedom of network adjustment. It is possible to fully tap the potential of the pre-trained original model from the "network parameter dimension" and "network structure dimension", which is beneficial to improve the performance of the target model.
在一些实施例中,本公开实施例提供的装置具有的功能或包含的模块可以用于执行上文方法实施例描述的方法,其具体实现可以参照上文方法实施例的描述,为了简洁,这里不再赘述。In some embodiments, the functions or modules included in the apparatuses provided in the embodiments of the present disclosure may be used to execute the methods described in the above method embodiments. For specific implementation, reference may be made to the descriptions of the above method embodiments. For brevity, here No longer.
上文对各个实施例的描述倾向于强调各个实施例之间的不同之处,其相同或相似之处可以互相参考,为了简洁,本文不再赘述。The above descriptions of the various embodiments tend to emphasize the differences between the various embodiments, and the similarities or similarities can be referred to each other. For the sake of brevity, details are not repeated herein.
在本公开所提供的几个实施例中,应该理解到,所揭露的方法和装置,可以通过其它的方式实现。例如,以上所描述的装置实施方式仅仅是示意性的,例如,模块或单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性、机械或其它的形式。In the several embodiments provided in the present disclosure, it should be understood that the disclosed method and apparatus may be implemented in other manners. For example, the device implementations described above are only illustrative. For example, the division of modules or units is only a logical function division. In actual implementation, there may be other divisions. For example, units or components may be combined or integrated. to another system, or some features can be ignored, or not implemented. On the other hand, the shown or discussed mutual coupling or direct coupling or communication connection may be through some interfaces, indirect coupling or communication connection of devices or units, which may be in electrical, mechanical or other forms.
作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施方式方案的目的。Units described as separate components may or may not be physically separated, and components shown as units may or may not be physical units, that is, may be located in one place, or may be distributed over network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution in this implementation manner.
另外,在本公开各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。In addition, each functional unit in each embodiment of the present disclosure may be integrated into one processing unit, or each unit may exist physically alone, or two or more units may be integrated into one unit. The above-mentioned integrated units may be implemented in the form of hardware, or may be implemented in the form of software functional units.
集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本公开的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形 式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)或处理器(processor)执行本公开各个实施方式方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、磁碟或者光盘等各种可以存储程序代码的介质。The integrated unit, if implemented as a software functional unit and sold or used as a stand-alone product, may be stored in a computer-readable storage medium. Based on such understanding, the technical solutions of the present disclosure essentially or the parts that contribute to the prior art, or all or part of the technical solutions can be embodied in the form of software products, and the computer software products are stored in a storage medium , including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) or a processor (processor) to execute all or part of the steps of the methods of the various embodiments of the present disclosure. The aforementioned storage medium includes: U disk, mobile hard disk, Read-Only Memory (ROM, Read-Only Memory), Random Access Memory (RAM, Random Access Memory), magnetic disk or optical disk and other media that can store program codes .
Claims (25)
Priority Applications (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| KR1020217038252A KR20220023825A (en) | 2020-08-21 | 2020-11-30 | Method and apparatus for acquiring a target model, electronic device and storage medium |
| JP2021569395A JP2022548341A (en) | 2020-08-21 | 2020-11-30 | Get the target model |
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202010852192.4A CN112052949B (en) | 2020-08-21 | 2020-08-21 | Image processing method, device, equipment and storage medium based on transfer learning |
| CN202010852192.4 | 2020-08-21 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2022036921A1 true WO2022036921A1 (en) | 2022-02-24 |
Family
ID=73599559
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/CN2020/132785 Ceased WO2022036921A1 (en) | 2020-08-21 | 2020-11-30 | Acquisition of target model |
Country Status (5)
| Country | Link |
|---|---|
| JP (1) | JP2022548341A (en) |
| KR (1) | KR20220023825A (en) |
| CN (1) | CN112052949B (en) |
| TW (1) | TWI785739B (en) |
| WO (1) | WO2022036921A1 (en) |
Cited By (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN118411128A (en) * | 2024-07-01 | 2024-07-30 | 一智科技有限公司 | Engineering construction acceptance sheet generation method, system and storage medium |
| CN118869463A (en) * | 2024-09-26 | 2024-10-29 | 杭州数云信息技术有限公司 | Risk network positioning method, system and computer equipment based on community mining |
Families Citing this family (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN112633156B (en) * | 2020-12-22 | 2024-05-31 | 浙江大华技术股份有限公司 | Vehicle detection method, image processing device, and computer-readable storage medium |
| CN112634992A (en) * | 2020-12-29 | 2021-04-09 | 上海商汤智能科技有限公司 | Molecular property prediction method, training method of model thereof, and related device and equipment |
| CN112784912B (en) * | 2021-01-29 | 2024-11-26 | 北京百度网讯科技有限公司 | Image recognition method and device, neural network model training method and device |
| CN116827497A (en) * | 2022-03-15 | 2023-09-29 | 中国移动通信有限公司研究院 | Model transmission method, terminal and network side equipment |
Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN110321964A (en) * | 2019-07-10 | 2019-10-11 | 重庆电子工程职业学院 | Identification model update method and relevant apparatus |
| CN110363233A (en) * | 2019-06-28 | 2019-10-22 | 西安交通大学 | A fine-grained image recognition method and system based on block detector and feature fusion convolutional neural network |
| CN110443286A (en) * | 2019-07-18 | 2019-11-12 | 广州华多网络科技有限公司 | Training method, image-recognizing method and the device of neural network model |
| US20200117894A1 (en) * | 2018-10-10 | 2020-04-16 | Drvision Technologies Llc | Automated parameterization image pattern recognition method |
Family Cites Families (6)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| GB201709672D0 (en) * | 2017-06-16 | 2017-08-02 | Ucl Business Plc | A system and computer-implemented method for segmenting an image |
| CN110348428B (en) * | 2017-11-01 | 2023-03-24 | 腾讯科技(深圳)有限公司 | Fundus image classification method and device and computer-readable storage medium |
| CN110163234B (en) * | 2018-10-10 | 2023-04-18 | 腾讯科技(深圳)有限公司 | Model training method and device and storage medium |
| CN111368998B (en) * | 2020-03-04 | 2025-02-11 | 深圳前海微众银行股份有限公司 | Spark cluster-based model training method, device, equipment and storage medium |
| CN111507985A (en) * | 2020-03-19 | 2020-08-07 | 北京市威富安防科技有限公司 | Image instance segmentation optimization processing method and device and computer equipment |
| CN111522944B (en) * | 2020-04-10 | 2023-11-14 | 北京百度网讯科技有限公司 | Method, apparatus, device and storage medium for outputting information |
-
2020
- 2020-08-21 CN CN202010852192.4A patent/CN112052949B/en active Active
- 2020-11-30 JP JP2021569395A patent/JP2022548341A/en active Pending
- 2020-11-30 KR KR1020217038252A patent/KR20220023825A/en not_active Withdrawn
- 2020-11-30 WO PCT/CN2020/132785 patent/WO2022036921A1/en not_active Ceased
-
2021
- 2021-08-16 TW TW110130162A patent/TWI785739B/en not_active IP Right Cessation
Patent Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20200117894A1 (en) * | 2018-10-10 | 2020-04-16 | Drvision Technologies Llc | Automated parameterization image pattern recognition method |
| CN110363233A (en) * | 2019-06-28 | 2019-10-22 | 西安交通大学 | A fine-grained image recognition method and system based on block detector and feature fusion convolutional neural network |
| CN110321964A (en) * | 2019-07-10 | 2019-10-11 | 重庆电子工程职业学院 | Identification model update method and relevant apparatus |
| CN110443286A (en) * | 2019-07-18 | 2019-11-12 | 广州华多网络科技有限公司 | Training method, image-recognizing method and the device of neural network model |
Cited By (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN118411128A (en) * | 2024-07-01 | 2024-07-30 | 一智科技有限公司 | Engineering construction acceptance sheet generation method, system and storage medium |
| CN118869463A (en) * | 2024-09-26 | 2024-10-29 | 杭州数云信息技术有限公司 | Risk network positioning method, system and computer equipment based on community mining |
Also Published As
| Publication number | Publication date |
|---|---|
| CN112052949A (en) | 2020-12-08 |
| KR20220023825A (en) | 2022-03-02 |
| TWI785739B (en) | 2022-12-01 |
| CN112052949B (en) | 2023-09-08 |
| TW202209194A (en) | 2022-03-01 |
| JP2022548341A (en) | 2022-11-18 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| TWI785739B (en) | Method of acquiring target model, electronic device and storage medium | |
| CN112561027B (en) | Neural network architecture search method, image processing method, device and storage medium | |
| CN116229319B (en) | Multi-scale feature fusion classroom behavior detection method and system | |
| US10726313B2 (en) | Active learning method for temporal action localization in untrimmed videos | |
| CN109840531B (en) | Method and device for training multi-label classification model | |
| Wang et al. | Detect globally, refine locally: A novel approach to saliency detection | |
| CN113159073B (en) | Knowledge distillation method and device, storage medium and terminal | |
| CN111144329B (en) | A lightweight and fast crowd counting method based on multi-label | |
| US10275719B2 (en) | Hyper-parameter selection for deep convolutional networks | |
| CN113688894B (en) | Fine granularity image classification method integrating multiple granularity features | |
| WO2023207163A1 (en) | Object detection model and method for detecting object occupying fire escape route, and use | |
| KR102582194B1 (en) | Selective backpropagation | |
| US20210056357A1 (en) | Systems and methods for implementing flexible, input-adaptive deep learning neural networks | |
| WO2021238366A1 (en) | Neural network construction method and apparatus | |
| CN111523470A (en) | Feature fusion block, convolutional neural network, pedestrian re-identification method and related equipment | |
| US20160321784A1 (en) | Reducing image resolution in deep convolutional networks | |
| WO2019100724A1 (en) | Method and device for training multi-label classification model | |
| CN116524379B (en) | Aerial Target Detection Method Based on Attention Mechanism and Adaptive Feature Fusion | |
| TWI761813B (en) | Video analysis method and related model training methods, electronic device and storage medium thereof | |
| CN117173422B (en) | Fine granularity image recognition method based on graph fusion multi-scale feature learning | |
| US20240303497A1 (en) | Robust test-time adaptation without error accumulation | |
| CN116051828B (en) | A real-time segmentation method for SAR images based on LOVASZ loss and lightweight bilateral network | |
| US12333811B2 (en) | Permutation invariant convolution (PIC) for recognizing long-range activities, generating global representation of input streams and classifying activity based on global representation | |
| HK40032409A (en) | Method and device for obtaining target model, electronic equipment and storage medium | |
| KR20220075521A (en) | Layer optimization system for 3d rram device using artificial intelligence technology and method thereof |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| ENP | Entry into the national phase |
Ref document number: 2021569395 Country of ref document: JP Kind code of ref document: A |
|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 20950124 Country of ref document: EP Kind code of ref document: A1 |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| 122 | Ep: pct application non-entry in european phase |
Ref document number: 20950124 Country of ref document: EP Kind code of ref document: A1 |