WO2022151591A1 - Procédé et appareil d'extraction de caractéristiques multitâches couplées, dispositif électronique et support de stockage - Google Patents
Procédé et appareil d'extraction de caractéristiques multitâches couplées, dispositif électronique et support de stockage Download PDFInfo
- Publication number
- WO2022151591A1 WO2022151591A1 PCT/CN2021/084284 CN2021084284W WO2022151591A1 WO 2022151591 A1 WO2022151591 A1 WO 2022151591A1 CN 2021084284 W CN2021084284 W CN 2021084284W WO 2022151591 A1 WO2022151591 A1 WO 2022151591A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- network
- picture set
- input
- train
- picture
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/951—Indexing; Web crawling techniques
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/211—Selection of the most significant subset of features
- G06F18/2113—Selection of the most significant subset of features by ranking or filtering the set of features, e.g. using a measure of variance or of feature cross-correlation
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/168—Feature extraction; Face representation
- G06V40/171—Local features and components; Facial parts ; Occluding parts, e.g. glasses; Geometrical relationships
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Definitions
- the present application belongs to the technical field of deep learning networks, and particularly relates to a coupled multi-task feature extraction method, device, electronic device and storage medium.
- Deep learning is to learn the inherent laws and representation levels of sample data, and the information obtained during the learning process is of great help to the interpretation of data such as images.
- the ultimate goal of deep learning is to enable machines to have the ability to analyze and learn like humans, and to recognize data such as images.
- the classification task is a classic deep learning task, but the classification task itself ignores some spatial structures of objects, or it is difficult to learn the spatial structure. If the basic backbone network can grasp more spatial structure information and local texture information, it is more beneficial to the accurate extraction of features. At the same time, the extracted features should be able to completely restore the items of interest through the network, so that the information extraction for the items is complete.
- the current training network does not guarantee the realization of this from the algorithm.
- Figure 1 is the traditional deep learning network structure
- the traditional deep learning network structure needs to manually label the data, and then use the sample labeling to train the images.
- classification tasks classify according to the type of items.
- the data for network training is only labeled data, the quantity is small, the picture type is single, and the picture labeling is difficult.
- the inventor is aware of the problems of the prior art: the input pictures used in traditional deep learning all need to be manually labeled, limited by the number of labels, some spatial structures of objects will be ignored, and it is difficult to pay attention to the key items, affecting the final results. recognition accuracy.
- the purpose of this application is to provide a coupled multi-task feature extraction method, device, electronic device and storage medium, so as to solve the problem that the number of manually annotated pictures in the prior art is insufficient, which makes it difficult to pay attention to the items of interest during deep learning, and affects the final Identify technical issues with accuracy.
- the first application of the present application provides a coupled multi-task feature extraction method, comprising the following steps:
- Initial training input the mixed image set into the first network to train the first network; the feature quantity output by the first network is used as input to enter the third network, and the third network is trained until the LOSS converges; the first network is obtained after the training is completed and the weight coefficients of each layer of the third network;
- Retraining Input the first type of image set and the second type of image set into the first network at the same time, and train the first network again; the feature quantity output by the first network is shunted: the feature quantity extracted from the mixed image set is input into the third network.
- the network trains the third network until the LOSS converges; the features extracted from the labeled picture set are input into the second network to train the second network until the LOSS converges; the first network, the second network, and the third network are completed. training, to obtain the weight coefficients of each layer of the first network, the second network, and the third network; the first kind of picture set is a mixed picture set, and the second kind of picture set is a labeled picture set;
- Collect the picture to be recognized input the deep learning network composed of the first network and the second network after retraining, and output the corresponding classification label to complete the recognition.
- a second aspect of the present application provides a coupled multi-task feature extraction device, comprising:
- the data collection unit is used to collect labeled picture sets and unlabeled picture sets
- the data preprocessing unit is used to mix the pictures of the labeled picture set and the unlabeled picture set and perform augmentation preprocessing to obtain the mixed picture set;
- the initial training unit is used to input the mixed picture set into the first network to train the first network; the feature quantity output by the first network is used as input to enter the third network, and the third network is trained until the LOSS converges; the training is completed to obtain The weight coefficients of each layer of the first network and the third network;
- the retraining unit is used to input the first type of picture set and the second type of picture set into the first network at the same time, and train the first network again; the feature quantity output by the first network is divided: the feature quantity extracted from the mixed image set Input the third network to train the third network until the LOSS converges; input the feature quantity extracted from the labeled picture set into the second network to train the second network until the LOSS converges; complete the first network, the second network, and the first network.
- the training of the three networks obtains the weight coefficients of each layer of the first network, the second network, and the third network; the first picture set is a mixed picture set, and the second picture set is a labeled picture set;
- the identification unit is used to collect the pictures to be identified, input the deep learning network formed by the first network and the second network, and output the corresponding classification labels to complete the identification.
- a third aspect of the present application provides an electronic device, the electronic device includes a processor and a memory, and the processor is configured to implement the coupled multi-task feature extraction method when executing a computer program stored in the memory.
- a fourth aspect of the present application provides a computer-readable storage medium, where the computer-readable storage medium stores at least one instruction, and when the at least one instruction is executed by a processor, implements the coupled multitasking feature extraction method.
- This application does not require additional labeled data, and a large amount of non-labeled data can be crawled using crawler software; the current training method adds spatial location semantic information to the classification network, and adds a variety of identification and classification information to the positioning network.
- the coupling of these two information improves the generalization of the entire network, and can also have better results on data other than labeled data.
- the output of the hidden layer of the network can extract more features, and at the same time, in the problem of feature point location, such as face key points, the network learns the spatial structure information instead of just learning the fixed position of the feature points for fine-tuning.
- Figure 1 shows the traditional deep learning network structure
- Embodiment 2 is a deep learning network structure used in a coupled multi-task feature extraction method in Embodiment 1 of the application;
- FIG. 3 is a flowchart of a coupled multi-task feature extraction method according to Embodiment 1 of the present application.
- FIG. 5 is a flowchart of a coupled multi-task feature extraction method in Embodiment 2 of the present application.
- Fig. 6 is the schematic diagram of applying the present application method to identify the area of the Bauhinia flower map on the Hong Kong identity card;
- FIG. 7 is a structural block diagram of a coupled multi-task feature extraction device in Embodiment 3.
- Embodiment 8 is a structural block diagram of a coupled multi-task feature extraction apparatus in Embodiment 4.
- FIG. 9 is an electronic device implementing a coupled multi-task feature extraction method provided by the present application.
- the present application provides a coupled multi-task feature extraction method, which includes the following steps:
- the data includes labeled picture sets and unlabeled picture sets; the collection cost of unlabeled data is very low, and the quantity can generally be more than 10 times that of labeled data; a large number of unlabeled pictures can be crawled by crawler software.
- the images of the labeled image set and the unlabeled image set are mixed and augmented and preprocessed to obtain the mixed image set.
- the augmentation preprocessing includes, but is not limited to, cropping, scaling, flipping, and the like.
- the mixed image set and the labeled image set are input into the first network at the same time, and the first network is retrained; the feature quantity output by the first network is divided: the feature quantity extracted from the mixed image set is input into the third network to the third network.
- the first network is a shallow feature extraction network of a deep learning network (such as VGG16).
- the second network and the third network have the same structure and different weight coefficients, which are the rest of the deep learning network (such as VGG16).
- the common first network not only learns the texture features of the image, but also further preserves the geometrically related features of the image.
- the present application also provides another coupled multi-task feature extraction method, which includes the following steps:
- the data includes labeled picture sets and unlabeled picture sets; the collection cost of unlabeled data is very low, and the quantity can generally be more than 10 times that of labeled data; a large number of unlabeled pictures can be crawled by crawler software.
- the images of the labeled image set and the unlabeled image set are mixed and augmented and preprocessed to obtain the mixed image set.
- the augmentation preprocessing includes, but is not limited to, cropping, scaling, flipping, and the like.
- the augmentation preprocessing is performed on the original image, and the preprocessing can be to crop a part of the area, for example, crop the nine-square grid for the original image, so that these nine pictures form a task group, and the corresponding training task is that the current block is the nine-square grid Which piece is the related topological relationship of the image.
- the cropped nine-square grid is not limited to that the edges are closely adjacent, nor is the nine-square grid can be the key position of some parts of the object, such as the nose, mouth, eyes, etc. of a human face.
- the 5 kinds of picture sets include: the first kind: the preprocessed mixed picture set; the second kind: the labeled picture set; the third kind : the original mixed image set composed of labeled and unlabeled images; the fourth: the original mixed image set composed of labeled and unlabeled images; the fifth: the fourth network restored image set.
- the feature quantity output by the first network is shunted:
- the feature quantity extracted from the first image set is input into the third network to train the third network, until the third network LOSS converges;
- the feature quantity extracted from the third image set is input into the fourth network to train the fourth network, until the fourth network LOSS converges;
- the features extracted from the fourth picture set and the fifth picture are input into the fifth network to train the fifth network, until the fifth network LOSS converges;
- the first network is a shallow feature extraction network of a deep learning network (such as VGG16)
- the second network, the third network, and the fifth network have the same structure and different weight coefficients, which are the rest of the deep learning network (such as VGG16);
- the fourth network is an upsampling network.
- This embodiment 2 further improves the network structure of embodiment 1, adds a new confrontation generation structure, and adds a fourth network for the original task.
- the input is the feature extracted from the original image with/without label, and the fourth The network performs image restoration, and the restored image is compared with the original image. The difference between them is loss.
- This task is the completeness of the features of the extracted original image, because once the extracted features lose some information, it must be It will lead to restoration distortion, resulting in a larger loss value.
- the main difference between the original training and the improved training is that there are more loss functions acting on the feature extraction network, since the new task does not require labels at all, so this part of the training images can be easily Get it.
- the unlabeled image and the labeled image are of the same or similar domain.
- tagged images can be of a large number of Asians
- unlabeled images can be of both white and black people.
- the labeled picture is a picture with a central object, such as cats, dogs and other animals in the picture, and the unlabeled pictures can be different kinds of cats and dogs of various angles and breeds.
- the performance indicators have the following improvements.
- Figure 6 shows the area of the Bauhinia flower on the Hong Kong Identity Card.
- the present application also provides a coupled multi-task feature extraction device, including:
- the data collection unit 11 is used to collect a tagged picture set and an unlabeled picture set
- the data preprocessing unit 12 is used for mixing the pictures of the labeled picture set and the unlabeled picture set and performing augmentation preprocessing to obtain the mixed picture set;
- the initial training unit 13 is used to input the mixed picture set into the first network, and train the first network; the feature quantity output by the first network is used as input to enter the third network, and the third network is trained until the third network LOSS converges ; After the training is completed, the weight coefficients of each layer of the first network and the third network are obtained;
- the retraining unit 14 is used to input the first type of image set and the second type of image set into the first network at the same time, and train the first network again; the feature quantity output by the first network is shunted: the features extracted from the mixed image set Input the third network to train the third network until the third network LOSS converges; input the feature amount extracted from the labeled picture set into the second network to train the second network until the second network LOSS converges; complete the first The training of the network, the second network, and the third network, to obtain the weight coefficients of each layer of the first network, the second network, and the third network; the first kind of picture set is a mixed picture set, and the second kind of picture set is labeled 's picture collection;
- the identification unit 15 is used for collecting the pictures to be identified, inputting the deep learning network (eg VGG16) composed of the first network and the second network, and outputting the corresponding classification labels to complete the identification.
- the deep learning network eg VGG16
- the present application also provides another coupled multi-task feature extraction device, including:
- the data collection unit 101 is used to collect a labeled picture set and an unlabeled picture set; the cost of collecting the unlabeled data is very low, and the quantity can generally be more than 10 times that of the labeled data;
- the data preprocessing unit 102 is configured to mix the pictures of the labeled picture set and the unlabeled picture set and perform augmentation preprocessing to obtain the mixed picture set.
- the initial training unit 103 is used to input the mixed picture set into the first network, and train the first network; the feature quantity output by the first network is used as input to enter the third network, and the third network is trained until the third network LOSS converges ; After the training is completed, the weight coefficients of each layer of the first network and the third network are obtained;
- the retraining unit 104 is used to input the first kind of picture set, the second kind of picture set, the third kind of picture set, the fourth kind of picture set, and the fifth kind of picture set into the first network at the same time, and train the first network again ;
- the third picture set is the original mixed picture set composed of labeled and unlabeled pictures;
- the fourth picture set is the original mixed picture set composed of labeled and unlabeled pictures;
- the fifth picture set is restored by the fourth network Photo album;
- the feature quantity output by the first network is shunted:
- the feature quantity extracted from the first image set is input into the third network to train the third network, until the third network LOSS converges;
- the feature quantity extracted from the second picture set is input into the second network to train the second network, until the ER-th network LOSS converges;
- the feature quantity extracted from the third image set is input into the fourth network to train the fourth network, until the fourth network LOSS converges;
- the features extracted from the fourth picture set and the fifth picture are input into the fifth network to train the fifth network, until the fifth network LOSS converges;
- the first network is a shallow feature extraction network of a deep learning network (such as VGG16)
- the second network, the third network, and the fifth network have the same structure and different weight coefficients, which are the rest of the deep learning network (such as VGG16);
- the fourth network is an upsampling network.
- the identification unit is used to collect the pictures to be identified, input the deep learning network (eg VGG16) composed of the first network and the second network, and output the corresponding classification labels to complete the identification.
- VGG16 deep learning network
- the present application further provides an electronic device 100 for implementing a coupled multi-task feature extraction method; the electronic device 100 includes a memory 101, at least one processor 102, which is stored in the memory 101 and can be accessed by A computer program 103 running on the at least one processor 102 and at least one communication bus 104 .
- the memory 101 can be used to store the computer program 103, and the processor 102 implements various functions of the electronic device 100 by running or executing the computer program stored in the memory 101 and calling the data stored in the memory 101.
- the memory 101 may mainly include a stored program area and a stored data area, wherein the stored program area may store an operating system, an application program required for at least one function (such as a sound playback function, an image playback function, etc.), etc.; the storage data area may Data such as audio data and the like created according to the use of the electronic device 100 are stored.
- the memory 101 may include non-volatile memory such as hard disk, internal memory, plug-in hard disk, Smart Media Card (SMC), Secure Digital (SD) card, Flash Card (Flash Card), At least one disk storage device, flash memory device, or other non-volatile solid state storage device.
- non-volatile memory such as hard disk, internal memory, plug-in hard disk, Smart Media Card (SMC), Secure Digital (SD) card, Flash Card (Flash Card), At least one disk storage device, flash memory device, or other non-volatile solid state storage device.
- the at least one processor 102 may be a central processing unit (Central Processing Unit, CPU), and may also be other general-purpose processors, digital signal processors (Digital Signal Processor, DSP), application-specific integrated circuits (Application Specific Integrated Circuit, ASIC), Field-Programmable Gate Array (FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc.
- the processor 102 can be a microprocessor or the processor 102 can also be any conventional processor, etc.
- the processor 102 is the control center of the electronic device 100, and uses various interfaces and lines to connect the entire electronic device 100. various parts.
- the memory 101 in the electronic device 100 stores multiple instructions to implement a coupled multi-task feature extraction method, and the processor 102 can execute the multiple instructions to implement:
- Initial training input the mixed image set into the first network to train the first network; the feature quantity output by the first network is used as input to enter the third network, and the third network is trained until the LOSS converges; the first network is obtained after the training is completed and the weight coefficients of each layer of the third network;
- Retraining Input the first type of image set and the second type of image set into the first network at the same time, and train the first network again; the feature quantity output by the first network is divided: the feature quantity extracted from the mixed image set is input into the third network.
- the network trains the third network until the LOSS converges; the feature quantity extracted from the labeled picture set is input into the second network to train the second network until the LOSS converges; the first network, the second network, and the third network are completed. training, to obtain the weight coefficients of each layer of the first network, the second network, and the third network; the first kind of picture set is a mixed picture set, and the second kind of picture set is a labeled picture set;
- Collect the picture to be recognized input the deep learning network composed of the first network and the second network after retraining, and output the corresponding classification label to complete the recognition.
- a labeled picture set and an unlabeled picture set can be obtained; the pictures of the labeled picture set and the unlabeled picture set are mixed and augmented and preprocessed to obtain a mixed picture set; initial training: input the mixed picture set into the first network, and train the first network; the feature quantity output by the first network is used as input into the third network, and the third network is trained until the LOSS converges; after the training is completed, the first network is obtained.
- the weight coefficients of each layer of the first network and the third network retraining: input the first picture set and the second picture set into the first network at the same time, and train the first network again; the feature quantity output by the first network is shunted : Input the features extracted from the mixed image set into the third network to train the third network until LOSS converges; input the features extracted from the labeled image set into the second network to train the second network until the LOSS converges; complete
- the first network, the second network, and the third network are trained to obtain the weight coefficients of each layer of the first network, the second network, and the third network;
- the first picture set is a mixed picture set, and the second kind of picture set is Labeled picture set; collect pictures to be recognized, input the deep learning network composed of the first network and the second network after retraining, and output the corresponding classification labels to complete the recognition.
- this application does not require additional labeled data, and a large amount of non-labeled data can be crawled using crawler software; the current training method adds spatial location semantic information to the classification network, and adds a variety of identification and classification information to the positioning network.
- the coupling of these two information improves the generalization of the entire network, and can also have better results on data other than labeled data.
- the output of the hidden layer of the network can extract more features, and at the same time, in the problem of feature point location, such as face key points, the network learns the spatial structure information instead of just learning the fixed position of the feature points for fine-tuning.
- the modules/units integrated in the electronic device 100 are implemented in the form of software functional units and sold or used as independent products, they may be stored in a computer-readable storage medium.
- the present application can implement all or part of the processes in the methods of the above embodiments, and can also be completed by instructing relevant hardware through a computer program, and the computer program can be stored in a computer-readable storage medium, and the The computer-readable storage medium may be non-volatile or volatile, and when the computer program is executed by the processor, the steps of the above-mentioned method embodiments may be implemented.
- the computer program includes computer program code, and the computer program code may be in the form of source code, object code, executable file or some intermediate form, and the like.
- the computer-readable medium may include: any entity or device capable of carrying the computer program code, recording medium, U disk, removable hard disk, magnetic disk, optical disk, computer memory and read-only memory (ROM, Read-Only Memory) .
- the embodiments of the present application may be provided as a method, an apparatus, or a computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.
- computer-usable storage media including, but not limited to, disk storage, CD-ROM, optical storage, etc.
- These computer program instructions may also be stored in a computer-readable memory capable of directing a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory result in an article of manufacture comprising instruction means, the instructions
- An apparatus implements the functions specified in a flow or flows of the flowcharts and/or a block or blocks of the block diagrams.
- These computer program instructions can also be loaded on a computer or other programmable data processing device to cause a series of operational steps to be performed on the computer or other programmable device to produce a computer-implemented process such that The instructions provide steps for implementing the functions specified in one or more of the flowcharts and/or one or more blocks of the block diagrams.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Oral & Maxillofacial Surgery (AREA)
- Health & Medical Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- General Engineering & Computer Science (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Computation (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Databases & Information Systems (AREA)
- Artificial Intelligence (AREA)
- General Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Human Computer Interaction (AREA)
- Multimedia (AREA)
- Image Analysis (AREA)
Abstract
Un procédé et un appareil d'extraction de caractéristiques multitâches couplées, un dispositif électronique et un support de stockage, se rapportant au domaine technique des réseaux d'apprentissage profond. Le procédé consiste à obtenir un ensemble d'images étiquetées et un ensemble d'images non étiquetées ; à mélanger des images de l'ensemble d'images étiquetées et des images de l'ensemble d'images non étiquetées, et à effectuer un prétraitement d'augmentation, de façon à obtenir un ensemble d'images mixtes ; à réaliser un entraînement initial et un ré-entraînement ; à entrer une quantité de caractéristiques, extraite de l'ensemble d'images étiquetées, à un second réseau pour entraîner le second réseau jusqu'à ce que LOSS soit convergent ; à réaliser un entraînement d'un premier réseau, du deuxième réseau et d'un troisième réseau pour obtenir des coefficients de pondération de chaque couche du premier réseau, du deuxième réseau et du troisième réseau ; et à collecter une image à reconnaître, à entrer celle-ci dans un réseau d'apprentissage profond ré-entraîné constitué du premier réseau et du second réseau, et à délivrer en sortie une étiquette de classification correspondante pour achever la reconnaissance. Le procédé peut résoudre le problème technique de l'état de la technique selon lequel une précision de reconnaissance finale est affectée en raison de la difficulté dans le repérage d'articles faisant l'objet d'une attention particulière en raison d'un nombre insuffisant d'images annotées manuellement.
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202110065929.2A CN112861926B (zh) | 2021-01-18 | 2021-01-18 | 耦合的多任务特征提取方法、装置、电子设备及存储介质 |
| CN202110065929.2 | 2021-01-18 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2022151591A1 true WO2022151591A1 (fr) | 2022-07-21 |
Family
ID=76006810
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/CN2021/084284 Ceased WO2022151591A1 (fr) | 2021-01-18 | 2021-03-31 | Procédé et appareil d'extraction de caractéristiques multitâches couplées, dispositif électronique et support de stockage |
Country Status (2)
| Country | Link |
|---|---|
| CN (1) | CN112861926B (fr) |
| WO (1) | WO2022151591A1 (fr) |
Cited By (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN115965967A (zh) * | 2022-12-26 | 2023-04-14 | 江苏美克医学技术有限公司 | 生殖道微生物人工智能网络训练方法、识别方法及装置 |
Citations (6)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN107958263A (zh) * | 2017-11-13 | 2018-04-24 | 浙江工业大学 | 一种半监督的图像分类器训练方法 |
| CN108416370A (zh) * | 2018-02-07 | 2018-08-17 | 深圳大学 | 基于半监督深度学习的图像分类方法、装置和存储介质 |
| WO2018184187A1 (fr) * | 2017-04-07 | 2018-10-11 | Intel Corporation | Procédés et systèmes de formation avancée et augmentée de réseaux neuronaux profonds à l'aide de données synthétiques et de réseaux génératifs innovants |
| CN110866140A (zh) * | 2019-11-26 | 2020-03-06 | 腾讯科技(深圳)有限公司 | 图像特征提取模型训练方法、图像搜索方法及计算机设备 |
| CN111353577A (zh) * | 2018-12-24 | 2020-06-30 | Tcl集团股份有限公司 | 基于多任务的级联组合模型的优化方法、装置及终端设备 |
| CN112215248A (zh) * | 2019-07-11 | 2021-01-12 | 深圳先进技术研究院 | 深度学习模型训练方法、装置、电子设备及存储介质 |
Family Cites Families (7)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN106845549B (zh) * | 2017-01-22 | 2020-08-21 | 珠海习悦信息技术有限公司 | 一种基于多任务学习的场景与目标识别的方法及装置 |
| CA3085897C (fr) * | 2017-12-13 | 2023-03-14 | Cognizant Technology Solutions U.S. Corporation | Architectures evolutives pour l'evolution de reseaux neuronaux profonds |
| CN108805160B (zh) * | 2018-04-17 | 2020-03-24 | 平安科技(深圳)有限公司 | 迁移学习方法、装置、计算机设备和存储介质 |
| CN110889325B (zh) * | 2019-10-12 | 2023-05-23 | 平安科技(深圳)有限公司 | 多任务面部动作识别模型训练和多任务面部动作识别方法 |
| CN110728255B (zh) * | 2019-10-22 | 2022-12-16 | Oppo广东移动通信有限公司 | 图像处理方法、装置、电子设备及存储介质 |
| CN110866564B (zh) * | 2019-11-22 | 2023-04-25 | 上海携程国际旅行社有限公司 | 多重半监督图像的季节分类方法、系统、电子设备和介质 |
| CN111881968B (zh) * | 2020-07-22 | 2024-04-09 | 平安科技(深圳)有限公司 | 多任务分类方法、装置及相关设备 |
-
2021
- 2021-01-18 CN CN202110065929.2A patent/CN112861926B/zh active Active
- 2021-03-31 WO PCT/CN2021/084284 patent/WO2022151591A1/fr not_active Ceased
Patent Citations (6)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2018184187A1 (fr) * | 2017-04-07 | 2018-10-11 | Intel Corporation | Procédés et systèmes de formation avancée et augmentée de réseaux neuronaux profonds à l'aide de données synthétiques et de réseaux génératifs innovants |
| CN107958263A (zh) * | 2017-11-13 | 2018-04-24 | 浙江工业大学 | 一种半监督的图像分类器训练方法 |
| CN108416370A (zh) * | 2018-02-07 | 2018-08-17 | 深圳大学 | 基于半监督深度学习的图像分类方法、装置和存储介质 |
| CN111353577A (zh) * | 2018-12-24 | 2020-06-30 | Tcl集团股份有限公司 | 基于多任务的级联组合模型的优化方法、装置及终端设备 |
| CN112215248A (zh) * | 2019-07-11 | 2021-01-12 | 深圳先进技术研究院 | 深度学习模型训练方法、装置、电子设备及存储介质 |
| CN110866140A (zh) * | 2019-11-26 | 2020-03-06 | 腾讯科技(深圳)有限公司 | 图像特征提取模型训练方法、图像搜索方法及计算机设备 |
Cited By (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN115965967A (zh) * | 2022-12-26 | 2023-04-14 | 江苏美克医学技术有限公司 | 生殖道微生物人工智能网络训练方法、识别方法及装置 |
Also Published As
| Publication number | Publication date |
|---|---|
| CN112861926B (zh) | 2023-10-31 |
| CN112861926A (zh) | 2021-05-28 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN111666846B (zh) | 一种人脸属性识别方法和装置 | |
| CN107330444A (zh) | 一种基于生成对抗网络的图像自动文本标注方法 | |
| CN110879959A (zh) | 生成数据集的方法及装置、利用其的测试方法及测试装置 | |
| WO2021082692A1 (fr) | Procédé et dispositif d'étiquetage d'images de piéton, support de stockage et appareil intelligent | |
| CN114663962A (zh) | 一种基于图像补全的唇形同步人脸伪造生成方法及系统 | |
| Goh et al. | Food-image Classification Using Neural Network Model | |
| TWI752486B (zh) | 訓練方法、特徵提取方法、裝置及電子設備 | |
| CN116824274B (zh) | 小样本细粒度图像分类方法及系统 | |
| CN105913377A (zh) | 一种保留图片相关性信息的图片拼接方法 | |
| CN115984968B (zh) | 一种学生时空动作识别方法、装置、终端设备及介质 | |
| WO2023207778A1 (fr) | Procédé et dispositif de récupération de données, et support d'enregistrement | |
| CN114742991A (zh) | 海报背景图像选取、模型训练、海报生成方法及相关装置 | |
| CN104252628A (zh) | 人脸图像标注方法和系统 | |
| CN114267089B (zh) | 一种伪造图像的识别方法、装置及设备 | |
| CN114266901B (zh) | 文档轮廓提取模型构建方法、装置、设备及可读存储介质 | |
| CN115359492A (zh) | 文本图像匹配模型训练方法、图片标注方法、装置、设备 | |
| WO2022151591A1 (fr) | Procédé et appareil d'extraction de caractéristiques multitâches couplées, dispositif électronique et support de stockage | |
| CN115661565B (zh) | 一种基于混合域与协同训练跨域检测模型的自训练方法 | |
| CN115661198B (zh) | 基于单阶段目标追踪模型的目标追踪方法、装置及介质 | |
| CN113643241A (zh) | 交互关系检测方法、交互关系检测模型训练方法及装置 | |
| US20200019785A1 (en) | Automatically associating images with other images of the same locations | |
| CN115496967A (zh) | 弱监督定位数据增强的多标签长尾分布识别方法及产品 | |
| CN117152546B (zh) | 一种遥感场景分类方法、系统、存储介质及电子设备 | |
| CN118298065A (zh) | 一种具有多模态细粒度身份保留的肖像生成系统及方法 | |
| CN114266910B (zh) | 图像处理方法、装置、终端设备以及可读存储介质 |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 21918790 Country of ref document: EP Kind code of ref document: A1 |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| 122 | Ep: pct application non-entry in european phase |
Ref document number: 21918790 Country of ref document: EP Kind code of ref document: A1 |