[go: up one dir, main page]

WO2021019748A1 - Procédé de classification, programme de classification, dispositif de classification, procédé d'apprentissage, programme d'apprentissage et dispositif d'apprentissage - Google Patents

Procédé de classification, programme de classification, dispositif de classification, procédé d'apprentissage, programme d'apprentissage et dispositif d'apprentissage Download PDF

Info

Publication number
WO2021019748A1
WO2021019748A1 PCT/JP2019/030085 JP2019030085W WO2021019748A1 WO 2021019748 A1 WO2021019748 A1 WO 2021019748A1 JP 2019030085 W JP2019030085 W JP 2019030085W WO 2021019748 A1 WO2021019748 A1 WO 2021019748A1
Authority
WO
WIPO (PCT)
Prior art keywords
classifier
classification
input
data
feature vector
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/JP2019/030085
Other languages
English (en)
Japanese (ja)
Inventor
泰斗 横田
彼方 鈴木
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fujitsu Ltd
Original Assignee
Fujitsu Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fujitsu Ltd filed Critical Fujitsu Ltd
Priority to PCT/JP2019/030085 priority Critical patent/WO2021019748A1/fr
Publication of WO2021019748A1 publication Critical patent/WO2021019748A1/fr
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis

Definitions

  • the present invention relates to a classification method, a classification program, a classification device, a learning method, a learning program, and a learning device.
  • DNN Deep Neural Network
  • Characteristic information (intermediate feature vector) of the input data appears in the intermediate layer.
  • the above technology has a problem that the accuracy of the classifier that classifies objects into classes may decrease.
  • the accuracy of the classifier may decrease.
  • the likelihood of appearance of features suitable for a classifier may differ for each intermediate layer. Therefore, in the learning of the classifier, if the influence of the intermediate feature vector in which the feature suitable for the classifier does not appear becomes large, the accuracy of the classifier may decrease.
  • One aspect is to improve the accuracy of the classifier.
  • the computer inputs the data of an image of one or more objects into the input layer of a neural network model having an input layer, a plurality of intermediate layers, and an output layer.
  • the process of acquiring the feature vector from each of the above is executed.
  • the computer updates each classifier based on the classification result when the acquired feature vector is input to the classifier corresponding to each of the feature vectors.
  • the computer determines the weight of each classifier based on the comparison result between the output result of each classifier and the correct answer data.
  • the computer executes a process of classifying the object reflected in the image data into one of a predetermined class based on the output result of the classifier and the weight.
  • the accuracy of the classifier can be improved.
  • FIG. 1 is a diagram for explaining the configuration of a classification system.
  • FIG. 2 is a block diagram showing a configuration example of the classification device.
  • FIG. 3 is a diagram showing an example of the shape of an object.
  • FIG. 4 is a diagram for explaining the class.
  • FIG. 5 is a diagram for explaining the configuration of the model.
  • FIG. 6 is a diagram showing an example of the output of the classifier.
  • FIG. 7 is a diagram showing an example of the weighted output of the classifier.
  • FIG. 8 is a flowchart showing the flow of the learning process.
  • FIG. 9 is a flowchart showing the flow of the classification process.
  • FIG. 10 is a diagram for explaining the difference between the intermediate layers.
  • FIG. 11 is a diagram illustrating a hardware configuration example.
  • FIG. 1 is a diagram for explaining the configuration of a classification system.
  • the classification system 1 includes a picking system 20 and a classification device 10.
  • the picking system 20 includes a picking robot 21, a camera 22, and a tray 23.
  • the camera 22 shoots the tray 23 from above and transmits the data of the shot image to the classification device 10.
  • the picking robot 21 is controlled based on the output of the image data of the classification device 10 and the like, and grips the object on the tray 23.
  • the classification device 10 detects the position of an object in the image using a model, and further classifies the class of the detected object.
  • the model includes a detector that detects the position of the object in the image and a classifier that classifies the class of the object.
  • the classification device 10 also functions as a learning device for learning the model.
  • FIG. 2 is a block diagram showing a configuration example of the classification device.
  • the classification device 10 includes an input / output unit 11, a storage unit 12, and a control unit 13.
  • the input / output unit 11 is an interface for inputting / outputting data to / from the picking system 20 and communicating data with other devices.
  • the input / output unit 11 may input / output data to / from an input device such as a keyboard or mouse, an output device such as a display or speaker, or an external storage device such as a USB memory.
  • the input / output unit 11 is a NIC (Network Interface Card), and data may be communicated via the Internet.
  • NIC Network Interface Card
  • the storage unit 12 is an example of a storage device that stores data, a program executed by the control unit 13, and the like, such as a hard disk and a memory.
  • the storage unit 12 stores the learning data information 121, the weight information 122, the detector information 123, and the classifier information 124.
  • the learning data information 121 is image data taken by the camera 22 of the picking system 20.
  • the learning data information 121 may be the image data itself, or may be the image data to which a predetermined process is applied. Further, it is assumed that the position of the object to be classified in the image included in the learning data information 121 and the class to which the object belongs are known. That is, the learning data information 121 is learning data with a correct answer label for learning the detector and the classifier.
  • the weight information 122 is a weight determined by the determination unit 132. The method of determining the weight by the determination unit 132 will be described later.
  • the weight information 122 is stored when training the model, and is referred to and used when classifying unknown data.
  • the detector information 123 is a parameter for constructing a detector.
  • the detector may be an SSD (Single Shot Multibox Detector).
  • SSD is a model for detecting the position of an object using DNN (Deep Neural Network).
  • the detector information 123 is the DNN weight, bias, and the like. Further, the detector information 123 is updated by learning.
  • the classifier information 124 is a parameter for constructing a classifier.
  • the classifier is SVM (Support Vector Machine).
  • the classifier information 124 is a vector of SVM.
  • the classifier information 124 is updated by learning. Also, in the examples, there are as many classifiers as there are intermediate layers of detectors. Therefore, the classifier information 124 is a parameter of each of the plurality of classifiers.
  • control unit 13 for example, a program stored in an internal storage device is executed with RAM as a work area by a CPU (Central Processing Unit), an MPU (Micro Processing Unit), a GPU (Graphics Processing Unit), or the like. Is realized by. Further, the control unit 13 may be realized by an integrated circuit such as an ASIC (Application Specific Integrated Circuit) or an FPGA (Field Programmable Gate Array), for example.
  • the control unit 13 has an acquisition unit 131, a determination unit 132, a classification unit 133, and an update unit 134.
  • the detector used in the classification device 10 is a neural network model having an input layer, an intermediate layer, and an output layer, and is an SSD that detects the position of an object in an image.
  • FIG. 3 is a diagram showing an example of the shape of an object.
  • the classification device 10 classifies the objects into classes corresponding to each shape.
  • FIG. 4 is a diagram for explaining the class.
  • each image is, for example, a color image of 300 ⁇ 300 pix in which one object is arranged at random positions.
  • the correct label is represented by the coordinates (x, y) indicating the position of the object, the grip width (w), and the inclination ( ⁇ ).
  • the update unit 134 learns the detector by backpropagating the error of the DNN using the learning data for the detector, and updates the detector information 123. At this time, the update unit 134 uses the data of the above-mentioned 2,000 images and performs learning with the number of epochs set to 200, for example.
  • the acquisition unit 131 is a feature vector from each of the plurality of intermediate layers when the data of the image in which one or more objects are captured is input to the input layer of the neural network model having the input layer, the plurality of intermediate layers, and the output layer. To get.
  • the detector has a plurality of intermediate layers.
  • FIG. 5 is a diagram for explaining the configuration of the model.
  • the detector is a convolutional neural network such as SSD. If the detector is an SSD, there are, for example, 23 intermediate layers.
  • the acquisition unit 131 acquires the dimension of the feature vector of each intermediate layer after compressing it by PCA (principal component analysis) or the like.
  • the determination unit 132 inputs the feature vector acquired by the acquisition unit 131 into each corresponding classifier and obtains an output result. Then, the determination unit 132 determines the weight of each classifier based on the comparison result between the output result of each classifier and the correct answer data.
  • the classification unit 133 classifies the object reflected in the image data into one of a predetermined class based on the output result and the weight of the classifier. Then, the update unit 134 updates each classifier based on the classification result when the input is input to the classifier corresponding to each of the feature vectors. Further, when each classifier is an SVM, the number of trainings may be one, and the training of each classifier may be performed in parallel.
  • FIG. 6 is a diagram showing an example of the output of the classifier. As shown in FIG. 6, each classifier calculates a score for each class and outputs the class with the highest score as the classification result class. In addition, the score calculated by each classifier for the correct class is used as the accuracy.
  • the No. 1 classifier calculates the class A score as, for example, "0.70", the class B score as, for example, "0.25", and the class C score, for example. It is calculated as "0.05".
  • the classification result by the No. 1 classifier is class A and does not match the correct answer.
  • the accuracy of the No. 1 classifier is "0.25".
  • the No. 2 classifier calculates the class A score as, for example, "0.10", the class B score as, for example, "0.80", and the class C score as, for example, "0.10". .. At this time, the classification result by the No. 2 classifier is class B, which matches the correct answer. The accuracy of the No. 2 classifier is "0.80".
  • the classification result by 3 out of 5 classifiers is incorrect. Therefore, if the classification result of the randomly selected intermediate layer is adopted as the final classification result, there is a 60% probability that the answer will be incorrect.
  • the determination unit 132 determines the weight of each classifier so that the higher the accuracy of the classifier, the larger the weight. For example, the determination unit 132 determines for each classifier with the square of accuracy as a weight. Further, the determination unit 132 stores the determined weight as the weight information 122 in the storage unit 12.
  • FIG. 7 is a diagram showing an example of the weighted output of the classifier.
  • the classification unit 133 can calculate the combined output as shown in FIG. 7 based on the weight determined by the determination unit 132. In this case, since the combined output of the class B is the largest, the classification unit 133 finally classifies the training data into the class B.
  • FIG. 8 is a flowchart showing the flow of the learning process.
  • the classification device 10 learns the detector using the learning data for the detector (step S101).
  • the detector is a DNN having an input layer, an intermediate layer, and an output layer.
  • the classification device 10 inputs the training data for the classifier into the trained detector, and acquires feature vectors from each of the plurality of intermediate layers (step S102).
  • the classification device 10 inputs the acquired feature vector into the classifier corresponding to each intermediate layer, and learns each classifier (step S103).
  • the classification device 10 determines the weight of each classifier based on the classification result of the classifier (step S104).
  • FIG. 9 is a flowchart showing the flow of the classification process.
  • the classification device 10 inputs the image data to the trained detector and acquires the feature vector from each of the plurality of intermediate layers (step S201).
  • the classification device 10 inputs the acquired feature vector into the corresponding classifier (step S202). Then, the classification device 10 outputs the classification result based on the combined output calculated by weighting the classification result of the classifier (step S203).
  • the classification device 10 is a plurality of intermediate layers when data of an image showing one or more objects is input to the input layer of the neural network model having an input layer, a plurality of intermediate layers, and an output layer. Get the feature vector from each.
  • the classification device 10 updates each classifier based on the classification result when the acquired feature vector is input to the classifier corresponding to each of the feature vectors.
  • the classification device 10 determines the weight of each classifier based on the comparison result between the output result of each classifier and the correct answer data.
  • the classification device 10 classifies the objects reflected in the image data into one of a predetermined class based on the output result and the weight of the classifier.
  • the feature vector acquired from which intermediate layer may be suitable for classification.
  • the feature vector acquired from which intermediate layer may be suitable for classification.
  • FIG. 10 in a DNN that performs image recognition, features of a fine region (narrow region) tend to appear in the intermediate layer close to the input layer, and an abstract (wide region) is likely to appear in the intermediate layer close to the output layer. )
  • Features tend to appear.
  • the classification device 10 can add weights according to accuracy to each classifier corresponding to a plurality of intermediate layers. Therefore, the classifier 10 can improve the accuracy of the classifier.
  • the classification device 10 compresses and acquires the dimension of the feature vector. As a result, the classification device 10 can reduce the amount of calculation.
  • the classifier 10 determines the weight of each classifier so that the higher the accuracy of the classifier, the larger the weight. As a result, the classification device 10 can increase the influence of the classifier having a particularly high accuracy and improve the final accuracy while utilizing the information output from each classifier.
  • the classification device 10 acquires a feature vector from the intermediate layer of the convolutional neural network. As a result, the classification device 10 can improve the accuracy of the classifier for image recognition in general.
  • the classification device 10 has been described as performing both the learning process and the classification process.
  • the classification device 10 may receive information on the model trained by the learning device and perform only the classification process.
  • the classifier is assumed to be an SVM.
  • the classifier is not limited to SVN, and may be DNN, for example.
  • the method of the example can be applied not only to the classification problem but also to the regression problem.
  • the classification device 10 calculates a weighted average weighted by the accuracy itself for the score of each class of each classifier.
  • each component of each device shown in the figure is a functional concept, and does not necessarily have to be physically configured as shown in the figure. That is, the specific forms of distribution and integration of each device are not limited to those shown in the figure. That is, all or a part thereof can be functionally or physically distributed / integrated in an arbitrary unit according to various loads, usage conditions, and the like. Further, each processing function performed by each device may be realized by a CPU and a program analyzed and executed by the CPU, or may be realized as hardware by wired logic.
  • FIG. 11 is a diagram illustrating a hardware configuration example.
  • the classification device 10 includes a communication interface 10a, an HDD (Hard Disk Drive) 10b, a memory 10c, and a processor 10d. Further, the parts shown in FIG. 11 are connected to each other by a bus or the like.
  • HDD Hard Disk Drive
  • the communication interface 10a is a network interface card or the like, and communicates with other servers.
  • the HDD 10b stores a program or DB that operates the functions shown in FIG.
  • the processor 10d reads a program that executes the same processing as each processing unit shown in FIG. 2 from the HDD 10b or the like and expands the program into the memory 10c to operate the process that executes each function described in FIG. 2 or the like. It is a wear circuit. That is, this process executes the same function as each processing unit of the classification device 10. Specifically, the processor 10d reads a program having the same functions as the acquisition unit 131, the determination unit 132, the classification unit 133, and the update unit 134 from the HDD 10b or the like. Then, the processor 10d executes a process of executing the same processing as the acquisition unit 131, the determination unit 132, the classification unit 133, the update unit 134, and the like.
  • the classification device 10 operates as an information processing device that executes a learning method by reading and executing a program. Further, the classification device 10 can realize the same function as that of the above-described embodiment by reading the program from the recording medium by the medium reading device and executing the read program.
  • the program referred to in the other examples is not limited to being executed by the classification device 10.
  • the present invention can be similarly applied when another computer or server executes a program, or when they execute a program in cooperation with each other.
  • This program can be distributed via networks such as the Internet.
  • this program is recorded on a computer-readable recording medium such as a hard disk, flexible disk (FD), CD-ROM, MO (Magneto-Optical disk), DVD (Digital Versatile Disc), and is recorded by the computer from the recording medium. It can be executed by being read.
  • Classification system 10 Classification device 11
  • Input / output unit 12 Storage unit 13
  • Control unit 20 Picking system 21
  • Picking robot 22 Camera 23 Tray 121
  • Learning data information 122
  • Weight information 123
  • Detector information 124
  • Classifier information 131
  • Acquisition unit 132
  • Decision unit 133
  • Classification unit 134 Update

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Image Analysis (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

Un dispositif de classification (10) obtient un vecteur de caractéristiques à partir de chacune d'une pluralité de couches intermédiaires d'un modèle de réseau neuronal, qui a une couche d'entrée, la pluralité de couches intermédiaires, et une couche de sortie, lorsque des données d'une image capturée d'un ou de plusieurs objets ont été entrées dans la couche d'entrée. Lorsque les vecteurs de caractéristiques obtenus ont été entrés dans des classificateurs correspondants respectivement, le dispositif de classification (10) met à jour chaque classificateur sur la base des résultats de classification. Le dispositif de classification (10) détermine les poids de chaque classificateur sur la base du résultat d'une comparaison entre le résultat de sortie de chaque classificateur et les données de réponse. Le dispositif de classification (10) classifie un objet représenté par les données d'image en une ou plusieurs classes prescrites sur la base du résultat de sortie et des poids de chaque classificateur.
PCT/JP2019/030085 2019-07-31 2019-07-31 Procédé de classification, programme de classification, dispositif de classification, procédé d'apprentissage, programme d'apprentissage et dispositif d'apprentissage Ceased WO2021019748A1 (fr)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/JP2019/030085 WO2021019748A1 (fr) 2019-07-31 2019-07-31 Procédé de classification, programme de classification, dispositif de classification, procédé d'apprentissage, programme d'apprentissage et dispositif d'apprentissage

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/JP2019/030085 WO2021019748A1 (fr) 2019-07-31 2019-07-31 Procédé de classification, programme de classification, dispositif de classification, procédé d'apprentissage, programme d'apprentissage et dispositif d'apprentissage

Publications (1)

Publication Number Publication Date
WO2021019748A1 true WO2021019748A1 (fr) 2021-02-04

Family

ID=74228679

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2019/030085 Ceased WO2021019748A1 (fr) 2019-07-31 2019-07-31 Procédé de classification, programme de classification, dispositif de classification, procédé d'apprentissage, programme d'apprentissage et dispositif d'apprentissage

Country Status (1)

Country Link
WO (1) WO2021019748A1 (fr)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2024004405A (ja) * 2022-06-28 2024-01-16 トヨタ自動車株式会社 学習方法、再同定装置、再同定方法、プログラム

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2015008567A1 (fr) * 2013-07-18 2015-01-22 Necソリューションイノベータ株式会社 Procédé, dispositif et programme d'estimation d'impression faciale

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2015008567A1 (fr) * 2013-07-18 2015-01-22 Necソリューションイノベータ株式会社 Procédé, dispositif et programme d'estimation d'impression faciale

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
YOKOTA, YASUTO ET AL.: "Filtering of middle layer outputs for object classification using a model trained detecting grasping positions", THE 33TH ANNUAL CONFERENCE OF THE JAPANESE SOCIETY FOR ARTIFICIAL INTELLIGENCE, 1 June 2019 (2019-06-01), pages 1 - 3, XP055788104, DOI: https://doi.org/10.11517/pjsai.JSAI2019.0_1L2J1103 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2024004405A (ja) * 2022-06-28 2024-01-16 トヨタ自動車株式会社 学習方法、再同定装置、再同定方法、プログラム
JP7609132B2 (ja) 2022-06-28 2025-01-07 トヨタ自動車株式会社 学習方法、再同定装置、再同定方法、プログラム

Similar Documents

Publication Publication Date Title
US11023806B2 (en) Learning apparatus, identifying apparatus, learning and identifying system, and recording medium
CN111563601B (zh) 使用联合语义向量进行表示学习
US11640518B2 (en) Method and apparatus for training a neural network using modality signals of different domains
JP6159489B2 (ja) 顔認証方法およびシステム
US20170116467A1 (en) Facial Expression Capture for Character Animation
CN112368719A (zh) 神经网络的梯度对抗性训练
US20150325046A1 (en) Evaluation of Three-Dimensional Scenes Using Two-Dimensional Representations
KR20190118387A (ko) 합성곱 신경망 기반 이미지 처리 시스템 및 방법
CN107533665A (zh) 经由偏置项在深度神经网络中纳入自顶向下信息
CN106548190A (zh) 模型训练方法和设备以及数据识别方法
US20190042942A1 (en) Hybrid spiking neural network and support vector machine classifier
US11301723B2 (en) Data generation device, data generation method, and computer program product
WO2018207334A1 (fr) Dispositif, procédé et programme de reconnaissance d'images
CN114118259A (zh) 一种目标检测方法及装置
Mathe et al. A deep learning approach for human action recognition using skeletal information
JP2007128195A (ja) 画像処理システム
WO2021200392A1 (fr) Système de réglage de données, dispositif de réglage de données, procédé de réglage de données, dispositif de terminal et dispositif de traitement d'informations
EP4293579A1 (fr) Procédé d'apprentissage automatique pour apprentissage continu et dispositif électronique
US12079717B2 (en) Data processing apparatus, training apparatus, method of detecting an object, method of training, and medium
US20220391698A1 (en) Training Recognition Device
KR102120007B1 (ko) 객체 추적 장치 및 객체 추적 방법
JP7472471B2 (ja) 推定システム、推定装置および推定方法
JP7310927B2 (ja) 物体追跡装置、物体追跡方法及び記録媒体
WO2021019748A1 (fr) Procédé de classification, programme de classification, dispositif de classification, procédé d'apprentissage, programme d'apprentissage et dispositif d'apprentissage
JP7446338B2 (ja) 顔と手との関連度の検出方法、装置、機器及び記憶媒体

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19939956

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19939956

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: JP