WO2024201655A1

WO2024201655A1 - Learning data generation device, robot system, learning data generation method, and learning data generation program

Info

Publication number: WO2024201655A1
Application number: PCT/JP2023/012217
Authority: WO
Inventors: 維佳李
Original assignee: Fanuc Corp
Current assignee: Fanuc Corp
Priority date: 2023-03-27
Filing date: 2023-03-27
Publication date: 2024-10-03
Anticipated expiration: 2025-09-27
Also published as: CN120530406A; TW202439201A; JPWO2024201655A1; DE112023005644T5

Abstract

The present invention achieves provision of a learning data generation device that can generate learning data without spending a high labor cost and an enormous work time. Provided is a learning data generation device that generates learning data to be used in machine learning comprises a data acquisition unit, a data processing unit, and a data storage unit. The data acquisition unit acquires an image of an existing area of a plurality of workpieces, the data processing unit estimates at least area information reflecting an area range of the whole or a part of at least one workpiece on the basis of the acquired image and generates teaching data that includes the estimated area information, and the data storage unit stores the generated teaching data and the image as the learning data.

Description

LEARNING DATA GENERATION DEVICE, ROBOT SYSTEM, LEARNING DATA GENERATION METHOD, AND LEARNING DATA GENERATION PROGRAM

　本開示は、学習データ生成装置、ロボットシステム、学習データ生成方法および学習データ生成プログラムに関する。 The present disclosure relates to a learning data generation device, a robot system, a learning data generation method, and a learning data generation program.

　近年、様々な分野で機械学習が利用され、実用化されて来ている。このような機械学習の手法の一つとして、「教師あり学習(Supervised learning)」が知られている。「教師あり学習」では、特徴量(データの特徴を表す変数：予測の手掛かり)と教師データ(ラベル，正解データ)の組を大量に準備してコンピュータ(機械学習装置)に与えて行うが、教師データを生成する一つの方法としてアノテーション(アノテーション教示)が挙げられる。 In recent years, machine learning has been used and put to practical use in a variety of fields. One such machine learning method is known as "supervised learning." In "supervised learning," a large number of pairs of features (variables that represent the characteristics of the data: clues for prediction) and training data (labels, correct answer data) are prepared and given to a computer (machine learning device), and annotation (annotation teaching) is one method of generating training data.

　アノテーションでは、大量の画像データや音声データ、動画データおよびテキストデータ等に対して、関連するタグやメタデータ等のラベルを付けて教師データ(教示データ)を生成する作業が必要になる。具体的に、画像を扱うアノテーションは、例えば、人による手作業で画像上をクリックして物体(ワーク)の領域情報を教えるといったことが行われている。このように、アノテーションを人による手作業で行う場合、高い人件費と莫大な作業時間が必要になる。 Annotation requires the generation of teacher data (instruction data) by labeling large amounts of image data, audio data, video data, text data, etc. with relevant tags and metadata. Specifically, annotation that handles images is done by a person manually clicking on the image to teach area information about the object (work). In this way, performing annotation manually requires high labor costs and a huge amount of work time.

　従来、「教師あり学習」に使用する教師データを容易に生成する技術としては、様々な提案がなされている。 Various techniques have been proposed to easily generate training data for use in "supervised learning."

特開２０２２－１１８３００号公報JP 2022-118300 A 特開２０１４－０５９７２９号公報JP 2014-059729 A

　上述したように、従来、アノテーションは、人による手作業で行うため、高い人件費と莫大な作業時間を費やすことになっている。さらに、例えば、大量の画像データに対して、画像中の物体の位置や姿勢、形状領域をアノテーションする場合、処理する各個人(各アノテータ)による判断の相違を最小限に抑えることが要求される。そのため、詳細なアノテーション・ルールの作成、並びに、処理する各個人の教育や選別等が必要になり、より一層の人件費の高騰や作業の長時間化を招くことにもなっている。 As mentioned above, traditionally annotation has been done manually, resulting in high labor costs and huge amounts of work time. Furthermore, for example, when annotating the position, orientation, and shape regions of objects in a large amount of image data, it is necessary to minimize the differences in judgment between each individual (annotator) who processes the data. This requires the creation of detailed annotation rules, as well as the training and selection of each individual who processes the data, which leads to even higher labor costs and longer work hours.

　そこで、高い人件費および莫大な作業時間を費やすことなくアノテーションを行うことができる学習データ生成装置、ロボットシステム、学習データ生成方法および学習データ生成プログラムの提供が要望されている。 Therefore, there is a demand for a learning data generation device, a robot system, a learning data generation method, and a learning data generation program that can perform annotation without incurring high labor costs and huge amounts of work time.

　本開示に係る一実施形態によれば、データ取得部と、データ処理部と、データ保存部と、を備え、機械学習に用いる学習データを生成する学習データ生成装置が提供される。データ取得部は、複数のワークの存在領域の画像を取得し、データ処理部は、取得された画像に基づいて、少なくとも１つのワークの全部または一部の領域範囲を反映した領域情報を少なくとも推定し、推定された領域情報を含む教示データを生成し、データ保存部は、生成された教示データと画像を学習データとして保存する。 According to one embodiment of the present disclosure, a learning data generation device is provided that includes a data acquisition unit, a data processing unit, and a data storage unit, and generates learning data to be used in machine learning. The data acquisition unit acquires images of the areas in which multiple workpieces exist, the data processing unit estimates at least area information reflecting the area range of all or part of at least one workpiece based on the acquired images, and generates teaching data including the estimated area information, and the data storage unit stores the generated teaching data and images as learning data.

図１は、本実施形態に係る学習データ生成装置の一実施例を説明するためのロボットシステム全体の一例を模式的に示す図である。FIG. 1 is a diagram showing an example of a schematic diagram of an entire robot system for explaining an example of a training data generating device according to the present embodiment. 図２は、本実施形態に係る学習データ生成装置の一実施例を説明するための機能ブロック図である。FIG. 2 is a functional block diagram for explaining an example of a training data generating device according to this embodiment. 図３は、本実施形態に係る学習データ生成プログラムの第１実施例における処理の一例を説明するためのフローチャートである。FIG. 3 is a flowchart for explaining an example of processing in a first example of a learning data generation program according to this embodiment. 図４は、本実施形態に係る学習データ生成プログラムの第２実施例における処理の一例を説明するためのフローチャートである。FIG. 4 is a flowchart for explaining an example of processing in a second example of the learning data generation program according to this embodiment. 図５は、本実施形態に係る学習データ生成プログラムの第３実施例における処理の一例を説明するためのフローチャートである。FIG. 5 is a flowchart for explaining an example of processing in a third example of the learning data generation program according to this embodiment. 図６は、本実施形態に係る学習データ生成プログラムの第４実施例における処理の一例を説明するためのフローチャートである。FIG. 6 is a flowchart for explaining an example of processing in the fourth example of the learning data generation program according to this embodiment. 図７は、本実施形態に係る学習データ生成方法において、ユーザによる教示データの修正処理の一例を説明するための図である。FIG. 7 is a diagram for explaining an example of a process of correcting teaching data by a user in the learning data generating method according to this embodiment. 図８は、本実施形態に係る学習データ生成プログラムの第５実施例における処理の一例を説明するためのフローチャートである。FIG. 8 is a flowchart for explaining an example of processing in the fifth example of the learning data generation program according to this embodiment. 図９は、本実施形態に係る学習データ生成装置の一実施例が使用されるロボットシステムにおけるワークの一例を示す図である。FIG. 9 is a diagram showing an example of a workpiece in a robot system in which an example of a learning data generating device according to this embodiment is used. 図１０は、本実施形態に係る学習データ生成装置の一実施例におけるワークの形状領域を説明するための図である。FIG. 10 is a diagram for explaining the shape region of a workpiece in one example of the learning data generation device according to this embodiment.

　以下、本実施形態に係る学習データ生成装置、ロボットシステム、学習データ生成方法および学習データ生成プログラムの実施例を、添付図面を参照して詳述する。各図面において、同一または類似の構成要素には同一または類似の符号が付与されている。また、以下に記載する実施形態は、特許請求の範囲に記載される発明の技術的範囲および用語の意義を限定するものではない。 Below, examples of the learning data generation device, robot system, learning data generation method, and learning data generation program according to the present embodiment will be described in detail with reference to the attached drawings. In each drawing, the same or similar components are given the same or similar reference numerals. Furthermore, the embodiments described below do not limit the technical scope of the invention and the meaning of the terms described in the claims.

　図１は、本実施形態に係る学習データ生成装置の一実施例を説明するためのロボットシステム全体の一例を模式的に示す図である。図１に示されるように、ロボットシステム１００は、ロボット１，ロボット制御装置２，学習データ生成装置３およびカメラ４を備える。ロボット１は、ロボット機構部１０，アーム１１およびエンドエフェクタ(ハンド部)１２を含む。 FIG. 1 is a schematic diagram showing an example of an entire robot system to explain an example of a learning data generation device according to this embodiment. As shown in FIG. 1, the robot system 100 includes a robot 1, a robot control device 2, a learning data generation device 3, and a camera 4. The robot 1 includes a robot mechanism 10, an arm 11, and an end effector (hand) 12.

　なお、図１に示すロボットシステム１００において、機械学習(教師あり学習)を行うための機械学習装置は、ロボット制御装置２に内蔵されているので図示していない。ただし、計算量やデータ量が多くてロボット制御装置２に内蔵することが難しい場合、機械学習装置は、例えば、ロボット制御装置２の近くに設けられた専用のワークステーション、或いは、ロボットシステム１００から離隔した場所に設けられた上位のコンピュータや汎用計算機等により構成することもできる。さらに、大量のデータを学習モデルに入力して学習させる場合、汎用の計算機やプロセッサを用いてもよいが、ＧＰＧＰＵ(General-Purpose computing on Graphics Processing Units)や大規模ＰＣクラスター等を使用すると、より高速に処理することが可能になる。 In the robot system 100 shown in FIG. 1, a machine learning device for performing machine learning (supervised learning) is not shown because it is built into the robot control device 2. However, if the amount of calculations or data is large and it is difficult to build it into the robot control device 2, the machine learning device can be configured, for example, as a dedicated workstation installed near the robot control device 2, or a higher-level computer or general-purpose computer installed in a location remote from the robot system 100. Furthermore, when inputting a large amount of data into a learning model for learning, a general-purpose computer or processor may be used, but faster processing can be achieved by using a GPGPU (General-Purpose computing on Graphics Processing Units) or a large-scale PC cluster.

　ロボット１は、例えば、多軸ロボットとして構成され、アーム１１の先端にはエンドエフェクタ１２が設けられている。図１において、エンドエフェクタ１２は、吸着装置(吸着ハンド)とされているが、これは、ロボットシステム１００が使用するワーク(物体)や作業内容等に応じて様々なものに変更することができるのは言うまでもない。なお、ロボット機構部１０は、ロボット制御装置２からの制御指令に基づいて、ロボット１に所定の動作を行わせるためのものである。 The robot 1 is configured, for example, as a multi-axis robot, and an end effector 12 is provided at the tip of an arm 11. In FIG. 1, the end effector 12 is a suction device (suction hand), but it goes without saying that this can be changed to various types depending on the work (object) and work content used by the robot system 100. The robot mechanism unit 10 is for making the robot 1 perform a specified operation based on a control command from the robot control device 2.

　ロボット制御装置２は、カメラ４の出力および学習データ生成装置３の出力を受け取り、例えば、予め内部の記憶装置に格納されているプログラムや制御データ等に基づいて、ロボット１に対して所定の動作を行わせるための制御指令を生成し、ロボット機構部１０に出力する。なお、教師あり学習を行う機械学習装置は、ロボット制御装置２に内蔵せずに、例えば、計算量やデータ量に応じて、ロボット制御装置２の近傍、或いは、ロボットシステム１００から離隔した場所に別体として設けてもよいのは前述した通りである。 The robot control device 2 receives the output of the camera 4 and the output of the learning data generation device 3, and generates control commands for making the robot 1 perform a specified operation, for example based on a program or control data previously stored in an internal storage device, and outputs the generated control commands to the robot mechanism unit 10. As mentioned above, the machine learning device that performs supervised learning does not need to be built into the robot control device 2, but may be provided separately, for example, near the robot control device 2 or at a location away from the robot system 100, depending on the amount of calculation and data.

　学習データ生成装置３は、ワークが撮像された複数の画像を受け取り、それぞれの画像におけるワークＤ１～Ｄ９の領域情報(ワークの形状領域に関する情報)を含む教示データ(教師データ)を生成する。さらに、学習データ生成装置３は、生成した教示データおよび対応する画像を、学習データとしてロボット制御装置２(機械学習装置)に出力する。ここで、学習データ生成装置３は、カメラ４により撮像したワークＤ１～Ｄ９の画像データを受け取るようになっているが、学習データ生成装置３に与える画像データとしては、ロボットシステム１００のカメラ４により撮像された画像に限定されず、例えば、ワークが予め撮像された画像や他のロボットシステムにより得られた画像といった様々な画像データであってもよい。また、学習データ生成装置３に入力する画像としては、２次元画像に限定されるものではなく、後に詳述するように、例えば、３次元データ(３次元画像，３次元点群データ，３次元計測データ)も一緒に入力してもよい。 The learning data generating device 3 receives multiple images of the workpieces and generates teaching data (teacher data) including area information (information on the shape area of the workpieces) of the workpieces D1 to D9 in each image. Furthermore, the learning data generating device 3 outputs the generated teaching data and the corresponding images to the robot control device 2 (machine learning device) as learning data. Here, the learning data generating device 3 receives image data of the workpieces D1 to D9 captured by the camera 4, but the image data provided to the learning data generating device 3 is not limited to images captured by the camera 4 of the robot system 100, and may be various image data such as images of the workpieces captured in advance or images obtained by other robot systems. Furthermore, the images input to the learning data generating device 3 are not limited to two-dimensional images, and as will be described in detail later, three-dimensional data (three-dimensional images, three-dimensional point cloud data, three-dimensional measurement data) may also be input together.

　カメラ４は、複数のワーク(例えば、複数のダンボール箱)Ｄ１～Ｄ９の存在領域の２次元画像、または、２次元画像と３次元データ(３次元点群データ)を取得するためのものであり、２台のカメラ４ａ，４ｂおよびプロジェクタ４ｃを含む。プロジェクタ４ｃは、複数のワークＤ１～Ｄ９が存在する領域に対して所定のパターンを投影し、２台のカメラ４ａ，４ｂは、プロジェクタ４ｃにより所定のパターンが投影された複数のワークの存在領域を撮像して、ワークＤ１～Ｄ９の３次元形状を計測する。このように、カメラ４は、複数のワークＤ１～Ｄ９の存在領域の３次元形状を計測して３次元点群データを取得するように構成してもよいが、例えば、１台の２次元カメラとして構成することもできる。なお、図１のカメラ４は、２台のカメラ４ａ，４ｂの一方のカメラが撮像した画像を使用して、複数のワークＤ１～Ｄ９の存在領域を撮像した２次元画も取得することができるようになっている。すなわち、カメラ４は、例えば、ロボット１が作業を行うワークや作業内容等に応じて、或いは、教師あり学習のアノテーションに応じて適切な画像を撮像することができればよい。 The camera 4 is for acquiring a two-dimensional image of the area where multiple workpieces (e.g., multiple cardboard boxes) D1 to D9 are present, or a two-dimensional image and three-dimensional data (three-dimensional point cloud data), and includes two cameras 4a and 4b and a projector 4c. The projector 4c projects a predetermined pattern onto the area where multiple workpieces D1 to D9 are present, and the two cameras 4a and 4b capture the area where multiple workpieces are present onto which the predetermined pattern is projected by the projector 4c, and measure the three-dimensional shapes of the workpieces D1 to D9. In this way, the camera 4 may be configured to measure the three-dimensional shape of the area where multiple workpieces D1 to D9 are present and acquire three-dimensional point cloud data, but it can also be configured as a single two-dimensional camera, for example. The camera 4 in FIG. 1 is also configured to acquire a two-dimensional image of the area where multiple workpieces D1 to D9 are present, using an image captured by one of the two cameras 4a and 4b. In other words, the camera 4 only needs to be able to capture appropriate images according to, for example, the work or task being performed by the robot 1, or according to annotations in supervised learning.

　図２は、本実施形態に係る学習データ生成装置の一実施例を説明するための機能ブロック図である。図２に示されるように、機械学習に用いる学習データを生成する学習データ生成装置３は、データ取得部３１，データ処理部(演算処理装置)３２，データ保存部３３，受付部３４および表示部３５を備える。データ取得部３１は、複数のワーク(Ｄ１～Ｄ９)の存在領域の画像(２次元画像、または、２次元画像および３次元点群データ)と学習済みモデルを取得する。データ処理部３２は、データ取得部３１が取得した画像に基づいて、少なくとも１つのワーク(Ｄ０，Ｄ)の全部または一部の領域範囲を反映した領域情報を少なくとも推定し、推定された領域情報を含む教示データを生成する。ここで、学習済みモデルは、例えば、他の学習データ生成装置３により生成された学習モデル、或いは、自身の学習データ生成装置３により生成された学習モデル(以前の学習モデル)を利用することができる。なお、図３～図８を参照して、後に詳述するように、本実施形態に係る各実施例に応じて、学習済みモデルを使用しない場合、データ取得部３１は、学習済みモデルを取得する必要はない。同様に、本実施形態に係る各実施例に応じて、３次元点群データを使用しない場合、データ取得部３１は、３次元点群データを取得する必要はない。 2 is a functional block diagram for explaining an example of the training data generation device according to this embodiment. As shown in FIG. 2, the training data generation device 3 that generates training data to be used in machine learning includes a data acquisition unit 31, a data processing unit (arithmetic processing device) 32, a data storage unit 33, a reception unit 34, and a display unit 35. The data acquisition unit 31 acquires images (two-dimensional images, or two-dimensional images and three-dimensional point cloud data) of the areas where multiple workpieces (D1 to D9) exist and a trained model. The data processing unit 32 at least estimates area information reflecting the entire or partial area range of at least one workpiece (D0, D) based on the images acquired by the data acquisition unit 31, and generates teaching data including the estimated area information. Here, the trained model can be, for example, a training model generated by another training data generation device 3, or a training model (previous training model) generated by the training data generation device 3 itself. Note that, as will be described in detail later with reference to FIGS. 3 to 8, if a trained model is not used according to each example of this embodiment, the data acquisition unit 31 does not need to acquire the trained model. Similarly, if three-dimensional point cloud data is not used in each example according to this embodiment, the data acquisition unit 31 does not need to acquire the three-dimensional point cloud data.

　また、ワーク(Ｄ０)の領域情報は、例えば、そのワークの外形を反映した外形領域情報、そのワークを吸着，吸引または把持するための取出し領域情報、および、そのワーク上の局部領域情報のうち少なくとも１つを含む。なお、ワーク(Ｄ０)の局部領域情報は、例えば、そのワーク上の平坦な面，曲面または面積が大きい領域、そのワーク上の滑りにくい領域、および、そのワーク上の密度が高い領域のうち少なくとも１つを含む。このワークの局部領域情報に関しては、図９を参照して後に詳述する。 The area information of the workpiece (D0) includes, for example, at least one of outer shape area information reflecting the outer shape of the workpiece, removal area information for adsorbing, suctioning or gripping the workpiece, and local area information on the workpiece. Note that the local area information of the workpiece (D0) includes, for example, at least one of flat surfaces, curved surfaces or areas with large areas on the workpiece, non-slip areas on the workpiece, and high density areas on the workpiece. This local area information of the workpiece will be described in detail later with reference to FIG. 9.

　データ保存部３３は、データ処理部３２により生成された教示データと画像(例えば、２次元画像、または、２次元画像および３次元点群データ)を学習データとして保存する。受付部３４は、例えば、作業者(ユーザ)が入力した、少なくとも１つのワークの領域情報に基づいた修正情報を受け付ける。このとき、データ処理部３２は、例えば、受付部３４が受け付けた修正情報に基づいて、教示データを修正して出力する。表示部３５は、画像および教示データを表示し、例えば、ユーザにより教示データ(学習データ)のさらなる修正を行うことができるようになっている。 The data storage unit 33 stores the teaching data and images (e.g., two-dimensional images, or two-dimensional images and three-dimensional point cloud data) generated by the data processing unit 32 as learning data. The reception unit 34 receives, for example, correction information based on area information of at least one workpiece input by a worker (user). At this time, the data processing unit 32 corrects and outputs the teaching data based on, for example, the correction information received by the reception unit 34. The display unit 35 displays the images and teaching data, and allows, for example, the user to make further corrections to the teaching data (learning data).

　ここで、データ取得部３１が学習済みモデルを取得する場合、データ処理部３２は、画像と学習済みモデルに基づいて、ワークの領域情報を推定して教示データを生成する。また、データ取得部３１が複数のワーク(Ｄ０～Ｄ９，Ｄ)の存在領域の２次元画像を取得する場合、データ処理部３２は、データ取得部３１が取得した２次元画像に基づいて領域情報を推定し、教示データを生成する。さらに、データ取得部３１が複数のワーク(Ｄ０～Ｄ９，Ｄ)の存在領域の２次元画像および３次元データ(３次元点群データ)を取得する場合、データ処理部３２は、データ取得部３１が取得した２次元画像および３次元点群データに基づいて領域情報を推定し、教示データを生成する。 Here, when the data acquisition unit 31 acquires a trained model, the data processing unit 32 estimates area information of the workpiece based on the image and the trained model, and generates teaching data. Also, when the data acquisition unit 31 acquires two-dimensional images of the areas where multiple workpieces (D0 to D9, D) exist, the data processing unit 32 estimates area information based on the two-dimensional images acquired by the data acquisition unit 31, and generates teaching data. Furthermore, when the data acquisition unit 31 acquires two-dimensional images and three-dimensional data (three-dimensional point cloud data) of the areas where multiple workpieces (D0 to D9, D) exist, the data processing unit 32 estimates area information based on the two-dimensional images and three-dimensional point cloud data acquired by the data acquisition unit 31, and generates teaching data.

　また、データ処理部３２は、３次元点群データの解析として、例えば、平面解析，曲面解析，ブロブ解析，座標系解析，姿勢解析，深度解析，スケール解析，特徴解析，近傍点解析，メッシュ解析およびボクセル解析のうち少なくとも１つを行った処理結果に基づいてワークの３次元領域情報を推定し、３次元点群データと２次元画像の照らし合わせを行って２次元画像におけるワークの領域情報を推定し、教示データを生成することができる。さらに、データ処理部３２は、３次元点群データの解析結果に基づいて推定されたワークの領域情報に基づいて、教示データを修正して出力することができる。また、データ処理部３２は、受付部３４が受け付けた少なくとも１つのワークの領域情報と画像に基づいて画像処理を行って特徴を抽出し、抽出された特徴のマッチングを行って画像上のワークの領域情報を推定して教示データを生成することもできる。 The data processing unit 32 can estimate three-dimensional area information of the workpiece based on the results of at least one of the following analyses of the three-dimensional point cloud data: plane analysis, curved surface analysis, blob analysis, coordinate system analysis, attitude analysis, depth analysis, scale analysis, feature analysis, nearby point analysis, mesh analysis, and voxel analysis, and can compare the three-dimensional point cloud data with a two-dimensional image to estimate area information of the workpiece in the two-dimensional image and generate teaching data. Furthermore, the data processing unit 32 can correct and output the teaching data based on the area information of the workpiece estimated based on the analysis results of the three-dimensional point cloud data. The data processing unit 32 can also extract features by performing image processing based on the area information of at least one workpiece and the image received by the receiving unit 34, and can match the extracted features to estimate area information of the workpiece on the image and generate teaching data.

　以上において、データ処理部３２は、画像に基づいて画像処理を行い、画像処理の処理結果と画像に基づいて、ワークの領域情報を推定して教示データを生成することも可能である。また、データ処理部３２は、画像処理として、例えば、パターンマッチ，平面マッチ，曲面マッチ，ブロブ解析，特徴解析，勾配解析，エッジ解析，コントラスト解析，ヒストグラム解析および色情報解析のうち少なくとも１つを行った処理結果に基づいて、ワークの領域情報を推定してもよい。さらに、データ処理部３２は、画像処理の処理結果に基づいて推定されたワークの領域情報に基づいて、教示データを修正して出力することも可能である。 In the above, the data processing unit 32 can perform image processing based on the image, and can estimate work area information based on the image and the results of the image processing to generate teaching data. The data processing unit 32 can also estimate work area information based on the results of at least one of the following image processing methods: pattern matching, plane matching, curved surface matching, blob analysis, feature analysis, gradient analysis, edge analysis, contrast analysis, histogram analysis, and color information analysis. Furthermore, the data processing unit 32 can correct and output teaching data based on the work area information estimated based on the results of the image processing.

　ここで、図３～図８を参照して、本実施形態に係る学習データ生成プログラム(学習データ生成方法)の第１実施例～第５実施例を説明する前に、図９を参照して、作業対象となるワーク(物体)Ｄ０の一例、および、図１０を参照して、本明細書における「ワークの形状領域」なる文言を説明する。 Before describing the first to fifth examples of the learning data generation program (learning data generation method) according to this embodiment with reference to Figures 3 to 8, we will now explain an example of the workpiece (object) D0 to be worked on with reference to Figure 9, and the term "shape area of the workpiece" in this specification with reference to Figure 10.

　図９は、本実施形態に係る学習データ生成装置の一実施例が使用されるロボットシステムにおけるワークの一例を示す図である。本実施形態に係るロボットシステムにおいて、作業対象となるワーク(物体)Ｄ０としては、例えば、図１に示す段ボール箱Ｄ１～Ｄ９のような直方体形状で同一素材のものに限定されず、様々なものが考えられる。すなわち、図９に示すワークＤ０は、エア接手であり、領域Ｄａ，ＤｂおよびＤｄがプラスチック(例えば、ＰＴＦＥ：ポリテトラフルオロエチレン)素材で形成され、領域Ｄｃが金属(例えば、黄銅やステンレス)素材で形成されている。ここで、領域Ｄｃを形成する金属は、領域Ｄａ，ＤｂおよびＤｄを形成するプラスチックよりも比重が大きく、さらに、領域Ｄｃは、スパナ等の工具による締め付け作業が行い易いように六角形の平面で領域Ｄｂ～Ｄｄの領域を囲むようになっているものとする。 FIG. 9 is a diagram showing an example of a workpiece in a robot system in which an embodiment of the learning data generating device according to this embodiment is used. In the robot system according to this embodiment, the workpiece (object) D0 to be worked on is not limited to rectangular parallelepiped objects made of the same material, such as the cardboard boxes D1 to D9 shown in FIG. 1, and various objects are possible. That is, the workpiece D0 shown in FIG. 9 is an air joint, in which areas Da, Db, and Dd are made of plastic (for example, PTFE: polytetrafluoroethylene) material, and area Dc is made of metal (for example, brass or stainless steel). Here, the metal that forms area Dc has a higher specific gravity than the plastic that forms areas Da, Db, and Dd, and area Dc is designed to surround areas Db to Dd with a hexagonal plane to facilitate tightening work using a tool such as a wrench.

　ところで、ワーク上の局部領域としては、ワーク上の平坦な面，曲面，面積が大きい領域，滑りにくい領域および密度が高い(重い)領域といった様々なものがあり、例えば、吸着ハンド(１２)によりワークを取出す場合、吸着を行うワークの領域に応じて取出しの成功率が変化する。そのため、学習データ(教示データ)として、ワーク上の局部領域の外形を囲んだ領域情報を含む方が好ましいことになる。具体的に、ワークが図９に示すエア接手Ｄ０の場合、例えば、吸着ハンド１２により、曲面形状の領域Ｄａ，Ｄｂ，Ｄｄを吸着するよりも平面形状の領域Ｄｃを吸着した方が、取出しの成功率が高くなると考えられる。さらに、密度が低い(軽い)材質のプラスチックで形成された領域Ｄａ，Ｄｂ，Ｄｄよりも密度が高い(重い)材質の金属で形成された領域Ｄｃを吸着した方が、ワークＤ０全体の重心に近い位置で取出しを行えるため、取出しの成功率が高くなると考えられる。すなわち、吸着ハンド１２により、ワークであるエア接手Ｄ０の金属で形成された平面の領域Ｄｃを吸着することにより、ワークＤ０を安定して取出すことが可能になる。 Incidentally, there are various local areas on a workpiece, such as flat surfaces, curved surfaces, areas with large areas, areas that are not slippery, and areas with high density (heavy). For example, when a workpiece is taken out using a suction hand (12), the success rate of taking out the workpiece changes depending on the area of the workpiece to be picked up. Therefore, it is preferable to include area information surrounding the outline of the local area on the workpiece as learning data (teaching data). Specifically, when the workpiece is the air joint D0 shown in FIG. 9, for example, it is considered that the success rate of taking out the workpiece is higher when the suction hand 12 picks up the flat area Dc rather than picking up the curved areas Da, Db, Dd. Furthermore, it is considered that the success rate of taking out the workpiece is higher when the suction hand 12 picks up the area Dc made of metal with high density (heavy) material rather than picking up the areas Da, Db, Dd made of plastic with low density (light) material, because the workpiece can be taken out at a position closer to the center of gravity of the entire workpiece D0. In other words, the suction hand 12 suctions the flat area Dc formed by the metal of the air joint D0, which is the workpiece, making it possible to stably remove the workpiece D0.

　ここで、図６および図７を参照して説明する第４実施例のように、ユーザが表示部３５に表示された画像および教示データを参照し、教示データ(学習データ)の修正を行ってもらうのが好ましい。この場合でも、ユーザは一部の処理のみを行うだけよいため、従来の人による手作業で画像上をクリックして物体の領域情報を教えるといったものに比して、遥かに少ない労力のみでよいことになる。このように、ワークＤ０の領域情報は、そのワークＤ０の外形を反映した外形領域情報、そのワークＤ０を吸着，吸引または把持するための取出し領域情報、および、そのワークＤ０上の局部領域情報のうち少なくとも１つを含むのが好ましい。 Here, as in the fourth embodiment described with reference to Figures 6 and 7, it is preferable for the user to refer to the image and teaching data displayed on the display unit 35 and correct the teaching data (learning data). Even in this case, the user only needs to perform some of the processing, which requires much less effort than the conventional method of manually clicking on an image to teach area information of an object. In this way, it is preferable for the area information of the workpiece D0 to include at least one of outer shape area information reflecting the outer shape of the workpiece D0, removal area information for adsorbing, suctioning or gripping the workpiece D0, and local area information on the workpiece D0.

　次に、図１０は、本実施形態に係る学習データ生成装置の一実施例におけるワークの形状領域を説明するための図であり、図１０(a)は、ワークＤの形状領域を示し、図１０(b)は、ワークＤの背景も含む領域を示す。本明細書において、「ワークの形状領域」なる文言は、図１０(a)に示されるように、例えば、カメラ４から出力された２次元画像Ｐ３において、ワークＤのみを含み、ワークＤの背景を含まない参照符号Ａ１で示す領域(ワークＤ自体の領域)であり、図１０(b)に示されるようなワークＤおよび背景を含む領域ではない。つまり、本明細書における「ワークの形状領域」は、「ワークの形状/外形を反映した領域情報」であり、この情報を使ってワークの形状/外形を計算できるものである。 Next, FIG. 10 is a diagram for explaining the shape region of a workpiece in one embodiment of the learning data generation device according to this embodiment, where FIG. 10(a) shows the shape region of the workpiece D, and FIG. 10(b) shows the region including the background of the workpiece D. In this specification, the term "shape region of the workpiece" refers to the region indicated by reference character A1 (the region of the workpiece D itself) that includes only the workpiece D and does not include the background of the workpiece D in the two-dimensional image P3 output from the camera 4, as shown in FIG. 10(a), and is not the region that includes the workpiece D and the background as shown in FIG. 10(b). In other words, the "shape region of the workpiece" in this specification is "region information that reflects the shape/outer shape of the workpiece," and this information can be used to calculate the shape/outer shape of the workpiece.

　以下、図３～図８を参照して、本実施形態に係る学習データ生成プログラム(学習データ生成方法)の第１実施例～第５実施例を説明する。なお、以下の説明は、図２を参照して説明した学習データ生成装置３の機能ブロック図に基づくものであるが、学習データ生成装置３は、図２に示すものに限定されないのは言うまでもない。 Below, first to fifth examples of the learning data generation program (learning data generation method) according to this embodiment will be described with reference to Figures 3 to 8. Note that the following description is based on the functional block diagram of the learning data generation device 3 described with reference to Figure 2, but it goes without saying that the learning data generation device 3 is not limited to the one shown in Figure 2.

　図３は、本実施形態に係る学習データ生成プログラム(学習データ生成方法)の第１実施例における処理の一例を説明するためのフローチャートであり、データ取得部３１が２次元画像および学習済みモデルを取得する場合を示すものである。図３に示されるように、第１実施例の学習データ生成プログラムの処理の一例が開始(ＳＴＡＲＴ)すると、ステップＳＴ１１において、カメラ４がワークＤの存在領域の２次元画像を撮像する。すなわち、図１を参照して説明したように、カメラ４は、ワーク(例えば、複数のダンボール箱)Ｄ１～Ｄ９の存在領域の２次元画像を撮像して、学習データ生成装置３に出力する。 FIG. 3 is a flowchart for explaining an example of processing in a first example of the learning data generation program (learning data generation method) according to this embodiment, and shows a case where the data acquisition unit 31 acquires a two-dimensional image and a trained model. As shown in FIG. 3, when an example of processing in the learning data generation program of the first example starts (START), in step ST11, the camera 4 captures a two-dimensional image of the presence area of the workpiece D. That is, as described with reference to FIG. 1, the camera 4 captures two-dimensional images of the presence areas of the workpieces (e.g., multiple cardboard boxes) D1 to D9 and outputs them to the learning data generation device 3.

　次に、ステップＳＴ１２に進んで、データ取得部３１が２次元画像と学習済みモデルを取得してデータ処理部３２に出力する。ここで、データ取得部３１が受け取る２次元画像は、図２を参照して説明したように、カメラ４により撮像されたワークが存在する領域を撮像した２次元画像である。また、データ取得部３１が受け取る学習済みモデルは、例えば、ロボットシステムを提供するプロバイダが予め準備したものをしようすることができる。或いは、学習済みモデルとしては、例えば、他の学習データ生成装置３により生成された学習モデル、若しくは、自身の学習データ生成装置３により生成された学習モデル(以前の学習モデル)を利用することもできるのは前述した通りである。 Then, proceed to step ST12, where the data acquisition unit 31 acquires a two-dimensional image and a trained model and outputs them to the data processing unit 32. Here, the two-dimensional image received by the data acquisition unit 31 is a two-dimensional image of the area in which the workpiece is present, as described with reference to FIG. 2. The trained model received by the data acquisition unit 31 can be, for example, one prepared in advance by the provider that offers the robot system. Alternatively, as described above, the trained model can be, for example, a training model generated by another training data generation device 3, or a training model (previous training model) generated by the training data generation device 3 itself.

　さらに、ステップＳＴ１３に進んで、データ処理部３２がデータ取得部３１からの２次元画像と学習済みモデルに基づいてワーク領域を推定した後、ステップＳＴ１４に進んで、データ処理部３２がワークＤの形状領域の推定情報に基づいて教示データ(教師データ)を生成する。すなわち、データ処理部３２は、２次元画像と学習済みモデルに基づいて、ワークの形状領域(領域情報)を推定して教示データを生成する。ここで、ワークＤの形状領域は、例えば、図１０を参照して説明したように、図１０(a)の形状領域Ａ１に対応する。 Then, the process proceeds to step ST13, where the data processing unit 32 estimates the work area based on the two-dimensional image from the data acquisition unit 31 and the trained model, and then the process proceeds to step ST14, where the data processing unit 32 generates teaching data (teacher data) based on the estimated information of the shape area of the workpiece D. That is, the data processing unit 32 estimates the shape area (area information) of the workpiece based on the two-dimensional image and the trained model, and generates teaching data. Here, the shape area of the workpiece D corresponds to, for example, the shape area A1 in FIG. 10(a), as described with reference to FIG. 10.

　そして、ステップＳＴ１５に進んで、データ保存部３３が教示データと２次元画像を学習データとしてデータ保存部３３に保存して処理を終了(ＥＮＤ)する。ここで、ステップＳＴ１６は、ステップＳＴ１２による２次元画像データ、および、ステップＳＴ１４による教示データを表示部３５に表示する処理であり、例えば、ユーザが表示部３５に表示された２次元画像および教示データを参照し、教示データの修正を行うことも可能である。なお、ユーザによる処理は行わず、単に、表示部３５に２次元画像および教示データを表示するだけでもよいのは言うまでもない。なお、この第１実施例におけるステップＳＴ１６の処理は、後述する各実施例におけるステップＳＴ２６，ＳＴ３６，ＳＴ４７およびＳＴ５５と同様である。 Then, the process proceeds to step ST15, where the data storage unit 33 stores the teaching data and the two-dimensional image as learning data in the data storage unit 33, and the process ends (END). Here, step ST16 is a process of displaying the two-dimensional image data from step ST12 and the teaching data from step ST14 on the display unit 35, and it is possible for the user to refer to the two-dimensional image and teaching data displayed on the display unit 35 and correct the teaching data, for example. It goes without saying that it is also possible to simply display the two-dimensional image and teaching data on the display unit 35 without performing any processing by the user. The process of step ST16 in this first embodiment is similar to steps ST26, ST36, ST47, and ST55 in each embodiment described below.

　このように、本第１実施例の学習データ生成プログラム(学習データ生成方法)によれば、高い人件費および莫大な作業時間を費やすことなく学習データ(教示データ)を容易に生成することができる。 In this way, according to the learning data generation program (learning data generation method) of this first embodiment, learning data (teaching data) can be easily generated without incurring high labor costs and huge amounts of work time.

　図４は、本実施形態に係る学習データ生成プログラムの第２実施例における処理の一例を説明するためのフローチャートであり、データ取得部３１が２次元画像を取得し、学習済みモデルを取得しない場合を示すものである。図４に示されるように、第２実施例の学習データ生成プログラムの処理の一例が開始(ＳＴＡＲＴ)すると、ステップＳＴ２１において、カメラ４がワークＤ(Ｄ１～Ｄ９)の存在領域の２次元画像を撮像する。さらに、ステップＳＴ２２に進んで、データ取得部３１が２次元画像を取得してデータ処理部３２に出力する。ここで、データ取得部３１が受け取る２次元画像は、図２を参照して説明したように、カメラ４により撮像されたワークが存在する領域を撮像した２次元画像である。 FIG. 4 is a flowchart for explaining an example of processing in a second example of the learning data generation program according to this embodiment, showing a case in which the data acquisition unit 31 acquires a two-dimensional image and does not acquire a trained model. As shown in FIG. 4, when an example of processing of the learning data generation program of the second example starts (START), in step ST21, the camera 4 captures a two-dimensional image of the area in which the workpiece D (D1 to D9) is present. Then, the process proceeds to step ST22, in which the data acquisition unit 31 acquires the two-dimensional image and outputs it to the data processing unit 32. Here, the two-dimensional image received by the data acquisition unit 31 is a two-dimensional image of the area in which the workpiece is present, as described with reference to FIG. 2, captured by the camera 4.

　次に、ステップＳＴ２３に進んで、データ処理部３２がデータ取得部３１からの２次元画像に対して画像処理を行ってワークＤの形状領域を推定する。さらに、ステップＳＴ２４に進んで、データ処理部３２がワークＤの形状領域の推定情報に基づいて教示データを生成する。そして、ステップＳＴ２５に進んで、データ保存部３３が教示データと２次元画像を学習データとして保存して処理を終了(ＥＮＤ)する。なお、ステップＳＴ２６は、上述した第１実施例におけるステップＳＴ１６に対応するものでり、その説明は省略する。 Then, the process proceeds to step ST23, where the data processing unit 32 performs image processing on the two-dimensional image from the data acquisition unit 31 to estimate the shape area of the workpiece D. The process then proceeds to step ST24, where the data processing unit 32 generates teaching data based on the estimated information on the shape area of the workpiece D. The process then proceeds to step ST25, where the data storage unit 33 stores the teaching data and the two-dimensional image as learning data and ends the process (END). Note that step ST26 corresponds to step ST16 in the first embodiment described above, and a description thereof will be omitted.

　このように、本第２実施例の学習データ生成プログラムでは、上述した第１実施例における学習済みモデルを使用しないため処理を簡略化することが可能である。ただし、学習済みモデルを使用しない分、教示データとしての精度は第１実施例よりも多少劣る虞がある。そのため、本第２実施例は、対象とするワークの形状や作業の内容、或いは、実際にロボットシステムを稼働するまでの時間等に基づいて、実施を判断するのが好ましい。 In this way, the training data generation program of this second embodiment does not use the trained model in the first embodiment described above, making it possible to simplify processing. However, since a trained model is not used, there is a risk that the accuracy of the teaching data will be somewhat inferior to that of the first embodiment. For this reason, it is preferable to determine whether to implement this second embodiment based on the shape of the target workpiece, the content of the work, or the time it takes to actually start operating the robot system, etc.

　ここで、本実施形態に係る学習データ生成プログラム(学習データ生成方法)の変形例として、複数のワーク(Ｄ０～Ｄ９，Ｄ)において、例えば、ユーザ(作業者)が１種類のワークに対して１つだけのワーク(Ｄ０，Ｄ)の領域情報(ワークの形状領域)を教示する。さらに、教示した領域情報と画像(２次元画像)に基づいて画像処理を行い、特徴を抽出する。そして、抽出した特徴のマッチング処理を行って２次元画像上の全て(複数)のワークの領域を推定し、教示データを生成することも可能である。 Here, as a modified example of the learning data generation program (learning data generation method) according to this embodiment, for multiple workpieces (D0 to D9, D), for example, a user (operator) teaches area information (shape area of the workpiece) of only one workpiece (D0, D) for one type of workpiece. Furthermore, image processing is performed based on the taught area information and the image (two-dimensional image) to extract features. Then, a matching process of the extracted features is performed to estimate the areas of all (multiple) workpieces on the two-dimensional image, and teaching data can be generated.

　図５は、本実施形態に係る学習データ生成プログラムの第３実施例における処理の一例を説明するためのフローチャートであり、データ取得部３１が２次元画像のみ取得して学習済みモデルを取得せず、受付部３４がワークの領域情報(形状領域)を受け付け、データ処理部３２が画像処理を行う場合を示すものである。図５に示されるように、第３実施例の学習データ生成プログラムの処理の一例が開始(ＳＴＡＲＴ)すると、ステップＳＴ３１において、カメラ４がワークＤの存在領域の２次元画像を撮像する。さらに、ステップＳＴ３２に進んで、データ取得部３１が２次元画像を取得してデータ処理部３２に出力する。 FIG. 5 is a flowchart for explaining an example of processing in a third example of the learning data generation program according to this embodiment, showing a case in which the data acquisition unit 31 acquires only a two-dimensional image and does not acquire a learned model, the reception unit 34 accepts area information (shape area) of the workpiece, and the data processing unit 32 performs image processing. As shown in FIG. 5, when an example of processing of the learning data generation program of the third example starts (START), in step ST31, the camera 4 captures a two-dimensional image of the area in which the workpiece D exists. Then, the process proceeds to step ST32, where the data acquisition unit 31 acquires the two-dimensional image and outputs it to the data processing unit 32.

　次に、ステップＳＴ３３に進んで、データ処理部３２がデータ取得部３１からの２次元画像とワークＤの領域情報に基づいて画像処理を行い、画像上の全て(複数)のワークＤ１～Ｄ９(Ｄ０，Ｄ)の形状領域を推定する。ここで、データ処理部３２が行う画像処理は、例えば、パターンマッチ，平面マッチ，曲面マッチ，ブロブ解析，特徴解析，勾配解析，エッジ解析，コントラスト解析，ヒストグラム解析および色情報解析のうち少なくとも１つである。 Next, the process proceeds to step ST33, where the data processing unit 32 performs image processing based on the two-dimensional image from the data acquisition unit 31 and the area information of the workpiece D, and estimates the shape area of all (multiple) works D1 to D9 (D0, D) on the image. Here, the image processing performed by the data processing unit 32 is, for example, at least one of pattern matching, plane matching, curved surface matching, blob analysis, feature analysis, gradient analysis, edge analysis, contrast analysis, histogram analysis, and color information analysis.

　さらに、ステップＳＴ３４に進んで、データ処理部３２がワークＤの形状領域の推定情報に基づいて教示データを生成し、そして、ステップＳＴ３５に進んで、データ保存部３３が教示データと２次元画像を学習データとして保存して処理を終了(ＥＮＤ)する。なお、ステップＳＴ３６の処理は、前述したステップＳＴ１６およびＳＴ２６に対応するものであり、例えば、図６を参照して説明する第４実施例のステップＳＴ４５～ＳＴ４７のように、ユーザが表示部３５に表示された２次元画像および教示データを参照し、教示データを修正することもできる。ここで、教示データの修正は、例えば、図９を参照して説明したように、ワークＤの表面形状や状態、或いは、領域を形成する素材(材料)等に基づいて、ユーザにより行うことができるが、例えば、ワークＤ上の平坦な面，曲面，面積が大きい領域，滑りにくい領域および密度が高い(重い)領域といった様々な領域の判断を機械学習装置等を利用して自動化することも可能である。 Furthermore, the process proceeds to step ST34, where the data processing unit 32 generates teaching data based on the estimated information of the shape area of the workpiece D, and then the process proceeds to step ST35, where the data storage unit 33 stores the teaching data and the two-dimensional image as learning data and ends the process (END). The process of step ST36 corresponds to the above-mentioned steps ST16 and ST26, and the user can also refer to the two-dimensional image and teaching data displayed on the display unit 35 and correct the teaching data, for example, as in steps ST45 to ST47 of the fourth embodiment described with reference to FIG. 6. Here, the correction of the teaching data can be performed by the user based on the surface shape or condition of the workpiece D, or the material (material) forming the area, as described with reference to FIG. 9, for example. However, it is also possible to automate the determination of various areas on the workpiece D, such as flat surfaces, curved surfaces, areas with large areas, areas that are not slippery, and areas with high density (heavy), using a machine learning device, etc.

　図６は、本実施形態に係る学習データ生成プログラムの第４実施例における処理の一例を説明するためのフローチャートであり、データ取得部３１が２次元画像および学習済みモデルを取得し、ユーザが教示データを修正する場合を示すものである。図６に示されるように、第４実施例の学習データ生成プログラムの処理の一例が開始(ＳＴＡＲＴ)すると、ステップＳＴ４１において、カメラ４がワークＤの存在領域の２次元画像を撮像する。さらに、ステップＳＴ４２に進んで、データ取得部３１が２次元画像と学習済みモデルを取得してデータ処理部３２に出力する。 FIG. 6 is a flowchart for explaining an example of processing in the fourth example of the learning data generation program according to this embodiment, showing a case where the data acquisition unit 31 acquires a two-dimensional image and a trained model, and the user modifies the teaching data. As shown in FIG. 6, when an example of processing in the learning data generation program of the fourth example starts (START), in step ST41, the camera 4 captures a two-dimensional image of the area in which the work D exists. Then, the process proceeds to step ST42, where the data acquisition unit 31 acquires the two-dimensional image and the trained model, and outputs them to the data processing unit 32.

　次に、ステップＳＴ４３に進んで、データ処理部３２がデータ取得部３１からの２次元画像と学習済みモデルに基づいて、ワークＤの形状領域を推定し、ステップＳＴ４４に進む。ステップＳＴ４４では、データ処理部３２がワークＤの形状領域の推定情報に基づいて教示データを生成する。 Next, the process proceeds to step ST43, where the data processing unit 32 estimates the shape area of the workpiece D based on the two-dimensional image from the data acquisition unit 31 and the trained model, and the process proceeds to step ST44. In step ST44, the data processing unit 32 generates teaching data based on the estimated information of the shape area of the workpiece D.

　さらに、ステップＳＴ４５に進んで、ユーザ(作業者)が表示部３５を見ながら(参照しながら)教示データを修正して、ステップＳＴ４６に進み、データ保存部３３が教示データと２次元画像を学習データとして保存した後、処理を終了(ＥＮＤ)する。ここで、ステップＳＴ４７の処理は、前述したステップＳＴ１６，ＳＴ２６およびＳＴ３６に対応するものである。本第４実施例では、ステップＳＴ４５～ＳＴ４７に示されるように、ユーザが表示部３５に表示された２次元画像および教示データを参照し、教示データを修正することができるようになっている。ここで、教示データの修正は、例えば、図９を参照して説明したように、ワークＤの表面形状や状態、或いは、領域を形成する素材(材料)等に基づいて、ユーザにより行うことができるが、例えば、ワークＤ上の平坦な面，曲面，面積が大きい領域，滑りにくい領域および密度が高い(重い)領域といった様々な領域の判断を、機械学習装置等を利用して自動化することも可能である。 Furthermore, the process proceeds to step ST45, where the user (operator) modifies the teaching data while viewing (referring to) the display unit 35, and the process proceeds to step ST46, where the data storage unit 33 stores the teaching data and the two-dimensional image as learning data, and then the process ends (END). Here, the process of step ST47 corresponds to the above-mentioned steps ST16, ST26, and ST36. In this fourth embodiment, as shown in steps ST45 to ST47, the user can refer to the two-dimensional image and teaching data displayed on the display unit 35 and modify the teaching data. Here, the modification of the teaching data can be performed by the user based on, for example, the surface shape or state of the workpiece D, or the material (material) forming the area, as described with reference to FIG. 9, but it is also possible to automate the determination of various areas on the workpiece D, such as flat surfaces, curved surfaces, areas with large areas, areas that are not slippery, and areas with high density (heavy), using a machine learning device, etc.

　図７は、本実施形態に係る学習データ生成方法において、ユーザによる教示データの修正処理の一例を説明するための図である。ここで、上段の図７(a)，図７(b)および図７(c)は、教示データが正しくてユーザによる教示データの修正が不要な場合を示し、下段の図７(d)，図７(e)および図７(f)は、教示データが誤っていてユーザによる教示データの修正が必要な場合を示す。さらに、図７(a)および図７(d)は、複数のワークを撮影した２次元画像であり、図７(b)および図７(e)は、画像処理による輪郭線の抽出結果と２次元画像を重ねて示す画像であり、そして、図７(c)および図７(f)は、最終的な教示結果(最終的な学習データ)を示す画像である。 FIG. 7 is a diagram for explaining an example of a user-initiated correction process for teaching data in the learning data generation method according to this embodiment. Here, the upper part of FIG. 7(a), FIG. 7(b), and FIG. 7(c) show cases where the teaching data is correct and the user does not need to correct the teaching data, while the lower part of FIG. 7(d), FIG. 7(e), and FIG. 7(f) show cases where the teaching data is incorrect and the user needs to correct the teaching data. Furthermore, FIG. 7(a) and FIG. 7(d) are two-dimensional images of multiple workpieces, FIG. 7(b) and FIG. 7(e) are images showing the results of contour extraction by image processing superimposed on the two-dimensional images, and FIG. 7(c) and FIG. 7(f) are images showing the final teaching results (final learning data).

　まず、複数のワーク(例えば、複数の段ボール箱)が並んでいる領域に対してカメラ４を使ってその存在領域を撮像し、ワークの並び方を変更しながら、異なる並び方で並んでいる複数のワークの存在領域に対して複数枚の２次元画像を撮像する(図７(a)および図７(d))。次に、それぞれの画像に対して画像処理を行い、各画像上の各ワークの輪郭線の特徴を抽出する(図７(b)および図７(e))。ここで、輪郭線を抽出する方法としては、例えば、２次元画像の二値化処理を行って白黒画像を生成し、画像上の白と黒の境界線を探索して求めることができる。なお、例えば、ワークが段ボール箱の場合、ワーク領域内の色はほぼ均一であり、ワーク領域と背景領域の色の差が大きいほど、正しくワークの輪郭線を求めることができる。 First, a camera 4 is used to capture an image of an area where multiple workpieces (e.g., multiple cardboard boxes) are lined up, and while changing the arrangement of the workpieces, multiple two-dimensional images are captured of the area where multiple workpieces are lined up in different arrangements (Figs. 7(a) and 7(d)). Next, image processing is performed on each image, and the characteristics of the contour line of each workpiece in each image are extracted (Figs. 7(b) and 7(e)). Here, the contour line can be extracted, for example, by binarizing the two-dimensional image to generate a black and white image, and then searching for and finding the boundary line between black and white on the image. Note that, for example, when the workpiece is a cardboard box, the color within the workpiece area is almost uniform, and the greater the color difference between the workpiece area and the background area, the more accurately the contour line of the workpiece can be found.

　まず、画像処理により抽出された輪郭線が正しい場合、例えば、図７(a)のようなワークａ11～ａ14の撮像画像に対して画像処理を行った結果、図７(b)のようなワークｂ11～ｂ14の輪郭線の特徴が抽出された場合を説明する。このとき、ユーザは、表示部３５に表示されたワークの２次元画像と教示データの図７(b)に示されるような画像を見る(参照する)ことになる。ここで、ユーザは、表示部３５に表示された図７(b)に示されるような学習データ(２次元画像および教示データ)が正しいか否かを判断し、正しいと判断すると、学習データ(教示データ)を修正することなく、図７(c)に示されるような最終的な学習データ(教示データｃ11～ｃ14)が得られ、この学習データをデータ保存部３３に保存する。 First, we will explain the case where the contour lines extracted by image processing are correct, for example, the case where image processing is performed on the captured image of workpieces a11 to a14 as shown in FIG. 7(a), and the characteristics of the contour lines of workpieces b11 to b14 as shown in FIG. 7(b) are extracted. At this time, the user looks at (refers to) the two-dimensional image of the workpiece and the image of the teaching data as shown in FIG. 7(b) displayed on the display unit 35. Here, the user judges whether the learning data (two-dimensional image and teaching data) as shown in FIG. 7(b) displayed on the display unit 35 is correct, and if it is judged to be correct, the final learning data (teaching data c11 to c14) as shown in FIG. 7(c) is obtained without modifying the learning data (teaching data), and this learning data is saved in the data saving unit 33.

　一方、画像処理により抽出された輪郭線が誤っている場合、例えば、図７(d)のようなワークｄ11～ｄ14の撮像画像に対して画像処理を行った結果、図７(e)のようなワークｅ11～ｅ14の輪郭線の特徴が抽出された場合を説明する。すなわち、図７(d)に示されるように、例えば、画像上のワークｄ13の領域内の色が背景の色とかなり近く、ワークｄ12とｄ14がくっついてその間の境界が不明瞭な場合、図７(e)に示されるように、ワークｅ13の左側と下側の輪郭線を画像処理で認識できておらず、ワークｅ12およびｅ14の間の輪郭線を画像処理で認識できずにそれらを１つのワークとして誤判断されたされた場合を説明する。このとき、ユーザは、表示部３５に表示されたワークの２次元画像と教示データの図７(e)に示されるような画像を見ることになる。そして、ユーザは、表示部３５に表示された図７(e)に示されるような学習データ(２次元画像および教示データ)が正しいか否かを判断することになる。 On the other hand, when the contour extracted by image processing is incorrect, for example, as a result of performing image processing on the captured image of workpieces d11 to d14 as shown in FIG. 7(d), the contour features of workpieces e11 to e14 as shown in FIG. 7(e) are extracted. That is, as shown in FIG. 7(d), for example, when the color in the area of workpiece d13 on the image is quite close to the color of the background, and workpieces d12 and d14 are stuck together and the boundary between them is unclear, as shown in FIG. 7(e), the left and lower contours of workpiece e13 cannot be recognized by image processing, and the contour between workpieces e12 and e14 cannot be recognized by image processing, and they are erroneously determined as one workpiece. In this case, the user will see the two-dimensional image of the workpiece displayed on the display unit 35 and the image of the teaching data as shown in FIG. 7(e). The user will then determine whether the learning data (two-dimensional image and teaching data) as shown in FIG. 7(e) displayed on the display unit 35 is correct.

　ここで、ユーザが図７(e)に示されるような学習データは誤っていると判断すると、例えば、ユーザが図７(e)におけるｅ13にはワークの存在が認識されていないと判断すると、そのｅ13に欠けている輪郭線をマウス等でクリックして追加することにより、ワークの存在領域に補正して修正する。さらに、ユーザが図７(e)におけるｅ12およびｅ14は１つのワークではなく、２つのワークであると判断すると、そのｅ12とｅ14の間に欠けている境界線をマウス等で操作して追加することにより、２つのワークに修正する。また、ユーザが図７(e)におけるｅ11の領域内にあるラベルの輪郭線が抽出され、それらの輪郭線が囲んだラベルを１つの段ボール箱として誤認識されたと判断すると、ラベルを囲んだ輪郭線を削除するように修正する。さらに、段ボール箱が存在していない背景領域に輪郭線が間違って抽出されたと判断すると、そのような余分な輪郭線を削除するように修正する。このようにして、図７(f)に示されるようなユーザにより修正された最終的な学習データ(教示データｆ11～ｆ14)が得られ、この学習データをデータ保存部３３に保存する。なお、ユーザによる教示データの修正は、例えば、図９を参照して説明したように、ユーザ(人)が画像(例えば、カラー画像またはグレースケール画像)を参照すれば直ちに判断がつくような金属素材とプラスチック素材の領域の判断等に使用するのが好ましい。 Here, if the user judges that the learning data shown in FIG. 7(e) is incorrect, for example, if the user judges that the presence of a workpiece is not recognized in e13 in FIG. 7(e), the missing contour line in e13 is added by clicking with a mouse or the like to correct it to the area where the workpiece exists, and correction is made. Furthermore, if the user judges that e12 and e14 in FIG. 7(e) are not one workpiece but two works, the missing boundary line between e12 and e14 is added by operating a mouse or the like to correct it to two works. Furthermore, if the user judges that the contour line of the label in the area e11 in FIG. 7(e) was extracted and the label surrounded by these contour lines was erroneously recognized as one cardboard box, correction is made to delete the contour line surrounding the label. Furthermore, if the user judges that the contour line was erroneously extracted in the background area where no cardboard box exists, correction is made to delete such extra contour line. In this way, the final learning data (instruction data f11 to f14) corrected by the user as shown in FIG. 7(f) is obtained, and this learning data is stored in the data storage unit 33. Note that the correction of the instruction data by the user is preferably used, for example, as described with reference to FIG. 9, for determining areas of metal and plastic materials that can be immediately determined by the user (person) by referring to an image (e.g., a color image or a grayscale image).

　このように、例えば、ワーク領域内の色が均一ではない場合はワーク領域内部にも輪郭線が抽出され、ワーク領域と背景領域の色の差が大きくない場合はワークの輪郭線の一部が欠け、或いは、複数のワークが密接している場合は隣接する輪郭線を抽出できないといった様々な問題がある。また、撮像した画像の解像度が低い場合、或いは、撮像時の照明状況が悪くてノイズが乗りやすい場合には、ワークが存在していない背景領域に輪郭線が間違って抽出されることもある。そこで、このような抽出結果を画像に重ねて表示部３５に表示してユーザに提示し、ユーザが目視して間違った輪郭線を削除/修正し、足りない輪郭線を追加して、正しいワーク領域を反映した教示データに修正して学習データを生成するのが好ましい。 In this way, various problems arise, such as, for example, if the color within the work area is not uniform, a contour line is extracted within the work area, part of the work contour line is missing if the color difference between the work area and the background area is not large, or adjacent contour lines cannot be extracted if multiple work pieces are close together. Also, if the resolution of the captured image is low, or if the lighting conditions at the time of capture are poor and noise is likely to occur, a contour line may be mistakenly extracted in a background area where no work exists. Therefore, it is preferable to superimpose such extraction results on the image and display them on the display unit 35 for the user, so that the user can visually delete/correct the incorrect contour lines and add missing contour lines, and modify the teaching data to reflect the correct work area to generate learning data.

　すなわち、誤ったワーク領域を含む教示データを含む学習データを機械学習に使用すると、正しく学習することができず、ワークの領域を正しく推論計算できる学習済みモデルを生成することが困難になる。これに対して、一度生成された教示データと画像(学習データ)をユーザに提示し、ユーザによる判断に基づいて教示データを修正することにより、教示データの精度(信頼度)を向上させることが可能になる。なお、上述した処理はユーザが行うことになるが、ユーザはゼロから教示処理を行う必要がなく、間違ったところだけに対してユーザによる修正を加えることになるため、高い人件費および莫大な作業時間を必要としないのは言うまでもない。なお、上述したようなユーザによる教示データの修正は、第４実施例に限定されるものではなく、本実施形態に係る学習データ生成装置、ロボットシステム、学習データ生成方法および学習データ生成プログラムの実施例に対して幅広く使用することができる。 In other words, if learning data including teaching data that includes an incorrect work area is used for machine learning, it will not be possible to learn correctly, and it will be difficult to generate a trained model that can correctly infer and calculate the work area. In response to this, it is possible to improve the accuracy (reliability) of the teaching data by presenting the generated teaching data and images (learning data) to the user and correcting the teaching data based on the user's judgment. Note that although the above-mentioned processing is performed by the user, the user does not need to perform the teaching processing from scratch, and only makes corrections to the incorrect parts, so it goes without saying that high labor costs and huge work hours are not required. Note that the correction of teaching data by the user as described above is not limited to the fourth example, and can be widely used in the examples of the learning data generation device, robot system, learning data generation method, and learning data generation program related to this embodiment.

　図８は、本実施形態に係る学習データ生成プログラムの第５実施例における処理の一例を説明するためのフローチャートである。図８に示されるように、第５実施例の学習データ生成プログラムの処理の一例が開始(ＳＴＡＲＴ)すると、ステップＳＴ５１において、カメラ４がワークＤの存在領域の２次元画像を撮像すると共に、３次元センサがワークＤの存在領域の３次元点群データ(３次元データ)を取得する。さらに、ステップＳＴ５２に進んで、データ取得部３１が２次元画像に対して画像処理を行って形状領域を推定する。 FIG. 8 is a flowchart for explaining an example of processing in the fifth embodiment of the learning data generation program according to this embodiment. As shown in FIG. 8, when an example of processing in the learning data generation program of the fifth embodiment starts (START), in step ST51, the camera 4 captures a two-dimensional image of the area in which the workpiece D exists, and the three-dimensional sensor acquires three-dimensional point cloud data (three-dimensional data) of the area in which the workpiece D exists. Then, the process proceeds to step ST52, where the data acquisition unit 31 performs image processing on the two-dimensional image to estimate the shape area.

　次に、ステップＳＴ５３に進んで、３次元点群データに基づいてワークＤの領域情報を補正する。ここで、３次元点群データに基づいてワークＤの領域情報の補正とは、例えば、複数のワークのそれぞれの２次元画像における形状領域の計算結果に対して、３次元点群データと２次元画像の照らし合わせを行う。さらに、３次元点群データに含まれる３次元位置の差を利用して、計算結果の中に誤って入っている背景領域や障害物領域若しくは隣のワークの領域等を除外して、補正されたワークの教示データを取得する。 Next, proceed to step ST53, where the area information of work D is corrected based on the three-dimensional point cloud data. Here, correcting the area information of work D based on the three-dimensional point cloud data means, for example, comparing the three-dimensional point cloud data with the two-dimensional images for the calculation results of the shape area in each of the two-dimensional images of multiple workpieces. Furthermore, by utilizing the difference in three-dimensional position contained in the three-dimensional point cloud data, background areas, obstacle areas, areas of adjacent workpieces, etc. that are erroneously included in the calculation results are excluded, and the corrected teaching data for the work is obtained.

　そして、ステップＳＴ５４に進んで、データ保持部３３が教示データと２次元画像を学習データとして保存して処理を終了(ＥＮＤ)する。なお、２次元画像によるワークＤの領域情報を３次元点群データに基づいて補正する場合、例えば、２次元画像における溝，ギャップ，段差，３次元平面および３次元曲面等を特徴として抽出するのが好ましい。これにより、例えば、２次元画像では、ワークとその周りの背景、或いは、ワーク同士の境界等を識別できない場合でも、３次元点群データと照らし合わせを行うことにより、教示データの信頼度(精度)を向上させることが可能になる。なお、本第５実施例においても、ステップＳＴ５５は、例えば、図６を参照して説明した第４実施例のステップＳＴ４７に対応し、第４実施例におけるステップＳＴ４５およびＳＴ４６の処理を追加して行うことができるのは、前述した通りである。 Then, the process proceeds to step ST54, where the data storage unit 33 stores the teaching data and the two-dimensional image as learning data, and the process ends (END). When the area information of the workpiece D from the two-dimensional image is corrected based on the three-dimensional point cloud data, it is preferable to extract, for example, grooves, gaps, steps, three-dimensional planes, and three-dimensional curved surfaces in the two-dimensional image as features. This makes it possible to improve the reliability (accuracy) of the teaching data by comparing it with the three-dimensional point cloud data, even if, for example, the workpiece and its surrounding background, or the boundaries between workpieces, cannot be identified in the two-dimensional image. Note that in this fifth embodiment, step ST55 also corresponds to, for example, step ST47 in the fourth embodiment described with reference to FIG. 6, and as described above, the processes of steps ST45 and ST46 in the fourth embodiment can be additionally performed.

　上述した本実施形態に係る学習データ生成プログラムは、コンピュータ読み取り可能な非一時的記録媒体や不揮発性半導体メモリに記録して提供してもよく、また、有線または無線を介して提供してもよい。ここで、コンピュータ読み取り可能な非一時的記録媒体としては、例えば、ＣＤ－ＲＯＭ(Compact Disc Read Only Memory)やＤＶＤ－ＲＯＭ等の光ディスク、或いは、ハードディスク装置等が考えられる。また、不揮発性半導体メモリとしては、ＰＲＯＭ(Programmable Read Only Memory)やフラッシュメモリ等が考えられる。さらに、サーバ装置からの配信としては、有線または無線によるＬＡＮ(Local Area Network)、或いは、インターネット等のＷＡＮを介した提供が考えられる。 The learning data generation program according to the present embodiment described above may be provided by recording it on a computer-readable non-transitory recording medium or a non-volatile semiconductor memory, or may be provided via a wired or wireless connection. Examples of computer-readable non-transitory recording media include optical discs such as CD-ROMs (Compact Disc Read Only Memory) and DVD-ROMs, or hard disk devices. Examples of non-volatile semiconductor memory include PROMs (Programmable Read Only Memory) and flash memories. Furthermore, distribution from a server device may be via a wired or wireless LAN (Local Area Network), or a WAN such as the Internet.

　以上、詳述したように、本実施形態に係る学習データ生成装置、ロボットシステム、学習データ生成方法および学習データ生成プログラムによれば、高い人件費および莫大な作業時間を費やすことなくアノテーションを行うことが可能になる。 As described above in detail, the learning data generation device, robot system, learning data generation method, and learning data generation program according to this embodiment make it possible to perform annotation without incurring high labor costs and huge amounts of work time.

　本開示について詳述したが、本開示は上述した個々の実施形態に限定されるものではない。これらの実施形態は、本開示の要旨を逸脱しない範囲で、または、特許請求の範囲に記載された内容とその均等物から導き出される本開示の趣旨を逸脱しない範囲で、種々の追加、置き換え、変更、部分的削除等が可能である。また、これらの実施形態は、組み合わせて実施することもできる。例えば、上述した実施形態において、各動作の順序や各処理の順序は、一例として示したものであり、これらに限定されるものではない。また、上述した実施形態の説明に数値または数式が用いられている場合も同様である。 Although the present disclosure has been described in detail, the present disclosure is not limited to the individual embodiments described above. Various additions, substitutions, modifications, partial deletions, etc. are possible to these embodiments without departing from the gist of the present disclosure, or without departing from the spirit of the present disclosure derived from the contents described in the claims and their equivalents. These embodiments can also be implemented in combination. For example, in the above-mentioned embodiments, the order of each operation and the order of each process are shown as examples, and are not limited to these. The same applies when numerical values or formulas are used to explain the above-mentioned embodiments.

　上記実施形態および変形例に関し、さらに、以下の付記を開示する。
　［付記１］
　機械学習に用いる学習データを生成する学習データ生成装置（３）であって、
　複数のワーク（Ｄ０～Ｄ９，Ｄ）の存在領域の画像を取得するデータ取得部（３１）と、
　取得された前記画像に基づいて、少なくとも１つの前記ワーク（Ｄ０，Ｄ）の全部または一部の領域範囲を反映した領域情報を少なくとも推定し、推定された前記領域情報を含む教示データを生成するデータ処理部（３２）と、
　生成された前記教示データと前記画像を前記学習データとして保存するデータ保存部（３３）と、
　を備える、学習データ生成装置。
　［付記２］
　前記ワーク（Ｄ０，Ｄ）の領域情報は、当該ワーク（Ｄ０，Ｄ）の外形を反映した外形領域情報、当該ワーク（Ｄ０，Ｄ）を吸着，吸引または把持するための取出し領域情報、および、当該ワーク（Ｄ０，Ｄ）上の局部領域情報のうち少なくとも１つを含む、付記１に記載の学習データ生成装置。
　［付記３］
　前記ワーク（Ｄ０，Ｄ）の前記局部領域情報は、当該ワーク（Ｄ０，Ｄ）上の平坦な面，曲面または面積が大きい領域、当該ワーク（Ｄ０，Ｄ）上の滑りにくい領域、および、当該ワーク（Ｄ０，Ｄ）上の密度が高い領域のうち少なくとも１つを含む、付記２に記載の学習データ生成装置。
　［付記４］
　さらに、受付部（３４）を備え、
　前記受付部（３４）は、少なくとも１つの前記ワーク（Ｄ０，Ｄ）の領域情報に基づいた修正情報を受け付け、
　前記データ処理部（３２）は、前記受付部（３４）が受け付けた前記教示情報に基づいて、前記教示データを修正して出力する、付記１乃至付記３のいずれか１項に記載の学習データ生成装置。
　［付記５］
　前記データ取得部（３１）は、さらに、学習済みモデルを取得し、
　前記データ処理部（３２）は、前記画像と前記学習済みモデルに基づいて前記ワーク（Ｄ０，Ｄ）の領域情報を推定し、前記教示データを生成する、付記１乃至付記４のいずれか１項に記載の学習データ生成装置。
　［付記６］
　前記データ取得部（３１）は、複数の前記ワーク（Ｄ０～Ｄ９，Ｄ）の存在領域の２次元画像を取得し、
　前記データ処理部（３２）は、前記２次元画像に基づいて前記領域情報を推定し、前記教示データを生成する、付記１乃至付記５のいずれか１項に記載の学習データ生成装置。
　［付記７］
　前記データ取得部（３１）は、複数の前記ワーク（Ｄ０～Ｄ９，Ｄ）の存在領域の２次元画像および３次元データを取得し、
　前記データ処理部（３２）は、前記２次元画像および前記３次元データに基づいて前記領域情報を推定し、前記教示データを生成する、付記１乃至付記５のいずれか１項に記載の学習データ生成装置。
　［付記８］
　前記データ処理部（３２）は、前記３次元データの解析を行い、解析結果に基づいて前記ワークの３次元領域情報を推定し、前記３次元データと前記２次元画像の照らし合わせを行って前記２次元画像における前記ワークの領域情報を推定し、前記教示データを生成する、付記７に記載の学習データ生成装置。
　［付記９］
　前記データ処理部（３２）は、前記３次元データの解析として、平面解析，曲面解析，ブロブ解析，座標系解析，姿勢解析，深度解析，スケール解析，特徴解析，近傍点解析，メッシュ解析およびボクセル解析のうち少なくとも１つを行った処理結果に基づいて、前記ワーク（Ｄ０，Ｄ）の３次元領域情報を推定する、付記８に記載の学習データ生成装置。
　［付記１０］
　前記データ処理部（３２）は、前記３次元データの解析結果に基づいて推定された前記ワーク（Ｄ０，Ｄ）の領域情報に基づいて、前記教示データを修正して出力する、付記８または付記９に記載の学習データ生成装置。
　［付記１１］
　前記データ処理部（３２）は、前記受付部（３４）が受け付けた少なくとも１つの前記ワーク（Ｄ０，Ｄ）の領域情報と前記画像に基づいて画像処理を行って特徴を抽出し、抽出された前記特徴のマッチング処理を行って前記画像上の複数の前記ワーク（Ｄ０，Ｄ）の領域情報を推定して前記教示データを生成する、付記４乃至付記６のいずれか１項に記載の学習データ生成装置。
　［付記１２］
　前記データ処理部（３２）は、前記画像に基づいて画像処理を行い、前記画像処理の処理結果と前記画像に基づいて、前記ワーク（Ｄ０，Ｄ）の領域情報を推定して前記教示データを生成する、付記１乃至付記１１のいずれか１項に記載の学習データ生成装置。
　［付記１３］
　前記データ処理部（３２）は、前記画像処理として、パターンマッチ，平面マッチ，曲面マッチ，ブロブ解析，特徴解析，勾配解析，エッジ解析，コントラスト解析，ヒストグラム解析および色情報解析のうち少なくとも１つを行った処理結果に基づいて、前記ワーク（Ｄ０，Ｄ）の領域情報を推定する、付記１１または付記１２に記載の学習データ生成装置。
　［付記１４］
　前記データ処理部（３２）は、前記画像処理の処理結果に基づいて推定された前記ワーク（Ｄ０，Ｄ）の領域情報に基づいて、前記教示データを修正して出力する、付記１１乃至付記１３のいずれか１項に記載の学習データ生成装置。
　［付記１５］
　さらに、表示部（３５）を備え、
　前記表示部（３５）は、前記画像および前記教示データを表示する、付記１乃至付記１４のいずれか１項に記載の学習データ生成装置。
　［付記１６］
　前記データ処理部（３２）は、前記表示部（３５）に表示された前記画像および前記教示データを参照して前記教示データを修正する処理を含む、付記１５に記載の学習データ生成装置。
　［付記１７］
　複数の前記ワーク（Ｄ０～Ｄ９，Ｄ）に対して所定の処理を行うロボット（１）と、
　複数の前記ワーク（Ｄ０～Ｄ９，Ｄ）の存在領域を撮像して前記画像を出力するカメラ（４）と、
　前記カメラ（４）からの前記画像を受け取って前記学習データを出力する、付記１乃至付記１６のいずれか１項に記載の学習データ生成装置（３）と、
　前記学習データ生成装置（３）からの前記学習データを受け取って機械学習を行う機械学習装置と、
　前記機械学習装置の出力を受け取って、前記ロボット（１）を制御するロボット制御装置（２）と、を備えるロボットシステム（１００）。
　［付記１８］
　機械学習に用いる学習データを生成する学習データ生成方法であって、
　複数のワーク（Ｄ０～Ｄ９，Ｄ）の存在領域の画像を取得するデータ取得工程と、
　取得された前記画像に基づいて、少なくとも１つの前記ワーク（Ｄ０，Ｄ）の全部または一部の領域範囲を反映した領域情報を少なくとも推定し、推定された前記領域情報を含む教示データを生成するデータ処理工程と、
　生成された前記教示データと前記画像を前記学習データとして保存するデータ保存工程と、
　を備える、学習データ生成方法。
　［付記１９］
　前記データ取得工程は、さらに、事前に学習して生成した学習済みモデルを取得し、
　前記データ処理工程は、前記画像と前記学習済みモデルに基づいて前記ワーク（Ｄ０，Ｄ）の領域情報を推定し、前記教示データを生成する、付記１８に記載の学習データ生成方法。
　［付記２０］
　さらに、受付工程を備え、前記受付工程は、少なくとも１つの前記ワーク（Ｄ０，Ｄ）の領域情報を受け付け、
　前記データ処理工程は、複数の前記ワーク（Ｄ０～Ｄ９，Ｄ）において、受け付けられた前記領域情報と前記画像に基づいて画像処理を行って特徴を抽出し、抽出された前記特徴のマッチング処理を行って前記画像上の複数のワーク（Ｄ０～Ｄ９，Ｄ）の領域情報を推定して前記教示データを生成する、付記１８または付記１９に記載の学習データ生成方法。
　［付記２１］
　前記データ処理工程は、前記画像に基づいて画像処理を行い、前記画像処理の処理結果と前記画像に基づいて、前記ワーク（Ｄ０～Ｄ９，Ｄ）の領域情報を推定して前記教示データを生成する、付記１８乃至付記２０のいずれか１項に記載の学習データ生成方法。
　［付記２２］
　前記データ取得工程は、さらに、複数の前記ワーク（Ｄ０～Ｄ９，Ｄ）の存在領域の２次元画像および３次元データを取得し、
　前記データ処理工程は、前記２次元画像および前記３次元データに基づいて前記領域情報を推定し、前記教示データを生成する、付記１８乃至付記２１のいずれか１項に記載の学習データ生成方法。
　［付記２３］
　さらに、受付工程を備え、前記受付工程は、少なくとも１つの前記ワーク（Ｄ０，Ｄ）の領域情報に基づいた修正情報を受け付け、
　前記データ処理工程は、受け付けられた前記修正情報に基づいて、前記教示データを修正する、付記１８乃至付記２２のいずれか１項に記載の学習データ生成方法。
　［付記２４］
　機械学習に用いる学習データを生成する学習データ生成プログラムであって、
　演算処理装置に、
　　複数のワーク（Ｄ０～Ｄ９，Ｄ）の存在領域の画像を取得するデータ取得工程と、
　　取得された前記画像に基づいて、少なくとも１つの前記ワーク（Ｄ０，Ｄ）の全部または一部の領域範囲を反映した領域情報を少なくとも推定し、推定された前記領域情報を含む教示データを生成するデータ処理工程と、
　　生成された前記教示データと前記画像を前記学習データとして保存するデータ保存工程と、を実行させる学習データ生成プログラム。 Regarding the above embodiment and modified examples, the following supplementary notes are further disclosed.
[Appendix 1]
A learning data generation device (3) for generating learning data for use in machine learning, comprising:
A data acquisition unit (31) that acquires images of areas where a plurality of workpieces (D0 to D9, D) are present;
A data processing unit (32) that at least estimates area information reflecting a whole or part of an area range of at least one of the workpieces (D0, D) based on the acquired image, and generates teaching data including the estimated area information;
a data storage unit (33) that stores the generated teaching data and the image as the learning data;
A training data generating device comprising:
[Appendix 2]
The learning data generation device described in Appendix 1, wherein the area information of the workpiece (D0, D) includes at least one of outer shape area information reflecting the outer shape of the workpiece (D0, D), removal area information for adsorbing, suctioning or grasping the workpiece (D0, D), and local area information on the workpiece (D0, D).
[Appendix 3]
The learning data generation device described in Appendix 2, wherein the local area information of the workpiece (D0, D) includes at least one of a flat surface, a curved surface, or a large area on the workpiece (D0, D), a non-slip area on the workpiece (D0, D), and a high-density area on the workpiece (D0, D).
[Appendix 4]
Further, a reception unit (34) is provided,
The reception unit (34) receives correction information based on area information of at least one of the works (D0, D),
The learning data generating device according to any one of claims 1 to 3, wherein the data processing unit (32) modifies and outputs the teaching data based on the teaching information accepted by the accepting unit (34).
[Appendix 5]
The data acquisition unit (31) further acquires a trained model,
The data processing unit (32) estimates area information of the work (D0, D) based on the image and the trained model, and generates the teaching data.
[Appendix 6]
The data acquisition unit (31) acquires a two-dimensional image of an area where the plurality of workpieces (D0 to D9, D) are present,
The learning data generation device according to any one of Supplementary Note 1 to Supplementary Note 5, wherein the data processing unit (32) estimates the region information based on the two-dimensional image and generates the teaching data.
[Appendix 7]
The data acquisition unit (31) acquires two-dimensional images and three-dimensional data of the areas in which the plurality of workpieces (D0 to D9, D) are present,
The learning data generation device according to any one of Supplementary Note 1 to Supplementary Note 5, wherein the data processing unit (32) estimates the region information based on the two-dimensional image and the three-dimensional data, and generates the teaching data.
[Appendix 8]
The data processing unit (32) analyzes the three-dimensional data, estimates three-dimensional area information of the workpiece based on the analysis results, compares the three-dimensional data with the two-dimensional image to estimate area information of the workpiece in the two-dimensional image, and generates the teaching data.
[Appendix 9]
The learning data generation device described in Appendix 8, wherein the data processing unit (32) estimates three-dimensional region information of the work (D0, D) based on the processing results of at least one of plane analysis, curved surface analysis, blob analysis, coordinate system analysis, posture analysis, depth analysis, scale analysis, feature analysis, nearby point analysis, mesh analysis, and voxel analysis as an analysis of the three-dimensional data.
[Appendix 10]
The learning data generation device described in Appendix 8 or Appendix 9, wherein the data processing unit (32) modifies and outputs the teaching data based on area information of the work (D0, D) estimated based on the analysis results of the three-dimensional data.
[Appendix 11]
The data processing unit (32) performs image processing based on area information of at least one of the works (D0, D) received by the receiving unit (34) and the image to extract features, performs matching processing of the extracted features to estimate area information of multiple works (D0, D) on the image, and generates the teaching data.
[Appendix 12]
The data processing unit (32) performs image processing based on the image, and estimates area information of the work (D0, D) based on a result of the image processing and the image, thereby generating the teaching data.
[Appendix 13]
The learning data generation device according to claim 11 or 12, wherein the data processing unit (32) estimates area information of the work (D0, D) based on the results of processing performed as the image processing at least one of pattern matching, plane matching, curved surface matching, blob analysis, feature analysis, gradient analysis, edge analysis, contrast analysis, histogram analysis, and color information analysis.
[Appendix 14]
The data processing unit (32) corrects and outputs the teaching data based on area information of the work (D0, D) estimated based on the processing result of the image processing.
[Appendix 15]
Further, a display unit (35) is provided,
The learning data generation device according to any one of claims 1 to 14, wherein the display unit (35) displays the image and the teaching data.
[Appendix 16]
The learning data generation device described in Appendix 15, wherein the data processing unit (32) includes a process of correcting the teaching data by referring to the image and the teaching data displayed on the display unit (35).
[Appendix 17]
A robot (1) that performs a predetermined process on a plurality of the workpieces (D0 to D9, D);
A camera (4) that captures an image of an area where the plurality of works (D0 to D9, D) are present and outputs the image;
A learning data generation device (3) according to any one of Supplementary Note 1 to Supplementary Note 16, which receives the image from the camera (4) and outputs the learning data;
a machine learning device that receives the learning data from the learning data generation device (3) and performs machine learning;
A robot system (100) comprising: a robot control device (2) that receives output from the machine learning device and controls the robot (1).
[Appendix 18]
A learning data generation method for generating learning data for use in machine learning, comprising:
A data acquisition process for acquiring images of the areas where a plurality of workpieces (D0 to D9, D) are present;
A data processing step of estimating at least area information reflecting the entire or partial area range of at least one of the workpieces (D0, D) based on the acquired image, and generating teaching data including the estimated area information;
a data storage step of storing the generated teaching data and the image as the learning data;
A learning data generation method comprising:
[Appendix 19]
The data acquisition step further includes acquiring a trained model that has been trained and generated in advance,
The learning data generation method described in Appendix 18, wherein the data processing step estimates region information of the work (D0, D) based on the image and the trained model, and generates the teaching data.
[Appendix 20]
The method further includes a receiving step, in which area information of at least one of the works (D0, D) is received,
The data processing step comprises performing image processing based on the received area information and the image for a plurality of the workpieces (D0 to D9, D) to extract features, performing a matching process for the extracted features, and estimating area information for the plurality of workpieces (D0 to D9, D) on the image to generate the teaching data.
[Appendix 21]
The data processing step performs image processing based on the image, and estimates area information of the work (D0 to D9, D) based on a processing result of the image processing and the image to generate the teaching data.
[Appendix 22]
The data acquisition step further acquires two-dimensional images and three-dimensional data of the areas where the plurality of workpieces (D0 to D9, D) are present,
22. The learning data generation method according to any one of claims 18 to 21, wherein the data processing step estimates the region information based on the two-dimensional image and the three-dimensional data, and generates the teaching data.
[Appendix 23]
The method further includes a receiving step, in which correction information based on area information of at least one of the works (D0, D) is received;
23. The learning data generating method according to any one of claims 18 to 22, wherein the data processing step corrects the teaching data based on the received correction information.
[Appendix 24]
A learning data generation program for generating learning data for use in machine learning,
A processing unit,
A data acquisition process for acquiring images of the areas where a plurality of workpieces (D0 to D9, D) are present;
A data processing step of estimating at least area information reflecting the entire or partial area range of at least one of the workpieces (D0, D) based on the acquired image, and generating teaching data including the estimated area information;
and a data storage step of storing the generated teaching data and the image as the learning data.

　１　　ロボット
　２　　ロボット制御装置
　３　　学習データ生成装置
　４　　カメラ
　１０　　ロボット機構部
　１１　　アーム
　１２　　エンドエフェクタ(ハンド部)
　３１　　データ取得部
　３２　　データ処理部
　３３　　データ保存部
　３４　　受付部
　３５　　表示部
　１００　　ロボットシステム
　Ａ１　　形状領域
　Ａ２　　領域
　Ｄ，Ｄ１～Ｄ９　　ワーク(物体) REFERENCE SIGNS LIST 1 Robot 2 Robot control device 3 Learning data generating device 4 Camera 10 Robot mechanism 11 Arm 12 End effector (hand)
31 Data acquisition unit 32 Data processing unit 33 Data storage unit 34 Reception unit 35 Display unit 100 Robot system A1 Shape area A2 Areas D, D1 to D9 Workpiece (object)

Claims

A learning data generation device for generating learning data for use in machine learning, comprising:
A data acquisition unit that acquires images of areas where a plurality of workpieces are present;
A data processing unit that at least estimates area information reflecting an area range of all or a part of at least one of the workpieces based on the acquired image, and generates teaching data including the estimated area information;
a data storage unit that stores the generated teaching data and the image as the learning data;
A training data generating device comprising:

The learning data generating device according to claim 1, wherein the area information of the workpiece includes at least one of outer shape area information reflecting the outer shape of the workpiece, removal area information for adsorbing, suctioning or gripping the workpiece, and local area information on the workpiece.

The learning data generating device according to claim 2, wherein the local area information of the workpiece includes at least one of flat surfaces, curved surfaces or large areas on the workpiece, non-slip areas on the workpiece, and high-density areas on the workpiece.

Further, a reception unit is provided,
The reception unit receives correction information based on area information of at least one of the works,
The training data generating device according to claim 1 , wherein the data processing unit corrects and outputs the teaching data based on the correction information received by the receiving unit.

The data acquisition unit further acquires a trained model,
The learning data generation device according to claim 1 , wherein the data processing unit estimates area information of the workpiece based on the image and the trained model, and generates the teaching data.

The data acquisition unit acquires two-dimensional images of areas where the plurality of workpieces are present,
The training data generating device according to claim 1 , wherein the data processing unit estimates the region information based on the two-dimensional image and generates the teaching data.

The data acquisition unit acquires two-dimensional images and three-dimensional data of the areas where the workpieces are present,
The training data generating device according to claim 1 , wherein the data processing unit estimates the region information based on the two-dimensional image and the three-dimensional data, and generates the teaching data.

The learning data generating device according to claim 7, wherein the data processing unit analyzes the three-dimensional data, estimates three-dimensional area information of the workpiece based on the analysis results, compares the three-dimensional data with the two-dimensional image to estimate area information of the workpiece in the two-dimensional image, and generates the teaching data.

The learning data generating device according to claim 8, wherein the data processing unit estimates the three-dimensional area information of the workpiece based on the results of processing the three-dimensional data by performing at least one of plane analysis, curved surface analysis, blob analysis, coordinate system analysis, attitude analysis, depth analysis, scale analysis, feature analysis, nearby point analysis, mesh analysis, and voxel analysis.

The learning data generating device according to claim 8 or claim 9, wherein the data processing unit modifies and outputs the teaching data based on area information of the workpiece estimated based on the analysis results of the three-dimensional data.

The learning data generating device according to any one of claims 4 to 6, wherein the data processing unit performs image processing based on the area information of at least one of the workpieces received by the receiving unit and the image to extract features, performs matching processing of the extracted features to estimate area information of multiple workpieces on the image, and generates the teaching data.

The learning data generating device according to any one of claims 1 to 11, wherein the data processing unit performs image processing based on the image, and generates the teaching data by estimating area information of the work based on the result of the image processing and the image.

The learning data generating device according to claim 11 or 12, wherein the data processing unit estimates area information of the workpiece based on the results of processing performed as the image processing at least one of pattern matching, plane matching, curved surface matching, blob analysis, feature analysis, gradient analysis, edge analysis, contrast analysis, histogram analysis, and color information analysis.

The learning data generating device according to any one of claims 11 to 13, wherein the data processing unit modifies and outputs the teaching data based on area information of the workpiece estimated based on the results of the image processing.

Further, the device includes a display unit,
The training data generating device according to claim 1 , wherein the display unit displays the image and the teaching data.

The learning data generating device according to claim 15, wherein the data processing unit includes a process for correcting the teaching data by referring to the image and the teaching data displayed on the display unit.

A robot that performs a predetermined process on a plurality of the workpieces;
A camera that captures an image of an area where the workpieces are present and outputs the image;
The training data generating device according to claim 1 , which receives the image from the camera and outputs the training data;
a machine learning device that receives the learning data from the learning data generation device and performs machine learning;
A robot system comprising: a robot control device that receives output of the machine learning device and controls the robot.

A learning data generation method for generating learning data for use in machine learning, comprising:
A data acquisition step of acquiring images of areas where a plurality of workpieces are present;
A data processing step of at least estimating area information reflecting an area range of all or a part of at least one of the workpieces based on the acquired image, and generating teaching data including the estimated area information;
a data storage step of storing the generated teaching data and the image as the learning data;
A learning data generation method comprising:

The data acquisition step further includes acquiring a trained model that has been trained and generated in advance,
The learning data generation method according to claim 18 , wherein the data processing step estimates area information of the workpiece based on the image and the trained model, and generates the teaching data.

The method further includes a receiving step, the receiving step receiving area information of at least one of the works,
The learning data generation method according to claim 18 or claim 19, wherein the data processing step performs image processing based on the received area information and the image for a plurality of the workpieces to extract features, and performs a matching process of the extracted features to estimate area information of the plurality of workpieces on the image to generate the teaching data.

The learning data generation method according to any one of claims 18 to 20, wherein the data processing step performs image processing based on the image, and generates the teaching data by estimating area information of the work based on the result of the image processing and the image.

The data acquisition step further includes acquiring two-dimensional images and three-dimensional data of the areas where the workpieces are present,
22. The learning data generating method according to claim 18, wherein the data processing step estimates the region information based on the two-dimensional image and the three-dimensional data, and generates the teaching data.

The method further includes a receiving step, the receiving step receiving correction information based on area information of at least one of the works,
23. The method of generating training data according to claim 18, wherein the data processing step corrects the teaching data based on the received correction information.

A learning data generation program for generating learning data for use in machine learning,
A processing unit includes:
A data acquisition step of acquiring images of areas where a plurality of workpieces are present;
A data processing step of at least estimating area information reflecting an area range of all or a part of at least one of the workpieces based on the acquired image, and generating teaching data including the estimated area information;
and a data storage step of storing the generated teaching data and the image as the learning data.