WO2022059117A1

WO2022059117A1 - Video processing device, image capturing device, and video processing method

Info

Publication number: WO2022059117A1
Application number: PCT/JP2020/035209
Authority: WO
Inventors: 嵩臣神田
Original assignee: Hitachi Kokusai Electric Inc
Current assignee: Kokusai Denki Electric Inc
Priority date: 2020-09-17
Filing date: 2020-09-17
Publication date: 2022-03-24
Anticipated expiration: 2023-03-17
Also published as: JPWO2022059117A1; JP7471435B2

Abstract

The present invention provides a video processing device capable of efficiently carrying out an operation of applying labels to captured images.　Thus, a video processing device 100 comprises: a moving image processing unit 19 which processes a video signal output by an image capturing element 2 into moving image files of prescribed periods and outputs said moving image files; a storage unit 17 which stores operator action data 170 that includes action content of the operator and processing content corresponding thereto; and a recognition processing unit 18 which determines the action content of the operator on the basis of sensor signals output by sensor elements 3, generates action recognition results information on the basis of the operator action data 170 in the storage unit 17, and outputs the result to the moving image processing unit 19. When the recognition processing unit 18 has output information relating to label application as the action recognition results information, the moving image processing unit 19 applies a label relating to said label application to the moving image file of the prescribed period which includes the point in time when the action recognition results information was output.

Description

Video processing device, image pickup device, and video processing method

　本発明は、映像処理装置、撮像装置、及び、映像処理方法に関する。 The present invention relates to a video processing device, an imaging device, and a video processing method.

　近年、画像認識の分野では、深層学習と呼ばれるＤｅｅｐ　Ｎｅｕｒａｌ　Ｎｅｔｗｏｒｋ（ＤＮＮ）を用いた機械学習が盛んに研究され、膨大な計算資源と学習データとを用いて、高い精度の機械学習モデルを生成することが可能となっている。 In recent years, in the field of image recognition, machine learning using Deep Neural Network (DNN) called deep learning has been actively studied, and a highly accurate machine learning model is generated using a huge amount of computational resources and learning data. It is possible.

　機械学習の中でも教師あり機械学習では、機械学習モデルを生成する際、画像データと、その画像データに対するアノテーション（正解ラベル）とを組にした学習データが多数必要とされる。しかし、多数の画像データを集め、これらの画像データの各々にアノテーション（正解ラベル）を付与する作業には、多大な時間と手間を要する。そのため、機械学習を用いた画像認識技術を実システムに適用するにあたり大きな障壁となっている。 Among machine learning, in supervised machine learning, when generating a machine learning model, a large amount of learning data that combines image data and annotations (correct answer labels) for the image data is required. However, it takes a lot of time and effort to collect a large amount of image data and add annotations (correct labels) to each of these image data. Therefore, it is a big obstacle in applying the image recognition technology using machine learning to the actual system.

　画像データにアノテーションを付与する手法として、特許文献１には、撮像部で撮影された撮影画像に対して文字認識を行い、その文字認識により取得された文字列をアノテーションとして撮影画像に付与する手法が開示されている。 As a method of annotating image data, Patent Document 1 describes a method of performing character recognition on a captured image captured by an imaging unit and assigning a character string acquired by the character recognition to the captured image as an annotation. Is disclosed.

特開２０１０－２８４８６号公報Japanese Unexamined Patent Publication No. 2010-28486

　機械学習モデルを用いた画像認識技術を適用する実システムの一例として、工場の製造ラインにおいて、製造ライン上を通過する対象物をカメラで撮影し、その撮影画像に対して画像認識を施し、対象物の異常や欠陥を検知する検査システムが挙げられる。 As an example of an actual system that applies image recognition technology using a machine learning model, in a factory production line, an object passing over the production line is photographed with a camera, image recognition is performed on the captured image, and the object is subjected to image recognition. An inspection system that detects abnormalities and defects in objects can be mentioned.

　上記のような検査システムで用いられる機械学習モデルに必要な学習データを集める場合、特許文献１に開示された手法を用いることも考えられるが、対象物に異常が発生した場合であっても、対象物を撮像した撮影画像には異常という文字列が映るわけではない。そのため、特許文献１に開示された手法を用いたとしても、撮影画像に含まれる文字列がアノテーションとして撮影画像に付与されるだけであり、異常が発生した対象物を撮影した撮影画像に、例えば、「異常あり」というラベルを付与することはできない。 When collecting the training data required for the machine learning model used in the inspection system as described above, it is conceivable to use the method disclosed in Patent Document 1, but even if an abnormality occurs in the object, it may occur. The character string "abnormal" does not appear in the captured image of the object. Therefore, even if the method disclosed in Patent Document 1 is used, the character string included in the photographed image is only added to the photographed image as an annotation, and the photographed image in which the object in which the abnormality has occurred is, for example, , "Abnormal" cannot be labeled.

　また、製造ラインでは、正常な対象物に比べて異常な対象物が発生する割合は極めて低いのが一般的である。そのため、全ての対象物を無造作に撮影し、それらに対して後から、「異常あり」のラベルを付与することは効率的でない。 Also, in the production line, the rate of abnormal objects being generated is generally extremely low compared to normal objects. Therefore, it is not efficient to randomly photograph all objects and later label them as "abnormal".

　そこで、本発明は、撮影画像に対してラベルを付与する作業を効率的に実施することができる映像処理装置、撮像装置、及び、映像処理方法を提供することを目的とする。 Therefore, an object of the present invention is to provide a video processing device, an imaging device, and a video processing method capable of efficiently carrying out the work of assigning a label to a photographed image.

　上記の課題を解決するために、代表的な本発明の映像処理装置の一つは、
　撮像素子から出力された映像信号を所定期間の動画ファイルに加工して出力する動画処理部と、
　作業者の行動内容とこれに対応する処理内容を含む作業者行動データを記憶する記憶部と、
　センサ素子から出力されたセンサ信号に基づいて作業者の行動内容を判定し、前記記憶部の前記作業者行動データに基づいて行動認識結果情報を生成し、前記動画処理部に出力する認識処理部と
　を備え、
　前記動画処理部は、
　　前記認識処理部が前記行動認識結果情報としてラベル付与に関する情報を出力したとき、当該ラベル付与に関するラベルを、当該行動認識結果情報の出力時点を含む前記所定期間の前記動画ファイルに付与するものである。 In order to solve the above problems, one of the representative video processing devices of the present invention is
A video processing unit that processes the video signal output from the image sensor into a video file for a predetermined period and outputs it.
A storage unit that stores worker behavior data including worker behavior content and corresponding processing content,
A recognition processing unit that determines the behavior content of a worker based on a sensor signal output from a sensor element, generates behavior recognition result information based on the worker behavior data in the storage unit, and outputs it to the moving image processing unit. And with
The moving image processing unit
When the recognition processing unit outputs information related to label assignment as the action recognition result information, the label related to the label assignment is added to the moving image file for the predetermined period including the output time of the action recognition result information. ..

　本発明によれば、動画処理部が、作業者のラベル付与行動に応じて動画ファイルにラベル情報を付与するので、動画ファイルに対してラベルを付与する作業を効率的に実施することができる。 According to the present invention, since the moving image processing unit assigns label information to the moving image file according to the labeling action of the worker, the work of assigning the label to the moving image file can be efficiently performed.

　上記以外の課題、構成及び効果は、以下の発明を実施するための形態における説明により明らかにされる。 Issues, configurations and effects other than the above will be clarified by the following description in the embodiment for carrying out the invention.

実施例１に係る映像処理システム１を示す全体構成図である。It is an overall block diagram which shows the image processing system 1 which concerns on Example 1. FIG. 実施例１に係る映像処理装置１００を示す概略構成図である。It is a schematic block diagram which shows the image processing apparatus 100 which concerns on Example 1. FIG. 実施例１に係る映像処理装置１００の記憶部１７に記憶された作業者行動データ１７０を示すデータ構成図である。It is a data composition diagram which shows the worker behavior data 170 stored in the storage part 17 of the image processing apparatus 100 which concerns on Example 1. FIG. 実施例１に係る映像処理装置１００の動作を示すフローチャートである。It is a flowchart which shows the operation of the image processing apparatus 100 which concerns on Example 1. FIG. 実施例１に係る映像処理装置１００において、作業者Ｕの行動内容が始点設定行動１７２と判定されたときの動作を説明する図である。It is a figure explaining the operation when the action content of the worker U is determined to be the start point setting action 172 in the image processing apparatus 100 which concerns on Example 1. FIG. 実施例１に係る映像処理装置１００において、作業者Ｕの行動内容がラベル付与行動１７１が判定されたときの動作を説明する図である。In the video processing apparatus 100 according to the first embodiment, it is a figure explaining the operation when the action content of a worker U is determined to be a label-imparting action 171. 実施例１に係る映像処理装置１００において、作業者Ｕの行動内容が終点設定行動１７３と判定されたときの動作を説明する図である。It is a figure explaining the operation when the action content of the worker U is determined to be the end point setting action 173 in the image processing apparatus 100 which concerns on Example 1. FIG. 実施例２に係る映像処理装置１００を示す概略構成図である。It is a schematic block diagram which shows the image processing apparatus 100 which concerns on Example 2. FIG. 実施例３に係る撮像装置２００を示す概略構成図である。It is a schematic block diagram which shows the image pickup apparatus 200 which concerns on Example 3. FIG. 実施例４に係る撮像装置２００を示す概略構成図である。It is a schematic block diagram which shows the image pickup apparatus 200 which concerns on Example 4. FIG. コンピュータシステム３００の構成例を示すブロック図である。It is a block diagram which shows the configuration example of the computer system 300.

　以下、図面を参照して、本発明の実施の形態に係る映像処理装置、撮像装置、及び、映像処理方法について説明する。 Hereinafter, the image processing device, the image pickup device, and the image processing method according to the embodiment of the present invention will be described with reference to the drawings.

（実施例１）
　図１は、実施例１に係る映像処理システム１を示す全体構成図である。 (Example 1)
FIG. 1 is an overall configuration diagram showing a video processing system 1 according to the first embodiment.

　映像処理システム１は、対象物１０１を製造する工場の製造ライン１０２にて使用されるシステムである。製造ライン１０２は、例えば、図１の矢印Ａで示す搬送方向に対象物１０１を搬送するベルトコンベアで構成される。作業者Ｕは、製造ライン１０２上を次々に流れてくる各対象物１０１が正常（良品）か異常（不良品）かを目視等で検査するための検査員である。 The video processing system 1 is a system used in the production line 102 of the factory that manufactures the object 101. The production line 102 is composed of, for example, a belt conveyor that conveys the object 101 in the conveying direction indicated by the arrow A in FIG. The worker U is an inspector for visually inspecting whether each object 101 flowing on the production line 102 one after another is normal (good product) or abnormal (defective product).

　映像処理システム１は、その構成として、映像処理装置１００、対象物用の撮像素子２、作業者用のセンサ素子３、作業者用端末装置４及び動画ファイル格納装置５を備える。 The video processing system 1 includes a video processing device 100, an image pickup element 2 for an object, a sensor element 3 for a worker, a terminal device 4 for a worker, and a moving image file storage device 5 as its configuration.

　映像処理装置１００は、撮像素子２、センサ素子３、作業者用端末装置４及び動画ファイル格納装置５にそれぞれ接続されて、各種の信号やデータを有線又は無線により送受信する。映像処理装置１００は、撮像素子２から出力された映像信号を所定期間の動画ファイルに加工して動画ファイル格納装置５に出力する。映像処理装置１００の詳細は後述する。 The video processing device 100 is connected to an image pickup element 2, a sensor element 3, a worker terminal device 4, and a moving image file storage device 5, respectively, and transmits and receives various signals and data by wire or wirelessly. The video processing device 100 processes the video signal output from the image pickup device 2 into a video file for a predetermined period and outputs the video signal to the video file storage device 5. Details of the video processing apparatus 100 will be described later.

　対象物用の撮像素子２は、例えば、ＣＭＯＳセンサ、ＣＣＤセンサ等で構成される。撮像素子２は、例えば、製造ライン１０２の上方に設置されて、製造ライン１０２及び製造ライン１０２上を流れる対象物１０１を撮像し、その撮像した映像信号を映像処理装置１００に出力する。 The image sensor 2 for an object is composed of, for example, a CMOS sensor, a CCD sensor, or the like. The image sensor 2 is installed above the production line 102, for example, images the object 101 flowing on the production line 102 and the production line 102, and outputs the imaged image signal to the image processing apparatus 100.

　作業用のセンサ素子３は、撮像素子３０、マイクロホン素子３１及び接点素子３２の少なくとも１つを含む。センサ素子３は、撮像素子２の設置場所付近にて作業する作業者Ｕの行動を監視するために、撮像素子２の周辺に設置されている。なお、本実施例では、センサ素子３は、撮像素子３０、マイクロホン素子３１及び接点素子３２の全てを備えるものして説明する。
　作業者用の撮像素子３０は、対象物用の撮像素子２と同様に、例えば、ＣＭＯＳセンサ、ＣＣＤセンサ等で構成される。撮像素子３０は、作業者Ｕの体の全体又一部を撮像し、その撮像した映像信号を映像処理装置１００に出力する。 The working sensor element 3 includes at least one of an image pickup element 30, a microphone element 31, and a contact element 32. The sensor element 3 is installed around the image pickup element 2 in order to monitor the behavior of the worker U working near the installation location of the image pickup element 2. In this embodiment, the sensor element 3 includes all of the image pickup element 30, the microphone element 31, and the contact element 32.
The image sensor 30 for an operator is composed of, for example, a CMOS sensor, a CCD sensor, or the like, like the image sensor 2 for an object. The image pickup device 30 takes an image of the whole or a part of the body of the worker U, and outputs the imaged image signal to the image processing device 100.

　マイクロホン素子３１は、作業者Ｕが発声した音声を電気信号としての音声信号に変換し、その変換した音声信号を映像処理装置１００に出力する。 The microphone element 31 converts the voice uttered by the worker U into an audio signal as an electric signal, and outputs the converted audio signal to the image processing device 100.

　接点素子３２は、例えば、接触式又は非接触式のスイッチ、ボタン、レバー等で構成される。接点素子３２は、作業者Ｕが手や足で行った操作を電気信号としての接点信号に変換し、その変換した接点信号を映像処理装置１００に出力する。 The contact element 32 is composed of, for example, a contact type or non-contact type switch, button, lever or the like. The contact element 32 converts the operation performed by the worker U by hand or foot into a contact signal as an electric signal, and outputs the converted contact signal to the image processing device 100.

　作業者用端末装置４は、例えば、デスクトップ型コンピュータ、ノート型コンピュータ、タブレット型コンピュータ、スマートフォン等で構成される。作業者用端末装置４は、作業者Ｕが視認可能な位置に設置されて、映像処理装置１００から出力された表示データに基づいて表示画面を表示する。なお、作業者用端末装置４は、例えば、液晶ディスプレイ、有機ＥＬディスプレイ等のディスプレイ装置として構成されていてもよい。 The worker terminal device 4 is composed of, for example, a desktop computer, a notebook computer, a tablet computer, a smartphone, or the like. The worker terminal device 4 is installed at a position where the worker U can see, and displays a display screen based on the display data output from the video processing device 100. The worker terminal device 4 may be configured as a display device such as a liquid crystal display or an organic EL display, for example.

　動画ファイル格納装置５は、例えば、ファイルサーバ等のサーバ装置やストレージ装置で構成される。動画ファイル格納装置５は、映像処理装置１００から出力された動画ファイルを格納し、動画ファイルのデータベースとして動作する。なお、動画ファイル格納装置５は、作業者用端末装置４と同様に、例えば、デスクトップ型コンピュータ、ノート型コンピュータ、タブレット型コンピュータ、スマートフォン等で構成されていてもよい。 The moving image file storage device 5 is composed of, for example, a server device such as a file server or a storage device. The moving image file storage device 5 stores the moving image file output from the video processing device 100 and operates as a database of the moving image file. The moving image file storage device 5 may be composed of, for example, a desktop computer, a notebook computer, a tablet computer, a smartphone, or the like, similarly to the worker terminal device 4.

　図１では、撮像素子２、撮像素子３０、マイクロホン素子３１及び接点素子３２の設置台数は、それぞれ１つであるが、これに限られず、複数でもよい。また、撮像素子２、撮像素子３０、マイクロホン素子３１及び接点素子３２の設置位置は、図１の例に限られず、適宜変更してもよい。 In FIG. 1, the number of the image pickup element 2, the image pickup element 30, the microphone element 31, and the contact element 32 is one each, but the number is not limited to this, and may be a plurality. Further, the installation positions of the image pickup element 2, the image pickup element 30, the microphone element 31, and the contact element 32 are not limited to the example of FIG. 1, and may be appropriately changed.

（映像処理装置１００の構成）
　図２は、実施例１に係る映像処理装置１００を示す概略構成図である。映像処理装置１００は、映像信号入力部１１、映像信号処理部１２、音声信号入力部１３、音声信号処理部１４、接点信号入力部１５、接点信号処理部１６、記憶部１７、認識処理部１８及び動画処理部１９を備える。 (Configuration of video processing device 100)
FIG. 2 is a schematic configuration diagram showing the video processing apparatus 100 according to the first embodiment. The video processing device 100 includes a video signal input unit 11, a video signal processing unit 12, an audio signal input unit 13, an audio signal processing unit 14, a contact signal input unit 15, a contact signal processing unit 16, a storage unit 17, and a recognition processing unit 18. And a moving image processing unit 19.

　映像信号入力部１１は、例えば、ＢＮＣコネクタ等のコネクタと、レシーバＩＣとで構成される。映像信号入力部１１には、対象物用の撮像素子２と、作業者用の撮像素子３０とが接続されて、映像信号がそれぞれ入力される。映像信号処理部１２は、映像信号入力部１１を介して入力された映像信号に対して、例えば、色補正やガンマ補正等の処理を行う。 The video signal input unit 11 is composed of, for example, a connector such as a BNC connector and a receiver IC. An image pickup element 2 for an object and an image pickup element 30 for an operator are connected to the video signal input unit 11, and a video signal is input to each. The video signal processing unit 12 performs processing such as color correction and gamma correction on the video signal input via the video signal input unit 11.

　音声信号入力部１３は、例えば、ＸＬＲコネクタと、レシーバＩＣとで構成される。音声信号入力部１３には、マイクロホン素子３１が接続されて、音声信号が入力される。音声信号処理部１４は、音声信号入力部１３を介して入力された音声信号に対してスペクトル分析やノイズ除去等の処理を行う。 The audio signal input unit 13 is composed of, for example, an XLR connector and a receiver IC. A microphone element 31 is connected to the audio signal input unit 13, and an audio signal is input. The audio signal processing unit 14 performs processing such as spectrum analysis and noise removal on the audio signal input via the audio signal input unit 13.

　接点信号入力部１５は、例えば、任意の形式のコネクタと、レシーバＩＣとで構成される。接点信号入力部１５には、接点素子３２が接続されて、接点信号が入力される。接点信号処理部１６は、接点信号入力部１５を介して入力された接点信号に対してオンオフの状態検出等の処理を行う。 The contact signal input unit 15 is composed of, for example, a connector of any type and a receiver IC. A contact element 32 is connected to the contact signal input unit 15, and a contact signal is input. The contact signal processing unit 16 performs processing such as on / off state detection for the contact signal input via the contact signal input unit 15.

　記憶部１７には、作業者Ｕが所定の行動を取ったときに、映像処理装置１００が実行する処理内容が作業者Ｕの行動内容に対応付けられた作業者行動データ１７０が記憶されている。 The storage unit 17 stores the worker action data 170 in which the processing content executed by the video processing device 100 is associated with the action content of the worker U when the worker U takes a predetermined action. ..

　図３は、実施例１に係る映像処理装置１００の記憶部１７に記憶された作業者行動データ１７０を示すデータ構成図である。 FIG. 3 is a data configuration diagram showing worker behavior data 170 stored in the storage unit 17 of the video processing device 100 according to the first embodiment.

　記憶部１７には、作業者行動データ１７０として、センサ信号を解析して得られる作業者Ｕの「行動内容」と、それぞれの行動内容に対応した「センサ種別」、「処理種別」及び「処理内容」とが記憶されている。なお、作業者行動データ１７０において、センサ種別及び処理種別は、付加的な情報であり、それらの一方又は両方が省略されてもよい。 In the storage unit 17, the "behavior content" of the worker U obtained by analyzing the sensor signal as the worker behavior data 170, and the "sensor type", "processing type", and "processing" corresponding to each behavior content are stored. "Content" is remembered. In the worker behavior data 170, the sensor type and the processing type are additional information, and one or both of them may be omitted.

　処理内容には、動画ファイルに電子的なラベルを付与する「ラベル付与処理」、動画ファイルの始点を設定する「始点設定処理」、及び、動画ファイルの終点を設定する「終点設定処理」等が記憶されている。ラベル付与処理には、動画ファイルに付与するラベルを示すラベル情報１７１ａ（図３の例では「異常あり」）も併せて記憶されている。 The processing contents include "labeling process" for assigning an electronic label to a video file, "start point setting process" for setting a start point of a video file, and "end point setting process" for setting an end point of a video file. It is remembered. In the label assignment process, label information 171a (“abnormal” in the example of FIG. 3) indicating a label to be assigned to the moving image file is also stored.

　処理種別には、映像処理装置１００が実行する処理内容を分類した情報として、例えば、動画ファイルにラベルを付与する「ラベル付与」と、動画ファイルの始点や終点に関する「期間設定」等が記憶されている。 In the processing type, for example, "labeling" for assigning a label to a video file and "period setting" for the start point and end point of the video file are stored as information for classifying the processing contents executed by the video processing apparatus 100. ing.

　行動内容には、それぞれの処理内容に対応させて、作業者Ｕの具体的な行動が記憶されている。図３の例では、ラベル付与処理に対応する行動内容（ラベル付与行動１７１）として、「両手を挙げる」及び「フットスイッチを操作する」という２つの行動内容が記憶されている。始点設定処理に対応する行動内容（始点設定行動１７２）として、「左手を挙げる」及び「開始と発話する」という２つの行動内容が記憶されている。終点設定処理に対応する行動内容（終点設定行動１７３）として、「右手を挙げる」及び「終了と発話する」という２つの行動内容が記憶されている。 In the action content, the specific action of the worker U is stored in correspondence with each processing content. In the example of FIG. 3, two action contents of "raising both hands" and "operating the foot switch" are stored as action contents (labeling action 171) corresponding to the label giving process. As the action contents corresponding to the start point setting process (start point setting action 172), two action contents of "raising the left hand" and "starting and speaking" are stored. As the action content corresponding to the end point setting process (end point setting action 173), two action contents of "raising the right hand" and "speaking with the end" are stored.

　センサ種別には、センサ素子３から出力されたセンサ信号を解析して作業者Ｕの行動内容を判定するときに、解析対象となるセンサ素子３を特定する情報が記憶されている。図３の例では、「両手を挙げる」、「左手を挙げる」及び「右手を挙げる」の行動内容に対応するセンサ種別として、「撮像素子」が記憶されている。「開始と発話する」及び「終了と発話する」の行動内容に対応するセンサ種別として、「マイクロホン素子」が記憶されている。「フットスイッチを操作する」の行動内容に対応するセンサ種別として、「接点素子」が記憶されている。 The sensor type stores information that identifies the sensor element 3 to be analyzed when the sensor signal output from the sensor element 3 is analyzed to determine the action content of the worker U. In the example of FIG. 3, an "image sensor" is stored as a sensor type corresponding to the action contents of "raising both hands", "raising the left hand", and "raising the right hand". A "microphone element" is stored as a sensor type corresponding to the action contents of "start and speak" and "end and speak". A "contact element" is stored as a sensor type corresponding to the action content of "operating the foot switch".

　なお、作業者行動データ１７０に記憶される行動内容、センサ種別、処理種別及び処理内容の種類は上記のものに限定されず、様々な行動内容に応じてセンサ種別、処理種別及び処理内容を設定することができる。例えば、２種類の連続した行動を１つの行動内容として分類し、それに応じた処理内容を設定することも可能である。さらに、１つの行動内容に対して複数のセンサ種別を設定することも可能である。 The action content, sensor type, processing type, and processing content type stored in the worker behavior data 170 are not limited to the above, and the sensor type, processing type, and processing content are set according to various behavior content. can do. For example, it is possible to classify two types of continuous actions as one action content and set the processing content accordingly. Further, it is possible to set a plurality of sensor types for one action content.

　また、図３の例では、ラベル付与に関するラベルは、「異常あり」の１種類であるが、複数種類のラベルに対応することも可能である。例えば、異常の内容を３つに細分化した場合、作業者行動データ１７０には、第１のラベル「第１の異常あり」を付与する第１のラベル付与処理に対応する第１の行動内容「左手の指を１本伸ばした状態で左手を挙げる」と、第２のラベル「第２の異常あり」を付与する第２のラベル付与処理に対応する第２の行動内容「左手の指を２本伸ばした状態で左手を挙げる」と、第３のラベル「第３の異常あり」を付与する第３のラベル付与処理に対応する第３の行動内容「左手の指を３本伸ばした状態で左手を挙げる」等が記憶されていればよい。 Further, in the example of FIG. 3, the label related to label assignment is one type of "abnormal", but it is also possible to correspond to a plurality of types of labels. For example, when the content of the abnormality is subdivided into three, the worker behavior data 170 is given the first label "there is a first abnormality". "Raise the left hand with one finger of the left hand extended" and the second action content "Finger of the left hand" corresponding to the second label assignment process for assigning the second label "there is a second abnormality" "Raise your left hand with two extended hands" and the third action content corresponding to the third label assignment process that gives the third label "Third abnormality" "The state where three fingers of the left hand are extended" It suffices to remember "raise your left hand with."

　認識処理部１８は、センサ素子３から出力されたセンサ信号（映像信号、音声信号及び接点信号）を解析し、撮像素子２の設置場所付近にて作業する作業者Ｕの行動内容を判定する。その際、認識処理部１８は、作業者Ｕが作業者行動データ１７０として記憶部１７に記憶された行動内容、すなわち、ラベル付与行動１７１、始点設定行動１７２及び終点設定行動１７３のいずれかを取ったと判定した場合には、その判定した行動内容を特定する行動認識結果情報を動画処理部１９に出力する。また、認識処理部１８は、撮像素子２から出力された映像信号を解析し、その映像に含まれている文字を取得する。 The recognition processing unit 18 analyzes the sensor signals (video signal, audio signal, and contact signal) output from the sensor element 3 and determines the action content of the worker U working in the vicinity of the installation location of the image pickup element 2. At that time, the recognition processing unit 18 takes any of the action contents stored in the storage unit 17 as the worker action data 170 by the worker U, that is, the labeling action 171 and the start point setting action 172 and the end point setting action 173. If it is determined that the behavior has been determined, the behavior recognition result information for specifying the determined behavior content is output to the moving image processing unit 19. Further, the recognition processing unit 18 analyzes the video signal output from the image sensor 2 and acquires the characters included in the video.

　認識処理部１８は、その構成として、映像信号処理部１２に接続された映像認識部１８０及び文字認識部１８３と、音声信号処理部１４に接続された音声認識部１８１と、接点信号処理部１６に接続された接点認識部１８２とを備える。 The recognition processing unit 18 includes a video recognition unit 180 and a character recognition unit 183 connected to the video signal processing unit 12, a voice recognition unit 181 connected to the voice signal processing unit 14, and a contact signal processing unit 16. It is provided with a contact recognition unit 182 connected to the above.

　映像認識部１８０は、映像信号処理部１２により処理された作業者用の撮像素子３０の映像信号に対して、公知の映像解析手法を適用し、作業者Ｕの行動内容を判定する。音声認識部１８１は、音声信号処理部１４により処理されたマイクロホン素子３１の音声信号に対して、公知の音声解析手法を適用し、作業者Ｕの行動内容を判定する。接点認識部１８２は、接点信号処理部１６により処理された接点素子３２の接点信号に対して、公知の接点解析手法を適用し、作業者Ｕの行動内容を判定する。そして、映像認識部１８０、音声認識部１８１及び接点認識部１８２は、作業者Ｕの行動内容を判定し、作業者行動データ１７０に基づいて行動認識結果情報を生成し、動画処理部１９に出力する。 The video recognition unit 180 applies a known video analysis method to the video signal of the image pickup device 30 for the worker processed by the video signal processing unit 12, and determines the action content of the worker U. The voice recognition unit 181 applies a known voice analysis method to the voice signal of the microphone element 31 processed by the voice signal processing unit 14, and determines the action content of the worker U. The contact recognition unit 182 applies a known contact analysis method to the contact signal of the contact element 32 processed by the contact signal processing unit 16, and determines the action content of the worker U. Then, the video recognition unit 180, the voice recognition unit 181 and the contact recognition unit 182 determine the action content of the worker U, generate the action recognition result information based on the worker action data 170, and output it to the moving image processing unit 19. do.

　なお、作業者Ｕが、例えば、目印となるワッペン等を左手に装着していることを前提として、映像認識部１８０が、当該目印を検出したことに応じて作業者Ｕを認識し、さらに当該目印の位置を追跡することで、例えば、その目印を装着した左手の位置や移動を認識してもよい。 It should be noted that, on the premise that the worker U is wearing, for example, an emblem or the like as a mark on his left hand, the image recognition unit 180 recognizes the worker U in response to the detection of the mark, and further, the said. By tracking the position of the mark, for example, the position or movement of the left hand wearing the mark may be recognized.

　文字認識部１８３は、映像信号処理部１２により処理された対象物用の撮像素子２の映像信号に対して、公知の文字認識手法を適用し、対象物１０１自体や対象物１０１のパッケージに記載された文字を取得する。そして、文字認識部１８３は、文字を取得した結果に基づいて文字認識結果情報を生成し、動画処理部１９に出力する。 The character recognition unit 183 applies a known character recognition method to the video signal of the image sensor 2 for the object processed by the video signal processing unit 12, and describes the object 101 itself or the package of the object 101. Get the character that was done. Then, the character recognition unit 183 generates character recognition result information based on the result of acquiring the character, and outputs the character recognition result information to the moving image processing unit 19.

　動画処理部１９は、認識処理部１８が出力した行動認識結果情報及び文字認識結果情報に基づいて、映像信号処理部１２により処理された対象物用の撮像素子２の映像信号を所定期間の動画ファイルに加工して動画ファイル格納装置５に出力する。 Based on the action recognition result information and the character recognition result information output by the recognition processing unit 18, the moving image processing unit 19 displays the video signal of the image pickup element 2 for the object processed by the video signal processing unit 12 as a moving image for a predetermined period. It is processed into a file and output to the moving image file storage device 5.

　具体的には、動画処理部１９は、認識処理部１８が行動認識結果情報として始点指示に関する情報を出力したとき、当該行動認識結果情報の出力時点を動画ファイルの始点に設定する。また、動画処理部１９は、認識処理部１８が行動認識結果情報として終点指示に関する情報を出力したとき、当該行動認識結果情報の出力時点を動画ファイルの終点に設定する。そして、動画処理部１９は、その設定した始点及び終点により区切られた所定期間の動画ファイルを出力する。 Specifically, when the recognition processing unit 18 outputs information related to a start point instruction as action recognition result information, the moving image processing unit 19 sets the output time point of the action recognition result information as the starting point of the moving image file. Further, when the recognition processing unit 18 outputs the information related to the end point instruction as the action recognition result information, the moving image processing unit 19 sets the output time point of the action recognition result information to the end point of the moving image file. Then, the moving image processing unit 19 outputs a moving image file for a predetermined period separated by the set start point and ending point.

　なお、動画処理部１９が映像信号を所定期間の動画ファイルに加工して出力する手法は任意の手法を採用すればよい。例えば、動画処理部１９は、始点の設定に応じて映像信号に基づく動画ファイルの記録を開始し、終点の設定に応じて映像信号に基づく動画ファイルの記録を終了することで、所定期間の動画ファイルを出力してもよい。また、動画処理部１９は、過去の一定時間分の映像信号を記憶部１７に一時的に記憶しておき、始点及び終点を設定した後に、その設定した始点及び終点に応じて記憶部１７に記憶した映像信号を切り出すことで、所定期間の動画ファイルを出力してもよい。その際、動画ファイルのファイル形式や解像度は任意のものを採用すればよい。 It should be noted that any method may be adopted as the method in which the moving image processing unit 19 processes the video signal into a moving image file for a predetermined period and outputs it. For example, the moving image processing unit 19 starts recording the moving image file based on the video signal according to the setting of the start point, and ends the recording of the moving image file based on the video signal according to the setting of the ending point, thereby moving the moving image for a predetermined period. You may output the file. Further, the moving image processing unit 19 temporarily stores the video signals for a certain time in the past in the storage unit 17, sets the start point and the end point, and then stores the set start point and end point in the storage unit 17. By cutting out the stored video signal, a moving image file for a predetermined period may be output. At that time, any file format and resolution of the video file may be adopted.

　また、動画処理部１９は、認識処理部１８が行動認識結果情報としてラベル付与に関する情報を出力したとき、当該ラベル付与に関するラベル（本実施例では、「異常あり」）を、当該行動認識結果情報の出力時点を含む所定期間の動画ファイルに付与する。その際、動画処理部１９は、文字認識結果情報についても動画ファイルに付与する。 Further, when the recognition processing unit 18 outputs information related to label assignment as action recognition result information, the moving image processing unit 19 displays the label related to the label assignment (“abnormal” in this embodiment) as the action recognition result information. It is given to the video file for a predetermined period including the output time of. At that time, the moving image processing unit 19 also adds the character recognition result information to the moving image file.

　なお、動画処理部１９が動画ファイルにラベル情報及び文字認識結果情報を付与する場合、ラベル情報及び文字認識結果情報は、動画ファイルの属性情報として記録されてもよいし、動画ファイルとは別のファイルに記録され、当該ファイルと動画ファイルとが関連付けられてもよい。 When the moving image processing unit 19 imparts label information and character recognition result information to the moving image file, the label information and character recognition result information may be recorded as attribute information of the moving image file, or may be recorded separately from the moving image file. It may be recorded in a file and the file may be associated with the moving image file.

　動画処理部１９は、各種の情報を表示するための表示データを作業者用端末装置４に出力する。表示データは、例えば、行動認識結果情報、文字認識結果情報、動画ファイルに付与したラベル情報、対象物用の撮像素子２の映像信号に基づくリアルタイム映像、作業者用の撮像素子３０の映像信号に基づくリアルタイム映像、並びに、動画ファイルの始点及び終点を示す時刻情報等を含む。 The moving image processing unit 19 outputs display data for displaying various information to the worker terminal device 4. The display data may be, for example, behavior recognition result information, character recognition result information, label information given to a moving image file, real-time video based on the video signal of the image pickup element 2 for an object, or a video signal of the image pickup element 30 for a worker. Includes real-time video based on the video, and time information indicating the start and end points of the video file.

（映像処理装置１００の動作）
　図４は、実施例１に係る映像処理装置１００の動作を示すフローチャートである。映像処理装置１００は、図４に示す一連の処理（映像処理方法）を、例えば、所定の制御周期ごとに繰り返し実行する。 (Operation of video processing device 100)
FIG. 4 is a flowchart showing the operation of the video processing apparatus 100 according to the first embodiment. The video processing apparatus 100 repeatedly executes a series of processes (video processing method) shown in FIG. 4, for example, at predetermined control cycles.

　まず、ステップＳ１０では、認識処理部１８が、センサ素子３から出力されたセンサ信号を解析することにより、作業者Ｕの行動内容を監視する。具体的には、映像認識部１８０、音声認識部１８１及び接点認識部１８２が、映像信号、音声信号及び接点信号をそれぞれ解析することにより、作業者Ｕが、作業者行動データ１７０として記憶部１７に記憶されたラベル付与行動１７１、始点設定行動１７２及び終点設定行動１７３のいずれかを取ったか否かを監視する。 First, in step S10, the recognition processing unit 18 monitors the action content of the worker U by analyzing the sensor signal output from the sensor element 3. Specifically, the video recognition unit 180, the voice recognition unit 181 and the contact recognition unit 182 analyze the video signal, the voice signal and the contact signal, respectively, so that the worker U can store the worker action data 170 as the storage unit 17. It is monitored whether or not any of the labeling action 171, the start point setting action 172, and the end point setting action 173 stored in the above is taken.

　ステップＳ２０では、認識処理部１８が、ステップＳ１０にてセンサ信号を解析した結果、作業者Ｕの行動内容が、ラベル付与行動１７１、始点設定行動１７２及び終点設定行動１７３のいずれかに該当するか否かを判定する。ステップＳ２０にて「Ｙｅｓ」の場合には、ステップＳ２１に進む。一方、ステップＳ２０にて「Ｎｏ」の場合には、ステップＳ１０に戻り、作業者Ｕの行動を監視する処理を継続する。 In step S20, as a result of the recognition processing unit 18 analyzing the sensor signal in step S10, whether the action content of the worker U corresponds to any of the label assigning action 171 and the start point setting action 172 and the end point setting action 173. Judge whether or not. If "Yes" in step S20, the process proceeds to step S21. On the other hand, if "No" in step S20, the process returns to step S10 and the process of monitoring the behavior of the worker U is continued.

　ステップＳ２１では、認識処理部１８が、作業者行動データ１７０に基づいて作業者Ｕの行動内容を特定する行動認識結果情報を生成し、動画処理部１９に出力し、ステップＳ３０に進む。 In step S21, the recognition processing unit 18 generates behavior recognition result information that specifies the behavior content of the worker U based on the worker behavior data 170, outputs the information to the video processing unit 19, and proceeds to step S30.

　ステップＳ３０では、動画処理部１９が、認識処理部１８が行動認識結果情報として出力した作業者Ｕの行動内容に基づいて記憶部１７の作業者行動データ１７０を参照して当該行動内容に対応する処理内容を決定する。すなわち、行動認識結果情報が、始点設定行動１７２（始点設定に関する情報）の場合には、始点設定行動１７２に対応する始点設定処理（ステップＳ４０）に進み、ラベル付与行動１７１（ラベル付与に関する情報）の場合には、始点設定の確認処理（ステップＳ５０）を経てラベル付与行動１７１に対応する始点設定処理（ステップＳ５１）に進み、終点設定行動１７３（終点設定に関する情報）の場合には、始点設定の確認処理（ステップＳ６０）を経て終点設定行動１７３に対応する終点設定処理（ステップＳ６１）に進む。 In step S30, the moving image processing unit 19 corresponds to the action content by referring to the worker action data 170 of the storage unit 17 based on the action content of the worker U output by the recognition processing unit 18 as the action recognition result information. Determine the processing content. That is, when the action recognition result information is the start point setting action 172 (information regarding the start point setting), the process proceeds to the start point setting process (step S40) corresponding to the start point setting action 172, and the label assignment action 171 (information regarding the label assignment) is performed. In the case of, the process proceeds to the start point setting process (step S51) corresponding to the label assigning action 171 through the start point setting confirmation process (step S50), and in the case of the end point setting action 173 (information about the end point setting), the start point setting is performed. After the confirmation process (step S60), the process proceeds to the end point setting process (step S61) corresponding to the end point setting action 173.

　なお、本実施例では、行動認識結果情報は、作業者行動データ１７０の行動内容を特定する情報として説明したが、行動認識結果情報は、ラベル付与行動１７１の処理内容を特定する情報でもよい。その場合には、認識処理部１８が、センサ信号を解析して得られた作業者Ｕの行動内容に基づいて記憶部１７の作業者行動データ１７０を参照して当該行動内容に対応する処理内容を特定することにより、行動認識結果情報を生成し、動画処理部１９に出力すればよい。そして、動画処理部１９は、認識処理部１８が行動認識結果情報として、始点設定処理（始点設定に関する情報）を出力した場合には、ステップＳ４０に進み、ラベル付与処理（ラベル付与に関する情報）を出力した場合には、ステップＳ５０に進み、終点設定処理（終点設定に関する情報）を出力した場合には、ステップＳ６０に進めばよい。 In this embodiment, the behavior recognition result information has been described as information for specifying the behavior content of the worker behavior data 170, but the behavior recognition result information may be information for specifying the processing content of the labeling action 171. In that case, the recognition processing unit 18 refers to the worker behavior data 170 of the storage unit 17 based on the behavior content of the worker U obtained by analyzing the sensor signal, and the processing content corresponding to the behavior content. By specifying the above, the action recognition result information may be generated and output to the moving image processing unit 19. Then, when the recognition processing unit 18 outputs the start point setting process (information regarding the start point setting) as the action recognition result information, the moving image processing unit 19 proceeds to step S40 and performs the label assignment process (information regarding the label assignment). If it is output, the process proceeds to step S50, and if the end point setting process (information about the end point setting) is output, the process may proceed to step S60.

　ステップＳ４０では、動画処理部１９が、認識処理部１８による行動認識結果情報の出力時点、すなわち、作業者Ｕの行動内容が始点設定行動１７２と判定された時点を動画ファイルの始点に設定し、動画ファイルの記録を開始する。ステップＳ４１では、動画処理部１９が、動画ファイルの記録を開始した旨を示す表示データを作業者用端末装置４に出力する。 In step S40, the moving image processing unit 19 sets the output time of the action recognition result information by the recognition processing unit 18, that is, the time when the action content of the worker U is determined to be the start point setting action 172 as the start point of the moving image file. Start recording video files. In step S41, the moving image processing unit 19 outputs display data indicating that the recording of the moving image file has started to the worker terminal device 4.

　図５は、実施例１に係る映像処理装置１００において、作業者Ｕの行動内容が始点設定行動１７２と判定されたときの動作を説明する図である。なお、図５及び後述する図６、図７では、マイクロホン素子３１及び接点素子３２の記載を省略している。 FIG. 5 is a diagram illustrating an operation when the action content of the worker U is determined to be a start point setting action 172 in the video processing device 100 according to the first embodiment. In addition, in FIG. 5 and FIGS. 6 and 7 described later, the description of the microphone element 31 and the contact element 32 is omitted.

　撮像素子２の設置場所付近で待機している作業者Ｕが、製造ライン１０２上を流れる対象物１０１が自身の目前に到達したことを視認すると、左手を挙げる。この「左手を挙げる」という行動が、認識処理部１８により始点設定行動１７２として判定されることで、動画処理部１９が動画ファイルの記録を開始する。また、作業者用端末装置４の表示画面４０には、動画処理部１９から出力された表示データに基づいて、始点設定行動１７２を認識し、動画ファイルの記録を開始した旨が表示される。また、表示画面４０には、撮像素子２の映像信号に基づくリアルタイム映像４１、作業者用の撮像素子３０の映像信号に基づくリアルタイム映像４２が表示される。 When the worker U standing by near the installation location of the image sensor 2 visually recognizes that the object 101 flowing on the production line 102 has reached in front of him, he raises his left hand. This action of "raising the left hand" is determined by the recognition processing unit 18 as the start point setting action 172, so that the moving image processing unit 19 starts recording the moving image file. Further, on the display screen 40 of the worker terminal device 4, it is displayed that the start point setting action 172 is recognized and the recording of the moving image file is started based on the display data output from the moving image processing unit 19. Further, on the display screen 40, a real-time image 41 based on the image signal of the image pickup element 2 and a real-time image 42 based on the image signal of the image pickup element 30 for the operator are displayed.

　ステップＳ５０では、ラベル付与行動１７１の前に始点設定が行われたかどうかを確認し、始点設定が行われた場合には、ステップＳ５１に進み、始点設定が行われていない場合には、ステップＳ５１、Ｓ５２に進むことなく、処理を終了する。これにより、動画ファイルの記録が開始されていない場合にはラベル付与処理が行われないため、誤作動を防止することができる。 In step S50, it is confirmed whether or not the start point has been set before the label assigning action 171. If the start point has been set, the process proceeds to step S51. If the start point has not been set, step S51 has not been set. , The process is terminated without proceeding to S52. As a result, if the recording of the moving image file has not started, the labeling process is not performed, so that malfunction can be prevented.

　ステップＳ５１では、動画処理部１９が、認識処理部１８による行動認識結果情報の出力時点、すなわち、作業者Ｕの行動内容がラベル付与行動１７１と判定された時点で記録中の動画ファイルにラベルを付与する。ステップＳ５２では、動画処理部１９が、動画ファイルにラベルを付与した旨を示す表示データを作業者用端末装置４に出力する。 In step S51, the moving image processing unit 19 labels the moving image file being recorded at the time when the action recognition result information by the recognition processing unit 18 is output, that is, when the action content of the worker U is determined to be the labeling action 171. Give. In step S52, the moving image processing unit 19 outputs display data indicating that the moving image file is labeled to the worker terminal device 4.

　図６は、実施例１に係る映像処理装置１００において、作業者Ｕの行動内容がラベル付与行動１７１と判定されたときの動作を説明する図である。 FIG. 6 is a diagram illustrating an operation when the action content of the worker U is determined to be the labeling action 171 in the video processing device 100 according to the first embodiment.

　作業者Ｕが、自身の目前に到達した対象物１０１を目視で確認し、異常を発見すると、両手を挙げる。この「両手を挙げる」という行動が、認識処理部１８によりラベル付与行動１７１として判定されることで、動画処理部１９が現在記録中の動画ファイルに「異常あり」のラベルを付与する。また、作業者用端末装置４の表示画面４０には、動画処理部１９から出力された表示データに基づいて、ラベル付与行動１７１を認識し、動画ファイルに「異常あり」のラベルを付与した旨が表示される。 Worker U visually confirms the object 101 that has reached in front of him, and when he finds an abnormality, raises both hands. This action of "raising both hands" is determined by the recognition processing unit 18 as the labeling action 171 so that the moving image processing unit 19 assigns the "abnormal" label to the moving image file currently being recorded. Further, on the display screen 40 of the terminal device 4 for workers, the label giving action 171 is recognized based on the display data output from the moving image processing unit 19, and the moving image file is given the label "abnormal". Is displayed.

　なお、ラベル付与行動１７１が記憶部１７に複数記憶されている場合において、動画ファイルの記録中に、例えば、第１のラベル付与行動と、第２のラベル付与行動のように、複数のラベル付与行動１７１が認識されたときは、動画ファイルには、第１のラベル情報と第２のラベル情報の両方が付与されてもよいし、例えば、所定の優先順位（作業者Ｕが後に取った行動を優先する）に従っていずれかのラベル情報が付与されてもよい。 In addition, when a plurality of label giving actions 171 are stored in the storage unit 17, a plurality of labels are given during recording of the moving image file, for example, the first label giving action and the second labeling action. When the action 171 is recognized, both the first label information and the second label information may be given to the moving image file, for example, a predetermined priority (action taken later by the worker U). Any label information may be given according to (priority is given to).

　ステップＳ６０では、終点設定行動１７３の前に始点設定が行われたかどうかを確認し、始点設定が行われた場合には、ステップＳ６１に進み、始点設定が行われていない場合には、ステップＳ６１～Ｓ６３に進むことなく、処理を終了する。これにより、動画ファイルの記録が開始されていない場合には終点設定処理が行われないため、誤作動を防止することができる。 In step S60, it is confirmed whether or not the start point has been set before the end point setting action 173, and if the start point has been set, the process proceeds to step S61, and if the start point has not been set, step S61 has not been set. The process ends without proceeding to S63. As a result, if the recording of the moving image file has not started, the end point setting process is not performed, so that a malfunction can be prevented.

　ステップＳ６１では、動画処理部１９が、認識処理部１８による行動認識結果情報の出力時点、すなわち、作業者Ｕの行動内容が終点設定行動１７３と判定された時点を動画ファイルの終点に設定し、動画ファイルの記録を終了する。そして、ステップＳ６２では、動画処理部１９が、動画ファイルの記録を終了した旨を示す表示データを作業者用端末装置４に出力する。 In step S61, the moving image processing unit 19 sets the output time point of the action recognition result information by the recognition processing unit 18, that is, the time point when the action content of the worker U is determined to be the end point setting action 173, as the end point of the moving image file. End recording of the video file. Then, in step S62, the moving image processing unit 19 outputs display data indicating that the recording of the moving image file is completed to the worker terminal device 4.

　ステップＳ６３では、動画処理部１９が、動画ファイルを動画ファイル格納装置５に出力する。その際、動画ファイルの記録中に、作業者Ｕのラベル付与行動１７１が認識された場合には、その動画ファイルは、ステップＳ５１にて「異常あり」のラベル情報が付与された状態で出力される。一方、作業者Ｕのラベル付与行動１７１が認識されなかった場合には、その動画ファイルは、デフォルトのラベルとして、「正常」というラベル情報が付与された状態で出力されてもよいし、ラベル情報が付与されていない状態で出力されてもよい。 In step S63, the moving image processing unit 19 outputs the moving image file to the moving image file storage device 5. At that time, if the label assignment action 171 of the worker U is recognized during the recording of the moving image file, the moving image file is output in a state where the label information of "abnormality" is given in step S51. The label. On the other hand, when the label assignment action 171 of the worker U is not recognized, the moving image file may be output with the label information of "normal" attached as the default label, or the label information. May be output in a state where is not given.

　図７は、実施例１に係る映像処理装置１００において、作業者Ｕの行動内容が終点設定行動１７３と判定されたときの動作を説明する図である。 FIG. 7 is a diagram illustrating an operation when the action content of the worker U is determined to be the end point setting action 173 in the video processing apparatus 100 according to the first embodiment.

　作業者Ｕが、対象物１０１が自身の目前を通過したことを視認すると、右手を挙げる。この「右手を挙げる」という行動が、認識処理部１８により終点設定行動１７３として判定されることで、動画処理部１９が動画ファイルの記録を終了する。また、作業者用端末装置４の表示画面４０には、動画処理部１９から出力された表示データに基づいて、終点設定行動１７３を認識し、動画ファイルの記録を終了した旨が表示される。 When the worker U visually recognizes that the object 101 has passed in front of him, he raises his right hand. The action of "raising the right hand" is determined by the recognition processing unit 18 as the end point setting action 173, so that the moving image processing unit 19 ends the recording of the moving image file. Further, on the display screen 40 of the worker terminal device 4, it is displayed that the end point setting action 173 is recognized based on the display data output from the moving image processing unit 19, and the recording of the moving image file is completed.

　図４に示す各ステップにおいて、ステップＳ１０～ステップＳ２１が、認識処理ステップに相当し、ステップＳ３０～ステップＳ６２が、動画処理ステップに相当する。なお、図４に示す各ステップの変形例として、動画ファイルにラベル情報を付与するタイミングは、ステップＳ５１のタイミングに代えて、ステップＳ６３の動画ファイルを出力するタイミングでもよい。また、ステップＳ６３にて動画ファイルを動画ファイル格納装置５に出力することに代えて、例えば、動画ファイルを記憶部１７に記憶してもよい。さらに、ステップＳ６３にて動画ファイルを出力するタイミングは、ステップＳ６２の直後に限られず、任意のタイミングで実行されてもよい。さらに、ステップＳ６２とステップＳ６３の実行順序は入れ替えてもよい。 In each step shown in FIG. 4, steps S10 to S21 correspond to recognition processing steps, and steps S30 to S62 correspond to moving image processing steps. As a modification of each step shown in FIG. 4, the timing of adding the label information to the moving image file may be the timing of outputting the moving image file of step S63 instead of the timing of step S51. Further, instead of outputting the moving image file to the moving image file storage device 5 in step S63, for example, the moving image file may be stored in the storage unit 17. Further, the timing of outputting the moving image file in step S63 is not limited to immediately after step S62, and may be executed at any timing. Further, the execution order of step S62 and step S63 may be interchanged.

　以上のようにして、図４に示す一連の処理（映像処理方法）が終了する。映像処理装置１００は、図４に示す一連の処理を繰り返し実行することにより、製造ライン１０２上を次々に流れてくる各対象物１０１を撮像素子２でそれぞれ撮像した複数の動画ファイルを出力する。その際、作業者Ｕの目視検査にて異常が発見された対象物１０１を撮像した動画ファイルには、作業者Ｕのラベル付与行動１７１に応じて、「異常あり」のラベル情報が付与された状態で出力される。 As described above, the series of processing (video processing method) shown in FIG. 4 is completed. By repeatedly executing a series of processes shown in FIG. 4, the image processing apparatus 100 outputs a plurality of moving image files in which each object 101 flowing one after another on the production line 102 is imaged by the image sensor 2. At that time, the label information of "abnormality" was given to the moving image file in which the object 101 in which the abnormality was found by the visual inspection of the worker U was imaged according to the labeling action 171 of the worker U. It is output in the state.

　したがって、本実施例によれば、動画処理部１９が、作業者Ｕのラベル付与行動１７１に応じて動画ファイルにラベル情報を付与するので、動画ファイルに対してラベルを付与する作業を効率的に実施することができる。 Therefore, according to the present embodiment, the moving image processing unit 19 assigns the label information to the moving image file according to the label giving action 171 of the worker U, so that the work of assigning the label to the moving image file is efficient. Can be carried out.

　また、動画処理部１９が、作業者Ｕの始点設定行動１７２及び終点設定行動１７３に応じて動画ファイルの始点及び終点を設定するので、動画ファイルとして記録する期間を簡単に指示するができる。さらに、動画ファイルを常時記録する場合に比べて動画ファイルの容量が低減されるとともに、動画ファイルは、対象物１０１が撮像された期間に限定されるので、学習データとしての利用性を向上することができる。 Further, since the moving image processing unit 19 sets the starting point and the ending point of the moving image file according to the start point setting action 172 and the end point setting action 173 of the worker U, the period for recording as a moving image file can be easily instructed. Further, the capacity of the moving image file is reduced as compared with the case where the moving image file is constantly recorded, and the moving image file is limited to the period in which the object 101 is imaged, so that the usability as learning data is improved. Can be done.

（実施例２）
　図８は、実施例２に係る映像処理装置１００を示す概略構成図である。 (Example 2)
FIG. 8 is a schematic configuration diagram showing the video processing apparatus 100 according to the second embodiment.

　実施例１に係る映像処理システム１は、対象物１０１を撮像する撮像素子２と、作業者Ｕを撮像する撮像素子３０（センサ素子３の一例）とを別々の装置として備える。これに対し、実施例２に係る映像処理システム１は、対象物１０１及び作業者Ｕを同一の画角内にて撮像する撮像素子２を備えることで、撮像素子２がセンサ素子３としての撮像素子３０を兼ねたものである。なお、図８に示す映像処理システム１は、センサ素子３として、マイクロホン素子３１及び接点素子３２を備えるが、マイクロホン素子３１及び接点素子３２は省略されてもよい。 The video processing system 1 according to the first embodiment includes an image pickup element 2 for imaging an object 101 and an image pickup element 30 (an example of a sensor element 3) for imaging a worker U as separate devices. On the other hand, the image processing system 1 according to the second embodiment includes an image pickup element 2 that images an object 101 and an operator U within the same angle of view, so that the image pickup element 2 takes an image as a sensor element 3. It also serves as an element 30. The image processing system 1 shown in FIG. 8 includes a microphone element 31 and a contact element 32 as the sensor element 3, but the microphone element 31 and the contact element 32 may be omitted.

　本実施例に係る映像処理装置１００は、撮像素子２の映像信号、マイクロホン素子３１の音声信号及び接点素子３２の接点信号が入力される。本実施例に係る映像処理装置１００の基本的な構成及び動作は、実施例１に係る映像処理装置１００と同様であるため、ここでは両者の相違点を中心に説明する。 In the video processing device 100 according to the present embodiment, the video signal of the image pickup element 2, the audio signal of the microphone element 31, and the contact signal of the contact element 32 are input. Since the basic configuration and operation of the video processing apparatus 100 according to the present embodiment are the same as those of the video processing apparatus 100 according to the first embodiment, the differences between the two will be mainly described here.

　映像信号入力部１１には、撮像素子２が接続されており、撮像素子２から出力された映像信号は、映像信号入力部１１及び映像信号処理部１２を介して認識処理部１８及び動画処理部１９に出力される。 An image pickup element 2 is connected to the video signal input unit 11, and the video signal output from the image pickup element 2 is a recognition processing unit 18 and a moving image processing unit via the video signal input unit 11 and the video signal processing unit 12. It is output to 19.

　認識処理部１８の映像認識部１８０、音声認識部１８１及び接点信号処理部１６は、撮像素子２から出力された映像信号と、センサ素子３から出力されたセンサ信号（音声信号及び接点信号）とを解析し、撮像素子２の設置場所付近にて作業する作業者Ｕの行動内容を判定する。また、認識処理部１８の文字認識部１８３は、撮像素子２から出力された映像信号を解析し、その映像に含まれている文字を取得する。 The video recognition unit 180, the voice recognition unit 181 and the contact signal processing unit 16 of the recognition processing unit 18 include a video signal output from the image pickup element 2 and a sensor signal (voice signal and contact signal) output from the sensor element 3. Is analyzed, and the action content of the worker U working in the vicinity of the installation location of the image pickup element 2 is determined. Further, the character recognition unit 183 of the recognition processing unit 18 analyzes the video signal output from the image sensor 2 and acquires the characters included in the video.

　動画処理部１９は、認識処理部１８による行動認識結果情報及び文字認識結果情報に基づいて、映像信号処理部１２により処理された撮像素子２の映像信号を所定期間の動画ファイルに加工して出力する。 The moving image processing unit 19 processes the video signal of the image pickup element 2 processed by the video signal processing unit 12 into a moving image file for a predetermined period and outputs it based on the action recognition result information and the character recognition result information by the recognition processing unit 18. do.

　本実施例によれば、撮像素子２が対象物１０１及び作業者Ｕを撮像するように設置された場合でも、動画処理部１９が、作業者Ｕのラベル付与行動１７１に応じて動画ファイルにラベル情報を付与する。したがって、システム構成の簡素化を図りながら、動画ファイルに対してラベルを付与する作業を効率的に実施することができる。 According to this embodiment, even when the image sensor 2 is installed so as to image the object 101 and the worker U, the moving image processing unit 19 labels the moving image file according to the labeling action 171 of the worker U. Give information. Therefore, it is possible to efficiently carry out the work of assigning a label to a moving image file while simplifying the system configuration.

（実施例３）
　図９は、実施例３に係る撮像装置２００を示す概略構成図である。 (Example 3)
FIG. 9 is a schematic configuration diagram showing the image pickup apparatus 200 according to the third embodiment.

　実施例１に係る映像処理システム１は、映像処理装置１００と、撮像素子２とを別々の装置として備える。これに対し、実施例３に係る映像処理システム１は、映像処理装置１００と撮像素子２とが１つの装置として構成された撮像装置２００を備える。 The video processing system 1 according to the first embodiment includes a video processing device 100 and an image pickup device 2 as separate devices. On the other hand, the image processing system 1 according to the third embodiment includes an image pickup device 200 in which the image processing device 100 and the image pickup element 2 are configured as one device.

　本実施例によれば、撮像装置２００が、映像処理装置１００と撮像素子２とを一体にした装置として構成された場合でも、動画処理部１９が、作業者Ｕのラベル付与行動１７１に応じて動画ファイルにラベル情報を付与する。したがって、システム構成の簡素化を図りながら、動画ファイルに対してラベルを付与する作業を効率的に実施することができる。 According to this embodiment, even when the image pickup device 200 is configured as a device in which the image processing device 100 and the image pickup element 2 are integrated, the moving image processing unit 19 responds to the labeling action 171 of the worker U. Add label information to the video file. Therefore, it is possible to efficiently carry out the work of assigning a label to a moving image file while simplifying the system configuration.

（実施例４）
　図１０は、実施例４に係る撮像装置２００を示す概略構成図である。 (Example 4)
FIG. 10 is a schematic configuration diagram showing the image pickup apparatus 200 according to the fourth embodiment.

　実施例３に係る映像処理システム１は、上記のように、対象物１０１を撮像する撮像装置２００（撮像素子２）と、作業者Ｕを撮像する撮像素子３０（センサ素子３の一例）とを別々の装置として備える。これに対し、実施例４に係る映像処理システム１では、撮像装置２００が、対象物１０１及び作業者Ｕを同一の画角内にて撮像する撮像素子２を備えることで、撮像素子２がセンサ素子３としての撮像素子３０を兼ねたものである。なお、図１０に示す映像処理システム１は、センサ素子３として、マイクロホン素子３１及び接点素子３２を備えるが、マイクロホン素子３１及び接点素子３２は省略されてもよい。 As described above, the image processing system 1 according to the third embodiment includes an image pickup device 200 (imaging element 2) that images an object 101 and an image pickup element 30 (an example of a sensor element 3) that images an operator U. Provided as a separate device. On the other hand, in the image processing system 1 according to the fourth embodiment, the image pickup device 200 includes the image pickup element 2 that images the object 101 and the worker U within the same angle of view, so that the image pickup element 2 is a sensor. It also serves as an image pickup device 30 as the element 3. The video processing system 1 shown in FIG. 10 includes a microphone element 31 and a contact element 32 as the sensor element 3, but the microphone element 31 and the contact element 32 may be omitted.

（コンピュータシステム３００の構成例）
　図１１は、コンピュータシステム３００の構成例を示すブロック図である。上記各実施例の各装置には、以下に説明するコンピュータシステム３００が適用されてもよい。その際、コンピュータシステム３００の構成は、各装置の機能に合わせて適宜変形して適用されてもよい。 (Configuration example of computer system 300)
FIG. 11 is a block diagram showing a configuration example of the computer system 300. The computer system 300 described below may be applied to each device of each of the above embodiments. At that time, the configuration of the computer system 300 may be appropriately modified and applied according to the function of each device.

　コンピュータシステム３００は、その主要コンポーネントとして、プロセッサ３０２、メモリ３０４、端末インターフェース３１２、ストレージインターフェース３１４、Ｉ／Ｏ（入出力）デバイスインターフェース３１６及びネットワークインターフェース３１８を備える。これらの主要コンポーネントは、メモリバス３０６、Ｉ／Ｏバス３０８、バスインターフェースユニット３０９及びＩ／Ｏバスインターフェースユニット３１０からなるバス構造を介して相互に接続されている。 The computer system 300 includes a processor 302, a memory 304, a terminal interface 312, a storage interface 314, an I / O (input / output) device interface 316, and a network interface 318 as its main components. These main components are connected to each other via a bus structure including a memory bus 306, an I / O bus 308, a bus interface unit 309, and an I / O bus interface unit 310.

　プロセッサ３０２は、１つ又は複数の演算処理装置（ＣＰＵ、ＭＰＵ、ＧＰＵ、ＤＳＰ等）で構成される。図１１の例では、プロセッサ３０２は、２つのＣＰＵ３０２Ａ、３０２Ｂで構成される。プロセッサ３０２は、メモリ３０４に格納された命令を実行し、オンボードキャッシュを含んでもよい。 The processor 302 is composed of one or a plurality of arithmetic processing units (CPU, MPU, GPU, DSP, etc.). In the example of FIG. 11, the processor 302 is composed of two CPUs 302A and 302B. Processor 302 may execute instructions stored in memory 304 and include an onboard cache.

　メモリ３０４は、各種のデータ及びプログラム３５０を記憶するための揮発性メモリ（ＤＲＡＭ、ＳＲＡＭ等）又は不揮発性メモリ（ＲＯＭ、フラッシュメモリ等）で構成される。メモリ３０４は、コンピュータシステム３００の仮想メモリを表し、ネットワーク３３０を介してコンピュータシステム３００に接続された他のコンピュータシステムの仮想メモリを含んでもよい。 The memory 304 is composed of a volatile memory (DRAM, SRAM, etc.) or a non-volatile memory (ROM, flash memory, etc.) for storing various data and the program 350. The memory 304 represents the virtual memory of the computer system 300 and may include the virtual memory of another computer system connected to the computer system 300 via the network 330.

　プログラム３５０は、プロセッサ３０２上で実行する命令又は記述を含んでもよく、あるいは別の命令又は記述によって解釈される命令又は記述を含んでもよい。また、プログラム３５０は、プロセッサベースのシステムの代わりに、又は、プロセッサベースのシステムに加えて、半導体デバイス、チップ、論理ゲート、回路、回路カード、及び／又は他の物理ハードウェアデバイスを介してハードウェアで実施されてもよい。 The program 350 may include an instruction or description executed on the processor 302, or may include an instruction or description interpreted by another instruction or description. Program 350 also hardware in place of or in addition to processor-based systems via semiconductor devices, chips, logic gates, circuits, circuit cards, and / or other physical hardware devices. It may be carried out by hardware.

　プログラム３５０は、メモリ３０４の代わりに、ストレージ装置３２２に記憶されていてもよいし、インストール可能なファイル形式又は実行可能なファイル形式でＣＤ、ＤＶＤ、ＵＳＢメモリ等のコンピュータで読み取り可能な記録媒体に格納されて提供されてもよい。また、プログラム３５０は、外部のコンピュータシステムに格納されて、ネットワーク３３０経由でダウンロードすることにより提供されてもよい。 The program 350 may be stored in the storage device 322 instead of the memory 304, or may be stored in a computer-readable recording medium such as a CD, DVD, or USB memory in an installable file format or an executable file format. It may be stored and provided. The program 350 may also be stored in an external computer system and provided by downloading via network 330.

　表示システム３２４は、表示コントローラ、表示メモリ、又はその両方で構成される。表示コントローラは、ビデオ、オーディオ、又はその両方のデータを表示装置３２６に出力する。表示メモリは、ビデオデータをバッファするための専用メモリである。なお、表示システム３２４が提供する機能は、プロセッサ３０２を含む集積回路によって実現されてもよい。 The display system 324 is composed of a display controller, a display memory, or both. The display controller outputs video, audio, or both data to the display device 326. The display memory is a dedicated memory for buffering video data. The function provided by the display system 324 may be realized by an integrated circuit including a processor 302.

　表示装置３２６は、例えば、液晶ディスプレイ、有機ＥＬディスプレイ、電子ペーパー、プロジェクタ等で構成される。なお、表示装置３２６は、表示画面を有する他の装置（例えば、ノート型コンピュータ、タブレット型コンピュータ、スマートフォン等の携帯型デバイスやテレビ等）に接続されて、他の装置を表示装置３２６として機能させてもよい。 The display device 326 is composed of, for example, a liquid crystal display, an organic EL display, electronic paper, a projector, or the like. The display device 326 is connected to another device having a display screen (for example, a portable device such as a notebook computer, a tablet computer, a smartphone, a television, etc.) to cause the other device to function as the display device 326. You may.

　端末インターフェース３１２は、ユーザ入力デバイスやユーザ出力デバイスのようなユーザＩ／Ｏデバイス３２０に接続可能に構成され、プロセッサ３０２とユーザＩ／Ｏデバイス３２０とが有線又は無線により相互に通信するための通信経路を提供する。ユーザＩ／Ｏデバイス３２０は、例えば、キーボード、マウス、キーパッド、タッチパッド、トラックボール、ボタン、ライトペン、又は他のポインティングデバイス等である。 The terminal interface 312 is configured to be connectable to a user I / O device 320 such as a user input device or a user output device, and is a communication for the processor 302 and the user I / O device 320 to communicate with each other by wire or wirelessly. Provide a route. The user I / O device 320 is, for example, a keyboard, mouse, keypad, touchpad, trackball, button, light pen, or other pointing device.

　ストレージインターフェース３１４は、ストレージ装置３２２に接続可能に構成され、プロセッサ３０２とストレージ装置３２２とが相互に通信するための通信経路を提供する。ストレージ装置３２２は、１つ又は複数のディスクドライブを有し、例えば、磁気ディスクドライブ、半導体ディスクドライブ等で構成される。 The storage interface 314 is configured to be connectable to the storage device 322, and provides a communication path for the processor 302 and the storage device 322 to communicate with each other. The storage device 322 has one or more disk drives, and is composed of, for example, a magnetic disk drive, a semiconductor disk drive, or the like.

　Ｉ／Ｏデバイスインターフェース３１６は、外部Ｉ／Ｏデバイス３４０に接続可能に構成され、プロセッサ３０２と外部Ｉ／Ｏデバイス３４０とが有線又は無線により相互に通信するための通信経路を提供する。外部Ｉ／Ｏデバイス３４０は、例えば、カメラ、スキャナ、マイクロホン、スピーカ、プリンタ、ファックスマシン等の任意の機器や、例えば、バイオメトリックセンサ、環境センサ、モーションセンサ等の任意のセンサである。なお、コンピュータシステム３００は、上記の外部Ｉ／Ｏデバイス３４０の一部又は全部を含んでもよい。 The I / O device interface 316 is configured to be connectable to an external I / O device 340 and provides a communication path for the processor 302 and the external I / O device 340 to communicate with each other by wire or wirelessly. The external I / O device 340 is, for example, any device such as a camera, scanner, microphone, speaker, printer, fax machine, or any sensor such as a biometric sensor, environment sensor, motion sensor or the like. The computer system 300 may include a part or all of the above-mentioned external I / O device 340.

　ネットワークインターフェース３１８は、インターネットやイントラネット等のネットワーク３３０に接続可能に構成され、プロセッサ３０２とネットワーク３３０とが有線又は無線により相互に通信するための通信経路を提供する。なお、外部Ｉ／Ｏデバイス３４０は、ネットワーク３３０を介してネットワークインターフェース３１８に接続されてもよい。 The network interface 318 is configured to be connectable to a network 330 such as the Internet or an intranet, and provides a communication path for the processor 302 and the network 330 to communicate with each other by wire or wirelessly. The external I / O device 340 may be connected to the network interface 318 via the network 330.

　コンピュータシステム３００におけるバス構造は、プロセッサ３０２、メモリ３０４、バスインターフェースユニット３０９、表示システム３２４、及びＩ／Ｏバスインターフェースユニット３１０の間を直接接続する通信経路を提供する。なお、バス構造は、例えば、階層構成、スター構成又はウェブ構成のポイントツーポイントリンク、複数の階層バス、平行又は冗長の通信経路を含んでもよい。また、バスインターフェースユニット３０９が提供する機能は、プロセッサ３０２を含む集積回路によって実現されてもよい。 The bus structure in the computer system 300 provides a communication path that directly connects between the processor 302, the memory 304, the bus interface unit 309, the display system 324, and the I / O bus interface unit 310. The bus structure may include, for example, a hierarchical structure, a star structure or a web structure point-to-point link, a plurality of hierarchical buses, and parallel or redundant communication paths. Further, the function provided by the bus interface unit 309 may be realized by an integrated circuit including a processor 302.

　コンピュータシステム３００は、例えば、デスクトップ型コンピュータ、ノート型コンピュータ、タブレット型コンピュータ、スマートフォン、携帯電話又は任意の他の適切な電子機器である。なお、コンピュータシステム３００は、クライアント型コンピュータでもよいし、サーバ型コンピュータやクラウド型コンピュータでもよい。 The computer system 300 is, for example, a desktop computer, a laptop computer, a tablet computer, a smartphone, a mobile phone or any other suitable electronic device. The computer system 300 may be a client-type computer, a server-type computer, or a cloud-type computer.

　上記構成を有するコンピュータシステム３００にて映像処理装置１００及び撮像装置２００を実現する場合、映像信号入力部１１、音声信号入力部１３及び接点信号入力部１５は、１又は複数のＩ／Ｏデバイスインターフェース３１６により構成される。映像信号処理部１２、音声信号処理部１４、接点信号処理部１６、認識処理部１８及び動画処理部１９は、１又は複数のプロセッサ３０２により構成される。記憶部１７は、１又は複数のメモリ３０４、及び、１又は複数のストレージ装置３２２の少なくとも一方で構成される。 When the video processing device 100 and the image pickup device 200 are realized in the computer system 300 having the above configuration, the video signal input unit 11, the audio signal input unit 13, and the contact signal input unit 15 are one or a plurality of I / O device interfaces. It is composed of 316. The video signal processing unit 12, the audio signal processing unit 14, the contact signal processing unit 16, the recognition processing unit 18, and the moving image processing unit 19 are composed of one or a plurality of processors 302. The storage unit 17 is composed of one or a plurality of memories 304 and at least one of the one or a plurality of storage devices 322.

　なお、映像信号処理部１２、音声信号処理部１４、接点信号処理部１６、認識処理部１８及び動画処理部１９が、複数のプロセッサ３０２で構成される場合、各部が別々のプロセッサ３０２で構成されてもよいし、各部を任意に組み合わせたものが別々のプロセッサ３０２で構成されてもよい。例えば、映像信号処理部１２、音声信号処理部１４及び接点信号処理部１６が第１のプロセッサで構成され、認識処理部１８及び動画処理部１９が第２のプロセッサで構成されてもよい。また、プロセッサ３０２がプログラム３５０を実行することにより、当該プロセッサが、映像信号処理部１２、音声信号処理部１４、接点信号処理部１６、認識処理部１８及び動画処理部１９の少なくとも一部の機能を実現するものでもよく、例えば、認識処理部１８及び動画処理部１９の機能を実現するものでもよい。 When the video signal processing unit 12, the audio signal processing unit 14, the contact signal processing unit 16, the recognition processing unit 18, and the moving image processing unit 19 are composed of a plurality of processors 302, each unit is composed of a separate processor 302. Alternatively, an arbitrary combination of parts may be configured by separate processors 302. For example, the video signal processing unit 12, the audio signal processing unit 14, and the contact signal processing unit 16 may be configured by the first processor, and the recognition processing unit 18 and the moving image processing unit 19 may be configured by the second processor. Further, when the processor 302 executes the program 350, the processor has at least a part of the functions of the video signal processing unit 12, the audio signal processing unit 14, the contact signal processing unit 16, the recognition processing unit 18, and the moving image processing unit 19. For example, the functions of the recognition processing unit 18 and the moving image processing unit 19 may be realized.

　映像処理システム１における撮像素子２及びセンサ素子３は、外部Ｉ／Ｏデバイス３４０に対応し、作業者用端末装置４及び動画ファイル格納装置５は、ネットワーク３３０に接続された他のコンピュータシステムに対応する。なお、作業者用端末装置４は、表示装置３２６に対応するものでもよい。 The image pickup element 2 and the sensor element 3 in the image processing system 1 correspond to the external I / O device 340, and the worker terminal device 4 and the moving image file storage device 5 correspond to other computer systems connected to the network 330. do. The worker terminal device 4 may correspond to the display device 326.

（補足事項）
　以上、本発明の実施の形態について説明したが、本発明は、上記の実施の形態に限定されるものではなく、本発明の要旨を逸脱しない範囲において種々の変形が可能である。例えば、本発明は、上記の実施の形態で説明した全ての構成を備えるものに限定されず、その構成の一部を削除したものも含まれる。また、特定の実施例に係る構成の一部を、他の実施例に係る構成に追加又は置換することが可能である。 (Supplementary information)
Although the embodiments of the present invention have been described above, the present invention is not limited to the above-described embodiments, and various modifications can be made without departing from the gist of the present invention. For example, the present invention is not limited to the one including all the configurations described in the above-described embodiment, and includes the one in which a part of the configurations is deleted. Further, it is possible to add or replace a part of the configuration according to a specific embodiment with the configuration according to another embodiment.

　作業者Ｕが、例えば、作業者用端末装置４を操作することにより、作業者行動データ１７０は適宜変更されてもよい。さらに、作業者Ｕが、例えば、作業者用端末装置４を操作することにより、認識処理部１８がセンサ信号を解析し、作業者Ｕの行動内容を判定する際の解析パラメータは適宜調整されてもよい。 The worker behavior data 170 may be appropriately changed by the worker U operating, for example, the worker terminal device 4. Further, when the worker U operates, for example, the worker terminal device 4, the recognition processing unit 18 analyzes the sensor signal, and the analysis parameters when determining the action content of the worker U are appropriately adjusted. May be good.

　また、上記各実施例では、認識処理部１８が、始点設定行動１７２及び終点設定行動１７３の両方を認識するものとして説明したが、認識処理部１８が、始点設定行動１７２及び終点設定行動１７３の少なくとも一方を認識する処理を省略してもよい。その場合には、動画ファイルの始点及び終点が他の手段により設定される。例えば、認識処理部１８が、撮像素子２から出力された映像信号を解析し、対象物１０１のフレームイン及びフレームアウトの時点を、動画ファイルの始点及び終点にそれぞれ設定してもよい。また、撮像素子２の設置場所の上流側及び下流側に、例えば、透過センサ、反射センサ等のトリガ発生装置を設置し、当該トリガ発生装置のトリガ信号に基づいて、動画ファイルの始点及び終点をそれぞれ設定してもよい。 Further, in each of the above embodiments, the recognition processing unit 18 has been described as recognizing both the start point setting action 172 and the end point setting action 173, but the recognition processing unit 18 has described the start point setting action 172 and the end point setting action 173. The process of recognizing at least one of them may be omitted. In that case, the start point and the end point of the moving image file are set by other means. For example, the recognition processing unit 18 may analyze the video signal output from the image sensor 2 and set the frame-in and frame-out time points of the object 101 at the start point and the end point of the moving image file, respectively. Further, for example, a trigger generator such as a transmission sensor or a reflection sensor is installed on the upstream side and the downstream side of the installation location of the image pickup element 2, and the start point and the end point of the moving image file are set based on the trigger signal of the trigger generator. Each may be set.

　また、上記各実施例では、映像処理装置１００及び撮像装置２００が、音声信号入力部１３、音声信号処理部１４、接点信号入力部１５及び接点信号処理部１６を備えるものとして説明したが、これらは、映像処理装置１００及び撮像装置２００に接続されるセンサ素子３の有無に応じて省略されてもよい。例えば、マイクロホン素子３１が接続されない場合には、音声信号入力部１３及び音声信号処理部１４が省略されてもよいし、接点素子３２が接続されない場合には、接点信号入力部１５及び接点信号処理部１６が省略されてもよい。 Further, in each of the above embodiments, the video processing device 100 and the image pickup device 200 have been described as including the voice signal input unit 13, the voice signal processing unit 14, the contact signal input unit 15, and the contact signal processing unit 16. May be omitted depending on the presence or absence of the sensor element 3 connected to the image processing device 100 and the image pickup device 200. For example, if the microphone element 31 is not connected, the audio signal input unit 13 and the audio signal processing unit 14 may be omitted, and if the contact element 32 is not connected, the contact signal input unit 15 and the contact signal processing unit 15 may be omitted. Part 16 may be omitted.

　また、上記各実施例では、認識処理部１８は、文字認識部１８３を備えるものとして説明したが、文字認識部１８３は省略されてもよい。その場合には、動画処理部１９は、文字認識部１８３による文字認識結果が取得されなかったものとして、動画ファイルを出力すればよい。 Further, in each of the above embodiments, the recognition processing unit 18 has been described as including the character recognition unit 183, but the character recognition unit 183 may be omitted. In that case, the moving image processing unit 19 may output the moving image file assuming that the character recognition result by the character recognition unit 183 has not been acquired.

１…映像処理システム、２…撮像素子、３…センサ素子、４…作業者用端末装置、５…動画ファイル格納装置、１１…映像信号入力部、１２…映像信号処理部、１３…音声信号入力部、１４…音声信号処理部、１５…接点信号入力部、１６…接点信号処理部、１７…記憶部、１８…認識処理部、１９…動画処理部、３０…撮像素子、３１…マイクロホン素子、３２…接点素子、１００…映像処理装置、１７０…作業者行動データ、１７１…ラベル付与行動、１７２…始点設定行動、１７３…終点設定行動、１８０…映像認識部、１８１…音声認識部、１８２…接点認識部、１８３…文字認識部、２００…撮像装置、３００…コンピュータシステム、３０２…プロセッサ、３０２Ａ，３０２Ｂ…ＣＰＵ、３０４…メモリ、３０６…メモリバス、３０８…Ｉ／Ｏバス、３０９…バスインターフェースユニット、３１０…Ｉ／Ｏバスインターフェースユニット、３１２…端末インターフェース、３１４…ストレージインターフェース、３１６…Ｉ／Ｏデバイスインターフェース、３１８…ネットワークインターフェース、３２０…ユーザＩ／Ｏデバイス、３２２…ストレージ装置、３２４…表示システム、３２６…表示装置、３３０…ネットワーク、３４０…外部Ｉ／Ｏデバイス、３５０…プログラム 1 ... video processing system, 2 ... image pickup element, 3 ... sensor element, 4 ... worker terminal device, 5 ... video file storage device, 11 ... video signal input unit, 12 ... video signal processing unit, 13 ... audio signal input Unit, 14 ... Audio signal processing unit, 15 ... Contact signal input unit, 16 ... Contact signal processing unit, 17 ... Storage unit, 18 ... Recognition processing unit, 19 ... Movie processing unit, 30 ... Imaging element, 31 ... Microphone element, 32 ... Contact element, 100 ... Video processing device, 170 ... Worker action data, 171 ... Labeling action, 172 ... Start point setting action, 173 ... End point setting action, 180 ... Video recognition unit, 181 ... Voice recognition unit, 182 ... Contact recognition unit, 183 ... Character recognition unit, 200 ... Imaging device, 300 ... Computer system, 302 ... Processor, 302A, 302B ... CPU, 304 ... Memory, 306 ... Memory bus, 308 ... I / O bus, 309 ... Bus interface Unit, 310 ... I / O bus interface unit, 312 ... Terminal interface, 314 ... Storage interface, 316 ... I / O device interface, 318 ... Network interface, 320 ... User I / O device, 322 ... Storage device, 324 ... Display System, 326 ... Display device, 330 ... Network, 340 ... External I / O device, 350 ... Program

Claims

A video processing unit that processes the video signal output from the image sensor into a video file for a predetermined period and outputs it.
A storage unit that stores worker behavior data including worker behavior content and corresponding processing content,
A recognition processing unit that determines the behavior content of a worker based on a sensor signal output from a sensor element, generates behavior recognition result information based on the worker behavior data in the storage unit, and outputs it to the moving image processing unit. And with
The moving image processing unit
When the recognition processing unit outputs information related to label assignment as the action recognition result information, the label related to the label assignment is added to the moving image file for the predetermined period including the output time of the action recognition result information.
Video processing equipment.

The moving image processing unit
When the recognition processing unit outputs information related to a start point instruction as the action recognition result information, the output time point of the action recognition result information is set as the start point of the predetermined period.
When the recognition processing unit outputs information related to the end point instruction as the action recognition result information, the output time point of the action recognition result information is set to the end point of the predetermined period.
The video processing apparatus according to claim 1.

Image sensor and
A video processing unit that processes the video signal output from the image sensor into a video file for a predetermined period and outputs it.
A storage unit that stores worker behavior data including worker behavior content and corresponding processing content,
A recognition processing unit that determines the behavior content of a worker based on a sensor signal output from a sensor element, generates behavior recognition result information based on the worker behavior data in the storage unit, and outputs it to the moving image processing unit. And with
The moving image processing unit
When the recognition processing unit outputs information related to label assignment as the action recognition result information, the label related to the label assignment is added to the moving image file for the predetermined period including the output time of the action recognition result information.
Imaging device.

The moving image processing unit
When the recognition processing unit outputs information related to a start point instruction as the action recognition result information, the output time point of the action recognition result information is set as the start point of the predetermined period.
When the recognition processing unit outputs information related to the end point instruction as the action recognition result information, the output time point of the action recognition result information is set to the end point of the predetermined period.
The imaging device according to claim 3.

It is a video processing method that processes the video signal output from the image sensor into a video file for a predetermined period and outputs it.
The action content of the worker is determined based on the sensor signal output from the sensor element, and the data described in the storage unit is based on the action content of the worker and the worker action data including the corresponding processing content. A recognition processing step that generates and outputs action recognition result information,
When the recognition processing step outputs information related to label assignment as the action recognition result information, the label related to the label assignment is added to the moving image file for the predetermined period including the output time point of the action recognition result information.
When the recognition processing step outputs information related to a start point instruction as the action recognition result information, the output time point of the action recognition result information is set as the start point of the predetermined period.
When the recognition processing step outputs information related to an end point instruction as the action recognition result information, the image processing step includes a moving image processing step in which the output time point of the action recognition result information is set to the end point of the predetermined period.
Video processing method.