WO2024232239A1 - 注意抽出システム、及び注意抽出方法 - Google Patents
注意抽出システム、及び注意抽出方法 Download PDFInfo
- Publication number
- WO2024232239A1 WO2024232239A1 PCT/JP2024/015567 JP2024015567W WO2024232239A1 WO 2024232239 A1 WO2024232239 A1 WO 2024232239A1 JP 2024015567 W JP2024015567 W JP 2024015567W WO 2024232239 A1 WO2024232239 A1 WO 2024232239A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- information
- gaze
- attention
- instructor
- work
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/06—Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
- G06Q10/063—Operations research, analysis or management
- G06Q10/0631—Resource planning, allocation, distributing or scheduling for enterprises or organisations
- G06Q10/06311—Scheduling, planning or task assignment for a person or group
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/011—Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
- G06F3/013—Eye tracking input arrangements
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/06—Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
- G06Q10/063—Operations research, analysis or management
- G06Q10/0639—Performance analysis of employees; Performance analysis of enterprise or organisation operations
- G06Q10/06398—Performance of employee with respect to a job function
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/10—Services
- G06Q50/20—Education
- G06Q50/205—Education administration or guidance
- G06Q50/2057—Career enhancement or continuing education service
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/20—Scenes; Scene-specific elements in augmented reality scenes
-
- G—PHYSICS
- G09—EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
- G09B—EDUCATIONAL OR DEMONSTRATION APPLIANCES; APPLIANCES FOR TEACHING, OR COMMUNICATING WITH, THE BLIND, DEAF OR MUTE; MODELS; PLANETARIA; GLOBES; MAPS; DIAGRAMS
- G09B5/00—Electrically-operated educational appliances
- G09B5/02—Electrically-operated educational appliances with visual presentation of the material to be studied, e.g. using film strip
Definitions
- the present invention relates to an attention extraction system and an attention extraction method that extracts attention information for a task.
- the skill transfer system disclosed in Patent Document 1 performs machine learning on first image data and gaze data of a first user's field of view based on teacher data, extracts feature data of the images, inputs information on the first user's focus point in the first image as teacher data and registers it in association with the feature data, transmits second image data of a second user's field of view to a server, reads out focus point data associated with the feature data corresponding to the second image, creates support display data and transmits it to the second user's glasses terminal, and displays information on the focus point based on the support display data on the second user's glasses terminal, superimposing it on the second user's field of view.
- the wiring work support system disclosed in Patent Document 2 includes an eye tracking device that tracks the line of sight of the worker, an image capture device that captures the wiring and other objects being worked on, and images of the worker's visual range are processed to decipher characters and colors. It also discloses a portable work management device that has an image processing unit that provides the worker with information about objects stored in a data storage management unit, and that stores work completion signals from limit switches installed on the work tools and images of the objects being worked on.
- Patent Document 1 is premised on the premise that users will coordinate their gazes and pass on skills between them. In other words, it is difficult to grasp the state of gaze based on gaze duration and gaze shifts, etc., extract appropriate attention information, and provide it to the worker. Furthermore, there is no mention or suggestion of identifying attention behavior that indicates what the instructor gazed at and in what situation.
- Patent Document 2 is premised on linking the line of sight of the worker with the work object and providing the information about the object to the worker. In other words, it is difficult to grasp the state of gaze based on the gaze duration and gaze shift of the viewpoint, extract appropriate attention information, and provide it to the worker. Furthermore, there is no mention or suggestion of identifying attention behavior that indicates what the instructor gazed at and in what situation.
- the present invention was devised in consideration of the above-mentioned problems, and its purpose is to provide an attention extraction system and method that grasps the instructor's gaze state and extracts and provides appropriate attention information.
- the attention extraction system is an attention extraction system that extracts attention information for a task, and is characterized by comprising: an acquisition means that acquires video information in the visual field of an instructor performing the task and coordinate information indicating the viewpoint at which the instructor gazes in the visual field in a time series manner, linking the video information and coordinate information acquired by the acquisition means, and chronologically acquiring the video information; a determination means that determines the viewpoint displacement of the instructor in the visual field and identifies the instructor's gaze mode based on the video information and coordinate information acquired by the acquisition means; an extraction means that sets a gaze area based on the gaze mode identified by the determination means and extracts a gaze image of a task object gazed at by the instructor in the gaze area; and a storage means that links the gaze image extracted by the extraction means to the visual field, the viewpoint displacement, and the gaze mode, and stores the gaze image in a database as attention information for the task.
- the attention extraction system is the same as in the first invention, but the acquisition means further includes a determination means for determining whether the task object included in the gaze image is correct or incorrect, and a display means for displaying the determination result of the determination means.
- the attention extraction system is the second invention, further comprising a database storing the correlation between previously acquired past gaze image information and reference information indicating the correctness or incorrectness of the work object linked to the gaze image, the determination means refers to the database to determine the correctness or incorrectness of the work object and obtains corresponding information from the database according to the result of the determination, and the display means further outputs the corresponding information obtained by the determination means.
- the attention extraction system is the second invention, further comprising an input means for inputting the corresponding information output by the determination means, and the storage means links the corresponding information input by the input means with the gaze image and stores it as an attention data set including setting conditions for recognizing it as a gaze target.
- the attention extraction system is the second invention, characterized in that the video information acquired by the acquisition means includes recording date and time information, recording position information, and recording control information related to the operation of acquiring the video information, of the work performed by the instructor, and the display means displays the recording date and time information, the recording position information, and the recording control information in the center of the field of view before the acquisition means acquires the video information, and switches to only the recording control information and displays it in a corner of the field of view while the video information is being acquired.
- the sixth aspect of the present invention is an attention extraction system according to the second aspect of the present invention, characterized in that the display means further comprises an acquisition display area that displays an instruction for the worker performing the work to select the type of the acquisition means for acquiring the video information, the type of the attention data set stored in the database, and the start of the work, and an attention information display area that switches between displaying corresponding information corresponding to the gaze image of the worker and attention information based on the attention data set after the selection, and the attention information displayed in the attention display area includes nudge information that is displayed at least at the start of work, the middle of work, or the end of work, depending on the progress of the worker's work, and the information including at least any of the corresponding information, the attention information, and the nudge information displayed in the attention information display area is displayed in a manner that is distributed according to the gaze state of the worker performing the work, based on the attention data set and the result of the judgment by the judgment means.
- the attention extraction method is characterized in that a computer is made to execute the following steps: an acquisition step of acquiring video information in the visual field of an instructor performing the task and coordinate information indicating the viewpoint of the instructor in the visual field in association with the task in a time series; a specification step of identifying the instructor's gaze mode based on the video information and the coordinate information acquired by the acquisition step, in which the time series coordinate information indicating the viewpoint displacement of the instructor in the visual field is concentrated at the center of the visual field, and in which the coordinate information is alert, in association with the task; an extraction step of setting a gaze area based on the gaze mode identified by the specification step, and extracting a gaze image of a work object gazed at by the instructor in the gaze area; and a storage step of linking the gaze image extracted by the extraction step to the visual field, the viewpoint displacement, and the gaze mode, and storing the gaze image in a database as attention information for the task.
- the identification means determines the displacement of the instructor's viewpoint within the field of view based on the video information and coordinate information, and identifies the instructor's gaze mode. Therefore, the extraction means can set a gaze area based on the identified gaze mode, and extract a gaze image of the work object on which the instructor gazed in the gaze area. This makes it possible to extract a gaze image that shows what and in what situation the instructor gazed, based on the gaze time and gaze shift of the viewpoint, and thus makes it possible to accurately grasp the state of gaze and provide appropriate attention information.
- the gaze area includes the instructor's wide field of view mode and alert mode. Therefore, it is possible to set a wide field of view mode in which the instructor's viewpoint displacement is concentrated at the center of the field of view range, and an alert mode in which the instructor's viewpoint displacement is dispersed outside the center of the field of view range. This makes it possible to extract a gaze image that shows what the instructor is gazing at and in what situation.
- the acquisition means further includes a determination means. Therefore, it is possible to determine whether the task object contained in the gaze image is correct or incorrect. This makes it possible to accurately grasp the gaze states of the instructor and the worker, and provide appropriate attention information.
- the database stores the correlation between previously acquired past gaze image information and reference information indicating the correctness of the work object linked to the gaze image. Therefore, the determination means can refer to the database to determine the correctness of the work object, and the display means can output corresponding information according to the determination result. This makes it possible to accurately grasp the gaze state of the worker and provide appropriate caution information.
- the input means inputs the corresponding information output by the determination means. Therefore, the storage means can link the corresponding information with the gaze image and store it as an attention data set including setting conditions for recognizing it as a gaze target. This makes it possible to accurately grasp the gaze state and provide appropriate attention information.
- the display means switches the information displayed in the field of view before and during acquisition of video information. Therefore, before acquisition of video information, recording date and time information, recording position information, and recording control information of the instructor's work are displayed in the center of the field of view, and during acquisition, only the recording control information is displayed in the corner of the field of view. This makes it possible to acquire video information of the instructor performing the work and extract the gaze image without placing a burden on the instructor.
- the display means switches between displaying the acquisition display area and the attention display area. Therefore, based on the attention data set and the determination result, the corresponding information, attention information, or nudge information can be displayed in accordance with the gaze state of the worker performing the work. This makes it possible to accurately grasp the gaze state of the worker and provide appropriate attention information.
- the identification step determines the instructor's viewpoint displacement within the field of view based on the video information and coordinate information, and identifies the instructor's gaze mode. Therefore, in the extraction step, a gaze area can be set based on the identified gaze mode, and a gaze image of the work object on which the instructor gazed in the gaze area can be extracted. This makes it possible to extract a gaze image that shows what and in what situation the instructor gazed based on the gaze time and gaze shift of the viewpoint, making it possible to accurately grasp the gaze state and provide appropriate attention information.
- FIG. 1 is a schematic diagram showing an example of the configuration of an attention extraction system according to this embodiment.
- FIG. 2 is a schematic diagram showing an example of attention extraction by an instructor and task evaluation by an operator in the attention extraction system 100 according to this embodiment.
- 3A to 3C are schematic diagrams showing an example of a method for extracting attention in this embodiment.
- FIG. 4A is a schematic diagram showing an example of the configuration of an attention extraction device
- FIG. 4B is a schematic diagram showing an example of the function of the attention extraction device.
- FIG. 5 is a schematic diagram showing an example of a database in this embodiment.
- FIG. 6 is a schematic diagram showing a first modified example of the database in this embodiment.
- FIG. 7 is a schematic diagram showing an example of a data table stored in a database in this embodiment.
- FIG. 8 is a schematic diagram showing an example of a flowchart illustrating the attention extraction method according to this embodiment.
- 9A to 9E are schematic diagrams showing an example of a method for extracting attention in this embodiment.
- 10A and 10B are schematic diagrams showing an example of a display on the worker device in this embodiment.
- FIG. 11 is a schematic diagram showing an example of a display of the notice extraction device in this embodiment.
- 12A to 12E are schematic diagrams showing examples of displays on the worker terminal in this embodiment.
- FIG. 1 is a schematic diagram showing an example of the configuration of the attention extraction system 100 in this embodiment
- Figure 2 is a schematic diagram showing an example of attention extraction by an instructor and task evaluation by an operator in the attention extraction system 100 in this embodiment.
- the attention extraction system 100 is used to extract the task object that the instructor should pay attention to, using image information within the visual field of the instructor performing the task and coordinate information indicating the viewpoint at which the instructor is gazing within the visual field.
- the attention extraction system 100 can obtain image information and coordinate information under various conditions within the instructor's range for various types of information including image information, coordinate information, and task information related to the task.
- the attention extraction system 100 includes an attention extraction device 1, an instructor device 2, a worker device 3, and a server 4, and, for example, a plurality of instructor devices 2 and worker devices 3 may be provided in a work area 50.
- the attention extraction system 100 may transmit and receive various information to and from the attention extraction device 1, instructor device 2, worker device 3, server 4, and other user devices (not shown), for example, via a known communication network 5.
- the attention extraction system 100 acquires viewpoint displacement 2b of the field of view at which the instructor is gazing, which is included in the instructor's visual field range 2a, via an instructor device 2 worn by the instructor, for example.
- the information acquired by the attention extraction system 100 from the instructor includes, for example, video information (images within the visual field range where work is performed), coordinate information (viewpoint, position information, etc.), work information (work date and time, work instructions, process information, etc.), instructor information (instructor ID, equipment ID, etc.), and may also include various information about the work performed by the instructor (assigned work, group work, etc.).
- the attention extraction system 100 acquires, from an instructor device 2 worn by the instructor, image information within the visual field of the instructor performing a task and coordinate information indicating the viewpoint at which the instructor is gazing within the visual field in chronological order, linking the information to the task performed by the instructor.
- the image information and coordinate information may be acquired using, for example, known eye tracking technology, a head-mounted display, or a technology for acquiring the viewpoint provided in smart glasses, etc.
- the attention extraction system 100 acquires video information and coordinate information, for example, using the aforementioned technology provided in the instructor device 2, identifies the instructor's gaze mode in the attention extraction device 1, and extracts a gaze image based on the identified gaze mode.
- the attention extraction system 100 for example, links the extracted instructor's gaze image to reference information linked to the gaze image, and stores it in a database as an attention dataset.
- the attention extraction system 100 may, for example, refer to a database, acquire attention information linked to the reference information, and display it on the operator device 1 of the operator using the attention dataset.
- the attention extraction system 100 evaluates the work of the worker based on, for example, the image information of the work object 6 included in the field of view 3a of the worker, the numerical information of the viewpoint displacement and the coordinate information of the field of view, etc., based on the attention data set selected by the worker, and identifies, for example, the work object 6 as gaze information based on the evaluation result, and further obtains corresponding information such as "(1) Check” and "(2) Adjust", which is superimposed on the field of view 3a of the worker device 3 and displayed together with the actual display. Details of the configurations of the attention extraction device 1, the instructor device 2, and the worker device 3 will be described later.
- the attention extraction system 100 identifies the instructor's gaze mode by, for example, an attention extraction method such as that shown in FIG. 3(a) to (c). For example, as shown in FIG. 3(a), the attention extraction system 100 identifies the gaze mode in the instructor's visual field range based on the image information in the visual field range 2a acquired by the instructor device 2 and the coordinate information (x-axis, y-axis) of the instructor's viewpoint displacement.
- the attention extraction system 100 obtains the instructor's gaze displacement from the coordinates in the field of view 2a based on the image information and coordinate information acquired via the instructor device 2, and identifies the instructor's gaze mode. For example, as shown in FIG. 3(b), the attention extraction system 100 may identify the instructor's gaze displacement as a "wide field of view mode" if the instructor's gaze displacement is characterized by a "narrow displacement range (concentrated)" and "few points of interest in the gaze”. The attention extraction system 100 may also identify the instructor's gaze displacement as a "vigilance mode” if the instructor's gaze displacement is characterized by a "wide displacement range (dispersed)" and "many points of interest in the gaze", as shown in FIG. 3(c).
- the attention extraction system 100 sets a gaze area on which the instructor gazes based on the identified gaze mode, extracts a gaze image of the work object on which the instructor gazes in the set gaze area, links the extracted gaze image to the field of view, viewpoint displacement, and gaze mode, and stores it in a database on the server 4 as attention information for the instructor's work.
- the attention extraction system 100 may assign, for example, one work process of the instructor to be shared among multiple workers.
- the allocation of instructors and workers by the attention extraction system 100 may be determined based on, for example, the number of steps in a work process, the difficulty level, the skills of the workers, the deadline for the work, etc., or may be determined by, for example, an evaluator via the attention extraction device 1. It is arbitrary as to how many workers are assigned to which tasks, and how the workers are arranged, and may be specified as appropriate.
- the attention extraction device 1 determines the displacement of the instructor's viewpoint within the field of view based on, for example, video information in the field of view of the instructor's work acquired by the instructor device 1 and coordinate information indicating the viewpoint at which the instructor is gazing within the field of view, and identifies the instructor's gaze mode.
- the gaze area identified by the identification means includes, for example, a wide field of view mode in which the instructor's viewpoint displacement is within a field of view radius of 5 degrees to 20 degrees and is concentrated at the center of the field of view range, and an alert mode in which the viewpoint displacement is within a field of view radius of 4 degrees and is dispersed outside the center of the field of view range.
- the attention extraction device 1 identifies a "wide field of view mode" when the instructor's gaze displacement characteristics are “narrow displacement range (concentrated)” and “few points of interest” as shown in FIG. 3(b) above, and identifies a “vigilant mode” when the instructor's gaze displacement characteristics are “wide displacement range (dispersed)” and “many points of interest” as shown in FIG. 3(c) above.
- the attention extraction device 1 sets a gaze area based on, for example, a specified gaze mode, and extracts a gaze image of the work object that the instructor gazes at in the gaze area.
- the attention extraction device 1 links, for example, the extracted gaze image to the visual field range, viewpoint displacement, and gaze mode, and stores it in a database as attention information for the work.
- FIG. 11 shows an example of the attention extraction device screen 1a of the attention extraction device 1 in the attention extraction system 100.
- FIG. 11 is a screen operated by, for example, an evaluator, and displays various information acquired by the instructor device 1.
- the attention extraction device 1 refers to a database and displays on the attention extraction device screen 1a a gaze monitoring area that includes at least a gaze display area 1b (eye gaze movement distance graph, gaze tracking, gaze area, gaze target, etc.) that displays coordinate information, gaze area, gaze image, and gaze shift in time series, a target setting area 1c (gaze target evaluation, recognizer generation, etc.) that displays setting information for setting the gaze image displayed in the gaze display area 1b, and a judgment setting area 1d (correspondence information setting, etc.) that displays judgment conditions for the setting information displayed in the target setting area 1c.
- a gaze display area 1b eye gaze movement distance graph, gaze tracking, gaze area, gaze target, etc.
- a target setting area 1c gaze target evaluation, recognizer generation, etc.
- a judgment setting area 1d correlateence information setting, etc.
- the attention extraction device 1 accepts, for example, conditions and numerical values related to various settings and adjustments from the evaluator via a setting item menu displayed on the attention extraction device screen 1a.
- the attention extraction device 1 sets and adjusts the display and conditions in the gaze display area 1b, the target setting area 1c, and the judgment setting area 1d, for example, based on the various conditions and numerical values accepted.
- the attention extraction device 1 determines whether the task object included in the instructor's gaze image is correct or incorrect according to various conditions received from the evaluator, for example, via a setting item menu displayed on the attention extraction device screen 1a.
- the attention extraction device 1 may display the determination result on the instructor device 2 or the worker device 3.
- the attention extraction device 1 accepts input of response information from the evaluator, for example, via a setting item menu displayed on the attention extraction device screen 1a. In addition to accepting input of new response information, the attention extraction device 1 may also update or delete existing conditions, settings, response information, etc., and store them in the database of the server 4.
- the attention extraction device 1 may, for example, link the inputted corresponding information with the gaze image, generate an attention dataset including setting conditions for recognizing it as a gaze target, and store the generated attention dataset in a database of the server 4.
- FIG. 4(a) is a schematic diagram showing an example of the configuration of the attention extraction device 1.
- a single-board computer such as a Raspberry Pi (registered trademark) may be used, or a well-known electronic device such as a personal computer (PC) may be used.
- the attention extraction device 1 includes, for example, a housing 10, a CPU (Central Processing Unit) 101, a ROM (Read Only Memory) 102, a RAM (Random Access Memory) 103, a storage unit 104, and I/Fs 105 to 107. Each component 101 to 107 is connected by an internal bus 110.
- a CPU Central Processing Unit
- ROM Read Only Memory
- RAM Random Access Memory
- the CPU 101 controls the entire attention extraction device 1.
- the ROM 102 stores the operating code of the CPU 101.
- the RAM 103 is a working area used when the CPU 101 is operating.
- the storage unit 104 stores various information such as learning models and databases.
- well-known data storage media such as an SD memory card, a hard disk drive (HDD), a solid state drive (SSD), etc. may be used.
- the I/F 105 is a known interface for transmitting and receiving various information with the instructor device 2, the worker device 3, the server 4, the communication network 5, etc., which are connected depending on the application. For example, multiple I/Fs 105 may be provided.
- the I/F 106 is a known interface for transmitting and receiving various types of information to and from an input section 108 that is connected depending on the application.
- a keyboard is used as the input section 108, and an administrator or the like who manages the attention extraction system 100 inputs or selects various types of information or control commands for the attention extraction device 1 via the input section 108.
- the I/F 107 is a known interface for transmitting and receiving various information to and from a display unit 109 that is connected depending on the application.
- the display unit 109 outputs various information stored in the storage unit 104, the processing status of the attention extraction device 1, and the like.
- a display is used as the display unit 109, and may be, for example, a touch panel type.
- the display unit 109 may be configured to include the input unit 108.
- I/F 105 to I/F 107 may be used, or, for example, multiple I/Fs may be used for each of I/F 105 to I/F 107.
- at least one of the instructor device 2, operator device 3, server 4, communication network 5, input section 108, and display section 109 may be removed depending on the situation.
- the storage unit 14 stores various databases, for example, in the storage unit 104.
- the storage unit 14 stores associations between previously acquired past evaluation target information and reference information linked to the past evaluation target information, and stores, for example, a learning model having associations.
- the storage unit 14 may be provided in, for example, the instructor device 2, the operator device 3, and the server 4.
- the memory unit 14 records the instructor's attention behavior (points of attention, viewpoint, etc.) when, for example, observing the instructor's work.
- the attention extraction system 100 may determine the gaze mode based on information such as the gaze image, field of view, and viewpoint displacement acquired via the instructor device 2.
- the attention extraction system 100 may determine, for example, a "wide field of view mode" when the instructor's gaze displacement is characterized by a “narrow displacement range (concentration)" and "few points of attention for the viewpoint,” and may determine, for example, as shown in FIG.
- the storage unit 14 stores various information related to the instructor's work acquired by gestures, voice, gaze, and other input devices, for example, through various input interfaces (not shown) provided in the instructor device 2.
- the attention extraction system 100 stores various information acquired, for example, through an input device, in the storage unit 14, by linking it with information related to the instructor and information related to the instructor's work.
- the attention extraction system 100 may record various information appropriately, for example, in response to recording instructions such as start recording and end recording given by the instructor via the instructor device 2.
- the coordinate position set for the viewpoint supplemented by the instructor via the instructor device 2, changes in the coordinate position, and the speed of change, etc. are sequentially recorded.
- the position and orientation of the head in the work space coordinates may be recorded for each spatial layout of the place where each task is performed.
- the attention extraction system 100 may acquire various information recorded for each spatial layout using an imaging device such as a well-known 360° camera, surveillance camera, or network camera, and store this information in the memory unit 14 in association with information indicating the coordinate position of the instructor's head and changes in the coordinate position.
- the storage unit 14 may record various data recorded in association with the image data and the photographed data as an attention data set for the instructor's work.
- the attention data set is evaluated by an evaluator via the attention extraction device 1, for example, for attention and know-how, and is registered (recorded) in the storage unit 4 as corresponding information for the attention object recognizer and the monitoring of the attention object.
- the attention object recognizer is a device that is generated using attention object video that has been evaluated by the attention extraction system 100, and may be, for example, a publicly known image processing program such as pattern matching, or an image recognition model.
- the image processing program or image recognition model may be installed in, for example, the worker device 3 that the worker wears when performing work. This allows the worker to evaluate attention information for work, for example, based on time-series image information and coordinate information acquired via the worker device 3 when performing work.
- the storage unit 14 may further include a database that stores, for example, correlations between past gaze image information acquired in advance and reference information indicating the correctness of the work object linked to the gaze image.
- the storage unit 14 may be referenced, for example, by the judgment unit 15 and used when judging the correctness of the work object in the instructor's work.
- the judgment unit 15 may acquire corresponding correspondence information according to the judgment result, for example.
- the display unit 16 may further output the corresponding correspondence information from the storage unit 14 according to the content judged by the judgment unit 15, for example.
- the memory unit 14 may register, for example, the gaze target recognizer and corresponding information in monitoring the gaze target, which are acquired by the instructor device 2 and evaluated as attention know-how by the evaluation unit 15, as an attention data set for the work in the worker device 13.
- the storage unit 14 may store, for example, past evaluation target information and reference information acquired by the instructor device 2.
- the correlation is constructed by machine learning using multiple pieces of learning data, for example, with the past evaluation target information and the reference information as a set of learning data.
- a learning method for example, deep learning such as a convolutional neural network is used.
- association indicates the degree of connection between many-to-many information (multiple data sets contained in the past evaluation target information versus multiple data sets contained in the reference information).
- the correlation is updated appropriately during the machine learning process. That is, the correlation represents a function optimized based on, for example, past image data A, corresponding information A, and reference information. Therefore, an evaluation result for the evaluation target information is generated using the correlation constructed based on all past evaluation results of the evaluation target information. This makes it possible to generate optimal evaluation results even when the evaluation target information has a configuration state in which it is combined with other image data A, corresponding information A, etc.
- the evaluation target information stored in the storage unit 14 may be the same as or similar to past evaluation target information, or it may be dissimilar. This allows the attention extraction system 100 to quantitatively generate optimal evaluation results. Furthermore, the attention extraction system 100 can improve generalization capabilities when performing machine learning, and can improve evaluation accuracy for unknown evaluation target information.
- the association may have multiple association degrees that indicate the degree of connection between multiple data included in the past evaluation target information and multiple data included in the reference information.
- the association degrees can correspond to weight variables.
- the past evaluation target information indicates information of the same type as the evaluation target information described above.
- the past evaluation target information includes, for example, multiple pieces of evaluation target information obtained when evaluating image data A in the past.
- the reference information is linked to past evaluation target information and indicates information related to image data A and corresponding information A.
- the reference information indicates, for example, the scope of work for image data A, work instructions, work procedures, details of subsequent processes and related work, details of shared work or alternative work of other workers working in the same area, evaluations based on various work relationships and mutual influences (for example, “Confirm,” “Adjust,” “Pay attention (WATCH!,” “Judge,” “Respond,” etc.), and may also include various information related to the progress of the work process, checklists, images of interest, priority of displayed information, combinations of display and non-display, numerical values, data, distribution, etc.
- the specific content included in the reference information can be set arbitrarily.
- the correlation may indicate the degree of connection between the past evaluation target information and the reference information, for example as shown in FIG. 5.
- the association has multiple degrees of association that link multiple data included in the past evaluation target information with multiple data included in the reference information.
- the degree of association is shown in three or more levels, such as a percentage, a 10-point scale, or a 5-point scale, and is shown, for example, by the characteristics of the line (e.g., thickness, etc.).
- image data A” included in the past evaluation target information shows an association degree AA of "80%” with “reference A” included in the reference information, and an association degree AB of "55%” with “reference B” included in the reference information.
- the "association degree” indicates the degree of connection between each piece of data, and for example, the higher the association degree, the stronger the connection between each piece of data. Note that when constructing associations using the above-mentioned machine learning, the association may be set to have three or more levels of association.
- the past evaluation target information may be divided into past gaze images of the work target and past corresponding information of the work target, as shown in FIG. 6, for example, and stored in the database.
- the degree of association is calculated based on the relationship between the combination of image data of the past gaze images and past corresponding information, and the reference information.
- the past evaluation target information may be divided into past countermeasure information, for example, and stored in the database.
- the combination of "image data a" contained in the past gaze image and "correspondence information a" contained in the past correspondence information indicates a correlation degree AAA of "85%” with “reference A” and a correlation degree ABA of "25%” with “reference B.”
- the past target data and the past correspondence information can be stored independently. This makes it possible to improve accuracy and expand the range of options when generating evaluation results.
- the past evaluation target information may include, for example, synthetic data and a similarity.
- the synthetic data is indicated by three or more levels of similarity between the past target data or the past corresponding information.
- the synthetic data is stored in the database in the form of a numerical value, a matrix, a histogram, or the like, and may also be stored in the form of, for example, an image or a character string.
- the database various information related to the instructor (instructor), evaluator, worker, etc., who are users of the attention extraction system 100, as well as information related to the content and process of various tasks, and multiple data sets used by workers, are stored as attention data sets.
- the attention dataset stores at least, for example, a "task information table,” a "task procedure table,” a "correspondence information table,” a "teaching task record table,” a "viewpoint record table,” a “viewpoint record data table,” and a "gaze target table.”
- the "task information table” stores (contains) data for identifying a task to be performed, for example.
- the "task information table” stores, for example, a "task ID” and a "task name” for identifying a task to be performed by an instructor or worker.
- the "work procedure table” stores (contains) data on the procedures of work to be performed by, for example, an instructor or a worker.
- the "work procedure table” stores, for example, a "work procedure ID” that identifies the procedure of the work, and a "work ID,””work procedure name,” and “work sequence” that are linked to the "work procedure ID” and correspond to each other.
- the "work procedure table” is linked to the "work information table” by, for example, the "work ID.”
- the "corresponding information table” stores (contains) data for identifying information corresponding to the work displayed by the display means, for example, when a worker performs work, after the judgment means judges whether the work object is correct or incorrect based on the image information acquired by the acquisition means and the coordinate information of the viewpoint.
- the "Response Information Table” stores, for example, a "Response Information ID” for identifying information to respond to a task, a "Work Procedure ID,” "Advance” for indicating an action by nudging or the presentation of advance confirmation information, and a "Display Type” for indicating a display type such as an "Interrupt Type” that is displayed depending on the work situation, including oversights.
- a trigger for start timing a trigger indicating the display time, or a trigger indicating the end of display is set, such as an animation effect timing setting.
- the display type is, for example, "interrupt”
- information indicating the judgment timing for interrupting is stored as the "display/judgment start condition.”
- the judgment start condition may, for example, specify a trigger for interrupt judgment.
- the "Corresponding Information Table” stores the "Determination Criteria” that indicate the criteria or thresholds that the instructor uses to determine whether or not something has been seen when the "interrupt determination” process is started, and the display content that is displayed as corresponding information as the "Corresponding Information Content.”
- the "corresponding information content” may be stored with corresponding information created at the work site, for example, by specifying the file name, or by using Markdown or the like.
- the "corresponding information content” may be generated by storing, for example, a one-off instruction such as contacting a superior or a veteran, and inputting the instruction via an editing screen.
- the "Corresponding Information Table” also stores "Work Result Record Display Conditions,” which indicate conditions such as the timing for displaying content that records the results of each work step, such as work reports and checklists, and "Work Result Record Content,” which indicates content for recording work results, such as work reports and checklists.
- the attention extraction system 100 may refer to the "work result record display conditions" and determine by image comparison that the gaze target has reached an appropriate state, for example, based on a signal from a gesture or voice input.
- the attention extraction system 100 may store various information and data in the "work result record display conditions" as conditions for storing the determination results, for example.
- the attention extraction system 100 may require, for example, the input of the results of the content in the "work result recording content.”
- the attention extraction system 100 may, for example, determine this result input and determine that one work procedure has been completed based on the result of the determination.
- the "correspondence information table” is linked to the "work procedure table” by, for example, the "work procedure ID.”
- ⁇ Teaching work record table>> various information related to work performed by an instructor is stored.
- a "teaching work record ID” for identifying the recorded teaching work
- a "teaching work date and time” indicating the date and time when the teaching work was performed
- a "teaching worker” indicating the instructor who performed the teaching work
- a "teaching work location” indicating the place or area where the teaching work was performed
- a "work ID” are stored in association with each other.
- the “teaching work memory table” is associated with the "work information table” by, for example, the "work ID”.
- the "viewpoint record table” stores (contains) information and data related to the viewpoint from which the instructor performs work, acquired by, for example, the acquisition unit 11.
- the "viewpoint record table” stores, as a “viewpoint record data file", a "viewpoint record ID” that identifies the viewpoint record of the work performed by the instructor and data on the coordinate information of the viewpoint obtained by eye tracking.
- the viewpoint record table may store stream data acquired in milliseconds. This allows the attention extraction system 100 to store the actual file name of data such as JavaScript (registered trademark) Object Notation (JSON: JavaScript Object Notation) without directly storing the data in a database.
- JavaScript registered trademark
- JSON JavaScript Object Notation
- the "viewpoint record table” also stores a "viewpoint range record video file” that indicates data on video information within the instructor's field of view, as recorded by an outside camera, for example.
- the "viewpoint record table” also stores a "teaching work record ID” that identifies the history of teaching work by the instructor.
- the "viewpoint record table” is linked to the “teaching work record table” by, for example, the “teaching work record ID.”
- Viewpoint recording data table for example, detailed information on the coordinate information of the viewpoint acquired by the acquisition unit 11 when the instructor performed the work is stored.
- a "viewpoint identification data file” for identifying a file storing the viewpoint recording data and a "viewpoint recording elapsed time” indicating the time of the viewpoint identification recorded by the instructor are stored.
- the "viewpoint recording elapsed information” is stored (held) as a record of the elapsed time of recording the viewpoint, with the start of recording the instructor's work being set as 00:00:00.
- the "viewpoint recording elapsed information” may be recorded, for example, at unit time intervals (for example, 1 second intervals) together with various position information.
- the "viewpoint recording data table” is linked to the “teaching work record table” by, for example, the "viewpoint identification data file”.
- the “viewpoint recording data table” also stores the position information of the teaching operator in the work space as “teaching operator position information (X, Y, Z).”
- the “teaching operator position information (X, Y, Z)” may, for example, record the position of the HMD worn by the teaching operator, i.e., the three-dimensional coordinates of the head position information, and may also store information such as "teaching operator orientation information (rad),” “teaching operator eye position information (x, y, z),” and “teaching operator eye angle information (rad)” acquired by the acquisition unit 11.
- the “viewpoint recording data table” is linked to the “teaching work record table” by, for example, a "viewpoint recording data file.”
- various information related to the object gazed upon by the instructor is stored.
- a plurality of various pieces of information set by the evaluator are stored.
- a "gaze target ID” for identifying the object gazed upon by the instructor a "visual field range recording video file”, a "viewpoint recording data file”, a “viewpoint recording elapsed time”, and a "work procedure ID” are stored.
- the "gaze target table” stores a "viewpoint range image at time of gaze” as a still image extracted from the "viewpoint range recording video file” at a moment during the gaze recording elapsed time.
- the "gaze target position (x, y)” indicating the position of the gaze target extracted by the extraction unit 13
- the "gaze target range (w, h)” indicating the range
- the "gaze target image” indicating the image of the gaze target may be stored as attention information for the work, based on the gaze mode identified by the identification unit 12 based on the video information and coordinate information associated in the "gaze target recording table".
- the attention information may be generated by the identification unit 12 and the extraction unit 13 based on information acquired by the acquisition unit 11, for example, and then evaluated, set, and recorded by an evaluator.
- the "gaze target table” is linked, for example, to the "viewpoint recording table” by the “gaze target ID,” the "work procedure table” by the “work procedure ID,” and the "viewpoint recording data table” by the “viewpoint recording data file.”
- FIG. 4(b) is a schematic diagram showing an example of the functions of the attention extraction device 1.
- the attention extraction device 1 includes, for example, an acquisition unit 11, an identification unit 12, an extraction unit 13, a storage unit 14, a determination unit 15, a display unit 16, an input unit 17, and a monitoring display unit 18.
- Each function shown in FIG. 5(b) is realized by the CPU 101 using the RAM 103 as a working area to execute a program stored in the storage unit 104 or the like.
- the acquisition unit 11 acquires image information in the visual field of an instructor performing a task and coordinate information indicating a viewpoint at which the instructor gazes within the visual field in a time series manner, in association with the task. For example, the information is used when performing an acquisition step S110 described later.
- the timing at which the acquisition unit 11 acquires the image information and the coordinate information from the instructor device 2 can be set arbitrarily.
- the image information and coordinate information acquired via the unit 14 are linked to the work performed by the instructor and stored in the storage unit 104 in chronological order.
- the determination unit 12 obtains the viewpoint displacement of the instructor in the visual field range based on the image information and the coordinate information acquired by the acquisition unit 11, and determines the gaze mode of the instructor.
- the determination unit 12 is used when performing step S120. For example, as shown in FIG. 3, the determination unit 12 obtains image information in the visual field range 2a acquired by the instructor device 2 and coordinate information (x Based on the x-axis, the gaze mode in the field of vision of the instructor is identified.
- the identification unit 12 may identify it as a new mode.
- the identification unit 12 can arbitrarily set various modes using the attention extraction device 1. For example, via the memory unit 14, the identification unit 12 changes the acquired existing mode, and also links the new mode to the instructor and the work performed by the instructor and stores it in the storage unit 104 in chronological order. Note that the identification unit 12 of the instructor's various modes may be performed by, for example, an evaluator setting the various modes based on information acquired by the acquisition unit 11 and stored in a database.
- Extraction unit 13 (extraction means)>> The extraction unit 13 sets a gaze area based on the gaze mode specified by the specification unit 12, and extracts a gaze image of the work target 6 gazed by the instructor in the gaze area.
- gaze images of the work object 6 included in the set gaze area may be identified based on each gaze mode and extracted in chronological order.
- the extraction unit 13 extracts gaze images of the work target 6 included in the set gaze area based on, for example, a gaze mode, and if there is an order to the instructor's work or confirmation, extracts them in chronological order with emphasis on the order, or if there is no order, may link multiple gaze images to one work and extract them.
- the extraction unit 13 may output the extraction result to the storage unit 14, the attention extraction device 1, or the like, for example, via the communication network 5.
- the extraction unit 13 stores the extracted gaze image in the storage unit 104, for example, via the storage unit 14, by linking it with the instructor, the work performed by the instructor, the gaze mode, and the like. Note that the linking of the gaze image and the gaze mode by the extraction unit 13 may be set by the evaluator based on various information stored in the storage unit 104.
- ⁇ Storage unit 14 (storage means, database)>>>
- the storage unit 14 associates the gaze image extracted by the extraction unit 13 with the visual field range, the viewpoint displacement, and the gaze mode, and stores the image in a database as attention information for the instructor's work.
- the storage unit 14 stores various information in the storage unit 104, or retrieves various information from the storage unit 104.
- the storage unit 14 stores multiple correlations between previously acquired past gaze image information and reference information indicating the correctness or incorrectness of the work object linked to the gaze image.
- the storage unit 14 stores or retrieves various information depending on the processing contents of, for example, the acquisition unit 11, the identification unit 12, the extraction unit 13, the determination unit 15, the display unit 16, the input unit 17, and the monitoring display unit 18.
- the determination unit 15 determines whether a task object included in a gaze image of a task acquired via the worker device 3 of the worker is correct or incorrect by using a task-related attention dataset.
- the database stored in the storage unit 104 is referred to, and the database is referred to based on the image information acquired via the worker's worker device 1, and the gaze information of the worker (for example, the work object 6, etc.) is acquired. Judge whether the work is correct or not.
- the determination unit 15 acquires various types of corresponding information from a database, for example, depending on the result of the judgment.
- the determination unit 15 acquires information related to the work that the worker should respond to, such as the target information displayed in the visual field 3a of the worker device 3 in FIG. 2 described above ("(1) Confirmation OK; 01234", "(2) Adjustment Procedure: XXX").
- the determination unit 15 transmits the acquired corresponding information to the display unit 16 of the worker device 3, for example.
- the judgment unit 15 may collectively judge the correctness of the task objects of the multiple workers.
- the multiple workers set a common task attention data set in advance in each worker device 3.
- the judgment unit 15 may acquire multiple pieces of coordinate information for each task workspace layout based on the relationship between the coordinate information indicating the position and orientation of each worker's head and the coordinate information indicating the task workspace from the coordinate information acquired by the acquisition unit 11, for example, and judge the tasks and task objects of each worker in an integrated or group-by-group manner.
- the determination unit 15 acquires status information such as whether the original worker is facing the correct direction toward the work object that the original worker should be working on, and if not (unaware), whether there are other workers who can provide support, and whether there are other workers facing the work object that the original worker should be working on. Based on the acquired information and the work process information of each worker, for example, the determination unit 15 may transmit corresponding information to the worker device 3 of other workers working together nearby who can work on the work object appropriately even if the original worker is not looking,
- Display section 16 (display means)>>>
- the display unit 16 outputs various kinds of correspondence information acquired by the determination unit 15, for example, to the worker device 3 of the worker performing the corresponding work by using the attention data set generated based on the work of the instructor.
- the display unit 16 (display portion 109 of the worker device 3) displays, for example, various types of correspondence information transmitted from the determination unit 15 in the visual field range 3a of the worker device 3 worn by the worker, and in an actual image. The corresponding information is displayed in a superimposed manner.
- the display unit 16 appropriately displays the exercise information obtained from the determination unit 15 according to the work situation and work position of each worker.
- the display unit 16 also displays on the instructor device 2 a display that extracts attention information for the task, as shown in FIG. 9, for example.
- FIG. 9 is a schematic diagram showing an example of an attention extraction method in this embodiment, for example.
- the display unit 16 displays recording date and time information, recording position information, and recording control information in the center of the field of view 2a, as shown in FIG. 9(a). Also, while the acquisition unit 11 is acquiring video information, the display unit 16 may switch to displaying only the recording control information (for example, "recording paused,” "recording ended,” etc.) in a corner of the field of view 2a.
- the display unit 16 may also display indications for each operation in the center of the visual field 2a of the instructor device 2, as shown in, for example, Figures 9(c) to (e). These indications by the display unit 16 may be, for example, the work process planned by the instructor, the results of the work performed by the instructor, and the timing of the end or end of the work detected by a position detection sensor provided in advance in the instructor device 2.
- the display unit 16 displays, for example, displays such as those shown in Figs. 10(a) and (b) and Figs. 12(a) to (e) on the worker device 3.
- Figs. 10(a) and (b) are schematic diagrams showing an example of attention extraction displayed when a worker performs work via the worker device 3, for example, using the above-mentioned attention data set in this embodiment
- Figs. 12(a) to (e) are schematic diagrams showing an example of a display on the worker device 3 (worker terminal, smartphone, display device, etc.) held by a worker in this embodiment.
- the display unit 16 displays the result of the judgment by the judgment unit 15, for example, on the gaze target 6a included in the visual field 3a of the worker device 3 worn by the worker, or on a display device held by the worker, depending on the result of the judgment by the judgment unit 15. Furthermore, the display unit 16 refers to a database by the judgment unit 15 and displays corresponding information depending on the result of the judgment obtained.
- FIG. 10(a) is a display of the visual field 3a when, for example, a worker is working and there is no corresponding information for the worker.
- FIG. 10(b) is a display of the visual field 3a when, for example, a worker is working and there is corresponding information for the worker, with the corresponding information displayed in the visual field 3a including the points that should be focused on (correct target for gaze in the visual field 3a: circle + "WATCH!), procedures and related information that the worker should confirm or refer to (center of the visual field 3a: "Check-3-A”, “Check”, “Judgement”, “Response”, “Reference”, etc.), and the work status and check results of the worker or co-workers (left side of the visual field 3a: work location, work content, work progress, etc.).
- the display unit 16 may further include an acquisition display area that displays, for example, at the beginning of work by the worker, an instruction to select the type of acquisition unit 11 that acquires the video information, the type of attention data set stored in the database, and the start of work, and an attention information display area that switches between displaying corresponding information corresponding to the worker's gaze image and attention information based on the attention data set selected by the worker.
- an acquisition display area that displays, for example, at the beginning of work by the worker, an instruction to select the type of acquisition unit 11 that acquires the video information, the type of attention data set stored in the database, and the start of work
- an attention information display area that switches between displaying corresponding information corresponding to the worker's gaze image and attention information based on the attention data set selected by the worker.
- the display unit 16 may display the warning information displayed in the visual field 3a of the worker device 3 as nudge information, for example, at least one of the timings of the start, middle, or end of work depending on the progress of the worker's work.
- the display unit 16 refers to the warning database based on, for example, the warning data set and the result of the judgment by the judgment unit 15, and acquires response information, warning information, nudge information, etc., and distributes and displays the various types of acquired information depending on the gaze state of the worker performing the work.
- the nudge information displayed by the display unit 16 may be, for example, a display approach to be displayed to the worker, such as a nudge default (unconsciously encouraging), a mechanism (naturally encouraging), a nudge that makes the worker aware of the nudge but leaves the choice up to the worker, labeling (intentionally encouraging), and an incentive (encouraging with rewards).
- the display of nudge information may be linked, for example, to the work process or work progress, the worker's personality, temperament, evaluation, difficulty of the work, expected completion time, etc., and may be extracted from a database and displayed as appropriate according to the worker's actual work progress.
- the display unit 16 may display the nudge information in a random pattern, such as asking a question, such as "Are you OK?" or "Wait a minute?", in addition to directly displaying the reference information raised as a display candidate as a result of the judgment by the judgment unit 15.
- the display unit 16 may also display the nudge information in the center of the visual field 3a of the worker device 3, or at a location of the gaze target 6a within the visual field 3a, at random.
- the display unit 13 may also appropriately set and display, for example, a mixture of display output methods or random changes (visual, auditory, haptics (vibration), nudge information approach (labeling), and nudge information technique (talking, presenting succinctly, making the viewer wonder, changing drastically) depending on the worker's skills, the progress of the work, etc.
- a mixture of display output methods or random changes visual, auditory, haptics (vibration), nudge information approach (labeling), and nudge information technique (talking, presenting succinctly, making the viewer wonder, changing drastically) depending on the worker's skills, the progress of the work, etc.
- the display unit 16 may also display, for example, heuristics, normalcy (maintaining the status quo), and variations in the nudge information to prevent overlooking of reference information due to familiarity, boredom, or an expectation to maintain the status quo as related biases for the nudge information to be displayed.
- the display unit 16 may also include, for example, a means (not shown) for grasping the overall flow and procedure of the work being performed by the worker as a prerequisite process for displaying nudge information, and may further include a means (not shown) for detecting deviations from appropriate attention.
- the display unit 16 may refer to the database and display words of appreciation to the worker, and the type of nudge information and timing of display are optional. This makes it possible to vary the corresponding information displayed to the worker, for example, and prevents the worker from overlooking reference information due to familiarity, boredom, or an expectation of maintaining the status quo.
- the display unit 16 further displays, for example, a display such as that shown in FIG. 11 on the attention extraction device 1.
- FIG. 11 displays, for example, the gaze monitoring area of the attention extraction device 1 in this embodiment as the attention extraction device screen 1a.
- the display unit 16 displays, for example, the coordinate information, gaze area, gaze image, and gaze shift in chronological order in the gaze display area 1b when an evaluator refers to a database via the attention extraction device 1 for the work performed by an instructor.
- the display unit 16 displays in the target setting area 1c setting information for setting the gaze image displayed in the gaze display area 1b. Furthermore, the display unit 16 displays, for example, in the judgment setting area 1d, judgment conditions for the setting information displayed in the target setting area 1c.
- the display unit 16 displays, for example, in the target setting area 1c, setting information for setting the gaze image displayed in the gaze display area 1b. Furthermore, the display unit 16 displays, for example, in the judgment setting area 1d, judgment conditions for the setting information displayed in the target setting area 1c.
- the input unit 17 inputs, for example, the correspondence information output by the determination unit 16.
- the input unit 17 accepts, for example, conditions and numerical values relating to various settings and adjustments from the evaluator, as well as, for example, settings item menus displayed on the attention extraction device screen 1a, via, for example, the attention extraction device screen 1a displayed on the attention extraction device 1.
- the acquisition unit 11, the identification unit 12, the extraction unit 13, the storage unit 14, the determination unit 15, the display unit 16, and the monitoring display unit 18 perform various processes based on the various conditions and numerical values accepted by the input unit 17.
- the corresponding information input by the input unit 17 is linked to the gaze image and stored in the database of the server 4 as an attention data set that includes the setting conditions for recognizing it as a gaze target.
- the monitor display unit 18 displays, for example, various pieces of information that an evaluator refers to and sets while an instructor is working on a task, as a gaze monitoring area on the attention extraction device screen 1a.
- the monitor display unit 18 displays, for example, coordinate information stored in a database.
- a gaze display area 1b displays a gaze area, a gaze image, and a gaze shift in time series
- a target setting area 1c displays setting information for setting the gaze image displayed in the gaze display area
- a judgment setting area 1d for displaying judgment conditions for the displayed setting information is displayed.
- the attention extraction device 1 acquires video information related to work, for example, via the instructor's instructor device 2.
- the work performed by the instructor may be preselected for each of various attention datasets that are set according to the work of the worker, for example, and a gaze dataset for the work performed by the instructor may be generated using the attention dataset corresponding to the selected work.
- the attention data set extracted by the attention extraction device 1 is selected by workers A to C, for example, and the work of workers A to C is evaluated using the selected attention data set.
- the evaluator may monitor the work status of workers A to C who are performing work using, for example, the attention data set via the attention extraction device 1.
- the attention extraction device 1 may, for example, obtain the work status of workers A to C together with the work using the gaze data set, and may obtain images and position information of the work target by the worker device 1 worn by each worker, and display them on the display unit 16 of the attention extraction device 1.
- the evaluator may refer to the work status of workers A to C displayed on the attention extraction device screen 1a displayed on the attention extraction device 1, and if a task or evaluation other than that in the attention dataset occurs, for example if worker C is not performing the correct task (skipping the task order and being in front of another task) and worker B is in a position where he or she can confirm the work that worker C is actually doing, the evaluator may "ignore" the task corresponding to the gaze information that worker C should be focusing on, and instruct worker B in real time to "perform" the task corresponding to the gaze information that he or she should be focusing on.
- the instructor device 2 is worn by the instructor of the work, as well as by, for example, an expert or a qualified person, and acquires video information seen by the instructor's eyes through the instructor's work.
- the instructor device 2 may be, for example, a known eye tracking technology, a head-mounted display, or smart glasses, and may acquire voice and surrounding sound information, temperature and humidity, position information, spatial information, and the like in addition to the video information.
- the instructor device 2 may be connected to, for example, the attention extraction device 1, other instructor devices 2, the worker device 3, and the server 4 in a state capable of data communication, or may have, for example, the attention extraction device 1 built in.
- the worker device 3 is worn by the worker performing the work, or by a person other than the instructor, and acquires image information seen by the worker's eyes through the worker's work.
- the worker device 3 may be, for example, a known eye tracking technology, a head-mounted display, or smart glasses, and may acquire voice and surrounding sound information, temperature and humidity, position information, spatial information, and the like, in addition to the image information.
- the worker device 3 is connected to, for example, the attention extraction device 1, the instructor device 2, other worker devices 3, and the server 4 in a state capable of data communication, and may also have, for example, the attention extraction device 1 built in.
- the communication network 5 indicates, for example, an Internet network to which the attention extraction device 1, the instructor device 2, and the worker device 3 are connected via communication circuits, and may be configured as an optical fiber communication network.
- the communication network 5 can be realized by a known communication network such as a wired communication network or a wireless communication network.
- the learning model generates a database by, for example, machine learning.
- the learning model acquires, as a pair of learning data, a learning target image including a gaze image captured by the instructor device 2 and reference information indicating the correctness of the gaze image captured by the instructor device 2, in multiple pairs for each task in the instructor device 2.
- the learning model is generated by machine learning using the multiple learning data, from a database in which associations between multiple learning target images and multiple pieces of reference information are stored.
- FIG. 8 is a schematic diagram showing an example of an attention extraction method in this embodiment.
- the attention extraction method includes an acquisition step S110, a specification step S120, an extraction step S130, a storage step S140, and a determination step S150.
- the attention extraction method can be implemented using the attention extraction system 100.
- ⁇ Acquisition step S110> image information and coordinate information are acquired in chronological order by linking them with the work.
- the information is acquired using an instructor device 2 equipped with a known camera or imaging device.
- the image information and coordinate information may be acquired in chronological order by linking them with the work from an operator device 3.
- the image information is acquired by a device selected by the instructor and the operator.
- the identification step S120 obtains the viewpoint displacement of the instructor and identifies the instructor's gaze mode.
- the identification step S120 obtains the viewpoint displacement in the instructor's visual field range based on the image information and the coordinate information acquired in the acquisition step S110, for example, and identifies the instructor's gaze mode.
- the instructor's viewpoint displacement is within the instructor's field of view radius of 5 degrees to 20 degrees and is characterized by being concentrated at the center of the field of view range, it is identified as "wide field of view mode.”
- the instructor's viewpoint displacement is within the instructor's field of view radius of 4 degrees and is characterized by being dispersed outside the center of the field of view range, it is identified as "alert mode.”
- the identification step S120 identifies the gaze modes in the instructor's visual field range as "wide field mode” and "alert mode” based on, for example, image information in the visual field range 2a acquired by the instructor device 2 and coordinate information (x-axis, y-axis) of the instructor's viewpoint displacement.
- the identification step S120 identifies the gaze modes as "wide field mode” and "alert mode” based on, for example, coordinate information of the instructor's viewpoint displacement, but if other characteristics or trends can be confirmed, the confirmed gaze modes may be set as new ones.
- a gaze area is set, and a gaze image of the work object gazed by the instructor is extracted.
- a gaze area gazed by the instructor is set based on the gaze mode (for example, "wide field mode” or "alert mode") specified in the specification step S120.
- the extraction step S130 for example, in the set gaze area, a gaze image of the work object gazed by the instructor is extracted from the video information of the instructor.
- the extraction step S130 for example, if there are multiple gaze images in the video information of the instructor's gaze area, they may be extracted in the order of gaze or the duration of gaze.
- the instructor's gaze image of the work object is extracted from the video information in the gaze area. This makes it possible to grasp the instructor's gaze state and extract appropriate attention information.
- ⁇ Storage step S140> the gaze image is linked to the field of view, the viewpoint displacement, and the gaze mode, and stored as attention information.
- the gaze image extracted in the extraction step S130 is linked to the field of view, the viewpoint displacement, and the gaze mode, and stored in the database as attention information for the work, as described above.
- ⁇ Determination Step S150 the correctness of the task object included in the gaze image acquired, for example, via the worker device 3 is determined using the attention data set.
- the correctness of the task object is determined, for example, based on the gaze image acquired by the worker device 3 in the acquisition step S110.
- the determination step S150 may be performed based on evaluation by an evaluator monitoring the task, for example, via the attention extraction device 1.
- the judgment step S150 may be configured to judge whether the work object being worked on by the worker is correct or incorrect based on an attention data set selected in advance on the worker device 3.
- the judgment step S150 may be configured to judge the work of multiple workers in the same work area.
- the judgment step S150 may be configured to judge the work of multiple workers simultaneously, for example, using the same or a common attention data set. This makes it possible to make judgments according to the work position and work situation of each worker, for example, regarding work steps that a certain worker overlooked, skipped, or were performed out of order.
- the judgment step S150 refers to various data tables, such as the "work information table”, “work procedure table”, “correspondence information table”, “instruction work record table”, “viewpoint record table”, “viewpoint record data table”, and “gaze target table”, stored in a database, and judges whether the work target is correct or incorrect based on the various data tables and gaze data sets, and acquires various correspondence information stored in the corresponding work or reference link depending on the judgment result.
- the correspondence information acquired by the judgment step S150 may be displayed on the worker device 3 of the corresponding teaching worker or other workers, for example.
- the judgment step S150 for example, via the input unit 17, additions or updates of corresponding information of various data tables stored in the database are input.
- the input unit 17 can appropriately input, for example, the work processes, judgment conditions, corresponding work, reference links, etc. of the specific targets and judgment targets stored in each of the various data tables.
- the attention extraction device 1 acquires video information via the acquisition unit 11 of the instructor device 2.
- the video information may include, for example, recording date and time information of the instructor's work, recording position information, and recording control information related to the operation of acquiring the video information.
- the display unit 11 switches the display of the instructor device 2 as shown in FIG. 9. For example, before the acquisition unit 11 acquires video information, the display unit 11 displays the recording date and time information, recording position information, and recording control information in the center of the field of view. Furthermore, while the acquisition unit 11 is acquiring video information, it switches to displaying only the recording control information in a corner of the field of view. This makes it possible to perform a display that does not interfere with the instructor's work.
- the display unit 16 displays attention information.
- the display information is nudge information based on nudge theory at the start, middle, or end of work depending on information such as the progress and status of the work performed by the worker. This makes it possible to eliminate bias that affects interactions with conventional computer systems and information, and by displaying nudge displays based on nudge theory, it becomes possible to enhance the effectiveness of notifications from the attention extraction system device 1.
- Attention extraction device 1a Attention extraction device screen 1b: Gaze display area 1c: Target setting area 1d: Judgment setting area 1e: Work space map 1f: Work proxy flag 2: Instructor device 2a: Field of view range (instructor) 2b: Gaze mode 3: Worker device 3a: Field of view (worker) 4: Server 5: Communication network 6: Work object 6a: Gaze object 10: Housing 11: Acquisition unit 12: Identification unit 13: Extraction unit 14: Storage unit (database) 15: Determination unit 16: Display unit 17: Input unit 18: Monitoring display unit 50: Work area 100: Attention extraction system S110: Acquisition step S120: Identification step S130: Extraction step S140: Storage step S150: Determination step
Landscapes
- Engineering & Computer Science (AREA)
- Business, Economics & Management (AREA)
- Human Resources & Organizations (AREA)
- Theoretical Computer Science (AREA)
- Educational Administration (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Strategic Management (AREA)
- Economics (AREA)
- Educational Technology (AREA)
- Tourism & Hospitality (AREA)
- Entrepreneurship & Innovation (AREA)
- Development Economics (AREA)
- Marketing (AREA)
- General Business, Economics & Management (AREA)
- General Engineering & Computer Science (AREA)
- Quality & Reliability (AREA)
- Operations Research (AREA)
- Game Theory and Decision Science (AREA)
- Multimedia (AREA)
- Human Computer Interaction (AREA)
- Primary Health Care (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- User Interface Of Digital Computer (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
Description
図1、及び図2を参照して、本実施形態における注意抽出システム100の構成の一例について説明する。図1は、本実施形態における注意抽出システム100の構成の一例を示す模式図であり、図2は、本実施形態における注意抽出システム100の教示者による注意抽出と作業者による作業評価の一例を示す模式図である。
注意抽出装置1は、例えば教示者装置1が取得した教示者が行う作業の視野範囲における映像情報と、視野範囲において教示者が注視する視点を示す座標情報に基づいて、視野範囲における教示者の視点変位を求め、教示者の注視モードを特定する。
記憶部14は、例えば保存部104に各種のデータベースを記憶する。記憶部14には、予め取得された過去の評価対象情報と、過去の評価対象情報に紐づく参照情報との間における連関性が記憶され、例えば連関性を有する学習モデルが記憶される。記憶部14は、例えば教示者装置2、作業者装置3、及びサーバ4に備わってもよい。
『作業情報テーブル』には、例えば作業を行なう作業を識別するためのデータが記憶(格納)される。『作業情報テーブル』には、例えば教示者や作業者が行う作業を識別するための「作業ID」と「作業名」が記憶される。
『作業手順テーブル』には、例えば教示者や作業者が行う作業の手順に関するデータが記憶(格納)される。『作業手順テーブル』には、例えば作業の手順を識別する「作業手順ID」と、「作業手順ID」に紐づけられて、「作業ID」、「作業手順名」、「作業順序」が各々に対応づけられて記憶される。『作業手順テーブル』は、例えば「作業ID」で『作業情報テーブル』と紐付けられる。
『対応情報テーブル』には、例えば作業者が作業を行う際、取得手段によって取得された映像情報と視点の座標情報に基づいて判定手段によって作業対象の正誤を判定された後、表示手段によって表示される作業に対応する情報を識別するためのデータが記憶(格納)される。
『教示作業記録テーブル』には、例えば教示者による作業に関する各種の情報が記憶(格納)される。『教示作業記録テーブル』には、例えば記録される教示作業を識別するための「教示作業記録ID」、教示作業を行った日時を示す「教示作業日時」、教示作業を行った教示者を示す「教示作業者」、教示作業を行った場所やエリアなどを示す「教示作業場所」、「作業ID」が各々の紐付けられて記憶される。『教示作業記憶テーブル』は、例えば「作業ID」で『作業情報テーブル』と紐付けられる。
『視点記録テーブル』には、例えば取得部11によって取得された教示者が作業を行なう視点に関する情報とデータが記憶(格納)される。『視点記録テーブル』には、教示者よって作業された視点記録を識別する「視点記録ID」、アイトラッキングによる視点の座標情報のデータが「視点記録データファイル」として記憶される。
『視点記録データテーブル』には、例えば教示者が作業を行った時に取得部11によって取得された視点の座標情報の詳細な情報が複数記憶される。『視点記録データテーブル』には、視点記録データを格納するファイルを識別する「視点識別データファイル」、教示者によって記録された視点識別の時間を示す「視点記録経過時間」が記憶される。「視点記録経過情報」は、例えば教示者による作業開始の記録開始を00:00:00として、視点を記録した経過時間をレコードとして記憶(保持)される。「視点記録経過情報」は、例えば単位時間(例えば1秒間隔など)間隔で、各種位置情報と合わせて記録されるようにしてもよい。『視点記録データテーブル』は、例えば「視点識別データファイル」で『教示作業記録テーブル』と紐付けられる。
『注視対象テーブル』には、例えば教示者が注視する対象物に関する各種の情報が記憶(格納)される。評価者が設定する各種の情報が複数記憶される。『注視対象テーブル』には、例えば教示者が注視する対象物を識別する「注視対象ID」、「視野範囲記録映像ファイル」、「視点記録データファイル」、「視点記録経過時間」、「作業手順ID」が記憶される。
取得部11は、作業を行なう教示者の視野範囲における映像情報と、視野範囲において教示者が注視する視点を示す座標情報と、を作業と紐づけて時系列に取得する。取得部11は、例えば後述する取得ステップS110を実施する際に用いられる。取得部11が教示者装置2から映像情報、及び座標情報を取得するタイミングは、任意に設定することができる。取得部11は、例えば記憶部14を介して、取得した映像情報と、座標情報と、を教示者が行う作業と紐づけて時系列に保存部104に保存する。
特定部12は、取得部11により取得された映像情報及び座標情報に基づいて、視野範囲における教示者の視点変位を求め、教示者の注視モードを特定する。特定部12は、例えば後述する特定ステップS120を実施する際に用いられる。特定部12は、例えば前述の図3に示すとおり、教示者装置2によって取得された視野範囲2aにおける映像情報と、教示者の視点変位の座標情報(x軸、y軸)に基づいて、教示者の視野範囲における注視モードを特定する。
抽出部13は、特定部12により特定された注視モードに基づいて注視領域を設定し、注視領域において教示者が注視した作業対象6の注視画像を抽出する。抽出部13は、例えば特定部12により特定された注視モードが複数ある場合は、それぞれの注視モードに基づいて、設定された注視領域に含まれる作業対象6の注視画像を判別し、時系列に抽出するようにしてもよい。
記憶部14は、抽出部13により抽出された注視画像を、視野範囲、視点変位、及び注視モードと紐づけ、教示者の作業における注意情報としてデータベースに記憶する。記憶部14は、各種情報を保存部104に記憶させ、又は各種情報を保存部104から取出す。
判定部15は、例えば作業者の作業者装置3を介して取得される作業の注視画像に含まれる作業対象の正誤を作業に関する注意データセットを用いて判定する。判定部15は、記憶部14に記憶されるデータベース、または保存部104を参照し、作業者の作業者装置1を介して取得された映像情報に基づき、データベースを参照し、作業者が行う注視情報(例えば作業対象6など)に対する作業の正誤を判定する。
表示部16は、例えば判定部15により取得された各種の対応情報を、教示者の作業に基づき生成された注意データセットを用いて、対応する作業を行なう作業者の作業者装置3などに出力する。表示部16(作業者装置3の表示部分109)は、例えば判定部15から送信された各種の対応情報を、作業者が装着する作業者装置3の視野範囲3aに、実際の映像に重畳して対応情報の表示を合わせて表示する。
入力部17は、例えば判定部16により出力される対応情報を入力する。入力部17は、例えば注意抽出装置1に表示される注意抽出装置画面1aを介して、例えば評価者からの各種の設定や調整に関する条件や数値のほか、例えば注意抽出装置画面1aに表示する設定項目メニューを介して受け付ける。これにより、例えば取得部11、特定部12、抽出部13、記憶部14、判定部15、表示部16、及び監視表示部18は、入力部17が受け付けた各種の条件や数値に基づいて、各種の処理を行なう。
監視表示部18は、例えば教示者による作業を評価者が参照、設定する各種の情報を注意抽出装置画面1aに注視監視領域として表示する。監視表示部18は、例えばデータベースに記憶された座標情報、注視領域、注視画像、及び注視変移を時系列に表示する注視表示領域1bと、注視表示領域に表示される注視画像に対する設定を行なう設定情報を表示する対象設定領域1cと、対象設定領域に表示される設定情報の判定条件を表示する判定設定領域1dを表示する。
教示者装置2は、作業の教示者のほか、例えばエキスパート、有資格者などが装着し、教示者の作業を通じて教示者の目で見た映像情報を取得する。教示者装置2は、例えば公知のアイトラッキング技術、ヘッドマウントディスプレィ、またはスマートグラス等であってもよく、映像情報と合わせて、音声や周囲の音情報、気温や湿度、位置情報、空間情報等を取得するようにしてもよい。
作業者装置3は、作業を行なう作業者のほか、例えば教示者以外の者が装着し、作業者の作業を通じて作業者の目で見た映像情報を取得する。作業者装置3は、例えば公知のアイトラッキング技術、ヘッドマウントディスプレィ、またはスマートグラス等であってもよく、映像情報と合わせて、音声や周囲の音情報、気温や湿度、位置情報、空間情報等を取得するようにしてもよい。
通信網5は、例えば注意抽出装置1、教示者装置2、及び作業者装置3が、通信回路を介して接続されるインターネット網を示し、光ファイバ通信網で構成されてもよい。通信網5は、有線通信網のほか、無線通信網等の公知の通信網で実現できる。
学習モデルは、例えば機械学習によりデータベースを生成する。学習モデルは、例えば教示者装置2を用いて撮影された注視画像を含む学習用対象画像と、教示者用装置2を用いて撮像された注視画像の正誤を示す参照情報と、を一対の学習データとして、教示者用装置2における作業毎に複数取得する。学習モデルは、複数の学習データを用いた機械学習により、複数の学習用対象画像と、複数の参照情報との間における連関性が記憶されたデータベースにより生成される。
取得ステップS110は、例えば映像情報と、座標情報と、を作業と紐づけて時系列に取得する。取得ステップS110は、例えば公知のカメラや撮像装置等を備えた教示者装置2を用いて取得する。また取得ステップS110は、例えば作業者装置3から映像情報と、座標情報と、を作業と紐づけて時系列に取得するようにしてもよい。取得ステップS110は、例えば教示者、及び作業者によって選択されたデバイスにより映像情報などを取得する。
特定ステップS120は、教示者の視点変位を求め、教示者の注視モードを特定するする。特定ステップS120は、例えば取得ステップS110において取得された映像情報及び前記座標情報に基づいて、教示者の視野範囲における視点変位を求め、教示者の注視モードを特定する。
抽出ステップS130は、例えば注視領域を設定し、教示者が注視した作業対象の注視画像を抽出する。抽出ステップS130は、例えば特定ステップS120により特定された注視モード(例えば「広視野モード」、「警戒モード」)に基づいて、教示者が注視する注視領域を設定する。抽出ステップS130は、例えば設定した注視領域において、教示者の映像情報の中から、教示者が注視した作業対象の注視画像を抽出する。
記憶ステップS140は、例えば注視画像を視野範囲、視点変位、注視モードと紐づけ、注意情報として記憶する。記憶ステップS140は、例えば抽出ステップS130により抽出された注視画像を、視野範囲、視点変位、及び注視モードと紐づけ、作業における注意情報として前述の通りデータベースに記憶する。
判定ステップS150は、例えば作業者装置3を介して取得された注視画像に含まれる作業対象の正誤を、注意データセットを用いて判定する。判定ステップS150は、例えば取得ステップS110により作業者装置3により取得された注視画像に基づき、作業対象の正誤を判定する。判定ステップS150は、例えば選択した作業毎の注意データセットを用いた評価のほか、例えば注意抽出装置1を介して、作業を監視している評価者の評価により判定するようにしてもよい。
1a :注意抽出装置画面
1b :注視表示領域
1c :対象設定領域
1d :判定設定領域
1e :作業空間マップ
1f :作業代行フラグ
2 :教示者装置
2a :視野範囲(教示者)
2b :注視モード
3 :作業者装置
3a :視野範囲(作業者)
4 :サーバ
5 :通信網
6 :作業対象
6a :注視対象
10 :筐体
11 :取得部
12 :特定部
13 :抽出部
14 :記憶部(データベース)
15 :判定部
16 :表示部
17 :入力部
18 :監視表示部
50 :作業エリア
100 :注意抽出システム
S110 :取得ステップ
S120 :特定ステップ
S130 :抽出ステップ
S140 :記憶ステップ
S150 :判定ステップ
Claims (7)
- 作業に対する注意情報を抽出する注意抽出システムであって、
前記作業を行なう教示者の視野範囲における映像情報と、前記視野範囲において前記教示者が注視する視点を示す座標情報と、を前記作業と紐づけて時系列に取得する取得手段と、
前記取得手段により取得された前記映像情報及び前記座標情報に基づいて、前記視野範囲における前記教示者の視点変位を示す時系列の前記座標情報が、前記視野範囲の中心に集約する場合は広視野モードとし、前記視野範囲の中心より外側に分散する場合は警戒モードとして、前記教示者の注視モードを特定する特定手段と、
前記特定手段により特定された前記注視モードに基づいて注視領域を設定し、前記注視領域において前記教示者が注視した作業対象の注視画像を抽出する抽出手段と、
前記抽出手段により抽出された前記注視画像を、前記視野範囲、前記視点変位、及び前記注視モードと紐づけ、前記作業における注意情報としてデータベースに記憶する記憶手段と、
を備えることを特徴とする注意抽出システム。 - 前記取得手段は、
作業を行う作業者の注視画像を取得し、前記注視画像に含まれる映像情報と、予めデータベースに記憶された前記作業者が対応すべき作業に関する対象情報とに基づいて、前記作業者が行う作業対象に対する作業の正誤を判定する判定手段をさらに備え、
前記判定手段の判定結果を表示する表示手段と、をさらに備えること、
を特徴とする請求項1記載の注意抽出システム。 - 予め取得された過去の注視画像情報と、前記注視画像に紐づく前記作業対象の正誤を示す参照情報との間における連関性が記憶されたデータベースをさらに備え、
前記判定手段は、前記データベースを参照し、前記作業対象の正誤を判定するとともに、判定の結果に応じた対応情報を前記データベースより取得し、
前記表示手段は、前記判定手段により取得された前記対応情報をさらに出力すること、
を特徴とする請求項2記載の注意抽出システム。 - 前記判定手段により出力される対応情報を入力する入力手段をさらに備え、
前記記憶手段は、前記入力手段により入力された前記対応情報を、前記注視画像と紐づけ、注視対象として認識するための設定条件を含む注意データセットとして記憶すること、
を特徴とする請求項2記載の注意抽出システム。 - 前記取得手段により取得される前記映像情報は、前記教示者による作業の記録日時情報、記録位置情報、及び前記映像情報の取得操作に関する記録制御情報を含み、
前記表示手段は、前記取得手段による前記映像情報の取得前は、前記記録日時情報、前記記録位置情報、及び前記記録制御情報を前記視野範囲内の中央に表示し、前記映像情報の取得中は前記記録制御情報のみに切り替えて前記視野範囲の隅に表示すること、
を特徴とする請求項2記載の注意抽出システム。 - 前記表示手段は、
前記作業を行なう作業者に対して、前記映像情報を取得する前記取得手段の種別、前記データベースに記憶される注視対象として認識するための設定条件を含む注意データセットの種別、及び前記作業の開始を各々に選択させる指示を表示する取得表示領域と、前記選択の後に前記注意データセットに基づき、前記作業者の注視画像に対応する対応情報、及び注意情報を表示する注意表示領域と、を切り替えて表示する注意情報表示領域と、をさらに備え、
前記注意表示領域に表示される前記注意情報を、前記作業者の作業経過に応じて、作業開始、作業中間、又は作業終了の少なくとも何れかのタイミングで表示されるナッジ情報を含ませ、
前記注意情報表示領域に表示される前記対応情報、前記注意情報、又は前記ナッジ情報の少なくとも何れかを含む情報を、前記注意データセット、及び前記判定手段による判定の結果に基づき、前記作業を行なう作業者の注視状況に応じて振り分けて表示すること、
を特徴とする請求項2記載の注意抽出システム。 - 作業に対する注意情報を抽出する注意抽出方法であって、
前記作業を行なう教示者の視野範囲における映像情報と、前記視野範囲において前記教示者が注視する視点を示す座標情報と、を前記作業と紐づけて時系列に取得する取得ステップと、
前記取得ステップにより取得された前記映像情報及び前記座標情報に基づいて、前記視野範囲における前記教示者の視点変位を示す時系列の前記座標情報が、前記視野範囲の中心に集約する場合は広視野モードとし、前記視野範囲の中心より外側に分散する場合は警戒モードとして、前記教示者の注視モードを特定する特定ステップと、
前記特定ステップにより特定された前記注視モードに基づいて注視領域を設定し、前記注視領域において前記教示者が注視した作業対象の注視画像を抽出する抽出ステップと、
前記抽出ステップにより抽出された前記注視画像を、前記視野範囲、前記視点変位、及び前記注視モードと紐づけ、前記作業における注意情報としてデータベースに記憶する記憶ステップと、
をコンピュータに実行させること
を特徴とする注意抽出方法。
Priority Applications (3)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202480002623.4A CN119317931A (zh) | 2023-05-11 | 2024-04-19 | 注意提取系统以及注意提取方法 |
| DE112024000075.4T DE112024000075T5 (de) | 2023-05-11 | 2024-04-19 | Aufmerksamkeitsextraktionssystem und Aufmerksamkeitsextraktionsverfahren |
| US18/869,057 US20250348137A1 (en) | 2023-05-11 | 2024-04-19 | Attention extraction system and attention extraction method |
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| JP2023-078265 | 2023-05-11 | ||
| JP2023078265A JP7418711B1 (ja) | 2023-05-11 | 2023-05-11 | 注意抽出システム、及び注意抽出方法 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2024232239A1 true WO2024232239A1 (ja) | 2024-11-14 |
Family
ID=89616093
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/JP2024/015567 Pending WO2024232239A1 (ja) | 2023-05-11 | 2024-04-19 | 注意抽出システム、及び注意抽出方法 |
Country Status (5)
| Country | Link |
|---|---|
| US (1) | US20250348137A1 (ja) |
| JP (1) | JP7418711B1 (ja) |
| CN (1) | CN119317931A (ja) |
| DE (1) | DE112024000075T5 (ja) |
| WO (1) | WO2024232239A1 (ja) |
Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2010137165A1 (ja) * | 2009-05-29 | 2010-12-02 | 新日本製鐵株式会社 | 技術管理装置及び技術管理方法 |
| JP2013097466A (ja) * | 2011-10-28 | 2013-05-20 | Hitachi Ltd | 作業支援システム、作業支援方法、および、作業用端末 |
| US20160027336A1 (en) * | 2012-04-23 | 2016-01-28 | The Boeing Company | Methods for Evaluating Human Performance in Aviation |
Family Cites Families (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2015061339A (ja) | 2013-09-17 | 2015-03-30 | 株式会社日立製作所 | 結線作業支援システム |
| JP6646511B2 (ja) | 2016-04-14 | 2020-02-14 | 株式会社フジタ | 技能伝承システム及び方法 |
-
2023
- 2023-05-11 JP JP2023078265A patent/JP7418711B1/ja active Active
-
2024
- 2024-04-19 CN CN202480002623.4A patent/CN119317931A/zh active Pending
- 2024-04-19 US US18/869,057 patent/US20250348137A1/en active Pending
- 2024-04-19 WO PCT/JP2024/015567 patent/WO2024232239A1/ja active Pending
- 2024-04-19 DE DE112024000075.4T patent/DE112024000075T5/de active Pending
Patent Citations (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2010137165A1 (ja) * | 2009-05-29 | 2010-12-02 | 新日本製鐵株式会社 | 技術管理装置及び技術管理方法 |
| JP2013097466A (ja) * | 2011-10-28 | 2013-05-20 | Hitachi Ltd | 作業支援システム、作業支援方法、および、作業用端末 |
| US20160027336A1 (en) * | 2012-04-23 | 2016-01-28 | The Boeing Company | Methods for Evaluating Human Performance in Aviation |
Also Published As
| Publication number | Publication date |
|---|---|
| US20250348137A1 (en) | 2025-11-13 |
| DE112024000075T5 (de) | 2025-03-06 |
| JP7418711B1 (ja) | 2024-01-22 |
| JP2024162591A (ja) | 2024-11-21 |
| CN119317931A (zh) | 2025-01-14 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US10786149B2 (en) | Head-mounted display for performing ophthalmic examinations | |
| Drouot et al. | Augmented reality on industrial assembly line: Impact on effectiveness and mental workload | |
| CA2419962C (en) | Method and apparatus for providing an environment to a patient | |
| US11520947B1 (en) | System and method for adapting graphical user interfaces to real-time user metrics | |
| JP7191560B2 (ja) | コンテンツ作成システム | |
| WO2020003547A1 (ja) | コンテンツ提示システムおよびコンテンツ提示方法 | |
| US20100076912A1 (en) | System and Method for Determining a Characteristic of an Individual | |
| Cachada et al. | Using AR interfaces to support industrial maintenance procedures | |
| JP6270488B2 (ja) | オペレータ監視制御装置およびオペレータ監視制御方法 | |
| Jia et al. | Human performance measures for interactive haptic-audio-visual interfaces | |
| Tang et al. | Performance evaluation of augmented reality for directed assembly | |
| US20180053438A1 (en) | Action evaluation apparatus, action evaluation method, and computer-readable storage medium | |
| Yoo et al. | AI-Integrated AR as an Intelligent Companion for Industrial Workers: A Systematic Review | |
| WO2024232239A1 (ja) | 注意抽出システム、及び注意抽出方法 | |
| US11636359B2 (en) | Enhanced collection of training data for machine learning to improve worksite safety and operations | |
| JP7475948B2 (ja) | 訓練システム、方法及びプログラム | |
| JP7165108B2 (ja) | 作業訓練システム及び作業訓練支援方法 | |
| Inoue et al. | AiRcupid: supporting cucumber picking for intellectual disabilities using artificial intelligence and augmented reality | |
| JP2025027566A (ja) | 視線追跡システム、視線追跡方法、および視線追跡プログラム | |
| Jayawardena et al. | Advanced Gaze Analytics Dashboard | |
| JP2025043924A (ja) | 作業支援システム及び作業支援方法 | |
| WO2023182424A1 (ja) | 情報処理方法、コンピュータプログラム及び情報処理装置 | |
| Rechberger et al. | Exploring Human and Artificial Attention Mechanisms in Driving Scenarios | |
| Pesca et al. | Augmented Reality: Emergent Applications and Opportunities for Industry 4.0 | |
| JP7570632B2 (ja) | 設計支援装置、設計支援方法 |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| WWE | Wipo information: entry into national phase |
Ref document number: 202480002623.4 Country of ref document: CN |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 18869057 Country of ref document: US |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 112024000075 Country of ref document: DE |
|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 24803353 Country of ref document: EP Kind code of ref document: A1 |
|
| WWP | Wipo information: published in national office |
Ref document number: 202480002623.4 Country of ref document: CN |
|
| WWP | Wipo information: published in national office |
Ref document number: 112024000075 Country of ref document: DE |
|
| WWP | Wipo information: published in national office |
Ref document number: 18869057 Country of ref document: US |