WO2024128212A1 - Information processing device for outputting information based on point-of-gaze, and information processing method and program for outputting information based on point-of-gaze - Google Patents
Information processing device for outputting information based on point-of-gaze, and information processing method and program for outputting information based on point-of-gaze Download PDFInfo
- Publication number
- WO2024128212A1 WO2024128212A1 PCT/JP2023/044369 JP2023044369W WO2024128212A1 WO 2024128212 A1 WO2024128212 A1 WO 2024128212A1 JP 2023044369 W JP2023044369 W JP 2023044369W WO 2024128212 A1 WO2024128212 A1 WO 2024128212A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- information processing
- processing device
- captured image
- gaze point
- surgery
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/70—Determining position or orientation of objects or cameras
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H30/00—ICT specially adapted for the handling or processing of medical images
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N7/00—Television systems
- H04N7/18—Closed-circuit television [CCTV] systems, i.e. systems in which the video signal is not broadcast
Definitions
- the present invention relates to an information processing device, an information processing method, and a program.
- Patent Document 1 discloses a technology for distributing and recording video of a surgical site (operative field).
- an information processing device acquires captured images of a surgery, determines a gaze point related to the surgery based on the captured images of the surgery, and outputs information based on the determined gaze point.
- FIG. 1 is a diagram illustrating an example of a system configuration of an information processing system 1000.
- FIG. 2 is a diagram illustrating an example of a hardware configuration of the server device 100.
- FIG. 3 is a diagram illustrating an example of a hardware configuration of the client device 110.
- FIG. 4 is an activity diagram showing an example of information processing for storing captured images.
- FIG. 5 is an activity diagram showing an example of main information processing.
- FIG. 6 is a diagram showing an example of an acquired captured image.
- FIG. 7 is a diagram for explaining how to determine the axis of an object.
- FIG. 8 is a diagram showing an example of a gaze point of a captured image (current frame) to be processed.
- FIG. 9 is a diagram showing an example of determining a comprehensive gaze point based on the gaze point of the current frame and gaze points of several past frames.
- FIG. 10 is a diagram showing an example of a gaze point of a captured image (current frame) to be processed in the fourth modification.
- the term "part” may include, for example, a combination of hardware resources implemented by a circuit in a broad sense and software information processing that can be specifically realized by these hardware resources.
- various information is handled, and communication and calculation can be performed on a circuit in a broad sense, regardless of whether this information is represented by the high and low signal values as a binary bit collection consisting of 0 or 1, represented by physical numerical values of signal values, or represented by quantum superposition.
- a circuit in the broad sense is a circuit that is realized by at least appropriately combining a circuit, circuitry, a processor, and memory.
- ASICs application specific integrated circuits
- SPLDs simple programmable logic devices
- CPLDs complex programmable logic devices
- FPGAs field programmable gate arrays
- the programs for realizing the software appearing in the embodiments may be implemented in a form that allows them to be downloaded from a server, the programs may be executed on a cloud computer, or the programs may be stored in non-volatile or volatile non-transient storage media and distributed.
- System Configuration Fig. 1 is a diagram showing an example of the system configuration of an information processing system 1000.
- the information processing system 1000 includes, as a system configuration, a server device 100, a client device 110, and a camera 120.
- the server device 100, the client device 110, and the camera 120 are communicatively connected via a network 150.
- the camera 120 is a device that captures an image of a surgical site (surgical field) in an operating room. In the case of an open surgery, the camera 120 is a Web camera or the like installed in the operating room.
- the camera 120 Based on control information from the server device 100, the camera 120 changes the imaging direction, imaging position, and camera settings (e.g., ISO sensitivity, white balance, shutter mode, microphone, self-timer, focus setting, camera shake reduction, flicker prevention, storage destination setting, time-lapse shooting interval, etc.) and changes the digital zoom setting.
- Digital zoom is a process of stretching and enlarging a part of an image by image processing.
- the server device 100 is a device that executes the process according to the embodiment on captured images transmitted from the camera 120 or the like.
- the client device 110 is a device that displays the results of the processing performed by the server device 100, etc.
- FIG. 1 only one client device 110 is illustrated in the information processing system 1000 for the sake of simplicity, but the information processing system 1000 may include multiple client devices 110.
- FIG. 1 only one camera 120 is illustrated in the information processing system 1000 for the sake of simplicity, but the information processing system 1000 may include multiple cameras 120.
- the types of the multiple cameras 120 may be the same or different. Examples of the types of camera 120 include a web camera, an endoscope camera, etc.
- the processes of the following embodiments and the like will be described as being executed by the server device 100, but instead of the server device 100, a PC (Personal Computer), a cloud system, etc. may execute the processes of the embodiments and the like.
- PC Personal Computer
- FIG. 2 is a diagram showing an example of the hardware configuration of the server device 100.
- the server device 100 includes, as its hardware configuration, a control unit 210, a storage unit 220, and a communication unit 230.
- the control unit 210 is a central processing unit (CPU) or the like, and controls the entire server device 100.
- the storage unit 220 is any one of a hard disk drive (HDD), a read only memory (ROM), a random access memory (RAM), a solid state drive (SSD), or the like, or any combination thereof, and stores programs and data used when the control unit 210 executes processing based on the programs.
- HDD hard disk drive
- ROM read only memory
- RAM random access memory
- SSD solid state drive
- the control unit 210 executes processes based on the programs stored in the storage unit 220, thereby realizing the functions of the server device 100.
- the communication unit 230 is a NIC (Network Interface Card) or the like, which connects the server device 100 to the network 150 and controls communication with other devices.
- the data used by the control unit 210 to execute processing based on a program is described as being stored in the storage unit 220, but the data may be stored in a storage unit or the like of another device with which the server device 100 can communicate. Also, in FIG. 2, the control unit 210 is described as being one unit, but multiple control units may execute processing based on a program stored in a storage unit or the like.
- FIG. 3 is a diagram showing an example of the hardware configuration of the client device 110.
- the client device 110 includes, as a hardware configuration, a control unit 310, a storage unit 320, an input unit 330, an output unit 340, and a communication unit 350.
- the control unit 310 is a CPU or the like, and controls the entire client device 110.
- the storage unit 320 is any one of an HDD, a ROM, a RAM, and an SSD, or any combination thereof, and stores a program, data used when the control unit 310 executes processing based on the program, and the like.
- the control unit 310 executes processing based on the program stored in the storage unit 320, thereby realizing the function of the client device 110.
- the input unit 330 is a keyboard and/or a mouse, and the like, and inputs user operation information, etc. to the control unit 310.
- the output unit 340 is a display, and the like, and displays the results of information processing by the control unit 310, etc.
- the communication unit 350 is a NIC or the like, which connects the client device 110 to the network 150 and controls communication with other devices.
- the camera 120 also has at least a control unit, a memory unit, and an imaging unit.
- the imaging unit captures an image of a subject (the surgical site in this embodiment) based on the control of the control unit.
- the control unit controls the overall processing of the camera 120.
- the control unit realizes the functions of the camera 120 by executing processing based on a program stored in the memory unit.
- the memory unit stores the program for the camera 120 and data used when the control unit executes processing based on the program.
- the memory unit may also be configured to store images captured by the camera 120.
- FIG. 4 is an activity diagram showing an example of information processing for storing captured images.
- the control unit 210 waits to receive an image of the surgical site (hereinafter also simply referred to as an image) from the camera 120. If the control unit 210 receives an image, it proceeds to activity A402, and if the control unit 210 does not receive an image, it repeats the processing of activity A401.
- the control unit 210 stores the received captured image in the storage unit 220 or the like.
- the server device 100 repeatedly executes the process shown in Fig. 4.
- the camera 120 captures images of the surgical site in the operating room and transmits chronologically continuous captured images (i.e., video) to the server device 100.
- the server device 100 stores the video transmitted from the camera 120 in the storage unit 220 or the like.
- the server device 100 is described as executing the information processing described below after storing the captured image received from the camera 120 in the storage unit 220, etc.
- the server device 100 may acquire the captured image from the predetermined data received from the camera 120 before storing the captured image in the storage unit 220, etc., and execute the information processing described below.
- FIG. 5 is an activity diagram showing an example of main information processing.
- the control unit 210 acquires a captured image (frame) from data received from the storage unit 220 or the camera 120.
- Fig. 6 is a diagram showing an example of the acquired captured image.
- the captured image includes a surgical site of an abdominal surgery. Also, as shown in Fig. 6, the captured image includes an operator's hand 610, a surgical instrument 620, a surgical instrument 630, and a surgical instrument 640. That is, the captured image of the surgery shown in Fig. 6 is an image of a surgical site during an abdominal surgery.
- the operator's hand 610, the surgical instrument 620, the surgical instrument 630, and the surgical instrument 640 included in the captured image are examples of objects.
- the control unit 210 extracts objects from the captured image.
- the control unit 210 inputs the captured image to the trained model.
- the trained model is a trained model that has been trained using the captured image of the surgery as input data and the objects contained in the captured image of the surgery as output data.
- the control unit 210 extracts objects from the captured image by acquiring the objects output from the trained model.
- the control unit 210 may perform image analysis of the captured image and extract objects from the captured image based on the image analysis. When multiple objects are present, the control unit 210 extracts each of the multiple objects. In the following explanation, unless otherwise specified, it is assumed that multiple objects are present.
- the control unit 210 obtains the axes of each of a plurality of objects extracted from the captured image, and obtains a gaze point from the intersection of the obtained axes.
- a method for determining the axis of an object will be described with reference to Fig. 7.
- Fig. 7 is a diagram for explaining a method for determining the axis of an object.
- the control unit 210 determines a rectangular area that circumscribes an area from which an object included in a captured image is extracted, and determines a width W and a height H of the rectangular area.
- the control unit 210 also determines a center of gravity (X, Y) 710 of the rectangular area.
- the control unit 210 obtains the area and center of gravity of an extraction area existing within a circle with a radius of ⁇ (W ⁇ 2+H ⁇ 2)*3/8 at the coordinates of the following eight points. 1. ((X-W)/4, (Y-H)/4) 2. (X, (Y-H)/4) 3. ((X+W)/4, (Y-H)/4) 4. ((X-W)/4,Y) 5. ((X+W)/4,Y) 6. ((X-W)/4, (Y+H)/4) 7. (X, (Y+H)/4) 8. ((X-W)/4, (Y-H)/4) 7, eight circles are shown as the areas of interest. The control unit 210 calculates the area and center of gravity of the extraction areas that exist within these circles.
- control unit 210 selects the area of interest (of the eight circles) that overlaps the largest area with the extracted area.
- area of interest 720 is selected as the selected area of interest.
- the control unit 210 also selects three areas of interest that exist opposite the selected area of interest with respect to the center of gravity.
- area of interest 721, area of interest 722, and area of interest 723 are selected as the three areas of interest.
- the control unit 210 determines a center of gravity 730 of the area where the attention area 720, which is the selected attention area, overlaps with the extraction area.
- the control unit 210 also determines a center of gravity 731 of the area where the attention area 721 overlaps with the extraction area.
- the control unit 210 also determines a center of gravity 732 of the area where the attention area 722 overlaps with the extraction area.
- the control unit 210 also determines a center of gravity 733 of the area where the attention area 723 overlaps with the extraction area.
- the control unit 210 also determines a center of gravity 735 which is the center of gravity of the centers of gravity 731, 732, and 733. Then, the control unit 210 sets the straight line connecting the center of gravity 730, the center of gravity 710, and the center of gravity 735 as the axis of the object to be processed among the extracted objects.
- the control unit 210 determines the axes of each of the multiple objects extracted from the captured image, and determines the gaze point of the captured image to be processed from the intersection of the determined axes. More specifically, the control unit 210 removes outlier intersections from all intersections of the determined axes. The control unit 210 removes outliers using an outlier removal algorithm called LocalOutlierFactor, but is not limited to this method. The control unit 210 sets the average point of the intersections from which the outliers have been removed as the gaze point of the captured image to be processed (current frame).
- Figure 8 is a diagram showing an example of the gaze point of the captured image to be processed (current frame). Point 810 is an example of the determined gaze point.
- the control unit 210 acquires the gaze points of the captured image (current frame) and the past captured images (past frames) that are chronologically consecutive from the captured image (current frame) from the storage unit 220, etc.
- the control unit 210 acquires gaze points for four frames as the gaze points of the past captured images (past frames), but is not limited to this.
- the control unit 210 calculates the average of the gaze points of the captured image (current frame) and the gaze points of the past captured images (past frames) to obtain a comprehensive gaze point.
- This process is an example of a process for obtaining a gaze point related to surgery based on the gaze point obtained based on the captured image (current frame) to be processed and the gaze point of the past captured images (past frames) that are chronologically consecutive to the captured image to be processed.
- FIG. 9 is a diagram showing an example of obtaining a comprehensive gaze point based on the gaze point of the current frame and the gaze points of several past frames.
- the determined overall gaze point is the gaze point determined in activity A503.
- control unit 210 outputs information based on the gaze point obtained in activity A503.
- the control unit 210 outputs information based on the obtained gaze point (for example, an object indicating the gaze point) by displaying information indicating the obtained gaze point superimposed on the captured image displayed on the output unit 340 of the client device 110.
- the control unit 210 outputs information based on the obtained gaze point by transmitting control information for controlling an imaging device that captures an image based on the obtained gaze point to the imaging device.
- the control information includes information for controlling the camera 120 so that the gaze point is included in the captured image.
- the control unit 210 transmits control information to the camera 120 for shifting the imaging orientation or position of the camera 120 so that the gaze point is included in the captured image.
- the camera 120 receives the control information, the camera 120 moves the imaging orientation of the camera 120 or the position of the camera 120 based on the control information so that the gaze point is included in the captured image.
- the camera 120 can be controlled to record an image including the gaze point of the surgical site.
- an image including the gaze point of the surgical site is recorded in the camera 120 or the server device 100.
- Modifications of the first embodiment will be described. Modification 1 will describe another example of a part of the processing of the first embodiment. Modification 1 is included in the first embodiment, and does not describe other embodiments. The same applies to the modifications described below.
- Another example of the control information is information for controlling the imaging device to focus on the gaze point. For example, when the focus of the camera 120 deviates from the gaze point, the control unit 210 transmits control information to the camera 120 for changing the focus setting of the camera 120 so that the focus of the camera 120 coincides with the gaze point of the surgical site (surgical field). Upon receiving the control information, the camera 120 focuses the camera 120 on the gaze point of the surgical site (surgical field) based on the control information.
- the camera 120 can be controlled to record an image that includes the gaze point of the surgical site that is not out of focus. As a result, an image that includes the gaze point of the surgical site and is not out of focus is recorded in the camera 120 or the server device 100.
- the control unit 210 may output audio information as information based on the obtained gaze point.
- An example of the audio information is information such as "the gaze point is blocked by the surgeon's head.”
- information regarding the point of gaze can be output by audio information.
- the control unit 210 transmits control information to the laparoscopic camera for controlling the laparoscopic camera so that an image including the gaze point of the surgical site (operative field) is always captured during the laparoscopic surgery.
- the laparoscopic camera moves, for example, automatically during the laparoscopic surgery so that an image including the gaze point of the surgical site (operative field) is always captured.
- the camera 120 can be controlled to record an image including the gaze point of the surgical site even in surgery other than open surgery, such as laparoscopic surgery. As a result, an image including the gaze point of the surgical site is recorded in the camera 120 or the server device 100.
- the control unit 210 extracts surgical instruments and anatomical structures from the captured image in the activity A502.
- the surgical instruments are an example of a first object.
- the anatomical structures are an example of a second object.
- the anatomical structures are anatomical parts that constitute the human body, and examples of such anatomical structures include the gallbladder, liver, and heart.
- the control unit 210 inputs the acquired captured image into the trained model.
- the trained model is a trained model trained using the surgical captured image as input data and the objects and object types included in the surgical captured image as output data.
- the control unit 210 determines the axis of the surgical instrument. Then, the control unit 210 determines the intersection of the determined axis and the extracted anatomical structure, and sets the determined intersection as the gaze point.
- FIG. 10 is a diagram showing an example of a gaze point of a captured image (current frame) to be processed in Modification Example 4.
- An axis 1010 is an axis of a surgical instrument extracted from the captured image.
- a gaze point 1020 is an intersection point of an anatomical structure and the axis 1010. According to the fourth modification, even if there is only one surgical instrument, the gaze point can be obtained based on the captured image. Therefore, even if there is only one surgical instrument, the camera 120 can be controlled to record an image including the gaze point of the surgical site. As a result, an image including the gaze point of the surgical site is recorded in the camera 120 or the server device 100.
- An information processing device that acquires captured images of a surgery, determines a gaze point related to the surgery based on the captured images of the surgery, and outputs information based on the determined gaze point.
- an information processing device extracts multiple objects included in the captured image, determines the axes of the extracted multiple objects, and determines the gaze point from the intersection of the determined axes.
- the object is an instrument related to the surgery.
- a first object and a second object included in the captured image are extracted, an axis of the extracted first object is determined, and the gaze point is determined from the intersection of the axis and the second object.
- the first object is an instrument related to the surgery
- the second object is an anatomical structure.
- An information processing device which determines a gaze point related to the surgery based on the gaze point determined based on the captured image and the gaze point of a past captured image that is chronologically consecutive to the captured image.
- An information processing device which outputs information based on the determined gaze point by displaying information indicating the determined gaze point superimposed on the captured image.
- An information processing device which outputs information based on the determined gaze point by transmitting control information to an imaging device that captures the captured image based on the determined gaze point.
- control information is information for controlling the imaging device so that the gaze point is included in the captured image.
- control information is information for controlling the imaging device so as to focus on the gaze point.
- An information processing method that acquires an image of a surgery, determines a gaze point related to the surgery based on the image of the surgery, and outputs information based on the determined gaze point.
- 100 Server device, 110: Client device, 120: Camera, 150: Network, 210: Control unit, 220: Memory unit, 230: Communication unit, 310: Control unit, 320: Memory unit, 330: Input unit, 340: Output unit, 350: Communication unit, 610: Hand, 620: Instrument, 630: Instrument, 640: Instrument, 710: Center of gravity, 720: Area of interest, 721: Area of interest, 722: Area of interest, 723: Area of interest, 730: Center of gravity, 731: Center of gravity, 732: Center of gravity, 733: Center of gravity, 735: Center of gravity, 1000: Information processing system
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Medical Informatics (AREA)
- Health & Medical Sciences (AREA)
- Software Systems (AREA)
- Artificial Intelligence (AREA)
- Data Mining & Analysis (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Evolutionary Computation (AREA)
- Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
- Radiology & Medical Imaging (AREA)
- Epidemiology (AREA)
- General Health & Medical Sciences (AREA)
- Primary Health Care (AREA)
- Public Health (AREA)
- Image Analysis (AREA)
Abstract
Description
本発明は、情報処理装置、情報処理方法及びプログラムに関する。 The present invention relates to an information processing device, an information processing method, and a program.
学術発表、医師の教育の場等で利用するため、手術における手術部位を撮像し、記録するシステムがある。
特許文献1には手術部位(術野)の映像を配信、記録する技術が開示されている。
There are systems that capture and record images of surgical sites during surgery for use in academic presentations, in medical education, and the like.
Patent Document 1 discloses a technology for distributing and recording video of a surgical site (operative field).
手術部位の画像を記録する際には手術部位に関する注視点がどこかが重要となる。 When recording images of the surgical site, it is important to know where to focus on the surgical site.
本発明の一態様によれば、情報処理装置が提供される。この情報処理装置は、手術の撮像画像を取得し、手術の撮像画像に基づいて手術に関する注視点を求め、求めた注視点に基づく情報を出力する。 According to one aspect of the present invention, an information processing device is provided. This information processing device acquires captured images of a surgery, determines a gaze point related to the surgery based on the captured images of the surgery, and outputs information based on the determined gaze point.
(実施形態1)
以下、図面を用いて本発明の実施形態について説明する。以下に示す実施形態中で示した各種特徴事項は、互いに組み合わせ可能である。特に、本明細書において「部」とは、例えば、広義の回路によって実施されるハードウェア資源と、これらのハードウェア資源によって具体的に実現されうるソフトウェアの情報処理とを合わせたものも含みうる。また、本実施形態においては様々な情報を取り扱うが、これら情報は、0又は1で構成される2進数のビット集合体として信号値の高低によって表されるか、信号値の物理的な数値によって表されるか、又は量子的な重ね合わせによって表されるかによらず、広義の回路上で通信及び演算が実行されうる。
(Embodiment 1)
Hereinafter, an embodiment of the present invention will be described with reference to the drawings. Various features shown in the following embodiments can be combined with each other. In particular, in this specification, the term "part" may include, for example, a combination of hardware resources implemented by a circuit in a broad sense and software information processing that can be specifically realized by these hardware resources. In addition, in this embodiment, various information is handled, and communication and calculation can be performed on a circuit in a broad sense, regardless of whether this information is represented by the high and low signal values as a binary bit collection consisting of 0 or 1, represented by physical numerical values of signal values, or represented by quantum superposition.
また、広義の回路とは、回路(Circuit)、回路類(Circuitry)、プロセッサ(Processor)、及びメモリ(Memory)等を少なくとも適当に組み合わせることによって実現される回路である。すなわち、特定用途向け集積回路(Application Specific Integrated Circuit:ASIC)、プログラマブル論理デバイス(例えば、単純プログラマブル論理デバイス(Simple Programmable Logic Device:SPLD)、複合プログラマブル論理デバイス(Complex Programmable Logic Device:CPLD)、及びフィールドプログラマブルゲートアレイ(Field Programmable Gate Array:FPGA))等を含むものである。 In addition, a circuit in the broad sense is a circuit that is realized by at least appropriately combining a circuit, circuitry, a processor, and memory. In other words, it includes application specific integrated circuits (ASICs), programmable logic devices (e.g., simple programmable logic devices (SPLDs), complex programmable logic devices (CPLDs), and field programmable gate arrays (FPGAs)), etc.
また、実施形態中に登場するソフトウェアを実現するためのプログラムは、サーバーからダウンロード可能な態様で実施してもよいし、クラウドコンピュータ上でプログラムの実行がなされてもよいし、不揮発性又は揮発性の非一時的な記憶媒体に記憶させて頒布されてもよい。 In addition, the programs for realizing the software appearing in the embodiments may be implemented in a form that allows them to be downloaded from a server, the programs may be executed on a cloud computer, or the programs may be stored in non-volatile or volatile non-transient storage media and distributed.
1.システム構成
図1は、情報処理システム1000のシステム構成の一例を示す図である。図1に示されるように、情報処理システム1000は、システム構成として、サーバー装置100と、クライアント装置110と、カメラ120と、を含む。サーバー装置100と、クライアント装置110と、カメラ120とは、ネットワーク150を介して通信可能に接続されている。
カメラ120は、手術室において手術部位(術野)を撮像する装置である。手術が開腹手術のような場合、カメラ120は、手術室に設置されたWebカメラ等である。カメラ120は、サーバー装置100からの制御情報に基づき、撮像方向、撮像位置、カメラの設定値(例えば、ISO感度、ホワイトバランス、シャッターモード、マイク、セルフタイマー、フォーカス設定、手ぶれ軽減、ちらつき防止、保存先設定、微速度撮影間隔等)を変更したり、デジタルズームの設定を変更したりする。デジタルズームとは、画像の一部を画像処理で引き伸ばして拡大する処理である。
サーバー装置100は、カメラ120等から送信された撮像画像に関して実施形態に係る処理を実行する装置である。
クライアント装置110は、サーバー装置100の処理の実行結果等を表示する装置である。
1. System Configuration Fig. 1 is a diagram showing an example of the system configuration of an information processing system 1000. As shown in Fig. 1, the information processing system 1000 includes, as a system configuration, a server device 100, a client device 110, and a camera 120. The server device 100, the client device 110, and the camera 120 are communicatively connected via a network 150.
The camera 120 is a device that captures an image of a surgical site (surgical field) in an operating room. In the case of an open surgery, the camera 120 is a Web camera or the like installed in the operating room. Based on control information from the server device 100, the camera 120 changes the imaging direction, imaging position, and camera settings (e.g., ISO sensitivity, white balance, shutter mode, microphone, self-timer, focus setting, camera shake reduction, flicker prevention, storage destination setting, time-lapse shooting interval, etc.) and changes the digital zoom setting. Digital zoom is a process of stretching and enlarging a part of an image by image processing.
The server device 100 is a device that executes the process according to the embodiment on captured images transmitted from the camera 120 or the like.
The client device 110 is a device that displays the results of the processing performed by the server device 100, etc.
なお、図1では説明の簡略化のため情報処理システム1000においてクライアント装置110は1台しか図示していないが、情報処理システム1000には複数のクライアント装置110が含まれていてもよい。また、図1では説明の簡略化のため情報処理システム1000においてカメラ120は1台しか図示していないが、情報処理システム1000には複数のカメラ120が含まれていてもよい。複数のカメラ120が情報処理システム1000に含まれる場合、複数のカメラ120の種類は同一であってもよいし、異なってもよい。カメラ120の種類の例としては、Webカメラ、内視鏡カメラ等である。また、以下に示す実施形態等の処理はサーバー装置100が実行するものとして説明を行うが、サーバー装置100の替わりに、PC(Personal Computer)、又はクラウドシステム等が実施形態等の処理を実行してもよい。 Note that in FIG. 1, only one client device 110 is illustrated in the information processing system 1000 for the sake of simplicity, but the information processing system 1000 may include multiple client devices 110. Also, in FIG. 1, only one camera 120 is illustrated in the information processing system 1000 for the sake of simplicity, but the information processing system 1000 may include multiple cameras 120. When multiple cameras 120 are included in the information processing system 1000, the types of the multiple cameras 120 may be the same or different. Examples of the types of camera 120 include a web camera, an endoscope camera, etc. Also, the processes of the following embodiments and the like will be described as being executed by the server device 100, but instead of the server device 100, a PC (Personal Computer), a cloud system, etc. may execute the processes of the embodiments and the like.
2.ハードウェア構成
(1)サーバー装置100のハードウェア構成
図2は、サーバー装置100のハードウェア構成の一例を示す図である。サーバー装置100は、ハードウェア構成として、制御部210と、記憶部220と、通信部230と、を含む。制御部210は、CPU(Central Processing Unit)等であって、サーバー装置100の全体を制御する。記憶部220は、HDD(Hard Disk Drive)、ROM(Read Only Memory)、RAM(Random Access Memory)、SSD(Solid Sate Drive)等の何れか、又はこれらの任意の組み合わせであって、プログラム及び制御部210がプログラムに基づき処理を実行する際に利用するデータ等を記憶する。制御部210が、記憶部220に記憶されているプログラムに基づき、処理を実行することによって、サーバー装置100の機能等が実現される。通信部230は、NIC(Network Interface Card)等であって、サーバー装置100をネットワーク150に接続し、他の装置との通信を司る。
2. Hardware Configuration (1) Hardware Configuration of Server Device 100 Fig. 2 is a diagram showing an example of the hardware configuration of the server device 100. The server device 100 includes, as its hardware configuration, a control unit 210, a storage unit 220, and a communication unit 230. The control unit 210 is a central processing unit (CPU) or the like, and controls the entire server device 100. The storage unit 220 is any one of a hard disk drive (HDD), a read only memory (ROM), a random access memory (RAM), a solid state drive (SSD), or the like, or any combination thereof, and stores programs and data used when the control unit 210 executes processing based on the programs. The control unit 210 executes processes based on the programs stored in the storage unit 220, thereby realizing the functions of the server device 100. The communication unit 230 is a NIC (Network Interface Card) or the like, which connects the server device 100 to the network 150 and controls communication with other devices.
なお、実施形態1では、制御部210がプログラムに基づき処理を実行する際に利用するデータを記憶部220に記憶するものとして説明するが、サーバー装置100が通信可能な他の装置の記憶部等に記憶するようにしてもよい。また、図2では、制御部210を1つとして説明しているが、複数の制御部が記憶部等に記憶されているプログラムに基づき処理を実行するようにしてもよい。 In the first embodiment, the data used by the control unit 210 to execute processing based on a program is described as being stored in the storage unit 220, but the data may be stored in a storage unit or the like of another device with which the server device 100 can communicate. Also, in FIG. 2, the control unit 210 is described as being one unit, but multiple control units may execute processing based on a program stored in a storage unit or the like.
(2)クライアント装置110のハードウェア構成
図3は、クライアント装置110のハードウェア構成の一例を示す図である。クライアント装置110は、ハードウェア構成として、制御部310と、記憶部320と、入力部330と、出力部340と、通信部350と、を含む。制御部310は、CPU等であって、クライアント装置110の全体を制御する。記憶部320は、HDD、ROM、RAM、SSDの何れか、又はこれらの任意の組み合わせであって、プログラム、制御部310がプログラムに基づき処理を実行する際に利用するデータ等を記憶させる。制御部310が、記憶部320に記憶されているプログラムに基づき、処理を実行することによって、クライアント装置110の機能が実現される。入力部330は、キーボード及び/又はマウス等であって、ユーザーの操作情報等を制御部310に入力する。出力部340は、ディスプレイ等であって、制御部310の情報処理の結果等を表示する。通信部350は、NIC等であって、クライアント装置110をネットワーク150に接続し、他の装置との通信を司る。
(2) Hardware Configuration of the Client Device 110 FIG. 3 is a diagram showing an example of the hardware configuration of the client device 110. The client device 110 includes, as a hardware configuration, a control unit 310, a storage unit 320, an input unit 330, an output unit 340, and a communication unit 350. The control unit 310 is a CPU or the like, and controls the entire client device 110. The storage unit 320 is any one of an HDD, a ROM, a RAM, and an SSD, or any combination thereof, and stores a program, data used when the control unit 310 executes processing based on the program, and the like. The control unit 310 executes processing based on the program stored in the storage unit 320, thereby realizing the function of the client device 110. The input unit 330 is a keyboard and/or a mouse, and the like, and inputs user operation information, etc. to the control unit 310. The output unit 340 is a display, and the like, and displays the results of information processing by the control unit 310, etc. The communication unit 350 is a NIC or the like, which connects the client device 110 to the network 150 and controls communication with other devices.
なお、カメラ120も少なくとも制御部と記憶部と撮像部とを有する。撮像部は制御部の制御に基づき被写体(実施形態の例では手術部位)を撮像する。制御部はカメラ120の全体の処理を制御する。制御部は記憶部に記憶されているプログラムに基づき処理を実行することによってカメラ120の機能等を実現する。記憶部はカメラ120のプログラム及び制御部がプログラムに基づき処理を実行する際に利用するデータ等を記憶する。また記憶部はカメラ120が撮像した画像を記憶するようにしてもよい。 The camera 120 also has at least a control unit, a memory unit, and an imaging unit. The imaging unit captures an image of a subject (the surgical site in this embodiment) based on the control of the control unit. The control unit controls the overall processing of the camera 120. The control unit realizes the functions of the camera 120 by executing processing based on a program stored in the memory unit. The memory unit stores the program for the camera 120 and data used when the control unit executes processing based on the program. The memory unit may also be configured to store images captured by the camera 120.
3.情報処理
以下、実施形態1に係る情報処理を説明する。
(1)撮像画像記憶処理
図4は、撮像画像を記憶する情報処理の一例を示すアクティビティ図である。
アクティビティA401において、制御部210は、カメラ120からの手術部位の撮像画像(以下、単に撮像画像ともいう)の受信を待機する。制御部210は、撮像画像を受信した場合、アクティビティA402に進み、撮像画像を受信しない場合、アクティビティA401の処理を繰り返す。
アクティビティA402において、制御部210は、受信した撮像画像を記憶部220等に記憶する。
サーバー装置100は、起動中、図4に示される処理を繰り返し実行する。カメラ120は、手術室における手術の手術部位を撮像し、時系列的に連続した撮像画像(すなわち、動画)をサーバー装置100に送信する。したがって、サーバー装置100は、カメラ120から送信される動画を記憶部220等に記憶することになる。
なお、以下に示す実施形態等では、サーバー装置100は、カメラ120から受信した撮像画像を記憶部220等に記憶した後、以下に示す情報処理を実行するよう説明を行う。しかし、サーバー装置100は、記憶部220等に記憶する前に、カメラ120から受信した所定のデータから撮像画像を取得し、以下に示す情報処理を実行するようにしてもよい。
3. Information Processing Hereinafter, information processing according to the first embodiment will be described.
(1) Captured Image Storage Processing FIG. 4 is an activity diagram showing an example of information processing for storing captured images.
In activity A401, the control unit 210 waits to receive an image of the surgical site (hereinafter also simply referred to as an image) from the camera 120. If the control unit 210 receives an image, it proceeds to activity A402, and if the control unit 210 does not receive an image, it repeats the processing of activity A401.
In activity A402, the control unit 210 stores the received captured image in the storage unit 220 or the like.
During startup, the server device 100 repeatedly executes the process shown in Fig. 4. The camera 120 captures images of the surgical site in the operating room and transmits chronologically continuous captured images (i.e., video) to the server device 100. Thus, the server device 100 stores the video transmitted from the camera 120 in the storage unit 220 or the like.
In the following embodiment, the server device 100 is described as executing the information processing described below after storing the captured image received from the camera 120 in the storage unit 220, etc. However, the server device 100 may acquire the captured image from the predetermined data received from the camera 120 before storing the captured image in the storage unit 220, etc., and execute the information processing described below.
(2)メイン処理
(処理の概要)
図5を用いて実施形態1のメインの情報処理の概要を説明する。制御部210は、記憶部220又はカメラ120から受信したデータ等から撮像画像を取得する。そして、制御部210は、取得した撮像画像に基づいて手術に関する注視点を求める。そして、制御部210は、求めた注視点に基づく情報を出力する。このような処理を行うことによって、手術部位に関する注視点に基づく情報を出力することができる。
(処理の詳細)
図5は、メインの情報処理の一例を示すアクティビティ図である。
アクティビティA501において、制御部210は、記憶部220又はカメラ120から受信したデータ等から撮像画像(フレーム)を取得する。図6は、取得された撮像画像の一例を示す図である。図6に示されるように撮像画像には開腹手術の手術部位が含まれる。また、図6に示されるように撮像画像には術者の手610、手術用の器具620、手術用の器具630及び手術用の器具640が含まれている。すなわち、図6に示される手術の撮像画像は、開腹手術中の手術部位の撮像画像である。撮像画像に含まれる術者の手610、手術用の器具620、手術用の器具630及び手術用の器具640は、オブジェクトの一例である。
(2) Main processing (processing overview)
An overview of the main information processing of the first embodiment will be described with reference to Fig. 5. The control unit 210 acquires a captured image from data received from the storage unit 220 or the camera 120. The control unit 210 then determines a gaze point related to the surgery based on the acquired captured image. The control unit 210 then outputs information based on the determined gaze point. By performing such processing, it is possible to output information based on the gaze point related to the surgical site.
(Processing details)
FIG. 5 is an activity diagram showing an example of main information processing.
In activity A501, the control unit 210 acquires a captured image (frame) from data received from the storage unit 220 or the camera 120. Fig. 6 is a diagram showing an example of the acquired captured image. As shown in Fig. 6, the captured image includes a surgical site of an abdominal surgery. Also, as shown in Fig. 6, the captured image includes an operator's hand 610, a surgical instrument 620, a surgical instrument 630, and a surgical instrument 640. That is, the captured image of the surgery shown in Fig. 6 is an image of a surgical site during an abdominal surgery. The operator's hand 610, the surgical instrument 620, the surgical instrument 630, and the surgical instrument 640 included in the captured image are examples of objects.
アクティビティA502において、制御部210は、取得した撮像画像からオブジェクトを抽出する。アクティビティA502の処理をより具体的に説明すると、制御部210は、取得した撮像画像を学習済みモデルに入力する。ここで、学習済みモデルは、手術の撮像画像を入力データ、手術の撮像画像に含まれるオブジェクトを出力データとして学習された学習済みモデルである。制御部210は、学習済みモデルから出力されたオブジェクトを取得することで撮像画像からオブジェクトを抽出する。なお、他の例として、制御部210は、撮像画像を画像解析し、画像解析に基づき、撮像画像からオブジェクトを抽出するようにしてもよい。オブジェクトが複数存在する場合、制御部210は、複数のオブジェクトをそれぞれ抽出する。以下、特に言及しない限りオブジェクトは複数存在するものとして説明を行う。 In activity A502, the control unit 210 extracts objects from the captured image. To explain the processing of activity A502 in more detail, the control unit 210 inputs the captured image to the trained model. Here, the trained model is a trained model that has been trained using the captured image of the surgery as input data and the objects contained in the captured image of the surgery as output data. The control unit 210 extracts objects from the captured image by acquiring the objects output from the trained model. Note that, as another example, the control unit 210 may perform image analysis of the captured image and extract objects from the captured image based on the image analysis. When multiple objects are present, the control unit 210 extracts each of the multiple objects. In the following explanation, unless otherwise specified, it is assumed that multiple objects are present.
アクティビティA503において、制御部210は、撮像画像から抽出した複数のオブジェクトそれぞれの軸を求め、求めた軸の交点から注視点を求める。
図7を用いてオブジェクトの軸の求め方を説明する。図7は、オブジェクトの軸の求め方を説明するための図である。制御部210は、撮像画像に含まれるオブジェクトを抽出した領域に外接する矩形領域を決定し、矩形領域の幅Wと高さHを求める。また、制御部210は、矩形領域の重心(X,Y)710を求める。
In activity A503, the control unit 210 obtains the axes of each of a plurality of objects extracted from the captured image, and obtains a gaze point from the intersection of the obtained axes.
A method for determining the axis of an object will be described with reference to Fig. 7. Fig. 7 is a diagram for explaining a method for determining the axis of an object. The control unit 210 determines a rectangular area that circumscribes an area from which an object included in a captured image is extracted, and determines a width W and a height H of the rectangular area. The control unit 210 also determines a center of gravity (X, Y) 710 of the rectangular area.
次に、制御部210は、下記8点の座標において半径√(W^2+H^2)*3/8として円内に存在する抽出エリアの面積と重心を求める。
1.((X-W)/4、(Y-H)/4)
2.(X、(Y-H)/4)
3.((X+W)/4、(Y-H)/4)
4.((X-W)/4、Y)
5.((X+W)/4、Y)
6.((X-W)/4、(Y+H)/4)
7.(X、(Y+H)/4)
8.((X-W)/4、(Y-H)/4)
図7においては、注目エリアとして8つの円が図示されている。制御部210は、これらの円内に存在する抽出エリアの面積と重心とを求める。
Next, the control unit 210 obtains the area and center of gravity of an extraction area existing within a circle with a radius of √(W^2+H^2)*3/8 at the coordinates of the following eight points.
1. ((X-W)/4, (Y-H)/4)
2. (X, (Y-H)/4)
3. ((X+W)/4, (Y-H)/4)
4. ((X-W)/4,Y)
5. ((X+W)/4,Y)
6. ((X-W)/4, (Y+H)/4)
7. (X, (Y+H)/4)
8. ((X-W)/4, (Y-H)/4)
7, eight circles are shown as the areas of interest. The control unit 210 calculates the area and center of gravity of the extraction areas that exist within these circles.
次に、制御部210は、注目エリア(8つの円)の中で最も抽出エリアと重なっている面積が多い注目エリアを選択する。図7の例においては注目エリア720が選択注目エリアとして選択される。また、制御部210は、重心に対して選択注目エリアの反対に存在する、3つの注目エリアを選択する。図7の例においては注目エリア721、注目エリア722、注目エリア723が3つの注目エリアとして選択される。 Next, the control unit 210 selects the area of interest (of the eight circles) that overlaps the largest area with the extracted area. In the example of FIG. 7, area of interest 720 is selected as the selected area of interest. The control unit 210 also selects three areas of interest that exist opposite the selected area of interest with respect to the center of gravity. In the example of FIG. 7, area of interest 721, area of interest 722, and area of interest 723 are selected as the three areas of interest.
制御部210は、選択注目エリアである注目エリア720と抽出エリアとが重なっているエリアの重心730を求める。また、制御部210は、注目エリア721と抽出エリアとが重なっているエリアの重心731を求める。また、制御部210は、注目エリア722と抽出エリアとが重なっているエリアの重心732を求める。また、制御部210は、注目エリア723と抽出エリアとが重なっているエリアの重心733を求める。また、制御部210は、重心731、重心732及び重心733の重心である重心735を求める。
そして、制御部210は、重心730と、重心710と、重心735と、を結ぶ直線を抽出したオブジェクトのうち処理対象のオブジェクトの軸とする。
The control unit 210 determines a center of gravity 730 of the area where the attention area 720, which is the selected attention area, overlaps with the extraction area. The control unit 210 also determines a center of gravity 731 of the area where the attention area 721 overlaps with the extraction area. The control unit 210 also determines a center of gravity 732 of the area where the attention area 722 overlaps with the extraction area. The control unit 210 also determines a center of gravity 733 of the area where the attention area 723 overlaps with the extraction area. The control unit 210 also determines a center of gravity 735 which is the center of gravity of the centers of gravity 731, 732, and 733.
Then, the control unit 210 sets the straight line connecting the center of gravity 730, the center of gravity 710, and the center of gravity 735 as the axis of the object to be processed among the extracted objects.
制御部210は、撮像画像から抽出した複数のオブジェクトそれぞれの軸を求め、求めた軸の交点から処理対象の撮像画像の注視点を求める。より具体的に説明すると、制御部210は、求めた軸のすべての交点のうち外れ値の交点を除去する。制御部210は、LocalOutolierFactorと呼ばれる外れ値値除去アルゴリズムを用いて外れ値を除去しているが、この手法に限定されるものではない。制御部210は、外れ値が除去された交点の平均点を処理対象の撮像画像(現フレーム)の注視点とする。図8は、処理対象の撮像画像(現フレーム)の注視点の一例を示す図である。点810は、求められた注視点の一例である。 The control unit 210 determines the axes of each of the multiple objects extracted from the captured image, and determines the gaze point of the captured image to be processed from the intersection of the determined axes. More specifically, the control unit 210 removes outlier intersections from all intersections of the determined axes. The control unit 210 removes outliers using an outlier removal algorithm called LocalOutlierFactor, but is not limited to this method. The control unit 210 sets the average point of the intersections from which the outliers have been removed as the gaze point of the captured image to be processed (current frame). Figure 8 is a diagram showing an example of the gaze point of the captured image to be processed (current frame). Point 810 is an example of the determined gaze point.
制御部210は、記憶部220等から撮像画像(現フレーム)と時系列に連続する過去の撮像画像(過去のフレーム)の注視点を取得する。なお、制御部210は、過去の撮像画像(過去のフレーム)の注視点として4フレーム分の注視点を取得するが、これに限定されるものではない。制御部210は、撮像画像(現フレーム)の注視点と、過去の撮像画像(過去のフレーム)の注視点と、の平均を求め、総合的な注視点とする。この処理は、処理対象の撮像画像(現フレーム)に基づき求められた注視点と、処理対象の撮像画像と時系列に連続する過去の撮像画像(過去のフレーム)の注視点と、に基づいて手術に関する注視点を求める処理の一例である。図9は、現在のフレームの注視点と、過去の数フレームの注視点と、に基づき、総合的な注視点を求める一例を示す図である。
求められた総合的な注視点がアクティビティA503で求められる注視点である。
The control unit 210 acquires the gaze points of the captured image (current frame) and the past captured images (past frames) that are chronologically consecutive from the captured image (current frame) from the storage unit 220, etc. The control unit 210 acquires gaze points for four frames as the gaze points of the past captured images (past frames), but is not limited to this. The control unit 210 calculates the average of the gaze points of the captured image (current frame) and the gaze points of the past captured images (past frames) to obtain a comprehensive gaze point. This process is an example of a process for obtaining a gaze point related to surgery based on the gaze point obtained based on the captured image (current frame) to be processed and the gaze point of the past captured images (past frames) that are chronologically consecutive to the captured image to be processed. FIG. 9 is a diagram showing an example of obtaining a comprehensive gaze point based on the gaze point of the current frame and the gaze points of several past frames.
The determined overall gaze point is the gaze point determined in activity A503.
アクティビティA504において、制御部210は、アクティビティA503で求めた注視点に基づく情報を出力する。例えば、制御部210は、求めた注視点を示す情報をクライアント装置110の出力部340等に表示されている撮像画像に重畳して表示することで、求めた注視点に基づく情報(例えば、注視点を示すオブジェクト等)を出力する。また、例えば、制御部210は、求めた注視点に基づき撮像画像を撮像する撮像装置を制御する制御情報を撮像装置に送信することで、求めた注視点に基づく情報を出力する。制御情報としては、注視点が撮像画像に含まれるようカメラ120を制御するための情報がある。例えば、カメラ120と注視点との間に術者の体の一部等が入り込み撮像画像に注視点が含まれなくなった場合、制御部210は、撮像画像に注視点が含まれようカメラ120の撮像向き、又は位置等をずらすための制御情報をカメラ120に送信する。カメラ120は、制御情報を受信すると、制御情報に基づき、カメラ120の撮像向き、又はカメラ120の位置を移動させて撮像画像に注視点が含まれるようにする。 In activity A504, the control unit 210 outputs information based on the gaze point obtained in activity A503. For example, the control unit 210 outputs information based on the obtained gaze point (for example, an object indicating the gaze point) by displaying information indicating the obtained gaze point superimposed on the captured image displayed on the output unit 340 of the client device 110. Also, for example, the control unit 210 outputs information based on the obtained gaze point by transmitting control information for controlling an imaging device that captures an image based on the obtained gaze point to the imaging device. The control information includes information for controlling the camera 120 so that the gaze point is included in the captured image. For example, if a part of the surgeon's body or the like gets between the camera 120 and the gaze point and the captured image does not include the gaze point, the control unit 210 transmits control information to the camera 120 for shifting the imaging orientation or position of the camera 120 so that the gaze point is included in the captured image. When the camera 120 receives the control information, the camera 120 moves the imaging orientation of the camera 120 or the position of the camera 120 based on the control information so that the gaze point is included in the captured image.
実施形態1によれば、手術部位の注視点が含まれる画像を記録するようカメラ120を制御することができる。その結果、手術部位の注視点が含まれる画像がカメラ120、又はサーバー装置100において記録されることになる。 According to embodiment 1, the camera 120 can be controlled to record an image including the gaze point of the surgical site. As a result, an image including the gaze point of the surgical site is recorded in the camera 120 or the server device 100.
(変形例1)
実施形態1の変形例を説明する。変形例1では実施形態1の一部の処理等の他の例を説明する。変形例1は実施形態1に含まれるものであって、他の実施形態を説明するものではない。以下に示す変形例においても同様である。
制御情報の他の例としては、注視点にピントが合うよう撮像装置を制御するための情報がある。例えば、カメラ120のピントが注視点からずれた場合、制御部210は、カメラ120のピントが手術部位(術野)の注視点と一致するようにカメラ120のフォーカス設定を変えるための制御情報をカメラ120に送信する。カメラ120は、制御情報を受信すると、制御情報に基づき、手術部位(術野)の注視点にカメラ120のピントが合うようにする。
(Variation 1)
Modifications of the first embodiment will be described. Modification 1 will describe another example of a part of the processing of the first embodiment. Modification 1 is included in the first embodiment, and does not describe other embodiments. The same applies to the modifications described below.
Another example of the control information is information for controlling the imaging device to focus on the gaze point. For example, when the focus of the camera 120 deviates from the gaze point, the control unit 210 transmits control information to the camera 120 for changing the focus setting of the camera 120 so that the focus of the camera 120 coincides with the gaze point of the surgical site (surgical field). Upon receiving the control information, the camera 120 focuses the camera 120 on the gaze point of the surgical site (surgical field) based on the control information.
変形例1を実施形態1と組み合わせることによって、ピンボケしていない手術部位の注視点が含まれる画像を記録するようカメラ120を制御することができる。その結果、手術部位の注視点が含まれ、かつ、ピンボケしていない画像がカメラ120、又はサーバー装置100において記録されることになる。 By combining the first modification with the first embodiment, the camera 120 can be controlled to record an image that includes the gaze point of the surgical site that is not out of focus. As a result, an image that includes the gaze point of the surgical site and is not out of focus is recorded in the camera 120 or the server device 100.
(変形例2)
実施形態1の変形例を説明する。
制御部210は、求めた注視点に基づく情報として、音声情報を出力するようにしてもよい。音声情報の一例としては、例えば、注視点が術者の頭によって遮られています等の情報である。
変形例2によれば、音声情報によって注視点に関する情報を出力することができる。
(Variation 2)
A modification of the first embodiment will be described.
The control unit 210 may output audio information as information based on the obtained gaze point. An example of the audio information is information such as "the gaze point is blocked by the surgeon's head."
According to the second modification, information regarding the point of gaze can be output by audio information.
(変形例3)
実施形態1の変形例を説明する。
実施形態1では開腹手術を例に説明を行った。しかし、情報処理システム1000は、開腹手術以外の手術にも適用可能である。開腹手術以外の手術の例としては、例えば、鏡視下手術がある。鏡視下手術の例としては、腹腔鏡手術、ロボット支援下手術、顕微鏡手術等がある。変形例3では、説明の簡略化のため、鏡視下手術として腹腔鏡手術を例に説明を行う。腹腔鏡手術の場合、カメラ120は、腹腔鏡カメラとなる。そして、腹腔鏡手術の撮像画像は、腹腔鏡手術中の手術部位の撮像画像となる。変形例3の構成の場合、注視点に基づく情報として、制御部210は、腹腔鏡手術中において、常に手術部位(術野)の注視点が含まれる撮像画像が撮像されるよう、腹腔鏡カメラを制御する制御情報を腹腔鏡カメラに送信する。腹腔鏡カメラは、制御情報に基づき、例えば、腹腔鏡手術中は自動的に常に手術部位(術野)の注視点が含まれる撮像画像が撮像されるよう動く。
変形例3によれば、腹腔鏡手術のような開腹手術以外の手術においても手術部位の注視点が含まれる画像を記録するようカメラ120を制御することができる。その結果、手術部位の注視点が含まれる画像がカメラ120、又はサーバー装置100において記録されることになる。
(Variation 3)
A modification of the first embodiment will be described.
In the first embodiment, the description is given taking laparotomy as an example. However, the information processing system 1000 is also applicable to surgeries other than laparotomy. An example of surgery other than laparotomy is, for example, endoscopic surgery. Examples of endoscopic surgery include laparoscopic surgery, robot-assisted surgery, and microsurgery. In the third modification, for the sake of simplicity, laparoscopic surgery is described as an example. In the case of laparoscopic surgery, the camera 120 is a laparoscopic camera. The captured image of the laparoscopic surgery is an image of the surgical site during the laparoscopic surgery. In the case of the configuration of the third modification, as information based on the gaze point, the control unit 210 transmits control information to the laparoscopic camera for controlling the laparoscopic camera so that an image including the gaze point of the surgical site (operative field) is always captured during the laparoscopic surgery. Based on the control information, the laparoscopic camera moves, for example, automatically during the laparoscopic surgery so that an image including the gaze point of the surgical site (operative field) is always captured.
According to the third modification, the camera 120 can be controlled to record an image including the gaze point of the surgical site even in surgery other than open surgery, such as laparoscopic surgery. As a result, an image including the gaze point of the surgical site is recorded in the camera 120 or the server device 100.
(変形例4)
実施形態1の変形例を説明する。
変形例4の制御部210は、アクティビティA502において、撮像画像から手術に関する器具と、解剖学的構造物と、を抽出する。手術に関する器具は、第1のオブジェクトの一例である。解剖学的構造物は、第2のオブジェクトの一例である。なお、解剖学的構造物とは、人体等を構成する解剖学的部位のことであり、例えば、胆嚢、肝臓、心臓等が存在する。制御部210は、取得した撮像画像を学習済みモデルに入力する。ここで、学習済みモデルは、手術の撮像画像を入力データ、手術の撮像画像に含まれるオブジェクトとオブジェクトの種類とを出力データとして学習された学習済みモデルである。制御部210は、学習済みモデルから出力されたオブジェクトの種類が手術に関する器具を示しており、かつ、手術に関する器具が1つだけ抽出された場合、手術に関する器具の軸を求める。そして、制御部210は、求めた軸と、抽出した解剖学的構造物と、の交点を求め、求めた交点を注視点とする。
(Variation 4)
A modification of the first embodiment will be described.
In the fourth modified example, the control unit 210 extracts surgical instruments and anatomical structures from the captured image in the activity A502. The surgical instruments are an example of a first object. The anatomical structures are an example of a second object. The anatomical structures are anatomical parts that constitute the human body, and examples of such anatomical structures include the gallbladder, liver, and heart. The control unit 210 inputs the acquired captured image into the trained model. Here, the trained model is a trained model trained using the surgical captured image as input data and the objects and object types included in the surgical captured image as output data. When the type of object output from the trained model indicates a surgical instrument and only one surgical instrument is extracted, the control unit 210 determines the axis of the surgical instrument. Then, the control unit 210 determines the intersection of the determined axis and the extracted anatomical structure, and sets the determined intersection as the gaze point.
図10は、変形例4における処理対象の撮像画像(現フレーム)の注視点の一例を示す図である。軸1010は、撮像画像から抽出された手術に関する器具の軸である。注視点1020は、解剖学的構造物と、軸1010と、の交点である。
変形例4によれば、手術に関する器具が1つだけの場合であっても、撮像画像に基づき注視点を求めることができる。そのため、手術に関する器具が1つだけの場合であっても、手術部位の注視点が含まれる画像を記録するようカメラ120を制御することができる。その結果、手術部位の注視点が含まれる画像がカメラ120、又はサーバー装置100において記録されることになる。
10 is a diagram showing an example of a gaze point of a captured image (current frame) to be processed in Modification Example 4. An axis 1010 is an axis of a surgical instrument extracted from the captured image. A gaze point 1020 is an intersection point of an anatomical structure and the axis 1010.
According to the fourth modification, even if there is only one surgical instrument, the gaze point can be obtained based on the captured image. Therefore, even if there is only one surgical instrument, the camera 120 can be controlled to record an image including the gaze point of the surgical site. As a result, an image including the gaze point of the surgical site is recorded in the camera 120 or the server device 100.
さらに、次に記載の各態様で提供されてもよい。 Furthermore, it may be provided in the following forms:
(1)情報処理装置であって、手術の撮像画像を取得し、手術の撮像画像に基づいて手術に関する注視点を求め、求めた注視点に基づく情報を出力する、情報処理装置。 (1) An information processing device that acquires captured images of a surgery, determines a gaze point related to the surgery based on the captured images of the surgery, and outputs information based on the determined gaze point.
(2)上記(1)に記載の情報処理装置において、前記撮像画像に含まれる複数のオブジェクトを抽出し、抽出した複数のオブジェクトの軸を求め、求めた軸の交点から前記注視点を求める、情報処理装置。 (2) In the information processing device described in (1) above, an information processing device extracts multiple objects included in the captured image, determines the axes of the extracted multiple objects, and determines the gaze point from the intersection of the determined axes.
(3)上記(2)に記載の情報処理装置において、前記オブジェクトは、前記手術に関する器具である、情報処理装置。 (3) In the information processing device described in (2) above, the object is an instrument related to the surgery.
(4)上記(2)又は(3)に記載の情報処理装置において、前記撮像画像を学習済みモデルに入力し、前記学習済みモデルは、手術の撮像画像を入力データ、手術の撮像画像に含まれるオブジェクトを出力データとして学習された学習済みモデルであり、前記学習済みモデルから出力されたオブジェクトを取得することで前記撮像画像から前記オブジェクトを抽出する、情報処理装置。 (4) An information processing device according to (2) or (3) above, in which the captured image is input to a trained model, the trained model being trained using the captured image of the surgery as input data and the objects contained in the captured image of the surgery as output data, and the object is extracted from the captured image by obtaining the object output from the trained model.
(5)上記(1)に記載の情報処理装置において、前記撮像画像に含まれる第1のオブジェクトと第2のオブジェクトとを抽出し、抽出した前記第1のオブジェクトの軸を求め、前記軸と前記第2のオブジェクトとの交点から前記注視点を求める、情報処理装置。 (5) In the information processing device described in (1) above, a first object and a second object included in the captured image are extracted, an axis of the extracted first object is determined, and the gaze point is determined from the intersection of the axis and the second object.
(6)上記(5)に記載の情報処理装置において、前記第1のオブジェクトは、前記手術に関する器具であり、前記第2のオブジェクトは、解剖学的構造物である、情報処理装置。 (6) In the information processing device described in (5) above, the first object is an instrument related to the surgery, and the second object is an anatomical structure.
(7)上記(1)から(6)までの何れか1つに記載の情報処理装置において、前記撮像画像に基づき求められた注視点と、前記撮像画像と時系列に連続する過去の撮像画像の注視点と、に基づいて前記手術に関する注視点を求める、情報処理装置。 (7) An information processing device according to any one of (1) to (6) above, which determines a gaze point related to the surgery based on the gaze point determined based on the captured image and the gaze point of a past captured image that is chronologically consecutive to the captured image.
(8)上記(1)から(7)までの何れか1つに記載の情報処理装置において、求めた注視点を示す情報を前記撮像画像に重畳して表示することで、求めた注視点に基づく情報を出力する、情報処理装置。 (8) An information processing device according to any one of (1) to (7) above, which outputs information based on the determined gaze point by displaying information indicating the determined gaze point superimposed on the captured image.
(9)上記(1)から(8)までの何れか1つに記載の情報処理装置において、求めた注視点に基づき前記撮像画像を撮像する撮像装置を制御する制御情報を前記撮像装置に送信することで、求めた注視点に基づく情報を出力する、情報処理装置。 (9) An information processing device according to any one of (1) to (8) above, which outputs information based on the determined gaze point by transmitting control information to an imaging device that captures the captured image based on the determined gaze point.
(10)上記(9)に記載の情報処理装置において、前記制御情報は、前記注視点が前記撮像画像に含まれるよう前記撮像装置を制御するための情報である、情報処理装置。 (10) In the information processing device described in (9) above, the control information is information for controlling the imaging device so that the gaze point is included in the captured image.
(11)上記(9)又は(10)に記載の情報処理装置において、前記制御情報は、前記注視点にピントが合うよう前記撮像装置を制御するための情報である、情報処理装置。 (11) In the information processing device described in (9) or (10) above, the control information is information for controlling the imaging device so as to focus on the gaze point.
(12)上記(1)から(11)までの何れか1つに記載の情報処理装置において、前記手術の撮像画像は、開腹手術中の手術部位の撮像画像である、情報処理装置。 (12) An information processing device according to any one of (1) to (11) above, wherein the captured image of the surgery is a captured image of the surgical site during an open surgery.
(13)上記(1)から(11)までの何れか1つに記載の情報処理装置において、前記手術の撮像画像は、鏡視下手術中の手術部位の撮像画像である、情報処理装置。 (13) An information processing device according to any one of (1) to (11) above, wherein the captured image of the surgery is a captured image of the surgical site during laparoscopic surgery.
(14)情報処理方法であって、手術の撮像画像を取得し、手術の撮像画像に基づいて手術に関する注視点を求め、求めた注視点に基づく情報を出力する、情報処理方法。 (14) An information processing method that acquires an image of a surgery, determines a gaze point related to the surgery based on the image of the surgery, and outputs information based on the determined gaze point.
(15)プログラムであって、コンピュータを、上記(1)から(13)までの何れか1つに記載の情報処理装置として機能させるためのプログラム。
もちろん、この限りではない。
(15) A program for causing a computer to function as the information processing device according to any one of (1) to (13) above.
Of course, this is not the case.
最後に、本発明に係る種々の実施形態を説明したが、これらは、例として提示したものであり、発明の範囲を限定することは意図していない。新規な実施形態は、その他の様々な形態で実施されることが可能であり、発明の要旨を逸脱しない範囲で、種々の省略、置き換え、変更を行うことができる。実施形態やその変形は、発明の範囲や要旨に含まれると共に、特許請求の範囲に記載された発明とその均等の範囲に含まれるものである。 Finally, although various embodiments of the present invention have been described, these are presented as examples and are not intended to limit the scope of the invention. New embodiments can be embodied in various other forms, and various omissions, substitutions, and modifications can be made without departing from the gist of the invention. The embodiments and their modifications are within the scope and gist of the invention, and are also within the scope of the invention and its equivalents as set forth in the claims.
100:サーバー装置,110:クライアント装置,120:カメラ,150:ネットワーク,210:制御部,220:記憶部,230:通信部,310:制御部,320:記憶部,330:入力部,340:出力部,350:通信部,610:手,620:器具,630:器具,640:器具,710:重心,720:注目エリア,721:注目エリア,722:注目エリア,723:注目エリア,730:重心,731:重心,732:重心,733:重心,735:重心,1000:情報処理システム 100: Server device, 110: Client device, 120: Camera, 150: Network, 210: Control unit, 220: Memory unit, 230: Communication unit, 310: Control unit, 320: Memory unit, 330: Input unit, 340: Output unit, 350: Communication unit, 610: Hand, 620: Instrument, 630: Instrument, 640: Instrument, 710: Center of gravity, 720: Area of interest, 721: Area of interest, 722: Area of interest, 723: Area of interest, 730: Center of gravity, 731: Center of gravity, 732: Center of gravity, 733: Center of gravity, 735: Center of gravity, 1000: Information processing system
Claims (15)
手術の撮像画像を取得し、手術の撮像画像に基づいて手術に関する注視点を求め、求めた注視点に基づく情報を出力する、
情報処理装置。 An information processing device,
acquiring a captured image of the surgery, determining a gaze point related to the surgery based on the captured image of the surgery, and outputting information based on the determined gaze point;
Information processing device.
前記撮像画像に含まれる複数のオブジェクトを抽出し、抽出した複数のオブジェクトの軸を求め、求めた軸の交点から前記注視点を求める、
情報処理装置。 2. The information processing device according to claim 1,
extracting a plurality of objects included in the captured image, determining axes of the extracted plurality of objects, and determining the gaze point from an intersection of the determined axes;
Information processing device.
前記オブジェクトは、前記手術に関する器具である、
情報処理装置。 3. The information processing device according to claim 2,
the object is an instrument related to the surgery;
Information processing device.
前記撮像画像を学習済みモデルに入力し、
前記学習済みモデルは、手術の撮像画像を入力データ、手術の撮像画像に含まれるオブジェクトを出力データとして学習された学習済みモデルであり、
前記学習済みモデルから出力されたオブジェクトを取得することで前記撮像画像から前記オブジェクトを抽出する、
情報処理装置。 3. The information processing device according to claim 2,
The captured image is input into a trained model,
The trained model is a trained model trained using surgical images as input data and objects included in the surgical images as output data,
Extracting the object from the captured image by acquiring the object output from the trained model;
Information processing device.
前記撮像画像に含まれる第1のオブジェクトと第2のオブジェクトとを抽出し、抽出した前記第1のオブジェクトの軸を求め、前記軸と前記第2のオブジェクトとの交点から前記注視点を求める、
情報処理装置。 2. The information processing device according to claim 1,
extracting a first object and a second object included in the captured image, determining an axis of the extracted first object, and determining the gaze point from an intersection of the axis and the second object;
Information processing device.
前記第1のオブジェクトは、前記手術に関する器具であり、
前記第2のオブジェクトは、解剖学的構造物である、
情報処理装置。 6. The information processing device according to claim 5,
the first object is an instrument related to the surgery;
the second object is an anatomical structure.
Information processing device.
前記撮像画像に基づき求められた注視点と、前記撮像画像と時系列に連続する過去の撮像画像の注視点と、に基づいて前記手術に関する注視点を求める、
情報処理装置。 2. The information processing device according to claim 1,
determining a gaze point related to the surgery based on the gaze point determined based on the captured image and a gaze point of a past captured image that is chronologically consecutive to the captured image;
Information processing device.
求めた注視点を示す情報を前記撮像画像に重畳して表示することで、求めた注視点に基づく情報を出力する、
情報処理装置。 2. The information processing device according to claim 1,
and outputting information based on the obtained gaze point by superimposing information indicating the obtained gaze point on the captured image.
Information processing device.
求めた注視点に基づき前記撮像画像を撮像する撮像装置を制御する制御情報を前記撮像装置に送信することで、求めた注視点に基づく情報を出力する、
情報処理装置。 2. The information processing device according to claim 1,
and outputting information based on the obtained gaze point by transmitting control information to the imaging device that captures the captured image based on the obtained gaze point.
Information processing device.
前記制御情報は、前記注視点が前記撮像画像に含まれるよう前記撮像装置を制御するための情報である、
情報処理装置。 10. The information processing device according to claim 9,
The control information is information for controlling the imaging device so that the gaze point is included in the captured image.
Information processing device.
前記制御情報は、前記注視点にピントが合うよう前記撮像装置を制御するための情報である、
情報処理装置。 10. The information processing device according to claim 9,
The control information is information for controlling the imaging device so as to focus on the gaze point.
Information processing device.
前記手術の撮像画像は、開腹手術中の手術部位の撮像画像である、
情報処理装置。 2. The information processing device according to claim 1,
The captured image of the surgery is an image of a surgical site during an open surgery.
Information processing device.
前記手術の撮像画像は、鏡視下手術中の手術部位の撮像画像である、
情報処理装置。 2. The information processing device according to claim 1,
The captured image of the surgery is an image of a surgical site during laparoscopic surgery;
Information processing device.
手術の撮像画像を取得し、
手術の撮像画像に基づいて手術に関する注視点を求め、
求めた注視点に基づく情報を出力する、
情報処理方法。 1. An information processing method, comprising:
Acquire images of the surgery;
A surgical gaze point is obtained based on the captured image of the surgical procedure.
Output information based on the obtained gaze point.
Information processing methods.
コンピュータを、
請求項1から請求項13までの何れか1項に記載の情報処理装置として機能させるためのプログラム。 A program,
Computer,
A program for causing the information processing device according to any one of claims 1 to 13 to function as the information processing device.
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| JP2024564388A JPWO2024128212A1 (en) | 2022-12-15 | 2023-12-12 |
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| JP2022-200600 | 2022-12-15 | ||
| JP2022200600 | 2022-12-15 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2024128212A1 true WO2024128212A1 (en) | 2024-06-20 |
Family
ID=91485739
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/JP2023/044369 Ceased WO2024128212A1 (en) | 2022-12-15 | 2023-12-12 | Information processing device for outputting information based on point-of-gaze, and information processing method and program for outputting information based on point-of-gaze |
Country Status (2)
| Country | Link |
|---|---|
| JP (1) | JPWO2024128212A1 (en) |
| WO (1) | WO2024128212A1 (en) |
Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20090245600A1 (en) * | 2008-03-28 | 2009-10-01 | Intuitive Surgical, Inc. | Automated panning and digital zooming for robotic surgical systems |
| WO2018159328A1 (en) * | 2017-02-28 | 2018-09-07 | ソニー株式会社 | Medical arm system, control device, and control method |
| WO2019087904A1 (en) * | 2017-11-01 | 2019-05-09 | ソニー株式会社 | Surgical arm system and surgical arm control system |
| JP2021013412A (en) * | 2019-07-10 | 2021-02-12 | ソニー株式会社 | Medical observation system, control device, and control method |
| WO2021193697A1 (en) * | 2020-03-26 | 2021-09-30 | 国立大学法人筑波大学 | Multi-viewpoint video filming device |
-
2023
- 2023-12-12 JP JP2024564388A patent/JPWO2024128212A1/ja active Pending
- 2023-12-12 WO PCT/JP2023/044369 patent/WO2024128212A1/en not_active Ceased
Patent Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20090245600A1 (en) * | 2008-03-28 | 2009-10-01 | Intuitive Surgical, Inc. | Automated panning and digital zooming for robotic surgical systems |
| WO2018159328A1 (en) * | 2017-02-28 | 2018-09-07 | ソニー株式会社 | Medical arm system, control device, and control method |
| WO2019087904A1 (en) * | 2017-11-01 | 2019-05-09 | ソニー株式会社 | Surgical arm system and surgical arm control system |
| JP2021013412A (en) * | 2019-07-10 | 2021-02-12 | ソニー株式会社 | Medical observation system, control device, and control method |
| WO2021193697A1 (en) * | 2020-03-26 | 2021-09-30 | 国立大学法人筑波大学 | Multi-viewpoint video filming device |
Also Published As
| Publication number | Publication date |
|---|---|
| JPWO2024128212A1 (en) | 2024-06-20 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US12444094B2 (en) | Systems and methods for controlling surgical data overlay | |
| US10169535B2 (en) | Annotation of endoscopic video using gesture and voice commands | |
| JP5904812B2 (en) | Surgeon assistance for medical display | |
| JP7160033B2 (en) | Input control device, input control method, and surgical system | |
| JP6663571B2 (en) | Endoscope image processing apparatus, method of operating endoscope image processing apparatus, and program | |
| JP2005110878A (en) | Operation supporting system | |
| JPWO2016208246A1 (en) | Medical stereoscopic observation apparatus, medical stereoscopic observation method, and program | |
| EP3415076B1 (en) | Medical image processing device, system, method, and program | |
| JPWO2018096987A1 (en) | Information processing apparatus and method, and program | |
| CN113768619B (en) | Path positioning method, information display device, storage medium and integrated circuit chip | |
| US20230410491A1 (en) | Multi-view medical activity recognition systems and methods | |
| WO2023002661A1 (en) | Information processing system, information processing method, and program | |
| JP7521538B2 (en) | Medical imaging system, medical imaging method and image processing device | |
| JP7417337B2 (en) | Information processing system, information processing method and program | |
| WO2020152758A1 (en) | Endoscopic instrument and endoscopic system | |
| WO2024128212A1 (en) | Information processing device for outputting information based on point-of-gaze, and information processing method and program for outputting information based on point-of-gaze | |
| US12008682B2 (en) | Information processor, information processing method, and program image to determine a region of an operation target in a moving image | |
| JP7563384B2 (en) | Medical image processing device and medical image processing program | |
| Green et al. | Microanalysis of video from a robotic surgical procedure: implications for observational learning in the robotic environment | |
| WO2018087977A1 (en) | Information processing device, information processing method, and program | |
| JP7571722B2 (en) | Medical observation system and method, and medical observation device | |
| JP7480783B2 (en) | ENDOSCOPE SYSTEM, CONTROL DEVICE, AND CONTROL METHOD | |
| JP7451707B2 (en) | Control device, data log display method, and medical centralized control system | |
| WO2020049993A1 (en) | Image processing device, image processing method, and program | |
| US12290232B2 (en) | Enhanced video enabled software tools for medical environments |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 23903496 Country of ref document: EP Kind code of ref document: A1 |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 2024564388 Country of ref document: JP |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| 122 | Ep: pct application non-entry in european phase |
Ref document number: 23903496 Country of ref document: EP Kind code of ref document: A1 |