WO2024157360A1 - Treatment instrument detection device for endoscopic images, treatment instrument detection method for endoscopic images, and treatment instrument detection device program for endoscopic images - Google Patents
Treatment instrument detection device for endoscopic images, treatment instrument detection method for endoscopic images, and treatment instrument detection device program for endoscopic images Download PDFInfo
- Publication number
- WO2024157360A1 WO2024157360A1 PCT/JP2023/002125 JP2023002125W WO2024157360A1 WO 2024157360 A1 WO2024157360 A1 WO 2024157360A1 JP 2023002125 W JP2023002125 W JP 2023002125W WO 2024157360 A1 WO2024157360 A1 WO 2024157360A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- image
- treatment tool
- unit
- color
- surgical
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Images
Classifications
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61B—DIAGNOSIS; SURGERY; IDENTIFICATION
- A61B1/00—Instruments for performing medical examinations of the interior of cavities or tubes of the body by visual or photographical inspection, e.g. endoscopes; Illuminating arrangements therefor
- A61B1/04—Instruments for performing medical examinations of the interior of cavities or tubes of the body by visual or photographical inspection, e.g. endoscopes; Illuminating arrangements therefor combined with photographic or television appliances
- A61B1/045—Control thereof
Definitions
- the present invention relates to a treatment tool detection device for endoscopic images, a treatment tool detection method for endoscopic images, and a treatment tool detection device program for endoscopic images, and in particular to a treatment tool detection device, method, and program for endoscopic images that complements treatment tool recognition using artificial intelligence in surgical support for laparoscopic surgery.
- Patent Document 1 is a method of attaching a marker to the tip of a treatment tool and detecting the treatment tool.
- a marker when retrofitting a marker to an existing treatment tool, it is necessary to consider the sterilization of the marker and the risk of the marker falling off. For this reason, there are problems such as the need for the treatment tool manufacturers to be responsive in order to popularize treatment tools with markers or that are marker-compatible.
- the method of attaching a marker involves issues such as the possibility of interfering with the operation of the treatment tool.
- Patent Document 2 is a method for detecting treatment tools in endoscopic images by combining a support vector machine (SVM), which is a type of machine learning, with a mathematical model.
- SVM support vector machine
- the machine learning used in the method disclosed in Patent Document 2 is a support vector machine, which recognizes only predefined features, and therefore requires the learning of a very large number of features to recognize one treatment tool, such as changes in the texture of the treatment tool over time during surgery, parts of the treatment tool being hidden, and the presence of other treatment tools with the same characteristics as those used for learning. Furthermore, even if this configuration is simply replaced with a method capable of recognition under more complex conditions such as deep learning, if an endoscopic image contains multiple treatment tools of the same type and these are close to each other, it is difficult to recognize these treatment tools of the same type as individual treatment tools. Furthermore, in order to accurately track the three-dimensional position of the tip of the treatment tool during surgery, it is necessary to extract the tip point in more detail.
- an invention aims to provide a device for detecting the tip of a treatment tool in an endoscopic image, a method for detecting the tip of a treatment tool in an endoscopic image, and a program for detecting the tip of a treatment tool in an endoscopic image, which can appropriately detect the position of the tip of a treatment tool even in a complex environment where multiple treatment tools of the same type are included in an endoscopic image and are close to each other (see Patent Document 3).
- Patent Document 3 is an invention relating to a device for detecting the tip of a treatment tool in an endoscopic image, which includes an image acquisition unit that acquires a surgical image, an image recognition unit that performs image recognition of the surgical image using a pre-trained deep learning model, a treatment tool extraction unit that generates a segmentation mask and a bounding box of the treatment tool in the surgical image, and a tip detection unit that detects the tip of the treatment tool using the generated segmentation mask and the generated bounding box, and the tip detection unit includes a root identification unit that identifies each of the four sides of the image edge region present in the surgical image and the side of the image edge region that has the largest coverage area with the segmentation mask as the root, an end position detection unit that detects the end portion of the treatment tool, an angle detection unit that detects the angle of the treatment tool, and a tip point detection unit that detects the tip point of the treatment tool.
- JP 2017-164007 A Special Publication No. 2016-506260 Patent Application No. 2021-135936
- the method described in Patent Document 3 is a technology that detects the tip point based on the segmentation mask and bounding box information of the treatment tool present in the acquired image, but due to obstacles reflected in the screen, the segmentation mask and bounding box may not be in contact with the edge area of the image, which is thought to affect the detection of the tip point.
- the treatment tool is present on the screen, including a trocar (a cylindrical object called a trocar, etc.) for introducing the treatment tool into the abdominal cavity, the trocar part will not be recognized, resulting in a situation where the segmentation mask and bounding box cannot be accurately acquired, and this will affect the performance of tip detection.
- the present invention aims to provide a device for detecting the tip of a treatment tool in an endoscopic image, a method for detecting the tip of a treatment tool in an endoscopic image, and a program for detecting the tip of a treatment tool in an endoscopic image, which can quickly and appropriately detect the position of the tip point of a treatment tool even in situations where an obstacle is reflected in the surgical image.
- the endoscopic image treatment tool detection device of the embodiment includes an image acquisition unit that acquires a surgical image, an image recognition unit that performs image recognition of the surgical image using a pre-trained deep learning model, a treatment tool extraction unit that generates a treatment tool area in the surgical image consisting of a segmentation mask and a bounding box in the treatment tool image, and a treatment tool area completion unit that completes the treatment tool area due to an obstacle present in the surgical image.
- the treatment tool area completion unit includes an image completion unit that performs color processing on the surgical image to generate a color-processed image and completes the treatment tool area using the color-processed image.
- the image completion unit When generating the color-processed image, the image completion unit includes a color reduction unit that reduces a predetermined color from the surgical image to generate a reduced-color image, a binarization unit that binarizes the reduced-color image, an extraction unit that deletes areas unrelated to the target treatment tool present in the surgical image and extracts the treatment tool image from the color-processed image, a combination unit that generates a combined image by combining the treatment tool image acquired from the color-processed image with the inferred treatment tool image acquired through the deep learning model, and an expansion unit that generates a bounding box according to the combined image.
- a color reduction unit that reduces a predetermined color from the surgical image to generate a reduced-color image
- a binarization unit that binarizes the reduced-color image
- an extraction unit that deletes areas unrelated to the target treatment tool present in the surgical image and extracts the treatment tool image from the color-processed image
- a combination unit that generates a combined image by
- the treatment tool area complementing section of the endoscopic image treatment tool detection device may include a rectangle complementing section that generates a minimum rectangle surrounding the treatment tool area obtained by image processing for the treatment tool segmentation mask that is the output of the treatment tool extraction section, and performs complementing using the minimum rectangle.
- the rectangle complementing section may include a tip determination section that detects the tip of the treatment tool from the treatment tool segmentation mask that is the output of the treatment tool extraction section and determines whether or not there is an opening at the tip, a direction detection section that detects the extension direction of the treatment tool from the treatment tool segmentation mask that is the output of the treatment tool extraction section, a brightness detection section that detects the brightness around the treatment tool in the surgical image and detects the extension direction of the treatment tool based on the inference mask from the level of brightness, an extension section that extends the inference mask in the extension direction to generate an extended treatment tool image, a combination section that generates a combined image by combining the extended treatment tool image with the inferred treatment tool image obtained through the deep learning model, and an extension section that generates the bounding box according to the combined image.
- the treatment tool area completion section may include an image completion section and a rectangle completion section, and may selectively output either of the outputs.
- the image complementing section of the endoscopic image treatment tool detection device may be configured to remove background areas in the color-processed image from the brightness-adjusted image generated by adjusting the brightness of the surgical image.
- the image complementation section of the endoscopic image treatment tool detection device may identify the treatment tool image in the color-processed image by comparing the treatment tool image with the inferred treatment tool image.
- the image complementation section of the endoscopic image treatment tool detection device may reduce the image size of the surgical image to compress the time required for image processing.
- the device for detecting the tip of a treatment tool in an endoscopic image of an embodiment includes an image acquisition unit that acquires a surgical image, an image recognition unit that performs image recognition of the surgical image using a pre-trained deep learning model, a treatment tool extraction unit that generates a segmentation mask and a bounding box of the treatment tool in the surgical image, a treatment tool area completion unit that completes missing treatment tool areas due to obstacles present in the surgical image, and a tip detection unit that detects the tip of the treatment tool using the generated segmentation mask and the generated bounding box, and the tip detection unit is characterized by including a root identification unit that identifies, as a root, each of the four sides of the image edge area present in the surgical image and the side of the image edge area that has the largest coverage area with the segmentation mask, an end position detection unit that detects the end portion of the treatment tool, an angle detection unit that detects the angle of the treatment tool, and a tip point detection unit that detects the tip point of the treatment tool.
- a root identification unit that
- the endoscopic image treatment tool detection device of the present invention includes an image acquisition unit that acquires a surgical image, an image recognition unit that performs image recognition of the surgical image using a pre-trained deep learning model, a treatment tool extraction unit that generates a treatment tool area in the surgical image consisting of a segmentation mask and a bounding box in the treatment tool image, and a treatment tool area completion unit that completes the treatment tool area due to an obstacle present in the surgical image, and the treatment tool area completion unit is provided with an image completion unit that performs color processing on the surgical image to generate a color-processed image and completes the treatment tool area using the color-processed image, and the image completion unit is provided with a color reduction unit that reduces a predetermined color from the surgical image to generate a color-reduced image when generating the color-processed image.
- the system includes a binarization unit that binarizes the color-reduced image, an extraction unit that removes areas in the surgical image that are unrelated to the treatment tool of interest and extracts the treatment tool image from the color-processed image, a combination unit that combines the treatment tool image obtained from the color-processed image with the inferred treatment tool image obtained through the deep learning model to generate a combined image, and an expansion unit that generates a bounding box according to the combined image.
- the device for detecting the tip of a treatment tool in an endoscopic image of the present invention includes an image acquisition unit for acquiring a surgical image, an image recognition unit for performing image recognition of the surgical image using a pre-trained deep learning model, a treatment tool extraction unit for generating a segmentation mask and a bounding box of the treatment tool in the surgical image, a treatment tool area completion unit for completing a missing treatment tool area due to an obstacle present in the surgical image, and a tip detection unit for detecting the tip of the treatment tool using the generated segmentation mask and the generated bounding box, and the tip detection unit detects the tip of the treatment tool in the surgical image.
- the system is equipped with a root identification unit that identifies as the root the side of the image edge region that has the largest coverage area with each of the four sides of the image edge region present in the image and the segmentation mask, an end portion detection unit that detects the end portion of the treatment tool, an angle detection unit that detects the angle of the treatment tool, and a tip point detection unit that detects the tip point of the treatment tool.
- FIG. 1 is a schematic diagram showing an example of the overall configuration of a treatment tool detection device for an endoscopic image according to an embodiment of the present invention.
- 2 is a diagram illustrating an image displayed on the endoscope display shown in FIG. 1 .
- FIG. 2 is a block diagram showing a configuration of a treatment tool detection device for an endoscopic image shown in FIG. 1 .
- 1 is a first block diagram showing a configuration of functional parts in a computer of an endoscopic image treatment tool detection device.
- FIG. 13 is a second block diagram showing the configuration of functional parts in a computer of the treatment tool detection device for an endoscopic image.
- FIG. FIG. 1 is a schematic diagram illustrating a segmentation mask and a bounding box.
- FIG. 7A and 7B are diagrams for explaining a method in which the root identification unit determines the side on which the treatment tool appears
- FIG. 7A is a schematic diagram of the entire inference image
- FIG. 7B is a schematic diagram of an enlarged view of the root of the treatment tool.
- 3A is a first schematic diagram showing generation of a color-processed image in a color-reduction section and a binarization section
- FIG. 3B is a second schematic diagram
- FIG. 3C is a third schematic diagram
- 5A is a first schematic diagram showing generation of a color-tone-processed image in an extraction unit
- FIG. 5B is a second schematic diagram showing generation of a color-tone-processed image in an extraction unit.
- FIG. 1A is a first schematic diagram showing generation of a complementary image in a combination section and an extension section
- FIG. 1B is a second schematic diagram showing generation of a complementary image in a combination section and an extension section.
- FIG. 11 is a flow diagram of a process for adjusting the brightness of a surgical image and deleting a background image in the image complementing unit.
- 13 is a flowchart showing a process of identifying a treatment tool image in a color-tone-processed image in an image complementing section.
- FIG. 1A is a schematic diagram showing the state of image size compression in an image complementing section at actual size
- FIG. 1B is a schematic diagram showing 1/4 size
- FIG. 1C is a schematic diagram showing 1/8 size.
- FIG. 1A is a first schematic diagram showing detection of an opening of a treatment tool by a tip determination unit
- FIG. 1B is a second schematic diagram showing detection of an opening of a treatment tool by a tip determination unit.
- 1A is a first schematic diagram showing a first detection example of the extension direction of the treatment tool in the direction detection unit
- FIG. 13 is a schematic diagram showing a second detection example of the extension direction of the treatment tool in the brightness detection unit
- FIG. 13A is a first schematic diagram showing a third detection example of the extension direction of the treatment tool in the direction detection unit
- FIG. 1A is a first schematic diagram showing the generation of a complementary rectangle in the extension section and the combination section
- FIG. 4 is a flowchart illustrating a treatment tool detection method for an endoscopic image according to an embodiment of the present invention.
- 13 is a flowchart of the leading end detection unit.
- 13A is a first flowchart and
- FIG. 13B is a second flowchart in a treatment tool region complementing section.
- 13 is a flowchart of an image complementing unit.
- 13 is a flowchart of a rectangle complementing unit.
- FIG. 1 shows an example of the overall configuration of an endoscopic image treatment tool detection device 10 (hereinafter referred to as the "treatment tool detection device") of an embodiment.
- the endoscopic image treatment tool detection device 10 also functions as a treatment tool tip detection device for endoscopic images.
- the treatment tool tip detection device will be included in the treatment tool detection device 10.
- the treatment tool detection device 10 shown in FIG. 1 is composed of an endoscope 103, treatment tools 101 and 102, a processing device 105, and an endoscope display 106.
- the endoscope 103 and the treatment tools 101 and 102 are inserted into the abdominal cavity of a patient 107.
- the endoscope 103 inserted into the abdominal cavity of the patient 107 photographs an affected area 104, and the image obtained by photographing is transmitted to the processing device 105.
- the image transmitted from the endoscope 103 to the processing device 105 is displayed on the endoscope display 106.
- the treatment tools 101 and 102 are, for example, grasping forceps, high-frequency knives, hemostatic clips, etc. Furthermore, in addition to the aforementioned forceps, knives, etc., the treatment tools 101 and 102 also include a trocar 108 that is used when introducing the treatment tools into the abdominal cavity of the patient 107.
- the treatment tools 101 and 102 are each housed inside the sheath of the trocar 108 (see the two-dot chain line), and the tips of the treatment tools 101 and 102 are exposed from the trocar 108.
- the endoscope 103 has an image sensor and a light source (not shown), and the light source of the endoscope 103 irradiates light into the abdominal cavity of the patient 107, and the image sensor captures an image of the affected area 104, generating image data.
- the processing device 105 is an endoscope system that includes various electronic computers (computing resources, so-called computers), such as a personal computer (PC), a mainframe, a workstation, a cloud computing system, etc.
- computers such as a personal computer (PC), a mainframe, a workstation, a cloud computing system, etc.
- Figure 2 shows an image captured by the endoscope 103 shown in Figure 1 and displayed on the endoscope display 106.
- the image shown in Figure 2 includes treatment tools 101 and 102 and an affected area 104.
- the treatment tools 101 and 102 shown in Figure 2 are the same type of grasping forceps.
- Treatment tool 101 represents a state in which the tip of the grasping forceps is open
- treatment tool 102 represents a state in which the tip of the grasping forceps is closed.
- treatment tools 101 and 102 are also housed inside the sheath of trocar 108 (see dashed double-dashed line), and the tips of treatment tools 101 and 102 are exposed from trocar 108.
- the treatment tool detection device for endoscopic images (treatment tool tip detection device for endoscopic images) 10 of the embodiment recognizes a treatment tool in an endoscopic image captured by an endoscope and detects the tip of the recognized treatment tool.
- a high-frequency knife has one tip, so the treatment tool tip detection device of the embodiment detects one tip for the high-frequency knife.
- grasping forceps has one tip when the tip gripper is closed. When the tip gripper is open, there are actually two tips, but even when the tip gripper is open, there are cases where the tip appears to be two or one in the endoscopic image depending on the orientation of the grasping forceps relative to the endoscope.
- the treatment tool tip detection device of the embodiment can detect treatment tools in endoscopic images regardless of the type, number, state, or orientation of the treatment tool. It can also detect treatment tools in the screen, including trocars.
- the block diagram in FIG. 3 is an example of the configuration of the processing device 105 shown in FIG. 1.
- the processing device 105 in FIG. 3 is provided with a CPU 111 for executing various calculations, a ROM 112 for storing processing programs, a RAM 113 for storing data, etc., a storage unit 114 for storing various data and observation results, etc., and an I/O (input/output interface) 115, etc.
- the I/O 115 is an interface for communication (transmission and reception), a buffer, etc.
- the I/O 115 is used for transmitting and receiving image data to and from the endoscope 103, transmitting and receiving image data to and from the endoscope display 106, etc., and works in conjunction with the CPU 111.
- the RAM 113 is used for temporarily expanding the processing programs, image data imported from the I/O 115, etc.
- a GPU specialized for image processing may be provided.
- the memory unit 114 stores the deep learning model that is used by the image recognition unit 160 (described below) when performing image recognition.
- each functional unit of the CPU 111 When each functional unit of the CPU 111 is realized by software, the CPU 111 realizes it by executing commands of a program, which is the software that realizes each function.
- each functional unit is an image acquisition unit 150, an image recognition unit 160, a treatment tool extraction unit 170, a tip detection unit 180, and a treatment tool area completion unit 200, and the tip detection unit 180 is equipped with a root identification unit 210, an end position detection unit 220, an angle detection unit 230, and a tip point detection unit 240.
- the treatment tool area completion unit 200 is equipped with an image completion unit 250 and a rectangle completion unit 260.
- the image completion unit 250 is equipped with a color reduction unit 251, a binarization unit 252, an extraction unit 253, a combination unit 254, and an expansion unit 255.
- the rectangle completion unit 260 includes a tip determination unit 261, a direction detection unit 262, a brightness detection unit 263, an extension unit 264, a combination unit 265, and an expansion unit 266. Each functional unit will be described below with reference to the drawings.
- the image acquisition unit 150 acquires surgical images.
- the surgical images are images captured by the endoscope 103 shown in FIG. 1 and displayed on the endoscope display 106.
- the image recognition unit 160 performs image recognition of surgical images using a pre-trained deep learning model.
- the deep learning model used by the image recognition unit 160 of the treatment tool detection device 10 of the embodiment when performing image recognition of a surgical image is a model that has been trained in advance based on an instance segmentation technique and is stored in the storage unit 114.
- This deep learning model is a Convolutional Neural Network (CNN) that is constructed by adding annotations to image data of a surgical image captured in advance and using the annotated image data.
- the deep learning model is constructed using at least image data that has annotations added to the treatment tool within the image data captured in advance.
- the annotation here refers to the creation of training data for the treatment tool when using the deep learning model (machine learning model).
- the treatment tool extraction unit 170 generates a treatment tool area in the surgical image, the area being composed of a segmentation mask and a bounding box in the surgical image.
- the tip detection unit 180 detects the tip of the treatment tool using the generated segmentation mask and the generated bounding box.
- the treatment tool extraction unit 170 extracts a segmentation mask 402 and a bounding box 403 of a treatment tool 401 contained in the image data from the inference result of instance segmentation by the image recognition unit 160.
- instance segmentation is a method of detecting individual segmentation masks and bounding boxes for multiple objects present in the image data.
- the treatment tool segmentation mask is the pixel-unit area of the treatment tool in the endoscopic image, and is the area indicated by diagonal lines in Figure 6.
- the bounding box is rectangular data surrounded by the minimum x value, minimum y value, maximum x value, and maximum y value of the treatment tool segmentation mask, and is the rectangle indicated by dashed lines in Figure 6, and the four sides of the bounding box are parallel to the four sides of the endoscopic image.
- the treatment tool area complementing unit 200 complements missing treatment tool areas caused by obstacles present in the surgical image. In surgical images, various tissues or other treatment tools may overlap the treatment tool as obstacles. Therefore, the treatment tool area complementing unit 200 complements the treatment tool area that is invisible due to these obstacles through image processing, enabling proper detection of the tip of the treatment tool.
- the root identification unit 210 identifies as the root the side of the image edge region that has the largest coverage area between each of the four sides of the image edge region present in the surgical image and the segmentation mask.
- Figure 7 shows an inference image 60 including a treatment tool 601.
- an image edge line 602 is set on the inside at a distance r from the four sides ABCD of the inference image 60, and the area between the image edge line 602 and the outer frame line of the inference image 60 is set as the image edge 603.
- the side that covers the largest area between the image edge 603 and the treatment tool 601 is determined to be the side on which the treatment tool 601 appears.
- FIG. 7(B) is a schematic diagram showing an enlarged view of the root of the treatment tool 601 in FIG. 7(A).
- a perpendicular line L1 is drawn from the corner closest to side C of the image edge line 602 to side C.
- a perpendicular line L2 is drawn from the corner closest to side D of the image edge line 602 to side D.
- the area of a region 604 of the treatment tool 601 surrounded by the image edge line 602 and the perpendicular line L1 is compared with the area of a region 605 surrounded by the image edge line 602 and the perpendicular line L2.
- the area of region 605 is larger than the area of region 604, and therefore it is determined that the treatment tool 601 is emerging from side D.
- the end position detection unit 220 detects the end portion of the treatment tool.
- the end position detection unit 220 detects the rearmost point of the treatment tool extending from the base toward the tip point of the treatment tool in the endoscopic image, i.e., the rear end point, as the end position.
- the corner closest to the tip of the treatment tool i.e., the tip corner
- the corner furthest from it i.e., the rear end corner
- the angle detection unit 230 detects the angle that the treatment tool in the endoscopic image makes with one of the four sides surrounding the endoscopic image as the angle of the treatment tool.
- Methods by which the angle detection unit 230 detects the angle of the treatment tool include a method of detection using the rear end point detected by the end position detection unit 220, a method of detection using the tip angle and rear end angle detected by the end position detection unit 220, and a method of detection without using any of the rear end point, tip angle, or rear end angle. Note that in the embodiment, these methods are applied regardless of the number of tips of the treatment tool.
- the tip point detection unit 240 detects the tip point of the treatment tool in the endoscopic image. For example, the point on the treatment tool's segmentation mask that is farthest from the rear end point of the treatment tool is detected as the tip point of the treatment tool. Or, the point on the treatment tool's segmentation mask that is closest to the tip angle of the treatment tool is detected as the tip point of the treatment tool. These methods are limited to the case where the treatment tool has one tip. In addition, the intersection of a straight line that passes through the center of gravity of the treatment tool's segmentation mask and has the angle of the treatment tool as its inclination with the treatment tool's bounding box is detected as the tip point of the treatment tool. These detections of the tip point are just an example, and it is also possible to set a circle for determination at the tip point of the treatment tool and determine the angle of the arc portion that overlaps with each tip point.
- the image completion unit 250 performs color processing on the surgical image to generate a color-processed image, and uses the color-processed image to complete the treatment tool area.
- the image completion unit 250 performs the following processes in an integrated manner:
- the color reduction unit 251 reduces a specific color from the surgical image to generate a color-reduced image. This process is explained in FIG. 8.
- the binarization unit 252 binarizes the color-reduced image. This process is explained in FIG. 8.
- the extraction unit 253 removes areas that are unrelated to the treatment tool in question that are present in the surgical image, and extracts the treatment tool image from the color-processed image. This process is explained in FIG. 9.
- the combination unit 254 combines the inferred treatment tool image acquired through the deep learning model with the treatment tool image acquired from the color-processed image to generate a combined image 125 (see FIG. 10). This process is described in FIG. 10.
- the expansion unit 255 generates a bounding box for the combined image. This process is explained in FIG. 10.
- the image completion unit 250 also adjusts the brightness of the surgical image to remove background areas in the color-processed image.
- the rectangle completion unit 260 In the embodiment of the treatment tool area completion unit 200, the rectangle completion unit 260 generates the smallest rectangle that encloses the treatment tool area for the treatment tool image in the color-tone-processed image, and performs completion using the smallest rectangle.
- the rectangle completion unit 260 performs the following processes in an integrated manner.
- the tip determination unit 261 detects the tip of the treatment tool using an inference mask and determines whether or not there is an opening at the tip. This process is explained in Figure 14.
- the direction detection unit 262 detects the extension direction of the treatment tool from the inference mask. This process is explained in Figure 15.
- the brightness detection unit 263 detects the brightness around the treatment tool area in the endoscopic image and detects the extension direction of the treatment tool based on the brightness level. This process is explained in FIG. 16.
- the extension unit 264 extends the inference mask in the extension direction to generate an extended treatment tool image. This process is explained in FIG. 18.
- the combination unit 265 generates a combined image by combining the inferred treatment tool image acquired through the deep learning model with the extended treatment tool image. This process is explained in FIG. 18.
- the expansion unit 266 expands the bounding box to fit the combined image. This process is explained in FIG. 18.
- the inference mask referred to here is an image (data) that has been processed to separate the area in the endoscopic image where the treatment tool is present from unrelated areas and to cover (mask) the area where the treatment tool is present.
- the minimum rectangle referred to here is a rectangle that includes all points inside the treatment tool and is sized and oriented so as to have the smallest area.
- FIG. 8 The generation of a color-processed image is illustrated as the schematic diagrams of Figures 8 and 9.
- the schematic diagrams of Figures 8 and 9 show an overview of the processing of a surgical image acquired through the image acquisition unit 150 and image-recognized by the image recognition unit 160.
- the processing is in the order of Figures 8(A), (B), (C), and then Figures 9(A) and (B).
- Figure 8 (A) shows a surgical image 121, which is an image of the abdominal cavity captured by a camera (not shown) connected to the endoscope 103.
- a color-reduced image is generated by subtracting two of the three elements R (red), G (green), and B (blue) that make up each pixel of the surgical image. For example, G (green) is subtracted from R (red).
- the brightness of the color-reduced image represents the level of saturation in the surgical image.
- Figure 8 (B) shows a color-reduced image 122 after color reduction.
- FIG. 8(C) is a binarized image (binarized image 123).
- a binarized image is generated by outputting 1 for pixels below a predetermined threshold and 0 for other pixels. As a result, areas with low saturation in the surgical image are extracted as treatment tool areas.
- FIG. 9(A) shows a method for removing areas unrelated to the target treatment tool from a binarized image 123 (FIG. 8(C)).
- the treatment tool area 123a in the figure is a segmentation mask obtained through a deep learning model.
- the treatment tool and areas unrelated to the target treatment tool are removed to generate a color-processed image 124 (treatment tool mask image).
- the schematic diagram in Figure 10 (A) shows an inferred treatment tool image obtained by image recognition of a surgical image using a deep learning model.
- the inference mask 126 does not include the area stored in the trocar. Therefore, it is unclear which of the four sides of the image edge area the treatment tool is in contact with (which side it is emerging from). Therefore, the treatment tool mask 127 in the above-mentioned color-processed image 124 is combined. In this way, it becomes clear which side in the image the treatment tool is emerging from.
- the bounding box Bx1 corresponding to the initial inference mask 126 is expanded to the bounding box Bx2 of the combined image 125.
- a color-reduced image 122 is generated from the above-mentioned surgical image 121, and a binary image 123 is then generated.
- a grayscale image 122g is generated from the surgical image 121, and a binary image 123g is generated from the grayscale image 122g.
- binarized image 123 and binarized image 123g are obtained.
- the image of the bright parts of the biological tissue other than the treatment tool is deleted, and a corrected image (brightness-adjusted image 128) is generated in which only the treatment tool is clearly shown. This is performed to prevent the bright parts of the biological tissue from being mistakenly recognized as the treatment tool.
- the treatment tool image in the color-processed image is identified by comparing the treatment tool image with the inferred treatment tool image.
- This process is shown as a flow diagram in FIG. 12.
- a color-reduced image 122 is generated from the above-mentioned surgical image 121, and then a binarized image 123 is generated.
- an inferred image 129p is generated from a segmentation mask obtained through the deep learning model. This includes all treatment tool areas contained in the surgical image 121. Since each treatment tool is distinguished in the inference result of instance segmentation, the treatment tool area is deleted from the inferred image 129p, and a non-target treatment tool image 129 is generated.
- the area that overlaps with the non-target treatment tool image 129 is deleted from the binarized image 123, and a corrected image (overlap adjusted image 130) is generated in which treatment tools other than the target treatment tool have been deleted.
- the image completion unit 250 of the embodiment reduces the image size of the surgical image to compress the time required for image processing. Specifically, the image size is reduced for the binarized image 123 generated from the surgical image 121 via the color-reduced image 122.
- Figure 13 is a schematic diagram of an example of image size reduction.
- Figure 13(A) is a schematic diagram of the binarized image 123 at "actual size”.
- Figure 13(B) is a schematic diagram of the binarized image 123 in (A) with one side at "1/4 size”.
- Figure 13(C) is a schematic diagram of the binarized image 123 in (A) with one side at "1/8 size".
- the binarized image with reduced image size is used as the binarized image for each process in Figures 8 to 12 described above.
- the tip determination unit 261 detects the tip of the treatment tool from the treatment tool segmentation mask, which is the output of the treatment tool extraction unit, and determines whether or not there is an opening at the tip.
- the schematic diagram in FIG. 14(A) shows a rectangular region 138 cut out as the smallest rectangle that encompasses the treatment tool region, with the long axis direction of the rectangle aligned horizontally.
- the treatment tool in rectangular region 138 has its tip open on the right side of the page. It is shaped like a sideways V.
- the white parts in the diagram correspond to the treatment tools.
- the direction detection unit 262 determines the extension direction of the treatment tool based on the shape of the treatment tool segmentation mask, which is the output of the treatment tool extraction unit.
- the schematic diagram in Figure 15 (A) shows a segmentation mask of a treatment tool whose tip opens and closes.
- the treatment tool in rectangular area 138 has its tip open to the right of the page. It is shaped like a horizontal V.
- the white part in the figure corresponds to the treatment tool.
- the root part (O) of the horizontal V-shape and the ends (P) and (Q) that spread out from the root part are defined.
- the vector (OP) of the root part (O) and the end part (P) and the vector (OQ) of the root part (O) and the end part (Q) are calculated.
- the directional vector (v) indicating the extension direction of the treatment tool is the opposite of the combined direction of vectors (OP) and (OQ). In other words, it is to the left of the page.
- the schematic diagram in Figure 15 (B) shows a segmentation mask for a treatment tool whose tip does not open or close, with the white part in the diagram corresponding to the treatment tool.
- the areas (number of pixels) of the left and right segmentation masks are compared, with the center (Oi) of the smallest rectangle cut out as shown as the boundary.
- the direction with the larger area is determined to be the extension direction of the treatment tool. This is based on the structure of the treatment tool, in which the shaft of the treatment tool is thicker than the tip.
- the widest position of the segmentation mask becomes the origin (O), and the intersection of the straight line extended from the origin (O) in the extension direction of the treatment tool determined earlier and the smallest rectangle becomes the end (R).
- the directional vector (v) indicating the extension direction of the treatment tool becomes the vector (OR).
- the treatment tool area complementation unit 200 includes a brightness detection unit 263 that detects the brightness around the treatment tool area in the surgical image and detects the extension direction of the treatment tool from the brightness level.
- a straight line is drawn in the long axis direction of the rectangular area, and points on the line near the outside of the treatment tool mask are designated as X and Y (to make the position of the points easier to understand, point (Y) in Figure 16 is displayed off the straight line).
- point (X) on the treatment tool shaft has a lower brightness than point (Y) in the biological tissue, it can be seen that the extension direction of the treatment tool is in the direction of OX.
- the determination of the extension direction of the treatment tool shown in the schematic diagram of FIG. 17 may be included.
- the segmentation mask of the treatment tool is displayed as if it is floating in the inferred treatment tool image 141. Then, the distances K1 and K2 between the edge of the screen and the image corresponding to the treatment tool are obtained. From a mutual comparison of the distances, the side with the closer distance (K1 side) becomes the root side of the treatment tool, and the extension direction is determined. As shown in FIG. 17(B), an extension image 142 with the extension direction corrected is generated.
- FIG. 18 corresponds to an explanation of the processing in the extension unit 264.
- the inference mask 133 is corrected based on the extension direction determined by the direction detection unit in the combined image 125. Then, the origin (O) and direction vector (v) of the image of the treatment tool in the inference mask 133 are defined, and it is extended to the edge of the screen as shown in the addition unit 137 in FIG. 18(B).
- the width of the addition unit 137 is defined by the number of pixels of the cross section where the perpendicular line of the direction vector at the origin (O) intersects with the inference mask 133.
- the combination unit 265 combines the inference treatment tool image (inference mask 133) acquired through the deep learning model with the addition unit 137 to generate a combined image (inference mask 134).
- the expansion unit 266 generates a bounding box (expanded bounding box 136) to match the combined image (inference mask 134).
- the techniques described in Figures 14 to 17 may be used individually or in combination as appropriate.
- the optimal technique is selected taking into consideration various influences such as the treatment site (influence of organs and tissues) in the surgical image 121 and the type of treatment tool used.
- the endoscopic image treatment tool detection device (endoscopic image treatment tool tip detection device) 10 of the embodiment generates a color-processed image to be used for the bounding box before generating a bounding box to improve the recognition accuracy of the treatment tool, making it easy to obtain information about which side of the screen in the endoscopic image the treatment tool is appearing from, which was not adequately handled by previous bounding box processing alone. For this reason, the treatment tool detection device 10 of the embodiment is expected to contribute to supporting laparoscopic surgery using artificial intelligence.
- the endoscopic image treatment tool detection method is executed by the CPU 111 of the processing device (computer) 105 based on the endoscopic image treatment tool detection program.
- the endoscopic image treatment tool detection program causes the processing device (computer) 105 in Figure 3 to execute an image acquisition function, an image recognition function, a treatment tool extraction function, and a tip detection function, a treatment tool area completion function, a root identification function, an end position detection function, an angle detection function, and a tip point detection function, an image completion function, a rectangle completion function, a color reduction function, a binarization function, an extraction function, a combination function, and an extension function, and a tip determination function, a direction detection function, a brightness detection function, an extension function, a combination function, and an extension function.
- Each function overlaps with the description of the treatment tool detection device 10 described above, so details will be omitted.
- the processing of the processing device (computer) 105 includes various steps such as an image acquisition step (S150), an image recognition step (S160), a treatment tool extraction step (S170), and a treatment tool area completion step (S200).
- image acquisition step S150
- image recognition step S160
- treatment tool extraction step S170
- treatment tool area completion step S200
- steps necessary for the operation of the processing device (computer) 105 (CPU 111) itself, as well as output steps necessary for information output, a tip detection step (S180), etc. are naturally included.
- the image acquisition function acquires a surgical image (S150; image acquisition step).
- the image recognition function performs image recognition of the surgical image using a pre-trained deep learning model (S160; image recognition step).
- the treatment tool extraction function generates a treatment tool area in the surgical image consisting of a segmentation mask and a bounding box in the treatment tool image (S170; treatment tool extraction step).
- the treatment tool area completion step (S200) executes each process of the image completion step (S250) or rectangle completion step (S260) in FIG. 21.
- the processing by the processing device (computer) 105 (CPU 111) includes various steps such as a root identification step (S210), an end point detection step (S220), an angle detection step (S230), and a tip point detection step (S240).
- the root identification function identifies as the root the side of the image edge region that has the largest coverage area between each of the four sides of the image edge region in the surgical image and the segmentation mask (S210; root identification step).
- the end position detection function detects the end portion of the treatment tool (end position detection step; S220).
- the angle detection function detects the angle of the treatment tool (angle detection step; S230).
- the tip point detection function detects the tip point of the treatment tool (tip point detection step; S240).
- the treatment tool area completion step (S200) is selective to execute an image completion step (S250) as shown in FIG. 21(A) or a rectangle completion step (S260) as shown in FIG. 21(B). Note that in the treatment tool detection device for endoscopic images (treatment tool tip detection device for endoscopic images) 10, the treatment tool area completion step (S200) can also execute both the image completion step (S250) and the rectangle completion step (S260).
- FIG. 22 is a flowchart of the image completion step (S250) in the treatment tool area completion step (S200), and the processing by the processing device (computer) 105 (CPU 111) includes various steps of a color reduction step (S251), a binarization step (S252), an extraction step (S253), a combination step (S254), and an expansion step (S255).
- the color reduction function reduces a specified color from the surgical image to generate a color-reduced image (S251; color reduction step).
- the binarization function binarizes the color-reduced image (S252; binarization step).
- the extraction function removes areas unrelated to the target treatment tool present in the surgical image and extracts the treatment tool image from the color-processed image (S253; extraction step).
- the combination function combines the inferred treatment tool image obtained through the deep learning model with the treatment tool image obtained from the color-processed image to generate a combined image (S254; combination step).
- the expansion function generates the bounding box to match the combined image (S255; expansion step).
- FIG. 23 is a flow chart of the rectangle completion step (S260) in the treatment tool area completion step (S200), and the processing by the processing device (computer) 105 (CPU 111) includes various steps, such as a tip determination step (S261), a direction detection step (S262), a brightness detection step (S263), an extension step (S264), a combination step (S265), and an expansion step (S266).
- a tip determination step S261
- S262 direction detection step
- S263 a brightness detection step
- S264 extension step
- S265 a combination step
- expansion step S266
- the tip determination function detects the tip of the treatment tool from the treatment tool segmentation mask, which is the output of the treatment tool extraction unit, and determines whether or not there is an opening at the tip (S261; tip determination step).
- the direction detection function detects the extension direction of the treatment tool from the treatment tool segmentation mask, which is the output of the treatment tool extraction unit (S262; direction detection step).
- the brightness detection function detects the brightness around the treatment tool image in the color-processed image and detects the extension direction of the treatment tool from the high and low brightness (S263; brightness detection step).
- the extension function extends the minimum rectangle in the extension direction to generate an extended treatment tool image (S264; extension step).
- the combination function combines the extended treatment tool image with the inferred treatment tool image obtained through the deep learning model to generate a combined image (S265; combination step).
- the expansion function generates a bounding box to match the combined image (S266; expansion step).
- the computer program of the present invention described above may be recorded on a processor-readable recording medium, and the recording medium may be a "non-transitory tangible medium” such as a disk, card, semiconductor memory, or programmable logic circuit.
- the computer program can be implemented using, for example, a scripting language such as ActionScript or JavaScript (registered trademark), an object-oriented programming language such as Objective-C or Java (registered trademark), or a markup language such as HTML5.
- a scripting language such as ActionScript or JavaScript (registered trademark)
- object-oriented programming language such as Objective-C or Java (registered trademark)
- markup language such as HTML5.
- Treatment tool detection device 60 Inference image 101, 102, 401, 601 Treatment tool 103 Endoscope 104 Affected area 105 Processing device 106 Endoscope display 107 Patient 108 Trocar 111 CPU 112 ROM 113 RAM 114 Storage unit 115 I/O 121 Surgical image 122 Color-reduced image 122g Grayscale image 123 Binarized image 123a Treatment tool area 123g Binarized image based on grayscale image 124 Tone-processed image 125 Combined image 126, 133, 134 Inference mask 127 Treatment tool mask 128 Brightness-adjusted image 129 Non-target treatment tool image 129p Inference image 130 Overlap-adjusted image 133, 134 Inference mask 135, 136 Bounding box 137 Addition section 138 Rectangular area 141 Inferred treatment tool image 142 Extension image 150 Image acquisition section 160 Image recognition section 170 Treatment tool extraction section 180 Tip detection section 200 Treatment tool area complementation section 210 Root identification section 220 End portion detection section 230 Angle
Landscapes
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Surgery (AREA)
- Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
- Biomedical Technology (AREA)
- Optics & Photonics (AREA)
- Pathology (AREA)
- Radiology & Medical Imaging (AREA)
- Biophysics (AREA)
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Heart & Thoracic Surgery (AREA)
- Medical Informatics (AREA)
- Molecular Biology (AREA)
- Animal Behavior & Ethology (AREA)
- General Health & Medical Sciences (AREA)
- Public Health (AREA)
- Veterinary Medicine (AREA)
- Endoscopes (AREA)
Abstract
Description
本発明は、内視鏡画像の処置具検出装置、内視鏡画像の処置具検出方法、及び内視鏡画像の処置具検出装置プログラムに関し、特に、腹腔鏡下手術の手術支援における人工知能を用いた処置具の認識を補完する内視鏡画像の処置具検出装置、その方法及びプログラムに関する。 The present invention relates to a treatment tool detection device for endoscopic images, a treatment tool detection method for endoscopic images, and a treatment tool detection device program for endoscopic images, and in particular to a treatment tool detection device, method, and program for endoscopic images that complements treatment tool recognition using artificial intelligence in surgical support for laparoscopic surgery.
内視鏡下手術において、内視鏡画像内での処置具の先端の正確な位置を把握することにより、例えば手術支援ロボットシステムを使用した際に、より正確な操作が可能になる等、様々な応用が可能となる。内視鏡画像内の処置具を把握するための技術開発が進められている(特許文献1、2等参照)。
In endoscopic surgery, by determining the exact position of the tip of a treatment tool within an endoscopic image, various applications become possible, such as enabling more accurate operation when using a surgical support robot system. Progress is being made in the development of technology to determine the location of treatment tools within endoscopic images (see
特許文献1に開示されている方法は、処置具の先端にマーカを取り付け、処置具を検出する方法である。しかしながら、既成の処置具にマーカを後付けするにはマーカの滅菌対応、またはマーカの脱落リスクを考慮する必要がある。このため、マーカ付き、又はマーカ対応の処置具を普及させるには、処置具メーカの対応力が要求されること等の問題がある。さらに、マーカを付着させる方法は、処置具の操作に支障を来すおそれがある等の課題を内包している。
The method disclosed in
特許文献2に開示されている方法は、機械学習の一種であるサポートベクターマシン(SVM)及び数理モデルを組み合わせ、内視鏡画像内の処置具を検出する方法である。
The method disclosed in
特許文献2に開示されている方法において使用されている機械学習はサポートベクターマシンであり、あらかじめ定義した特徴のみを認識するため、手術中の処置具のテクスチャの経時的な変化や処置具の一部が隠れている、学習に用いる特徴と同種の特徴を持った異なる処置具の存在等、1つの処置具を認識するために非常に多くの特徴量の学習を必要とする。又、この構成を単純に深層学習等のより複雑な条件化での認識を行うことが可能な手法に置き換えたとしても、内視鏡画像内に同種の処置具が複数含まれ、これらが互いに近接している場合、これらの同種の処置具を個別の処置具であると認識することが困難である。更に、手術下での三次元的な処置具の先端の位置を正確に追従するためには、より詳細に先端点を抽出する必要がある。
The machine learning used in the method disclosed in
上記問題点を鑑み、内視鏡画像内に同種の処置具が複数含まれ、これらが互いに近接しているような複雑な環境下においても適切に処置具の先端点の位置を検出する内視鏡画像の処置具の先端検出装置、内視鏡画像の処置具の先端検出方法、及び内視鏡画像の処置具の先端検出プログラムを提供することを目的とした発明が出願されている(特許文献3参照)。 In consideration of the above problems, an invention has been filed that aims to provide a device for detecting the tip of a treatment tool in an endoscopic image, a method for detecting the tip of a treatment tool in an endoscopic image, and a program for detecting the tip of a treatment tool in an endoscopic image, which can appropriately detect the position of the tip of a treatment tool even in a complex environment where multiple treatment tools of the same type are included in an endoscopic image and are close to each other (see Patent Document 3).
特許文献3は、内視鏡画像の処置具の先端検出装置であって、手術画像を取得する画像取得部と、予め学習された深層学習モデルを用いて、手術画像の画像認識を行う画像認識部と、手術画像内において当該手術画像内の処置具のセグメンテーションマスクとバウンディングボックスとを生成する処置具抽出部と、生成されたセグメンテーションマスクと、生成されたバウンディングボックスとを用いて、処置具の先端を検出する先端検出部と、を備え、先端検出部は、手術画像内に存在する画像縁部領域の4辺のそれぞれと、セグメンテーションマスクとの被覆面積が最大となる画像縁部領域の辺を根部として特定する根部特定部と、処置具の端部位を検出する端部位検出部と、処置具の角度を検出する角度検出部と、処置具の先端点を検出する先端点検出部とを備えることを要旨とする発明である。 Patent Document 3 is an invention relating to a device for detecting the tip of a treatment tool in an endoscopic image, which includes an image acquisition unit that acquires a surgical image, an image recognition unit that performs image recognition of the surgical image using a pre-trained deep learning model, a treatment tool extraction unit that generates a segmentation mask and a bounding box of the treatment tool in the surgical image, and a tip detection unit that detects the tip of the treatment tool using the generated segmentation mask and the generated bounding box, and the tip detection unit includes a root identification unit that identifies each of the four sides of the image edge region present in the surgical image and the side of the image edge region that has the largest coverage area with the segmentation mask as the root, an end position detection unit that detects the end portion of the treatment tool, an angle detection unit that detects the angle of the treatment tool, and a tip point detection unit that detects the tip point of the treatment tool.
特許文献3に記載された方法では取得画像内に存在する処置具のセグメンテーションマスクおよびバウンディングボックスの情報を基に先端点を検出する技術であるものの、画面内に映り込む障害物により、セグメンテーションマスクとバウンディングボックスが画像縁部領域に接していないことがあり、先端点の検出に影響が出ることが考えられる。特に、腹腔内への処置具の導入のためのトロッカ(トロカー等と称される筒状物)を含めて処置具が画面内に存在する場合、トロッカ部分が認識されず、セグメンテーションマスクとバウンディングボックスを正確に取得できない状況が発生し、先端検出の性能に影響が生じる。 The method described in Patent Document 3 is a technology that detects the tip point based on the segmentation mask and bounding box information of the treatment tool present in the acquired image, but due to obstacles reflected in the screen, the segmentation mask and bounding box may not be in contact with the edge area of the image, which is thought to affect the detection of the tip point. In particular, if the treatment tool is present on the screen, including a trocar (a cylindrical object called a trocar, etc.) for introducing the treatment tool into the abdominal cavity, the trocar part will not be recognized, resulting in a situation where the segmentation mask and bounding box cannot be accurately acquired, and this will affect the performance of tip detection.
上記問題点を鑑み、本発明は、障害物が手術画像内に映り込む状況においても迅速、かつ適切に処置具の先端点の位置を検出する内視鏡画像の処置具の先端検出装置、内視鏡画像の処置具の先端検出方法、及び内視鏡画像の処置具の先端検出プログラムを提供することを目的とする。 In consideration of the above problems, the present invention aims to provide a device for detecting the tip of a treatment tool in an endoscopic image, a method for detecting the tip of a treatment tool in an endoscopic image, and a program for detecting the tip of a treatment tool in an endoscopic image, which can quickly and appropriately detect the position of the tip point of a treatment tool even in situations where an obstacle is reflected in the surgical image.
すなわち、実施形態の内視鏡画像の処置具検出装置は、手術画像を取得する画像取得部と、予め学習された深層学習モデルを用いて、手術画像の画像認識を行う画像認識部と、手術画像内において当該処置具画像内のセグメンテーションマスクとバウンディングボックスからなる処置具領域を生成する処置具抽出部と、手術画像内に存在する障害物による処置具領域の欠落を補完する処置具領域補完部とを備え、処置具領域補完部は、手術画像に対して色調処理して色調処理画像を生成し、色調処理画像を用いて処置具領域の補完を行う画像補完部を備え、画像補完部は色調処理画像の生成に際して、手術画像から所定の色を減色して減色画像を生成する減色部と、減色画像を二値化する二値化部と、手術画像内に存在する対象となる処置具と無関係な領域を削除して色調処理画像から処置具画像を抽出する抽出部と、深層学習モデルを通じて取得される推論処置具画像に、色調処理画像から取得される処置具画像を組み合わせて組合せ画像を生成する組合せ部と、組合せ画像に合わせてバウンディングボックスを生成する拡張部とを備えることを特徴とする。 That is, the endoscopic image treatment tool detection device of the embodiment includes an image acquisition unit that acquires a surgical image, an image recognition unit that performs image recognition of the surgical image using a pre-trained deep learning model, a treatment tool extraction unit that generates a treatment tool area in the surgical image consisting of a segmentation mask and a bounding box in the treatment tool image, and a treatment tool area completion unit that completes the treatment tool area due to an obstacle present in the surgical image. The treatment tool area completion unit includes an image completion unit that performs color processing on the surgical image to generate a color-processed image and completes the treatment tool area using the color-processed image. When generating the color-processed image, the image completion unit includes a color reduction unit that reduces a predetermined color from the surgical image to generate a reduced-color image, a binarization unit that binarizes the reduced-color image, an extraction unit that deletes areas unrelated to the target treatment tool present in the surgical image and extracts the treatment tool image from the color-processed image, a combination unit that generates a combined image by combining the treatment tool image acquired from the color-processed image with the inferred treatment tool image acquired through the deep learning model, and an expansion unit that generates a bounding box according to the combined image.
さらに、内視鏡画像の処置具検出装置の処置具領域補完部は、処置具抽出部の出力である処置具のセグメンテーションマスクに対し、画像処理によって取得した処置具領域を囲う最小矩形を生成し、最小矩形を用いて補完を行う矩形補完部を備え、矩形補完部は最小矩形を用いた補完に際して、処置具抽出部の出力である処置具のセグメンテーションマスクより処置具の先端を検出し先端の開口の有無を判定する先端判定部と、処置具抽出部の出力である処置具のセグメンテーションマスクより処置具の伸長方向を検出する方向検出部と、手術画像内の処置具周囲の輝度を検出して輝度の高低から推論マスクに基づいて処置具の伸長方向を検出する輝度検出部と、推論マスクを伸長方向に拡張して伸長処置具画像を生成する伸長部と、深層学習モデルを通じて取得される推論処置具画像に、伸長処置具画像を組み合わせて組合せ画像を生成する組合せ部と、組合せ画像に合わせて前記バウンディングボックスを生成する拡張部とを備えることとしてもよい。 Furthermore, the treatment tool area complementing section of the endoscopic image treatment tool detection device may include a rectangle complementing section that generates a minimum rectangle surrounding the treatment tool area obtained by image processing for the treatment tool segmentation mask that is the output of the treatment tool extraction section, and performs complementing using the minimum rectangle. When performing complementing using the minimum rectangle, the rectangle complementing section may include a tip determination section that detects the tip of the treatment tool from the treatment tool segmentation mask that is the output of the treatment tool extraction section and determines whether or not there is an opening at the tip, a direction detection section that detects the extension direction of the treatment tool from the treatment tool segmentation mask that is the output of the treatment tool extraction section, a brightness detection section that detects the brightness around the treatment tool in the surgical image and detects the extension direction of the treatment tool based on the inference mask from the level of brightness, an extension section that extends the inference mask in the extension direction to generate an extended treatment tool image, a combination section that generates a combined image by combining the extended treatment tool image with the inferred treatment tool image obtained through the deep learning model, and an extension section that generates the bounding box according to the combined image.
さらに、内視鏡画像の処置具検出装置にあっては、処置具領域補完部は、画像補完部と、矩形補完部を備え、何れかの出力を選択的に出力することとしてもよい。 Furthermore, in the treatment tool detection device for endoscopic images, the treatment tool area completion section may include an image completion section and a rectangle completion section, and may selectively output either of the outputs.
さらに、内視鏡画像の処置具検出装置の画像補完部は、手術画像の明度を調整して生成した明度調整画像から色調処理画像内の背景領域を削除することとしてもよい。 Furthermore, the image complementing section of the endoscopic image treatment tool detection device may be configured to remove background areas in the color-processed image from the brightness-adjusted image generated by adjusting the brightness of the surgical image.
さらに、内視鏡画像の処置具検出装置の画像補完部は、処置具画像と推論処置具画像との比較から色調処理画像内の処置具画像を特定することとしてもよい。 Furthermore, the image complementation section of the endoscopic image treatment tool detection device may identify the treatment tool image in the color-processed image by comparing the treatment tool image with the inferred treatment tool image.
さらに、内視鏡画像の処置具検出装置の画像補完部は、手術画像の画像サイズを縮小して画像処理に要する時間を圧縮することとしてもよい。 Furthermore, the image complementation section of the endoscopic image treatment tool detection device may reduce the image size of the surgical image to compress the time required for image processing.
また、実施形態の内視鏡画像の処置具の先端検出装置は、手術画像を取得する画像取得部と、予め学習された深層学習モデルを用いて手術画像の画像認識を行う画像認識部と、手術画像内において当該手術画像内の処置具のセグメンテーションマスクとバウンディングボックスとを生成する処置具抽出部と、手術画像内に存在する障害物による処置具領域の欠落を補完する処置具領域補完部と、生成された前記セグメンテーションマスクと、生成されたバウンディングボックスとを用いて、処置具の先端を検出する先端検出部を備え、先端検出部は、手術画像内に存在する画像縁部領域の4辺のそれぞれと、セグメンテーションマスクとの被覆面積が最大となる画像縁部領域の辺を根部として特定する根部特定部と、処置具の端部位を検出する端部位検出部と、処置具の角度を検出する角度検出部と、処置具の先端点を検出する先端点検出部を備えることを特徴とする。 In addition, the device for detecting the tip of a treatment tool in an endoscopic image of an embodiment includes an image acquisition unit that acquires a surgical image, an image recognition unit that performs image recognition of the surgical image using a pre-trained deep learning model, a treatment tool extraction unit that generates a segmentation mask and a bounding box of the treatment tool in the surgical image, a treatment tool area completion unit that completes missing treatment tool areas due to obstacles present in the surgical image, and a tip detection unit that detects the tip of the treatment tool using the generated segmentation mask and the generated bounding box, and the tip detection unit is characterized by including a root identification unit that identifies, as a root, each of the four sides of the image edge area present in the surgical image and the side of the image edge area that has the largest coverage area with the segmentation mask, an end position detection unit that detects the end portion of the treatment tool, an angle detection unit that detects the angle of the treatment tool, and a tip point detection unit that detects the tip point of the treatment tool.
本発明の内視鏡画像の処置具検出装置によれば、手術画像を取得する画像取得部と、予め学習された深層学習モデルを用いて、手術画像の画像認識を行う画像認識部と、手術画像内において当該処置具画像内のセグメンテーションマスクとバウンディングボックスからなる処置具領域を生成する処置具抽出部と、手術画像内に存在する障害物による処置具領域の欠落を補完する処置具領域補完部とを備え、処置具領域補完部は、手術画像に対して色調処理して色調処理画像を生成し、色調処理画像を用いて処置具領域の補完を行う画像補完部を備え、画像補完部は色調処理画像の生成に際して、手術画像から所定の色を減色して減色画像を生成する減色部と、減色画像を二値化する二値化部と、手術画像内に存在する対象となる処置具と無関係な領域を削除して色調処理画像から処置具画像を抽出する抽出部と、深層学習モデルを通じて取得される推論処置具画像に、色調処理画像から取得される処置具画像を組み合わせて組合せ画像を生成する組合せ部と、組合せ画像に合わせてバウンディングボックスを生成する拡張部とを備えるため、内視鏡画像内の処置具の認識においてセグメンテーションマスクとバウンディングボックスを用いる手法をさらに発展させて処置具領域の色調・形状・開口の有無などの評価を加えることにより、処置具の内視鏡画像内の画面における位置把握の精度を高めることができる。 The endoscopic image treatment tool detection device of the present invention includes an image acquisition unit that acquires a surgical image, an image recognition unit that performs image recognition of the surgical image using a pre-trained deep learning model, a treatment tool extraction unit that generates a treatment tool area in the surgical image consisting of a segmentation mask and a bounding box in the treatment tool image, and a treatment tool area completion unit that completes the treatment tool area due to an obstacle present in the surgical image, and the treatment tool area completion unit is provided with an image completion unit that performs color processing on the surgical image to generate a color-processed image and completes the treatment tool area using the color-processed image, and the image completion unit is provided with a color reduction unit that reduces a predetermined color from the surgical image to generate a color-reduced image when generating the color-processed image. The system includes a binarization unit that binarizes the color-reduced image, an extraction unit that removes areas in the surgical image that are unrelated to the treatment tool of interest and extracts the treatment tool image from the color-processed image, a combination unit that combines the treatment tool image obtained from the color-processed image with the inferred treatment tool image obtained through the deep learning model to generate a combined image, and an expansion unit that generates a bounding box according to the combined image. This further develops the method of using a segmentation mask and a bounding box to recognize treatment tools in endoscopic images, and adds evaluation of the color tone, shape, and presence or absence of an opening in the treatment tool area, thereby improving the accuracy of identifying the position of the treatment tool on the screen of the endoscopic image.
加えて、本発明の内視鏡画像の処置具の先端検出装置によれば、手術画像を取得する画像取得部と、予め学習された深層学習モデルを用いて手術画像の画像認識を行う画像認識部と、手術画像内において当該手術画像内の処置具のセグメンテーションマスクとバウンディングボックスとを生成する処置具抽出部と、手術画像内に存在する障害物による処置具領域の欠落を補完する処置具領域補完部と、生成された前記セグメンテーションマスクと、生成されたバウンディングボックスとを用いて、処置具の先端を検出する先端検出部を備え、先端検出部は、手術画像内に存在する画像縁部領域の4辺のそれぞれと、セグメンテーションマスクとの被覆面積が最大となる画像縁部領域の辺を根部として特定する根部特定部と、処置具の端部位を検出する端部位検出部と、処置具の角度を検出する角度検出部と、処置具の先端点を検出する先端点検出部を備えるため、内視鏡画像内の処置具の認識においてセグメンテーションマスクとバウンディングボックスを用いる手法をさらに発展させて処置具領域の色調・形状・開口の有無などの評価を加えることにより、処置具の内視鏡画像内の画面における位置把握の精度を高めることができる。 In addition, the device for detecting the tip of a treatment tool in an endoscopic image of the present invention includes an image acquisition unit for acquiring a surgical image, an image recognition unit for performing image recognition of the surgical image using a pre-trained deep learning model, a treatment tool extraction unit for generating a segmentation mask and a bounding box of the treatment tool in the surgical image, a treatment tool area completion unit for completing a missing treatment tool area due to an obstacle present in the surgical image, and a tip detection unit for detecting the tip of the treatment tool using the generated segmentation mask and the generated bounding box, and the tip detection unit detects the tip of the treatment tool in the surgical image. The system is equipped with a root identification unit that identifies as the root the side of the image edge region that has the largest coverage area with each of the four sides of the image edge region present in the image and the segmentation mask, an end portion detection unit that detects the end portion of the treatment tool, an angle detection unit that detects the angle of the treatment tool, and a tip point detection unit that detects the tip point of the treatment tool. By further developing the method of using a segmentation mask and bounding box to recognize treatment tools in endoscopic images and adding evaluation of the color tone, shape, and presence or absence of an opening in the treatment tool region, the accuracy of identifying the position of the treatment tool on the screen in the endoscopic image can be improved.
次に、図面を参照して、本発明の実施形態を説明する。実施形態に係る図面の記載において、同一又は類似の部分には同一又は類似の符号を付している。但し、図面は模式的なものであり、厚みと平面寸法との関係、各部材の厚みの比率等は現実のものとは異なることに留意すべきである。したがって、具体的な厚み、寸法は以下の説明を参酌して判断すべきものである。また、図面相互間においても互いの寸法の関係、比率が異なる部分が含まれていることは勿論である。 Next, an embodiment of the present invention will be described with reference to the drawings. In describing the drawings relating to the embodiment, identical or similar parts are given the same or similar reference symbols. However, it should be noted that the drawings are schematic, and the relationship between thickness and planar dimensions, the thickness ratio of each component, etc., differ from the actual ones. Therefore, specific thicknesses and dimensions should be determined with reference to the following explanation. In addition, it goes without saying that there are parts in which the dimensional relationships and ratios differ between the drawings.
実施形態は本発明の技術的思想を具体化するための装置、方法の例示であって、本発明の技術的思想は、各構成要素の構成、配置、レイアウト等を下記に特定されない。本発明の技術的思想は、特許請求の範囲に記載された請求項が規定する技術的範囲内において、種々の変更を加えることができる。 The embodiments are merely examples of devices and methods for embodying the technical ideas of the present invention, and the technical ideas of the present invention are not specified below in terms of the configuration, arrangement, layout, etc. of each component. The technical ideas of the present invention may be modified in various ways within the technical scope defined by the claims.
図1に、実施形態の内視鏡画像の処置具検出装置10(以下、「処置具検出装置」とする。)の全体構成の一例を示す。なお、内視鏡画像の処置具検出装置10は内視鏡画像の処置具の先端検出装置としての機能も兼備する。以降の説明では処置具の先端検出装置も処置具検出装置10に包含して説明する。図1に示す処置具検出装置10は、内視鏡103、処置具101及び102、処理装置105、内視鏡用ディスプレイ106から構成される。図1において、内視鏡103、処置具101及び102は、患者107の腹腔内に挿入される。患者107の腹腔内に挿入された内視鏡103によって患部104が撮影され、撮影によって得られた画像は処理装置105に送信される。内視鏡103から処理装置105に送信された画像は、内視鏡用ディスプレイ106に表示される。
FIG. 1 shows an example of the overall configuration of an endoscopic image treatment tool detection device 10 (hereinafter referred to as the "treatment tool detection device") of an embodiment. The endoscopic image treatment
処置具101及び102は、例えば把持鉗子、高周波ナイフ、止血クリップ等である。さらに、処置具101及び102は、前述の鉗子、ナイフ等の他に、患者107の腹腔内に導入する際に用いられるトロッカ108を含む。図1では、処置具101及び102はそれぞれトロッカ108(二点鎖線参照)の鞘の内部に収容され、当該処置具101及び102の先端がトロッカ108から露出している。
The
内視鏡103は、図示しない撮像素子及び光源を有し、内視鏡103の光源によって患者107の腹腔内に光が照射され、撮像素子によって患部104が撮影され、画像データが生成される。
The
処理装置105は、内視鏡システムであって、パーソナルコンピュータ(PC)、メインフレーム、ワークステーション、クラウドコンピューティングシステム等、種々の電子計算機(計算リソース、いわゆるコンピュータ)を備える。
The
図2に、図1に示す内視鏡103によって撮影され、内視鏡用ディスプレイ106に表示されている画像を示す。図2に示す画像には、処置具101及び102、患部104が含まれている。図2に示す処置具101及び102は、一例として、互いに同種の把持鉗子であるとする。処置具101は把持鉗子の先端が開いた状態、処置具102は把持鉗子の先端が閉じた状態を表している。図2でも、処置具101及び102はそれぞれトロッカ108(二点鎖線参照)の鞘の内部に収容され、当該処置具101及び102の先端がトロッカ108から露出している。
Figure 2 shows an image captured by the
実施形態の内視鏡画像の処置具検出装置(内視鏡画像の処置具の先端検出装置)10は、内視鏡によって撮影された内視鏡画像において、処置具を認識し、認識した処置具の先端を検出する。例えば、高周波ナイフは先端が1つであるため、高周波ナイフに対して実施形態に係る処置具先端検出装置によって検出される先端は1つである。一方、例えば、把持鉗子は、先端の把持部が閉じた状態のとき、先端は1つである。先端の把持部が開いた状態のとき、実際の先端は2つであるが、先端の把持部が開いた状態であっても、内視鏡に対する把持鉗子の向きによって、内視鏡画像において先端が2つに見える場合と1つに見える場合がある。実施形態に係る処置具先端検出装置は、処置具の種類、先端の数、状態、向きにかかわらず、内視鏡画像内の処置具を検出可能とする。また、トロッカを含めて画面内の処置具の検出も可能とする。 The treatment tool detection device for endoscopic images (treatment tool tip detection device for endoscopic images) 10 of the embodiment recognizes a treatment tool in an endoscopic image captured by an endoscope and detects the tip of the recognized treatment tool. For example, a high-frequency knife has one tip, so the treatment tool tip detection device of the embodiment detects one tip for the high-frequency knife. On the other hand, for example, grasping forceps has one tip when the tip gripper is closed. When the tip gripper is open, there are actually two tips, but even when the tip gripper is open, there are cases where the tip appears to be two or one in the endoscopic image depending on the orientation of the grasping forceps relative to the endoscope. The treatment tool tip detection device of the embodiment can detect treatment tools in endoscopic images regardless of the type, number, state, or orientation of the treatment tool. It can also detect treatment tools in the screen, including trocars.
図3のブロック図は、図1に示す処理装置105の構成の一例である。図3の処理装置105には、各種の演算実行のためのCPU111、処理用のプログラムを記憶するROM112、データ等の記憶のためのRAM113、各種のデータ及び観測結果等の記憶のための記憶部114、I/O(インプット・アウトプットインターフェース)115等が備えられる。I/O115は通信(送受信)用のインターフェース、バッファ等である。I/O115は、内視鏡103との画像データの送受信、内視鏡用ディスプレイ106との画像データの送受信等に用いられCPU111と連携する。RAM113は、処理用のプログラム、I/O115から取り込まれた画像データ等の一時的な展開等に使用される。なお、CPU111に加えて画像処理に特化したGPUが備えられてもよい。
The block diagram in FIG. 3 is an example of the configuration of the
記憶部114は、後述の画像認識部160が画像認識を行う際に用いる深層学習モデルを記憶する。
The
図4及び図5のブロック図は図3のCPU111内の機能部を示す。CPU111の各機能部をソフトウェアにより実現する場合、CPU111は各機能を実現するソフトウェアであるプログラムの命令を実行することで実現する。詳細には、各機能部は画像取得部150、画像認識部160、処置具抽出部170、先端検出部180、処置具領域補完部200であり、先端検出部180に根部特定部210、端部位検出部220、角度検出部230、先端点検出部240が備えられる。そして、処置具領域補完部200には、画像補完部250、矩形補完部260が備えられる。画像補完部250には、減色部251、二値化部252、抽出部253、組合せ部254、拡張部255が備えられる。矩形補完部260には、先端判定部261、方向検出部262、輝度検出部263、伸長部264、組合せ部265、拡張部266が備えられる。以降、各機能部について図示とともに説明する。
The block diagrams of Figures 4 and 5 show the functional units within the
画像取得部150は、手術画像を取得する。手術画像は、図1に示す内視鏡103により撮影され、内視鏡用ディスプレイ106に表示されている画像である。
The
画像認識部160は、予め学習された深層学習モデルを用いて、手術画像の画像認識を行う。
The
実施形態の処置具検出装置10の画像認識部160が手術画像の画像認識を行う際に用いる深層学習モデルは、インスタンスセグメンテーションの手法に基づいて予め学習されたモデルであり記憶部114に記憶されている。この深層学習モデルは、予め撮影された手術画像の画像データに対してアノテーションを付与し、アノテーションを付与された画像データを用いて構築された畳み込みニューラルネットワーク(Convolutional Neural Network:CNN)である。実施形態において、深層学習モデルは、予め撮影された画像データ内の、少なくとも、処置具に対してアノテーションを付与された画像データを用いて構築される。ここでいうアノテーションとは、深層学習モデル(機械学習のモデル)に際しての処置具の教師データを作成することである。
The deep learning model used by the
処置具抽出部170は、手術画像内において当該手術画像内のセグメンテーションマスクとバウンディングボックスからなる処置具領域を生成する。先端検出部180は、生成されたセグメンテーションマスクと、生成されたバウンディングボックスとを用いて処置具の先端を検出する。
The treatment
図6に示すように、処置具抽出部170は、画像認識部160によるインスタンスセグメンテーションの推論結果から、画像データ内に含まれる処置具401のセグメンテーションマスク402とバウンディングボックス403とを抽出する。ここで、インスタンスセグメンテーションとは、画像データに存在する複数の対象物について、個別のセグメンテーションマスクおよびバウンディングボックスを検出する手法である。
As shown in FIG. 6, the treatment
処置具のセグメンテーションマスクとは、内視鏡画像内の処置具のピクセル単位の領域であって、図6において斜線で表されている領域である。バウンディングボックスは、処置具のセグメンテーションマスクのxの最小値、yの最小値、xの最大値、yの最大値に囲まれた矩形のデータであって、図6において破線で表されている矩形であり、バウンディングボックスの4つの辺は、内視鏡画像の4つの辺とそれぞれ平行である。 The treatment tool segmentation mask is the pixel-unit area of the treatment tool in the endoscopic image, and is the area indicated by diagonal lines in Figure 6. The bounding box is rectangular data surrounded by the minimum x value, minimum y value, maximum x value, and maximum y value of the treatment tool segmentation mask, and is the rectangle indicated by dashed lines in Figure 6, and the four sides of the bounding box are parallel to the four sides of the endoscopic image.
処置具領域補完部200は、手術画像内に存在する障害物による処置具領域の欠落を補完する。手術画像内には、処置具に障害物として各種の組織、または他の処置具が重なることがある。そこで、処置具領域補完部200はこれらの障害物によって不可視になっている処置具領域を画像処理により補完し、処置具の先端検出を適切に実行できるようにする。
The treatment tool
根部特定部210は、手術画像内に存在する画像縁部領域の4辺のそれぞれと、セグメンテーションマスクとの被覆面積が最大となる画像縁部領域の辺を根部として特定する。
The
根部特定部210が処置具の出現する辺を判定する方法を図7(A)及び(B)を参照しながら説明する。図7では処置具601を含む推論画像60が示される。図7(A)では、推論画像60の4つの辺ABCDから距離rだけ離れた内側に、画像縁部線602が設定され、画像縁部線602と推論画像60の外枠線との間の領域は画像縁部603とされる。推論画像60の4つの辺ABCDのうち、画像縁部603と、処置具601との被覆する面積が最大となる辺が処置具601の出現する辺と判定される。
The method by which the
図7(B)は図7(A)の処置具601の根部を拡大した模式図である。画像縁部線602の辺Cに最近接する角から辺Cに垂直線L1が引かれる。同様に、画像縁部線602の辺Dに最近接する角から辺Dに垂直線L2が引かれる。処置具601の、画像縁部線602及び垂直線L1に囲まれた領域604の面積と、画像縁部線602及び垂直線L2に囲まれた領域605の面積とが比較される。領域604の面積と比較して、領域605の面積の方が大きく、従って、処置具601は辺Dから出現していると判定される。
FIG. 7(B) is a schematic diagram showing an enlarged view of the root of the
端部位検出部220は処置具の端部位を検出する。端部位検出部220については、根部から内視鏡画像内の処置具の先端点へ向けて延びる処置具の最後方の点、即ち後端点が端部位として検出される。また、バウンディングボックスの4つの角のうち、処置具の先端に最も近い角、即ち先端角と、最も遠い角、即ち後端角とが端部位として検出される。
The end
角度検出部230は内視鏡画像内の処置具が内視鏡画像の周囲の4辺のうちの一つとなす角度を、処置具の角度として検出する。角度検出部230が処置具の角度を検出する方法として、端部位検出部220により検出された後端点を用いて検出する方法、端部位検出部220により検出された先端角と後端角を用いて検出する方法と、後端点、先端角、後端角のいずれも使用せず検出する方法等がある。なお、実施形態において、処置具の先端の数に関わらず、これらの方法が適用される。
The
先端点検出部240は内視鏡画像内の処置具の先端点を検出する。例えば、処置具の後端点から最遠となる処置具のセグメンテーションマスク上の点が処置具の先端点として検出される。または、処置具の先端角に最近接する処置具のセグメンテーションマスク上の点が処置具の先端点として検出される。これらの方法は、処置具の先端が1つである場合に限定される。加えて、処置具のセグメンテーションマスクの重心を通り、処置具の角度を傾きとする直線と、処置具のバウンディングボックスとの交点が処置具の先端点として検出される。これらの先端点の検出は一例であり、その他に処置具の先端点に判定用の円が設定され、各先端点と重なる円弧部分の角度から判定することも可能である。
The tip
実施形態の処置具領域補完部200において、画像補完部250は、手術画像に対して色調処理して色調処理画像を生成し、色調処理画像を用いて処置具領域の補完を行う。
In the treatment tool
画像補完部250は次の処理を統合的に実行する。
The
減色部251は、手術画像から所定の色を減色して減色画像を生成する。当該処理は図8にて説明する。
The
二値化部252は、減色画像を二値化する。当該処理は図8にて説明する。
The
抽出部253は、手術画像内に存在する対象となる処置具と無関係な領域を削除して色調処理画像から処置具画像を抽出する。当該処理は図9にて説明する。
The
組合せ部254は、深層学習モデルを通じて取得される推論処置具画像に、色調処理画像から取得される処置具画像を組み合わせて組合せ画像125を生成する(図10参照)。当該処理は図10にて説明する。
The
拡張部255は、組合せ画像に合わせてバウンディングボックスを生成する。当該処理は図10にて説明する。
The
また、画像補完部250は手術画像の明度を調整して色調処理画像内の背景領域を削除する。
The
実施形態の処置具領域補完部200において、矩形補完部260は、色調処理画像内の処置具画像に対し処置具領域を囲う最小矩形を生成し、最小矩形を用いて補完を行う。矩形補完部260は次の処理を統合的に実行する。
In the embodiment of the treatment tool
先端判定部261は、推論マスクより処置具の先端を検出し先端の開口の有無を判定する。当該処理は図14にて説明する。
The
方向検出部262は、推論マスクより処置具の伸長方向を検出する。当該処理は図15にて説明する。
The
輝度検出部263は、内視鏡画像内の処置具領域の周囲の輝度を検出して輝度の高低から処置具の伸長方向を検出する。当該処理は図16にて説明する。
The
伸長部264は、推論マスクを伸長方向に拡張して伸長処置具画像を生成する。当該処理は図18にて説明する。
The
組合せ部265は、深層学習モデルを通じて取得される推論処置具画像に伸長処置具画像を組み合わせて組合せ画像を生成する。当該処理は図18にて説明する。
The
拡張部266は、組合せ画像に合わせてバウンディングボックスを拡張する。当該処理は図18にて説明する。
The
ここで言う推論マスクとは、内視鏡画像内において、当該処置具の存在部分と無関係な領域とを切り分けて処置具が存在する領域を被覆(マスキング)する画像処理された画像(データ)を言う。また、ここで言う最小矩形とは、当該処置具内部の全ての点を包含し、かつ面積が最小になるような大きさおよび方向で配置された矩形を言う。 The inference mask referred to here is an image (data) that has been processed to separate the area in the endoscopic image where the treatment tool is present from unrelated areas and to cover (mask) the area where the treatment tool is present. Additionally, the minimum rectangle referred to here is a rectangle that includes all points inside the treatment tool and is sized and oriented so as to have the smallest area.
色調処理画像の生成は、図8及び図9の模式図として例示される。図8及び図9の模式図は、画像取得部150を通じて取得され画像認識部160により画像認識された手術画像に対する処理の概要を示す。処理は図8(A)、(B)、(C)、そして図9(A)、(B)の順である。
The generation of a color-processed image is illustrated as the schematic diagrams of Figures 8 and 9. The schematic diagrams of Figures 8 and 9 show an overview of the processing of a surgical image acquired through the
図8(A)は手術画像121であり、内視鏡103に接続されるカメラ(図示せず)により撮影された腹腔内の画像である。手術画像の各画素を構成するR(赤)、G(緑)、B(青)の3要素のうち2つの間で減算を行うことにより減色画像を生成する。例えばR(赤)からG(緑)が減色される。この結果、減色画像の明暗は手術画像内の彩度の高さを表す。図8(B)は減色された後の減色画像122である。
Figure 8 (A) shows a
図8(C)は二値化した画像(二値化画像123)である。図8(B)の減色画像について、所定の閾値未満の画素に対して1を、それ以外の画素に対して0を出力することによって、二値化画像を生成する。この結果、手術画像内で彩度が低い領域が処置具領域として抽出される。 FIG. 8(C) is a binarized image (binarized image 123). For the color-reduced image in FIG. 8(B), a binarized image is generated by outputting 1 for pixels below a predetermined threshold and 0 for other pixels. As a result, areas with low saturation in the surgical image are extracted as treatment tool areas.
図9(A)では、二値化画像123(図8(C))から、対象となる処置具と無関係な領域を削除する方法を示す。図中の当該処置具の領域123aは、深層学習モデルを通じて取得されるセグメンテーションマスクである。そして、図9(B)のとおり、当該処置具と対象となる処置具と無関係な領域を削除されて色調処理画像124(処置具マスク画像)が生成される。
FIG. 9(A) shows a method for removing areas unrelated to the target treatment tool from a binarized image 123 (FIG. 8(C)). The
図10(A)の模式図では、深層学習モデルを用いた手術画像の画像認識により取得される推論処置具画像が示される。ここで、推論マスク126はトロッカに収納されている領域が含まれていない。そのため、処置具が画像縁部領域の4辺のどの辺に接するのか(どの辺から現れているのか)不明瞭となる。そこで、前出の色調処理画像124における処置具マスク127が組み合わせられる。こうして、画像中の処置具はどの辺から現れているのかが明らかとなる。
The schematic diagram in Figure 10 (A) shows an inferred treatment tool image obtained by image recognition of a surgical image using a deep learning model. Here, the
そして、図10(B)のとおり、当初の推論マスク126に対応するバウンディングボックスBx1は、組合せ画像125のバウンディングボックスBx2まで拡張される。
Then, as shown in FIG. 10B, the bounding box Bx1 corresponding to the
色調処理画像内の背景領域を削除する処理は図11のフロー図として表される。前出の手術画像121から減色画像122が生成されて、さらに二値化画像123が生成される。この流れに加えて、手術画像121からグレースケール画像122gが生成され、グレースケール画像122gから二値化画像123gが生成される。
The process of deleting background regions in a color-processed image is shown as a flow diagram in Figure 11. A color-reduced
そして、二値化画像123と二値化画像123gの共通部分が取得される。結果、処置具以外の生体組織の明るい部分の画像が削除されて処置具のみが明確化された修正画像(明度調整画像128)が生成される。これは、生体組織の明るい部分を処置具と誤認識するのを防ぐ目的で実行される。
Then, the common portion between
さらに、処置具画像と推論処置具画像との比較から色調処理画像内の処置具画像が特定される。この処理は図12のフロー図として表される。前出の手術画像121から減色画像122が生成されて、さらに二値化画像123が生成される。この流れに加えて、深層学習モデルを通じて取得されるセグメンテーションマスクから、推論画像129pが生成される。これは手術画像121に含まれる全ての処置具領域を含む。インスタンスセグメンテーションの推論結果では各処置具が区別されているため、推論画像129pから当該処置具領域が削除され、対象外処置具画像129が生成される。
Furthermore, the treatment tool image in the color-processed image is identified by comparing the treatment tool image with the inferred treatment tool image. This process is shown as a flow diagram in FIG. 12. A color-reduced
そして、二値化画像123から、対象外処置具画像129と重なりのある領域が削除され、当該処置具以外の処置具が削除された修正画像(重複調整画像130)が生成される。
Then, the area that overlaps with the non-target
加えて実施形態の画像補完部250は、手術画像の画像サイズを縮小して画像処理に要する時間を圧縮する。具体的には、手術画像121から減色画像122を経て生成される二値化画像123に対して画像サイズが縮小される。図13は画像サイズの縮小例の模式図である。図13(A)は二値化画像123の「等倍」の模式図である。図13(B)は同(A)の二値化画像123の一辺を「1/4倍」とした模式図である。図13(C)は同(A)の二値化画像123の一辺を「1/8倍」とした模式図である。
In addition, the
いったん二値化された画像については、画像解像度が低下しても処置具領域の特定には影響がない。画像サイズを縮小することにより、背景の領域の削除等の画像処理に要する時間の短縮が可能となる。図13の図示のように、画像サイズを縮小した二値化画像が、前述の図8ないし図12における各処理の二値化画像として使用される。 Once an image has been binarized, even if the image resolution is reduced, it does not affect the identification of the treatment tool area. By reducing the image size, it is possible to shorten the time required for image processing such as removing background areas. As shown in Figure 13, the binarized image with reduced image size is used as the binarized image for each process in Figures 8 to 12 described above.
先端判定部261は、処置具抽出部の出力である処置具のセグメンテーションマスクより処置具の先端を検出し先端の開口の有無を判定する。
The
図14(A)の模式図では、処置具領域を包含する最小矩形によって切り出され、矩形の長軸方向を水平に揃えた矩形領域138を示す。矩形領域138の処置具は紙面右方に先端を開口している。ちょうど横倒しのV字状である。図中の白の部分が処置具に対応する。
The schematic diagram in FIG. 14(A) shows a
図14(B)の模式図では、横倒しのV字状の根元部分(鉗子等の先端の間の空間)が拡大される。なお図示では白と黒が反転して図中の黒の部分が処置具の部分に対応する。そして、図中の白の部分である背景の端部e1とe2が認識される。端部e1とe2のそれぞれのピクセル数が算出される。図示の例では、端部e1は狭まっているため端部e2と比較してピクセル数は少なくなる。このことから、端部e2は端部e1よりも広がっていることが判明する。そこで、処置具では、端部e2側の開口が判明する。 In the schematic diagram of Figure 14 (B), the base of the horizontal V-shape (the space between the tips of forceps, etc.) is enlarged. Note that in the illustration, black and white are inverted, with the black parts in the illustration corresponding to the treatment tool. Then, ends e1 and e2 of the background, which is the white parts in the illustration, are recognized. The number of pixels for each of ends e1 and e2 is calculated. In the example shown, end e1 is narrower, so the number of pixels is less compared to end e2. From this, it is clear that end e2 is wider than end e1. Therefore, it is clear that there is an opening on the side of end e2 in the treatment tool.
方向検出部262は、処置具抽出部の出力である処置具のセグメンテーションマスクの形状に基づいて処置具の伸長方向を特定する。
The
図15(A)の模式図では、先端が開閉するタイプの処置具のセグメンテーションマスクを示す。矩形領域138の処置具は紙面右方に先端を開口している。ちょうど横倒しのV字状である。図中の白の部分が処置具に対応する。そして、横倒しのV字状の根元部分(O)と、根元部分から広がる端部(P)と(Q)が規定される。ここで、根元部分(O)端部(P)のベクトル(OP)と根元部分(O)端部(Q)のベクトル(OQ)が求められる。すると、処置具の伸長方向を示す方向ベクトル(v)は、ベクトル(OP)とベクトル(OQ)の合成方向の逆となる。つまり、紙面左方となる。
The schematic diagram in Figure 15 (A) shows a segmentation mask of a treatment tool whose tip opens and closes. The treatment tool in
また、図15(B)の模式図は、先端が開閉しないタイプの処置具のセグメンテーションマスクを示し、図中の白の部分が処置具に対応する。図示のとおり切り出された最小矩形の中心(Oi)を境にして、左右のセグメンテーションマスクの面積(ピクセル数)が比較される。面積が大きい方向が処置具の伸長方向と判定される。これは、処置具のシャフトが先端部分より太いという構造に基づく。次に、セグメンテーションマスクの最も幅の広い位置が原点(O)となり、原点(O)から先ほど判定した処置具の伸長方向に伸ばした直線と最小矩形との交点が端部(R)となる。処置具の伸長方向を示す方向ベクトル(v)は、ベクトル(OR)となる。 The schematic diagram in Figure 15 (B) shows a segmentation mask for a treatment tool whose tip does not open or close, with the white part in the diagram corresponding to the treatment tool. The areas (number of pixels) of the left and right segmentation masks are compared, with the center (Oi) of the smallest rectangle cut out as shown as the boundary. The direction with the larger area is determined to be the extension direction of the treatment tool. This is based on the structure of the treatment tool, in which the shaft of the treatment tool is thicker than the tip. Next, the widest position of the segmentation mask becomes the origin (O), and the intersection of the straight line extended from the origin (O) in the extension direction of the treatment tool determined earlier and the smallest rectangle becomes the end (R). The directional vector (v) indicating the extension direction of the treatment tool becomes the vector (OR).
さらに、処置具領域補完部200は手術画像内の処置具領域の周囲の輝度を検出して輝度の高低から処置具の伸長方向を検出する輝度検出部263を備える。処置具のセグメンテーションマスクの重心Oを原点として、矩形領域の長軸方向に直線を引き、直線上において当該処置具マスクの外側近傍の点をXおよびYとする(点の位置を分かりやすくするため、図16の点(Y)は直線から外れて表示されている)。ここで、処置具のシャフト上の点(X)が生体組織内の点(Y)より輝度が低くなることから、処置具の伸長方向がOXの向きであることがわかる。
Furthermore, the treatment tool
加えて、図17の模式図に示される処置具の伸長の向きの判定が含められてもよい。図17(A)では、推論処置具画像141中に処置具のセグメンテーションマスクが浮いているように表示されている。そこで、画面の端と処置具に相当する画像との距離K1とK2が求められる。距離の相互比較から、距離の近い側(K1側)が処置具の根元側となり伸長している向きが判定される。図17(B)のとおり、伸長方向を修正した伸長画像142が生成される。
In addition, the determination of the extension direction of the treatment tool shown in the schematic diagram of FIG. 17 may be included. In FIG. 17(A), the segmentation mask of the treatment tool is displayed as if it is floating in the inferred
図18は伸長部264における処理の説明に相当する。図18(A)では、組合せ画像125において方向検出部で判定された伸長方向に基づき、推論マスク133が修正される。そこで、推論マスク133内の処置具の画像の原点(O)と方向ベクトル(v)が規定され、図18(B)の付加部137のとおり画面の端まで延長される。付加部137の幅は原点(O)における方向ベクトルの垂線が推論マスク133と交わる断面のピクセル数から規定される。
FIG. 18 corresponds to an explanation of the processing in the
図18(B)では、組合せ部265により、深層学習モデルを通じて取得される推論処置具画像(推論マスク133)に、付加部137を組み合わせて組合せ画像(推論マスク134)が生成される。加えて、拡張部266により、組合せ画像(推論マスク134)に合わせてバウンディングボックス(拡張後のバウンディングボックス136)が生成される。
In FIG. 18(B), the
図14ないし図17にて説明の手法は、それぞれ単独としても、複数の組み合わせとしても適宜である。手術画像121における施術部位(臓器、組織の影響)、使用する処置具の種類等の各種影響が考慮され、最適に選択される。
The techniques described in Figures 14 to 17 may be used individually or in combination as appropriate. The optimal technique is selected taking into consideration various influences such as the treatment site (influence of organs and tissues) in the
特に、実施形態の内視鏡画像の処置具検出装置(内視鏡画像の処置具の先端検出装置)10は、バウンディングボックスを生成して処置具の認識精度を高めるに先立って、バウンディングボックスに供する色調処理画像を生成するため、従前のバウンディングボックスの処理のみでは対応が十分ではなかった処置具が内視鏡画像内の画面のどの辺から出現しているかの情報を得ることを容易とすることができる。このことから、実施形態の処置具検出装置10は、人工知能を用いた腹腔鏡手術の支援への貢献が期待される。
In particular, the endoscopic image treatment tool detection device (endoscopic image treatment tool tip detection device) 10 of the embodiment generates a color-processed image to be used for the bounding box before generating a bounding box to improve the recognition accuracy of the treatment tool, making it easy to obtain information about which side of the screen in the endoscopic image the treatment tool is appearing from, which was not adequately handled by previous bounding box processing alone. For this reason, the treatment
これより、図19ないし図23のフローチャートを用い、内視鏡画像の処置具検出装置(内視鏡画像の処置具の先端検出装置)10における内視鏡画像の処置具検出方法と内視鏡画像の処置具検出プログラムをともに説明する。内視鏡画像の処置具検出方法は、内視鏡画像の処置具検出プログラムに基づいて、処理装置(コンピュータ)105のCPU111により実行される。内視鏡画像の処置具検出プログラムは、図3の処理装置(コンピュータ)105に対して、画像取得機能、画像認識機能、処置具抽出機能、先端検出機能を実行させ、処置具領域補完機能、根部特定機能、端部位検出機能、角度検出機能、先端点検出機能を実行させ、画像補完機能、矩形補完機能を実行させ、減色機能、二値化機能、抽出機能、組合せ機能、拡張機能を実行させ、先端判定機能、方向検出機能、輝度検出機能、伸長機能、組合せ機能、拡張機能を実行させる。各機能は前述の処置具検出装置10の説明と重複するため、詳細は省略する。
Now, using the flowcharts of Figures 19 to 23, the endoscopic image treatment tool detection method and the endoscopic image treatment tool detection program in the endoscopic image treatment tool detection device (endoscopic image treatment tool tip detection device) 10 will be described. The endoscopic image treatment tool detection method is executed by the
図19のフローチャートより、処理装置(コンピュータ)105(CPU111)の処理は、画像取得ステップ(S150)、画像認識ステップ(S160)、処置具抽出ステップ(S170)、処置具領域補完ステップ(S200)等の各種ステップを備える。むろん、処理装置(コンピュータ)105(CPU111)自体の可動に必要な各種ステップ、また、情報出力に必要な出力のステップ、先端検出ステップ(S180)等は当然に含まれる。 As can be seen from the flow chart in FIG. 19, the processing of the processing device (computer) 105 (CPU 111) includes various steps such as an image acquisition step (S150), an image recognition step (S160), a treatment tool extraction step (S170), and a treatment tool area completion step (S200). Of course, various steps necessary for the operation of the processing device (computer) 105 (CPU 111) itself, as well as output steps necessary for information output, a tip detection step (S180), etc. are naturally included.
画像取得機能は手術画像を取得する(S150;画像取得ステップ)。画像認識機能は予め学習された深層学習モデルを用いて手術画像の画像認識を行う(S160;画像認識ステップ)。処置具抽出機能は手術画像内において当該処置具画像内のセグメンテーションマスクとバウンディングボックスからなる処置具領域を生成する(S170;処置具抽出ステップ)。処置具領域補完ステップ(S200)は、図21の画像補完ステップ(S250)または矩形補完ステップ(S260)の各処理を実行する。 The image acquisition function acquires a surgical image (S150; image acquisition step). The image recognition function performs image recognition of the surgical image using a pre-trained deep learning model (S160; image recognition step). The treatment tool extraction function generates a treatment tool area in the surgical image consisting of a segmentation mask and a bounding box in the treatment tool image (S170; treatment tool extraction step). The treatment tool area completion step (S200) executes each process of the image completion step (S250) or rectangle completion step (S260) in FIG. 21.
図20の先端検出ステップ(S180)のフローチャートより、処理装置(コンピュータ)105(CPU111)の処理は、根部特定ステップ(S210)、端部位検出ステップ(S220)、角度検出ステップ(S230)、先端点検出ステップ(S240)等の各種ステップを備える。 As can be seen from the flow chart of the tip detection step (S180) in FIG. 20, the processing by the processing device (computer) 105 (CPU 111) includes various steps such as a root identification step (S210), an end point detection step (S220), an angle detection step (S230), and a tip point detection step (S240).
根部特定機能は手術画像内に存在する画像縁部領域の4辺のそれぞれと、セグメンテーションマスクとの被覆面積が最大となる画像縁部領域の辺を根部として特定する(S210;根部特定ステップ)。端部位検出機能は処置具の端部位を検出する(端部位検出ステップ;S220)。角度検出機能は処置具の角度を検出する(角度検出ステップ;S230)。先端点検出機能は処置具の先端点を検出する(先端点検出ステップ;S240)。 The root identification function identifies as the root the side of the image edge region that has the largest coverage area between each of the four sides of the image edge region in the surgical image and the segmentation mask (S210; root identification step). The end position detection function detects the end portion of the treatment tool (end position detection step; S220). The angle detection function detects the angle of the treatment tool (angle detection step; S230). The tip point detection function detects the tip point of the treatment tool (tip point detection step; S240).
図21のフローチャートより、処置具領域補完ステップ(S200)は、図21(A)のとおり画像補完ステップ(S250)の実行、もしくは、図21(B)のとおり矩形補完ステップ(S260)と選択的である。なお、内視鏡画像の処置具検出装置(内視鏡画像の処置具の先端検出装置)10において、処置具領域補完ステップ(S200)が画像補完ステップ(S250)と矩形補完ステップ(S260)の両方を実行することも可能である。 As shown in the flowchart of FIG. 21, the treatment tool area completion step (S200) is selective to execute an image completion step (S250) as shown in FIG. 21(A) or a rectangle completion step (S260) as shown in FIG. 21(B). Note that in the treatment tool detection device for endoscopic images (treatment tool tip detection device for endoscopic images) 10, the treatment tool area completion step (S200) can also execute both the image completion step (S250) and the rectangle completion step (S260).
図22は処置具領域補完ステップ(S200)における画像補完ステップ(S250)のフローチャートより、処理装置(コンピュータ)105(CPU111)の処理は、減色ステップ(S251)、二値化ステップ(S252)、抽出ステップ(S253)、組合せステップ(S254)、拡張ステップ(S255)の各種ステップを備える。 FIG. 22 is a flowchart of the image completion step (S250) in the treatment tool area completion step (S200), and the processing by the processing device (computer) 105 (CPU 111) includes various steps of a color reduction step (S251), a binarization step (S252), an extraction step (S253), a combination step (S254), and an expansion step (S255).
減色機能は手術画像から所定の色を減色して減色画像を生成する(S251;減色ステップ)。二値化機能は減色画像を二値化する(S252;二値化ステップ)。抽出機能は手術画像内に存在する対象となる処置具と無関係な領域を削除して色調処理画像から処置具画像を抽出する(S253;抽出ステップ)。組合せ機能は深層学習モデルを通じて取得される推論処置具画像に色調処理画像から取得される処置具画像を組み合わせて組合せ画像を生成する(S254;組合せステップ)。拡張機能は組合せ画像に合わせて前記バウンディングボックスを生成する(S255;拡張ステップ)。 The color reduction function reduces a specified color from the surgical image to generate a color-reduced image (S251; color reduction step). The binarization function binarizes the color-reduced image (S252; binarization step). The extraction function removes areas unrelated to the target treatment tool present in the surgical image and extracts the treatment tool image from the color-processed image (S253; extraction step). The combination function combines the inferred treatment tool image obtained through the deep learning model with the treatment tool image obtained from the color-processed image to generate a combined image (S254; combination step). The expansion function generates the bounding box to match the combined image (S255; expansion step).
図23は処置具領域補完ステップ(S200)における矩形補完ステップ(S260)のフローチャートより、処理装置(コンピュータ)105(CPU111)の処理は、先端判定ステップ(S261)、方向検出ステップ(S262)、輝度検出ステップ(S263)、伸長ステップ(S264)、組合せステップ(S265)、拡張ステップ(S266)の各種ステップを備える。 FIG. 23 is a flow chart of the rectangle completion step (S260) in the treatment tool area completion step (S200), and the processing by the processing device (computer) 105 (CPU 111) includes various steps, such as a tip determination step (S261), a direction detection step (S262), a brightness detection step (S263), an extension step (S264), a combination step (S265), and an expansion step (S266).
先端判定機能は処置具抽出部の出力である処置具のセグメンテーションマスクより処置具の先端を検出し先端の開口の有無を判定する(S261;先端判定ステップ)。方向検出機能は処置具抽出部の出力である処置具のセグメンテーションマスクより処置具の伸長方向を検出する(S262;方向検出ステップ)。輝度検出機能は色調処理画像内の処置具画像の周囲の輝度を検出して輝度の高低から処置具の伸長方向を検出する(S263;輝度検出ステップ)。伸長機能は最小矩形を前記伸長方向に拡張して伸長処置具画像を生成する(S264;伸長ステップ)。組合せ機能は深層学習モデルを通じて取得される推論処置具画像に伸長処置具画像を組み合わせて組合せ画像を生成する(S265;組合せステップ)。拡張機能は組合せ画像に合わせてバウンディングボックスを生成する(S266;拡張ステップ)。 The tip determination function detects the tip of the treatment tool from the treatment tool segmentation mask, which is the output of the treatment tool extraction unit, and determines whether or not there is an opening at the tip (S261; tip determination step). The direction detection function detects the extension direction of the treatment tool from the treatment tool segmentation mask, which is the output of the treatment tool extraction unit (S262; direction detection step). The brightness detection function detects the brightness around the treatment tool image in the color-processed image and detects the extension direction of the treatment tool from the high and low brightness (S263; brightness detection step). The extension function extends the minimum rectangle in the extension direction to generate an extended treatment tool image (S264; extension step). The combination function combines the extended treatment tool image with the inferred treatment tool image obtained through the deep learning model to generate a combined image (S265; combination step). The expansion function generates a bounding box to match the combined image (S266; expansion step).
上述した本発明のコンピュータプログラムは、プロセッサが読み取り可能な記録媒体に記録されていてよく、記録媒体としては、「一時的でない有形の媒体」、例えば、ディスク、カード、半導体メモリ、プログラマブルな論理回路などを用いることができる。 The computer program of the present invention described above may be recorded on a processor-readable recording medium, and the recording medium may be a "non-transitory tangible medium" such as a disk, card, semiconductor memory, or programmable logic circuit.
なお、上記コンピュータプログラムは、例えば、ActionScript、JavaScript(登録商標)などのスクリプト言語、Objective-C、Java(登録商標)などのオブジェクト指向プログラミング言語、HTML5などのマークアップ言語などを用いて実装できる。 The computer program can be implemented using, for example, a scripting language such as ActionScript or JavaScript (registered trademark), an object-oriented programming language such as Objective-C or Java (registered trademark), or a markup language such as HTML5.
10 処置具検出装置
60 推論画像
101,102,401,601 処置具
103 内視鏡
104 患部
105 処理装置
106 内視鏡用ディスプレイ
107 患者
108 トロッカ
111 CPU
112 ROM
113 RAM
114 記憶部
115 I/O
121 手術画像
122 減色画像
122g グレースケール画像
123 二値化画像
123a 処置具領域
123g グレースケール画像に基づく二値化画像
124 色調処理画像
125 組合せ画像
126,133,134 推論マスク
127 処置具マスク
128 明度調整画像
129 対象外処置具画像
129p 推論画像
130 重複調整画像
133,134 推論マスク
135,136 バウンディングボックス
137 付加部
138 矩形領域
141 推論処置具画像
142 伸長画像
150 画像取得部
160 画像認識部
170 処置具抽出部
180 先端検出部
200 処置具領域補完部
210 根部特定部
220 端部位検出部
230 角度検出部
240 先端点検出部
250 画像補完部
251 減色部
252 二値化部
253 抽出部
254 組合せ部
255 拡張部
260 矩形補完部
261 先端判定部
262 方向検出部
263 輝度検出部
264 伸長部
265 組合せ部
266 拡張部
402 セグメンテーションマスク
403,Bx1,Bx2 バウンディングボックス
602 画像縁部線
603 画像縁部
604,605 領域
10 Treatment
112 ROM
113 RAM
114 Storage unit 115 I/O
121
Claims (11)
予め学習された深層学習モデルを用いて、前記手術画像の画像認識を行う画像認識部と、
前記手術画像内において当該処置具画像内のセグメンテーションマスクとバウンディングボックスからなる処置具領域を生成する処置具抽出部と、
前記手術画像内に存在する障害物による処置具領域の欠落を補完する処置具領域補完部と、を備え、
前記処置具領域補完部は、
前記手術画像に対して色調処理して色調処理画像を生成し、色調処理画像を用いて処置具領域の補完を行う画像補完部を備え、
前記画像補完部は前記色調処理画像の生成に際して、
前記手術画像から所定の色を減色して減色画像を生成する減色部と、
前記減色画像を二値化する二値化部と、
前記手術画像内に存在する対象となる処置具と無関係な領域を削除して前記色調処理画像から前記処置具画像を抽出する抽出部と、
前記深層学習モデルを通じて取得される推論処置具画像に、前記色調処理画像から取得される前記処置具画像を組み合わせて組合せ画像を生成する組合せ部と、
前記組合せ画像に合わせて前記バウンディングボックスを生成する拡張部と、
を備えることを特徴とする内視鏡画像の処置具検出装置。 an image acquisition unit for acquiring a surgical image;
An image recognition unit that performs image recognition of the surgical image using a pre-trained deep learning model;
a treatment tool extraction unit that generates a treatment tool area in the surgical image, the treatment tool area being composed of a segmentation mask and a bounding box in the treatment tool image;
a treatment tool region complementing unit that complements a missing treatment tool region due to an obstacle present in the surgical image,
The treatment tool region complement portion includes:
an image complementing unit that performs color processing on the surgical image to generate a color-processed image and complements a treatment tool region using the color-processed image;
The image completion unit, when generating the color-tone-processed image,
a color reduction unit that reduces a predetermined color from the surgical image to generate a reduced color image;
a binarization unit that binarizes the color-reduced image;
an extraction unit that deletes an area that is unrelated to the treatment tool and exists in the surgical image, and extracts the treatment tool image from the color-tone-processed image;
A combination unit that combines an inferred treatment tool image acquired through the deep learning model with the treatment tool image acquired from the color processing image to generate a combined image;
an extension unit that generates the bounding box to fit the combined image;
A treatment tool detection device for an endoscopic image, comprising:
前記処置具抽出部の出力である処置具のセグメンテーションマスクに対し、画像処理によって取得した処置具領域を囲う最小矩形を生成し、前記最小矩形を用いて補完を行う矩形補完部を備え、
前記矩形補完部は前記処置具領域を囲う最小矩形の生成に際して、
前記処置具抽出部の出力である処置具のセグメンテーションマスクより処置具の先端を検出し前記先端の開口の有無を判定する先端判定部と、
前記処置具抽出部の出力である処置具のセグメンテーションマスクより処置具の伸長方向を検出する方向検出部と、
前記手術画像内の処置具周囲の輝度を検出して輝度の高低から推論マスクに基づいて処置具の伸長方向を検出する輝度検出部と、
前記最小矩形を前記伸長方向に拡張して伸長処置具画像を生成する伸長部と、
前記深層学習モデルを通じて取得される推論処置具画像に、前記伸長処置具画像を組み合わせて組合せ画像を生成する組合せ部と、
前記組合せ画像に合わせて前記バウンディングボックスを生成する拡張部と、
を備えることを特徴とする請求項1に記載の内視鏡画像の処置具検出装置。 The treatment tool region complement portion includes:
a rectangle complementing unit that generates a minimum rectangle surrounding a treatment tool region obtained by image processing for a treatment tool segmentation mask that is an output of the treatment tool extraction unit, and performs complementation using the minimum rectangle;
The rectangle complementing unit, when generating a minimum rectangle surrounding the treatment tool region,
a tip determination unit that detects a tip of a treatment tool from a segmentation mask of the treatment tool that is an output of the treatment tool extraction unit and determines whether or not an opening of the tip is present;
a direction detection unit that detects an extension direction of the treatment tool from a segmentation mask of the treatment tool that is an output of the treatment tool extraction unit;
a brightness detection unit that detects the brightness of the area around the treatment tool in the surgical image and detects the extension direction of the treatment tool based on an inference mask from the brightness level;
an extension unit that extends the minimum rectangle in the extension direction to generate an extended treatment tool image;
A combination unit that combines the inference treatment tool image acquired through the deep learning model with the extension treatment tool image to generate a combination image;
an extension unit that generates the bounding box to fit the combined image;
The treatment tool detection device for an endoscopic image according to claim 1 , further comprising:
前記画像補完部と、
前記矩形補完部を備え、何れかの出力を選択的に出力することを特徴とする請求項1に記載の処置具検出装置。 The treatment tool region complement portion includes:
The image completion unit;
The treatment tool detection device according to claim 1 , further comprising the rectangular complementing section, and selectively outputs one of the outputs.
予め学習された深層学習モデルを用いて、前記手術画像の画像認識を行う画像認識部と、
前記手術画像内において当該手術画像内の処置具のセグメンテーションマスクとバウンディングボックスとを生成する処置具抽出部と、
前記手術画像内に存在する障害物による処置具領域の欠落を補完する処置具領域補完部と、
生成された前記セグメンテーションマスクと、生成された前記バウンディングボックスとを用いて、前記処置具の先端を検出する先端検出部と、を備え、
前記先端検出部は、
前記手術画像内に存在する画像縁部領域の4辺のそれぞれと、前記セグメンテーションマスクとの被覆面積が最大となる前記画像縁部領域の辺を根部として特定する根部特定部と、
前記処置具の端部位を検出する端部位検出部と、
前記処置具の角度を検出する角度検出部と、
前記処置具の先端点を検出する先端点検出部と
を備えることを特徴とする内視鏡画像の処置具の先端検出装置。 an image acquisition unit for acquiring a surgical image;
An image recognition unit that performs image recognition of the surgical image using a pre-trained deep learning model;
a treatment tool extraction unit that generates a segmentation mask and a bounding box of a treatment tool in the surgical image;
a treatment tool region complementing unit that complements a missing treatment tool region due to an obstacle present in the surgical image;
a tip detection unit that detects a tip of the treatment tool by using the generated segmentation mask and the generated bounding box,
The tip detection unit is
a root identifying unit that identifies, as a root, a side of an image edge region that has a maximum coverage area between each of four sides of an image edge region present in the surgical image and the segmentation mask;
an end portion detection unit for detecting an end portion of the treatment tool;
An angle detection unit that detects an angle of the treatment tool;
a tip point detection unit for detecting a tip point of the treatment tool,
手術画像を取得する画像取得ステップと、
予め学習された深層学習モデルを用いて、前記手術画像の画像認識を行う画像認識ステップと、
前記手術画像内において当該処置具画像内のセグメンテーションマスクとバウンディングボックスからなる処置具領域を生成する処置具抽出ステップと、
前記手術画像内に存在する障害物による処置具領域の欠落を補完する処置具領域補完ステップと、を実行し、
前記処置具領域補完ステップは、
前記手術画像に対して色調処理して色調処理画像を生成し、色調処理画像を用いて処置具領域の補完を行う画像補完ステップを備え、
前記画像補完ステップは前記色調処理画像の生成に際して、
前記手術画像から所定の色を減色して減色画像を生成する減色ステップと、
前記減色画像を二値化する二値化ステップと、
前記手術画像内に存在する対象となる処置具と無関係な領域を削除して前記色調処理画像から前記処置具画像を抽出する抽出ステップと、
前記深層学習モデルを通じて取得される推論処置具画像に、前記色調処理画像から取得される前記処置具画像を組み合わせて組合せ画像を生成する組合せステップと、
前記組合せ画像に合わせて前記バウンディングボックスを生成する拡張ステップと、を実行する
ことを特徴とする内視鏡画像の処置具検出方法。 A computer in an endoscopic image treatment tool detection device
An image acquisition step of acquiring a surgical image;
An image recognition step of performing image recognition of the surgical image using a pre-trained deep learning model;
a treatment tool extraction step of generating a treatment tool region in the surgical image, the treatment tool region being composed of a segmentation mask and a bounding box in the treatment tool image;
a treatment tool area complementing step of complementing a missing treatment tool area due to an obstacle present in the surgical image;
The treatment tool region complementing step includes:
an image complementing step of performing color processing on the surgical image to generate a color-processed image and complementing a treatment tool region using the color-processed image;
The image complementation step includes, when generating the color-tone processed image,
a color reduction step of reducing predetermined colors from the surgical image to generate a color-reduced image;
a binarization step of binarizing the color-reduced image;
an extraction step of deleting an area that is not related to the treatment tool and is present in the surgical image, and extracting the treatment tool image from the color-processed image;
A combination step of combining an inferred treatment tool image acquired through the deep learning model with the treatment tool image acquired from the color processing image to generate a combined image;
and an expansion step of generating the bounding box in accordance with the combined image.
前記処置具抽出ステップの出力である処置具のセグメンテーションマスクに対し、画像処理によって取得した処置具領域を囲う最小矩形を生成し、前記最小矩形を用いて補完を行う矩形補完ステップを実行し、
前記矩形補完ステップは前記処置具領域を囲う最小矩形の生成に際して、
前記処置具抽出ステップの出力である処置具のセグメンテーションマスクより処置具の先端を検出し前記先端の開口の有無を判定する先端判定ステップと、
前記処置具抽出ステップの出力である処置具のセグメンテーションマスクより処置具の伸長方向を検出する方向検出ステップと、
前記手術画像内の処置具周囲の輝度を検出して輝度の高低から推論マスクに基づいて処置具の伸長方向を検出する輝度検出ステップと、
前記最小矩形を前記伸長方向に拡張して伸長処置具画像を生成する伸長ステップと、
前記深層学習モデルを通じて取得される推論処置具画像に、前記伸長処置具画像を組み合わせて組合せ画像を生成する組合せステップと、
前記組合せ画像に合わせて前記バウンディングボックスを生成する拡張ステップと、
を実行する備えることを特徴とする請求項8に記載の内視鏡画像の処置具検出方法。 The treatment tool region complementing step includes:
A rectangle complementation step is performed in which a minimum rectangle that encloses the treatment tool region obtained by image processing is generated for the treatment tool segmentation mask that is an output of the treatment tool extraction step, and the minimum rectangle is used to complement the treatment tool region;
The rectangle complementing step includes, when generating a minimum rectangle surrounding the treatment tool region,
a tip determination step of detecting a tip of a treatment tool from a segmentation mask of the treatment tool which is an output of the treatment tool extraction step, and determining whether or not an opening of the tip exists;
a direction detection step of detecting an extension direction of the treatment tool from a segmentation mask of the treatment tool which is an output of the treatment tool extraction step;
a brightness detection step of detecting a brightness around the treatment tool in the surgical image and detecting an extension direction of the treatment tool based on an inference mask from the brightness level;
an extension step of extending the minimum rectangle in the extension direction to generate an extended treatment tool image;
A combination step of combining the inference treatment tool image acquired through the deep learning model with the extension treatment tool image to generate a combination image;
augmenting the bounding box to fit the combined image;
The treatment tool detection method for an endoscopic image according to claim 8, further comprising:
手術画像を取得する画像取得機能と、
予め学習された深層学習モデルを用いて、前記手術画像の画像認識を行う画像認識機能と、
前記手術画像内において当該処置具画像内のセグメンテーションマスクとバウンディングボックスからなる処置具領域を生成する処置具抽出機能と、
前記手術画像内に存在する障害物による処置具領域の欠落を補完する処置具領域補完機能と、を実現させ、
前記処置具領域補完機能は、
前記手術画像に対して色調処理して色調処理画像を生成し、色調処理画像を用いて処置具領域の補完を行う画像補完機能を備え、
前記画像補完機能は前記色調処理画像の生成に際して、
前記手術画像から所定の色を減色して減色画像を生成する減色機能と、
前記減色画像を二値化する二値化機能と、
前記手術画像内に存在する対象となる処置具と無関係な領域を削除して前記色調処理画像から前記処置具画像を抽出する抽出機能と、
前記深層学習モデルを通じて取得される推論処置具画像に、前記色調処理画像から取得される前記処置具画像を組み合わせて組合せ画像を生成する組合せ機能と、
前記組合せ画像に合わせて前記バウンディングボックスを生成する拡張機能と、を実現させる
ことを特徴とする内視鏡画像の処置具検出プログラム。 A computer in an endoscopic image treatment tool detection device
an image acquisition function for acquiring a surgical image;
An image recognition function that performs image recognition of the surgical image using a pre-trained deep learning model;
a treatment tool extraction function for generating a treatment tool region in the surgical image, the treatment tool region being composed of a segmentation mask and a bounding box in the treatment tool image;
A treatment tool area complementing function for complementing a missing treatment tool area due to an obstacle present in the surgical image is realized.
The treatment tool area complement function is
an image complementing function that performs color processing on the surgical image to generate a color-processed image and complements a treatment tool region using the color-processed image;
The image complementing function, when generating the color-tone processed image,
a color reduction function for reducing a predetermined color from the surgical image to generate a reduced color image;
A binarization function for binarizing the color-reduced image;
an extraction function for extracting the treatment tool image from the color-processed image by deleting an area that is unrelated to the treatment tool and exists in the surgical image;
A combination function of combining an inferred treatment tool image acquired through the deep learning model with the treatment tool image acquired from the color processing image to generate a combined image;
and an extended function for generating the bounding box in accordance with the combined image.
前記処置具抽出機能の出力である処置具のセグメンテーションマスクに対し、画像処理によって取得した処置具領域を囲う最小矩形を生成し、前記最小矩形を用いて補完を行う矩形補完機能を実現させ、
前記矩形補完機能は前記処置具領域を囲う最小矩形の生成に際して、
前記処置具抽出機能の出力である処置具のセグメンテーションマスクより処置具の先端を検出し前記先端の開口の有無を判定する先端判定機能と、
前記処置具抽出機能の出力である処置具のセグメンテーションマスクより処置具の伸長方向を検出する方向検出機能と、
前記手術画像内の処置具周囲の輝度を検出して輝度の高低から推論マスクに基づいて処置具の伸長方向を検出する輝度検出機能と、
前記最小矩形を前記伸長方向に拡張して伸長処置具画像を生成する伸長機能と、
前記深層学習モデルを通じて取得される推論処置具画像に、前記伸長処置具画像を組み合わせて組合せ画像を生成する組合せ機能と、
前記組合せ画像に合わせて前記バウンディングボックスを生成する拡張機能と、
を実行する備えることを特徴とする請求項10に記載の内視鏡画像の処置具検出プログラム。 The treatment tool area complement function is
A rectangle completion function is realized by generating a minimum rectangle surrounding a treatment tool region obtained by image processing for a segmentation mask of the treatment tool, which is an output of the treatment tool extraction function, and performing completion using the minimum rectangle;
The rectangle complement function, when generating a minimum rectangle surrounding the treatment tool area,
a tip determination function for detecting a tip of a treatment tool from a segmentation mask of the treatment tool which is an output of the treatment tool extraction function and determining whether or not an opening exists at the tip;
a direction detection function for detecting the extension direction of the treatment tool from a segmentation mask of the treatment tool which is an output of the treatment tool extraction function;
A brightness detection function for detecting the brightness of the periphery of the treatment tool in the surgical image and detecting the extension direction of the treatment tool based on an inference mask from the brightness level;
an extension function for extending the minimum rectangle in the extension direction to generate an extended treatment tool image;
A combination function of combining the inference treatment tool image acquired through the deep learning model with the extension treatment tool image to generate a combination image;
an extension that generates the bounding box for the combined image;
The treatment tool detection program for an endoscopic image according to claim 10, further comprising:
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| PCT/JP2023/002125 WO2024157360A1 (en) | 2023-01-24 | 2023-01-24 | Treatment instrument detection device for endoscopic images, treatment instrument detection method for endoscopic images, and treatment instrument detection device program for endoscopic images |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| PCT/JP2023/002125 WO2024157360A1 (en) | 2023-01-24 | 2023-01-24 | Treatment instrument detection device for endoscopic images, treatment instrument detection method for endoscopic images, and treatment instrument detection device program for endoscopic images |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2024157360A1 true WO2024157360A1 (en) | 2024-08-02 |
Family
ID=91969985
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/JP2023/002125 Ceased WO2024157360A1 (en) | 2023-01-24 | 2023-01-24 | Treatment instrument detection device for endoscopic images, treatment instrument detection method for endoscopic images, and treatment instrument detection device program for endoscopic images |
Country Status (1)
| Country | Link |
|---|---|
| WO (1) | WO2024157360A1 (en) |
Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2013141155A1 (en) * | 2012-03-17 | 2013-09-26 | 学校法人早稲田大学 | Image completion system for in-image cutoff region, image processing device, and program therefor |
| JP2014064657A (en) * | 2012-09-25 | 2014-04-17 | Canon Inc | Stereoscopic endoscope apparatus |
| WO2022054883A1 (en) * | 2020-09-10 | 2022-03-17 | オリンパス株式会社 | Control device, endoscope system, control method, and control program |
| JP7081862B1 (en) * | 2021-11-26 | 2022-06-07 | 株式会社Jmees | Surgery support system, surgery support method, and surgery support program |
| JP2023030681A (en) * | 2021-08-23 | 2023-03-08 | 国立研究開発法人国立がん研究センター | Tip detection device of treatment instrument of endoscopic image, tip detection method of treatment instrument of endoscopic image and tip detection program of treatment instrument of endoscopic image |
-
2023
- 2023-01-24 WO PCT/JP2023/002125 patent/WO2024157360A1/en not_active Ceased
Patent Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2013141155A1 (en) * | 2012-03-17 | 2013-09-26 | 学校法人早稲田大学 | Image completion system for in-image cutoff region, image processing device, and program therefor |
| JP2014064657A (en) * | 2012-09-25 | 2014-04-17 | Canon Inc | Stereoscopic endoscope apparatus |
| WO2022054883A1 (en) * | 2020-09-10 | 2022-03-17 | オリンパス株式会社 | Control device, endoscope system, control method, and control program |
| JP2023030681A (en) * | 2021-08-23 | 2023-03-08 | 国立研究開発法人国立がん研究センター | Tip detection device of treatment instrument of endoscopic image, tip detection method of treatment instrument of endoscopic image and tip detection program of treatment instrument of endoscopic image |
| JP7081862B1 (en) * | 2021-11-26 | 2022-06-07 | 株式会社Jmees | Surgery support system, surgery support method, and surgery support program |
Non-Patent Citations (1)
| Title |
|---|
| NAKASUJI, HISA; KAWAI, TOSHIKAZU; IWAMOTO, NORIYASU; NISHIKAWA, ATSUSHI; NISHIZAWA, YUJI; NAKAMURA, TATSUO: "1A1-H01 Image Recognition of Surgical Tool using Saturation for Endoscope Holding Robot", THE JAPANESE SOCIETY OF MECHANICAL ENGINEERS, JAPAN SOCIETY OF MECHANICAL ENGINEERS; ROBOTICS AND MECHATRONICS DIVISION, JP, 1 June 2018 (2018-06-01) - 5 June 2018 (2018-06-05), JP , pages 1A1 - 1A1-H01, 4, XP009556547, ISSN: 2424-3124, DOI: 10.1299/jsmermd.2018.1A1-H01 * |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN111968091B (en) | Method for detecting and classifying lesion areas in clinical image | |
| CN101420897B (en) | Endoscope insertion direction detecting device and endoscope insertion direction detecting method | |
| EP1994879B1 (en) | Image analysis device and image analysis method | |
| US9672610B2 (en) | Image processing apparatus, image processing method, and computer-readable recording medium | |
| US20220122244A1 (en) | Defect image generation method for deep learning and system therefor | |
| JP4434705B2 (en) | Image analysis method | |
| US12285151B2 (en) | Target part identification among human-body internal images | |
| JP2006006359A (en) | Image generator, image generator method, and its program | |
| CN112508835A (en) | Non-contrast agent medical image enhancement modeling method based on GAN | |
| CN113974837A (en) | Identification system and method of end tool, end tool and surgical robot system | |
| CN111105427A (en) | A method and system for lung image segmentation based on connected region analysis | |
| US12417530B2 (en) | Systems and methods to process electronic images to provide localized semantic analysis of whole slide images | |
| CN113706515B (en) | Tongue image anomaly determination method, tongue image anomaly determination device, computer equipment and storage medium | |
| WO2024157360A1 (en) | Treatment instrument detection device for endoscopic images, treatment instrument detection method for endoscopic images, and treatment instrument detection device program for endoscopic images | |
| JP2006325937A (en) | Image determination device, image determination method, and program therefor | |
| US20250090135A1 (en) | Diagnosis support device, ultrasound endoscope, diagnosis support method, and program | |
| CN113793316B (en) | Ultrasonic scanning area extraction method, device, equipment and storage medium | |
| CN117974603A (en) | Method, device, equipment and storage medium for detecting and segmenting multiple lesions of digestive tract | |
| WO2023026632A1 (en) | Device for detecting front end of treatment tool in endoscopic image, method for detecting front end of treatment tool in endoscopic image, and program for detecting front end of treatment tool in endoscopic image | |
| Gadermayr et al. | Getting one step closer to fully automatized celiac disease diagnosis | |
| CN120070287B (en) | Endoscopic image color correction method, device, equipment and storage medium | |
| CN119338786B (en) | A multi-scale detection method and device for tumors in digestive tract endoscopy images | |
| CN112508834A (en) | Non-contrast agent medical image enhancement method based on GAN | |
| CN121147063A (en) | A method, system, and device for highlight removal in grayscale images of cervical cancer based on image inpainting. | |
| JP2005177037A (en) | Calcified shadow judgment method, calcified shadow judgment apparatus and program |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 23918329 Country of ref document: EP Kind code of ref document: A1 |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |