CN116612411A - Method and system for identifying blood vessels, lymph nodes and nerves in operation - Google Patents
Method and system for identifying blood vessels, lymph nodes and nerves in operation Download PDFInfo
- Publication number
- CN116612411A CN116612411A CN202310496990.1A CN202310496990A CN116612411A CN 116612411 A CN116612411 A CN 116612411A CN 202310496990 A CN202310496990 A CN 202310496990A CN 116612411 A CN116612411 A CN 116612411A
- Authority
- CN
- China
- Prior art keywords
- layer
- attention
- blood vessels
- recognition model
- nerves
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61B—DIAGNOSIS; SURGERY; IDENTIFICATION
- A61B34/00—Computer-aided surgery; Manipulators or robots specially adapted for use in surgery
- A61B34/10—Computer-aided planning, simulation or modelling of surgical operations
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/22—Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
- G06V20/46—Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61B—DIAGNOSIS; SURGERY; IDENTIFICATION
- A61B34/00—Computer-aided surgery; Manipulators or robots specially adapted for use in surgery
- A61B34/10—Computer-aided planning, simulation or modelling of surgical operations
- A61B2034/101—Computer-aided simulation of surgical operations
- A61B2034/105—Modelling of the patient, e.g. for ligaments or bones
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V2201/00—Indexing scheme relating to image or video recognition or understanding
- G06V2201/03—Recognition of patterns in medical or anatomical images
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- General Health & Medical Sciences (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Computing Systems (AREA)
- Biomedical Technology (AREA)
- Molecular Biology (AREA)
- Software Systems (AREA)
- Artificial Intelligence (AREA)
- Medical Informatics (AREA)
- Surgery (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Public Health (AREA)
- Robotics (AREA)
- Veterinary Medicine (AREA)
- Animal Behavior & Ethology (AREA)
- Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
- Databases & Information Systems (AREA)
- Heart & Thoracic Surgery (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Medical Treatment And Welfare Office Work (AREA)
Abstract
本发明公开了用于手术中血管、淋巴结和神经的识别方法及系统,涉及计算机技术领域,方法包括S1获取手术视频,标记并建立血管样本数据库、淋巴样本数据库和神经样本数据库;S2构建初始识别模型;S3训练初始识别模型,得到血管识别模型、淋巴识别模型和神经识别模型;S4采集手术视频,识别并勾勒出血管、淋巴和神经的轮廓;S5可视化展示手术视频和轮廓给术者;系统包括采集模块、中央处理器和显示模块;采用人工智能计算机模型实时识别腹腔镜手术视野下的血管、淋巴及神经结构,结合可视化显示功能,为术者提供了切除和规避对应血管、淋巴结及神经的指引,指导术者合理的识别与切除血管和淋巴结,对无需切除的血管和淋巴结规避,辅助手术顺利进行。
The invention discloses a method and system for identifying blood vessels, lymph nodes, and nerves during surgery, and relates to the field of computer technology. The method includes S1 acquiring surgical videos, marking and establishing a blood vessel sample database, a lymphatic sample database, and a nerve sample database; S2 constructing an initial identification Model; S3 trains the initial recognition model to obtain the blood vessel recognition model, lymphatic recognition model and nerve recognition model; S4 collects surgical videos, recognizes and outlines the outlines of blood vessels, lymphatics and nerves; S5 visually displays the surgical videos and contours to the surgeon; the system Including acquisition module, central processing unit and display module; AI computer model is used to identify blood vessels, lymph nodes and nerve structures in the field of view of laparoscopic surgery in real time. The guideline guides the operator to identify and remove blood vessels and lymph nodes reasonably, and avoids blood vessels and lymph nodes that do not need to be removed, so as to assist the operation to proceed smoothly.
Description
技术领域technical field
本发明涉及计算机技术领域,尤其涉及一种用于手术中血管、淋巴结和神经的识别方法及系统。The invention relates to the technical field of computers, in particular to a method and system for identifying blood vessels, lymph nodes and nerves in operations.
背景技术Background technique
手术过程中,由于解剖结构的复杂性和术者的主观因素影响,血管、淋巴结和神经的切除和损伤相对比较常见。不当的损伤或切除失误极有可能导致术后并发症,不利于患者康复,影响患者的生存质量。During the operation, due to the complexity of the anatomical structure and the subjective factors of the operator, the resection and injury of blood vessels, lymph nodes and nerves are relatively common. Improper injury or wrong resection is very likely to lead to postoperative complications, which is not conducive to the recovery of patients and affects the quality of life of patients.
发明内容Contents of the invention
本发明的目的就在于为了解决上述问题设计了一种用于手术中血管、淋巴结和神经的识别方法及系统。The object of the present invention is to design a method and system for identifying blood vessels, lymph nodes and nerves in surgery in order to solve the above problems.
本发明通过以下技术方案来实现上述目的:The present invention achieves the above object through the following technical solutions:
用于手术中血管、淋巴结和神经的识别方法,包括:Methods for identifying blood vessels, lymph nodes, and nerves during surgery, including:
S1、获取多种术式的手术视频,并标记手术阶段以及视野下的血管、淋巴和神经,建立血管样本数据库、淋巴样本数据库和神经样本数据库;S1. Acquire surgical videos of various surgical procedures, mark the surgical stage and blood vessels, lymph and nerves in the field of view, and establish a blood vessel sample database, lymph sample database and nerve sample database;
S2、构建三个初始识别模型,每个初始识别模型为包括FPN神经网络和多特征提取神经网络,FPN神经网络为自底向上、自顶向下和横向连接的网络结构,多特征提取神经网络包括四层池化注意力层,四层池化注意力层连接形成自底向上的下采样结构,FPN神经网络的自底向上的层数为四层,一层池化注意力层与一层自底向上的下采样层对应,上一池化注意力层与上一下采样层的输出均作为下一下采样层的输入;S2. Construct three initial recognition models, each initial recognition model includes a FPN neural network and a multi-feature extraction neural network, the FPN neural network is a bottom-up, top-down and horizontally connected network structure, and a multi-feature extraction neural network Including four layers of pooling attention layer, the four layers of pooling attention layer are connected to form a bottom-up downsampling structure, the number of bottom-up layers of the FPN neural network is four layers, one layer of pooling attention layer and one layer The bottom-up downsampling layer corresponds to the output of the previous pooling attention layer and the previous downsampling layer as the input of the next downsampling layer;
S3、血管样本数据库导入初始识别模型并对其进行训练优化,得到血管识别模型;淋巴样本数据库导入初始识别模型并对其进行训练优化,得到淋巴识别模型;神经样本数据库导入初始识别模型并对其进行训练优化,得到神经识别模型;S3. The blood vessel sample database imports the initial recognition model and trains and optimizes it to obtain the blood vessel recognition model; the lymph sample database imports the initial recognition model and trains and optimizes it to obtain the lymphatic recognition model; the neural sample database imports the initial recognition model and Perform training optimization to obtain a neural recognition model;
S4、实时采集手术视频,并分别导入血管识别模型、淋巴识别模型和神经识别模型,得到血管识别结果、淋巴识别结果和神经识别结果,并勾勒出血管识别结果、淋巴识别结果和神经识别结果的轮廓;S4. Collect surgical video in real time, and import the blood vessel recognition model, lymphatic recognition model and nerve recognition model respectively, obtain the blood vessel recognition result, lymphatic recognition result and nerve recognition result, and outline the blood vessel recognition result, lymphatic recognition result and nerve recognition result contour;
S5、可视化展示手术视频和轮廓给术者。S5. Visually display the operation video and outline to the operator.
用于手术中血管、淋巴结和神经的识别系统,包括:Identification systems for blood vessels, lymph nodes and nerves during surgery, including:
用于实时采集手术视频的采集模块;Acquisition module for real-time acquisition of surgical video;
中央处理器;中央处理器用于分析手术视频识别血管、淋巴和神经,并将得到血管识别结果、淋巴识别结果和神经识别结果标注在手术视频中;Central processing unit; the central processing unit is used to analyze the surgical video to identify blood vessels, lymph and nerves, and mark the obtained blood vessel recognition results, lymphatic recognition results and nerve recognition results in the surgical video;
显示模块;显示模块用于显示标注后的手术视频给术者。A display module; the display module is used to display the marked operation video to the operator.
本发明的有益效果在于:采用人工智能计算机模型实时识别腹腔镜手术视野下的血管、淋巴及神经结构,同时结合可视化显示功能,为术者提供了切除和规避对应血管、淋巴结及神经的指引,从而指导术者合理的识别与切除血管和淋巴结等结构,以及对无需切除的血管和淋巴结的规避,辅助手术顺利进行。The beneficial effect of the present invention is that the artificial intelligence computer model is used to identify the blood vessels, lymph nodes and nerve structures in the field of view of laparoscopic surgery in real time, and at the same time combined with the visual display function, it provides guidance for the operator to remove and avoid the corresponding blood vessels, lymph nodes and nerves, In this way, the operator can be guided to identify and remove structures such as blood vessels and lymph nodes reasonably, as well as avoid the blood vessels and lymph nodes that do not need to be removed, and assist the operation to proceed smoothly.
附图说明Description of drawings
图1是本发明用于手术中血管、淋巴结和神经的识别方法的流程示意图;Fig. 1 is a schematic flow chart of the present invention's identification method for blood vessels, lymph nodes and nerves in surgery;
图2是本发明中识别模型的结构示意图;Fig. 2 is a structural representation of the recognition model in the present invention;
图3是本发明中多特征提取神经网络的结构示意图。Fig. 3 is a structural schematic diagram of a multi-feature extraction neural network in the present invention.
具体实施方式Detailed ways
为使本发明实施例的目的、技术方案和优点更加清楚,下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述。显然,所描述的实施例是本发明一部分实施例,而不是全部的实施例。通常在此处附图中描述和示出的本发明实施例的组件可以以各种不同的配置来布置和设计。In order to make the purpose, technical solutions and advantages of the embodiments of the present invention more clear, the technical solutions in the embodiments of the present invention will be clearly and completely described below in conjunction with the drawings in the embodiments of the present invention. Apparently, the described embodiments are some, but not all, embodiments of the present invention. The components of the embodiments of the invention generally described and illustrated in the figures herein may be arranged and designed in a variety of different configurations.
因此,以下对在附图中提供的本发明的实施例的详细描述并非旨在限制要求保护的本发明的范围,而是仅仅表示本发明的选定实施例。基于本发明中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。Accordingly, the following detailed description of the embodiments of the invention provided in the accompanying drawings is not intended to limit the scope of the claimed invention, but merely represents selected embodiments of the invention. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without creative efforts fall within the protection scope of the present invention.
应注意到:相似的标号和字母在下面的附图中表示类似项,因此,一旦某一项在一个附图中被定义,则在随后的附图中不需要对其进行进一步定义和解释。It should be noted that like numerals and letters denote similar items in the following figures, therefore, once an item is defined in one figure, it does not require further definition and explanation in subsequent figures.
在本发明的描述中,需要理解的是,术语“上”、“下”、“内”、“外”、“左”、“右”等指示的方位或位置关系为基于附图所示的方位或位置关系,或者是该发明产品使用时惯常摆放的方位或位置关系,或者是本领域技术人员惯常理解的方位或位置关系,仅是为了便于描述本发明和简化描述,而不是指示或暗示所指的设备或元件必须具有特定的方位、以特定的方位构造和操作,因此不能理解为对本发明的限制。In the description of the present invention, it should be understood that the orientations or positional relationships indicated by the terms "upper", "lower", "inner", "outer", "left", "right" etc. are based on those shown in the accompanying drawings. Orientation or positional relationship, or the orientation or positional relationship that is usually placed when the product of the invention is used, or the orientation or positional relationship that is commonly understood by those skilled in the art, is only for the convenience of describing the present invention and simplifying the description, rather than indicating or It should not be construed as limiting the invention by implying that a referenced device or element must have a particular orientation, be constructed, and operate in a particular orientation.
此外,术语“第一”、“第二”等仅用于区分描述,而不能理解为指示或暗示相对重要性。In addition, the terms "first", "second", etc. are only used for distinguishing descriptions, and should not be construed as indicating or implying relative importance.
在本发明的描述中,还需要说明的是,除非另有明确的规定和限定,“设置”、“连接”等术语应做广义理解,例如,“连接”可以是固定连接,也可以是可拆卸连接,或一体地连接;可以是机械连接,也可以是电连接;可以是直接连接,也可以通过中间媒介间接连接,可以是两个元件内部的连通。对于本领域的普通技术人员而言,可以根据具体情况理解上述术语在本发明中的具体含义。In the description of the present invention, it should also be noted that, unless otherwise specified and limited, terms such as "setting" and "connection" should be understood in a broad sense. For example, "connection" can be a fixed connection or a Detachable connection, or integral connection; it can be mechanical connection or electrical connection; it can be direct connection or indirect connection through an intermediary, and it can be the internal communication of two components. Those of ordinary skill in the art can understand the specific meanings of the above terms in the present invention according to specific situations.
下面结合附图,对本发明的具体实施方式进行详细说明。The specific implementation manners of the present invention will be described in detail below in conjunction with the accompanying drawings.
如图1、图2、图3所示,用于手术中血管、淋巴结和神经的识别方法,包括:As shown in Figure 1, Figure 2, and Figure 3, the identification methods used for blood vessels, lymph nodes and nerves during surgery include:
S1、获取多种术式的手术视频,并标记手术阶段以及视野下的血管、淋巴和神经,建立血管样本数据库、淋巴样本数据库和神经样本数据库;S1. Acquire surgical videos of various surgical procedures, mark the surgical stage and blood vessels, lymph and nerves in the field of view, and establish a blood vessel sample database, lymph sample database and nerve sample database;
S2、构建三个初始识别模型,每个初始识别模型为包括FPN神经网络和多特征提取神经网络,FPN神经网络为自底向上、自顶向下和横向连接的网络结构,多特征提取神经网络包括四层池化注意力层,四层池化注意力层连接形成自底向上的下采样结构,FPN神经网络的自底向上的层数为四层,一层池化注意力层与一层自底向上的下采样层对应,上一池化注意力层与上一下采样层的输出均作为下一下采样层的输入;S2. Construct three initial recognition models, each initial recognition model includes a FPN neural network and a multi-feature extraction neural network, the FPN neural network is a bottom-up, top-down and horizontally connected network structure, and a multi-feature extraction neural network Including four layers of pooling attention layer, the four layers of pooling attention layer are connected to form a bottom-up downsampling structure, the number of bottom-up layers of the FPN neural network is four layers, one layer of pooling attention layer and one layer The bottom-up downsampling layer corresponds to the output of the previous pooling attention layer and the previous downsampling layer as the input of the next downsampling layer;
S3、血管样本数据库导入初始识别模型并对其进行训练优化,得到血管识别模型;淋巴样本数据库导入初始识别模型并对其进行训练优化,得到淋巴识别模型;神经样本数据库导入初始识别模型并对其进行训练优化,得到神经识别模型;S3. The blood vessel sample database imports the initial recognition model and trains and optimizes it to obtain the blood vessel recognition model; the lymph sample database imports the initial recognition model and trains and optimizes it to obtain the lymphatic recognition model; the neural sample database imports the initial recognition model and Perform training optimization to obtain a neural recognition model;
S4、实时采集手术视频,并分别导入血管识别模型、淋巴识别模型和神经识别模型,得到血管识别结果、淋巴识别结果和神经识别结果,并勾勒出血管识别结果、淋巴识别结果和神经识别结果的轮廓;S4. Collect surgical video in real time, and import the blood vessel recognition model, lymphatic recognition model and nerve recognition model respectively, obtain the blood vessel recognition result, lymphatic recognition result and nerve recognition result, and outline the blood vessel recognition result, lymphatic recognition result and nerve recognition result contour;
S5、可视化展示手术视频和轮廓给术者。S5. Visually display the operation video and outline to the operator.
每层池化注意力层均池化层和第一自注意力层,池化层的输出作为第一自注意力层的输入,第一自注意力层通过局部聚集对手术图像进行下采样并计算全局自我注意力。Each pooling attention layer is equal to the pooling layer and the first self-attention layer. The output of the pooling layer is used as the input of the first self-attention layer. The first self-attention layer down-samples the surgical image through local aggregation and Computes global self-attention.
多特征提取神经网络还包括四层第二自注意力层,第二自注意力层通过将输入划分为非重叠窗口以及计算每个窗口内的局部自我注意力,第二自注意力层的输出作为FPN神经网络的输入。The multi-feature extraction neural network also includes a four-layer second self-attention layer. The second self-attention layer divides the input into non-overlapping windows and calculates the local self-attention within each window. The output of the second self-attention layer As the input of the FPN neural network.
多特征提取神经网络还包括三层混合窗口层,一层混合窗口层用于计算一个窗口内的局部关注度,除混合窗口层的最后一个块之外,混合窗口层的输出作为FPN神经网络的输入。The multi-feature extraction neural network also includes three layers of mixed window layer. One layer of mixed window layer is used to calculate the local attention in a window. Except for the last block of the mixed window layer, the output of the mixed window layer is used as the output of the FPN neural network. enter.
池化注意力层对于任意输入序列,对其进行线性投影,会得到Q、K、V三个张量,表示为;分别对三个张量依次进行池化处理和注意力计算,并将相对位置的信息纳入到了注意力计算中,注意力计算表示为,将元素和之间的距离计算沿时空轴分解表示为,其中h和w分别代表垂直和水平方向。The pooling attention layer performs a linear projection on any input sequence to obtain three tensors Q, K, and V, expressed as ; Perform pooling processing and attention calculation on the three tensors in turn, and incorporate the relative position information into the attention calculation. The attention calculation is expressed as , decompose the calculation of the distance between elements and along the space-time axis as , where h and w represent the vertical and horizontal directions, respectively.
用于手术中血管、淋巴结和神经的识别系统,包括:Identification systems for blood vessels, lymph nodes and nerves during surgery, including:
用于实时采集手术视频的采集模块;Acquisition module for real-time acquisition of surgical video;
中央处理器;中央处理器用于分析手术视频识别血管、淋巴和神经,并将得到血管识别结果、淋巴识别结果和神经识别结果标注在手术视频中;Central processing unit; the central processing unit is used to analyze the surgical video to identify blood vessels, lymph and nerves, and mark the obtained blood vessel recognition results, lymphatic recognition results and nerve recognition results in the surgical video;
显示模块;显示模块用于显示标注后的手术视频给术者。A display module; the display module is used to display the marked operation video to the operator.
本发明用于手术中血管、淋巴结和神经的识别方法及系统的工作原理如下:The working principle of the identification method and system for blood vessels, lymph nodes and nerves in the operation of the present invention is as follows:
一般动静脉血管的直径不超过3厘米,淋巴结的直径通常在0.5-2厘米,神经的宽度不超过1厘米。这三类需要识别检测的目标都属于较小个体,通常在腔镜镜头中的像素面积不超过32x32像素,本身的亦是信息不足,所包含的判别性信息不够,除此之外,还存在数据集的不平衡,参照框难以匹配的问题。Generally, the diameter of arteries and veins does not exceed 3 cm, the diameter of lymph nodes is usually 0.5-2 cm, and the width of nerves does not exceed 1 cm. These three types of targets that need to be identified and detected are all small individuals. Usually, the pixel area in the lens of the cavity mirror does not exceed 32x32 pixels. The information itself is insufficient, and the discriminative information contained is not enough. In addition, there are The imbalance of the data set and the problem that the reference frame is difficult to match.
为了较好地解决这些问题,FPN神经网络引入了一种自底向上、自顶向下的网络结构,通过将相邻层的特征融合以达到特征增强的目的,对于像素点不够多,特征不够丰富的目标有较好的效果。创建了强大的基线,沿着2个轴提高集中注意力,使用分解的位置距离将位置信息注入池化注意力层,通过池化残差连接来补偿池化步长在注意力计算中的影响,结合FPN神经网络用于目标检测和实例分割。In order to better solve these problems, the FPN neural network introduces a bottom-up and top-down network structure. By fusing the features of adjacent layers to achieve the purpose of feature enhancement, there are not enough pixels and features. Rich targets have better results. Creates a strong baseline that improves focused attention along 2 axes, injects location information into pooled attention layers using decomposed location distances, and compensates pooling strides in attention calculations with pooled residual connections , combined with FPN neural network for object detection and instance segmentation.
该算法的关键思想是通过扩展通道宽度,同时降低分辨率来实现高维度和低维度的视觉建模不同阶段的构建,而不是单尺度块。因为需要通过上下采样来对不同尺度的特征进行提取和融合,该算法提出了池化注意力,如图3所示。对于任意输入序列,对其进行线性投影,会得到Q(query),K(key),V(value)三个张量:,The key idea of the algorithm is to realize the construction of different stages of high-dimensional and low-dimensional visual modeling by expanding the channel width while reducing the resolution, instead of single-scale blocks. Because it is necessary to extract and fuse features of different scales through up and down sampling, this algorithm proposes pooling attention, as shown in Figure 3. For any input sequence, if it is linearly projected, three tensors of Q (query), K (key), and V (value) will be obtained: ,
Q表示查询向量,K表示被查询信息与其他信息的相关性的向量,V表示被查询信息的向量,W为线性投影的矩阵,X为自注意力权重,P为池化算子。Q represents the query vector, K represents the vector of the correlation between the queried information and other information, V represents the vector of the queried information, W is the matrix of linear projection, X is the self-attention weight, and P is the pooling operator.
Q,K,V再经过池化处理,主要是为了缩短K,V序列的长度,然后在池化后的基础上进行注意力的计算:,Q, K, and V are then pooled, mainly to shorten the length of the K, V sequence, and then perform attention calculations on the basis of pooling: ,
其中KT为矩阵K的转置,D为多头池化自注意力处理的向量维度。where K T is the transpose of matrix K, and D is the vector dimension of multi-head pooling self-attention processing.
多特征提取神经网络的第一自注意力层可以在每个步骤都进行池化,这样可以大大地减少Q-K-V计算时的内存成本和计算量,这样本系统进行识别所要求的硬件条件和算力条件都会大大降低,使更多医疗机构能够低成本地运行。The first self-attention layer of the multi-feature extraction neural network can be pooled at each step, which can greatly reduce the memory cost and calculation amount during Q-K-V calculation, so that the hardware conditions and computing power required by the system for recognition Conditions will be greatly reduced, enabling more medical institutions to operate at low cost.
因为通过绝对位置编码只能提供位置信息却忽略特征的平移不变性。因此绝对位置发生变化那么他们之间的依赖关系就会发生变化,虽然这两个区域的相对位置并没有发生变化。为了解决这个问题,该模型将相对位置的信息纳入到了池化注意力层的自我关注计算中,其仅取决于K的相对位置距离:,其中,Because the absolute position encoding can only provide position information but ignore the translation invariance of features. Therefore, if the absolute position changes, the dependency between them will change, although the relative position of the two regions has not changed. In order to solve this problem, the model incorporates the information of the relative position into the self-attention calculation of the pooled attention layer, which only depends on the relative position distance of K: ,in ,
其中i表示时间维度上第i个令牌token,j表示空间维度上第j个令牌token,R为相对位置编码,d为R的取值范围绝对值,P表示元素i和j的时空位置。Where i represents the i-th token token in the time dimension, j represents the j-th token token in the space dimension, R is the relative position code, d is the absolute value of the value range of R, and P represents the space-time position of elements i and j .
为了简化运算,降低复杂度,该算法将元素和之间的距离计算沿时空轴分解为:,In order to simplify the operation and reduce the complexity, the algorithm decomposes the calculation of the distance between elements and along the space-time axis into: ,
其中h和w分别代表垂直和水平方向,因为本项目采用手术单帧图片而非手术视频,t为时间维度,所以t维度为0,即。Among them, h and w represent the vertical and horizontal directions respectively, because this project uses a single-frame surgical picture instead of a surgical video, and t is the time dimension, so the t dimension is 0, that is .
在经过上述改进之后,多特征提取神经网络的结构能够融入到FPN神经网络中。从结构上来说,多特征提取神经网络在四个阶段中生成多尺度特征图,因此自然而然地集成到用于对象检测任务的FPN神经网络中,FPN神经网络中具有横向连接的自上而下金字塔在所有尺度上构造了多特征提取神经网络的语义强大的特征图,如图2所示。After the above improvements, the structure of the multi-feature extraction neural network can be integrated into the FPN neural network. Structurally, the multi-feature extraction neural network generates multi-scale feature maps in four stages, so it is naturally integrated into the FPN neural network for object detection tasks, which has a top-down pyramid with lateral connections Semantically powerful feature maps of multi-feature extraction neural networks are constructed at all scales, as shown in Figure 2.
除了池化注意力层之外,该算法还提出了第二自注意力层用来显著降低计算和内存复杂度。池化注意力层的特征是通过局部聚集对其进行下采样,但保持全局自我注意力计算,而第二自注意力层保持张量的分辨率,但通过将输入划分为非重叠窗口,然后仅计算每个窗口内的局部自我注意力,从而在本地执行自我注意力。这两种方法的内在差异促使它们是否能够执行互补的目标检测任务。同时提出了混合窗口层来添加跨窗口连接。混合窗口层计算一个窗口内的局部关注度,但最后三个阶段的最后一个块除外,这些块全部输入FPN神经网络,这样能够映射包含全局信息。In addition to the pooled attention layer, the algorithm also proposes a second self-attention layer to significantly reduce computational and memory complexity. The pooling attention layer is characterized by downsampling it via local aggregation but maintaining the global self-attention computation, while the second self-attention layer maintains the resolution of the tensor but by dividing the input into non-overlapping windows, then Only local self-attention within each window is computed, thus performing self-attention locally. The intrinsic differences of these two approaches motivate whether they can perform complementary object detection tasks. Meanwhile, a hybrid window layer is proposed to add cross-window connections. The hybrid window layer computes the local attention within a window, except for the last blocks of the last three stages, which are all fed into the FPN neural network, so that the map contains global information.
两种注意力机制使得血管识别模型、淋巴识别模型和神经识别模型能够精准地找出各个尺度的目标,无论手术过程中腔镜如何移动,只要其特征显露在镜头下就能被精准识别,同时也不需要太大的计算量,该模型的每秒浮点运算次数FLOPs仅为42.1G,计算机函数Params仅为56M,使得该系统能够在手术过程中实时分割标注,帮助医疗工作者快速找到血管,淋巴结和神经,降低手术风险。The two attention mechanisms enable the blood vessel recognition model, lymphatic recognition model, and neural recognition model to accurately identify targets at various scales. No matter how the endoscope moves during the operation, as long as its features are exposed under the lens, they can be accurately recognized. It does not require too much calculation. The number of FLOPs per second of the model is only 42.1G, and the computer function Params is only 56M, which enables the system to segment and mark in real time during the operation, helping medical workers quickly find blood vessels. , lymph nodes and nerves, reducing the risk of surgery.
采用人工智能计算机模型实时识别腹腔镜手术视野下的血管、淋巴及神经结构,同时结合可视化显示功能,为术者提供了切除和规避对应血管、淋巴结及神经的指引,从而指导术者合理的识别与切除血管和淋巴结等结构,以及对无需切除的血管和淋巴结的规避,辅助手术顺利进行。The artificial intelligence computer model is used to identify the blood vessels, lymph nodes and nerve structures in the field of view of laparoscopic surgery in real time. At the same time, combined with the visual display function, it provides the surgeon with guidance for removing and avoiding the corresponding blood vessels, lymph nodes and nerves, so as to guide the surgeon to identify reasonably With the removal of structures such as blood vessels and lymph nodes, as well as the avoidance of blood vessels and lymph nodes that do not need to be removed, the auxiliary surgery went smoothly.
本发明的技术方案不限于上述具体实施例的限制,凡是根据本发明的技术方案做出的技术变形,均落入本发明的保护范围之内。The technical solution of the present invention is not limited to the limitations of the above-mentioned specific embodiments, and any technical deformation made according to the technical solution of the present invention falls within the protection scope of the present invention.
Claims (6)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202310496990.1A CN116612411A (en) | 2023-05-05 | 2023-05-05 | Method and system for identifying blood vessels, lymph nodes and nerves in operation |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202310496990.1A CN116612411A (en) | 2023-05-05 | 2023-05-05 | Method and system for identifying blood vessels, lymph nodes and nerves in operation |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| CN116612411A true CN116612411A (en) | 2023-08-18 |
Family
ID=87673917
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN202310496990.1A Pending CN116612411A (en) | 2023-05-05 | 2023-05-05 | Method and system for identifying blood vessels, lymph nodes and nerves in operation |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN116612411A (en) |
Citations (8)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN112164082A (en) * | 2020-10-09 | 2021-01-01 | 深圳市铱硙医疗科技有限公司 | Method for segmenting multi-modal MR brain image based on 3D convolutional neural network |
| CN113226148A (en) * | 2018-07-16 | 2021-08-06 | 爱惜康有限责任公司 | Integration of imaging data |
| CN113409456A (en) * | 2021-08-19 | 2021-09-17 | 江苏集萃苏科思科技有限公司 | Modeling method, system, device and medium for three-dimensional model before craniocerebral puncture operation |
| CN114359317A (en) * | 2021-12-17 | 2022-04-15 | 浙江大学滨江研究院 | Blood vessel reconstruction method based on small sample identification |
| CN114724682A (en) * | 2022-06-08 | 2022-07-08 | 成都与睿创新科技有限公司 | Auxiliary decision-making method and device for minimally invasive surgery |
| CN115049605A (en) * | 2022-06-09 | 2022-09-13 | 栢华科技(南京)有限公司 | Artificial intelligence processing and analyzing method for liver disease pathological tissue image |
| CN115279252A (en) * | 2019-12-30 | 2022-11-01 | 西拉格国际有限公司 | Visualization system using structured light |
| CN115311249A (en) * | 2022-08-29 | 2022-11-08 | 中国科学院长春光学精密机械与物理研究所 | Method for segmenting blood vessels and nerves in neurosurgical microsurgery image |
-
2023
- 2023-05-05 CN CN202310496990.1A patent/CN116612411A/en active Pending
Patent Citations (8)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN113226148A (en) * | 2018-07-16 | 2021-08-06 | 爱惜康有限责任公司 | Integration of imaging data |
| CN115279252A (en) * | 2019-12-30 | 2022-11-01 | 西拉格国际有限公司 | Visualization system using structured light |
| CN112164082A (en) * | 2020-10-09 | 2021-01-01 | 深圳市铱硙医疗科技有限公司 | Method for segmenting multi-modal MR brain image based on 3D convolutional neural network |
| CN113409456A (en) * | 2021-08-19 | 2021-09-17 | 江苏集萃苏科思科技有限公司 | Modeling method, system, device and medium for three-dimensional model before craniocerebral puncture operation |
| CN114359317A (en) * | 2021-12-17 | 2022-04-15 | 浙江大学滨江研究院 | Blood vessel reconstruction method based on small sample identification |
| CN114724682A (en) * | 2022-06-08 | 2022-07-08 | 成都与睿创新科技有限公司 | Auxiliary decision-making method and device for minimally invasive surgery |
| CN115049605A (en) * | 2022-06-09 | 2022-09-13 | 栢华科技(南京)有限公司 | Artificial intelligence processing and analyzing method for liver disease pathological tissue image |
| CN115311249A (en) * | 2022-08-29 | 2022-11-08 | 中国科学院长春光学精密机械与物理研究所 | Method for segmenting blood vessels and nerves in neurosurgical microsurgery image |
Non-Patent Citations (1)
| Title |
|---|
| YANGHAO LI等: "MViTv2: Improved Multiscale Vision Transformers for Classification and Detection", COMPUTER VISION AND PATTERN RECOGNITION, 2 December 2021 (2021-12-02), pages 4804 - 4814 * |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US12175684B2 (en) | Pedestrian tracking method, computing device, pedestrian tracking system and storage medium | |
| CN112800903B (en) | Dynamic expression recognition method and system based on space-time diagram convolutional neural network | |
| US20220180520A1 (en) | Target positioning method, apparatus and system | |
| CN114998934B (en) | Clothes-changing pedestrian re-identification and retrieval method based on multi-mode intelligent perception and fusion | |
| CN109472198B (en) | Gesture robust video smiling face recognition method | |
| CN112541433B (en) | Two-stage human eye pupil accurate positioning method based on attention mechanism | |
| Gou et al. | Cascade learning from adversarial synthetic images for accurate pupil detection | |
| CN113435236A (en) | Home old man posture detection method, system, storage medium, equipment and application | |
| CN109086706A (en) | Applied to the action identification method based on segmentation manikin in man-machine collaboration | |
| Oszust et al. | Recognition of signed expressions observed by Kinect Sensor | |
| CN115187550B (en) | Target registration method, device, equipment, storage medium and program product | |
| CN114093024A (en) | Human body action recognition method, device, equipment and storage medium | |
| CN114708952B (en) | Image annotation method and device, storage medium and electronic equipment | |
| CN115994944B (en) | Training method of key point prediction model, three-dimensional key point prediction method and related equipment | |
| CN114283178A (en) | Image registration method and device, computer equipment and storage medium | |
| CN117315137A (en) | Monocular RGB image gesture reconstruction method and system based on self-supervision learning | |
| CN114781393B (en) | Image description generation method and device, electronic equipment and storage medium | |
| CN116630292A (en) | Target detection method, target detection device, electronic equipment and storage medium | |
| CN116612411A (en) | Method and system for identifying blood vessels, lymph nodes and nerves in operation | |
| Joshi et al. | Real-time object detection and identification for visually challenged people using mobile platform | |
| CN118196835B (en) | Method and system for re-identifying pedestrians changing clothes based on spatial consistency | |
| CN113343927A (en) | Intelligent face recognition method and system suitable for facial paralysis patient | |
| CN119723657A (en) | Action recognition method and action recognition system combining segmentation and depth estimation | |
| Chang et al. | Multi-view 3d human pose estimation with self-supervised learning | |
| CN118470774A (en) | Self-supervision face AU detection method, equipment and medium without label guidance |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination |