[go: up one dir, main page]

WO2025036995A1 - Superposition d'annotation par l'intermédiaire d'une interface de diffusion en continu - Google Patents

Superposition d'annotation par l'intermédiaire d'une interface de diffusion en continu Download PDF

Info

Publication number
WO2025036995A1
WO2025036995A1 PCT/EP2024/073050 EP2024073050W WO2025036995A1 WO 2025036995 A1 WO2025036995 A1 WO 2025036995A1 EP 2024073050 W EP2024073050 W EP 2024073050W WO 2025036995 A1 WO2025036995 A1 WO 2025036995A1
Authority
WO
WIPO (PCT)
Prior art keywords
surgical
interest
region
highlight
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
PCT/EP2024/073050
Other languages
English (en)
Inventor
Petros GIATAGANAS
Gauthier Camille Louis GRAS
Danail V. Stoyanov
Imanol Luengo Muntion
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Digital Surgery Ltd
Original Assignee
Digital Surgery Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Digital Surgery Ltd filed Critical Digital Surgery Ltd
Publication of WO2025036995A1 publication Critical patent/WO2025036995A1/fr
Pending legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H30/00ICT specially adapted for the handling or processing of medical images
    • G16H30/40ICT specially adapted for the handling or processing of medical images for processing medical images, e.g. editing
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B34/00Computer-aided surgery; Manipulators or robots specially adapted for use in surgery
    • A61B34/25User interfaces for surgical systems
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H20/00ICT specially adapted for therapies or health-improving plans, e.g. for handling prescriptions, for steering therapy or for monitoring patient compliance
    • G16H20/40ICT specially adapted for therapies or health-improving plans, e.g. for handling prescriptions, for steering therapy or for monitoring patient compliance relating to mechanical, radiation or invasive therapies, e.g. surgery, laser therapy, dialysis or acupuncture
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B34/00Computer-aided surgery; Manipulators or robots specially adapted for use in surgery
    • A61B34/20Surgical navigation systems; Devices for tracking or guiding surgical instruments, e.g. for frameless stereotaxis
    • A61B2034/2046Tracking techniques
    • A61B2034/2065Tracking using image or pattern recognition
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B34/00Computer-aided surgery; Manipulators or robots specially adapted for use in surgery
    • A61B34/25User interfaces for surgical systems
    • A61B2034/256User interfaces for surgical systems having a database of accessory information, e.g. including context sensitive help or scientific articles
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B90/00Instruments, implements or accessories specially adapted for surgery or diagnosis and not covered by any of the groups A61B1/00 - A61B50/00, e.g. for luxation treatment or for protecting wound edges
    • A61B90/36Image-producing devices or illumination devices not otherwise provided for
    • A61B2090/364Correlation of different images or relation of image positions in respect to the body
    • A61B2090/365Correlation of different images or relation of image positions in respect to the body augmented reality, i.e. correlating a live optical image with another image

Definitions

  • the present disclosure relates in general to computing technology and relates more particularly to computing technology for overlay annotation through a streaming interface.
  • Computer-assisted systems can rely on video data digitally captured during a surgery in an operating room.
  • video data can be stored and/or streamed.
  • the video data can be used within a system to augment a person’s physical sensing, perception, and reaction capabilities.
  • such systems can effectively provide the information corresponding to an expanded field of vision, both temporal and spatial, that enables a person to adjust current and future actions based on the part of an environment not included in his or her physical field of view.
  • the video data which can include or be accompanied by audio data captured by one or more microphones, can be stored and/or transmitted for several purposes such as archival, operational notes, training, post-surgery analysis, and/or patient consultation.
  • Streaming systems may allow users within an operating room environment to collaborate with users outside of the operating room environment. Further, streaming systems may be observational to allow remote users to view and discuss a live surgical procedure without directly interacting with the surgeon or surgical team performing an operation.
  • a computer-implemented method includes detecting, by a surgical monitoring system, a user input through a user interface of a streaming session to select a region of interest to highlight in a surgical video stream during a surgical procedure.
  • the computer-implemented method also includes highlighting the region of interest, by the surgical monitoring system, based on determining that the region of interest is associated with a known structure and performing a boundary search, by the surgical monitoring system, to locate one or more feature boundaries and highlight within the one or more feature boundaries based on determining that the region of interest is associated with an unknown structure.
  • the computer- implemented method further includes tracking, by the surgical monitoring system, an orientation and position of the region of interest during the surgical procedure to maintain the highlight within the one or more feature boundaries to account for movement in the surgical video stream until a user command is received or a time period has elapsed to remove the highlight.
  • a computer program product includes a memory device having computer executable instructions stored thereon, which when executed by one or more processors cause the one or more processors to perform operations including detecting a user input through a user interface of a streaming session to select a region of interest to highlight in a surgical video stream during a surgical procedure, and searching a frame of the surgical video stream to locate one or more feature boundaries and highlight within the one or more feature boundaries based on determining that the region of interest is associated with an unknown structure not yet identified. The operations further including tracking the region of interest during the surgical procedure to maintain the highlight within the one or more feature boundaries to account for movement in the surgical video stream until a user command is received or a time period has elapsed to remove the highlight.
  • a system includes a memory system and a processing system coupled to the memory system.
  • the processing system is configured to execute a plurality of instructions to detect a user input through a user interface of a streaming session to select a region of interest to highlight in a surgical video stream during a surgical procedure, search a frame of the surgical video stream to locate two or more feature points of the region of interest and highlight at least a portion of the region of interest based on determining that the region of interest is associated with an unknown structure not yet identified by the system, and track the region of interest during the surgical procedure to maintain the highlight to account for movement in the surgical video stream until a user command is received or a time period has elapsed to remove the highlight.
  • FIG. 1 depicts a computer-assisted surgery (CAS) system according to one or more aspects
  • FIG. 2 depicts a surgical procedure system in accordance with one or more aspects
  • FIG. 3 depicts a system for prediction generation that can be incorporated according to one or more aspects
  • FIG. 4A depicts a user interface of a streaming session according to one or more aspects
  • FIG. 4B depicts a user interface of a streaming session where a user input is detected according to one or more aspects
  • FIG. 4C depicts a user interface of a streaming session where highlighting is applied based on detecting the user input of FIG. 4B according to one or more aspects
  • FIG. 5 A depicts a user interface of a streaming session according to one or more aspects
  • FIG. 5B depicts a user interface of a streaming session where a first user input is detected according to one or more aspects
  • FIG. 5C depicts a user interface of a streaming session where highlighting is applied based on detecting the first user input of FIG. 5B according to one or more aspects
  • FIG. 5D depicts a user interface of a streaming session where a second user input is detected according to one or more aspects
  • FIG. 5E depicts a user interface of a streaming session where highlighting is applied based on detecting the second user input of FIG. 5D according to one or more aspects
  • FIG. 5F depicts a user interface of a streaming session with highlighting and adjustable vertices of feature boundaries according to one or more aspects
  • FIG. 5G depicts a user interface of a streaming session with highlighting adjusted after moving an adjustable vertex according to one or more aspects
  • FIG. 6A depicts a user interface of a streaming session according to one or more aspects
  • FIG. 6B depicts a user interface of a streaming session where a user input is detected according to one or more aspects
  • FIG. 6C depicts a user interface of a streaming session where highlighting is applied based on detecting the user input of FIG. 6B according to one or more aspects
  • FIG. 6D depicts a user interface of a streaming session where highlighting of FIG. 6C is labeled according to one or more aspects
  • FIG. 7 depicts a flowchart of a method for annotation overlay according to one or more aspects.
  • FIG. 8 depicts a computer system according to one or more aspects.
  • a video streaming system can allow one or more participants external to an operating room to observe and interact with a surgeon or surgical team within the operating room.
  • a user interface within the operating room can provide a surgeon or surgical team with the ability to invite one or more participants to observe the surgical procedure and interact through audio, video, and/or telestration.
  • Participants outside of the operating room can use a different user interface that allows the participants to customize viewing preferences as well as interacting through audio, video, and/or telestration.
  • Other interactions can occur through an interactive chat while streaming is active. Further, comments can be added by users during streaming or postoperatively to link with a recording of the surgical video.
  • surgical video can include an endoscopic view of a surgical procedure. Further, there can be multiple cameras or selectable points-of-view that capture the surgical procedure.
  • the surgical video can be captured with overlaid content, such as structural identification using color and/or text overlays.
  • the overlaid content can be merged with the surgical video or managed as another stream such that viewers may have an option of turning the overlaid content on or off.
  • annotations can be smart overlays that track to underlying structures. For example, selection of a structure by a user can result in a visual highlight of the structure that tracks to movement and orientation changes of the structure. Annotations can also include text labels, which may be user editable. Various approaches can be used to select a region of interest for highlighting, and the approaches can differ depending on whether a selected structure is readily identifiable by a machine learning model or is unknown at the time of selection.
  • Machine learning models can monitor one or more surgical video streams and/or other data sources to track progress through a surgical procedure.
  • the machine learning models can be trained to predict the occurrence of events and presence of structures within the context of a surgical procedure.
  • machine learning models can learn a sequence of phases for one or more types of surgical procedures along with expected occurrences of events or structures within particular phases. Phases or events can be associated with expected anatomy and/or surgical instruments. By tracking phase information, higher probability predictions can be made for structures which may not be fully visible within a scene as a surgical procedure progresses. Aspects can also track user modifications to annotations to aid in later identification of the same structure in the surgical procedure and/or similar structures in future surgical procedures. Further details are described in greater detail herein.
  • An operating room may contain a camera and microphone located on a central console and/or one or more cameras and microphones affixed (e.g., via a clip or other means) to medical personnel or objects in the operating room.
  • one or more cameras and microphones can be attached or integrated into one or more devices in the operating room such as, but not limited to surgical tools, goggles, personal computers, smart watches, and/or smart phones.
  • One or more cameras with microphones can be designated as providing a view of a surgeon or surgical team within the operating room.
  • One or more surgical cameras can be incorporated with surgical tools, such as a laparoscopic or more generally, and endoscopic camera.
  • some resulting video feeds can include an audio portion, other video feeds may not include audio.
  • the CAS system 100 includes at least a computing system 102, a video/audio recording system 104, and a surgical instrumentation system 106.
  • an actor 112 can be medical personnel that uses the CAS system 100 to perform a surgical procedure on a patient 110.
  • Medical personnel, or health care professionals can be a surgeon, assistant, nurse, administrator, or any other actor that interacts with the CAS system 100 in a surgical environment.
  • the surgical procedure can be any type of surgery, such as but not limited to open or laparoscopic hernia repair, laparoscopic cholecystectomy, robotic laparoscopic surgery, or any other surgical procedure with or without a robot.
  • actor 112 can be a surgeon, anesthesiologist, theatre nurse, technician, an administrator, an engineer, or any other such personnel that interacts with the CAS system 100.
  • actor 112 can record data from the CAS system 100, configure/update one or more attributes of the CAS system 100, review past performance of the CAS system 100, repair the CAS system 100, etc.
  • a surgical procedure can include multiple phases, and each phase can include one or more surgical actions.
  • a “surgical action” can include an incision, a compression, a stapling, a clipping, a suturing, a cauterization, a sealing, or any other such actions performed to complete a phase in the surgical procedure.
  • a “phase” represents a surgical event that is composed of a series of steps (e.g., closure).
  • a “step” refers to the completion of a named surgical objective (e.g., hemostasis).
  • certain surgical instruments 108 e.g., forceps
  • the video/audio recording system 104 shown in FIG. 1 includes one or more cameras 105, such as operating room cameras, endoscopic cameras, etc.
  • the cameras 105 capture video data of the surgical procedure being performed.
  • the video/audio recording system 104 includes one or more video capture devices that can include cameras 105 placed in the surgical room to capture events surrounding (i.e., outside) the patient being operated upon.
  • the video/audio recording system 104 further includes cameras 105 that are passed inside (e.g., endoscopic cameras) the patient 110 to capture endoscopic data.
  • the endoscopic data provides video and images of the surgical procedure.
  • the video/audio recording system 104 also includes one or more microphones 107, which can be located on a central console, affixed (e.g., via a clip or other means) to medical personnel or objects in the operating room, and/or attached to or integrated into one or more devices in the operating room. Examples of devices in the operating room can include, but are not limited to surgical tools, video recorders, cameras, goggles, personal computers, smart watches, and/or smart phones.
  • the microphones 107 capture audio data, and can be wired or wireless or a combination of both.
  • the video data captured by the cameras 105 and the audio data captured by the microphones 107 can both include timestamps (or other indicia) that are used to correlate the video data and the audio data.
  • the timestamps can be used to correlate, or synchronize, the sounds captured in the operating room with the images of the medical procedure performed in the operating room.
  • the computing system 102 includes one or more memory devices, one or more processors, and a user interface device, among other components. All or a portion of the computing system 102 shown in FIG. 1 can be implemented for example, by all or a portion of computer system 800 of FIG. 8. Computing system 102 can execute one or more computerexecutable instructions. The execution of the instructions facilitates the computing system 102 to perform one or more methods, including those described herein.
  • the computing system 102 can communicate with other computing systems via a wired and/or a wireless network.
  • the computing system 102 includes one or more trained machine learning models that can detect and/or predict features of/from the surgical procedure that is being performed or has been performed earlier.
  • Features can include structures such as anatomical structures, surgical instruments 108 in the captured video of the surgical procedure.
  • Features can further include events such as phases, actions in the surgical procedure.
  • Features that are detected can further include the actor 112 and/or patient 110.
  • the computing system 102 in one or more examples, can provide recommendations for subsequent actions to be taken by the actor 112.
  • the computing system 102 can provide one or more reports based on the detections.
  • the detections by the machine learning models can be performed in an autonomous or semi-autonomous manner.
  • the machine learning models can include artificial neural networks, such as deep neural networks, convolutional neural networks, recurrent neural networks, encoders, decoders, or any other type of machine learning model.
  • the machine learning models can be trained in a supervised, unsupervised, or hybrid manner.
  • the machine learning models can be trained to perform detection and/or prediction using one or more types of data acquired by the CAS system 100.
  • the machine learning models can use the video data captured via the video/audio recording system 104.
  • the machine learning models use the surgical instrumentation data from the surgical instrumentation system 106.
  • the machine learning models use a combination of video data and surgical instrumentation data.
  • the machine learning models can also use audio data captured by the one or microphones 107 during the surgical procedure.
  • the audio data can include sounds emitted by the surgical instrumentation system 106 while activating one or more surgical instruments 108.
  • the audio data can include voice commands, snippets, or dialog from one or more actors 112.
  • the audio data can further include sounds made by the surgical instruments 108 during their use.
  • the one or more machine learning models can then be used in real-time to process one or more data streams (e.g., video streams, audio streams, RFID data, etc.).
  • the processing can include predicting and characterizing visualization modifications in images of a video of a surgical procedure based on one or more surgical phases, instruments, and/or other structures within various instantaneous or block time periods.
  • the visualization can be modified to highlight the presence, position, and/or use of one or more structures.
  • the structures can be used to identify a stage within a workflow (e.g., as represented via a surgical data structure), predict a future stage within a workflow, etc.
  • the visualization can be selectively displayed based on a user input through a user interface, such as tapping or drawing on the user interface.
  • the machine learning models can detect surgical actions, surgical phases, anatomical structures, surgical instruments, activities, events, and various other features from the data associated with a surgical procedure. The detection can be performed in real-time in some examples. Alternatively, or in addition, the computing system 102 analyzes the surgical data, i.e., the various types of data captured during the surgical procedure, in an offline manner (e.g., post-surgery). In one or more examples, the machine learning models detect surgical phases based on detecting some of the features such as the anatomical structure, surgical instruments, etc.
  • Machine learning models executed by or accessible by the computing system 102 can include surgical monitoring modules 103.
  • the surgical monitoring modules 103 can monitor video data, audio data and/or other surgical data (e.g., sensor data, instrument data, etc.) to determine context information to assist in identifying and tracking structures depicted in a surgical video captured by cameras 105 upon a user request.
  • other surgical data e.g., sensor data, instrument data, etc.
  • a data collection system 150 can be employed to store the surgical data, including the video(s) captured during the surgical procedures and the audio data captured during the surgical procedure.
  • the data collection system 150 includes one or more storage devices 152.
  • the data collection system 150 can be a local storage system, a cloud-based storage system, or a combination thereof. Further, the data collection system 150 can use any type of cloud-based storage architecture, for example, public cloud, private cloud, hybrid cloud, etc. In some examples, the data collection system can use a distributed storage, i.e., the storage devices 152 are located at different geographic locations.
  • the storage devices 152 can include any type of electronic data storage media used for recording machine-readable data, such as semiconductorbased, magnetic-based, optical -based storage media, or a combination thereof.
  • the data storage media can include flash-based solid-state drives (SSDs), magnetic-based hard disk drives, magnetic tape, optical discs, etc.
  • the data collection system 150 can be part of the video/audio recording system 104, or vice-versa.
  • the data collection system 150, the video/audio recording system 104, and the computing system 102 can communicate with each other via a communication network, which can be wired, wireless, or a combination thereof.
  • the communication between the systems can include the transfer of data (e.g., video data, audio data, instrumentation data, etc.), data manipulation commands (e.g., browse, copy, paste, move, delete, create, compress, etc.), data manipulation results, etc.
  • the computing system 102 can manipulate the data already stored/being stored in the data collection system 150 based on outputs from the one or more machine learning models, e.g., phase detection, structure detection, etc. Alternatively, or in addition, the computing system 102 can manipulate the data already stored/being stored in the data collection system 150 based on information from the surgical instrumentation system 106.
  • the one or more machine learning models e.g., phase detection, structure detection, etc.
  • the computing system 102 can manipulate the data already stored/being stored in the data collection system 150 based on information from the surgical instrumentation system 106.
  • the video captured by the video/audio recording system 104 is stored on the data collection system 150.
  • the computing system 102 curates parts of the video data being stored on the data collection system 150.
  • the computing system 102 filters the video captured by the video/audio recording system 104 before it is stored on the data collection system 150.
  • the computing system 102 filters the video captured by the video/audio recording system 104 after it is stored on the data collection system 150.
  • a surgical data management system 160 can provide access to portions of data captured in the data collection system 150, as well as data and records stored in other systems. Participant systems 165A-165N can access the surgical data management system 160 through one or more applications or secure web pages. Participant systems 165A-165N can include various types of computing devices, such as personal computers, laptop computers, tablet computers, mobile devices, smart appliances, and the like.
  • the surgical data management system 160 can be a stand-alone application, module, and/or an extension of another system, including processing system and networking support hardware and software to support operation of the surgical data management system 160.
  • Video with or without audio, can pass through the data collection system 150 and surgical data management system 160 to support real-time streaming between users of the participant systems 165A-165N and one or more actors 112 through cameras 105 and/or microphones 107.
  • Surgical instruments 108 can also be a source of streaming.
  • Annotation information can be stored in a data store 162, such as annotation database 163, to track annotation information associated with surgical data, where the annotations are collected based on user input through the participant systems 165A-165N.
  • labels, geometry information, and context information can be stored in the annotation database 163 to supplement or modify tracking information identified by machine learning models associated with the surgical monitoring modules 103.
  • the combination of the computing system 102, surgical monitoring modules 103, video/audio recording system 104, cameras 105, microphones 107, data collection system 150, storage devices 152, surgical data management system 160, and/or data store 162 can be referred to as a surgical monitoring system 101.
  • a portion of the surgical instrumentation system 106 that provides feedback on surgical instruments can also be part of the surgical monitoring system 101.
  • Components of the surgical monitoring system 101 can be combined or further subdivided. Further, the surgical monitoring system 101 may include additional components beyond those specifically described above.
  • FIG. 2 a surgical procedure system 200 is generally shown in accordance with one or more aspects.
  • the example of FIG. 2 depicts a surgical procedure support system 202 that can include or may be coupled to the CAS system 100 of FIG. 1.
  • the surgical procedure support system 202 can acquire image or video data using one or more cameras 204 (e.g., cameras 105 of FIG. 1).
  • the surgical procedure support system 202 can also acquire audio data using one or more microphones 220 (e.g., microphones 107 of FIG. 1).
  • the surgical procedure support system 202 can further interface with a plurality of sensors 206 and effectors 208.
  • the sensors 206 may be associated with surgical support equipment and/or patient monitoring.
  • the effectors 208 can be robotic components or other equipment (e.g., surgical instruments 108 of FIG. 1) controllable through the surgical procedure support system 202.
  • the surgical procedure support system 202 can also interact with one or more user interfaces 210, such as various input and/or output devices.
  • the surgical procedure support system 202 can store, access, and/or update surgical data 214 associated with a training dataset and/or live data as a surgical procedure is being performed on patient 110 of FIG. 1.
  • the surgical procedure support system 202 can store, access, and/or update surgical objectives 216 to assist in training and guidance for one or more surgical procedures.
  • User configurations 218 can track and store user preferences.
  • the surgical procedure support system 202 can also communicate with other systems through a network 230.
  • the surgical procedure support system 202 can communicate with an electronic medical record (EMR) system 240, a surgical data postprocessing system 250, and/or other types of devices, such as a computing device 234, 264 (e.g., a mobile phone, tablet computer, or laptop) through a network 230.
  • EMR electronic medical record
  • computing device 234, 264 e.g., a mobile phone, tablet computer, or laptop
  • user interfaces 210 may be connected to or integrated with the surgical procedure support system 202 by a local connection (e.g., within an operating room), while the mobile computing device 234 may connect to the surgical procedure support system 202 via a wireless connection directly or pass through the network 230.
  • the EMR system 240 can access and/or modify EMR data 242 used to track medical records including data associated with surgical procedures.
  • the EMR data 242 can be used to track status and outcomes of surgical procedures along with other medical records associated with the patient 110 of FIG. 1.
  • the computing device 234 can execute or link to another computer system that executes the surgical data management system 160 of FIG. 1 to access various data sources through the network 230.
  • the surgical data post-processing system 250 can receive surgical data and associated data generated by the surgical procedure support system 202 and may be separately stored and secured through other data storage. Access to specific data or portions of data through the surgical data post-processing system 250 may be limited by associated permissions.
  • the surgical data post-processing system 250 may include features such as video viewing, video sharing, data analytics, and selective data extraction.
  • One or more computing device 264 can interact with the surgical data management system 160 of FIG. 1 to access various data sources through a network 260.
  • the network 230 may be within a facility or multiple facilities maintained with an a private network.
  • the network 260 may be a wider area network, such as the internet. Accordingly, the networks 230 and 260 may have access to different files and data sets along with shared access to select files and data sets. In some aspects, networks 230 and 260 can be combined.
  • the computing devices 234, 264 are examples of the participant systems 165A-165N of FIG. 1. Accordingly, the surgical data management system 160 of FIG. 1 can be interposed between the computing devices 234, 264 and the network 230 to manage data flow, streaming, and access constraints. Further, portions of the surgical data management system 160 can be executed locally by the computing devices 234, 264 and/or the surgical procedure support system 202.
  • a system 300 for analyzing data that includes video data is generally shown according to one or more aspects.
  • the video data can be captured from video/audio recording system 104 of FIG. 1.
  • the analysis can result in predicting surgical phases and structures (e.g., instruments, anatomical structures, etc.) in the video data using machine learning.
  • System 300 can be the CAS system 100 of FIG. 1, or a part thereof in one or more examples.
  • System 300 uses data streams in the surgical data to identify procedural states according to some aspects.
  • System 300 includes a data reception system 305 that collects surgical data, including the video data and surgical instrumentation data.
  • the data reception system 305 can include one or more devices (e.g., one or more user devices and/or servers) located within and/or associated with a surgical operating room and/or control center.
  • the data reception system 305 can receive surgical data in real-time, i.e., as the surgical procedure is being performed. Alternatively, or in addition, the data reception system 305 can receive or access surgical data in an offline manner, for example, by accessing data that is stored in the data collection system 150 of FIG. 1.
  • System 300 further includes a machine learning processing system 310 that processes the surgical data using one or more machine learning models to identify one or more features, such as surgical phase, instrument, anatomical structure, etc., in the surgical data.
  • machine learning processing system 310 can include one or more devices (e.g., one or more servers), each of which can be configured to include part or all of one or more of the depicted components of the machine learning processing system 310.
  • a part or all of the machine learning processing system 310 is in the cloud and/or remote from an operating room and/or physical location corresponding to a part or all of data reception system 305.
  • the components of the machine learning processing system 310 are depicted and described herein. However, the components are just one example structure of the machine learning processing system 310, and that in other examples, the machine learning processing system 310 can be structured using a different combination of the components. Such variations in the combination of the components are encompassed by the technical solutions described herein.
  • the machine learning processing system 310 includes a machine learning training system 325, which can be a separate device (e.g., server) that stores its output as one or more trained machine learning models 330.
  • the machine learning models 330 are accessible by a machine learning execution system 340.
  • the machine learning execution system 340 can be separate from the machine learning training system 325 in some examples.
  • devices that “train” the models are separate from devices that “infer,” i.e., perform real-time processing of surgical data using the trained machine learning models 330.
  • Machine learning processing system 310 further includes a data generator 315 to generate simulated surgical data, such as a set of virtual images, or record the video data from the video/audio recording system 104, to train the machine learning models 330.
  • Data generator 315 can access (read/write) a data store 320 to record data, including multiple images and/or multiple videos.
  • the images and/or videos can include images and/or videos collected during one or more procedures (e.g., one or more surgical procedures). For example, the images and/or video may have been collected by a user device worn by the actor 112 of FIG.
  • the data store 320 can be separate from the data collection system 150 of FIG. 1 in some examples. In other examples, the data store 320 can be part of the data collection system 150.
  • Each of the images and/or videos recorded in the data store 320 for training the machine learning models 330 can be defined as a base image and can be associated with other data that characterizes an associated procedure and/or rendering specifications.
  • the other data can identify a type of procedure, a location of a procedure, one or more people involved in performing the procedure, surgical objectives, and/or an outcome of the procedure.
  • the other data can indicate a stage of the procedure with which the image or video corresponds, rendering specification with which the image or video corresponds and/or a type of imaging device that captured the image or video (e.g., and/or, if the device is a wearable device, a role of a particular person wearing the device, etc.).
  • the other data can include image-segmentation data that identifies and/or characterizes one or more objects (e.g., tools, anatomical objects, etc.) that are depicted in the image or video.
  • the characterization can indicate the position, orientation, or pose of the object in the image.
  • the characterization can indicate a set of pixels that correspond to the object and/or a state of the object resulting from a past or current user handling. Localization can be performed using a variety of techniques for identifying objects in one or more coordinate systems.
  • the machine learning training system 325 uses the recorded data in the data store 320, which can include the simulated surgical data (e.g., set of virtual images) and actual surgical data to train the machine learning models 330.
  • the machine learning model 330 can be defined based on a type of model and a set of hyperparameters (e.g., defined based on input from a client device).
  • the machine learning models 330 can be configured based on a set of parameters that can be dynamically defined based on (e.g., continuous or repeated) training (i.e., learning, parameter tuning).
  • Machine learning training system 325 can use one or more optimization algorithms to define the set of parameters to minimize or maximize one or more loss functions.
  • the set of (learned) parameters can be stored as part of a trained machine learning model 330 using a specific data structure for that trained machine learning model 330.
  • the data structure can also include one or more non-learnable variables (e.g., hyperparameters and/or model definitions).
  • Examples of the trained machine learning model 330 can include a surgical video monitoring model 332 and a surgical data monitoring model 334, where the surgical video monitoring model 332 and surgical data monitoring model 334 can be surgical monitoring modules 103 of FIG. 1.
  • the surgical video monitoring model 332 can be trained to classify and/or detect features in a surgical video stream.
  • the surgical data monitoring model 334 can be trained to classify and/or detect features in other sources of surgical data, such as sensor data, instrument data, audio data, and/or other data accessible to the surgical monitoring system 101.
  • Machine learning execution system 340 can access the data structure(s) of the machine learning models 330 and accordingly configure the machine learning models 330 for inference (i.e., prediction).
  • the machine learning models 330 can include, for example, a fully convolutional network adaptation, an adversarial network model, an encoder, a decoder, or other types of machine learning models.
  • the type of the machine learning models 330 can be indicated in the corresponding data structures.
  • the machine learning model 330 can be configured in accordance with one or more hyperparameters and the set of learned parameters.
  • the one or more machine learning models 330 during execution, receive, as input, surgical data to be processed and subsequently generate one or more inferences according to the training.
  • surgical data For example, the video data captured by the video/audio recording system 104 of FIG.
  • the video data that is captured by the video/audio recording system 104 can be received by the data reception system 305, which can include one or more devices located within an operating room where the surgical procedure is being performed.
  • the data reception system 305 can include devices that are located remotely, to which the captured video data is streamed live during the performance of the surgical procedure.
  • the data reception system 305 accesses the data in an offline manner from the data collection system 150 or from any other data source (e.g., local or remote storage device).
  • the data reception system 305 can process the video and/or other data received.
  • the processing can include decoding when a video stream is received in an encoded format such that data for a sequence of images can be extracted and processed.
  • the data reception system 305 can also process other types of data included in the input surgical data.
  • the surgical data can include additional data streams, such as audio data, RFID data, textual data, measurements from one or more surgical instrum ents/sensors, etc., that can represent stimuli/procedural states from the operating room.
  • the data reception system 305 synchronizes the different inputs from the different devices/ sensors before inputting them in the machine learning processing system 310.
  • audio data can also be used as a data source to generate predictions.
  • Synchronization can be achieved by using a common reference clock to generate time stamps alongside each data stream.
  • the clocks can be shared via network protocols or through hardware locking or through any other means.
  • Such time stamps can be associated with any processed data format, such as, but not limited to text or other discrete data created from the audio signal.
  • Additional synchronization can be performed by linking actions, events, or phase segmented that have been automatically processed from the raw signals using machine learning models. For example, text generated from an audio signal can be associated to specific phases of the procedure that are extracted from that audio or any other data stream signal. Text generated may be captured and/or displayed through a user interface.
  • the machine learning models 330 can analyze the input surgical data, and in one or more aspects, predict and/or characterize structures included in the video data included with the surgical data.
  • the video data can include sequential images and/or encoded video data (e.g., using digital video file/stream formats and/or codecs, such as MP4, MOV, AVI, WEBM, AVCHD, OGG, etc.).
  • the prediction and/or characterization of the structures can include segmenting the video data or predicting the localization of the structures with a probabilistic heatmap.
  • the one or more machine learning models include or are associated with a preprocessing or augmentation (e.g., intensity normalization, resizing, cropping, etc.) that is performed prior to segmenting the video data.
  • An output of the one or more machine learning models can include image-segmentation or probabilistic heatmap data that indicates which (if any) of a defined set of structures are predicted within the video data, a location and/or position and/or pose of the structure(s) within the video data, and/or state of the structure(s).
  • the location can be a set of coordinates in an image/frame in the video data.
  • the coordinates can provide a bounding box.
  • the coordinates can provide boundaries that surround the structure(s) being predicted.
  • the trained machine learning models 330 in one or more examples, are trained to perform higher-level predictions and tracking, such as predicting a phase of a surgical procedure and tracking one or more surgical instruments used in the surgical procedure.
  • the machine learning processing system 310 includes a detector 350 that uses the machine learning models to identify a phase within the surgical procedure (“procedure”).
  • Detector 350 uses a particular procedural tracking data structure 355 from a list of procedural tracking data structures. Detector 350 selects the procedural tracking data structure 355 based on the type of surgical procedure that is being performed. In one or more examples, the type of surgical procedure is predetermined or input by actor 112. The procedural tracking data structure 355 identifies a set of potential phases that can correspond to a part of the specific type of procedure.
  • the procedural tracking data structure 355 can be a graph that includes a set of nodes and a set of edges, with each node corresponding to a potential phase.
  • the edges can provide directional connections between nodes that indicate (via the direction) an expected order during which the phases will be encountered throughout an iteration of the procedure.
  • the procedural tracking data structure 355 may include one or more branching nodes that feed to multiple next nodes and/or can include one or more points of divergence and/or convergence between the nodes.
  • a phase indicates a procedural action (e.g., surgical action) that is being performed or has been performed and/or indicates a combination of actions that have been performed.
  • a phase relates to a biological state of a patient undergoing a surgical procedure.
  • the biological state can indicate a complication (e.g., blood clots, clogged arteries/veins, etc.), pre-condition (e.g., lesions, polyps, etc.).
  • pre-condition e.g., lesions, polyps, etc.
  • the machine learning models 330 are trained to detect an “abnormal condition,” such as hemorrhaging, arrhythmias, blood vessel abnormality, etc.
  • Each node within the procedural tracking data structure 355 can identify one or more characteristics of the phase corresponding to that node.
  • the characteristics can include visual characteristics.
  • the node identifies one or more tools that are typically in use or availed for use (e.g., on a tool tray) during the phase.
  • the node also identifies one or more roles of people who are typically performing a surgical task, a typical type of movement (e.g., of a hand or tool), etc.
  • detector 350 can use the segmented data generated by machine learning execution system 340 that indicates the presence and/or characteristics of particular objects within a field of view to identify an estimated node to which the real image data corresponds.
  • Identification of the node can further be based upon previously detected phases for a given procedural iteration and/or other detected input (e.g., verbal audio data that includes person-to-person requests or comments, explicit identifications of a current or past phase, information requests, etc.).
  • other detected input e.g., verbal audio data that includes person-to-person requests or comments, explicit identifications of a current or past phase, information requests, etc.
  • the detector 350 outputs the prediction associated with a portion of the video data that is analyzed by the machine learning processing system 310.
  • the prediction is associated with the portion of the video data by identifying a start time and an end time of the portion of the video that is analyzed by the machine learning execution system 340.
  • the prediction that is output can include an identity of a surgical phase, activity, or event as detected by the detector 350 based on the output of the machine learning execution system 340.
  • the prediction in one or more examples, can include identities of the structures (e.g., instrument, anatomy, etc.) that are identified by the machine learning execution system 340 in the portion of the video that is analyzed.
  • the prediction can also include a confidence score of the prediction.
  • Various types of information in the prediction that can be output may include phases, actions, and/or events associated with a surgical procedure.
  • the technical solutions described herein can be applied to analyze video and image data captured by cameras that are not endoscopic (i.e., cameras external to the patient’s body) when performing open surgeries (i.e., not laparoscopic surgeries).
  • the video and image data can be captured by cameras that are mounted on one or more personnel in the operating room, e.g., surgeon.
  • the cameras can be mounted on surgical instruments, walls, or other locations in the operating room.
  • FIG. 4 A depicts a user interface 400 of a streaming session 401 according to one or more aspects.
  • a main display window 402 can display a surgical video stream observed by a plurality of participants watching the surgical procedure in real-time.
  • the user interface 400 can include a plurality of controls 406 accessible to participants and participant video streams 408A, 408B, 408C, 408D.
  • One or more surgical monitoring modules 103 of FIG. 1 can process the surgical video stream and/or other surgical data and determine whether any features are known in the currently displayed surgical video stream in the main display window 402.
  • machine learning models 330 of FIG. 3 can track surgical phase information to assist with identifying anatomical structures and/or surgical instruments.
  • a portion of surgical instrument 412 may be identified in proximity to a portion of another surgical instrument 414.
  • Video and image processing can be performed in combination with using surgical instrument data collected from the surgical instrumentation system 106 of FIG. 1.
  • Context information can assist in identifying one or more anatomical structures upon a user input from one or more users of associated with the participant video streams 408A, 408B, 408C, 408D.
  • FIG. 4B depicts the user interface 400 of the streaming session 401 of FIG. 4 A, where a user input 404 is detected according to one or more aspects.
  • the user input 404 can be a tap, a drawing motion, or other input gesture detected through the user interface 400.
  • one of the users appearing in the participant video streams 408A, 408B, 408C, 408D may select a tool from the controls 406 that supports annotation for the streaming session 401.
  • the surgical monitoring system 101 can detect the user input 404 through the user interface 400 of the streaming session 401 to select a region of interest to highlight in a surgical video stream during a surgical procedure.
  • the surgical monitoring system 101 can determine whether the region of interest is associated with a known structure or an unknown structure.
  • one or more other approaches can be used to attempt to identify the region of interest. For instance, a boundary search can be performed to locate one or more feature boundaries and highlight within the one or more feature boundaries based on determining that the region of interest is associated with an unknown structure.
  • FIG. 4C depicts the user interface 400 of the streaming session 401 where highlight 405 is applied based on detecting the user input 404 of FIG. 4B according to one or more aspects. Whether the region of interest is identified by the machine learning models 330 of FIG.
  • the surgical monitoring system 101 of FIG. 1 can continue to track an orientation and position of the region of interest during the surgical procedure to maintain the highlight 405 within the one or more feature boundaries to account for movement in the surgical video stream until a user command is received or a time period has elapsed to remove the highlight 405. For instance, tapping the location of the highlight 405 may toggle the highlight 405 off. Further, the highlight 405 may be active for a maximum period of time or until a phase transition or event is detected. Moreover, there can be different inputs selectable through the controls 406 to select a timeout based annotation or an event/command based annotation.
  • FIG. 5 A depicts a user interface 500 of a streaming session 501 according to one or more aspects.
  • a main display window 502 can display a surgical video stream observed by a plurality of participants watching the surgical procedure in real-time.
  • the user interface 500 can include a plurality of controls 506 accessible to participants and participant video streams 508A, 508B, 508C, 508D.
  • One or more surgical monitoring modules 103 of FIG. 1 can process the surgical video stream and/or other surgical data and determine whether any features are known in the currently displayed surgical video stream in the main display window 502.
  • machine learning models 330 of FIG. 3 can track surgical phase information to assist with identifying anatomical structures and/or surgical instruments.
  • FIG. 5B depicts the user interface 500 of the streaming session 501 of FIG. 5 A where a first user input 504 is detected according to one or more aspects.
  • a first user input 504 is detected according to one or more aspects.
  • one of the users appearing in the participant video streams 508A, 508B, 508C, 508D may select a tool from the controls 506 that supports annotation for the streaming session 501.
  • the first user input 504 can be a freehand-drawn substantially complete shape that approximates a circle, oval, or other polygon.
  • a region of interest can lie within the area indicated by the first user input 504.
  • the surgical monitoring system 101 can detect the first user input 504 through the user interface 500 of the streaming session 501 to select a region of interest to highlight in a surgical video stream during a surgical procedure.
  • the surgical monitoring system 101 can determine whether the region of interest is associated with a known structure or an unknown structure and determine boundaries for highlighting. If the machine learning models 330 of FIG. 3 are unable to identify a structure within the area of the first user input 504, one or more other approaches can be used to attempt to identify a structure in the region of interest. For instance, a boundary search can be performed to locate one or more feature boundaries and highlight within the one or more feature boundaries based on determining that the region of interest is associated with an unknown structure.
  • FIG. 5C depicts the user interface 500 of the streaming session 501 where highlight 505 is applied based on detecting the first user input 504 of FIG. 5B according to one or more aspects.
  • the surgical monitoring system 101 of FIG. 1 can continue to track an orientation and position of the region of interest during the surgical procedure to maintain the highlight 505 within one or more feature boundaries to account for movement in the surgical video stream until a user command is received or a time period has elapsed to remove the highlight 505. For instance, tapping the location of the highlight 505 may toggle the highlight 505 off. Further, the highlight 505 may be active for a maximum period of time or until a phase transition is detected.
  • FIG. 5D depicts the user interface 500 of the streaming session 501 with the highlight 505 where a second user input 510 is detected according to one or more aspects.
  • the second user input 510 can be from a same user or a different user who provided the first user input 504 of FIG. 5B.
  • the second user input 510 can be a freehand-drawn substantially complete shape and need not closely approximate a circle, oval, or other polygon.
  • a boundary search and/or segmentation process can be performed to search for potential boundaries of the structure located at the second user input 510.
  • geodesic segmentation or other such segmentation approach can be used to look for features that may approximate boundaries of the region of interest.
  • FIG. 5E depicts the user interface 500 of the streaming session 501 where highlight 511 is applied based on detecting the second user input 510 of FIG. 5D according to one or more aspects.
  • FIG. 5E whether a machine learning model 330 or searching/segmentation approach is used, it is possible that the boundaries selected for applying the highlight 511 may not be precise. As such, an option to adjust the boundaries manually may be available through the controls 506 or gesture detection, for example.
  • FIG. 5F depicts the user interface 500 of a streaming session 501 with highlight 511 and adjustable vertices 512 of feature boundaries according to one or more aspects.
  • a user of the user interface 500 can select one or more of the adjustable vertices 512 to shift to a new location 514 to redefine the feature boundaries.
  • the locations of the vertices 512 and updates to locations of the vertices 512 can be tracked, for example, in the annotation database 163 of FIG. 1. This can assist with continued tracking and/or in selecting boundaries in future surgical procedures.
  • FIG. 5G depicts the user interface 500 of the streaming session 501 with highlight 511 adjusted after moving an adjustable vertex 512 to a new location 514 according to one or more aspects.
  • the resulting highlight 511 can remain in use and modified in position and/or orientation as the scene changes. Users can also select to turn off viewing of the vertices 512 to reduce potential distraction.
  • FIG. 6 A depicts a user interface 600 of a streaming session 601 according to one or more aspects.
  • a main display window 602 can display a surgical video stream observed by a plurality of participants watching the surgical procedure in real-time.
  • the user interface 600 can include a plurality of controls 606 accessible to participants and participant video streams 608A, 608B, 608C, 608D.
  • One or more surgical monitoring modules 103 of FIG. 1 can process the surgical video stream and/or other surgical data and determine whether any features are known in the currently displayed surgical video stream in the main display window 602.
  • machine learning models 330 of FIG. 3 can track surgical phase information to assist with identifying anatomical structures and/or surgical instruments.
  • FIG. 6B depicts the user interface 600 of the streaming session 601 of FIG. 6 A where a user input 604 is detected according to one or more aspects.
  • a user input 604 is detected according to one or more aspects.
  • one of the users appearing in the participant video streams 608A, 608B, 608C, 608D may select a tool from the controls 606 that supports annotation for the streaming session 601.
  • the user input 604 can be a freehand-drawn shape that approximates a line.
  • a boundary search and/or segmentation process can be performed to search for potential boundaries of the structure located at the user input 604.
  • geodesic segmentation can be used to look for features that may approximate boundaries of the region of interest. The one or more boundaries may be initially defined at an edge of the main display window 602 where other features (e.g., edges) are not readily detectable in a current view.
  • FIG. 6C depicts the user interface 600 of the streaming session 601 where highlight 605 is applied based on detecting the user input 604 of FIG. 6B according to one or more aspects.
  • a machine learning model 330 or searching/segmentation approach it is possible that the boundaries selected for applying the highlight 605 may not be fully defined.
  • Identified boundaries or user modified boundaries of the region of interest with highlight 605 can be tracked in the annotation database 163 of FIG. 1. If a viewing perspective changes to reveal previously hidden portions of the region of interest with highlight 605, the highlight 605 can be expanded into newly revealed portions.
  • the expanded highlighting information can also be recorded in the annotation database 163 of FIG. 1 to assist with future identification and tracking of the region of interest.
  • FIG. 6D depicts the user interface 600 of the streaming session 601 where highlighting 605 of FIG. 6C is labeled according to one or more aspects.
  • a user can add a label 610 associated with a region of interest whether or not highlight 605 is actively displayed. For instance, the label 610 and highlight 605 can be displayed together or either may be toggled on or off.
  • the association formed between the label 610 and underlying region of interest can be tracked in the annotation database 163 of FIG. 1.
  • the label 610 can be suggested based on a most-likely prediction determined by a machine learning model 330 or previous annotation information captured in the annotation database 163.
  • the label 610 can be directly edited and may support auto-complete using a list of potential labels associated with the surgical procedure, such as expected anatomical structures and surgical instruments likely to be encountered. Additionally, users can type comments in a chat session that further describe the label 610. Information captured about the label 610 may also be captured in an operative note stored in EMR data 242 by the EMR system 240 of FIG. 2.
  • FIG. 7 a flowchart of a method 700 of annotation overlay is generally shown in accordance with one or more aspects. All or a portion of method 700 can be implemented, for example, by all or a portion of CAS system 100 of FIG. 1, surgical procedure system 200, the system 300 of FIG. 3 and/or computer system 800 of FIG. 8.
  • a surgical monitoring system 101 can detect a user input 404, 504, 510, 604 through a user interface 400, 500, 600 of a streaming session 401, 501, 601 to select a region of interest to highlight in a surgical video stream during a surgical procedure.
  • Two or more users can participate in the streaming session 401, 501, 601 through participant systems 165A-165N, for example.
  • the surgical monitoring system 101 can highlight the region of interest based on determining that the region of interest is associated with a known structure.
  • the surgical monitoring system 101 can perform a boundary search to locate one or more feature boundaries and highlight (e.g., highlight 405, 505, 511, 605) within the one or more feature boundaries based on determining that the region of interest is associated with an unknown structure.
  • the surgical monitoring system 101 can track an orientation and position of the region of interest during the surgical procedure to maintain the highlight within the one or more feature boundaries to account for movement in the surgical video stream until a user command is received or a time period has elapsed to remove the highlight.
  • tapping the region of interest can toggle the highlight 405, 505, 511, 605 off or back on.
  • the time period may be a configurable or predetermined time period to fade or remove the highlight. Where fading is used to make the highlight more transparent over a period of time, a user input can refresh the highlight to extend the period of time before the highlight is fully removed.
  • a frame of the surgical video stream can be searched to locate two or more feature points of the region of interest and highlight at least a portion of the region of interest based on determining that the region of interest is associated with an unknown structure not yet identified.
  • Two or more feature points can define a line of orientation of the region of interest.
  • the surgical monitoring system 101 can track the orientation of the region of interest in one or more subsequent frames of the surgical video stream.
  • the highlight 405, 505, 511, 605 can be viewed by a single user, a group of users, or all users. For instance, an observer can make private notes and see privately selected highlights or other annotations that are not shared by all users. Further, external users may have shared annotations that are not visible to a surgeon performing the surgical procedure. As another example, all users including the surgeon may see the highlights and/or other annotations on respective user interfaces 400, 500, 600. While a limited number of examples of user interfaces and user inputs are provided, it will be understood that many variations and combinations are contemplated. For instance, any number of two or more participants may use audio and/or video feeds when participating in streaming sessions. Highlighting can be limited to selected users (e.g., surgeon or instructor only) or a group of users can collaboratively add highlights and/or annotations.
  • user input can include at least a partial circle drawn upon the user interface 400, 500, 600 or a tap upon the user interface 400, 500, 600.
  • the highlight can include a change in color or contrast of an anatomical structure or surgical instrument at a location proximate to the user input.
  • the surgical monitoring system 101 can populate an annotation database 163 of FIG. 1 with information associating one or more feature boundaries of a region of interest as a tracked structure.
  • the tracked structure can be identified based on receiving a label 610 input through a user interface to convert the unknown structure into the known structure.
  • the label 610 input can be prepopulated with a most likely predicted structure for user confirmation or user editing through the user interface.
  • one or more machine learning models 330 can be used to determine whether the region of interest is associated with the known structure.
  • the surgical monitoring system 101 can search the annotation database 163 for the known structure associated with the region of interest based on determining that the one or more machine learning models 330 were unable to identify the region of interest with a prediction confidence above an identification threshold.
  • one or more vertices 512 of the one or more feature boundaries can be displayed, and a position of the one or more vertices 512 can be adjusted based on a user adjustment input, for instance, to shift at least one vertex 512 to a new location 514.
  • one or more machine learning models 330 can determine a surgical phase when the region of interest is selected, and the surgical monitoring system 101 can use the surgical phase in determining whether the region of interest is associated with a known structure or an unknown structure.
  • the surgical phase can provide context that increases the probability of locating a predetermined set of structures or surgical instruments during a corresponding surgical phase.
  • a plurality of surgical instrument data can be tracked by the surgical monitoring system 101 during the surgical procedure, and the surgical instrument data can be used in combination with the surgical phase to determine whether the region of interest is associated with the known structure or the unknown structure.
  • a combination of image/video/audio processing can be used in combination with other data sources, e.g., sensor data, to assist in structure identification in a surgical video based on time and spatial alignment, for example.
  • a user identifier can be tracked which indicates a user that selected the region of interest or modified one or more feature boundaries.
  • a record of the user identifier and one or more tracked structures can be output for post-processing analysis after completion of the surgical procedure.
  • records can be provided to the surgical data post-processing system 250 and/or the EMR system 240.
  • a user identifier of a user entering the user input can be tracked and one or more updates made by the user to modify an aspect of the region of interest can be tracked.
  • the user identifier and annotation associated with the region of interest can be stored in an operative note of the EMR system 240 for the surgical procedure as EMR data 242.
  • the surgical monitoring system 101 can detect a further user input through the user interface of the streaming session to select a further region of interest.
  • the surgical monitoring system 101 can track a position of the further region of interest and change an aspect of highlighting the region of interest based on a change of the position of the region of interest relative to the position of the further region of interest.
  • a region of interest can be indicated by highlight 405 and a further region of interest can be a portion of surgical instrument 412 of FIG. 4.
  • the portion of surgical instrument 412 moves relative to the region of interest identified by highlight 405, the color or another visual aspect of the highlight 405 may change (e.g., closer or further in distance may trigger a warning annotation).
  • highlighting can be tracked as an overlay layer that is separate from the surgical video stream to support playback with or without highlighting being displayed over the surgical video stream.
  • a user interface such as user interface 400, 500, 600, can be configurable to allow for customized highlight settings.
  • user preferences can be set on an individual or group level that define how highlights should appear.
  • User interface configuration adjustments can change default settings and allow users to change, for instance, the color and opacity of highlight overlays.
  • highlight preferences can be defined with association to structures, such as a per-structure basis.
  • Configuration preferences can be saved as settings for a particular user (e.g. user-specific highlight settings) or group of users (e.g., a department, surgical group, institution, etc.).
  • the processing shown in FIG. 7 is not intended to indicate that the operations are to be executed in any particular order or that all of the operations shown in FIG. 7 are to be included in every case. Additionally, the processing shown in FIG. 7 can include any suitable number of additional operations.
  • the computer system 800 can be an electronic computer framework comprising and/or employing any number and combination of computing devices and networks utilizing various communication technologies, as described herein.
  • the computer system 800 can be easily scalable, extensible, and modular, with the ability to change to different services or reconfigure some features independently of others.
  • the computer system 800 may be, for example, a server, desktop computer, laptop computer, tablet computer, or smartphone.
  • computer system 800 may be a cloud computing node.
  • Computer system 800 may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer system.
  • program modules may include routines, programs, objects, components, logic, data structures, and so on that perform particular tasks or implement particular abstract data types.
  • Computer system 800 may be practiced in distributed cloud computing environments where tasks are performed by remote processing devices that are linked through a communications network.
  • program modules may be located in both local and remote computer system storage media, including memory storage devices.
  • the computer system 800 has one or more central processing units (CPU(s)) 801a, 801b, 801c, etc. (collectively or generically referred to as processor(s) 801).
  • the processors 801 can be a single-core processor, multi-core processor, computing cluster, or any number of other configurations.
  • the processors 801 can be any type of circuitry capable of executing instructions.
  • the processors 801, also referred to as processing circuits are coupled via a system bus 802 to a system memory 803 and various other components.
  • the system memory 803 can include one or more memory devices, such as read-only memory (ROM) 804 and a random-access memory (RAM) 805.
  • ROM read-only memory
  • RAM random-access memory
  • the ROM 804 is coupled to the system bus 802 and may include a basic input/output system (BIOS), which controls certain basic functions of the computer system 800.
  • BIOS basic input/output system
  • the RAM is read-write memory coupled to the system bus 802 for use by the processors 801.
  • the system memory 803 provides temporary memory space for operations of said instructions during operation.
  • the system memory 803 can include random access memory (RAM), read-only memory, flash memory, or any other suitable memory systems.
  • the computer system 800 comprises an input/output (I/O) adapter 806 and a communications adapter 807 coupled to the system bus 802.
  • the I/O adapter 806 may be a small computer system interface (SCSI) adapter that communicates with a hard disk 808 and/or any other similar component.
  • SCSI small computer system interface
  • the I/O adapter 806 and the hard disk 808 are collectively referred to herein as a mass storage 810.
  • Software 811 for execution on the computer system 800 may be stored in the mass storage 810.
  • the mass storage 810 is an example of a tangible storage medium readable by the processors 801, where the software 811 is stored as instructions for execution by the processors 801 to cause the computer system 800 to operate, such as is described hereinbelow with respect to the various Figures. Examples of computer program product and the execution of such instruction is discussed herein in more detail.
  • the communications adapter 807 interconnects the system bus 802 with a network 812, which may be an outside network, enabling the computer system 800 to communicate with other such systems.
  • a portion of the system memory 803 and the mass storage 810 collectively store an operating system, which may be any appropriate operating system to coordinate the functions of the various components shown in FIG. 8.
  • Additional input/output devices are shown as connected to the system bus 802 via a display adapter 815 and an interface adapter 816.
  • the adapters 806, 807, 815, and 816 may be connected to one or more I/O buses that are connected to the system bus 802 via an intermediate bus bridge (not shown).
  • a display 819 e.g., a screen or a display monitor
  • a display adapter 815 which may include a graphics controller to improve the performance of graphics-intensive applications and a video controller.
  • a keyboard, a mouse, a touchscreen, one or more buttons, a speaker, etc. can be interconnected to the system bus 802 via the interface adapter 816, which may include, for example, a Super I/O chip integrating multiple device adapters into a single integrated circuit.
  • Suitable I/O buses for connecting peripheral devices such as hard disk controllers, network adapters, and graphics adapters typically include common protocols, such as the Peripheral Component Interconnect (PCI).
  • PCI Peripheral Component Interconnect
  • the computer system 800 includes processing capability in the form of the processors 801, and storage capability including the system memory 803 and the mass storage 810, input means such as the buttons, touchscreen, and output capability including the speaker 823 and the display 819.
  • the communications adapter 807 can transmit data using any suitable interface or protocol, such as the internet small computer system interface, among others.
  • the network 812 may be a cellular network, a radio network, a wide area network (WAN), a local area network (LAN), or the Internet, among others.
  • An external computing device may connect to the computer system 800 through the network 812.
  • an external computing device may be an external web server or a cloud computing node.
  • FIG. 8 the block diagram of FIG. 8 is not intended to indicate that the computer system 800 is to include all of the components shown in FIG. 8. Rather, the computer system 800 can include any appropriate fewer or additional components not illustrated in FIG. 8 (e.g., additional memory components, embedded controllers, modules, additional network interfaces, etc.). Further, the aspects described herein with respect to computer system 800 may be implemented with any appropriate logic, wherein the logic, as referred to herein, can include any suitable hardware (e.g., a processor, an embedded controller, or an applicationspecific integrated circuit, among others), software (e.g., an application, among others), firmware, or any suitable combination of hardware, software, and firmware, in various aspects. Various aspects can be combined to include two or more of the aspects described herein.
  • the present invention may be a system, a method, and/or a computer program product at any possible technical detail level of integration
  • the computer program product may include a computer-readable storage medium (or media) having computer-readable program instructions thereon for causing a processor to carry out aspects of the present invention
  • the computer-readable storage medium can be a tangible device that can retain and store instructions for use by an instruction execution device.
  • the computer-readable storage medium may be, for example, but is not limited to, an electronic storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination of the foregoing.
  • a non-exhaustive list of more specific examples of the computer-readable storage medium includes the following: a portable computer diskette, a hard disk, a random access memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or Flash memory), a static random access memory (SRAM), a portable compact disc read-only memory (CD-ROM), a digital versatile disk (DVD), a memory stick, a floppy disk, a mechanically encoded device such as punch-cards or raised structures in a groove having instructions recorded thereon, and any suitable combination of the foregoing.
  • RAM random access memory
  • ROM read-only memory
  • EPROM or Flash memory erasable programmable read-only memory
  • SRAM static random access memory
  • CD-ROM compact disc read-only memory
  • DVD digital versatile disk
  • memory stick a floppy disk
  • a mechanically encoded device such as punch-cards or raised structures in a groove having instructions recorded thereon
  • a computer-readable storage medium is not to be construed as being transitory signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through a waveguide or other transmission media (e.g., light pulses passing through a fiber-optic cable), or electrical signals transmitted through a wire.
  • Computer-readable program instructions described herein can be downloaded to respective computing/processing devices from a computer-readable storage medium or to an external computer or external storage device via a network, for example, the Internet, a local area network, a wide area network, and/or a wireless network.
  • the network may comprise copper transmission cables, optical transmission fibers, wireless transmission, routers, firewalls, switches, gateway computers, and/or edge servers.
  • the wireless network(s) can include, but is not limited to fifth generation (5G) and sixth generation (6G) protocols and connections.
  • a network adapter card or network interface in each computing/processing device receives computer-readable program instructions from the network and forwards the computer-readable program instructions for storage in a computer-readable storage medium within the respective computing/processing device.
  • Computer-readable program instructions for carrying out operations of the present invention may be assembler instructions, instruction-set-architecture (ISA) instructions, machine instructions, machine-dependent instructions, microcode, firmware instructions, state-setting data, configuration data for integrated circuitry, or either source-code or object code written in any combination of one or more programming languages, including an object-oriented programming language such as Smalltalk, C++, high-level languages such as Python, or the like, and procedural programming languages, such as the “C” programming language or similar programming languages.
  • the computer-readable program instructions may execute entirely on the user’s computer, partly on the user’s computer, as a stand-alone software package, partly on the user’ s computer and partly on a remote computer, or entirely on the remote computer or server.
  • the remote computer may be connected to the user’s computer through any type of network, including a local area network (LAN) or a wide area network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet Service Provider).
  • electronic circuitry including, for example, programmable logic circuitry, field-programmable gate arrays (FPGA), or programmable logic arrays (PLA) may execute the computer-readable program instruction by utilizing state information of the computer-readable program instructions to personalize the electronic circuitry, in order to perform aspects of the present invention.
  • These computer-readable program instructions may be provided to a processor of a general-purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks.
  • These computer- readable program instructions may also be stored in a computer-readable storage medium that can direct a computer, a programmable data processing apparatus, and/or other devices to function in a particular manner, such that the computer-readable storage medium having instructions stored therein comprises an article of manufacture including instructions which implement aspects of the function/act specified in the flowchart and/or block diagram block or blocks.
  • the computer-readable program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other device to cause a series of operational steps to be performed on the computer, other programmable apparatus or other devices to produce a computer-implemented process, such that the instructions which execute on the computer, other programmable apparatus, or other device implement the functions/acts specified in the flowchart and/or block diagram block or blocks.
  • each block in the flowchart or block diagrams may represent a module, segment, or portion of instructions, which comprises one or more executable instructions for implementing the specified logical function(s).
  • the functions noted in the blocks may occur out of the order noted in the Figures.
  • two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved.
  • exemplary is used herein to mean “serving as an example, instance or illustration.” Any aspect or design described herein as “exemplary” is not necessarily to be construed as preferred or advantageous over other aspects or designs.
  • the terms “at least one” and “one or more” may be understood to include any integer number greater than or equal to one, i.e., one, two, three, four, etc.
  • the terms “a plurality” may be understood to include any integer number greater than or equal to two, i.e., two, three, four, five, etc.
  • connection may include both an indirect “connection” and a direct “connection.”
  • the described techniques may be implemented in hardware, software, firmware, or any combination thereof. If implemented in software, the functions may be stored as one or more instructions or code on a computer-readable medium and executed by a hardware-based processing unit.
  • Computer-readable media may include non-transitory computer-readable media, which corresponds to a tangible medium such as data storage media (e.g., RAM, ROM, EEPROM, flash memory, or any other medium that can be used to store desired program code in the form of instructions or data structures and that can be accessed by a computer).
  • processors such as one or more digital signal processors (DSPs), graphics processing units (GPUs), general-purpose microprocessors, application-specific integrated circuits (ASICs), field programmable logic arrays (FPGAs), or other equivalent integrated or discrete logic circuitry.
  • DSPs digital signal processors
  • GPUs graphics processing units
  • ASICs application-specific integrated circuits
  • FPGAs field programmable logic arrays
  • processors may refer to any of the foregoing structure or any other physical structure suitable for implementation of the described techniques. Also, the techniques could be fully implemented in one or more circuits or logic elements.

Landscapes

  • Health & Medical Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Surgery (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Public Health (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Epidemiology (AREA)
  • Primary Health Care (AREA)
  • Human Computer Interaction (AREA)
  • Robotics (AREA)
  • Urology & Nephrology (AREA)
  • Biomedical Technology (AREA)
  • Heart & Thoracic Surgery (AREA)
  • Molecular Biology (AREA)
  • Animal Behavior & Ethology (AREA)
  • Veterinary Medicine (AREA)
  • Radiology & Medical Imaging (AREA)
  • Image Analysis (AREA)

Abstract

Un aspect concerne une annotation de superposition par l'intermédiaire d'une interface de diffusion en continu. Une entrée utilisateur peut être détectée par l'intermédiaire d'une interface utilisateur d'une session de diffusion en continu pour sélectionner une région d'intérêt à mettre en évidence dans un flux vidéo chirurgical pendant une procédure chirurgicale. Une trame du flux vidéo chirurgical peut être recherchée pour localiser deux points caractéristiques ou plus de la région d'intérêt et mettre en évidence au moins une partie de la région d'intérêt sur la base de la détermination que la région d'intérêt est associée à une structure inconnue non encore identifiée par le système. La région d'intérêt peut être suivie pendant la procédure chirurgicale pour maintenir la mise en évidence afin de tenir compte du déplacement dans le flux vidéo chirurgical jusqu'à ce qu'une instruction utilisateur soit reçue ou qu'une période de temps se soit écoulée pour éliminer la mise en évidence.
PCT/EP2024/073050 2023-08-17 2024-08-16 Superposition d'annotation par l'intermédiaire d'une interface de diffusion en continu Pending WO2025036995A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
GR20230100679 2023-08-17
GR20230100679 2023-08-17

Publications (1)

Publication Number Publication Date
WO2025036995A1 true WO2025036995A1 (fr) 2025-02-20

Family

ID=92458119

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/EP2024/073050 Pending WO2025036995A1 (fr) 2023-08-17 2024-08-16 Superposition d'annotation par l'intermédiaire d'une interface de diffusion en continu

Country Status (1)

Country Link
WO (1) WO2025036995A1 (fr)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022195303A1 (fr) * 2021-03-19 2022-09-22 Digital Surgery Limited Prédiction de structures dans des données chirurgicales à l'aide d'un apprentissage automatique
WO2022195306A1 (fr) * 2021-03-19 2022-09-22 Digital Surgery Limited Détection d'états et d'instruments chirurgicaux
WO2022263870A1 (fr) * 2021-06-16 2022-12-22 Digital Surgery Limited Détection d'états chirurgicaux, de profils de mouvement et d'instruments
US20230190136A1 (en) * 2020-04-13 2023-06-22 Kaliber Labs Inc. Systems and methods for computer-assisted shape measurements in video
WO2023144570A1 (fr) * 2022-01-28 2023-08-03 Digital Surgery Limited Détection et distinction de structures critiques dans des procédures chirurgicales à l'aide d'un apprentissage machine

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20230190136A1 (en) * 2020-04-13 2023-06-22 Kaliber Labs Inc. Systems and methods for computer-assisted shape measurements in video
WO2022195303A1 (fr) * 2021-03-19 2022-09-22 Digital Surgery Limited Prédiction de structures dans des données chirurgicales à l'aide d'un apprentissage automatique
WO2022195306A1 (fr) * 2021-03-19 2022-09-22 Digital Surgery Limited Détection d'états et d'instruments chirurgicaux
WO2022263870A1 (fr) * 2021-06-16 2022-12-22 Digital Surgery Limited Détection d'états chirurgicaux, de profils de mouvement et d'instruments
WO2023144570A1 (fr) * 2022-01-28 2023-08-03 Digital Surgery Limited Détection et distinction de structures critiques dans des procédures chirurgicales à l'aide d'un apprentissage machine

Similar Documents

Publication Publication Date Title
US20250148790A1 (en) Position-aware temporal graph networks for surgical phase recognition on laparoscopic videos
US12387005B2 (en) De-identifying data obtained from microphones
US20240037949A1 (en) Surgical workflow visualization as deviations to a standard
US20240153269A1 (en) Identifying variation in surgical approaches
EP4619949A1 (fr) Réseau spatio-temporel pour segmentation sémantique de vidéo dans des vidéos chirurgicales
CN118216156A (zh) 基于特征的手术视频压缩
US20240161934A1 (en) Quantifying variation in surgical approaches
US20250014717A1 (en) Removing redundant data from catalogue of surgical video
WO2025036995A1 (fr) Superposition d'annotation par l'intermédiaire d'une interface de diffusion en continu
US20240428956A1 (en) Query similar cases based on video information
WO2024223462A1 (fr) Interface utilisateur pour sélection de participants pendant une diffusion en continu d'opération chirurgicale
WO2024213571A1 (fr) Commande de permutation de chirurgiens
WO2025036996A1 (fr) Résumé chirurgical contextuel pour diffusion en continu
EP4430823A1 (fr) Adaptateurs de communication multimédia dans un environnement chirurgical
WO2025210184A1 (fr) Génération automatisée de rapport opératoire
WO2025036994A1 (fr) Notification de diffusion en continu conditionnelle au contexte
WO2025210185A1 (fr) Multimédia stocké et affiché avec une vidéo chirurgicale
WO2024110547A1 (fr) Tableau de bord d'analyse vidéo pour examen de cas
WO2025036993A1 (fr) Sélecteur de vue dynamique pour observation de salle d'opération multiple
WO2023084258A1 (fr) Compression de catalogue de vidéo chirurgicale
WO2025021978A1 (fr) Éditeur de paramètres d'intervention et base de données de paramètres d'intervention
US20250366948A1 (en) Media communication adaptors in a surgical environment
WO2025051987A1 (fr) Système d'aperçu de données chirurgicales
WO2024213771A1 (fr) Tableau de bord de données chirurgicales
EP4555529A1 (fr) Interface utilisateur pour structures détectées dans des procédures chirurgicales

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 24758218

Country of ref document: EP

Kind code of ref document: A1